middle-end IFN_ASSUME support [PR106654]
Commit Message
Hi!
My earlier patches gimplify the simplest non-side-effects assumptions
into if (cond) ; else __builtin_unreachable (); and throw the rest
on the floor.
The following patch attempts to do something with the rest too.
For -O0, it actually throws even the simplest assumptions on the floor,
we don't expect optimizations and the assumptions are there to allow
optimizations. Otherwise, it keeps the handling of the simplest
assumptions as is, and otherwise arranges for the assumptions to be
visible in the IL as
.ASSUME (_Z2f4i._assume.0, i_1(D));
call where there is an artificial function like:
bool _Z2f4i._assume.0 (int i)
{
bool _2;
<bb 2> [local count: 1073741824]:
_2 = i_1(D) == 43;
return _2;
}
with the semantics that there is UB unless the assumption function
would return true.
Aldy, could ranger handle this? If it sees .ASSUME call,
walk the body of such function from the edge(s) to exit with the
assumption that the function returns true, so above set _2 [true, true]
and from there derive that i_1(D) [43, 43] and then map the argument
in the assumption function to argument passed to IFN_ASSUME (note,
args there are shifted by 1)?
During gimplification it actually gimplifies it into
D.2591 = .ASSUME ();
if (D.2591 != 0) goto <D.2593>; else goto <D.2592>;
<D.2593>:
{
i = i + 1;
D.2591 = i == 44;
}
<D.2592>:
.ASSUME (D.2591);
with the condition wrapped into a GIMPLE_BIND (I admit the above isn't
extra clean but it is just something to hold it from gimplifier until
gimple low pass; it reassembles if (condition_never_true) { cond; };
an alternative would be introduce GOMP_ASSUME statement that would have
the guard var as operand and the GIMPLE_BIND as body, but for the
few passes (tree-nested and omp lowering) in between that looked like
an overkill to me) which is then pattern matched during gimple lowering
and outlined into a separate function. Variables declared inside of the
condition (both static and automatic) just change context, automatic
variables from the caller are turned into parameters (note, as the code
is never executed, I handle this way even non-POD types, we don't need to
bother pretending there would be user copy constructors etc. involved).
The assume_function artificial functions are then optimized until RTL
expansion, which isn't done for them nor any later pass, on the other
side bodies are not freed when done.
Earlier version of the patch has been successfully bootstrapped/regtested
on x86_64-linux and i686-linux and all assume tests have passed even with
this version. Ok for trunk if it succeeds on another bootstrap/regtest?
There are a few further changes I'd like to do, like ignoring the
.ASSUME calls in inlining size estimations (but haven't figured out where
it is done), or for LTO arrange for the assume functions to be emitted
in all partitions that reference those (usually there will be just one,
unless code with the assumption got inlined, versioned etc.).
2022-10-10 Jakub Jelinek <jakub@redhat.com>
PR c++/106654
gcc/
* function.h (struct function): Add assume_function bitfield.
* gimplify.cc (gimplify_call_expr): For -O0, always throw
away assumptions on the floor immediately. Otherwise if the
assumption isn't simple enough, expand it into IFN_ASSUME
guarded block.
* gimple-low.cc (create_assumption_fn): New function.
(struct lower_assumption_data): New type.
(find_assumption_locals_r, assumption_copy_decl,
adjust_assumption_stmt_r, adjust_assumption_stmt_op,
lower_assumption): New functions.
(lower_stmt): Handle IFN_ASSUME guarded block.
* tree-ssa-ccp.cc (pass_fold_builtins::execute): Remove
IFN_ASSUME calls.
* lto-streamer-out.cc (output_struct_function_base): Pack
assume_function bit.
* lto-streamer-in.cc (input_struct_function_base): And unpack it.
* cgraphunit.cc (cgraph_node::expand): Don't verify assume_function
has TREE_ASM_WRITTEN set and don't release its body.
* cfgexpand.cc (pass_expand::execute): Don't expand assume_function
into RTL, just destroy loops and exit.
* internal-fn.cc (expand_ASSUME): Remove gcc_unreachable.
* passes.cc (pass_rest_of_compilation::gate): Return false also for
fun->assume_function.
* tree-vectorizer.cc (pass_vectorize::gate,
pass_slp_vectorize::gate): Likewise.
* ipa-icf.cc (sem_function::parse): Punt for func->assume_function.
gcc/cp/
* parser.cc (cp_parser_omp_assumption_clauses): Wrap IFN_ASSUME
argument with fold_build_cleanup_point_expr.
* cp-gimplify.cc (process_stmt_assume_attribute): Likewise.
* pt.cc (tsubst_copy_and_build): Likewise.
gcc/testsuite/
* g++.dg/cpp23/attr-assume5.C: New test.
* g++.dg/cpp23/attr-assume6.C: New test.
* g++.dg/cpp23/attr-assume7.C: New test.
Jakub
Comments
On 10/10/22 04:54, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> My earlier patches gimplify the simplest non-side-effects assumptions
> into if (cond) ; else __builtin_unreachable (); and throw the rest
> on the floor.
> The following patch attempts to do something with the rest too.
> For -O0, it actually throws even the simplest assumptions on the floor,
> we don't expect optimizations and the assumptions are there to allow
> optimizations.
I'd think we should trap on failed assume at -O0 (i.e. with
-funreachable-traps).
Jason
On Mon, Oct 10, 2022 at 05:09:29PM -0400, Jason Merrill wrote:
> On 10/10/22 04:54, Jakub Jelinek via Gcc-patches wrote:
> > My earlier patches gimplify the simplest non-side-effects assumptions
> > into if (cond) ; else __builtin_unreachable (); and throw the rest
> > on the floor.
> > The following patch attempts to do something with the rest too.
> > For -O0, it actually throws even the simplest assumptions on the floor,
> > we don't expect optimizations and the assumptions are there to allow
> > optimizations.
>
> I'd think we should trap on failed assume at -O0 (i.e. with
> -funreachable-traps).
For the simple conditions? Perhaps. But for the side-effects cases
that doesn't seem to be easily possible.
Jakub
On 10/10/22 04:54, Jakub Jelinek via Gcc-patches wrote:
> Hi!
>
> My earlier patches gimplify the simplest non-side-effects assumptions
> into if (cond) ; else __builtin_unreachable (); and throw the rest
> on the floor.
> The following patch attempts to do something with the rest too.
> For -O0, it actually throws even the simplest assumptions on the floor,
> we don't expect optimizations and the assumptions are there to allow
> optimizations. Otherwise, it keeps the handling of the simplest
> assumptions as is, and otherwise arranges for the assumptions to be
> visible in the IL as
> .ASSUME (_Z2f4i._assume.0, i_1(D));
> call where there is an artificial function like:
> bool _Z2f4i._assume.0 (int i)
> {
> bool _2;
>
> <bb 2> [local count: 1073741824]:
> _2 = i_1(D) == 43;
> return _2;
>
> }
> with the semantics that there is UB unless the assumption function
> would return true.
>
> Aldy, could ranger handle this? If it sees .ASSUME call,
> walk the body of such function from the edge(s) to exit with the
> assumption that the function returns true, so above set _2 [true, true]
> and from there derive that i_1(D) [43, 43] and then map the argument
> in the assumption function to argument passed to IFN_ASSUME (note,
> args there are shifted by 1)?
Ranger GORI component could assume the return value is [1,1] and work
backwards from there. Single basic blocks would be trivial. The problem
becomes when there are multiple blocks. The gori engine has no real
restriction other than it works from within a basic block only
I see no reason we couldn't wire something up that continues propagating
values out the top of the block evaluating things for more complicated
cases. you would end up with a set of ranges for names which are the
"maximal" possible range based on the restriction that the return value
is [1,1].
> During gimplification it actually gimplifies it into
> D.2591 = .ASSUME ();
> if (D.2591 != 0) goto <D.2593>; else goto <D.2592>;
> <D.2593>:
> {
> i = i + 1;
> D.2591 = i == 44;
> }
> <D.2592>:
> .ASSUME (D.2591);
> with the condition wrapped into a GIMPLE_BIND (I admit the above isn't
> extra clean but it is just something to hold it from gimplifier until
> gimple low pass; it reassembles if (condition_never_true) { cond; };
What we really care about is what the SSA form looks like.. thats what
ranger will deal with.
Is this function inlined? If it isn't then you'd need LTO/IPA to
propagate the ranges we calculated above for the function. Or some
special pass that reads assumes, does the processing you mention above
and applies it? Is that what you are thinking?
Looking at assume7.C, I see:
int bar (int x)
{
<bb 2> [local count: 1073741824]:
.ASSUME (_Z3bari._assume.0, x_1(D));
return x_1(D);
And:
bool _Z3bari._assume.0 (int x)
{
bool _2;
<bb 2> [local count: 1073741824]:
_2 = x_1(D) == 42;
return _2;
Using the above approach, GORI could tell you that if _2 is [1,1] that
x_1 must be [42,42].
If you are parsing that ASSUME, you could presumably match things pu and
we could make x_1 have a range of [42,42] in bar() at that call.
this would require a bit of processing in fold_using_range for handling
function calls, checking for this case and so on, but quite doable.
looking at the more complicated case for
bool _Z3bazi._assume.0 (int x)
it seems that the answer is determines without processing most of the
function. ie:, work from the bottom up:
<bb 5> [local count: 670631318]:
_8 = x_3 == 43; x_3 = [43,43]
<bb 6> [local count: 1073741824]:
# _1 = PHI <0(2), _8(5)> _8 = [1,1] 2->6 cant happen
return _1; _1 = [1,1]
you only care about x, so as soon as you find a result that that, you'd
actually be done. However, I can imagine cases where you do need to go
all the way back to the top of the assume function.. and combine values. Ie
bool assume (int x, int y)
{
if (y > 10)
return x == 2;
return x > 20;
}
<bb 2> [local count: 1073741824]:
if (y_2(D) > 10)
goto <bb 3>; [34.00%]
else
goto <bb 4>; [66.00%]
<bb 3> [local count: 365072224]:
_5 = x_3(D) == 2; x_3 = [2,2]
goto <bb 5>; [100.00%]
<bb 4> [local count: 708669601]:
_4 = x_3(D) > 20; x_3 = [21, +INF]
<bb 5> [local count: 1073741824]:
# _1 = PHI <_5(3), _4(4)> _5 = [1,1], _4 = [1,1]
return _1;
And we'd have a range of [2,2][21, +INF]
if you wanted to be able to plug values of Y in, things would get more
complicated, but the framework would all be there.
OK, enough pontificating, it wouldn't be ranger, but yes, you could
wire GORI into a pass which evaluates what we think the range of a set
of inputs would be if we return 1. I don't think It would be very
difficult, just a bit of work to work the IL in reverse. And then you
should be able to wire in a range update for those parameters in
fold_using_range (or wherever else I suppose) with a little more work.
It seems to me that if you were to "optimize" the function via this new
pass, assuming 1 was returned, you could probably eliminate a lot of
the extraneous code.. for what thats worth.
Can the assume function produce more than one assumption you care about?
Andrew
On Tue, Oct 11, 2022 at 02:05:52PM -0400, Andrew MacLeod wrote:
> > Aldy, could ranger handle this? If it sees .ASSUME call,
> > walk the body of such function from the edge(s) to exit with the
> > assumption that the function returns true, so above set _2 [true, true]
> > and from there derive that i_1(D) [43, 43] and then map the argument
> > in the assumption function to argument passed to IFN_ASSUME (note,
> > args there are shifted by 1)?
>
>
> Ranger GORI component could assume the return value is [1,1] and work
> backwards from there. Single basic blocks would be trivial. The problem
> becomes when there are multiple blocks. The gori engine has no real
> restriction other than it works from within a basic block only
>
> I see no reason we couldn't wire something up that continues propagating
> values out the top of the block evaluating things for more complicated
> cases. you would end up with a set of ranges for names which are the
> "maximal" possible range based on the restriction that the return value is
> [1,1].
>
>
> > During gimplification it actually gimplifies it into
> > D.2591 = .ASSUME ();
> > if (D.2591 != 0) goto <D.2593>; else goto <D.2592>;
> > <D.2593>:
> > {
> > i = i + 1;
> > D.2591 = i == 44;
> > }
> > <D.2592>:
> > .ASSUME (D.2591);
> > with the condition wrapped into a GIMPLE_BIND (I admit the above isn't
> > extra clean but it is just something to hold it from gimplifier until
> > gimple low pass; it reassembles if (condition_never_true) { cond; };
>
>
> What we really care about is what the SSA form looks like.. thats what
> ranger will deal with.
Sure.
> Is this function inlined? If it isn't then you'd need LTO/IPA to propagate
Never (the code is not supposed to be actually executed at runtime ever,
it is purely as if, if this function would be executed, then it would return
true, otherwise it would be UB). But the goal is of course to inline stuff
into it and optimize the function even post IPA.
> the ranges we calculated above for the function. Or some special pass that
> reads assumes, does the processing you mention above and applies it? Is
> that what you are thinking?
The options would be to evaluate it each time ranger processes .ASSUME,
or to perform this backwards range propagation somewhere late during post
IPA optimizations of the cfun->assume_function and remember it somewhere
(e.g. in SSA_NAME_RANGE_INFO of the default defs of the params) and then
when visiting .ASSUME just look those up. I think the latter is better,
we'd do it only once - the assumption that the function returns true after
the assume function itself is optimized will always be the same.
It could be a separate pass (gated on fun->assume_function, so done only
for them) somewhere shortly before expansion to RTL (which is what isn't
done and nothing later for those), or could be done say in VRP2 or some
other existing late pass.
> Looking at assume7.C, I see:
>
> int bar (int x)
> {
> <bb 2> [local count: 1073741824]:
> .ASSUME (_Z3bari._assume.0, x_1(D));
> return x_1(D);
>
> And:
>
> bool _Z3bari._assume.0 (int x)
> {
> bool _2;
>
> <bb 2> [local count: 1073741824]:
> _2 = x_1(D) == 42;
> return _2;
>
>
> Using the above approach, GORI could tell you that if _2 is [1,1] that x_1
> must be [42,42].
>
> If you are parsing that ASSUME, you could presumably match things pu and we
> could make x_1 have a range of [42,42] in bar() at that call.
If we cache the range info for the assume_function arguments the above way
on SSA_NAME_RANGE_INFO, then you'd just see .ASSUME call and for (n+1)th
argument find nth argument of the 1st argument FUNCTION_DECL's
DECL_ARGUMENTS, ssa_default_def (DECL_STRUCT_FUNCTION (assume_fndecl), parm)
and just union the current range of (n+1)th argument with
SSA_NAME_RANGE_INFO of the ssa_default_def (if non-NULL).
>
> this would require a bit of processing in fold_using_range for handling
> function calls, checking for this case and so on, but quite doable.
>
> looking at the more complicated case for
>
> bool _Z3bazi._assume.0 (int x)
>
> it seems that the answer is determines without processing most of the
> function. ie:, work from the bottom up:
>
> <bb 5> [local count: 670631318]:
> _8 = x_3 == 43; x_3 = [43,43]
>
> <bb 6> [local count: 1073741824]:
> # _1 = PHI <0(2), _8(5)> _8 = [1,1] 2->6 cant happen
> return _1; _1 = [1,1]
>
> you only care about x, so as soon as you find a result that that, you'd
> actually be done. However, I can imagine cases where you do need to go all
> the way back to the top of the assume function.. and combine values. Ie
>
> bool assume (int x, int y)
> {
> if (y > 10)
> return x == 2;
> return x > 20;
> }
>
> <bb 2> [local count: 1073741824]:
> if (y_2(D) > 10)
> goto <bb 3>; [34.00%]
> else
> goto <bb 4>; [66.00%]
>
> <bb 3> [local count: 365072224]:
> _5 = x_3(D) == 2; x_3 = [2,2]
> goto <bb 5>; [100.00%]
>
> <bb 4> [local count: 708669601]:
> _4 = x_3(D) > 20; x_3 = [21, +INF]
>
> <bb 5> [local count: 1073741824]:
> # _1 = PHI <_5(3), _4(4)> _5 = [1,1], _4 = [1,1]
>
> return _1;
>
> And we'd have a range of [2,2][21, +INF]
> if you wanted to be able to plug values of Y in, things would get more
> complicated, but the framework would all be there.
Yeah. Note, it is fine to handle say single block assume functions (after
optimizations) first and improve incrementally later, the goal is that
people actually see useful optimizations with simpler (but not simplest)
assume conditions, so they don't say they aren't completely useless, and if
they come up with something more complex that we don't handle yet, they
can file enhancement requests. Of course, trying to walk all the bbs
backwards would be nicer, though even then it is important to be primarily
correct and so punting on anything we can't handle is fine (e.g. if there
are loops etc.).
> It seems to me that if you were to "optimize" the function via this new
> pass, assuming 1 was returned, you could probably eliminate a lot of the
> extraneous code.. for what thats worth.
>
> Can the assume function produce more than one assumption you care about?
The assume function can have many arguments (one is created for each
automatic var referenced or set by the condition), so it would be nice to
track all of them rather than just hardcoding the first. And, the argument
doesn't necessarily have to be a scalar, so perhaps later on we could derive
ranges even for structure members etc. if needed. Or just handle
assume_function in IPA-SRA somehow at least.
The C++23 paper mentions
[[assume(size > 0)]];
[[assume(size % 32 == 0)]];
[[assume(std::isfinite(data[i]))]];
[[assume(*pn >= 1)]];
[[assume(i == 42)]];
[[assume(++i == 43)]];
[[assume((std::cin >> i, i == 42))]];
[[assume(++almost_last == last)]];
among other things, the first and fifth are already handled the
if (!cond) __builtin_unreachable (); way and so work even without this
patch, the (std::cin >> i, i == 42) case is not worth bothering for now
and the rest would be single block assumptions that can be handled easily
(except the last one which would have record type arguments and so we'd need
SRA).
Jakub
On 10/12/22 06:15, Jakub Jelinek wrote:
>
>> the ranges we calculated above for the function. Or some special pass that
>> reads assumes, does the processing you mention above and applies it? Is
>> that what you are thinking?
> The options would be to evaluate it each time ranger processes .ASSUME,
> or to perform this backwards range propagation somewhere late during post
> IPA optimizations of the cfun->assume_function and remember it somewhere
> (e.g. in SSA_NAME_RANGE_INFO of the default defs of the params) and then
> when visiting .ASSUME just look those up. I think the latter is better,
> we'd do it only once - the assumption that the function returns true after
> the assume function itself is optimized will always be the same.
> It could be a separate pass (gated on fun->assume_function, so done only
> for them) somewhere shortly before expansion to RTL (which is what isn't
> done and nothing later for those), or could be done say in VRP2 or some
> other existing late pass.
>
I agree, I think it would be better to process once, and store the
results some where. I could provide a routine which attempts the
evaluation of the current function, and returns a "safe" range for each
of the parameters. By safe, I mean it would assume VARYING for every
unknown value in the function, reduced by whatever the backward walk
determined. We can refine that later by wiring this call in after a
full ranger walk of VRP for instance to get more precise values, but
that is not necessary at the moment.
I can also make it so that we always try to look up values from the
.ASSUME in fold_using_range, which means any VRP or other ranger client
will pick up the results. If there is nothing available, it would
return VARYING as the default. Any current range would be intersected
with what the ASSUME query returns.
I presume you are looking to get this working for this release, making
the priority high? :-)
>> Looking at assume7.C, I see:
>>
>> int bar (int x)
>> {
>> <bb 2> [local count: 1073741824]:
>> .ASSUME (_Z3bari._assume.0, x_1(D));
>> return x_1(D);
>>
>> And:
>>
>> bool _Z3bari._assume.0 (int x)
>> {
>> bool _2;
>>
>> <bb 2> [local count: 1073741824]:
>> _2 = x_1(D) == 42;
>> return _2;
>>
>>
>> Using the above approach, GORI could tell you that if _2 is [1,1] that x_1
>> must be [42,42].
>>
>> If you are parsing that ASSUME, you could presumably match things pu and we
>> could make x_1 have a range of [42,42] in bar() at that call.
> If we cache the range info for the assume_function arguments the above way
> on SSA_NAME_RANGE_INFO, then you'd just see .ASSUME call and for (n+1)th
> argument find nth argument of the 1st argument FUNCTION_DECL's
> DECL_ARGUMENTS, ssa_default_def (DECL_STRUCT_FUNCTION (assume_fndecl), parm)
> and just union the current range of (n+1)th argument with
> SSA_NAME_RANGE_INFO of the ssa_default_def (if non-NULL).
Intersection I believe...? I think the value from the assume's should
add restrictions to the range..
>> this would require a bit of processing in fold_using_range for handling
>> function calls, checking for this case and so on, but quite doable.
>>
>> looking at the more complicated case for
>>
>> bool _Z3bazi._assume.0 (int x)
>>
>> it seems that the answer is determines without processing most of the
>> function. ie:, work from the bottom up:
>>
>> <bb 5> [local count: 670631318]:
>> _8 = x_3 == 43; x_3 = [43,43]
>>
>> <bb 6> [local count: 1073741824]:
>> # _1 = PHI <0(2), _8(5)> _8 = [1,1] 2->6 cant happen
>> return _1; _1 = [1,1]
>>
>> you only care about x, so as soon as you find a result that that, you'd
>> actually be done. However, I can imagine cases where you do need to go all
>> the way back to the top of the assume function.. and combine values. Ie
>>
>> bool assume (int x, int y)
>> {
>> if (y > 10)
>> return x == 2;
>> return x > 20;
>> }
>>
>> <bb 2> [local count: 1073741824]:
>> if (y_2(D) > 10)
>> goto <bb 3>; [34.00%]
>> else
>> goto <bb 4>; [66.00%]
>>
>> <bb 3> [local count: 365072224]:
>> _5 = x_3(D) == 2; x_3 = [2,2]
>> goto <bb 5>; [100.00%]
>>
>> <bb 4> [local count: 708669601]:
>> _4 = x_3(D) > 20; x_3 = [21, +INF]
>>
>> <bb 5> [local count: 1073741824]:
>> # _1 = PHI <_5(3), _4(4)> _5 = [1,1], _4 = [1,1]
>>
>> return _1;
>>
>> And we'd have a range of [2,2][21, +INF]
>> if you wanted to be able to plug values of Y in, things would get more
>> complicated, but the framework would all be there.
> Yeah. Note, it is fine to handle say single block assume functions (after
> optimizations) first and improve incrementally later, the goal is that
> people actually see useful optimizations with simpler (but not simplest)
> assume conditions, so they don't say they aren't completely useless, and if
> they come up with something more complex that we don't handle yet, they
> can file enhancement requests. Of course, trying to walk all the bbs
> backwards would be nicer, though even then it is important to be primarily
> correct and so punting on anything we can't handle is fine (e.g. if there
> are loops etc.).
Single blocks for the first cut and POC. easier. then look at expansion
>> It seems to me that if you were to "optimize" the function via this new
>> pass, assuming 1 was returned, you could probably eliminate a lot of the
>> extraneous code.. for what thats worth.
>>
>> Can the assume function produce more than one assumption you care about?
> The assume function can have many arguments (one is created for each
> automatic var referenced or set by the condition), so it would be nice to
> track all of them rather than just hardcoding the first. And, the argument
> doesn't necessarily have to be a scalar, so perhaps later on we could derive
> ranges even for structure members etc. if needed. Or just handle
> assume_function in IPA-SRA somehow at least.
>
> The C++23 paper mentions
> [[assume(size > 0)]];
> [[assume(size % 32 == 0)]];
> [[assume(std::isfinite(data[i]))]];
> [[assume(*pn >= 1)]];
> [[assume(i == 42)]];
> [[assume(++i == 43)]];
> [[assume((std::cin >> i, i == 42))]];
> [[assume(++almost_last == last)]];
> among other things, the first and fifth are already handled the
> if (!cond) __builtin_unreachable (); way and so work even without this
> patch, the (std::cin >> i, i == 42) case is not worth bothering for now
> and the rest would be single block assumptions that can be handled easily
> (except the last one which would have record type arguments and so we'd need
> SRA).
I figured as much, I was just wondering if there might be some way to
"simplify" certain things by processing it and turning each parameter
query into a smaller function returning the range we determined from the
main one... but perhaps that is more complicated.
Andrew
On Wed, Oct 12, 2022 at 10:31:00AM -0400, Andrew MacLeod wrote:
> I presume you are looking to get this working for this release, making the
> priority high? :-)
Yes. So that we can claim we actually support C++23 Portable Assumptions
and OpenMP assume directive's hold clauses for something non-trivial so
people won't be afraid to actually use it.
Of course, first the posted patch needs to be reviewed and only once it gets
in, the ranger/GORI part can follow. As the latter is only an optimization,
it can be done incrementally.
> Intersection I believe...? I think the value from the assume's should add
> restrictions to the range..
Sure, sorry.
> I figured as much, I was just wondering if there might be some way to
> "simplify" certain things by processing it and turning each parameter query
> into a smaller function returning the range we determined from the main
> one... but perhaps that is more complicated.
We don't really know what the condition is, it can be pretty arbitrary
expression (well, e.g. for C++ conditional expression, so say
[[assume (var = foo ())]];
is not valid but
[[assume ((var = foo ()))]];
is. And with GNU statement expressions it can do a lot of stuff and until
we e.g. inline into it and optimize it a little, we don't really know what
it will be like.
Jakub
On 10/12/22 10:39, Jakub Jelinek wrote:
> On Wed, Oct 12, 2022 at 10:31:00AM -0400, Andrew MacLeod wrote:
>> I presume you are looking to get this working for this release, making the
>> priority high? :-)
> Yes. So that we can claim we actually support C++23 Portable Assumptions
> and OpenMP assume directive's hold clauses for something non-trivial so
> people won't be afraid to actually use it.
> Of course, first the posted patch needs to be reviewed and only once it gets
> in, the ranger/GORI part can follow. As the latter is only an optimization,
> it can be done incrementally.
I will start poking at something to find ranges for parameters from the
return backwards.
>> Intersection I believe...? I think the value from the assume's should add
>> restrictions to the range..
> Sure, sorry.
>
>> I figured as much, I was just wondering if there might be some way to
>> "simplify" certain things by processing it and turning each parameter query
>> into a smaller function returning the range we determined from the main
>> one... but perhaps that is more complicated.
> We don't really know what the condition is, it can be pretty arbitrary
> expression (well, e.g. for C++ conditional expression, so say
> [[assume (var = foo ())]];
> is not valid but
> [[assume ((var = foo ()))]];
> is. And with GNU statement expressions it can do a lot of stuff and until
> we e.g. inline into it and optimize it a little, we don't really know what
> it will be like.
>
>
No, I just meant that once we finally process the complicated function,
and decide the final range we are storing is for x_1 is say [20,30], we
could replace the assume call site with something like
int assume03_x (x) { if (x>= 20 || x <= 30) return x;
gcc_unreachable(); }
then at call sites:
x_5 = assume03_x(x_3);
For that matter, once all the assume functions have been processed, we
could textually replace the assume call with an expression which
represents the determined range... Kind of our own mini inlining?
Maybe thats even better than adding any kind of support in
fold_using_range.. just let things naturally fall into place?
.ASSUME_blah ( , , x_4);
where if x is determined to be [20, 30][50,60] could be textually
"expanded" in the IL with
if (x<20 || x>60 || (x>30 && x < 50)) gcc_unreachcable();
for each of the parameters? If we processed this like early inlining,
we could maybe expose the entire thing to optimization that way?
Andrew
On Wed, 12 Oct 2022, Andrew MacLeod wrote:
>
> On 10/12/22 10:39, Jakub Jelinek wrote:
> > On Wed, Oct 12, 2022 at 10:31:00AM -0400, Andrew MacLeod wrote:
> >> I presume you are looking to get this working for this release, making the
> >> priority high? :-)
> > Yes. So that we can claim we actually support C++23 Portable Assumptions
> > and OpenMP assume directive's hold clauses for something non-trivial so
> > people won't be afraid to actually use it.
> > Of course, first the posted patch needs to be reviewed and only once it gets
> > in, the ranger/GORI part can follow. As the latter is only an optimization,
> > it can be done incrementally.
>
> I will start poking at something to find ranges for parameters from the return
> backwards.
If the return were
if (return_val)
return return_val;
you could use path-ranger with the parameter SSA default defs as
"interesting". So you "only" need to somehow interpret the return
statement as such and do path rangers compute_ranges ()
>
> >> Intersection I believe...? I think the value from the assume's should add
> >> restrictions to the range..
> > Sure, sorry.
> >
> >> I figured as much, I was just wondering if there might be some way to
> >> "simplify" certain things by processing it and turning each parameter query
> >> into a smaller function returning the range we determined from the main
> >> one... but perhaps that is more complicated.
> > We don't really know what the condition is, it can be pretty arbitrary
> > expression (well, e.g. for C++ conditional expression, so say
> > [[assume (var = foo ())]];
> > is not valid but
> > [[assume ((var = foo ()))]];
> > is. And with GNU statement expressions it can do a lot of stuff and until
> > we e.g. inline into it and optimize it a little, we don't really know what
> > it will be like.
> >
> >
>
> No, I just meant that once we finally process the complicated function, and
> decide the final range we are storing is for x_1 is say [20,30], we could
> replace the assume call site with something like
>
> int assume03_x (x) { if (x>= 20 || x <= 30) return x; gcc_unreachable(); }
>
> then at call sites:
>
> x_5 = assume03_x(x_3);
>
> For that matter, once all the assume functions have been processed, we could
> textually replace the assume call with an expression which represents the
> determined range... Kind of our own mini inlining? Maybe thats even better
> than adding any kind of support in fold_using_range.. just let things
> naturally fall into place?
>
> .ASSUME_blah ( , , x_4);
>
> where if x is determined to be [20, 30][50,60] could be textually "expanded"
> in the IL with
>
> if (x<20 || x>60 || (x>30 && x < 50)) gcc_unreachcable();
>
> for each of the parameters? If we processed this like early inlining, we
> could maybe expose the entire thing to optimization that way?
>
> Andrew
>
>
>
On Thu, Oct 13, 2022 at 08:11:53AM +0000, Richard Biener wrote:
> On Wed, 12 Oct 2022, Andrew MacLeod wrote:
>
> >
> > On 10/12/22 10:39, Jakub Jelinek wrote:
> > > On Wed, Oct 12, 2022 at 10:31:00AM -0400, Andrew MacLeod wrote:
> > >> I presume you are looking to get this working for this release, making the
> > >> priority high? :-)
> > > Yes. So that we can claim we actually support C++23 Portable Assumptions
> > > and OpenMP assume directive's hold clauses for something non-trivial so
> > > people won't be afraid to actually use it.
> > > Of course, first the posted patch needs to be reviewed and only once it gets
> > > in, the ranger/GORI part can follow. As the latter is only an optimization,
> > > it can be done incrementally.
> >
> > I will start poking at something to find ranges for parameters from the return
> > backwards.
>
> If the return were
>
> if (return_val)
> return return_val;
>
> you could use path-ranger with the parameter SSA default defs as
> "interesting". So you "only" need to somehow interpret the return
> statement as such and do path rangers compute_ranges ()
If it was easier for handling, another possible representation of the
assume_function could be not that it returns a bool where [1,1] returned
means defined behavior, otherwise UB, but that the function returns void
and the assumption is that it returns, the other paths would be
__builtin_unreachable (). But still in both cases it needs a specialized
backwards walk from the assumption that either it returns [1,1] or that it
returns through GIMPLE_RETURN to be defined behavior. In either case,
external exceptions, or infinite loops or other reasons why the function
might not return normally (exit/abort/longjmp/non-local goto etc.) are still
UB for assumptions.
Say normally, if we have:
extern void foo (int);
bool
assume1 (int x)
{
foo (x);
if (x != 42)
__builtin_unreachable ();
return true;
}
we can't through backwards ranger walk determine that x_1(D) at the start of
the function has [42,42] range, we can just say it is true at the end of the
function, because foo could do if (x != 42) exit (0); or if (x != 42) throw
1; or if (x != 42) longjmp (buf, 1); or while (x != 42) ; or if (x != 42)
abort ();
But with assumption functions we actually can and stick [42, 42] on the
parameters even when we know nothing about foo function.
Of course, perhaps initially, we can choose to ignore those extra
guarantees.
Jakub
On Wed, Oct 12, 2022 at 12:12:38PM -0400, Andrew MacLeod wrote:
> No, I just meant that once we finally process the complicated function, and
> decide the final range we are storing is for x_1 is say [20,30], we could
> replace the assume call site with something like
>
> int assume03_x (x) { if (x>= 20 || x <= 30) return x; gcc_unreachable(); }
>
> then at call sites:
>
> x_5 = assume03_x(x_3);
>
> For that matter, once all the assume functions have been processed, we could
> textually replace the assume call with an expression which represents the
> determined range... Kind of our own mini inlining? Maybe thats even better
> than adding any kind of support in fold_using_range.. just let things
> naturally fall into place?
>
> .ASSUME_blah ( , , x_4);
>
> where if x is determined to be [20, 30][50,60] could be textually "expanded"
> in the IL with
>
> if (x<20 || x>60 || (x>30 && x < 50)) gcc_unreachcable();
>
> for each of the parameters? If we processed this like early inlining, we
> could maybe expose the entire thing to optimization that way?
That could work for integral parameters, but doesn't work for floating point
nor when builtins are involved. We do not want to put floating point
comparisons into the IL as __builtin_unreachable (); guards because they
have observable side-effects (floating point exceptions/traps) and we
wouldn't DCE them for those reasons. Similarly, if there are builtins
involved we don't want to call the corresponding library functions because
something wasn't DCEd.
Jakub
On 10/13/22 05:53, Jakub Jelinek wrote:
> On Thu, Oct 13, 2022 at 08:11:53AM +0000, Richard Biener wrote:
>> On Wed, 12 Oct 2022, Andrew MacLeod wrote:
>>
>>> On 10/12/22 10:39, Jakub Jelinek wrote:
>>>> On Wed, Oct 12, 2022 at 10:31:00AM -0400, Andrew MacLeod wrote:
>>>>> I presume you are looking to get this working for this release, making the
>>>>> priority high? :-)
>>>> Yes. So that we can claim we actually support C++23 Portable Assumptions
>>>> and OpenMP assume directive's hold clauses for something non-trivial so
>>>> people won't be afraid to actually use it.
>>>> Of course, first the posted patch needs to be reviewed and only once it gets
>>>> in, the ranger/GORI part can follow. As the latter is only an optimization,
>>>> it can be done incrementally.
>>> I will start poking at something to find ranges for parameters from the return
>>> backwards.
>> If the return were
>>
>> if (return_val)
>> return return_val;
>>
>> you could use path-ranger with the parameter SSA default defs as
>> "interesting". So you "only" need to somehow interpret the return
>> statement as such and do path rangers compute_ranges ()
> If it was easier for handling, another possible representation of the
> assume_function could be not that it returns a bool where [1,1] returned
> means defined behavior, otherwise UB, but that the function returns void
> and the assumption is that it returns, the other paths would be
> __builtin_unreachable (). But still in both cases it needs a specialized
> backwards walk from the assumption that either it returns [1,1] or that it
> returns through GIMPLE_RETURN to be defined behavior. In either case,
> external exceptions, or infinite loops or other reasons why the function
> might not return normally (exit/abort/longjmp/non-local goto etc.) are still
> UB for assumptions.
> Say normally, if we have:
> extern void foo (int);
>
> bool
> assume1 (int x)
> {
> foo (x);
> if (x != 42)
> __builtin_unreachable ();
> return true;
> }
> we can't through backwards ranger walk determine that x_1(D) at the start of
> the function has [42,42] range, we can just say it is true at the end of the
> function, because foo could do if (x != 42) exit (0); or if (x != 42) throw
> 1; or if (x != 42) longjmp (buf, 1); or while (x != 42) ; or if (x != 42)
> abort ();
> But with assumption functions we actually can and stick [42, 42] on the
> parameters even when we know nothing about foo function.
>
> Of course, perhaps initially, we can choose to ignore those extra
> guarantees.
>
I dont think we need to change anything. All I intend to do is provide
something that looks for the returns, wire GORI in, and reuse a global
range table in to a reverse dependency walk to starting with a range of
[1,1] for whatever is on the return stmt.
From GORIs point of view, thats all outgoing_edge_range_p () does,
except it picks up the initial value of [0,0] or [1,1] from the
specified edge instead.
Initially It'll stop at the top of the block, but I don't think its too
much work beyond that provide "simple" processing of PHIs and edges
coming into the block.. In the absence of loops it should be pretty
straightforward. "All" you do is feed the value of the phi argument to
the previous block. Of course it'll probably be a little more
complicated than that, but the basic premise seems pretty straightforward.
The result produced would be vector over the ssa-names in the function
with any ranges that were determined. You could use that to more
efficiently store just the values of the parameters somewhere and
somehow associate them with the assume function decl.
I'll try to get to this shortly.
Andrew
Am Montag, den 10.10.2022, 10:54 +0200 schrieb Jakub Jelinek:
> Hi!
>
> My earlier patches gimplify the simplest non-side-effects assumptions
> into if (cond) ; else __builtin_unreachable (); and throw the rest
> on the floor.
> The following patch attempts to do something with the rest too.
My recommendation would be to only process side-effect-free
assumptions and warn about the rest (clang does this for
__builtin_assume). I do not think this is worth the
complexity and I am not so sure the semantics of a
hypothetical evaluation are terribly well defined.
That you can not verify this properly by turning it
into traps in debug mode (as this would execute the
side effects) also makes this a terrible feature IMHO.
MSVC which this feature was based does not seem to make
much to sense to me: https://godbolt.org/z/4Ebar3G9b
Martin
> For -O0, it actually throws even the simplest assumptions on the floor,
> we don't expect optimizations and the assumptions are there to allow
> optimizations. Otherwise, it keeps the handling of the simplest
> assumptions as is, and otherwise arranges for the assumptions to be
> visible in the IL as
> .ASSUME (_Z2f4i._assume.0, i_1(D));
> call where there is an artificial function like:
> bool _Z2f4i._assume.0 (int i)
> {
> bool _2;
>
> <bb 2> [local count: 1073741824]:
> _2 = i_1(D) == 43;
> return _2;
>
> }
> with the semantics that there is UB unless the assumption function
> would return true.
>
> Aldy, could ranger handle this? If it sees .ASSUME call,
> walk the body of such function from the edge(s) to exit with the
> assumption that the function returns true, so above set _2 [true, true]
> and from there derive that i_1(D) [43, 43] and then map the argument
> in the assumption function to argument passed to IFN_ASSUME (note,
> args there are shifted by 1)?
>
> During gimplification it actually gimplifies it into
> D.2591 = .ASSUME ();
> if (D.2591 != 0) goto <D.2593>; else goto <D.2592>;
> <D.2593>:
> {
> i = i + 1;
> D.2591 = i == 44;
> }
> <D.2592>:
> .ASSUME (D.2591);
> with the condition wrapped into a GIMPLE_BIND (I admit the above isn't
> extra clean but it is just something to hold it from gimplifier until
> gimple low pass; it reassembles if (condition_never_true) { cond; };
> an alternative would be introduce GOMP_ASSUME statement that would have
> the guard var as operand and the GIMPLE_BIND as body, but for the
> few passes (tree-nested and omp lowering) in between that looked like
> an overkill to me) which is then pattern matched during gimple lowering
> and outlined into a separate function. Variables declared inside of the
> condition (both static and automatic) just change context, automatic
> variables from the caller are turned into parameters (note, as the code
> is never executed, I handle this way even non-POD types, we don't need to
> bother pretending there would be user copy constructors etc. involved).
>
> The assume_function artificial functions are then optimized until RTL
> expansion, which isn't done for them nor any later pass, on the other
> side bodies are not freed when done.
>
> Earlier version of the patch has been successfully bootstrapped/regtested
> on x86_64-linux and i686-linux and all assume tests have passed even with
> this version. Ok for trunk if it succeeds on another bootstrap/regtest?
>
> There are a few further changes I'd like to do, like ignoring the
> .ASSUME calls in inlining size estimations (but haven't figured out where
> it is done), or for LTO arrange for the assume functions to be emitted
> in all partitions that reference those (usually there will be just one,
> unless code with the assumption got inlined, versioned etc.).
>
> 2022-10-10 Jakub Jelinek <jakub@redhat.com>
>
> PR c++/106654
> gcc/
> * function.h (struct function): Add assume_function bitfield.
> * gimplify.cc (gimplify_call_expr): For -O0, always throw
> away assumptions on the floor immediately. Otherwise if the
> assumption isn't simple enough, expand it into IFN_ASSUME
> guarded block.
> * gimple-low.cc (create_assumption_fn): New function.
> (struct lower_assumption_data): New type.
> (find_assumption_locals_r, assumption_copy_decl,
> adjust_assumption_stmt_r, adjust_assumption_stmt_op,
> lower_assumption): New functions.
> (lower_stmt): Handle IFN_ASSUME guarded block.
> * tree-ssa-ccp.cc (pass_fold_builtins::execute): Remove
> IFN_ASSUME calls.
> * lto-streamer-out.cc (output_struct_function_base): Pack
> assume_function bit.
> * lto-streamer-in.cc (input_struct_function_base): And unpack it.
> * cgraphunit.cc (cgraph_node::expand): Don't verify assume_function
> has TREE_ASM_WRITTEN set and don't release its body.
> * cfgexpand.cc (pass_expand::execute): Don't expand assume_function
> into RTL, just destroy loops and exit.
> * internal-fn.cc (expand_ASSUME): Remove gcc_unreachable.
> * passes.cc (pass_rest_of_compilation::gate): Return false also for
> fun->assume_function.
> * tree-vectorizer.cc (pass_vectorize::gate,
> pass_slp_vectorize::gate): Likewise.
> * ipa-icf.cc (sem_function::parse): Punt for func->assume_function.
> gcc/cp/
> * parser.cc (cp_parser_omp_assumption_clauses): Wrap IFN_ASSUME
> argument with fold_build_cleanup_point_expr.
> * cp-gimplify.cc (process_stmt_assume_attribute): Likewise.
> * pt.cc (tsubst_copy_and_build): Likewise.
> gcc/testsuite/
> * g++.dg/cpp23/attr-assume5.C: New test.
> * g++.dg/cpp23/attr-assume6.C: New test.
> * g++.dg/cpp23/attr-assume7.C: New test.
>
> --- gcc/function.h.jj 2022-10-10 09:31:22.051478926 +0200
> +++ gcc/function.h 2022-10-10 09:59:49.283646705 +0200
> @@ -438,6 +438,10 @@ struct GTY(()) function {
>
> /* Set if there are any OMP_TARGET regions in the function. */
> unsigned int has_omp_target : 1;
> +
> + /* Set for artificial function created for [[assume (cond)]].
> + These should be GIMPLE optimized, but not expanded to RTL. */
> + unsigned int assume_function : 1;
> };
>
> /* Add the decl D to the local_decls list of FUN. */
> --- gcc/gimplify.cc.jj 2022-10-10 09:31:57.518983613 +0200
> +++ gcc/gimplify.cc 2022-10-10 09:59:49.285646677 +0200
> @@ -3556,6 +3556,12 @@ gimplify_call_expr (tree *expr_p, gimple
>
> if (ifn == IFN_ASSUME)
> {
> + /* If not optimizing, ignore the assumptions. */
> + if (!optimize)
> + {
> + *expr_p = NULL_TREE;
> + return GS_ALL_DONE;
> + }
> if (simple_condition_p (CALL_EXPR_ARG (*expr_p, 0)))
> {
> /* If the [[assume (cond)]]; condition is simple
> @@ -3569,7 +3575,46 @@ gimplify_call_expr (tree *expr_p, gimple
> fndecl, 0));
> return GS_OK;
> }
> - /* FIXME: Otherwise expand it specially. */
> + /* Temporarily, until gimple lowering, transform
> + .ASSUME (cond);
> + into:
> + guard = .ASSUME ();
> + if (guard) goto label_true; else label_false;
> + label_true:;
> + {
> + guard = cond;
> + }
> + label_false:;
> + .ASSUME (guard);
> + such that gimple lowering can outline the condition into
> + a separate function easily. */
> + tree guard = create_tmp_var (boolean_type_node);
> + gcall *call = gimple_build_call_internal (ifn, 0);
> + gimple_call_set_nothrow (call, TREE_NOTHROW (*expr_p));
> + gimple_set_location (call, loc);
> + gimple_call_set_lhs (call, guard);
> + gimple_seq_add_stmt (pre_p, call);
> + *expr_p = build2 (MODIFY_EXPR, void_type_node, guard,
> + CALL_EXPR_ARG (*expr_p, 0));
> + *expr_p = build3 (BIND_EXPR, void_type_node, NULL, *expr_p, NULL);
> + tree label_false = create_artificial_label (UNKNOWN_LOCATION);
> + tree label_true = create_artificial_label (UNKNOWN_LOCATION);
> + gcond *cond_stmt = gimple_build_cond (NE_EXPR, guard,
> + boolean_false_node,
> + label_true, label_false);
> + gimplify_seq_add_stmt (pre_p, cond_stmt);
> + gimplify_seq_add_stmt (pre_p, gimple_build_label (label_true));
> + push_gimplify_context ();
> + gimple_seq body = NULL;
> + gimple *g = gimplify_and_return_first (*expr_p, &body);
> + pop_gimplify_context (g);
> + gimplify_seq_add_seq (pre_p, body);
> + gimplify_seq_add_stmt (pre_p, gimple_build_label (label_false));
> + call = gimple_build_call_internal (ifn, 1, guard);
> + gimple_call_set_nothrow (call, TREE_NOTHROW (*expr_p));
> + gimple_set_location (call, loc);
> + gimple_seq_add_stmt (pre_p, call);
> + *expr_p = NULL_TREE;
> return GS_ALL_DONE;
> }
>
> --- gcc/gimple-low.cc.jj 2022-10-10 09:31:22.107478144 +0200
> +++ gcc/gimple-low.cc 2022-10-10 10:22:05.891999132 +0200
> @@ -33,6 +33,13 @@ along with GCC; see the file COPYING3.
> #include "predict.h"
> #include "gimple-predict.h"
> #include "gimple-fold.h"
> +#include "cgraph.h"
> +#include "tree-ssa.h"
> +#include "value-range.h"
> +#include "stringpool.h"
> +#include "tree-ssanames.h"
> +#include "tree-inline.h"
> +#include "gimple-walk.h"
>
> /* The differences between High GIMPLE and Low GIMPLE are the
> following:
> @@ -237,6 +244,383 @@ lower_omp_directive (gimple_stmt_iterato
> gsi_next (gsi);
> }
>
> +static tree
> +create_assumption_fn (location_t loc)
> +{
> + tree name = clone_function_name_numbered (current_function_decl, "_assume");
> + /* For now, will be changed later. */
> + tree type = TREE_TYPE (current_function_decl);
> + tree decl = build_decl (loc, FUNCTION_DECL, name, type);
> + TREE_STATIC (decl) = 1;
> + TREE_USED (decl) = 1;
> + DECL_ARTIFICIAL (decl) = 1;
> + DECL_IGNORED_P (decl) = 1;
> + DECL_NAMELESS (decl) = 1;
> + TREE_PUBLIC (decl) = 0;
> + DECL_UNINLINABLE (decl) = 1;
> + DECL_EXTERNAL (decl) = 0;
> + DECL_CONTEXT (decl) = NULL_TREE;
> + DECL_INITIAL (decl) = make_node (BLOCK);
> + BLOCK_SUPERCONTEXT (DECL_INITIAL (decl)) = decl;
> + DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
> + = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (current_function_decl);
> + DECL_FUNCTION_SPECIFIC_TARGET (decl)
> + = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
> + DECL_FUNCTION_VERSIONED (decl)
> + = DECL_FUNCTION_VERSIONED (current_function_decl);
> + tree t = build_decl (DECL_SOURCE_LOCATION (decl),
> + RESULT_DECL, NULL_TREE, boolean_type_node);
> + DECL_ARTIFICIAL (t) = 1;
> + DECL_IGNORED_P (t) = 1;
> + DECL_CONTEXT (t) = decl;
> + DECL_RESULT (decl) = t;
> + push_struct_function (decl);
> + cfun->function_end_locus = loc;
> + init_tree_ssa (cfun);
> + return decl;
> +}
> +
> +struct lower_assumption_data
> +{
> + copy_body_data id;
> + tree return_false_label;
> + tree guard_copy;
> + auto_vec<tree> decls;
> +};
> +
> +/* Helper function for lower_assumptions. Find local vars and labels
> + in the assumption sequence and remove debug stmts. */
> +
> +static tree
> +find_assumption_locals_r (gimple_stmt_iterator *gsi_p, bool *,
> + struct walk_stmt_info *wi)
> +{
> + lower_assumption_data *data = (lower_assumption_data *) wi->info;
> + gimple *stmt = gsi_stmt (*gsi_p);
> + tree lhs = gimple_get_lhs (stmt);
> + if (lhs && TREE_CODE (lhs) == SSA_NAME)
> + {
> + gcc_assert (SSA_NAME_VAR (lhs) == NULL_TREE);
> + data->id.decl_map->put (lhs, NULL_TREE);
> + data->decls.safe_push (lhs);
> + }
> + switch (gimple_code (stmt))
> + {
> + case GIMPLE_BIND:
> + for (tree var = gimple_bind_vars (as_a <gbind *> (stmt));
> + var; var = DECL_CHAIN (var))
> + if (VAR_P (var)
> + && !DECL_EXTERNAL (var)
> + && DECL_CONTEXT (var) == data->id.src_fn)
> + {
> + data->id.decl_map->put (var, var);
> + data->decls.safe_push (var);
> + }
> + break;
> + case GIMPLE_LABEL:
> + {
> + tree label = gimple_label_label (as_a <glabel *> (stmt));
> + data->id.decl_map->put (label, label);
> + break;
> + }
> + case GIMPLE_RETURN:
> + /* If something in assumption tries to return from parent function,
> + if it would be reached in hypothetical evaluation, it would be UB,
> + so transform such returns into return false; */
> + {
> + gimple *g = gimple_build_assign (data->guard_copy, boolean_false_node);
> + gsi_insert_before (gsi_p, g, GSI_SAME_STMT);
> + gimple_return_set_retval (as_a <greturn *> (stmt), data->guard_copy);
> + break;
> + }
> + case GIMPLE_DEBUG:
> + /* As assumptions won't be emitted, debug info stmts in them
> + are useless. */
> + gsi_remove (gsi_p, true);
> + wi->removed_stmt = true;
> + break;
> + default:
> + break;
> + }
> + return NULL_TREE;
> +}
> +
> +/* Create a new PARM_DECL that is indentical in all respect to DECL except that
> + DECL can be either a VAR_DECL, a PARM_DECL or RESULT_DECL. The original
> + DECL must come from ID->src_fn and the copy will be part of ID->dst_fn. */
> +
> +static tree
> +assumption_copy_decl (tree decl, copy_body_data *id)
> +{
> + tree type = TREE_TYPE (decl);
> +
> + if (is_global_var (decl))
> + return decl;
> +
> + gcc_assert (VAR_P (decl)
> + || TREE_CODE (decl) == PARM_DECL
> + || TREE_CODE (decl) == RESULT_DECL);
> + tree copy = build_decl (DECL_SOURCE_LOCATION (decl),
> + PARM_DECL, DECL_NAME (decl), type);
> + if (DECL_PT_UID_SET_P (decl))
> + SET_DECL_PT_UID (copy, DECL_PT_UID (decl));
> + TREE_ADDRESSABLE (copy) = TREE_ADDRESSABLE (decl);
> + TREE_READONLY (copy) = TREE_READONLY (decl);
> + TREE_THIS_VOLATILE (copy) = TREE_THIS_VOLATILE (decl);
> + DECL_NOT_GIMPLE_REG_P (copy) = DECL_NOT_GIMPLE_REG_P (decl);
> + DECL_BY_REFERENCE (copy) = DECL_BY_REFERENCE (decl);
> + DECL_ARG_TYPE (copy) = type;
> + ((lower_assumption_data *) id)->decls.safe_push (decl);
> + return copy_decl_for_dup_finish (id, decl, copy);
> +}
> +
> +/* Transform gotos out of the assumption into return false. */
> +
> +static tree
> +adjust_assumption_stmt_r (gimple_stmt_iterator *gsi_p, bool *,
> + struct walk_stmt_info *wi)
> +{
> + lower_assumption_data *data = (lower_assumption_data *) wi->info;
> + gimple *stmt = gsi_stmt (*gsi_p);
> + tree lab = NULL_TREE;
> + unsigned int idx = 0;
> + if (gimple_code (stmt) == GIMPLE_GOTO)
> + lab = gimple_goto_dest (stmt);
> + else if (gimple_code (stmt) == GIMPLE_COND)
> + {
> + repeat:
> + if (idx == 0)
> + lab = gimple_cond_true_label (as_a <gcond *> (stmt));
> + else
> + lab = gimple_cond_false_label (as_a <gcond *> (stmt));
> + }
> + else if (gimple_code (stmt) == GIMPLE_LABEL)
> + {
> + tree label = gimple_label_label (as_a <glabel *> (stmt));
> + DECL_CONTEXT (label) = current_function_decl;
> + }
> + if (lab)
> + {
> + if (!data->id.decl_map->get (lab))
> + {
> + if (!data->return_false_label)
> + data->return_false_label
> + = create_artificial_label (UNKNOWN_LOCATION);
> + if (gimple_code (stmt) == GIMPLE_GOTO)
> + gimple_goto_set_dest (as_a <ggoto *> (stmt),
> + data->return_false_label);
> + else if (idx == 0)
> + gimple_cond_set_true_label (as_a <gcond *> (stmt),
> + data->return_false_label);
> + else
> + gimple_cond_set_false_label (as_a <gcond *> (stmt),
> + data->return_false_label);
> + }
> + if (gimple_code (stmt) == GIMPLE_COND && idx == 0)
> + {
> + idx = 1;
> + goto repeat;
> + }
> + }
> + return NULL_TREE;
> +}
> +
> +/* Adjust trees in the assumption body. Called through walk_tree. */
> +
> +static tree
> +adjust_assumption_stmt_op (tree *tp, int *, void *datap)
> +{
> + struct walk_stmt_info *wi = (struct walk_stmt_info *) datap;
> + lower_assumption_data *data = (lower_assumption_data *) wi->info;
> + tree t = *tp;
> + tree *newt;
> + switch (TREE_CODE (t))
> + {
> + case SSA_NAME:
> + newt = data->id.decl_map->get (t);
> + /* There shouldn't be SSA_NAMEs other than ones defined in the
> + assumption's body. */
> + gcc_assert (newt);
> + *tp = *newt;
> + break;
> + case LABEL_DECL:
> + newt = data->id.decl_map->get (t);
> + if (newt)
> + *tp = *newt;
> + break;
> + case VAR_DECL:
> + case PARM_DECL:
> + case RESULT_DECL:
> + *tp = remap_decl (t, &data->id);
> + break;
> + default:
> + break;
> + }
> + return NULL_TREE;
> +}
> +
> +/* Lower assumption.
> + The gimplifier transformed:
> + .ASSUME (cond);
> + into:
> + guard = .ASSUME ();
> + if (guard) goto label_true; else label_false;
> + label_true:;
> + {
> + guard = cond;
> + }
> + label_false:;
> + .ASSUME (guard);
> + which we should transform into:
> + .ASSUME (&artificial_fn, args...);
> + where artificial_fn will look like:
> + bool artificial_fn (args...)
> + {
> + guard = cond;
> + return guard;
> + }
> + with any debug stmts in the block removed and jumps out of
> + the block or return stmts replaced with return false; */
> +
> +static void
> +lower_assumption (gimple_stmt_iterator *gsi, struct lower_data *data)
> +{
> + gimple *stmt = gsi_stmt (*gsi);
> + tree guard = gimple_call_lhs (stmt);
> + location_t loc = gimple_location (stmt);
> + gcc_assert (guard);
> + gsi_remove (gsi, true);
> + stmt = gsi_stmt (*gsi);
> + gcond *cond = as_a <gcond *> (stmt);
> + gcc_assert (gimple_cond_lhs (cond) == guard
> + || gimple_cond_rhs (cond) == guard);
> + tree l1 = gimple_cond_true_label (cond);
> + tree l2 = gimple_cond_false_label (cond);
> + gsi_remove (gsi, true);
> + stmt = gsi_stmt (*gsi);
> + glabel *lab = as_a <glabel *> (stmt);
> + gcc_assert (gimple_label_label (lab) == l1
> + || gimple_label_label (lab) == l2);
> + gsi_remove (gsi, true);
> + gimple *bind = gsi_stmt (*gsi);
> + gcc_assert (gimple_code (bind) == GIMPLE_BIND);
> +
> + lower_assumption_data lad;
> + hash_map<tree, tree> decl_map;
> + memset (&lad.id, 0, sizeof (lad.id));
> + lad.return_false_label = NULL_TREE;
> + lad.id.src_fn = current_function_decl;
> + lad.id.dst_fn = create_assumption_fn (loc);
> + lad.id.src_cfun = DECL_STRUCT_FUNCTION (lad.id.src_fn);
> + lad.id.decl_map = &decl_map;
> + lad.id.copy_decl = assumption_copy_decl;
> + lad.id.transform_call_graph_edges = CB_CGE_DUPLICATE;
> + lad.id.transform_parameter = true;
> + lad.id.do_not_unshare = true;
> + lad.id.do_not_fold = true;
> + cfun->curr_properties = lad.id.src_cfun->curr_properties;
> + lad.guard_copy = create_tmp_var (boolean_type_node);
> + decl_map.put (lad.guard_copy, lad.guard_copy);
> + decl_map.put (guard, lad.guard_copy);
> + cfun->assume_function = 1;
> +
> + struct walk_stmt_info wi;
> + memset (&wi, 0, sizeof (wi));
> + wi.info = (void *) &lad;
> + walk_gimple_stmt (gsi, find_assumption_locals_r, NULL, &wi);
> + unsigned int sz = lad.decls.length ();
> + for (unsigned i = 0; i < sz; ++i)
> + {
> + tree v = lad.decls[i];
> + tree newv;
> + if (TREE_CODE (v) == SSA_NAME)
> + {
> + newv = make_ssa_name (remap_type (TREE_TYPE (v), &lad.id));
> + decl_map.put (v, newv);
> + }
> + else if (VAR_P (v))
> + {
> + if (is_global_var (v) && !DECL_ASSEMBLER_NAME_SET_P (v))
> + DECL_ASSEMBLER_NAME (v);
> + TREE_TYPE (v) = remap_type (TREE_TYPE (v), &lad.id);
> + DECL_CONTEXT (v) = current_function_decl;
> + }
> + }
> + memset (&wi, 0, sizeof (wi));
> + wi.info = (void *) &lad;
> + walk_gimple_stmt (gsi, adjust_assumption_stmt_r,
> + adjust_assumption_stmt_op, &wi);
> + gsi_remove (gsi, false);
> +
> + gimple_seq body = NULL;
> + gimple *g = gimple_build_assign (lad.guard_copy, boolean_false_node);
> + gimple_seq_add_stmt (&body, g);
> + gimple_seq_add_stmt (&body, bind);
> + greturn *gr = gimple_build_return (lad.guard_copy);
> + gimple_seq_add_stmt (&body, gr);
> + if (lad.return_false_label)
> + {
> + g = gimple_build_label (lad.return_false_label);
> + gimple_seq_add_stmt (&body, g);
> + g = gimple_build_assign (lad.guard_copy, boolean_false_node);
> + gimple_seq_add_stmt (&body, g);
> + gr = gimple_build_return (lad.guard_copy);
> + gimple_seq_add_stmt (&body, gr);
> + }
> + bind = gimple_build_bind (NULL_TREE, body, NULL_TREE);
> + body = NULL;
> + gimple_seq_add_stmt (&body, bind);
> + gimple_set_body (current_function_decl, body);
> + pop_cfun ();
> +
> + tree parms = NULL_TREE;
> + tree parmt = void_list_node;
> + auto_vec<tree, 8> vargs;
> + vargs.safe_grow (1 + (lad.decls.length () - sz), true);
> + vargs[0] = build_fold_addr_expr (lad.id.dst_fn);
> + for (unsigned i = lad.decls.length (); i > sz; --i)
> + {
> + tree *v = decl_map.get (lad.decls[i - 1]);
> + gcc_assert (v && TREE_CODE (*v) == PARM_DECL);
> + DECL_CHAIN (*v) = parms;
> + parms = *v;
> + parmt = tree_cons (NULL_TREE, TREE_TYPE (*v), parmt);
> + vargs[i - sz] = lad.decls[i - 1];
> + if (is_gimple_reg_type (TREE_TYPE (vargs[i - sz]))
> + && !is_gimple_val (vargs[i - sz]))
> + {
> + tree t = make_ssa_name (TREE_TYPE (vargs[i - sz]));
> + g = gimple_build_assign (t, vargs[i - sz]);
> + gsi_insert_before (gsi, g, GSI_SAME_STMT);
> + vargs[i - sz] = t;
> + }
> + }
> + DECL_ARGUMENTS (lad.id.dst_fn) = parms;
> + TREE_TYPE (lad.id.dst_fn) = build_function_type (boolean_type_node, parmt);
> +
> + cgraph_node::add_new_function (lad.id.dst_fn, false);
> +
> + for (unsigned i = 0; i < sz; ++i)
> + {
> + tree v = lad.decls[i];
> + if (TREE_CODE (v) == SSA_NAME)
> + release_ssa_name (v);
> + }
> +
> + stmt = gsi_stmt (*gsi);
> + lab = as_a <glabel *> (stmt);
> + gcc_assert (gimple_label_label (lab) == l1
> + || gimple_label_label (lab) == l2);
> + gsi_remove (gsi, true);
> + stmt = gsi_stmt (*gsi);
> + gcc_assert (gimple_call_internal_p (stmt, IFN_ASSUME)
> + && gimple_call_num_args (stmt) == 1
> + && gimple_call_arg (stmt, 0) == guard);
> + data->cannot_fallthru = false;
> + gcall *call = gimple_build_call_internal_vec (IFN_ASSUME, vargs);
> + gimple_set_location (call, loc);
> + gsi_replace (gsi, call, true);
> +}
>
> /* Lower statement GSI. DATA is passed through the recursion. We try to
> track the fallthruness of statements and get rid of unreachable return
> @@ -354,6 +738,13 @@ lower_stmt (gimple_stmt_iterator *gsi, s
> tree decl = gimple_call_fndecl (stmt);
> unsigned i;
>
> + if (gimple_call_internal_p (stmt, IFN_ASSUME)
> + && gimple_call_num_args (stmt) == 0)
> + {
> + lower_assumption (gsi, data);
> + return;
> + }
> +
> for (i = 0; i < gimple_call_num_args (stmt); i++)
> {
> tree arg = gimple_call_arg (stmt, i);
> --- gcc/tree-ssa-ccp.cc.jj 2022-10-10 09:31:22.472473047 +0200
> +++ gcc/tree-ssa-ccp.cc 2022-10-10 09:59:49.286646663 +0200
> @@ -4253,6 +4253,12 @@ pass_fold_builtins::execute (function *f
> }
>
> callee = gimple_call_fndecl (stmt);
> + if (!callee
> + && gimple_call_internal_p (stmt, IFN_ASSUME))
> + {
> + gsi_remove (&i, true);
> + continue;
> + }
> if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
> {
> gsi_next (&i);
> --- gcc/lto-streamer-out.cc.jj 2022-10-10 09:31:22.331475016 +0200
> +++ gcc/lto-streamer-out.cc 2022-10-10 09:59:49.287646649 +0200
> @@ -2278,6 +2278,7 @@ output_struct_function_base (struct outp
> bp_pack_value (&bp, fn->calls_eh_return, 1);
> bp_pack_value (&bp, fn->has_force_vectorize_loops, 1);
> bp_pack_value (&bp, fn->has_simduid_loops, 1);
> + bp_pack_value (&bp, fn->assume_function, 1);
> bp_pack_value (&bp, fn->va_list_fpr_size, 8);
> bp_pack_value (&bp, fn->va_list_gpr_size, 8);
> bp_pack_value (&bp, fn->last_clique, sizeof (short) * 8);
> --- gcc/lto-streamer-in.cc.jj 2022-10-10 09:31:22.329475044 +0200
> +++ gcc/lto-streamer-in.cc 2022-10-10 09:59:49.287646649 +0200
> @@ -1318,6 +1318,7 @@ input_struct_function_base (struct funct
> fn->calls_eh_return = bp_unpack_value (&bp, 1);
> fn->has_force_vectorize_loops = bp_unpack_value (&bp, 1);
> fn->has_simduid_loops = bp_unpack_value (&bp, 1);
> + fn->assume_function = bp_unpack_value (&bp, 1);
> fn->va_list_fpr_size = bp_unpack_value (&bp, 8);
> fn->va_list_gpr_size = bp_unpack_value (&bp, 8);
> fn->last_clique = bp_unpack_value (&bp, sizeof (short) * 8);
> --- gcc/cgraphunit.cc.jj 2022-10-10 09:31:21.647484568 +0200
> +++ gcc/cgraphunit.cc 2022-10-10 09:59:49.288646635 +0200
> @@ -1882,6 +1882,16 @@ cgraph_node::expand (void)
> ggc_collect ();
> timevar_pop (TV_REST_OF_COMPILATION);
>
> + if (DECL_STRUCT_FUNCTION (decl)
> + && DECL_STRUCT_FUNCTION (decl)->assume_function)
> + {
> + /* Assume functions aren't expanded into RTL, on the other side
> + we don't want to release their body. */
> + if (cfun)
> + pop_cfun ();
> + return;
> + }
> +
> /* Make sure that BE didn't give up on compiling. */
> gcc_assert (TREE_ASM_WRITTEN (decl));
> if (cfun)
> --- gcc/cfgexpand.cc.jj 2022-10-10 09:31:21.554485867 +0200
> +++ gcc/cfgexpand.cc 2022-10-10 09:59:49.288646635 +0200
> @@ -6597,6 +6597,14 @@ pass_expand::execute (function *fun)
> rtx_insn *var_seq, *var_ret_seq;
> unsigned i;
>
> + if (cfun->assume_function)
> + {
> + /* Assume functions should not be expanded to RTL. */
> + cfun->curr_properties &= ~PROP_loops;
> + loop_optimizer_finalize ();
> + return 0;
> + }
> +
> timevar_push (TV_OUT_OF_SSA);
> rewrite_out_of_ssa (&SA);
> timevar_pop (TV_OUT_OF_SSA);
> --- gcc/internal-fn.cc.jj 2022-10-10 09:31:22.246476203 +0200
> +++ gcc/internal-fn.cc 2022-10-10 09:59:49.289646621 +0200
> @@ -4526,5 +4526,4 @@ expand_TRAP (internal_fn, gcall *)
> void
> expand_ASSUME (internal_fn, gcall *)
> {
> - gcc_unreachable ();
> }
> --- gcc/passes.cc.jj 2022-10-10 09:31:22.379474346 +0200
> +++ gcc/passes.cc 2022-10-10 09:59:49.289646621 +0200
> @@ -647,11 +647,12 @@ public:
> {}
>
> /* opt_pass methods: */
> - bool gate (function *) final override
> + bool gate (function *fun) final override
> {
> /* Early return if there were errors. We can run afoul of our
> consistency checks, and there's not really much point in fixing them. */
> - return !(rtl_dump_and_exit || flag_syntax_only || seen_error ());
> + return !(rtl_dump_and_exit || fun->assume_function
> + || flag_syntax_only || seen_error ());
> }
>
> }; // class pass_rest_of_compilation
> --- gcc/tree-vectorizer.cc.jj 2022-10-10 09:31:22.516472432 +0200
> +++ gcc/tree-vectorizer.cc 2022-10-10 09:59:49.290646607 +0200
> @@ -1213,6 +1213,10 @@ public:
> /* opt_pass methods: */
> bool gate (function *fun) final override
> {
> + /* Vectorization makes range analysis of assume functions
> + harder, not easier. */
> + if (fun->assume_function)
> + return false;
> return flag_tree_loop_vectorize || fun->has_force_vectorize_loops;
> }
>
> @@ -1490,7 +1494,14 @@ public:
>
> /* opt_pass methods: */
> opt_pass * clone () final override { return new pass_slp_vectorize (m_ctxt); }
> - bool gate (function *) final override { return flag_tree_slp_vectorize != 0; }
> + bool gate (function *fun) final override
> + {
> + /* Vectorization makes range analysis of assume functions harder,
> + not easier. */
> + if (fun->assume_function)
> + return false;
> + return flag_tree_slp_vectorize != 0;
> + }
> unsigned int execute (function *) final override;
>
> }; // class pass_slp_vectorize
> --- gcc/ipa-icf.cc.jj 2022-06-28 13:03:30.834690968 +0200
> +++ gcc/ipa-icf.cc 2022-10-10 10:29:31.187766299 +0200
> @@ -1517,6 +1517,9 @@ sem_function::parse (cgraph_node *node,
> if (!func || (!node->has_gimple_body_p () && !node->thunk))
> return NULL;
>
> + if (func->assume_function)
> + return NULL;
> +
> if (lookup_attribute_by_prefix ("omp ", DECL_ATTRIBUTES (node->decl)) != NULL)
> return NULL;
>
> --- gcc/cp/parser.cc.jj 2022-10-10 09:31:57.405985191 +0200
> +++ gcc/cp/parser.cc 2022-10-10 09:59:49.295646537 +0200
> @@ -46031,6 +46031,8 @@ cp_parser_omp_assumption_clauses (cp_par
> t = contextual_conv_bool (t, tf_warning_or_error);
> if (is_assume && !error_operand_p (t))
> {
> + if (!processing_template_decl)
> + t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
> t = build_call_expr_internal_loc (eloc, IFN_ASSUME,
> void_type_node, 1, t);
> finish_expr_stmt (t);
> --- gcc/cp/cp-gimplify.cc.jj 2022-10-10 09:31:57.309986531 +0200
> +++ gcc/cp/cp-gimplify.cc 2022-10-10 09:59:49.296646524 +0200
> @@ -3139,6 +3139,8 @@ process_stmt_assume_attribute (tree std_
> arg = contextual_conv_bool (arg, tf_warning_or_error);
> if (error_operand_p (arg))
> continue;
> + if (!processing_template_decl)
> + arg = fold_build_cleanup_point_expr (TREE_TYPE (arg), arg);
> statement = build_call_expr_internal_loc (attrs_loc, IFN_ASSUME,
> void_type_node, 1, arg);
> finish_expr_stmt (statement);
> --- gcc/cp/pt.cc.jj 2022-10-10 09:31:21.947480379 +0200
> +++ gcc/cp/pt.cc 2022-10-10 09:59:49.299646482 +0200
> @@ -21105,6 +21105,8 @@ tsubst_copy_and_build (tree t,
> ret = error_mark_node;
> break;
> }
> + if (!processing_template_decl)
> + arg = fold_build_cleanup_point_expr (TREE_TYPE (arg), arg);
> ret = build_call_expr_internal_loc (EXPR_LOCATION (t),
> IFN_ASSUME,
> void_type_node, 1,
> --- gcc/testsuite/g++.dg/cpp23/attr-assume5.C.jj 2022-10-10 09:59:49.299646482 +0200
> +++ gcc/testsuite/g++.dg/cpp23/attr-assume5.C 2022-10-10 09:59:49.299646482 +0200
> @@ -0,0 +1,5 @@
> +// P1774R8 - Portable assumptions
> +// { dg-do run { target c++11 } }
> +// { dg-options "-O2" }
> +
> +#include "attr-assume1.C"
> --- gcc/testsuite/g++.dg/cpp23/attr-assume6.C.jj 2022-10-10 09:59:49.300646468 +0200
> +++ gcc/testsuite/g++.dg/cpp23/attr-assume6.C 2022-10-10 09:59:49.300646468 +0200
> @@ -0,0 +1,5 @@
> +// P1774R8 - Portable assumptions
> +// { dg-do run { target c++11 } }
> +// { dg-options "-O2" }
> +
> +#include "attr-assume3.C"
> --- gcc/testsuite/g++.dg/cpp23/attr-assume7.C.jj 2022-10-10 09:59:49.300646468 +0200
> +++ gcc/testsuite/g++.dg/cpp23/attr-assume7.C 2022-10-10 10:05:51.472593600 +0200
> @@ -0,0 +1,42 @@
> +// P1774R8 - Portable assumptions
> +// { dg-do compile { target c++11 } }
> +// { dg-options "-O2" }
> +
> +int
> +foo (int x)
> +{
> + [[assume (x == 42)]];
> + return x;
> +}
> +
> +int
> +bar (int x)
> +{
> + [[assume (++x == 43)]];
> + return x;
> +}
> +
> +int
> +baz (int x)
> +{
> + [[assume (({ int z = ++x; static int w; ++w; if (z == 51) return -1; if (z == 53) goto lab1; if
> (z == 64) throw 1; z == 43; }))]];
> +lab1:
> + return x;
> +}
> +
> +struct S { S (); S (const S &); ~S (); int a, b; int foo (); };
> +
> +int
> +qux ()
> +{
> + S s;
> + [[assume (s.a == 42 && s.b == 43)]];
> + return s.a + s.b;
> +}
> +
> +int
> +S::foo ()
> +{
> + [[assume (a == 42 && b == 43)]];
> + return a + b;
> +}
>
> Jakub
>
On Fri, Oct 14, 2022 at 10:43:16PM +0200, Martin Uecker wrote:
> Am Montag, den 10.10.2022, 10:54 +0200 schrieb Jakub Jelinek:
> > Hi!
> >
> > My earlier patches gimplify the simplest non-side-effects assumptions
> > into if (cond) ; else __builtin_unreachable (); and throw the rest
> > on the floor.
> > The following patch attempts to do something with the rest too.
>
> My recommendation would be to only process side-effect-free
> assumptions and warn about the rest (clang does this for
> __builtin_assume). I do not think this is worth the
I think that is bad choice and makes it useless.
> complexity and I am not so sure the semantics of a
> hypothetical evaluation are terribly well defined.
I think the C++23 paper is quite clear. Yes, you can't verify it
in debug mode, but there are many other UBs that are hard to verify
through runtime instrumentation.
And, OpenMP has a similar feature (though, in that case it is even
a stronger guarantee where something is guaranteed to hold across
a whole region rather than just on its entry.
> That you can not verify this properly by turning it
> into traps in debug mode (as this would execute the
> side effects) also makes this a terrible feature IMHO.
>
> MSVC which this feature was based does not seem to make
> much to sense to me: https://godbolt.org/z/4Ebar3G9b
So maybe their feature is different from what is in C++23,
or is badly implemented?
I think with what we have in the works for GCC we'll be able to optimize
in
int f(int i)
{
[[assume(1 == i++)]];
return (1 == i++);
}
int g(int i)
{
[[assume(1 == ++i)]];
return (1 == ++i);
}
extern int i;
int h(void)
{
[[assume(1 == ++i)]];
return (1 == ++i);
}
int k(int i)
{
[[assume(42 == ++i)]];
return i;
}
at least f/g to return 1; and k to return 41;
The h case might take a while to take advantage of.
Jakub
Am Freitag, den 14.10.2022, 23:20 +0200 schrieb Jakub Jelinek:
> On Fri, Oct 14, 2022 at 10:43:16PM +0200, Martin Uecker wrote:
> > Am Montag, den 10.10.2022, 10:54 +0200 schrieb Jakub Jelinek:
> > > Hi!
> > >
> > > My earlier patches gimplify the simplest non-side-effects assumptions
> > > into if (cond) ; else __builtin_unreachable (); and throw the rest
> > > on the floor.
> > > The following patch attempts to do something with the rest too.
> >
> > My recommendation would be to only process side-effect-free
> > assumptions and warn about the rest (clang does this for
> > __builtin_assume). I do not think this is worth the
>
> I think that is bad choice and makes it useless.
I think
[[assume(10 == i + 1)]]
could be useful as it is nicer syntax than
if (10 != i + 1)
unreachable();
but
[[assume(10 == i++)]]
is confusing / untestable and therefor seems a bit
dangerous.
But we do not have to agree, I just stating my opinion
here. I would personally then suggest to have an
option for warning about side effects in assume.
> > complexity and I am not so sure the semantics of a
> > hypothetical evaluation are terribly well defined.
>
> I think the C++23 paper is quite clear. Yes, you can't verify it
> in debug mode, but there are many other UBs that are hard to verify
> through runtime instrumentation.
And none are good. Some are very useful though
(or even required in the context of C/C++). But I think
there should be a very good reason for adding more.
Personally, I do not see [[assume]] how side effects
is useful enough to justify adding another kind of
untestable UB.
Especially also because it has somewhat vaguely
defined semantics. I don't know any formalization the
proposed semantics and the normative wording
"the converted expression would evaluate to true"
is probably good starting point for a PhD thesis.
> And, OpenMP has a similar feature (though, in that case it is even
> a stronger guarantee where something is guaranteed to hold across
> a whole region rather than just on its entry.
>
> > That you can not verify this properly by turning it
> > into traps in debug mode (as this would execute the
> > side effects) also makes this a terrible feature IMHO.
> >
> > MSVC which this feature was based does not seem to make
> > much to sense to me: https://godbolt.org/z/4Ebar3G9b
>
> So maybe their feature is different from what is in C++23,
> or is badly implemented?
The paper says as the first sentence in the abstract:
"We propose a standard facility providing the semantics
of existing compiler built-ins such as
__builtin_assume (Clang) and __assume (MSVC, ICC)."
But Clang does not support side effects and MSVC
is broken. But yes, the paper then describes these
extended semantics for expression with side effects.
According to the author this was based on discussions
with MSVC developers, but - to me - the fact that MSVC
still implements something else which does not seem
to make sense speaks against this feature.
> I think with what we have in the works for GCC we'll be able to optimize
> in
> int f(int i)
> {
> [[assume(1 == i++)]];
> return (1 == i++);
> }
>
> int g(int i)
> {
> [[assume(1 == ++i)]];
> return (1 == ++i);
> }
>
> extern int i;
>
> int h(void)
> {
> [[assume(1 == ++i)]];
> return (1 == ++i);
> }
>
>
> int k(int i)
> {
> [[assume(42 == ++i)]];
> return i;
> }
> at least f/g to return 1; and k to return 41;
> The h case might take a while to take advantage of.
But why? Do we really want to encourage people to
write such code?
Martin
On Sat, Oct 15, 2022 at 10:07:46AM +0200, Martin Uecker wrote:
> But why? Do we really want to encourage people to
> write such code?
Of course these ++ cases inside of expressions are just obfuscation.
But the point is to support using predicates that can be inlined and
simplified into something really simple the optimizers can understand.
The paper shows as useful e.g. being able to assert something is finite:
[[assume (std::isfinite (x)]];
and with the recent changes on the GCC side it is now or shortly will be
possible to take advantage of such predicates.
It is true that
[[assume (__builtin_isfinite (x)]];
could work if we check TREE_SIDE_EFFECTS on the GCC side because
it is a const function, but that is a GNU extension, so the standard
can't count with that. std::isfinite isn't even marked const in libstdc++
and one can figure that out during IPA propagation only.
There are many similar predicates, or user could have some that are useful
to his program. And either in the end it wouldn't have side-effects
but the compiler doesn't know, or would but those side-effects would be
unimportant to the optimizations the compiler can derive from those.
As the spec defines it well what happens with the side-effects and it
is an attribute, not a function and the languages have non-evaluated
contexts in other places, I don't see where a user confusion could come.
We don't warn for sizeof (i++) and similar either.
__builtin_assume (i++) is a bad choice because it looks like a function
call (after all, the compilers have many similar builtins) and its argument
looks like normal argument to the function, so it is certainly unexpected
that the side-effects aren't evaluated.
Jakub
Am Samstag, den 15.10.2022, 10:53 +0200 schrieb Jakub Jelinek:
> On Sat, Oct 15, 2022 at 10:07:46AM +0200, Martin Uecker wrote:
> > But why? Do we really want to encourage people to
> > write such code?
>
> Of course these ++ cases inside of expressions are just obfuscation.
> But the point is to support using predicates that can be inlined and
> simplified into something really simple the optimizers can understand.
This makes sense,.
> The paper shows as useful e.g. being able to assert something is finite:
> [[assume (std::isfinite (x)]];
> and with the recent changes on the GCC side it is now or shortly will be
> possible to take advantage of such predicates.
> It is true that
> [[assume (__builtin_isfinite (x)]];
> could work if we check TREE_SIDE_EFFECTS on the GCC side because
> it is a const function, but that is a GNU extension, so the standard
> can't count with that. std::isfinite isn't even marked const in libstdc++
> and one can figure that out during IPA propagation only.
Hm, that already seems to work with
if (!std::isfinite(x))
__builtin_unreachable();
https://godbolt.org/z/hj3WrEhjb
> There are many similar predicates, or user could have some that are useful
> to his program. And either in the end it wouldn't have side-effects
> but the compiler doesn't know, or would but those side-effects would be
> unimportant to the optimizations the compiler can derive from those.
I still have the feeling that relying on something
such as the pure and const attributes might then
be a better approach for this.
From the standards point of view, this is OK
as GCC can just set its own rules as long as it is
a subset of what the standard allows.
> As the spec defines it well what happens with the side-effects and it
> is an attribute, not a function and the languages have non-evaluated
> contexts in other places, I don't see where a user confusion could come.
The user confusion might come when somebody writes
something such as [[assume(1 == ++i)]] and I expect
that people will start doing this once this works.
But I am also a a bit worried about the slipperly slope
of exploiting this more because what "would evaluate to true"
implies in case of I/O, atomic accesses, volatile accesses
etc. does not seem clear to me. But maybe I am worrying
too much.
> We don't warn for sizeof (i++) and similar either.
Which is also confusing and clang does indeed
warn about it outside of macros and I think GCC
should too.
> __builtin_assume (i++) is a bad choice because it looks like a function
> call (after all, the compilers have many similar builtins) and its argument
> looks like normal argument to the function, so it is certainly unexpected
> that the side-effects aren't evaluated.
I agree.
Best
Martin
> The assume function can have many arguments (one is created for each
> automatic var referenced or set by the condition), so it would be nice to
> track all of them rather than just hardcoding the first. And, the argument
> doesn't necessarily have to be a scalar, so perhaps later on we could derive
> ranges even for structure members etc. if needed. Or just handle
> assume_function in IPA-SRA somehow at least.
>
> The C++23 paper mentions
> [[assume(size > 0)]];
> [[assume(size % 32 == 0)]];
> [[assume(std::isfinite(data[i]))]];
> [[assume(*pn >= 1)]];
> [[assume(i == 42)]];
> [[assume(++i == 43)]];
> [[assume((std::cin >> i, i == 42))]];
> [[assume(++almost_last == last)]];
> among other things, the first and fifth are already handled the
> if (!cond) __builtin_unreachable (); way and so work even without this
> patch, the (std::cin >> i, i == 42) case is not worth bothering for now
> and the rest would be single block assumptions that can be handled easily
> (except the last one which would have record type arguments and so we'd need
> SRA).
>
> Jakub
I put together an initial prototype, attached is the 2 patches so far. I
applied this on top of one of your sets of patches to try it out.
The first patch has the initial simple version, and the second patch
hacks VRP to add a loop over all the ssa-names in the function and show
what assume_range_p would return for them.
First, I added another routine to ranger:
*bool gimple_ranger::assume_range_p (vrange &r, tree name)*
This is the routine that is called to determine what the range of NAME
is at the end of the function if the function returns [1,1]. It is
painfully simple, only working on names in the definition chain of the
return variable. It returns TRUE if it finds a non-varying result. I
will next expand on this to look back in the CFG and be more flexible.
To apply any assumed values, I added a routine to be called
*bool query_assume_call (vrange &r, tree assume_id, tree name);*
This routine would be what is called to lookup if there is any range
associated with NAME in the assume function ASSUME_ID. I hacked one
up to return [42, 42] for any integer query just for POC. You'd need to
supply this routine somewhere instead.
As the ASSUME function has no defs, we can't produce results for the
parameters in normal ways, so I leverage the inferred range code. When
doing a stmt walk, when VRP is done processing a stmt, it applies any
side effects of the statement going forward. The gimple_inferred_range
constructor now also looks for assume calls, and calls query_assume_call
() on each argument, and if it gets back TRUE, applies an inferred range
record for that range at that stmt. (This also means those ASSUME
ranges will only show up in a VRP walk.)
These seems like it might be functional enough for you to experiment with.
For the simple
int
bar (int x)
{
[[assume (++x == 43)]];
return x;
}
The VRP hack for ther assume function shows :
for an assume function, x_2(D) would have a range of [irange] int [42,
42] NONZERO 0x2a.
I also tried it for
bool foo1 (int x, int y) { return x < 10 || x > 20 || x == 12; }
or an assume function, x_5(D) would have a range of [irange] int [-INF,
9][12, 12][21, +INF]
bool foo2 (int x, int y) { return (x >= 10 && x <= 20) || x == 2; }
for an assume function, x_5(D) would have a range of [irange] int [2,
2][10, 20] NONZERO 0x1f
for:
int
bar (int x)
{
[[assume (++x == 43)]];
return x;
}
As for propagating assumed values, the hacked up version returning 42
shows it propagates into the return:
query_assume_call injection
_Z3bari._assume.0 assume inferred range of x_1(D) to [irange] int [42,
42] NONZERO 0x2a
int bar (int x)
{
<bb 2> :
.ASSUME (_Z3bari._assume.0, x_1(D));
return 42;
}
So in principle, I think theres enough infrastructure there to get
going. You can query parameter ranges by creating a ranger, and
querying the parameters via *assume_range_p () *. You can do that
anywhere, as the hack I put in vrp shows, it creates a new ranger,
simply queries each SSA_NAME, then disposes of the ranger before
invoking VRP on a fresh ranger. The you just wire up a proper
*query_assume_call(*) to return those ranges.
Thats the basic APi to deal with... call one function, supply another.
Does this model seem like it would work OK for you? Or do we need to
tweak it?
I am planning to extend assume_range_p to handle other basic blocks, as
well as pick up a few of the extra things that outgoing_edge_range_p does.
Andrew
PS. It also seems to me that the assume_range_p() infrastructure may
have some other uses when it comes to inlining or LTO or IPA. This
particular version works with a return value of [1,1], but that value is
manually supplied to GORI by the routine. If any other pass has some
reason to know that the return value was within a certain range, we
could use that and query what the incoming ranges of any parameter might
have to be. Just a thought.
@@ -438,6 +438,10 @@ struct GTY(()) function {
/* Set if there are any OMP_TARGET regions in the function. */
unsigned int has_omp_target : 1;
+
+ /* Set for artificial function created for [[assume (cond)]].
+ These should be GIMPLE optimized, but not expanded to RTL. */
+ unsigned int assume_function : 1;
};
/* Add the decl D to the local_decls list of FUN. */
@@ -3556,6 +3556,12 @@ gimplify_call_expr (tree *expr_p, gimple
if (ifn == IFN_ASSUME)
{
+ /* If not optimizing, ignore the assumptions. */
+ if (!optimize)
+ {
+ *expr_p = NULL_TREE;
+ return GS_ALL_DONE;
+ }
if (simple_condition_p (CALL_EXPR_ARG (*expr_p, 0)))
{
/* If the [[assume (cond)]]; condition is simple
@@ -3569,7 +3575,46 @@ gimplify_call_expr (tree *expr_p, gimple
fndecl, 0));
return GS_OK;
}
- /* FIXME: Otherwise expand it specially. */
+ /* Temporarily, until gimple lowering, transform
+ .ASSUME (cond);
+ into:
+ guard = .ASSUME ();
+ if (guard) goto label_true; else label_false;
+ label_true:;
+ {
+ guard = cond;
+ }
+ label_false:;
+ .ASSUME (guard);
+ such that gimple lowering can outline the condition into
+ a separate function easily. */
+ tree guard = create_tmp_var (boolean_type_node);
+ gcall *call = gimple_build_call_internal (ifn, 0);
+ gimple_call_set_nothrow (call, TREE_NOTHROW (*expr_p));
+ gimple_set_location (call, loc);
+ gimple_call_set_lhs (call, guard);
+ gimple_seq_add_stmt (pre_p, call);
+ *expr_p = build2 (MODIFY_EXPR, void_type_node, guard,
+ CALL_EXPR_ARG (*expr_p, 0));
+ *expr_p = build3 (BIND_EXPR, void_type_node, NULL, *expr_p, NULL);
+ tree label_false = create_artificial_label (UNKNOWN_LOCATION);
+ tree label_true = create_artificial_label (UNKNOWN_LOCATION);
+ gcond *cond_stmt = gimple_build_cond (NE_EXPR, guard,
+ boolean_false_node,
+ label_true, label_false);
+ gimplify_seq_add_stmt (pre_p, cond_stmt);
+ gimplify_seq_add_stmt (pre_p, gimple_build_label (label_true));
+ push_gimplify_context ();
+ gimple_seq body = NULL;
+ gimple *g = gimplify_and_return_first (*expr_p, &body);
+ pop_gimplify_context (g);
+ gimplify_seq_add_seq (pre_p, body);
+ gimplify_seq_add_stmt (pre_p, gimple_build_label (label_false));
+ call = gimple_build_call_internal (ifn, 1, guard);
+ gimple_call_set_nothrow (call, TREE_NOTHROW (*expr_p));
+ gimple_set_location (call, loc);
+ gimple_seq_add_stmt (pre_p, call);
+ *expr_p = NULL_TREE;
return GS_ALL_DONE;
}
@@ -33,6 +33,13 @@ along with GCC; see the file COPYING3.
#include "predict.h"
#include "gimple-predict.h"
#include "gimple-fold.h"
+#include "cgraph.h"
+#include "tree-ssa.h"
+#include "value-range.h"
+#include "stringpool.h"
+#include "tree-ssanames.h"
+#include "tree-inline.h"
+#include "gimple-walk.h"
/* The differences between High GIMPLE and Low GIMPLE are the
following:
@@ -237,6 +244,383 @@ lower_omp_directive (gimple_stmt_iterato
gsi_next (gsi);
}
+static tree
+create_assumption_fn (location_t loc)
+{
+ tree name = clone_function_name_numbered (current_function_decl, "_assume");
+ /* For now, will be changed later. */
+ tree type = TREE_TYPE (current_function_decl);
+ tree decl = build_decl (loc, FUNCTION_DECL, name, type);
+ TREE_STATIC (decl) = 1;
+ TREE_USED (decl) = 1;
+ DECL_ARTIFICIAL (decl) = 1;
+ DECL_IGNORED_P (decl) = 1;
+ DECL_NAMELESS (decl) = 1;
+ TREE_PUBLIC (decl) = 0;
+ DECL_UNINLINABLE (decl) = 1;
+ DECL_EXTERNAL (decl) = 0;
+ DECL_CONTEXT (decl) = NULL_TREE;
+ DECL_INITIAL (decl) = make_node (BLOCK);
+ BLOCK_SUPERCONTEXT (DECL_INITIAL (decl)) = decl;
+ DECL_FUNCTION_SPECIFIC_OPTIMIZATION (decl)
+ = DECL_FUNCTION_SPECIFIC_OPTIMIZATION (current_function_decl);
+ DECL_FUNCTION_SPECIFIC_TARGET (decl)
+ = DECL_FUNCTION_SPECIFIC_TARGET (current_function_decl);
+ DECL_FUNCTION_VERSIONED (decl)
+ = DECL_FUNCTION_VERSIONED (current_function_decl);
+ tree t = build_decl (DECL_SOURCE_LOCATION (decl),
+ RESULT_DECL, NULL_TREE, boolean_type_node);
+ DECL_ARTIFICIAL (t) = 1;
+ DECL_IGNORED_P (t) = 1;
+ DECL_CONTEXT (t) = decl;
+ DECL_RESULT (decl) = t;
+ push_struct_function (decl);
+ cfun->function_end_locus = loc;
+ init_tree_ssa (cfun);
+ return decl;
+}
+
+struct lower_assumption_data
+{
+ copy_body_data id;
+ tree return_false_label;
+ tree guard_copy;
+ auto_vec<tree> decls;
+};
+
+/* Helper function for lower_assumptions. Find local vars and labels
+ in the assumption sequence and remove debug stmts. */
+
+static tree
+find_assumption_locals_r (gimple_stmt_iterator *gsi_p, bool *,
+ struct walk_stmt_info *wi)
+{
+ lower_assumption_data *data = (lower_assumption_data *) wi->info;
+ gimple *stmt = gsi_stmt (*gsi_p);
+ tree lhs = gimple_get_lhs (stmt);
+ if (lhs && TREE_CODE (lhs) == SSA_NAME)
+ {
+ gcc_assert (SSA_NAME_VAR (lhs) == NULL_TREE);
+ data->id.decl_map->put (lhs, NULL_TREE);
+ data->decls.safe_push (lhs);
+ }
+ switch (gimple_code (stmt))
+ {
+ case GIMPLE_BIND:
+ for (tree var = gimple_bind_vars (as_a <gbind *> (stmt));
+ var; var = DECL_CHAIN (var))
+ if (VAR_P (var)
+ && !DECL_EXTERNAL (var)
+ && DECL_CONTEXT (var) == data->id.src_fn)
+ {
+ data->id.decl_map->put (var, var);
+ data->decls.safe_push (var);
+ }
+ break;
+ case GIMPLE_LABEL:
+ {
+ tree label = gimple_label_label (as_a <glabel *> (stmt));
+ data->id.decl_map->put (label, label);
+ break;
+ }
+ case GIMPLE_RETURN:
+ /* If something in assumption tries to return from parent function,
+ if it would be reached in hypothetical evaluation, it would be UB,
+ so transform such returns into return false; */
+ {
+ gimple *g = gimple_build_assign (data->guard_copy, boolean_false_node);
+ gsi_insert_before (gsi_p, g, GSI_SAME_STMT);
+ gimple_return_set_retval (as_a <greturn *> (stmt), data->guard_copy);
+ break;
+ }
+ case GIMPLE_DEBUG:
+ /* As assumptions won't be emitted, debug info stmts in them
+ are useless. */
+ gsi_remove (gsi_p, true);
+ wi->removed_stmt = true;
+ break;
+ default:
+ break;
+ }
+ return NULL_TREE;
+}
+
+/* Create a new PARM_DECL that is indentical in all respect to DECL except that
+ DECL can be either a VAR_DECL, a PARM_DECL or RESULT_DECL. The original
+ DECL must come from ID->src_fn and the copy will be part of ID->dst_fn. */
+
+static tree
+assumption_copy_decl (tree decl, copy_body_data *id)
+{
+ tree type = TREE_TYPE (decl);
+
+ if (is_global_var (decl))
+ return decl;
+
+ gcc_assert (VAR_P (decl)
+ || TREE_CODE (decl) == PARM_DECL
+ || TREE_CODE (decl) == RESULT_DECL);
+ tree copy = build_decl (DECL_SOURCE_LOCATION (decl),
+ PARM_DECL, DECL_NAME (decl), type);
+ if (DECL_PT_UID_SET_P (decl))
+ SET_DECL_PT_UID (copy, DECL_PT_UID (decl));
+ TREE_ADDRESSABLE (copy) = TREE_ADDRESSABLE (decl);
+ TREE_READONLY (copy) = TREE_READONLY (decl);
+ TREE_THIS_VOLATILE (copy) = TREE_THIS_VOLATILE (decl);
+ DECL_NOT_GIMPLE_REG_P (copy) = DECL_NOT_GIMPLE_REG_P (decl);
+ DECL_BY_REFERENCE (copy) = DECL_BY_REFERENCE (decl);
+ DECL_ARG_TYPE (copy) = type;
+ ((lower_assumption_data *) id)->decls.safe_push (decl);
+ return copy_decl_for_dup_finish (id, decl, copy);
+}
+
+/* Transform gotos out of the assumption into return false. */
+
+static tree
+adjust_assumption_stmt_r (gimple_stmt_iterator *gsi_p, bool *,
+ struct walk_stmt_info *wi)
+{
+ lower_assumption_data *data = (lower_assumption_data *) wi->info;
+ gimple *stmt = gsi_stmt (*gsi_p);
+ tree lab = NULL_TREE;
+ unsigned int idx = 0;
+ if (gimple_code (stmt) == GIMPLE_GOTO)
+ lab = gimple_goto_dest (stmt);
+ else if (gimple_code (stmt) == GIMPLE_COND)
+ {
+ repeat:
+ if (idx == 0)
+ lab = gimple_cond_true_label (as_a <gcond *> (stmt));
+ else
+ lab = gimple_cond_false_label (as_a <gcond *> (stmt));
+ }
+ else if (gimple_code (stmt) == GIMPLE_LABEL)
+ {
+ tree label = gimple_label_label (as_a <glabel *> (stmt));
+ DECL_CONTEXT (label) = current_function_decl;
+ }
+ if (lab)
+ {
+ if (!data->id.decl_map->get (lab))
+ {
+ if (!data->return_false_label)
+ data->return_false_label
+ = create_artificial_label (UNKNOWN_LOCATION);
+ if (gimple_code (stmt) == GIMPLE_GOTO)
+ gimple_goto_set_dest (as_a <ggoto *> (stmt),
+ data->return_false_label);
+ else if (idx == 0)
+ gimple_cond_set_true_label (as_a <gcond *> (stmt),
+ data->return_false_label);
+ else
+ gimple_cond_set_false_label (as_a <gcond *> (stmt),
+ data->return_false_label);
+ }
+ if (gimple_code (stmt) == GIMPLE_COND && idx == 0)
+ {
+ idx = 1;
+ goto repeat;
+ }
+ }
+ return NULL_TREE;
+}
+
+/* Adjust trees in the assumption body. Called through walk_tree. */
+
+static tree
+adjust_assumption_stmt_op (tree *tp, int *, void *datap)
+{
+ struct walk_stmt_info *wi = (struct walk_stmt_info *) datap;
+ lower_assumption_data *data = (lower_assumption_data *) wi->info;
+ tree t = *tp;
+ tree *newt;
+ switch (TREE_CODE (t))
+ {
+ case SSA_NAME:
+ newt = data->id.decl_map->get (t);
+ /* There shouldn't be SSA_NAMEs other than ones defined in the
+ assumption's body. */
+ gcc_assert (newt);
+ *tp = *newt;
+ break;
+ case LABEL_DECL:
+ newt = data->id.decl_map->get (t);
+ if (newt)
+ *tp = *newt;
+ break;
+ case VAR_DECL:
+ case PARM_DECL:
+ case RESULT_DECL:
+ *tp = remap_decl (t, &data->id);
+ break;
+ default:
+ break;
+ }
+ return NULL_TREE;
+}
+
+/* Lower assumption.
+ The gimplifier transformed:
+ .ASSUME (cond);
+ into:
+ guard = .ASSUME ();
+ if (guard) goto label_true; else label_false;
+ label_true:;
+ {
+ guard = cond;
+ }
+ label_false:;
+ .ASSUME (guard);
+ which we should transform into:
+ .ASSUME (&artificial_fn, args...);
+ where artificial_fn will look like:
+ bool artificial_fn (args...)
+ {
+ guard = cond;
+ return guard;
+ }
+ with any debug stmts in the block removed and jumps out of
+ the block or return stmts replaced with return false; */
+
+static void
+lower_assumption (gimple_stmt_iterator *gsi, struct lower_data *data)
+{
+ gimple *stmt = gsi_stmt (*gsi);
+ tree guard = gimple_call_lhs (stmt);
+ location_t loc = gimple_location (stmt);
+ gcc_assert (guard);
+ gsi_remove (gsi, true);
+ stmt = gsi_stmt (*gsi);
+ gcond *cond = as_a <gcond *> (stmt);
+ gcc_assert (gimple_cond_lhs (cond) == guard
+ || gimple_cond_rhs (cond) == guard);
+ tree l1 = gimple_cond_true_label (cond);
+ tree l2 = gimple_cond_false_label (cond);
+ gsi_remove (gsi, true);
+ stmt = gsi_stmt (*gsi);
+ glabel *lab = as_a <glabel *> (stmt);
+ gcc_assert (gimple_label_label (lab) == l1
+ || gimple_label_label (lab) == l2);
+ gsi_remove (gsi, true);
+ gimple *bind = gsi_stmt (*gsi);
+ gcc_assert (gimple_code (bind) == GIMPLE_BIND);
+
+ lower_assumption_data lad;
+ hash_map<tree, tree> decl_map;
+ memset (&lad.id, 0, sizeof (lad.id));
+ lad.return_false_label = NULL_TREE;
+ lad.id.src_fn = current_function_decl;
+ lad.id.dst_fn = create_assumption_fn (loc);
+ lad.id.src_cfun = DECL_STRUCT_FUNCTION (lad.id.src_fn);
+ lad.id.decl_map = &decl_map;
+ lad.id.copy_decl = assumption_copy_decl;
+ lad.id.transform_call_graph_edges = CB_CGE_DUPLICATE;
+ lad.id.transform_parameter = true;
+ lad.id.do_not_unshare = true;
+ lad.id.do_not_fold = true;
+ cfun->curr_properties = lad.id.src_cfun->curr_properties;
+ lad.guard_copy = create_tmp_var (boolean_type_node);
+ decl_map.put (lad.guard_copy, lad.guard_copy);
+ decl_map.put (guard, lad.guard_copy);
+ cfun->assume_function = 1;
+
+ struct walk_stmt_info wi;
+ memset (&wi, 0, sizeof (wi));
+ wi.info = (void *) &lad;
+ walk_gimple_stmt (gsi, find_assumption_locals_r, NULL, &wi);
+ unsigned int sz = lad.decls.length ();
+ for (unsigned i = 0; i < sz; ++i)
+ {
+ tree v = lad.decls[i];
+ tree newv;
+ if (TREE_CODE (v) == SSA_NAME)
+ {
+ newv = make_ssa_name (remap_type (TREE_TYPE (v), &lad.id));
+ decl_map.put (v, newv);
+ }
+ else if (VAR_P (v))
+ {
+ if (is_global_var (v) && !DECL_ASSEMBLER_NAME_SET_P (v))
+ DECL_ASSEMBLER_NAME (v);
+ TREE_TYPE (v) = remap_type (TREE_TYPE (v), &lad.id);
+ DECL_CONTEXT (v) = current_function_decl;
+ }
+ }
+ memset (&wi, 0, sizeof (wi));
+ wi.info = (void *) &lad;
+ walk_gimple_stmt (gsi, adjust_assumption_stmt_r,
+ adjust_assumption_stmt_op, &wi);
+ gsi_remove (gsi, false);
+
+ gimple_seq body = NULL;
+ gimple *g = gimple_build_assign (lad.guard_copy, boolean_false_node);
+ gimple_seq_add_stmt (&body, g);
+ gimple_seq_add_stmt (&body, bind);
+ greturn *gr = gimple_build_return (lad.guard_copy);
+ gimple_seq_add_stmt (&body, gr);
+ if (lad.return_false_label)
+ {
+ g = gimple_build_label (lad.return_false_label);
+ gimple_seq_add_stmt (&body, g);
+ g = gimple_build_assign (lad.guard_copy, boolean_false_node);
+ gimple_seq_add_stmt (&body, g);
+ gr = gimple_build_return (lad.guard_copy);
+ gimple_seq_add_stmt (&body, gr);
+ }
+ bind = gimple_build_bind (NULL_TREE, body, NULL_TREE);
+ body = NULL;
+ gimple_seq_add_stmt (&body, bind);
+ gimple_set_body (current_function_decl, body);
+ pop_cfun ();
+
+ tree parms = NULL_TREE;
+ tree parmt = void_list_node;
+ auto_vec<tree, 8> vargs;
+ vargs.safe_grow (1 + (lad.decls.length () - sz), true);
+ vargs[0] = build_fold_addr_expr (lad.id.dst_fn);
+ for (unsigned i = lad.decls.length (); i > sz; --i)
+ {
+ tree *v = decl_map.get (lad.decls[i - 1]);
+ gcc_assert (v && TREE_CODE (*v) == PARM_DECL);
+ DECL_CHAIN (*v) = parms;
+ parms = *v;
+ parmt = tree_cons (NULL_TREE, TREE_TYPE (*v), parmt);
+ vargs[i - sz] = lad.decls[i - 1];
+ if (is_gimple_reg_type (TREE_TYPE (vargs[i - sz]))
+ && !is_gimple_val (vargs[i - sz]))
+ {
+ tree t = make_ssa_name (TREE_TYPE (vargs[i - sz]));
+ g = gimple_build_assign (t, vargs[i - sz]);
+ gsi_insert_before (gsi, g, GSI_SAME_STMT);
+ vargs[i - sz] = t;
+ }
+ }
+ DECL_ARGUMENTS (lad.id.dst_fn) = parms;
+ TREE_TYPE (lad.id.dst_fn) = build_function_type (boolean_type_node, parmt);
+
+ cgraph_node::add_new_function (lad.id.dst_fn, false);
+
+ for (unsigned i = 0; i < sz; ++i)
+ {
+ tree v = lad.decls[i];
+ if (TREE_CODE (v) == SSA_NAME)
+ release_ssa_name (v);
+ }
+
+ stmt = gsi_stmt (*gsi);
+ lab = as_a <glabel *> (stmt);
+ gcc_assert (gimple_label_label (lab) == l1
+ || gimple_label_label (lab) == l2);
+ gsi_remove (gsi, true);
+ stmt = gsi_stmt (*gsi);
+ gcc_assert (gimple_call_internal_p (stmt, IFN_ASSUME)
+ && gimple_call_num_args (stmt) == 1
+ && gimple_call_arg (stmt, 0) == guard);
+ data->cannot_fallthru = false;
+ gcall *call = gimple_build_call_internal_vec (IFN_ASSUME, vargs);
+ gimple_set_location (call, loc);
+ gsi_replace (gsi, call, true);
+}
/* Lower statement GSI. DATA is passed through the recursion. We try to
track the fallthruness of statements and get rid of unreachable return
@@ -354,6 +738,13 @@ lower_stmt (gimple_stmt_iterator *gsi, s
tree decl = gimple_call_fndecl (stmt);
unsigned i;
+ if (gimple_call_internal_p (stmt, IFN_ASSUME)
+ && gimple_call_num_args (stmt) == 0)
+ {
+ lower_assumption (gsi, data);
+ return;
+ }
+
for (i = 0; i < gimple_call_num_args (stmt); i++)
{
tree arg = gimple_call_arg (stmt, i);
@@ -4253,6 +4253,12 @@ pass_fold_builtins::execute (function *f
}
callee = gimple_call_fndecl (stmt);
+ if (!callee
+ && gimple_call_internal_p (stmt, IFN_ASSUME))
+ {
+ gsi_remove (&i, true);
+ continue;
+ }
if (!callee || !fndecl_built_in_p (callee, BUILT_IN_NORMAL))
{
gsi_next (&i);
@@ -2278,6 +2278,7 @@ output_struct_function_base (struct outp
bp_pack_value (&bp, fn->calls_eh_return, 1);
bp_pack_value (&bp, fn->has_force_vectorize_loops, 1);
bp_pack_value (&bp, fn->has_simduid_loops, 1);
+ bp_pack_value (&bp, fn->assume_function, 1);
bp_pack_value (&bp, fn->va_list_fpr_size, 8);
bp_pack_value (&bp, fn->va_list_gpr_size, 8);
bp_pack_value (&bp, fn->last_clique, sizeof (short) * 8);
@@ -1318,6 +1318,7 @@ input_struct_function_base (struct funct
fn->calls_eh_return = bp_unpack_value (&bp, 1);
fn->has_force_vectorize_loops = bp_unpack_value (&bp, 1);
fn->has_simduid_loops = bp_unpack_value (&bp, 1);
+ fn->assume_function = bp_unpack_value (&bp, 1);
fn->va_list_fpr_size = bp_unpack_value (&bp, 8);
fn->va_list_gpr_size = bp_unpack_value (&bp, 8);
fn->last_clique = bp_unpack_value (&bp, sizeof (short) * 8);
@@ -1882,6 +1882,16 @@ cgraph_node::expand (void)
ggc_collect ();
timevar_pop (TV_REST_OF_COMPILATION);
+ if (DECL_STRUCT_FUNCTION (decl)
+ && DECL_STRUCT_FUNCTION (decl)->assume_function)
+ {
+ /* Assume functions aren't expanded into RTL, on the other side
+ we don't want to release their body. */
+ if (cfun)
+ pop_cfun ();
+ return;
+ }
+
/* Make sure that BE didn't give up on compiling. */
gcc_assert (TREE_ASM_WRITTEN (decl));
if (cfun)
@@ -6597,6 +6597,14 @@ pass_expand::execute (function *fun)
rtx_insn *var_seq, *var_ret_seq;
unsigned i;
+ if (cfun->assume_function)
+ {
+ /* Assume functions should not be expanded to RTL. */
+ cfun->curr_properties &= ~PROP_loops;
+ loop_optimizer_finalize ();
+ return 0;
+ }
+
timevar_push (TV_OUT_OF_SSA);
rewrite_out_of_ssa (&SA);
timevar_pop (TV_OUT_OF_SSA);
@@ -4526,5 +4526,4 @@ expand_TRAP (internal_fn, gcall *)
void
expand_ASSUME (internal_fn, gcall *)
{
- gcc_unreachable ();
}
@@ -647,11 +647,12 @@ public:
{}
/* opt_pass methods: */
- bool gate (function *) final override
+ bool gate (function *fun) final override
{
/* Early return if there were errors. We can run afoul of our
consistency checks, and there's not really much point in fixing them. */
- return !(rtl_dump_and_exit || flag_syntax_only || seen_error ());
+ return !(rtl_dump_and_exit || fun->assume_function
+ || flag_syntax_only || seen_error ());
}
}; // class pass_rest_of_compilation
@@ -1213,6 +1213,10 @@ public:
/* opt_pass methods: */
bool gate (function *fun) final override
{
+ /* Vectorization makes range analysis of assume functions
+ harder, not easier. */
+ if (fun->assume_function)
+ return false;
return flag_tree_loop_vectorize || fun->has_force_vectorize_loops;
}
@@ -1490,7 +1494,14 @@ public:
/* opt_pass methods: */
opt_pass * clone () final override { return new pass_slp_vectorize (m_ctxt); }
- bool gate (function *) final override { return flag_tree_slp_vectorize != 0; }
+ bool gate (function *fun) final override
+ {
+ /* Vectorization makes range analysis of assume functions harder,
+ not easier. */
+ if (fun->assume_function)
+ return false;
+ return flag_tree_slp_vectorize != 0;
+ }
unsigned int execute (function *) final override;
}; // class pass_slp_vectorize
@@ -1517,6 +1517,9 @@ sem_function::parse (cgraph_node *node,
if (!func || (!node->has_gimple_body_p () && !node->thunk))
return NULL;
+ if (func->assume_function)
+ return NULL;
+
if (lookup_attribute_by_prefix ("omp ", DECL_ATTRIBUTES (node->decl)) != NULL)
return NULL;
@@ -46031,6 +46031,8 @@ cp_parser_omp_assumption_clauses (cp_par
t = contextual_conv_bool (t, tf_warning_or_error);
if (is_assume && !error_operand_p (t))
{
+ if (!processing_template_decl)
+ t = fold_build_cleanup_point_expr (TREE_TYPE (t), t);
t = build_call_expr_internal_loc (eloc, IFN_ASSUME,
void_type_node, 1, t);
finish_expr_stmt (t);
@@ -3139,6 +3139,8 @@ process_stmt_assume_attribute (tree std_
arg = contextual_conv_bool (arg, tf_warning_or_error);
if (error_operand_p (arg))
continue;
+ if (!processing_template_decl)
+ arg = fold_build_cleanup_point_expr (TREE_TYPE (arg), arg);
statement = build_call_expr_internal_loc (attrs_loc, IFN_ASSUME,
void_type_node, 1, arg);
finish_expr_stmt (statement);
@@ -21105,6 +21105,8 @@ tsubst_copy_and_build (tree t,
ret = error_mark_node;
break;
}
+ if (!processing_template_decl)
+ arg = fold_build_cleanup_point_expr (TREE_TYPE (arg), arg);
ret = build_call_expr_internal_loc (EXPR_LOCATION (t),
IFN_ASSUME,
void_type_node, 1,
@@ -0,0 +1,5 @@
+// P1774R8 - Portable assumptions
+// { dg-do run { target c++11 } }
+// { dg-options "-O2" }
+
+#include "attr-assume1.C"
@@ -0,0 +1,5 @@
+// P1774R8 - Portable assumptions
+// { dg-do run { target c++11 } }
+// { dg-options "-O2" }
+
+#include "attr-assume3.C"
@@ -0,0 +1,42 @@
+// P1774R8 - Portable assumptions
+// { dg-do compile { target c++11 } }
+// { dg-options "-O2" }
+
+int
+foo (int x)
+{
+ [[assume (x == 42)]];
+ return x;
+}
+
+int
+bar (int x)
+{
+ [[assume (++x == 43)]];
+ return x;
+}
+
+int
+baz (int x)
+{
+ [[assume (({ int z = ++x; static int w; ++w; if (z == 51) return -1; if (z == 53) goto lab1; if (z == 64) throw 1; z == 43; }))]];
+lab1:
+ return x;
+}
+
+struct S { S (); S (const S &); ~S (); int a, b; int foo (); };
+
+int
+qux ()
+{
+ S s;
+ [[assume (s.a == 42 && s.b == 43)]];
+ return s.a + s.b;
+}
+
+int
+S::foo ()
+{
+ [[assume (a == 42 && b == 43)]];
+ return a + b;
+}