tree-ssa-loop-ivopts : Add live analysis in regs used in decision making
Checks
Commit Message
tree-ssa-loop-ivopts : Add live analysis in regs used in decision making.
Add live anaysis in regs used calculation in decision making of
selecting ivopts candidates.
2023-11-08 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>
gcc/ChangeLog:
* tree-ssa-loop-ivopts.cc (get_regs_used): New function.
(determine_set_costs): Call to get_regs_used to use live
analysis.
---
gcc/tree-ssa-loop-ivopts.cc | 73 +++++++++++++++++++++++++++++++++++--
1 file changed, 70 insertions(+), 3 deletions(-)
Comments
Hello Richard:
On 09/11/23 6:21 pm, Richard Biener wrote:
> On Wed, Nov 8, 2023 at 4:00 PM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>
>> tree-ssa-loop-ivopts : Add live analysis in regs used in decision making.
>>
>> Add live anaysis in regs used calculation in decision making of
>> selecting ivopts candidates.
>>
>> 2023-11-08 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>
>>
>> gcc/ChangeLog:
>>
>> * tree-ssa-loop-ivopts.cc (get_regs_used): New function.
>> (determine_set_costs): Call to get_regs_used to use live
>> analysis.
>> ---
>> gcc/tree-ssa-loop-ivopts.cc | 73 +++++++++++++++++++++++++++++++++++--
>> 1 file changed, 70 insertions(+), 3 deletions(-)
>>
>> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
>> index c3336603778..e02fe7d434b 100644
>> --- a/gcc/tree-ssa-loop-ivopts.cc
>> +++ b/gcc/tree-ssa-loop-ivopts.cc
>> @@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs,
>> return cost + n_cands;
>> }
>>
>> +/* Return regs used based on live-in and liveout of given ssa variables. */
>
> Please explain how the following code relates to anything like "live
> analysis" and
> where it uses live-in and live-out. And what "live-in/out of a given
> SSA variable"
> should be.
>
> Also explain why you are doing this at all. The patch doesn't come
> with a testcase
> or with any other hint that motivated you.
>
> Richard.
>
The function get_regs_used increments the regs_used based on live-in
and live-out analysis of given ssa name. Instead of setting live-in and
live-out bitmap I increment the regs_used.
Below is how I identify live-in and live-out and increments the regs_used
variable:
a) For a given def_bb of gimple statement of ssa name there should be
live-out and increments the regs_used.
b) Visit each use of SSA_NAME and if it isn't in the same block as the def,
we identify live on entry blocks and increments regs_used.
The below function is the modification of set_var_live_on_entry of tree-ssa-live.cc
Where we set the bitmap of liveout and livein of basic block. Instead of setting bitmap, regs_used is incremented.
I identify regs_used as the number of live-in and liveout of given ssa name variable.
For each iv candiate ssa variables I identify regs_used and take maximum of regs
used for all the iv candidates that will be used in ivopts_estimate_register_pressure
cost analysis.
Motivation behind doing this optimization is I get good performance improvement
for several spec cpu 2017 benchmarks for FP and INT around 2% to 7%.
Also setting regs_used as number of iv candiates, which is not
optimized and robust way of decision making for ivopts optimization I decide
on live-in and live-out analysis which is more correct and appropriate way of
identifying regs_used.
And also there are no regressions in bootstrapped/regtested on powerpc64-linux-gnu.
Thanks & Regards
Ajit
>> +static unsigned
>> +get_regs_used (tree ssa_name)
>> +{
>> + unsigned regs_used = 0;
>> + gimple *stmt;
>> + use_operand_p use;
>> + basic_block def_bb = NULL;
>> + imm_use_iterator imm_iter;
>> +
>> + stmt = SSA_NAME_DEF_STMT (ssa_name);
>> + if (stmt)
>> + {
>> + def_bb = gimple_bb (stmt);
>> + /* Mark defs in liveout bitmap temporarily. */
>> + if (def_bb)
>> + regs_used++;
>> + }
>> + else
>> + def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
>> +
>> + /* An undefined local variable does not need to be very alive. */
>> + if (virtual_operand_p (ssa_name)
>> + || ssa_undefined_value_p (ssa_name, false))
>> + return 0;
>> +
>> + /* Visit each use of SSA_NAME and if it isn't in the same block as the def,
>> + add it to the list of live on entry blocks. */
>> + FOR_EACH_IMM_USE_FAST (use, imm_iter, ssa_name)
>> + {
>> + gimple *use_stmt = USE_STMT (use);
>> + basic_block add_block = NULL;
>> +
>> + if (gimple_code (use_stmt) == GIMPLE_PHI)
>> + {
>> + /* Uses in PHI's are considered to be live at exit of the SRC block
>> + as this is where a copy would be inserted. Check to see if it is
>> + defined in that block, or whether its live on entry. */
>> + int index = PHI_ARG_INDEX_FROM_USE (use);
>> + edge e = gimple_phi_arg_edge (as_a <gphi *> (use_stmt), index);
>> + if (e->src != def_bb)
>> + add_block = e->src;
>> + }
>> + else if (is_gimple_debug (use_stmt))
>> + continue;
>> + else
>> + {
>> + /* If its not defined in this block, its live on entry. */
>> + basic_block use_bb = gimple_bb (use_stmt);
>> + if (use_bb != def_bb)
>> + add_block = use_bb;
>> + }
>> +
>> + /* If there was a live on entry use, increment register used. */
>> + if (add_block)
>> + {
>> + regs_used++;
>> + }
>> + }
>> + return regs_used;
>> +}
>> +
>> /* For each size of the induction variable set determine the penalty. */
>>
>> static void
>> @@ -6200,15 +6262,20 @@ determine_set_costs (struct ivopts_data *data)
>> n++;
>> }
>>
>> + unsigned max = 0;
>> EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, j, bi)
>> {
>> struct version_info *info = ver_info (data, j);
>> -
>> if (info->inv_id && info->has_nonlin_use)
>> - n++;
>> + {
>> + tree ssa_name = ssa_name (j);
>> + n = get_regs_used (ssa_name);
>> + if (n >= max)
>> + max = n;
>> + }
>> }
>>
>> - data->regs_used = n;
>> + data->regs_used = max;
>> if (dump_file && (dump_flags & TDF_DETAILS))
>> fprintf (dump_file, " regs_used %d\n", n);
>>
>> --
>> 2.39.3
>>
>>
Hello Richard:
On 10/11/23 7:29 pm, Richard Biener wrote:
> On Fri, Nov 10, 2023 at 7:42 AM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>
>> Hello Richard:
>>
>>
>> On 09/11/23 6:21 pm, Richard Biener wrote:
>>> On Wed, Nov 8, 2023 at 4:00 PM Ajit Agarwal <aagarwa1@linux.ibm.com> wrote:
>>>>
>>>> tree-ssa-loop-ivopts : Add live analysis in regs used in decision making.
>>>>
>>>> Add live anaysis in regs used calculation in decision making of
>>>> selecting ivopts candidates.
>>>>
>>>> 2023-11-08 Ajit Kumar Agarwal <aagarwa1@linux.ibm.com>
>>>>
>>>> gcc/ChangeLog:
>>>>
>>>> * tree-ssa-loop-ivopts.cc (get_regs_used): New function.
>>>> (determine_set_costs): Call to get_regs_used to use live
>>>> analysis.
>>>> ---
>>>> gcc/tree-ssa-loop-ivopts.cc | 73 +++++++++++++++++++++++++++++++++++--
>>>> 1 file changed, 70 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/gcc/tree-ssa-loop-ivopts.cc b/gcc/tree-ssa-loop-ivopts.cc
>>>> index c3336603778..e02fe7d434b 100644
>>>> --- a/gcc/tree-ssa-loop-ivopts.cc
>>>> +++ b/gcc/tree-ssa-loop-ivopts.cc
>>>> @@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs,
>>>> return cost + n_cands;
>>>> }
>>>>
>>>> +/* Return regs used based on live-in and liveout of given ssa variables. */
>>>
>>> Please explain how the following code relates to anything like "live
>>> analysis" and
>>> where it uses live-in and live-out. And what "live-in/out of a given
>>> SSA variable"
>>> should be.
>>>
>>> Also explain why you are doing this at all. The patch doesn't come
>>> with a testcase
>>> or with any other hint that motivated you.
>>>
>>> Richard.
>>>
>>
>> The function get_regs_used increments the regs_used based on live-in
>> and live-out analysis of given ssa name. Instead of setting live-in and
>> live-out bitmap I increment the regs_used.
>>
>> Below is how I identify live-in and live-out and increments the regs_used
>> variable:
>>
>> a) For a given def_bb of gimple statement of ssa name there should be
>> live-out and increments the regs_used.
>>
>> b) Visit each use of SSA_NAME and if it isn't in the same block as the def,
>> we identify live on entry blocks and increments regs_used.
>>
>> The below function is the modification of set_var_live_on_entry of tree-ssa-live.cc
>> Where we set the bitmap of liveout and livein of basic block. Instead of setting bitmap, regs_used is incremented.e
>
> It clearly doesn't work that way, and the number doesn't in any way relate to
> the number of registers used or register pressure.
>
I agree with you that actual regs_used is not actually the registers used calculated
based on livein and liveout.
Above decision making is using the variable reg_used which is not actually related
to registers used or registers used.
My decision making is based on livein and liveout instead of actual registers used.
I tried to sync up with variables names same as used in ivopts_estimate_register_pressure.
My logic is changing the actual implementation of ivopts_estimate_register_pressure
considering the livein and liveout instead of actual registers used. Idea behind
is to use the livein and liveout considering the regions that doing ivopts increases
or decreases the register pressure based on livein and liveout. My calculation of register
pressure should be based livein and liveout across the region based on ivopts
instead of calculating the register used based on number of iv candidates.
This is how my notion of register pressure.
I can change code to give variables names meaningful stated in above decison making.
>> I identify regs_used as the number of live-in and liveout of given ssa name variable.
>>
>> For each iv candiate ssa variables I identify regs_used and take maximum of regs
>> used for all the iv candidates that will be used in ivopts_estimate_register_pressure
>> cost analysis.
>>
>> Motivation behind doing this opttks for FP and INT around 2% to 7%.
>
> An interesting GIGO effect.
Why you think its GIGO effect. The gains are happening because of decision making
on register pressure stated above.
Please elaborate if you think otherwise.
Thanks & Regards
Ajit
>
>> Also setting regs_used as number of iv candiates, which is not
>> optimized and robust way of decision making for ivopts optimization I decide
>> on live-in and live-out analysis which is more correct and appropriate way of
>> identifying regs_used.
>>
>> And also there are no regressions in bootstrapped/regtested on powerpc64-linux-gnu.
>>
>> Thanks & Regards
>> Ajit
>>
>>>> +static unsigned
>>>> +get_regs_used (tree ssa_name)
>>>> +{
>>>> + unsigned regs_used = 0;
>>>> + gimple *stmt;
>>>> + use_operand_p use;
>>>> + basic_block def_bb = NULL;
>>>> + imm_use_iterator imm_iter;
>>>> +
>>>> + stmt = SSA_NAME_DEF_STMT (ssa_name);
>>>> + if (stmt)
>>>> + {
>>>> + def_bb = gimple_bb (stmt);
>>>> + /* Mark defs in liveout bitmap temporarily. */
>>>> + if (def_bb)
>>>> + regs_used++;
>>>> + }
>>>> + else
>>>> + def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
>>>> +
>>>> + /* An undefined local variable does not need to be very alive. */
>>>> + if (virtual_operand_p (ssa_name)
>>>> + || ssa_undefined_value_p (ssa_name, false))
>>>> + return 0;
>>>> +
>>>> + /* Visit each use of SSA_NAME and if it isn't in the same block as the def,
>>>> + add it to the list of live on entry blocks. */
>>>> + FOR_EACH_IMM_USE_FAST (use, imm_iter, ssa_name)
>>>> + {
>>>> + gimple *use_stmt = USE_STMT (use);
>>>> + basic_block add_block = NULL;
>>>> +
>>>> + if (gimple_code (use_stmt) == GIMPLE_PHI)
>>>> + {
>>>> + /* Uses in PHI's are considered to be live at exit of the SRC block
>>>> + as this is where a copy would be inserted. Check to see if it is
>>>> + defined in that block, or whether its live on entry. */
>>>> + int index = PHI_ARG_INDEX_FROM_USE (use);
>>>> + edge e = gimple_phi_arg_edge (as_a <gphi *> (use_stmt), index);
>>>> + if (e->src != def_bb)
>>>> + add_block = e->src;
>>>> + }
>>>> + else if (is_gimple_debug (use_stmt))
>>>> + continue;
>>>> + else
>>>> + {
>>>> + /* If its not defined in this block, its live on entry. */
>>>> + basic_block use_bb = gimple_bb (use_stmt);
>>>> + if (use_bb != def_bb)
>>>> + add_block = use_bb;
>>>> + }
>>>> +
>>>> + /* If there was a live on entry use, increment register used. */
>>>> + if (add_block)
>>>> + {
>>>> + regs_used++;
>>>> + }
>>>> + }
>>>> + return regs_used;
>>>> +}
>>>> +
>>>> /* For each size of the induction variable set determine the penalty. */
>>>>
>>>> static void
>>>> @@ -6200,15 +6262,20 @@ determine_set_costs (struct ivopts_data *data)
>>>> n++;
>>>> }
>>>>
>>>> + unsigned max = 0;
>>>> EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, j, bi)
>>>> {
>>>> struct version_info *info = ver_info (data, j);
>>>> -
>>>> if (info->inv_id && info->has_nonlin_use)
>>>> - n++;
>>>> + {
>>>> + tree ssa_name = ssa_name (j);
>>>> + n = get_regs_used (ssa_name);
>>>> + if (n >= max)
>>>> + max = n;
>>>> + }
>>>> }
>>>>
>>>> - data->regs_used = n;
>>>> + data->regs_used = max;
>>>> if (dump_file && (dump_flags & TDF_DETAILS))
>>>> fprintf (dump_file, " regs_used %d\n", n);
>>>>
>>>> --
>>>> 2.39.3
>>>>
>>>>
@@ -6160,6 +6160,68 @@ ivopts_estimate_reg_pressure (struct ivopts_data *data, unsigned n_invs,
return cost + n_cands;
}
+/* Return regs used based on live-in and liveout of given ssa variables. */
+static unsigned
+get_regs_used (tree ssa_name)
+{
+ unsigned regs_used = 0;
+ gimple *stmt;
+ use_operand_p use;
+ basic_block def_bb = NULL;
+ imm_use_iterator imm_iter;
+
+ stmt = SSA_NAME_DEF_STMT (ssa_name);
+ if (stmt)
+ {
+ def_bb = gimple_bb (stmt);
+ /* Mark defs in liveout bitmap temporarily. */
+ if (def_bb)
+ regs_used++;
+ }
+ else
+ def_bb = ENTRY_BLOCK_PTR_FOR_FN (cfun);
+
+ /* An undefined local variable does not need to be very alive. */
+ if (virtual_operand_p (ssa_name)
+ || ssa_undefined_value_p (ssa_name, false))
+ return 0;
+
+ /* Visit each use of SSA_NAME and if it isn't in the same block as the def,
+ add it to the list of live on entry blocks. */
+ FOR_EACH_IMM_USE_FAST (use, imm_iter, ssa_name)
+ {
+ gimple *use_stmt = USE_STMT (use);
+ basic_block add_block = NULL;
+
+ if (gimple_code (use_stmt) == GIMPLE_PHI)
+ {
+ /* Uses in PHI's are considered to be live at exit of the SRC block
+ as this is where a copy would be inserted. Check to see if it is
+ defined in that block, or whether its live on entry. */
+ int index = PHI_ARG_INDEX_FROM_USE (use);
+ edge e = gimple_phi_arg_edge (as_a <gphi *> (use_stmt), index);
+ if (e->src != def_bb)
+ add_block = e->src;
+ }
+ else if (is_gimple_debug (use_stmt))
+ continue;
+ else
+ {
+ /* If its not defined in this block, its live on entry. */
+ basic_block use_bb = gimple_bb (use_stmt);
+ if (use_bb != def_bb)
+ add_block = use_bb;
+ }
+
+ /* If there was a live on entry use, increment register used. */
+ if (add_block)
+ {
+ regs_used++;
+ }
+ }
+ return regs_used;
+}
+
/* For each size of the induction variable set determine the penalty. */
static void
@@ -6200,15 +6262,20 @@ determine_set_costs (struct ivopts_data *data)
n++;
}
+ unsigned max = 0;
EXECUTE_IF_SET_IN_BITMAP (data->relevant, 0, j, bi)
{
struct version_info *info = ver_info (data, j);
-
if (info->inv_id && info->has_nonlin_use)
- n++;
+ {
+ tree ssa_name = ssa_name (j);
+ n = get_regs_used (ssa_name);
+ if (n >= max)
+ max = n;
+ }
}
- data->regs_used = n;
+ data->regs_used = max;
if (dump_file && (dump_flags & TDF_DETAILS))
fprintf (dump_file, " regs_used %d\n", n);