In the pipeline, UNRECOG INSN is not executed in advance if it starts a live range.

Message ID 20230529105120.1703-1-jinma@linux.alibaba.com
State Accepted
Headers
Series In the pipeline, UNRECOG INSN is not executed in advance if it starts a live range. |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Jin Ma May 29, 2023, 10:51 a.m. UTC
  Unrecog insns (such as CLOBBER, USE) does not represent real instructions, but in the
process of pipeline optimization, they will wait for transmission in ready list like
other insns, without considering resource conflicts and cycles. This results in a
multi-issue CPU architecture that can be issued at any time if other regular insns
have resource conflicts or cannot be launched for other reasons. As a result, its
position is advanced in the generated insns sequence, which will affect register
allocation and often lead to more redundant mov instructions.

A simple example:
https://github.com/majin2020/gcc-test/blob/master/test.c
This is a function in the dhrystone benchmark.

https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b960ec0607ca24/test.c.299r.sched1
This is a log of the pass 'sched1' When issue_rate == 2. Among them, insn 13 and 14 are
much ahead of schedule, which risks generating redundant mov instructions, which seems
unreasonable.

Therefore, I submit patch again on the basis of the last review opinions to try to solve
this problem.

This is the new log of shed1 after patch is added.
https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde702955048bfe3f501263dd

gcc/ChangeLog:

        * haifa-sched.cc (unrecog_insn_for_forw_only_p): New.
        (prune_ready_list): UNRECOG INSN is not executed in advance if it starts a
	live range.
---
 gcc/haifa-sched.cc | 44 +++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 39 insertions(+), 5 deletions(-)
  

Comments

Jin Ma June 8, 2023, 1:50 a.m. UTC | #1
ping: https://gcc.gnu.org/pipermail/gcc-patches/2023-May/619951.html

Ref: http://patchwork.ozlabs.org/project/gcc/patch/20230323080734.423-1-jinma@linux.alibaba.com/
  
Jeff Law June 9, 2023, 11:40 p.m. UTC | #2
On 5/29/23 04:51, Jin Ma wrote:
>    Unrecog insns (such as CLOBBER, USE) does not represent real instructions, but in the
> process of pipeline optimization, they will wait for transmission in ready list like
> other insns, without considering resource conflicts and cycles. This results in a
> multi-issue CPU architecture that can be issued at any time if other regular insns
> have resource conflicts or cannot be launched for other reasons. As a result, its
> position is advanced in the generated insns sequence, which will affect register
> allocation and often lead to more redundant mov instructions.
> 
> A simple example:
> https://github.com/majin2020/gcc-test/blob/master/test.c
> This is a function in the dhrystone benchmark.
> 
> https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b960ec0607ca24/test.c.299r.sched1
> This is a log of the pass 'sched1' When issue_rate == 2. Among them, insn 13 and 14 are
> much ahead of schedule, which risks generating redundant mov instructions, which seems
> unreasonable.
> 
> Therefore, I submit patch again on the basis of the last review opinions to try to solve
> this problem.
> 
> This is the new log of shed1 after patch is added.
> https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde702955048bfe3f501263dd
> 
> gcc/ChangeLog:
> 
>          * haifa-sched.cc (unrecog_insn_for_forw_only_p): New.
>          (prune_ready_list): UNRECOG INSN is not executed in advance if it starts a
> 	live range.
> ---
>   gcc/haifa-sched.cc | 44 +++++++++++++++++++++++++++++++++++++++-----
>   1 file changed, 39 insertions(+), 5 deletions(-)
> 
> diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
> index 2c881ede0ec..205680a4936 100644
> --- a/gcc/haifa-sched.cc
> +++ b/gcc/haifa-sched.cc
> @@ -765,6 +765,23 @@ real_insn_for_shadow (rtx_insn *insn)
>     return pair->i1;
>   }
>   
> +/* Return true if INSN is unrecog that starts a live range.  */
I would rewrite this as

/* Return TRUE if INSN (a USE or CLOBBER) starts a new live
    range, FALSE otherwise.  */

> +
> +static bool
> +unrecog_insn_for_forw_only_p (rtx_insn *insn)
I would call this "use_or_clobber_starts_range_p" or something like that.


> +{
> +  if (insn && !INSN_P (insn) && recog_memoized (insn) >= 0)
> +    return false;
I would drop the test that INSN is not NULL in this test.  There's no 
way it can ever be NULL here.

If you really want to check that, then I'd do something like

gcc_assert (INSN);

Instead of checking it in that condition.




> @@ -6320,11 +6337,28 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
>   	    }
>   	  else if (recog_memoized (insn) < 0)
>   	    {
> -	      if (!first_cycle_insn_p
> -		  && (GET_CODE (PATTERN (insn)) == ASM_INPUT
> -		      || asm_noperands (PATTERN (insn)) >= 0))
> -		cost = 1;
> -	      reason = "asm";
> +	      if (GET_CODE (PATTERN (insn)) == ASM_INPUT
> +		  || asm_noperands (PATTERN (insn)) >= 0)
> +		{
> +		  reason = "asm";
> +		  if (!first_cycle_insn_p)
> +		    cost = 1;
> +		}
> +	      else if (unrecog_insn_for_forw_only_p (insn))
> +		{
> +		  reason = "unrecog insn";
> +		  if (!first_cycle_insn_p)
> +		    cost = 1;
> +		  else
> +		    {
> +		      int j = i;
> +		      while (n > ++j)
> +			if (!unrecog_insn_for_forw_only_p (ready_element (&ready, j)))
> +			  break;
> +
> +		      cost = (j == n) ? 0 : 1;
> +		    }
Why do you need a different cost based on what's in the ready list? 
Isn't the only property we're looking for whether or not the USE/CLOBBER 
opens a live range?

Jeff
  
Jin Ma June 12, 2023, 3:38 a.m. UTC | #3
> On 5/29/23 04:51, Jin Ma wrote:
> >    Unrecog insns (such as CLOBBER, USE) does not represent real instructions, but in the
> > process of pipeline optimization, they will wait for transmission in ready list like
> > other insns, without considering resource conflicts and cycles. This results in a
> > multi-issue CPU architecture that can be issued at any time if other regular insns
> > have resource conflicts or cannot be launched for other reasons. As a result, its
> > position is advanced in the generated insns sequence, which will affect register
> > allocation and often lead to more redundant mov instructions.
> > 
> > A simple example:
> > https://github.com/majin2020/gcc-test/blob/master/test.c
> > This is a function in the dhrystone benchmark.
> > 
> > https://github.com/majin2020/gcc-test/blob/0b08c1a13de9663d7d9aba7539b960ec0607ca24/test.c.299r.sched1
> > This is a log of the pass 'sched1' When issue_rate == 2. Among them, insn 13 and 14 are
> > much ahead of schedule, which risks generating redundant mov instructions, which seems
> > unreasonable.
> > 
> > Therefore, I submit patch again on the basis of the last review opinions to try to solve
> > this problem.
> > 
> > This is the new log of shed1 after patch is added.
> > https://github.com/majin2020/gcc-test/commit/efcb43e3369e771bde702955048bfe3f501263dd
> > 
> > gcc/ChangeLog:
> > 
> >          * haifa-sched.cc (unrecog_insn_for_forw_only_p): New.
> >          (prune_ready_list): UNRECOG INSN is not executed in advance if it starts a
> > 	live range.
> > ---
> >   gcc/haifa-sched.cc | 44 +++++++++++++++++++++++++++++++++++++++-----
> >   1 file changed, 39 insertions(+), 5 deletions(-)
> > 
> > diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
> > index 2c881ede0ec..205680a4936 100644
> > --- a/gcc/haifa-sched.cc
> > +++ b/gcc/haifa-sched.cc
> > @@ -765,6 +765,23 @@ real_insn_for_shadow (rtx_insn *insn)
> >     return pair->i1;
> >   }
> >   
> > +/* Return true if INSN is unrecog that starts a live range.  */
> I would rewrite this as
> 
> /* Return TRUE if INSN (a USE or CLOBBER) starts a new live
>     range, FALSE otherwise.  */

Ok.

> > +
> > +static bool
> > +unrecog_insn_for_forw_only_p (rtx_insn *insn)
> I would call this "use_or_clobber_starts_range_p" or something like that.

Ok.

> > +{
> > +  if (insn && !INSN_P (insn) && recog_memoized (insn) >= 0)
> > +    return false;
> I would drop the test that INSN is not NULL in this test.  There's no 
> way it can ever be NULL here.
> 
> If you really want to check that, then I'd do something like
> 
> gcc_assert (INSN);
> 
> Instead of checking it in that condition.

Ok.

> > @@ -6320,11 +6337,28 @@ prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
> >   	    }
> >   	  else if (recog_memoized (insn) < 0)
> >   	    {
> > -	      if (!first_cycle_insn_p
> > -		  && (GET_CODE (PATTERN (insn)) == ASM_INPUT
> > -		      || asm_noperands (PATTERN (insn)) >= 0))
> > -		cost = 1;
> > -	      reason = "asm";
> > +	      if (GET_CODE (PATTERN (insn)) == ASM_INPUT
> > +		  || asm_noperands (PATTERN (insn)) >= 0)
> > +		{
> > +		  reason = "asm";
> > +		  if (!first_cycle_insn_p)
> > +		    cost = 1;
> > +		}
> > +	      else if (unrecog_insn_for_forw_only_p (insn))
> > +		{
> > +		  reason = "unrecog insn";
> > +		  if (!first_cycle_insn_p)
> > +		    cost = 1;
> > +		  else
> > +		    {
> > +		      int j = i;
> > +		      while (n > ++j)
> > +			if (!unrecog_insn_for_forw_only_p (ready_element (&ready, j)))
> > +			  break;
> > +
> > +		      cost = (j == n) ? 0 : 1;
> > +		    }
> Why do you need a different cost based on what's in the ready list? 
> Isn't the only property we're looking for whether or not the USE/CLOBBER 
> opens a live range?
> 
> Jeff

For this, I found that if I only look for the USE/CLOBBER  that opens a live range,
when there is only the USE/CLOBBERs left in the ready list, there will be an infinite
loop, because we will always postpone it to the next cycle(cost = 1), causing it to
never be emitted and always be in the ready list.

So I think (may not be correct) when there is only the USE/CLOBBERs left in the ready
list, the cost should be set to 0, and the USE/CLOBBER can be emitted immediately.

Maybe there's a better way?
  
Jeff Law Nov. 11, 2023, 6:51 p.m. UTC | #4
On 6/11/23 21:38, Jin Ma wrote:

>> Why do you need a different cost based on what's in the ready list?
>> Isn't the only property we're looking for whether or not the USE/CLOBBER
>> opens a live range?
>>
>> Jeff
> 
> For this, I found that if I only look for the USE/CLOBBER  that opens a live range,
> when there is only the USE/CLOBBERs left in the ready list, there will be an infinite
> loop, because we will always postpone it to the next cycle(cost = 1), causing it to
> never be emitted and always be in the ready list.
> 
> So I think (may not be correct) when there is only the USE/CLOBBERs left in the ready
> list, the cost should be set to 0, and the USE/CLOBBER can be emitted immediately.
> 
> Maybe there's a better way?
Yea, I guess this makes sense.  Let me take a look at your V2 with that 
in mind.

Sorry for the long delays here.

jeff
  

Patch

diff --git a/gcc/haifa-sched.cc b/gcc/haifa-sched.cc
index 2c881ede0ec..205680a4936 100644
--- a/gcc/haifa-sched.cc
+++ b/gcc/haifa-sched.cc
@@ -765,6 +765,23 @@  real_insn_for_shadow (rtx_insn *insn)
   return pair->i1;
 }
 
+/* Return true if INSN is unrecog that starts a live range.  */
+
+static bool
+unrecog_insn_for_forw_only_p (rtx_insn *insn)
+{
+  if (insn && !INSN_P (insn) && recog_memoized (insn) >= 0)
+    return false;
+
+  if ((GET_CODE (PATTERN (insn)) == CLOBBER
+       || GET_CODE (PATTERN (insn)) == USE)
+      && !sd_lists_empty_p (insn, SD_LIST_FORW)
+      && sd_lists_empty_p (insn, SD_LIST_BACK))
+    return true;
+
+  return false;
+}
+
 /* For a pair P of insns, return the fixed distance in cycles from the first
    insn after which the second must be scheduled.  */
 static int
@@ -6320,11 +6337,28 @@  prune_ready_list (state_t temp_state, bool first_cycle_insn_p,
 	    }
 	  else if (recog_memoized (insn) < 0)
 	    {
-	      if (!first_cycle_insn_p
-		  && (GET_CODE (PATTERN (insn)) == ASM_INPUT
-		      || asm_noperands (PATTERN (insn)) >= 0))
-		cost = 1;
-	      reason = "asm";
+	      if (GET_CODE (PATTERN (insn)) == ASM_INPUT
+		  || asm_noperands (PATTERN (insn)) >= 0)
+		{
+		  reason = "asm";
+		  if (!first_cycle_insn_p)
+		    cost = 1;
+		}
+	      else if (unrecog_insn_for_forw_only_p (insn))
+		{
+		  reason = "unrecog insn";
+		  if (!first_cycle_insn_p)
+		    cost = 1;
+		  else
+		    {
+		      int j = i;
+		      while (n > ++j)
+			if (!unrecog_insn_for_forw_only_p (ready_element (&ready, j)))
+			  break;
+
+		      cost = (j == n) ? 0 : 1;
+		    }
+		}
 	    }
 	  else if (sched_pressure != SCHED_PRESSURE_NONE)
 	    {