[v1] Mode-Switching: Add optional EMIT_AFTER hook
Checks
Commit Message
From: Pan Li <pan2.li@intel.com>
We have EMIT hook in mode switching already, which will insert the
insn before in most cases. However, in some arch like RISC-V, it
requires the additional insn to be inserted after when meet a call.
|
| <- EMIT HOOK, insert the insn before.
+-----------+
| ptr->insn |
+-----------+
| <- EMIT_AFTER HOOK, insert the insn after.
|
Thus, this patch would like to add one optional EMIT_AFTER hook, which
will try to insert the emitted insn after. The end-user can either
implement this HOOK or leave it NULL as is.
If the backend ignore this optinal hook, there is no impact to the
original mode switching stuff. If the backend implement this optional
hook, the mode switching will try to insert the insn after. Please note
the EMIT_AFTER doen't have any impact to EMIT hook.
Passed both the regression and bootstrap test in x86.
Signed-off-by: Pan Li <pan2.li@intel.com>
gcc/ChangeLog:
* doc/tm.texi: Add hook def and update the description.
* doc/tm.texi.in: Ditto.
* mode-switching.cc (optimize_mode_switching): Insert the
emitted insn after ptr->insn.
* target.def (insn): Define emit_after hook.
---
gcc/doc/tm.texi | 12 ++++++++++--
gcc/doc/tm.texi.in | 6 ++++--
gcc/mode-switching.cc | 45 +++++++++++++++++++++++++++++++++++++++++++
gcc/target.def | 9 +++++++++
4 files changed, 68 insertions(+), 4 deletions(-)
Comments
On 8/21/23 01:26, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
>
> We have EMIT hook in mode switching already, which will insert the
> insn before in most cases. However, in some arch like RISC-V, it
> requires the additional insn to be inserted after when meet a call.
>
> |
> | <- EMIT HOOK, insert the insn before.
> +-----------+
> | ptr->insn |
> +-----------+
> | <- EMIT_AFTER HOOK, insert the insn after.
> |
>
> Thus, this patch would like to add one optional EMIT_AFTER hook, which
> will try to insert the emitted insn after. The end-user can either
> implement this HOOK or leave it NULL as is.
>
> If the backend ignore this optinal hook, there is no impact to the
> original mode switching stuff. If the backend implement this optional
> hook, the mode switching will try to insert the insn after. Please note
> the EMIT_AFTER doen't have any impact to EMIT hook.
>
> Passed both the regression and bootstrap test in x86.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
>
> gcc/ChangeLog:
>
> * doc/tm.texi: Add hook def and update the description.
> * doc/tm.texi.in: Ditto.
> * mode-switching.cc (optimize_mode_switching): Insert the
> emitted insn after ptr->insn.
> * target.def (insn): Define emit_after hook.
Not a full review. I think I need to know a bit more about why you need
these additional hooks.
Presumably you can't use the current ".emit" hook because it doesn't
give you access to the block or insn that you can then iterate on for
insertion on the outgoing edges?
> @@ -831,6 +833,49 @@ optimize_mode_switching (void)
> emit_insn_before (mode_set, ptr->insn_ptr);
> }
>
> + if (targetm.mode_switching.emit_after)
> + {
> + if (control_flow_insn_p (ptr->insn_ptr)
> + && ptr->insn_ptr == BB_END (bb))
I'm not aware of a case where we can have an insn with control flow that
isn't the end of the block. So perhaps then that second conditional
into an assertion inside the true arm?
> + {
> + edge eg;
> + edge_iterator eg_iterator;
> +
> + FOR_EACH_EDGE (eg, eg_iterator, bb->succs)
> + {
> + start_sequence ();
> + targetm.mode_switching.emit_after (entity_map[j],
> + ptr->mode, cur_mode, ptr->regs_live);
> + mode_set = get_insns ();
> + end_sequence ();
> +
> + if (mode_set != NULL_RTX)
> + {
> + if (eg->flags & EDGE_ABNORMAL)
> + insert_insn_end_basic_block (mode_set, bb);
> + else
> + insert_insn_on_edge (mode_set, eg);
Is this really correct for EDGE_ABNORMAL? If the abnormal edge is
created by, say a nonlocal goto, exception handling, etc, then the insn
you insert at the end of the block will never be executed.
This is a classic problem with these classes of algorithms and I suspect
there's code elsewhere to deal with these cases.
Jeff
Thanks Jeff for comments, and sorry for late response.
The background comes from the CALL insn. For the RISC-V dynamic rounding mode we need to
1. restore the frm BEFORE call, to avoid the static rounding mode pollute the call.
2. Backup the frm AFTER call, to ensure the frm value after call is live.
Currently, we don’t take care of it elegantly but we would like to refine this part by the optional EMIT_AFTER.
> I'm not aware of a case where we can have an insn with control flow that
> isn't the end of the block. So perhaps then that second conditional
> into an assertion inside the true arm?
Not very sure my understanding is correct, but there may be a call insn in the middle of the bb,
And can be considered as control flow?
> Is this really correct for EDGE_ABNORMAL? If the abnormal edge is
> created by, say a nonlocal goto, exception handling, etc, then the insn
> you insert at the end of the block will never be executed.
Got it, let me have a try for this, as well as there is somewhere take care of this already.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Monday, August 21, 2023 10:24 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/21/23 01:26, pan2.li@intel.com wrote:
> From: Pan Li <pan2.li@intel.com>
>
> We have EMIT hook in mode switching already, which will insert the
> insn before in most cases. However, in some arch like RISC-V, it
> requires the additional insn to be inserted after when meet a call.
>
> |
> | <- EMIT HOOK, insert the insn before.
> +-----------+
> | ptr->insn |
> +-----------+
> | <- EMIT_AFTER HOOK, insert the insn after.
> |
>
> Thus, this patch would like to add one optional EMIT_AFTER hook, which
> will try to insert the emitted insn after. The end-user can either
> implement this HOOK or leave it NULL as is.
>
> If the backend ignore this optinal hook, there is no impact to the
> original mode switching stuff. If the backend implement this optional
> hook, the mode switching will try to insert the insn after. Please note
> the EMIT_AFTER doen't have any impact to EMIT hook.
>
> Passed both the regression and bootstrap test in x86.
>
> Signed-off-by: Pan Li <pan2.li@intel.com>
>
> gcc/ChangeLog:
>
> * doc/tm.texi: Add hook def and update the description.
> * doc/tm.texi.in: Ditto.
> * mode-switching.cc (optimize_mode_switching): Insert the
> emitted insn after ptr->insn.
> * target.def (insn): Define emit_after hook.
Not a full review. I think I need to know a bit more about why you need
these additional hooks.
Presumably you can't use the current ".emit" hook because it doesn't
give you access to the block or insn that you can then iterate on for
insertion on the outgoing edges?
> @@ -831,6 +833,49 @@ optimize_mode_switching (void)
> emit_insn_before (mode_set, ptr->insn_ptr);
> }
>
> + if (targetm.mode_switching.emit_after)
> + {
> + if (control_flow_insn_p (ptr->insn_ptr)
> + && ptr->insn_ptr == BB_END (bb))
I'm not aware of a case where we can have an insn with control flow that
isn't the end of the block. So perhaps then that second conditional
into an assertion inside the true arm?
> + {
> + edge eg;
> + edge_iterator eg_iterator;
> +
> + FOR_EACH_EDGE (eg, eg_iterator, bb->succs)
> + {
> + start_sequence ();
> + targetm.mode_switching.emit_after (entity_map[j],
> + ptr->mode, cur_mode, ptr->regs_live);
> + mode_set = get_insns ();
> + end_sequence ();
> +
> + if (mode_set != NULL_RTX)
> + {
> + if (eg->flags & EDGE_ABNORMAL)
> + insert_insn_end_basic_block (mode_set, bb);
> + else
> + insert_insn_on_edge (mode_set, eg);
Is this really correct for EDGE_ABNORMAL? If the abnormal edge is
created by, say a nonlocal goto, exception handling, etc, then the insn
you insert at the end of the block will never be executed.
This is a classic problem with these classes of algorithms and I suspect
there's code elsewhere to deal with these cases.
Jeff
On 8/23/23 00:03, Li, Pan2 wrote:
> Thanks Jeff for comments, and sorry for late response.
>
> The background comes from the CALL insn. For the RISC-V dynamic rounding mode we need to
>
> 1. restore the frm BEFORE call, to avoid the static rounding mode pollute the call.
> 2. Backup the frm AFTER call, to ensure the frm value after call is live.
>
> Currently, we don’t take care of it elegantly but we would like to refine this part by the optional EMIT_AFTER.
Understood. So the natural question is why does x86/sh not need this
for its mode switching? Don't all the same issues exist on those
targets as well?
>
>> I'm not aware of a case where we can have an insn with control flow that
>> isn't the end of the block. So perhaps then that second conditional
>> into an assertion inside the true arm?
>
> Not very sure my understanding is correct, but there may be a call insn in the middle of the bb,
> And can be considered as control flow?
In the case where the call is control flow, then it'll end the block.
Examples of this would be if the call could throw or perform a nonlocal
goto. For "normal" calls, they are not considered control flow and can
show up in the middle of a block.
>
>> Is this really correct for EDGE_ABNORMAL? If the abnormal edge is
>> created by, say a nonlocal goto, exception handling, etc, then the insn
>> you insert at the end of the block will never be executed.
>
> Got it, let me have a try for this, as well as there is somewhere take care of this already.
You might also peek at the RTL gcse/pre code which is also LCM based and
has the same class of problems.
jeff
Thanks Jeff for comments.
> Understood. So the natural question is why does x86/sh not need this
> for its mode switching? Don't all the same issues exist on those
> targets as well?
AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
the x86/arm only indicates the semantics of the insn.
For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
vector intrinsic will always have static rounding mode API if the frm is honored.
In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Wednesday, August 23, 2023 10:25 PM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/23/23 00:03, Li, Pan2 wrote:
> Thanks Jeff for comments, and sorry for late response.
>
> The background comes from the CALL insn. For the RISC-V dynamic rounding mode we need to
>
> 1. restore the frm BEFORE call, to avoid the static rounding mode pollute the call.
> 2. Backup the frm AFTER call, to ensure the frm value after call is live.
>
> Currently, we don’t take care of it elegantly but we would like to refine this part by the optional EMIT_AFTER.
Understood. So the natural question is why does x86/sh not need this
for its mode switching? Don't all the same issues exist on those
targets as well?
>
>> I'm not aware of a case where we can have an insn with control flow that
>> isn't the end of the block. So perhaps then that second conditional
>> into an assertion inside the true arm?
>
> Not very sure my understanding is correct, but there may be a call insn in the middle of the bb,
> And can be considered as control flow?
In the case where the call is control flow, then it'll end the block.
Examples of this would be if the call could throw or perform a nonlocal
goto. For "normal" calls, they are not considered control flow and can
show up in the middle of a block.
>
>> Is this really correct for EDGE_ABNORMAL? If the abnormal edge is
>> created by, say a nonlocal goto, exception handling, etc, then the insn
>> you insert at the end of the block will never be executed.
>
> Got it, let me have a try for this, as well as there is somewhere take care of this already.
You might also peek at the RTL gcse/pre code which is also LCM based and
has the same class of problems.
jeff
On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
>
>> Understood. So the natural question is why does x86/sh not need this
>> for its mode switching? Don't all the same issues exist on those
>> targets as well?
>
> AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
> the x86/arm only indicates the semantics of the insn.
>
> For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is honored.
>
> In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important
here. Ultimately there's a state at a call site. We need to make sure
that state from the current function doesn't impact the callee and we
need to make sure that the callee doesn't impact the state in the caller.
That implies a save/restore pair around the call (possibly optimized so
that we minimize the number of save/restores). I would have expected
x86 to already be doing this. But maybe there's some ABI thing around
mmx vs x86 state that allows it to be avoided....
>
> For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
No problem. Get to it when you can. I think it affects you more than
me :-)
jeff
Thanks Jeff.
> That implies a save/restore pair around the call (possibly optimized so
> that we minimize the number of save/restores). I would have expected
> x86 to already be doing this. But maybe there's some ABI thing around
> mmx vs x86 state that allows it to be avoided....
Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add save/restore
pair around the call. I bet mode-switching take care of this already.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
>
>> Understood. So the natural question is why does x86/sh not need this
>> for its mode switching? Don't all the same issues exist on those
>> targets as well?
>
> AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
> the x86/arm only indicates the semantics of the insn.
>
> For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is honored.
>
> In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important
here. Ultimately there's a state at a call site. We need to make sure
that state from the current function doesn't impact the callee and we
need to make sure that the callee doesn't impact the state in the caller.
That implies a save/restore pair around the call (possibly optimized so
that we minimize the number of save/restores). I would have expected
x86 to already be doing this. But maybe there's some ABI thing around
mmx vs x86 state that allows it to be avoided....
>
> For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
No problem. Get to it when you can. I think it affects you more than
me :-)
jeff
Hi Jeff,
> You might also peek at the RTL gcse/pre code which is also LCM based and
> has the same class of problems.
I found a similar approach to take care of this in gcse.cc/pre_edge_insert with some comments as below.
/* We can't insert anything on an abnormal and
critical edge, so we insert the insn at the end of
the previous block. There are several alternatives
detailed in Morgans book P277 (sec 10.5) for
handling this situation. This one is easiest for
now. */
if (eg->flags & EDGE_ABNORMAL)
insert_insn_end_basic_block (index_map[j], bb);
else
{
insn = process_insert_insn (index_map[j]);
insert_insn_on_edge (insn, eg);
}
It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL edge by inserting at end of previous block from the comments.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Thanks Jeff.
> That implies a save/restore pair around the call (possibly optimized so
> that we minimize the number of save/restores). I would have expected
> x86 to already be doing this. But maybe there's some ABI thing around
> mmx vs x86 state that allows it to be avoided....
Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add save/restore
pair around the call. I bet mode-switching take care of this already.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
>
>> Understood. So the natural question is why does x86/sh not need this
>> for its mode switching? Don't all the same issues exist on those
>> targets as well?
>
> AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
> the x86/arm only indicates the semantics of the insn.
>
> For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is honored.
>
> In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important
here. Ultimately there's a state at a call site. We need to make sure
that state from the current function doesn't impact the callee and we
need to make sure that the callee doesn't impact the state in the caller.
That implies a save/restore pair around the call (possibly optimized so
that we minimize the number of save/restores). I would have expected
x86 to already be doing this. But maybe there's some ABI thing around
mmx vs x86 state that allows it to be avoided....
>
> For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
No problem. Get to it when you can. I think it affects you more than
me :-)
jeff
Hi Jeff,
Kindly ping for the Patch V2 as below.
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628508.html
Pan
-----Original Message-----
From: Li, Pan2 <pan2.li@intel.com>
Sent: Friday, August 25, 2023 8:45 PM
To: Li, Pan2 <pan2.li@intel.com>; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Hi Jeff,
> You might also peek at the RTL gcse/pre code which is also LCM based and
> has the same class of problems.
I found a similar approach to take care of this in gcse.cc/pre_edge_insert with some comments as below.
/* We can't insert anything on an abnormal and
critical edge, so we insert the insn at the end of
the previous block. There are several alternatives
detailed in Morgans book P277 (sec 10.5) for
handling this situation. This one is easiest for
now. */
if (eg->flags & EDGE_ABNORMAL)
insert_insn_end_basic_block (index_map[j], bb);
else
{
insn = process_insert_insn (index_map[j]);
insert_insn_on_edge (insn, eg);
}
It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL edge by inserting at end of previous block from the comments.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Thanks Jeff.
> That implies a save/restore pair around the call (possibly optimized so
> that we minimize the number of save/restores). I would have expected
> x86 to already be doing this. But maybe there's some ABI thing around
> mmx vs x86 state that allows it to be avoided....
Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add save/restore
pair around the call. I bet mode-switching take care of this already.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
>
>> Understood. So the natural question is why does x86/sh not need this
>> for its mode switching? Don't all the same issues exist on those
>> targets as well?
>
> AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
> the x86/arm only indicates the semantics of the insn.
>
> For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is honored.
>
> In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important
here. Ultimately there's a state at a call site. We need to make sure
that state from the current function doesn't impact the callee and we
need to make sure that the callee doesn't impact the state in the caller.
That implies a save/restore pair around the call (possibly optimized so
that we minimize the number of save/restores). I would have expected
x86 to already be doing this. But maybe there's some ABI thing around
mmx vs x86 state that allows it to be avoided....
>
> For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
No problem. Get to it when you can. I think it affects you more than
me :-)
jeff
Almost forget about this patch, sorry for disturbing and kindly ping again.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Li, Pan2 via Gcc-patches
Sent: Monday, September 11, 2023 4:37 PM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Hi Jeff,
Kindly ping for the Patch V2 as below.
https://gcc.gnu.org/pipermail/gcc-patches/2023-August/628508.html
Pan
-----Original Message-----
From: Li, Pan2 <pan2.li@intel.com>
Sent: Friday, August 25, 2023 8:45 PM
To: Li, Pan2 <pan2.li@intel.com>; Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Hi Jeff,
> You might also peek at the RTL gcse/pre code which is also LCM based and
> has the same class of problems.
I found a similar approach to take care of this in gcse.cc/pre_edge_insert with some comments as below.
/* We can't insert anything on an abnormal and
critical edge, so we insert the insn at the end of
the previous block. There are several alternatives
detailed in Morgans book P277 (sec 10.5) for
handling this situation. This one is easiest for
now. */
if (eg->flags & EDGE_ABNORMAL)
insert_insn_end_basic_block (index_map[j], bb);
else
{
insn = process_insert_insn (index_map[j]);
insert_insn_on_edge (insn, eg);
}
It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL edge by inserting at end of previous block from the comments.
Pan
-----Original Message-----
From: Gcc-patches <gcc-patches-bounces+pan2.li=intel.com@gcc.gnu.org> On Behalf Of Li, Pan2 via Gcc-patches
Sent: Thursday, August 24, 2023 12:54 PM
To: Jeff Law <jeffreyalaw@gmail.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: RE: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
Thanks Jeff.
> That implies a save/restore pair around the call (possibly optimized so
> that we minimize the number of save/restores). I would have expected
> x86 to already be doing this. But maybe there's some ABI thing around
> mmx vs x86 state that allows it to be avoided....
Very similar to save/restore but optional.
If no static rounding mode instrinsic here, it is unnecessary to add save/restore
pair around the call. I bet mode-switching take care of this already.
Pan
-----Original Message-----
From: Jeff Law <jeffreyalaw@gmail.com>
Sent: Thursday, August 24, 2023 7:27 AM
To: Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
On 8/23/23 08:54, Li, Pan2 wrote:
> Thanks Jeff for comments.
>
>> Understood. So the natural question is why does x86/sh not need this
>> for its mode switching? Don't all the same issues exist on those
>> targets as well?
>
> AFAIK, it comes from the different design principle between the risc-v and x86/arm intrinsic API.
> The risc-v rvv FP rounding mode intrinsic API has one abstract level above the insn itself, while
> the x86/arm only indicates the semantics of the insn.
>
> For example, if one vector instruction VFADD doesn't have static rounding mode (aka encoding rm in insn),
> there is no such a intrinsic API contains rounding mode argument in x86/arm. While the risc-v fp
> vector intrinsic will always have static rounding mode API if the frm is honored.
>
> In short, the risc-v intrinsic API is closer to the end-user, while the x86/arm instrinsic API is closer to insn itself.
OK, but I'm still strugging to see how the distinction is important
here. Ultimately there's a state at a call site. We need to make sure
that state from the current function doesn't impact the callee and we
need to make sure that the callee doesn't impact the state in the caller.
That implies a save/restore pair around the call (possibly optimized so
that we minimize the number of save/restores). I would have expected
x86 to already be doing this. But maybe there's some ABI thing around
mmx vs x86 state that allows it to be avoided....
>
> For the rest part, will have a try based on your suggestion soon as I am in the middle of something.
No problem. Get to it when you can. I think it affects you more than
me :-)
jeff
On 8/25/23 06:44, Li, Pan2 wrote:
> Hi Jeff,
>
>> You might also peek at the RTL gcse/pre code which is also LCM based and
>> has the same class of problems.
>
> I found a similar approach to take care of this in gcse.cc/pre_edge_insert with some comments as below.
>
> /* We can't insert anything on an abnormal and
> critical edge, so we insert the insn at the end of
> the previous block. There are several alternatives
> detailed in Morgans book P277 (sec 10.5) for
> handling this situation. This one is easiest for
> now. */
>
> if (eg->flags & EDGE_ABNORMAL)
> insert_insn_end_basic_block (index_map[j], bb);
> else
> {
> insn = process_insert_insn (index_map[j]);
> insert_insn_on_edge (insn, eg);
> }
>
> It looks the insert_insn_end_basic_block is designed to handle the ABNORMAL edge by inserting at end of previous block from the comments.
That's probably dead code at this point. IIRC rth did further work in
this space because inserting in the end of the block with the abnormal
edge isn't semantically correct.
It's been 20+ years, but IIRC he adjusted the PRE bitmaps so that we
never would need to do an insertion on an abnormal edge. Search for
EDGE_ABNORMAL in gcse.cc.
Jeff
On 8/23/23 22:53, Li, Pan2 wrote:
> Thanks Jeff.
>
>> That implies a save/restore pair around the call (possibly optimized so
>> that we minimize the number of save/restores). I would have expected
>> x86 to already be doing this. But maybe there's some ABI thing around
>> mmx vs x86 state that allows it to be avoided....
>
> Very similar to save/restore but optional.
> If no static rounding mode instrinsic here, it is unnecessary to add save/restore
> pair around the call. I bet mode-switching take care of this already.
But I still fail to see how this is relevant.
If we go back to the x86 mode switching case. We ultimately still have
to ensure that the caller does not impact the callee and the callee does
not impact the caller. That implies there must be a mechanism to
save/restore the mode at call sites unless the x86 ABI somehow defines
that problem away.
Conceptually the rounding mode is just a property. The call, in effect,
should demand a "normal" rounding mode and set the rounding mode to
unknown if I understand how this is supposed to work. If my
understanding is wrong, then maybe that's where we should start -- with
a good description of the problem ;-)
jeff
> Conceptually the rounding mode is just a property. The call, in
> effect, should demand a "normal" rounding mode and set the rounding
> mode to unknown if I understand how this is supposed to work. If my
> understanding is wrong, then maybe that's where we should start --
> with a good description of the problem ;-)
That's also what I what struggled with last time this was discussed.
Normally, mode switching is used to switch to a requested mode for
an insn or a call and potentially switch back afterwards.
For those riscv intrinsics that specify a variable, non-default rounding
mode we have two options:
- Save and restore before and after each mode-changing intrinsic
fegetround old_rounding
fesetround new_rounding
actual instruction
fesetround old_rounding)
- Have mode switching do it for us (lazily) to avoid most of the
storing of the old rounding mode by storing an (e.g.) function-level
rounding-mode backup value. The backup value is used to lazily
restore the currently valid rounding mode.
The problem with this now is that whenever fesetround gets called
our backup is outdated. Therefore we need to update our backup after
each function call (as fesetround can of course be present anywhere)
and this is where most of the complications come from.
So in that case the callee _does_ impact the caller via the backup
clobbering. That was one of my complaints about the whole procedure
last time. Besides, I didn't see the need for those intrinsics
anyway and would much rather have explicit fesetround calls but well :)
Having said that, it looks like Pan's patch just tries to move some of
the dirty work from the backend to the mode-switching pass by making it
easier to do something after a call. I believe I asked for that back in
one of the reviews even?
Regards
Robin
Thanks Jeff and Robin for comments, sorry for late reply.
> Conceptually the rounding mode is just a property. The call, in effect,
> should demand a "normal" rounding mode and set the rounding mode to
> unknown if I understand how this is supposed to work. If my
> understanding is wrong, then maybe that's where we should start -- with
> a good description of the problem ;-)
I think we are on the same page of how it works, I may need to take a look at how x86 taking care of this.
> That's probably dead code at this point. IIRC rth did further work in
> this space because inserting in the end of the block with the abnormal
> edge isn't semantically correct.
> It's been 20+ years, but IIRC he adjusted the PRE bitmaps so that we
> never would need to do an insertion on an abnormal edge. Search for
> EDGE_ABNORMAL in gcse.cc.
That is quite old up to a point, will have a try for the EDGE_ABNORMAL case.
> Having said that, it looks like Pan's patch just tries to move some of
> the dirty work from the backend to the mode-switching pass by making it
> easier to do something after a call. I believe I asked for that back in
> one of the reviews even?
Yes, that is what I would like to do in this PATCH, as the following up of some comments from Robin in previous.
Pan
-----Original Message-----
From: Robin Dapp <rdapp.gcc@gmail.com>
Sent: Monday, October 2, 2023 4:26 PM
To: Jeff Law <jeffreyalaw@gmail.com>; Li, Pan2 <pan2.li@intel.com>; gcc-patches@gcc.gnu.org
Cc: rdapp.gcc@gmail.com; juzhe.zhong@rivai.ai; Wang, Yanzhang <yanzhang.wang@intel.com>; kito.cheng@gmail.com
Subject: Re: [PATCH v1] Mode-Switching: Add optional EMIT_AFTER hook
> Conceptually the rounding mode is just a property. The call, in
> effect, should demand a "normal" rounding mode and set the rounding
> mode to unknown if I understand how this is supposed to work. If my
> understanding is wrong, then maybe that's where we should start --
> with a good description of the problem ;-)
That's also what I what struggled with last time this was discussed.
Normally, mode switching is used to switch to a requested mode for
an insn or a call and potentially switch back afterwards.
For those riscv intrinsics that specify a variable, non-default rounding
mode we have two options:
- Save and restore before and after each mode-changing intrinsic
fegetround old_rounding
fesetround new_rounding
actual instruction
fesetround old_rounding)
- Have mode switching do it for us (lazily) to avoid most of the
storing of the old rounding mode by storing an (e.g.) function-level
rounding-mode backup value. The backup value is used to lazily
restore the currently valid rounding mode.
The problem with this now is that whenever fesetround gets called
our backup is outdated. Therefore we need to update our backup after
each function call (as fesetround can of course be present anywhere)
and this is where most of the complications come from.
So in that case the callee _does_ impact the caller via the backup
clobbering. That was one of my complaints about the whole procedure
last time. Besides, I didn't see the need for those intrinsics
anyway and would much rather have explicit fesetround calls but well :)
Having said that, it looks like Pan's patch just tries to move some of
the dirty work from the backend to the mode-switching pass by making it
easier to do something after a call. I believe I asked for that back in
one of the reviews even?
Regards
Robin
@@ -10334,8 +10334,8 @@ return nonzero for any @var{entity} that needs mode-switching.
If you define this macro, you also have to define
@code{NUM_MODES_FOR_MODE_SWITCHING}, @code{TARGET_MODE_NEEDED},
@code{TARGET_MODE_PRIORITY} and @code{TARGET_MODE_EMIT}.
-@code{TARGET_MODE_AFTER}, @code{TARGET_MODE_ENTRY}, and @code{TARGET_MODE_EXIT}
-are optional.
+@code{TARGET_MODE_AFTER}, @code{TARGET_MODE_ENTRY}, @code{TARGET_MODE_EMIT_AFTER},
+and @code{TARGET_MODE_EXIT} are optional.
@end defmac
@defmac NUM_MODES_FOR_MODE_SWITCHING
@@ -10359,6 +10359,14 @@ to switch from. Sets of a lower numbered entity will be emitted before
sets of a higher numbered entity to a mode of the same or lower priority.
@end deftypefn
+@deftypefn {Target Hook} void TARGET_MODE_EMIT_AFTER (int @var{entity}, int @var{mode}, int @var{prev_mode}, HARD_REG_SET @var{regs_live})
+Generate one or more insns to set @var{entity} to @var{mode}.
+@var{hard_reg_live} is the set of hard registers live at the point where
+the insn(s) are to be inserted after. @var{prev_moxde} indicates the mode
+to switch from. Sets of a lower numbered entity will be emitted before
+sets of a higher numbered entity to a mode of the same or lower priority.
+@end deftypefn
+
@deftypefn {Target Hook} int TARGET_MODE_NEEDED (int @var{entity}, rtx_insn *@var{insn})
@var{entity} is an integer specifying a mode-switched entity.
If @code{OPTIMIZE_MODE_SWITCHING} is defined, you must define this macro
@@ -6911,8 +6911,8 @@ return nonzero for any @var{entity} that needs mode-switching.
If you define this macro, you also have to define
@code{NUM_MODES_FOR_MODE_SWITCHING}, @code{TARGET_MODE_NEEDED},
@code{TARGET_MODE_PRIORITY} and @code{TARGET_MODE_EMIT}.
-@code{TARGET_MODE_AFTER}, @code{TARGET_MODE_ENTRY}, and @code{TARGET_MODE_EXIT}
-are optional.
+@code{TARGET_MODE_AFTER}, @code{TARGET_MODE_ENTRY}, @code{TARGET_MODE_EMIT_AFTER},
+and @code{TARGET_MODE_EXIT} are optional.
@end defmac
@defmac NUM_MODES_FOR_MODE_SWITCHING
@@ -6930,6 +6930,8 @@ switch is needed / supplied.
@hook TARGET_MODE_EMIT
+@hook TARGET_MODE_EMIT_AFTER
+
@hook TARGET_MODE_NEEDED
@hook TARGET_MODE_AFTER
@@ -34,6 +34,8 @@ along with GCC; see the file COPYING3. If not see
#include "lcm.h"
#include "cfgcleanup.h"
#include "tree-pass.h"
+#include "cfgbuild.h"
+#include "gcse.h"
/* We want target macros for the mode switching code to be able to refer
to instruction attribute values. */
@@ -831,6 +833,49 @@ optimize_mode_switching (void)
emit_insn_before (mode_set, ptr->insn_ptr);
}
+ if (targetm.mode_switching.emit_after)
+ {
+ if (control_flow_insn_p (ptr->insn_ptr)
+ && ptr->insn_ptr == BB_END (bb))
+ {
+ edge eg;
+ edge_iterator eg_iterator;
+
+ FOR_EACH_EDGE (eg, eg_iterator, bb->succs)
+ {
+ start_sequence ();
+ targetm.mode_switching.emit_after (entity_map[j],
+ ptr->mode, cur_mode, ptr->regs_live);
+ mode_set = get_insns ();
+ end_sequence ();
+
+ if (mode_set != NULL_RTX)
+ {
+ if (eg->flags & EDGE_ABNORMAL)
+ insert_insn_end_basic_block (mode_set, bb);
+ else
+ insert_insn_on_edge (mode_set, eg);
+
+ emitted = true;
+ need_commit = true;
+ }
+ }
+ }
+ else
+ {
+ start_sequence ();
+ targetm.mode_switching.emit_after (entity_map[j],
+ ptr->mode, cur_mode, ptr->regs_live);
+ mode_set = get_insns ();
+ end_sequence ();
+
+ if (mode_set != NULL_RTX)
+ {
+ emit_insn_after (mode_set, ptr->insn_ptr);
+ emitted = true;
+ }
+ }
+ }
default_rtl_profile ();
}
@@ -7005,6 +7005,15 @@ to switch from. Sets of a lower numbered entity will be emitted before\n\
sets of a higher numbered entity to a mode of the same or lower priority.",
void, (int entity, int mode, int prev_mode, HARD_REG_SET regs_live), NULL)
+DEFHOOK
+(emit_after,
+ "Generate one or more insns to set @var{entity} to @var{mode}.\n\
+@var{hard_reg_live} is the set of hard registers live at the point where\n\
+the insn(s) are to be inserted after. @var{prev_moxde} indicates the mode\n\
+to switch from. Sets of a lower numbered entity will be emitted before\n\
+sets of a higher numbered entity to a mode of the same or lower priority.",
+ void, (int entity, int mode, int prev_mode, HARD_REG_SET regs_live), NULL)
+
DEFHOOK
(needed,
"@var{entity} is an integer specifying a mode-switched entity.\n\