[v2] swap: Fix incorrect lane extraction by vec_extract() [PR106770]
Checks
Commit Message
swap: Fix incorrect lane extraction by vec_extract() [PR106770]
In the routine rs6000_analyze_swaps(), special handling of swappable
instructions is done even if the webs that contain the swappable
instructions are not optimized, i.e., the webs do not contain any
permuting load/store instructions along with the associated register
swap instructions. Doing special handling in such webs will result in
the extracted lane being adjusted unnecessarily for vec_extract.
Another issue is that existing code treats non-permuting loads/stores
as special swappables. Non-permuting loads/stores (that have not yet
been split into a permuting load/store and a swap) are handled by
converting them into a permuting load/store (which effectively removes
the swap). As a result, if special swappables are handled only in webs
containing permuting loads/stores, then non-optimal code is generated
for non-permuting loads/stores.
Hence, in this patch, all webs containing either permuting loads/
stores or non-permuting loads/stores are marked as requiring special
handling of swappables. Swaps associated with permuting loads/stores
are marked for removal, and non-permuting loads/stores are converted to
permuting loads/stores. Then the special swappables in the webs are
fixed up.
Another issue with always handling swappable instructions is that it is
incorrect to do so in webs where loads/stores on quad word aligned
addresses are changed to lvx/stvx. Similarly, in webs where
swap(load(vector constant)) instructions are replaced with
load(swapped vector constant), the swappable instructions should not be
modified.
2023-09-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
gcc/
PR rtl-optimization/PR106770
* config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
function.
(handle_non_permuting_mem_insn): New function.
(rs6000_analyze_swaps): Handle swappable instructions only in
certain webs.
(web_requires_special_handling): New instance variable.
(handle_special_swappables): Remove handling of non-permuting
load/store instructions.
gcc/testsuite/
PR rtl-optimization/PR106770
* gcc.target/powerpc/pr106770.c: New test.
---
Comments
Ping
On 10/09/23 10:58 pm, Surya Kumari Jangala wrote:
> swap: Fix incorrect lane extraction by vec_extract() [PR106770]
>
> In the routine rs6000_analyze_swaps(), special handling of swappable
> instructions is done even if the webs that contain the swappable
> instructions are not optimized, i.e., the webs do not contain any
> permuting load/store instructions along with the associated register
> swap instructions. Doing special handling in such webs will result in
> the extracted lane being adjusted unnecessarily for vec_extract.
>
> Another issue is that existing code treats non-permuting loads/stores
> as special swappables. Non-permuting loads/stores (that have not yet
> been split into a permuting load/store and a swap) are handled by
> converting them into a permuting load/store (which effectively removes
> the swap). As a result, if special swappables are handled only in webs
> containing permuting loads/stores, then non-optimal code is generated
> for non-permuting loads/stores.
>
> Hence, in this patch, all webs containing either permuting loads/
> stores or non-permuting loads/stores are marked as requiring special
> handling of swappables. Swaps associated with permuting loads/stores
> are marked for removal, and non-permuting loads/stores are converted to
> permuting loads/stores. Then the special swappables in the webs are
> fixed up.
>
> Another issue with always handling swappable instructions is that it is
> incorrect to do so in webs where loads/stores on quad word aligned
> addresses are changed to lvx/stvx. Similarly, in webs where
> swap(load(vector constant)) instructions are replaced with
> load(swapped vector constant), the swappable instructions should not be
> modified.
>
> 2023-09-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
>
> gcc/
> PR rtl-optimization/PR106770
> * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
> function.
> (handle_non_permuting_mem_insn): New function.
> (rs6000_analyze_swaps): Handle swappable instructions only in
> certain webs.
> (web_requires_special_handling): New instance variable.
> (handle_special_swappables): Remove handling of non-permuting
> load/store instructions.
>
> gcc/testsuite/
> PR rtl-optimization/PR106770
> * gcc.target/powerpc/pr106770.c: New test.
> ---
>
> diff --git a/gcc/config/rs6000/rs6000-p8swap.cc b/gcc/config/rs6000/rs6000-p8swap.cc
> index 0388b9bd736..3a695aa1318 100644
> --- a/gcc/config/rs6000/rs6000-p8swap.cc
> +++ b/gcc/config/rs6000/rs6000-p8swap.cc
> @@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base
> unsigned int special_handling : 4;
> /* Set if the web represented by this entry cannot be optimized. */
> unsigned int web_not_optimizable : 1;
> + /* Set if the swappable insns in the web represented by this entry
> + have to be fixed. Swappable insns have to be fixed in :
> + - webs containing permuting loads/stores and the swap insns
> + in such webs have been marked for removal
> + - webs where non-permuting loads/stores have been converted
> + to permuting loads/stores */
> + unsigned int web_requires_special_handling : 1;
> /* Set if this insn should be deleted. */
> unsigned int will_delete : 1;
> };
> @@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry *insn_entry, unsigned i)
> if (dump_file)
> fprintf (dump_file, "Adjusting subreg in insn %d\n", i);
> break;
> - case SH_NOSWAP_LD:
> - /* Convert a non-permuting load to a permuting one. */
> - permute_load (insn);
> - break;
> - case SH_NOSWAP_ST:
> - /* Convert a non-permuting store to a permuting one. */
> - permute_store (insn);
> - break;
> case SH_EXTRACT:
> /* Change the lane on an extract operation. */
> adjust_extract (insn);
> @@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun)
> free (to_delete);
> }
>
> +/* Return true if insn is a non-permuting load/store. */
> +static bool
> +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
> +{
> + return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
> + insn_entry[i].special_handling == SH_NOSWAP_ST);
> +}
> +
> +/* Convert a non-permuting load/store insn to a permuting one. */
> +static void
> +handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
> +{
> + rtx_insn *insn = insn_entry[i].insn;
> + if (insn_entry[i].special_handling == SH_NOSWAP_LD)
> + permute_load (insn);
> + else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
> + permute_store (insn);
> +}
> +
> /* Main entry point for this pass. */
> unsigned int
> rs6000_analyze_swaps (function *fun)
> @@ -2624,25 +2642,56 @@ rs6000_analyze_swaps (function *fun)
> dump_swap_insn_table (insn_entry);
> }
>
> - /* For each load and store in an optimizable web (which implies
> - the loads and stores are permuting), find the associated
> - register swaps and mark them for removal. Due to various
> - optimizations we may mark the same swap more than once. Also
> - perform special handling for swappable insns that require it. */
> + /* There are two kinds of optimizations that can be performed on an
> + optimizable web:
> + 1. Remove the register swaps associated with permuting load/store
> + in an optimizable web
> + 2. Convert the vanilla loads/stores (that have not yet been split
> + into a permuting load/store and a swap) into a permuting
> + load/store (which effectively removes the swap)
> + In both the cases, swappable instructions in the webs need
> + special handling to fix them up. */
> for (i = 0; i < e; ++i)
> + /* For each permuting load/store in an optimizable web, find
> + the associated register swaps and mark them for removal.
> + Due to various optimizations we may mark the same swap more
> + than once. */
> if ((insn_entry[i].is_load || insn_entry[i].is_store)
> && insn_entry[i].is_swap)
> {
> swap_web_entry* root_entry
> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
> if (!root_entry->web_not_optimizable)
> - mark_swaps_for_removal (insn_entry, i);
> + {
> + mark_swaps_for_removal (insn_entry, i);
> + root_entry->web_requires_special_handling = true;
> + }
> }
> - else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
> + /* Convert the non-permuting loads/stores into a permuting
> + load/store. */
> + else if (insn_entry[i].is_swappable
> + && non_permuting_mem_insn (insn_entry, i))
> {
> swap_web_entry* root_entry
> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
> if (!root_entry->web_not_optimizable)
> + {
> + handle_non_permuting_mem_insn (insn_entry, i);
> + root_entry->web_requires_special_handling = true;
> + }
> + }
> +
> + /* Perform special handling for swappable insns that require it.
> + Note that special handling should be done only for those
> + swappable insns that are present in webs marked as requiring
> + special handling. */
> + for (i = 0; i < e; ++i)
> + if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
> + !non_permuting_mem_insn (insn_entry, i))
> + {
> + swap_web_entry* root_entry
> + = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
> + if (root_entry->web_requires_special_handling)
> handle_special_swappables (insn_entry, i);
> }
>
> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c b/gcc/testsuite/gcc.target/powerpc/pr106770.c
> new file mode 100644
> index 00000000000..11efe39abc5
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
> +/* The 2 xxpermdi instructions are generated by the two
> + calls to vec_promote() */
> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
> +
> +/* Test case to resolve PR106770 */
> +
> +#include <altivec.h>
> +
> +int cmp2(double a, double b)
> +{
> + vector double va = vec_promote(a, 1);
> + vector double vb = vec_promote(b, 1);
> + vector long long vlt = (vector long long)vec_cmplt(va, vb);
> + vector long long vgt = (vector long long)vec_cmplt(vb, va);
> + vector signed long long vr = vec_sub(vlt, vgt);
> +
> + return vec_extract(vr, 1);
> +}
>
Ping
On 20/09/23 7:31 am, Surya Kumari Jangala wrote:
> Ping
>
> On 10/09/23 10:58 pm, Surya Kumari Jangala wrote:
>> swap: Fix incorrect lane extraction by vec_extract() [PR106770]
>>
>> In the routine rs6000_analyze_swaps(), special handling of swappable
>> instructions is done even if the webs that contain the swappable
>> instructions are not optimized, i.e., the webs do not contain any
>> permuting load/store instructions along with the associated register
>> swap instructions. Doing special handling in such webs will result in
>> the extracted lane being adjusted unnecessarily for vec_extract.
>>
>> Another issue is that existing code treats non-permuting loads/stores
>> as special swappables. Non-permuting loads/stores (that have not yet
>> been split into a permuting load/store and a swap) are handled by
>> converting them into a permuting load/store (which effectively removes
>> the swap). As a result, if special swappables are handled only in webs
>> containing permuting loads/stores, then non-optimal code is generated
>> for non-permuting loads/stores.
>>
>> Hence, in this patch, all webs containing either permuting loads/
>> stores or non-permuting loads/stores are marked as requiring special
>> handling of swappables. Swaps associated with permuting loads/stores
>> are marked for removal, and non-permuting loads/stores are converted to
>> permuting loads/stores. Then the special swappables in the webs are
>> fixed up.
>>
>> Another issue with always handling swappable instructions is that it is
>> incorrect to do so in webs where loads/stores on quad word aligned
>> addresses are changed to lvx/stvx. Similarly, in webs where
>> swap(load(vector constant)) instructions are replaced with
>> load(swapped vector constant), the swappable instructions should not be
>> modified.
>>
>> 2023-09-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
>>
>> gcc/
>> PR rtl-optimization/PR106770
>> * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
>> function.
>> (handle_non_permuting_mem_insn): New function.
>> (rs6000_analyze_swaps): Handle swappable instructions only in
>> certain webs.
>> (web_requires_special_handling): New instance variable.
>> (handle_special_swappables): Remove handling of non-permuting
>> load/store instructions.
>>
>> gcc/testsuite/
>> PR rtl-optimization/PR106770
>> * gcc.target/powerpc/pr106770.c: New test.
>> ---
>>
>> diff --git a/gcc/config/rs6000/rs6000-p8swap.cc b/gcc/config/rs6000/rs6000-p8swap.cc
>> index 0388b9bd736..3a695aa1318 100644
>> --- a/gcc/config/rs6000/rs6000-p8swap.cc
>> +++ b/gcc/config/rs6000/rs6000-p8swap.cc
>> @@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base
>> unsigned int special_handling : 4;
>> /* Set if the web represented by this entry cannot be optimized. */
>> unsigned int web_not_optimizable : 1;
>> + /* Set if the swappable insns in the web represented by this entry
>> + have to be fixed. Swappable insns have to be fixed in :
>> + - webs containing permuting loads/stores and the swap insns
>> + in such webs have been marked for removal
>> + - webs where non-permuting loads/stores have been converted
>> + to permuting loads/stores */
>> + unsigned int web_requires_special_handling : 1;
>> /* Set if this insn should be deleted. */
>> unsigned int will_delete : 1;
>> };
>> @@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry *insn_entry, unsigned i)
>> if (dump_file)
>> fprintf (dump_file, "Adjusting subreg in insn %d\n", i);
>> break;
>> - case SH_NOSWAP_LD:
>> - /* Convert a non-permuting load to a permuting one. */
>> - permute_load (insn);
>> - break;
>> - case SH_NOSWAP_ST:
>> - /* Convert a non-permuting store to a permuting one. */
>> - permute_store (insn);
>> - break;
>> case SH_EXTRACT:
>> /* Change the lane on an extract operation. */
>> adjust_extract (insn);
>> @@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun)
>> free (to_delete);
>> }
>>
>> +/* Return true if insn is a non-permuting load/store. */
>> +static bool
>> +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>> +{
>> + return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
>> + insn_entry[i].special_handling == SH_NOSWAP_ST);
>> +}
>> +
>> +/* Convert a non-permuting load/store insn to a permuting one. */
>> +static void
>> +handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>> +{
>> + rtx_insn *insn = insn_entry[i].insn;
>> + if (insn_entry[i].special_handling == SH_NOSWAP_LD)
>> + permute_load (insn);
>> + else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
>> + permute_store (insn);
>> +}
>> +
>> /* Main entry point for this pass. */
>> unsigned int
>> rs6000_analyze_swaps (function *fun)
>> @@ -2624,25 +2642,56 @@ rs6000_analyze_swaps (function *fun)
>> dump_swap_insn_table (insn_entry);
>> }
>>
>> - /* For each load and store in an optimizable web (which implies
>> - the loads and stores are permuting), find the associated
>> - register swaps and mark them for removal. Due to various
>> - optimizations we may mark the same swap more than once. Also
>> - perform special handling for swappable insns that require it. */
>> + /* There are two kinds of optimizations that can be performed on an
>> + optimizable web:
>> + 1. Remove the register swaps associated with permuting load/store
>> + in an optimizable web
>> + 2. Convert the vanilla loads/stores (that have not yet been split
>> + into a permuting load/store and a swap) into a permuting
>> + load/store (which effectively removes the swap)
>> + In both the cases, swappable instructions in the webs need
>> + special handling to fix them up. */
>> for (i = 0; i < e; ++i)
>> + /* For each permuting load/store in an optimizable web, find
>> + the associated register swaps and mark them for removal.
>> + Due to various optimizations we may mark the same swap more
>> + than once. */
>> if ((insn_entry[i].is_load || insn_entry[i].is_store)
>> && insn_entry[i].is_swap)
>> {
>> swap_web_entry* root_entry
>> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>> if (!root_entry->web_not_optimizable)
>> - mark_swaps_for_removal (insn_entry, i);
>> + {
>> + mark_swaps_for_removal (insn_entry, i);
>> + root_entry->web_requires_special_handling = true;
>> + }
>> }
>> - else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
>> + /* Convert the non-permuting loads/stores into a permuting
>> + load/store. */
>> + else if (insn_entry[i].is_swappable
>> + && non_permuting_mem_insn (insn_entry, i))
>> {
>> swap_web_entry* root_entry
>> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>> if (!root_entry->web_not_optimizable)
>> + {
>> + handle_non_permuting_mem_insn (insn_entry, i);
>> + root_entry->web_requires_special_handling = true;
>> + }
>> + }
>> +
>> + /* Perform special handling for swappable insns that require it.
>> + Note that special handling should be done only for those
>> + swappable insns that are present in webs marked as requiring
>> + special handling. */
>> + for (i = 0; i < e; ++i)
>> + if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
>> + !non_permuting_mem_insn (insn_entry, i))
>> + {
>> + swap_web_entry* root_entry
>> + = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>> + if (root_entry->web_requires_special_handling)
>> handle_special_swappables (insn_entry, i);
>> }
>>
>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c b/gcc/testsuite/gcc.target/powerpc/pr106770.c
>> new file mode 100644
>> index 00000000000..11efe39abc5
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
>> @@ -0,0 +1,21 @@
>> +/* { dg-do compile } */
>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
>> +/* The 2 xxpermdi instructions are generated by the two
>> + calls to vec_promote() */
>> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
>> +
>> +/* Test case to resolve PR106770 */
>> +
>> +#include <altivec.h>
>> +
>> +int cmp2(double a, double b)
>> +{
>> + vector double va = vec_promote(a, 1);
>> + vector double vb = vec_promote(b, 1);
>> + vector long long vlt = (vector long long)vec_cmplt(va, vb);
>> + vector long long vgt = (vector long long)vec_cmplt(vb, va);
>> + vector signed long long vr = vec_sub(vlt, vgt);
>> +
>> + return vec_extract(vr, 1);
>> +}
>>
Ping
On 03/10/23 3:53 pm, Surya Kumari Jangala wrote:
> Ping
>
> On 20/09/23 7:31 am, Surya Kumari Jangala wrote:
>> Ping
>>
>> On 10/09/23 10:58 pm, Surya Kumari Jangala wrote:
>>> swap: Fix incorrect lane extraction by vec_extract() [PR106770]
>>>
>>> In the routine rs6000_analyze_swaps(), special handling of swappable
>>> instructions is done even if the webs that contain the swappable
>>> instructions are not optimized, i.e., the webs do not contain any
>>> permuting load/store instructions along with the associated register
>>> swap instructions. Doing special handling in such webs will result in
>>> the extracted lane being adjusted unnecessarily for vec_extract.
>>>
>>> Another issue is that existing code treats non-permuting loads/stores
>>> as special swappables. Non-permuting loads/stores (that have not yet
>>> been split into a permuting load/store and a swap) are handled by
>>> converting them into a permuting load/store (which effectively removes
>>> the swap). As a result, if special swappables are handled only in webs
>>> containing permuting loads/stores, then non-optimal code is generated
>>> for non-permuting loads/stores.
>>>
>>> Hence, in this patch, all webs containing either permuting loads/
>>> stores or non-permuting loads/stores are marked as requiring special
>>> handling of swappables. Swaps associated with permuting loads/stores
>>> are marked for removal, and non-permuting loads/stores are converted to
>>> permuting loads/stores. Then the special swappables in the webs are
>>> fixed up.
>>>
>>> Another issue with always handling swappable instructions is that it is
>>> incorrect to do so in webs where loads/stores on quad word aligned
>>> addresses are changed to lvx/stvx. Similarly, in webs where
>>> swap(load(vector constant)) instructions are replaced with
>>> load(swapped vector constant), the swappable instructions should not be
>>> modified.
>>>
>>> 2023-09-10 Surya Kumari Jangala <jskumari@linux.ibm.com>
>>>
>>> gcc/
>>> PR rtl-optimization/PR106770
>>> * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
>>> function.
>>> (handle_non_permuting_mem_insn): New function.
>>> (rs6000_analyze_swaps): Handle swappable instructions only in
>>> certain webs.
>>> (web_requires_special_handling): New instance variable.
>>> (handle_special_swappables): Remove handling of non-permuting
>>> load/store instructions.
>>>
>>> gcc/testsuite/
>>> PR rtl-optimization/PR106770
>>> * gcc.target/powerpc/pr106770.c: New test.
>>> ---
>>>
>>> diff --git a/gcc/config/rs6000/rs6000-p8swap.cc b/gcc/config/rs6000/rs6000-p8swap.cc
>>> index 0388b9bd736..3a695aa1318 100644
>>> --- a/gcc/config/rs6000/rs6000-p8swap.cc
>>> +++ b/gcc/config/rs6000/rs6000-p8swap.cc
>>> @@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base
>>> unsigned int special_handling : 4;
>>> /* Set if the web represented by this entry cannot be optimized. */
>>> unsigned int web_not_optimizable : 1;
>>> + /* Set if the swappable insns in the web represented by this entry
>>> + have to be fixed. Swappable insns have to be fixed in :
>>> + - webs containing permuting loads/stores and the swap insns
>>> + in such webs have been marked for removal
>>> + - webs where non-permuting loads/stores have been converted
>>> + to permuting loads/stores */
>>> + unsigned int web_requires_special_handling : 1;
>>> /* Set if this insn should be deleted. */
>>> unsigned int will_delete : 1;
>>> };
>>> @@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry *insn_entry, unsigned i)
>>> if (dump_file)
>>> fprintf (dump_file, "Adjusting subreg in insn %d\n", i);
>>> break;
>>> - case SH_NOSWAP_LD:
>>> - /* Convert a non-permuting load to a permuting one. */
>>> - permute_load (insn);
>>> - break;
>>> - case SH_NOSWAP_ST:
>>> - /* Convert a non-permuting store to a permuting one. */
>>> - permute_store (insn);
>>> - break;
>>> case SH_EXTRACT:
>>> /* Change the lane on an extract operation. */
>>> adjust_extract (insn);
>>> @@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun)
>>> free (to_delete);
>>> }
>>>
>>> +/* Return true if insn is a non-permuting load/store. */
>>> +static bool
>>> +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>>> +{
>>> + return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
>>> + insn_entry[i].special_handling == SH_NOSWAP_ST);
>>> +}
>>> +
>>> +/* Convert a non-permuting load/store insn to a permuting one. */
>>> +static void
>>> +handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>>> +{
>>> + rtx_insn *insn = insn_entry[i].insn;
>>> + if (insn_entry[i].special_handling == SH_NOSWAP_LD)
>>> + permute_load (insn);
>>> + else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
>>> + permute_store (insn);
>>> +}
>>> +
>>> /* Main entry point for this pass. */
>>> unsigned int
>>> rs6000_analyze_swaps (function *fun)
>>> @@ -2624,25 +2642,56 @@ rs6000_analyze_swaps (function *fun)
>>> dump_swap_insn_table (insn_entry);
>>> }
>>>
>>> - /* For each load and store in an optimizable web (which implies
>>> - the loads and stores are permuting), find the associated
>>> - register swaps and mark them for removal. Due to various
>>> - optimizations we may mark the same swap more than once. Also
>>> - perform special handling for swappable insns that require it. */
>>> + /* There are two kinds of optimizations that can be performed on an
>>> + optimizable web:
>>> + 1. Remove the register swaps associated with permuting load/store
>>> + in an optimizable web
>>> + 2. Convert the vanilla loads/stores (that have not yet been split
>>> + into a permuting load/store and a swap) into a permuting
>>> + load/store (which effectively removes the swap)
>>> + In both the cases, swappable instructions in the webs need
>>> + special handling to fix them up. */
>>> for (i = 0; i < e; ++i)
>>> + /* For each permuting load/store in an optimizable web, find
>>> + the associated register swaps and mark them for removal.
>>> + Due to various optimizations we may mark the same swap more
>>> + than once. */
>>> if ((insn_entry[i].is_load || insn_entry[i].is_store)
>>> && insn_entry[i].is_swap)
>>> {
>>> swap_web_entry* root_entry
>>> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>>> if (!root_entry->web_not_optimizable)
>>> - mark_swaps_for_removal (insn_entry, i);
>>> + {
>>> + mark_swaps_for_removal (insn_entry, i);
>>> + root_entry->web_requires_special_handling = true;
>>> + }
>>> }
>>> - else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
>>> + /* Convert the non-permuting loads/stores into a permuting
>>> + load/store. */
>>> + else if (insn_entry[i].is_swappable
>>> + && non_permuting_mem_insn (insn_entry, i))
>>> {
>>> swap_web_entry* root_entry
>>> = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>>> if (!root_entry->web_not_optimizable)
>>> + {
>>> + handle_non_permuting_mem_insn (insn_entry, i);
>>> + root_entry->web_requires_special_handling = true;
>>> + }
>>> + }
>>> +
>>> + /* Perform special handling for swappable insns that require it.
>>> + Note that special handling should be done only for those
>>> + swappable insns that are present in webs marked as requiring
>>> + special handling. */
>>> + for (i = 0; i < e; ++i)
>>> + if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
>>> + !non_permuting_mem_insn (insn_entry, i))
>>> + {
>>> + swap_web_entry* root_entry
>>> + = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
>>> + if (root_entry->web_requires_special_handling)
>>> handle_special_swappables (insn_entry, i);
>>> }
>>>
>>> diff --git a/gcc/testsuite/gcc.target/powerpc/pr106770.c b/gcc/testsuite/gcc.target/powerpc/pr106770.c
>>> new file mode 100644
>>> index 00000000000..11efe39abc5
>>> --- /dev/null
>>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
>>> @@ -0,0 +1,21 @@
>>> +/* { dg-do compile } */
>>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
>>> +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
>>> +/* The 2 xxpermdi instructions are generated by the two
>>> + calls to vec_promote() */
>>> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
>>> +
>>> +/* Test case to resolve PR106770 */
>>> +
>>> +#include <altivec.h>
>>> +
>>> +int cmp2(double a, double b)
>>> +{
>>> + vector double va = vec_promote(a, 1);
>>> + vector double vb = vec_promote(b, 1);
>>> + vector long long vlt = (vector long long)vec_cmplt(va, vb);
>>> + vector long long vgt = (vector long long)vec_cmplt(vb, va);
>>> + vector signed long long vr = vec_sub(vlt, vgt);
>>> +
>>> + return vec_extract(vr, 1);
>>> +}
>>>
Hi!
Please say "rs6000/p8swap:" in the subject, not "swap:" :-)
On Sun, Sep 10, 2023 at 10:58:32PM +0530, Surya Kumari Jangala wrote:
> Another issue with always handling swappable instructions is that it is
> incorrect to do so in webs where loads/stores on quad word aligned
> addresses are changed to lvx/stvx.
Why? Please say why in the commit message (the message you send with
your patch should be the exact eventual commit message!)
> gcc/
> PR rtl-optimization/PR106770
> * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
> function.
Please don't break commit message / changelog lines early unnecessarily.
Lines are 80 chars, the leading tab counts as 8.
> + /* Set if the swappable insns in the web represented by this entry
> + have to be fixed. Swappable insns have to be fixed in :
(no space before colon)
> +static bool
> +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
> +{
> + return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
> + insn_entry[i].special_handling == SH_NOSWAP_ST);
> +}
"return" is not a function, you don't need parens here.
> +/* Convert a non-permuting load/store insn to a permuting one. */
> +static void
> +handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
A better name would be good, "handle" is a weaselword :-) It is a
static, so a shorter name is completely acceptable (say, one that
wouldn't be acceptable with bigger than file scope).
> + rtx_insn *insn = insn_entry[i].insn;
> + if (insn_entry[i].special_handling == SH_NOSWAP_LD)
> + permute_load (insn);
> + else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
> + permute_store (insn);
Lose the "else"? The compiler can do micro-optimisations a million
times better than any user could. Simpler, more readable (better
understandable!) code is much preferred.
> + /* Perform special handling for swappable insns that require it.
That is a completely contentless sentence :-(
> + Note that special handling should be done only for those
> + swappable insns that are present in webs marked as requiring
> + special handling. */
This one isn't much better.
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
> @@ -0,0 +1,21 @@
> +/* { dg-do compile } */
This is the default, you do not need this.
> +/* { dg-require-effective-target powerpc_p8vector_ok } */
> +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
> +/* The 2 xxpermdi instructions are generated by the two
> + calls to vec_promote() */
> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
Please enclose in {}. Use double quotes in Tcl only when tou want the
interpolation they cause. Default to using {} instead.
So please fix those things, and write a better commit message. Ideally
the commit messsage will tell everything needed to understand the patch
(so also to review the patch). Maybe add examples where needed. So
reviewing the code in the patch should be an easy thing to do, after
reading the commit message :-)
Segher
Hi Segher,
My replies are inlined:
On 29/10/23 10:16 am, Segher Boessenkool wrote:
> Hi!
>
> Please say "rs6000/p8swap:" in the subject, not "swap:" :-)
>
> On Sun, Sep 10, 2023 at 10:58:32PM +0530, Surya Kumari Jangala wrote:
>> Another issue with always handling swappable instructions is that it is
>> incorrect to do so in webs where loads/stores on quad word aligned
>> addresses are changed to lvx/stvx.
>
> Why? Please say why in the commit message (the message you send with
> your patch should be the exact eventual commit message!)
ok, I will add more explanation.
>
>> gcc/
>> PR rtl-optimization/PR106770
>> * config/rs6000/rs6000-p8swap.cc (non_permuting_mem_insn): New
>> function.
>
> Please don't break commit message / changelog lines early unnecessarily.
> Lines are 80 chars, the leading tab counts as 8.
ok.
>
>> + /* Set if the swappable insns in the web represented by this entry
>> + have to be fixed. Swappable insns have to be fixed in :
>
> (no space before colon)
ok.
>
>> +static bool
>> +non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>> +{
>> + return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
>> + insn_entry[i].special_handling == SH_NOSWAP_ST);
>> +}
>
> "return" is not a function, you don't need parens here.
ok.
>
>> +/* Convert a non-permuting load/store insn to a permuting one. */
>> +static void
>> +handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
>
> A better name would be good, "handle" is a weaselword :-) It is a
> static, so a shorter name is completely acceptable (say, one that
> wouldn't be acceptable with bigger than file scope).
Ok. How does convert_mem_insn() sound?
Note: "handle" is used as a prefix for other functions in rs6000-p8swap.cc (such as handle_special_swappables()).
>
>> + rtx_insn *insn = insn_entry[i].insn;
>> + if (insn_entry[i].special_handling == SH_NOSWAP_LD)
>> + permute_load (insn);
>> + else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
>> + permute_store (insn);
>
> Lose the "else"? The compiler can do micro-optimisations a million
> times better than any user could. Simpler, more readable (better
> understandable!) code is much preferred.
>
>> + /* Perform special handling for swappable insns that require it.
>
> That is a completely contentless sentence :-(
>
This line was present in the original code. This is not something I added.
Let me try to add some more comments to make the explanation better.
>> + Note that special handling should be done only for those
>> + swappable insns that are present in webs marked as requiring
>> + special handling. */
>
> This one isn't much better.>
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/powerpc/pr106770.c
>> @@ -0,0 +1,21 @@
>> +/* { dg-do compile } */
>
> This is the default, you do not need this.
ok.
>
>> +/* { dg-require-effective-target powerpc_p8vector_ok } */
>> +/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
>> +/* The 2 xxpermdi instructions are generated by the two
>> + calls to vec_promote() */
>> +/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
>
> Please enclose in {}. Use double quotes in Tcl only when tou want the
> interpolation they cause. Default to using {} instead.
ok.
Regards,
Surya
>
> So please fix those things, and write a better commit message. Ideally
> the commit messsage will tell everything needed to understand the patch
> (so also to review the patch). Maybe add examples where needed. So
> reviewing the code in the patch should be an easy thing to do, after
> reading the commit message :-)
>
>
> Segher
@@ -179,6 +179,13 @@ class swap_web_entry : public web_entry_base
unsigned int special_handling : 4;
/* Set if the web represented by this entry cannot be optimized. */
unsigned int web_not_optimizable : 1;
+ /* Set if the swappable insns in the web represented by this entry
+ have to be fixed. Swappable insns have to be fixed in :
+ - webs containing permuting loads/stores and the swap insns
+ in such webs have been marked for removal
+ - webs where non-permuting loads/stores have been converted
+ to permuting loads/stores */
+ unsigned int web_requires_special_handling : 1;
/* Set if this insn should be deleted. */
unsigned int will_delete : 1;
};
@@ -1468,14 +1475,6 @@ handle_special_swappables (swap_web_entry *insn_entry, unsigned i)
if (dump_file)
fprintf (dump_file, "Adjusting subreg in insn %d\n", i);
break;
- case SH_NOSWAP_LD:
- /* Convert a non-permuting load to a permuting one. */
- permute_load (insn);
- break;
- case SH_NOSWAP_ST:
- /* Convert a non-permuting store to a permuting one. */
- permute_store (insn);
- break;
case SH_EXTRACT:
/* Change the lane on an extract operation. */
adjust_extract (insn);
@@ -2401,6 +2400,25 @@ recombine_lvx_stvx_patterns (function *fun)
free (to_delete);
}
+/* Return true if insn is a non-permuting load/store. */
+static bool
+non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
+{
+ return (insn_entry[i].special_handling == SH_NOSWAP_LD ||
+ insn_entry[i].special_handling == SH_NOSWAP_ST);
+}
+
+/* Convert a non-permuting load/store insn to a permuting one. */
+static void
+handle_non_permuting_mem_insn (swap_web_entry *insn_entry, unsigned int i)
+{
+ rtx_insn *insn = insn_entry[i].insn;
+ if (insn_entry[i].special_handling == SH_NOSWAP_LD)
+ permute_load (insn);
+ else if (insn_entry[i].special_handling == SH_NOSWAP_ST)
+ permute_store (insn);
+}
+
/* Main entry point for this pass. */
unsigned int
rs6000_analyze_swaps (function *fun)
@@ -2624,25 +2642,56 @@ rs6000_analyze_swaps (function *fun)
dump_swap_insn_table (insn_entry);
}
- /* For each load and store in an optimizable web (which implies
- the loads and stores are permuting), find the associated
- register swaps and mark them for removal. Due to various
- optimizations we may mark the same swap more than once. Also
- perform special handling for swappable insns that require it. */
+ /* There are two kinds of optimizations that can be performed on an
+ optimizable web:
+ 1. Remove the register swaps associated with permuting load/store
+ in an optimizable web
+ 2. Convert the vanilla loads/stores (that have not yet been split
+ into a permuting load/store and a swap) into a permuting
+ load/store (which effectively removes the swap)
+ In both the cases, swappable instructions in the webs need
+ special handling to fix them up. */
for (i = 0; i < e; ++i)
+ /* For each permuting load/store in an optimizable web, find
+ the associated register swaps and mark them for removal.
+ Due to various optimizations we may mark the same swap more
+ than once. */
if ((insn_entry[i].is_load || insn_entry[i].is_store)
&& insn_entry[i].is_swap)
{
swap_web_entry* root_entry
= (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
if (!root_entry->web_not_optimizable)
- mark_swaps_for_removal (insn_entry, i);
+ {
+ mark_swaps_for_removal (insn_entry, i);
+ root_entry->web_requires_special_handling = true;
+ }
}
- else if (insn_entry[i].is_swappable && insn_entry[i].special_handling)
+ /* Convert the non-permuting loads/stores into a permuting
+ load/store. */
+ else if (insn_entry[i].is_swappable
+ && non_permuting_mem_insn (insn_entry, i))
{
swap_web_entry* root_entry
= (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
if (!root_entry->web_not_optimizable)
+ {
+ handle_non_permuting_mem_insn (insn_entry, i);
+ root_entry->web_requires_special_handling = true;
+ }
+ }
+
+ /* Perform special handling for swappable insns that require it.
+ Note that special handling should be done only for those
+ swappable insns that are present in webs marked as requiring
+ special handling. */
+ for (i = 0; i < e; ++i)
+ if (insn_entry[i].is_swappable && insn_entry[i].special_handling &&
+ !non_permuting_mem_insn (insn_entry, i))
+ {
+ swap_web_entry* root_entry
+ = (swap_web_entry*)((&insn_entry[i])->unionfind_root ());
+ if (root_entry->web_requires_special_handling)
handle_special_swappables (insn_entry, i);
}
new file mode 100644
@@ -0,0 +1,21 @@
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p8vector_ok } */
+/* { dg-options "-mdejagnu-cpu=power8 -O2 " } */
+/* The 2 xxpermdi instructions are generated by the two
+ calls to vec_promote() */
+/* { dg-final { scan-assembler-times "xxpermdi" 2 } } */
+
+/* Test case to resolve PR106770 */
+
+#include <altivec.h>
+
+int cmp2(double a, double b)
+{
+ vector double va = vec_promote(a, 1);
+ vector double vb = vec_promote(b, 1);
+ vector long long vlt = (vector long long)vec_cmplt(va, vb);
+ vector long long vgt = (vector long long)vec_cmplt(vb, va);
+ vector signed long long vr = vec_sub(vlt, vgt);
+
+ return vec_extract(vr, 1);
+}