[v3] rs6000: Rework ELFv2 support for -fpatchable-function-entry* [PR99888]
Commit Message
Hi,
As PR99888 and its related show, the current support for
-fpatchable-function-entry on powerpc ELFv2 doesn't work
well with global entry existence. For example, with one
command line option -fpatchable-function-entry=3,2, it got
below w/o this patch:
.LPFE1:
nop
nop
.type foo, @function
foo:
nop
.LFB0:
.cfi_startproc
.LCF0:
0: addis 2,12,.TOC.-.LCF0@ha
addi 2,2,.TOC.-.LCF0@l
.localentry foo,.-foo
, the assembly is unexpected since the patched nops have
no effects when being entered from local entry.
This patch is to update the nops patched before and after
local entry, it looks like:
.type foo, @function
foo:
.LFB0:
.cfi_startproc
.LCF0:
0: addis 2,12,.TOC.-.LCF0@ha
addi 2,2,.TOC.-.LCF0@l
nop
nop
.localentry foo,.-foo
nop
Bootstrapped and regtested on powerpc64-linux-gnu P7 & P8,
and powerpc64le-linux-gnu P9 & P10.
v3: Modify NOP(s) to nop(s), update test case patchable_function_entry-default.c
with extra comments, pr99888-{1,2,3} by removing required targets as
Segher's review comments.
v2: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599617.html
v1: https://gcc.gnu.org/pipermail/gcc-patches/2022-August/599461.html
Is it ok for trunk?
BR,
Kewen
-----
PR target/99888
PR target/105649
gcc/ChangeLog:
* config/rs6000/rs6000-internal.h
(rs6000_print_patchable_function_entry): New function declaration.
* config/rs6000/rs6000-logue.cc (rs6000_output_function_prologue):
Support patchable-function-entry by emitting nops before and after
local entry for the function that needs global entry.
* config/rs6000/rs6000.cc (rs6000_print_patchable_function_entry): Skip
the function that needs global entry till global entry has been
emitted.
* config/rs6000/rs6000.h (struct machine_function): New bool member
global_entry_emitted.
gcc/testsuite/ChangeLog:
* gcc.target/powerpc/pr99888-1.c: New test.
* gcc.target/powerpc/pr99888-2.c: New test.
* gcc.target/powerpc/pr99888-3.c: New test.
* gcc.target/powerpc/pr99888-4.c: New test.
* gcc.target/powerpc/pr99888-5.c: New test.
* gcc.target/powerpc/pr99888-6.c: New test.
* c-c++-common/patchable_function_entry-default.c: Adjust for
powerpc_elfv2 to avoid compilation error.
---
gcc/config/rs6000/rs6000-internal.h | 5 +++
gcc/config/rs6000/rs6000-logue.cc | 32 ++++++++++++++
gcc/config/rs6000/rs6000.cc | 10 ++++-
gcc/config/rs6000/rs6000.h | 4 ++
.../patchable_function_entry-default.c | 3 ++
gcc/testsuite/gcc.target/powerpc/pr99888-1.c | 43 +++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr99888-2.c | 43 +++++++++++++++++++
gcc/testsuite/gcc.target/powerpc/pr99888-3.c | 11 +++++
gcc/testsuite/gcc.target/powerpc/pr99888-4.c | 13 ++++++
gcc/testsuite/gcc.target/powerpc/pr99888-5.c | 13 ++++++
gcc/testsuite/gcc.target/powerpc/pr99888-6.c | 14 ++++++
11 files changed, 189 insertions(+), 2 deletions(-)
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-1.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-2.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-3.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-4.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-5.c
create mode 100644 gcc/testsuite/gcc.target/powerpc/pr99888-6.c
--
2.27.0
Comments
Hi!
On Thu, Aug 18, 2022 at 10:12:48AM +0800, Kewen.Lin wrote:
> As PR99888 and its related show, the current support for
> -fpatchable-function-entry on powerpc ELFv2 doesn't work
> well with global entry existence.
> + /* Emit the NOPs after local entry. */
Please do not say "NOPs". It is not an acronym. I know some of our
documentation has this bug already, but please do not spread it further.
The machine instruction is "nop", lowercase.
Please fix this.
So, this patch overloads the meaning of the two parameters here to have
more meaning than explained in the documentation for the option. There
isn't much that can be done about this, so adding some new option would
only be extra work for everyone. But, could you add a line or two in
the documentation? "For PowerPC with the ELFv2 ABI, there will be M
nops before the local entry point, and N-M after", something like that?
Segher
Hi Segher,
Thanks for the review!
on 2022/8/19 01:34, Segher Boessenkool wrote:
> Hi!
>
> On Thu, Aug 18, 2022 at 10:12:48AM +0800, Kewen.Lin wrote:
>> As PR99888 and its related show, the current support for
>> -fpatchable-function-entry on powerpc ELFv2 doesn't work
>> well with global entry existence.
>
>> + /* Emit the NOPs after local entry. */
>
> Please do not say "NOPs". It is not an acronym. I know some of our
> documentation has this bug already, but please do not spread it further.
>
> The machine instruction is "nop", lowercase.
>
> Please fix this.
Whoops, I thought it's only used in commit log mistakenly, will fix!
>
> So, this patch overloads the meaning of the two parameters here to have
> more meaning than explained in the documentation for the option. There
> isn't much that can be done about this, so adding some new option would
> only be extra work for everyone. But, could you add a line or two in
> the documentation? "For PowerPC with the ELFv2 ABI, there will be M
> nops before the local entry point, and N-M after", something like that?
>
Since you proposed to update the documentation, I'm thinking if we can
reconsider Fangrui's proposal in the PR which Alan seconded: Put preceding
nops before GEP and succeeding nops after LEP. Previously I had the concern
that the nops inserted doesn't respect to a same function entry, it looks
inconsistent to the documentation, and you also noted that "The nops have
to be consecutive". If we want to update the documentation, could we reword
it for PowerPC ELFv2 ABI?
What's your opinion?
BR,
Kewen
On Fri, Aug 19, 2022 at 10:40:10AM +0800, Kewen.Lin wrote:
> Since you proposed to update the documentation, I'm thinking if we can
> reconsider Fangrui's proposal in the PR which Alan seconded: Put preceding
> nops before GEP and succeeding nops after LEP. Previously I had the concern
> that the nops inserted doesn't respect to a same function entry, it looks
> inconsistent to the documentation, and you also noted that "The nops have
> to be consecutive". If we want to update the documentation, could we reword
> it for PowerPC ELFv2 ABI?
>
> What's your opinion?
I'm not sure what the question is, sorry.
If you want different semantics for ELFv2 (which might well be useful),
we need some new command line option for that.
I suggested here to just describe in the existing doc what is done for
global and local entry points on ELFv2.
Segher
Hi Segher,
on 2022/8/23 22:33, Segher Boessenkool wrote:
> On Fri, Aug 19, 2022 at 10:40:10AM +0800, Kewen.Lin wrote:
>> Since you proposed to update the documentation, I'm thinking if we can
>> reconsider Fangrui's proposal in the PR which Alan seconded: Put preceding
>> nops before GEP and succeeding nops after LEP. Previously I had the concern
>> that the nops inserted doesn't respect to a same function entry, it looks
>> inconsistent to the documentation, and you also noted that "The nops have
>> to be consecutive". If we want to update the documentation, could we reword
>> it for PowerPC ELFv2 ABI?
>>
>> What's your opinion?
>
> I'm not sure what the question is, sorry.
>
Sorry for confusion. The question is that if we can consider the proposal
in [1], by noting the particularity on ppc64le in documentation.
btw, I did some searching on why the feature supports preceding nops, and
commented it in [2].
> If you want different semantics for ELFv2 (which might well be useful),
> we need some new command line option for that.
>
Not sure if we really needs one new command line option, dual entries on
ppc64le is special comparing with a normal unique function entry, couldn't
it be a special case in the documentation?
> I suggested here to just describe in the existing doc what is done for
> global and local entry points on ELFv2.
>
Yeah, you have suggested nice wordings "For PowerPC with the ELFv2 ABI,
there will be M nops before the local entry point, and N-M after".
I thought if we can consider [1] and updated the documentation similarly
like "For PowerPC with the ELFv2 ABI, there will be M nops before the global
entry point, and N-M after the local entry point".
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c5
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=99888#c10
BR,
Kewen
On Wed, Aug 24, 2022 at 03:30:51PM +0800, Kewen.Lin wrote:
> on 2022/8/23 22:33, Segher Boessenkool wrote:
> I thought if we can consider [1] and updated the documentation similarly
> like "For PowerPC with the ELFv2 ABI, there will be M nops before the global
> entry point, and N-M after the local entry point".
But that does not agree with the documentation. The N nops have to be
consecutive. If you want to support adding separate nop regions before
the LEP and GEP entry points, that is fine, but it will need a separate
command line option.
Segher
on 2022/8/24 22:01, Segher Boessenkool wrote:
> On Wed, Aug 24, 2022 at 03:30:51PM +0800, Kewen.Lin wrote:
>> on 2022/8/23 22:33, Segher Boessenkool wrote:
>> I thought if we can consider [1] and updated the documentation similarly
>> like "For PowerPC with the ELFv2 ABI, there will be M nops before the global
>> entry point, and N-M after the local entry point".
>
> But that does not agree with the documentation. The N nops have to be
> consecutive. If you want to support adding separate nop regions before
> the LEP and GEP entry points, that is fine, but it will need a separate
> command line option.
>
OK, previously I thought if we can claim GEP and LEP (and the area between) as
one special function entry (area) in documentation, but admittedly it's too tricky.
Adding one separated command line option now just for one potential use case in
future seems not a good idea. Following the previous proposal, I just posted
v4 at:
https://gcc.gnu.org/pipermail/gcc-patches/2022-August/600277.html
BR,
Kewen
@@ -182,10 +182,15 @@ extern tree rs6000_fold_builtin (tree fndecl ATTRIBUTE_UNUSED,
tree *args ATTRIBUTE_UNUSED,
bool ignore ATTRIBUTE_UNUSED);
+extern void rs6000_print_patchable_function_entry (FILE *,
+ unsigned HOST_WIDE_INT,
+ bool);
+
extern bool rs6000_passes_float;
extern bool rs6000_passes_long_double;
extern bool rs6000_passes_vector;
extern bool rs6000_returns_struct;
extern bool cpu_builtin_p;
+
#endif
@@ -4013,11 +4013,43 @@ rs6000_output_function_prologue (FILE *file)
fprintf (file, "\tadd 2,2,12\n");
}
+ unsigned short patch_area_size = crtl->patch_area_size;
+ unsigned short patch_area_entry = crtl->patch_area_entry;
+ /* Need to emit the patching area. */
+ if (patch_area_size > 0)
+ {
+ cfun->machine->global_entry_emitted = true;
+ /* As ELFv2 ABI shows, the allowable bytes between the global
+ and local entry points are 0, 4, 8, 16, 32 and 64 when
+ there is a local entry point. Considering there are two
+ non-prefixed instructions for global entry point prologue
+ (8 bytes), the count for patchable NOPs before local entry
+ point would be 2, 6 and 14. It's possible to support those
+ other counts of NOPs by not making a local entry point, but
+ we don't have clear user cases for them, so leave them
+ unsupported for now. */
+ if (patch_area_entry > 0)
+ {
+ if (patch_area_entry != 2
+ && patch_area_entry != 6
+ && patch_area_entry != 14)
+ error ("unsupported number of nops before function entry (%u)",
+ patch_area_entry);
+ rs6000_print_patchable_function_entry (file, patch_area_entry,
+ true);
+ patch_area_size -= patch_area_entry;
+ }
+ }
+
fputs ("\t.localentry\t", file);
assemble_name (file, name);
fputs (",.-", file);
assemble_name (file, name);
fputs ("\n", file);
+ /* Emit the NOPs after local entry. */
+ if (patch_area_size > 0)
+ rs6000_print_patchable_function_entry (file, patch_area_size,
+ patch_area_entry == 0);
}
else if (rs6000_pcrel_p ())
@@ -14898,8 +14898,14 @@ rs6000_print_patchable_function_entry (FILE *file,
if (!(TARGET_64BIT && DEFAULT_ABI != ABI_ELFv2)
&& HAVE_GAS_SECTION_LINK_ORDER)
flags |= SECTION_LINK_ORDER;
- default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
- flags);
+ bool global_entry_needed_p = rs6000_global_entry_point_prologue_needed_p ();
+ /* For a function which needs global entry point, we will emit the
+ patchable area before and after local entry point under the control of
+ cfun->machine->global_entry_emitted, see the handling in function
+ rs6000_output_function_prologue. */
+ if (!global_entry_needed_p || cfun->machine->global_entry_emitted)
+ default_print_patchable_function_entry_1 (file, patch_area_size, record_p,
+ flags);
}
enum rtx_code
@@ -2439,6 +2439,10 @@ typedef struct GTY(()) machine_function
bool lr_is_wrapped_separately;
bool toc_is_wrapped_separately;
bool mma_return_type_error;
+ /* Indicate global entry is emitted, only useful when the function requires
+ global entry. It helps to control the patchable area before and after
+ local entry. */
+ bool global_entry_emitted;
} machine_function;
#endif
@@ -1,6 +1,9 @@
/* { dg-do compile { target { ! { nvptx*-*-* visium-*-* } } } } */
/* { dg-options "-O2 -fpatchable-function-entry=3,1" } */
/* { dg-additional-options "-fno-pie" { target sparc*-*-* } } */
+/* See PR99888, one single preceding nop isn't allowed on powerpc_elfv2,
+ so overriding with two preceding nops to make it pass there. */
+/* { dg-additional-options "-fpatchable-function-entry=3,2" { target powerpc_elfv2 } } */
/* { dg-final { scan-assembler-times "nop|NOP|SWYM" 3 { target { ! { alpha*-*-* } } } } } */
/* { dg-final { scan-assembler-times "bis" 3 { target alpha*-*-* } } } */
new file mode 100644
@@ -0,0 +1,43 @@
+/* Verify no errors for different nops after local entry on ELFv2. */
+
+extern int a;
+
+__attribute__ ((noipa, patchable_function_entry (1, 0)))
+int test1 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (2, 0)))
+int test2 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (3, 0)))
+int test3 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (4, 0)))
+int test4 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (5, 0)))
+int test5 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (6, 0)))
+int test6 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (7, 0)))
+int test7 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (8, 0)))
+int test8 (int b) {
+ return a + b;
+}
new file mode 100644
@@ -0,0 +1,43 @@
+/* Verify no errors for 2, 6 and 14 nops before local entry on ELFv2. */
+
+extern int a;
+
+__attribute__ ((noipa, patchable_function_entry (2, 2)))
+int test1 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (4, 2)))
+int test2 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (6, 6)))
+int test3 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (8, 6)))
+int test4 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (16, 6)))
+int test5 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (14, 14)))
+int test6 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (28, 14)))
+int test7 (int b) {
+ return a + b;
+}
+
+__attribute__ ((noipa, patchable_function_entry (64, 14)))
+int test8 (int b) {
+ return a + b;
+}
new file mode 100644
@@ -0,0 +1,11 @@
+/* { dg-options "-fpatchable-function-entry=1" } */
+
+/* Verify no errors on ELFv2, using command line option instead of
+ function attribute. */
+
+extern int a;
+
+int test (int b) {
+ return a + b;
+}
+
new file mode 100644
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* There is no global entry point prologue with pcrel. */
+/* { dg-options "-mno-pcrel -fpatchable-function-entry=1,1" } */
+
+/* Verify one error emitted for unexpected 1 NOP before local
+ entry. */
+
+extern int a;
+
+int test (int b) {
+ return a + b;
+}
+/* { dg-error "unsupported number of nops before function entry \\(1\\)" "" { target *-*-* } .-1 } */
new file mode 100644
@@ -0,0 +1,13 @@
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* There is no global entry point prologue with pcrel. */
+/* { dg-options "-mno-pcrel -fpatchable-function-entry=7,3" } */
+
+/* Verify one error emitted for unexpected 3 NOP before local
+ entry. */
+
+extern int a;
+
+int test (int b) {
+ return a + b;
+}
+/* { dg-error "unsupported number of nops before function entry \\(3\\)" "" { target *-*-* } .-1 } */
new file mode 100644
@@ -0,0 +1,14 @@
+/* { dg-require-effective-target powerpc_elfv2 } */
+/* There is no global entry point prologue with pcrel. */
+/* { dg-options "-mno-pcrel" } */
+
+/* Verify one error emitted for unexpected 4 NOP before local
+ entry. */
+
+extern int a;
+
+__attribute__ ((patchable_function_entry (20, 4)))
+int test (int b) {
+ return a + b;
+}
+/* { dg-error "unsupported number of nops before function entry \\(4\\)" "" { target *-*-* } .-1 } */