[v2,2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.
Checks
Commit Message
Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the
linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d.
Otherwise, the results would be incorrect if these four instructions are not in
the same 4KiB page.
See the link for details:
https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model.
gcc/ChangeLog:
* config/loongarch/loongarch.cc (loongarch_symbol_extreme_p): Add
function declaration.
(loongarch_explicit_relocs_p): Use the macro instruction to get
the symbol address when loongarch_symbol_extreme_p returns true.
gcc/testsuite/ChangeLog:
* gcc.target/loongarch/attr-model-1.c: Modify the content of the search
string in the test case.
* gcc.target/loongarch/attr-model-2.c: Likewise.
* gcc.target/loongarch/attr-model-3.c: Likewise.
* gcc.target/loongarch/attr-model-4.c: Likewise.
* gcc.target/loongarch/func-call-extreme-1.c: Likewise.
* gcc.target/loongarch/func-call-extreme-2.c: Likewise.
* gcc.target/loongarch/func-call-extreme-3.c: Likewise.
* gcc.target/loongarch/func-call-extreme-4.c: Likewise.
---
gcc/config/loongarch/loongarch.cc | 11 +++++++++++
gcc/testsuite/gcc.target/loongarch/attr-model-1.c | 2 +-
gcc/testsuite/gcc.target/loongarch/attr-model-2.c | 2 +-
gcc/testsuite/gcc.target/loongarch/attr-model-3.c | 2 +-
gcc/testsuite/gcc.target/loongarch/attr-model-4.c | 2 +-
.../gcc.target/loongarch/func-call-extreme-1.c | 6 +++---
.../gcc.target/loongarch/func-call-extreme-2.c | 6 +++---
.../gcc.target/loongarch/func-call-extreme-3.c | 6 +++---
.../gcc.target/loongarch/func-call-extreme-4.c | 6 +++---
9 files changed, 27 insertions(+), 16 deletions(-)
Comments
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> bool
> loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> {
> + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> + so that the linker can infer the PC of pcalau12i to apply relocations
> + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
> + these four instructions are not in the same 4KiB page.
> + Therefore, macro instructions are used when cmodel=extreme. */
> + if (loongarch_symbol_extreme_p (type))
> + return false;
I think this is a bit of strange. With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as
"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
Give me several hours trying to implement this...
在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>> bool
>> loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>> {
>> + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>> + so that the linker can infer the PC of pcalau12i to apply relocations
>> + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
>> + these four instructions are not in the same 4KiB page.
>> + Therefore, macro instructions are used when cmodel=extreme. */
>> + if (loongarch_symbol_extreme_p (type))
>> + return false;
> I think this is a bit of strange. With -mexplicit-relocs={auto,always}
> we should still use explicit relocs, but coding all 4 instructions
> altogether as
>
> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>
> Give me several hours trying to implement this...
>
You mean to take the last add directive out separately?
在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>> bool
>> loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>> {
>> + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>> + so that the linker can infer the PC of pcalau12i to apply relocations
>> + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
>> + these four instructions are not in the same 4KiB page.
>> + Therefore, macro instructions are used when cmodel=extreme. */
>> + if (loongarch_symbol_extreme_p (type))
>> + return false;
> I think this is a bit of strange. With -mexplicit-relocs={auto,always}
> we should still use explicit relocs, but coding all 4 instructions
> altogether as
>
> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>
> Give me several hours trying to implement this...
>
I think there is no difference between macros and these instructions put
together. If implement it in a split form, I think I can try it through
TARGET_SCHED_MACRO_FUSION_PAIR_P
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > bool
> > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > {
> > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > + so that the linker can infer the PC of pcalau12i to apply relocations
> > > + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
> > > + these four instructions are not in the same 4KiB page.
> > > + Therefore, macro instructions are used when cmodel=extreme. */
> > > + if (loongarch_symbol_extreme_p (type))
> > > + return false;
> > I think this is a bit of strange. With -mexplicit-relocs={auto,always}
> > we should still use explicit relocs, but coding all 4 instructions
> > altogether as
> >
> > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> >
> > Give me several hours trying to implement this...
> >
> I think there is no difference between macros and these instructions put
> together. If implement it in a split form, I think I can try it through
> TARGET_SCHED_MACRO_FUSION_PAIR_P
There is a difference:
int x;
int t() { return x; }
pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
ldx.w a0, t0, t1
is slightly better than
pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
addi.d t0, t0, t1
ld.w a0, t0, 0
And generating macros when -mexplicit-relocs=always can puzzle people
(it says "always" :-\ ).
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> >
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > bool
> > > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > > {
> > > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > > + so that the linker can infer the PC of pcalau12i to apply relocations
> > > > + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
> > > > + these four instructions are not in the same 4KiB page.
> > > > + Therefore, macro instructions are used when cmodel=extreme. */
> > > > + if (loongarch_symbol_extreme_p (type))
> > > > + return false;
> > > I think this is a bit of strange. With -mexplicit-relocs={auto,always}
> > > we should still use explicit relocs, but coding all 4 instructions
> > > altogether as
> > >
> > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > >
> > > Give me several hours trying to implement this...
> > >
> > I think there is no difference between macros and these instructions put
> > together. If implement it in a split form, I think I can try it through
> > TARGET_SCHED_MACRO_FUSION_PAIR_P
We don't need to split the insn. We can just add a "large insn"
containing the assembly output we want.
See the attached patch. Note that TLS LE/LD/GD needs a fix too because
they are basically an variation of GOT addressing.
I've ran some small tests and now trying to bootstrap GCC with -
mcmodel=extreme in BOOT_CFLAGS...
>
> There is a difference:
>
> int x;
> int t() { return x; }
>
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> ldx.w a0, t0, t1
>
> is slightly better than
>
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> addi.d t0, t0, t1
> ld.w a0, t0, 0
>
> And generating macros when -mexplicit-relocs=always can puzzle people
> (it says "always" :-\ ).
>
在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
>> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>>> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
>>>> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>>>>> bool
>>>>> loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>>>>> {
>>>>> + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>>>>> + so that the linker can infer the PC of pcalau12i to apply relocations
>>>>> + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
>>>>> + these four instructions are not in the same 4KiB page.
>>>>> + Therefore, macro instructions are used when cmodel=extreme. */
>>>>> + if (loongarch_symbol_extreme_p (type))
>>>>> + return false;
>>>> I think this is a bit of strange. With -mexplicit-relocs={auto,always}
>>>> we should still use explicit relocs, but coding all 4 instructions
>>>> altogether as
>>>>
>>>> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>>>>
>>>> Give me several hours trying to implement this...
>>>>
>>> I think there is no difference between macros and these instructions put
>>> together. If implement it in a split form, I think I can try it through
>>> TARGET_SCHED_MACRO_FUSION_PAIR_P
> We don't need to split the insn. We can just add a "large insn"
> containing the assembly output we want.
>
> See the attached patch. Note that TLS LE/LD/GD needs a fix too because
> they are basically an variation of GOT addressing.
>
> I've ran some small tests and now trying to bootstrap GCC with -
> mcmodel=extreme in BOOT_CFLAGS...
>
>> There is a difference:
>>
>> int x;
>> int t() { return x; }
>>
>> pcalau12i.d t0, %pc_hi20(x)
>> addi.d t1, r0, %pc_lo12(x)
>> lu32i.d t1, %pc64_lo20(x)
>> lu52i.d t1, t1, %pc64_hi12(x)
>> ldx.w a0, t0, t1
>>
>> is slightly better than
>>
>> pcalau12i.d t0, %pc_hi20(x)
>> addi.d t1, r0, %pc_lo12(x)
>> lu32i.d t1, %pc64_lo20(x)
>> lu52i.d t1, t1, %pc64_hi12(x)
>> addi.d t0, t0, t1
>> ld.w a0, t0, 0
>>
>> And generating macros when -mexplicit-relocs=always can puzzle people
>> (it says "always" :-\ ).
>>
Thumbs up! This method is much better than my method, I learned
something! grateful!
But I still have to test the accuracy.
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
>
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > > > bool
> > > > > > loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > > > > {
> > > > > > + /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > > > > + so that the linker can infer the PC of pcalau12i to apply relocations
> > > > > > + to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
> > > > > > + these four instructions are not in the same 4KiB page.
> > > > > > + Therefore, macro instructions are used when cmodel=extreme. */
> > > > > > + if (loongarch_symbol_extreme_p (type))
> > > > > > + return false;
> > > > > I think this is a bit of strange. With -mexplicit-relocs={auto,always}
> > > > > we should still use explicit relocs, but coding all 4 instructions
> > > > > altogether as
> > > > >
> > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > > > >
> > > > > Give me several hours trying to implement this...
> > > > >
> > > > I think there is no difference between macros and these instructions put
> > > > together. If implement it in a split form, I think I can try it through
> > > > TARGET_SCHED_MACRO_FUSION_PAIR_P
> > We don't need to split the insn. We can just add a "large insn"
> > containing the assembly output we want.
> >
> > See the attached patch. Note that TLS LE/LD/GD needs a fix too because
> > they are basically an variation of GOT addressing.
> >
> > I've ran some small tests and now trying to bootstrap GCC with -
> > mcmodel=extreme in BOOT_CFLAGS...
> >
> > > There is a difference:
> > >
> > > int x;
> > > int t() { return x; }
> > >
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > ldx.w a0, t0, t1
> > >
> > > is slightly better than
> > >
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > addi.d t0, t0, t1
> > > ld.w a0, t0, 0
> > >
> > > And generating macros when -mexplicit-relocs=always can puzzle people
> > > (it says "always" :-\ ).
> > >
> Thumbs up! This method is much better than my method, I learned
> something! grateful!
> But I still have to test the accuracy.
I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like
../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 4f89c4af323..410e1b5e693 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10868,6 +10868,24 @@ loongarch_asm_code_end (void)
#undef DUMP_FEATURE
}
+static rtx loongarch_delegitimize_address (rtx op)
+{
+ if (GET_CODE (op) == UNSPEC)
+ {
+ int unspec = XINT (op, 1);
+ switch (unspec)
+ {
+ case UNSPEC_LA_PCREL_64_PART1:
+ case UNSPEC_LA_PCREL_64_PART2:
+ return XVECEXP (op, 0, 0);
+ default:
+ return op;
+ }
+ }
+
+ return op;
+}
+
/* Initialize the GCC target structure. */
#undef TARGET_ASM_ALIGNED_HI_OP
#define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11129,6 +11147,10 @@ loongarch_asm_code_end (void)
#define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
loongarch_builtin_support_vector_misalignment
+#undef TARGET_DELEGITIMIZE_ADDRESS
+#define TARGET_DELEGITIMIZE_ADDRESS \
+ loongarch_delegitimize_address
+
struct gcc_target targetm = TARGET_INITIALIZER;
#include "gt-loongarch.h"
> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> we need a target hook to tell the generic code
> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> see millions lines of messages like
>
> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>
I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't
reproduced the problem you mentioned.
$../configure --host=loongarch64-linux-gnu
--target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
--with-arch=loongarch64 --with-abi=lp64d --enable-tls
--enable-languages=c,c++,fortran,lto --enable-plugin \
--disable-multilib --disable-host-shared --enable-bootstrap
--enable-checking=release
$ make BOOT_FLAGS="-mcmodel=extreme"
What did I do wrong?:-(
在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > we need a target hook to tell the generic code
> > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > see millions lines of messages like
> >
> > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>
> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>
> $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
> --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
> $ make BOOT_FLAGS="-mcmodel=extreme"
>
> What did I do wrong?:-(
BOOT_CFLAGS, not BOOT_FLAGS :).
在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>
>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>> we need a target hook to tell the generic code
>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>> see millions lines of messages like
>>>
>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>
>> $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>> --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>> --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>> $ make BOOT_FLAGS="-mcmodel=extreme"
>>
>> What did I do wrong?:-(
> BOOT_CFLAGS, not BOOT_FLAGS :).
>
This is so strange. My compilation here stopped due to syntax problems,
and I still haven't reproduced the information you mentioned about
UNSPEC_LA_PCREL_64_PART1.
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>
> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> >
> > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > we need a target hook to tell the generic code
> > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > > > see millions lines of messages like
> > > >
> > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
> > >
> > > $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > > --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
> > > --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
> > > $ make BOOT_FLAGS="-mcmodel=extreme"
> > >
> > > What did I do wrong?:-(
> > BOOT_CFLAGS, not BOOT_FLAGS :).
> >
> This is so strange. My compilation here stopped due to syntax problems,
>
> and I still haven't reproduced the information you mentioned about
> UNSPEC_LA_PCREL_64_PART1.
I used:
../gcc/configure --with-system-zlib --disable-fixincludes \
--enable-default-ssp --enable-default-pie \
--disable-werror --disable-multilib \
--prefix=/home/xry111/gcc-dev
and then
make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
I guess "-g" is needed to reproduce the issue as well as the messages
were produced in dwarf generation.
在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
>>> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>>>
>>>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>>>> we need a target hook to tell the generic code
>>>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>>>> see millions lines of messages like
>>>>>
>>>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>>>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>>>
>>>> $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>>>> --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>>>> --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>>>> $ make BOOT_FLAGS="-mcmodel=extreme"
>>>>
>>>> What did I do wrong?:-(
>>> BOOT_CFLAGS, not BOOT_FLAGS :).
>>>
>> This is so strange. My compilation here stopped due to syntax problems,
>>
>> and I still haven't reproduced the information you mentioned about
>> UNSPEC_LA_PCREL_64_PART1.
> I used:
>
> ../gcc/configure --with-system-zlib --disable-fixincludes \
> --enable-default-ssp --enable-default-pie \
> --disable-werror --disable-multilib \
> --prefix=/home/xry111/gcc-dev
>
> and then
>
> make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
> BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
>
> I guess "-g" is needed to reproduce the issue as well as the messages
> were produced in dwarf generation.
Oh, okay, I'll try this method!:-)
>
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > >
> > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > > > we need a target hook to tell the generic code
> > > > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > > > > > see millions lines of messages like
> > > > > >
> > > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
> > > > >
> > > > > $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > > > > --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
> > > > > --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
> > > > > $ make BOOT_FLAGS="-mcmodel=extreme"
> > > > >
> > > > > What did I do wrong?:-(
> > > > BOOT_CFLAGS, not BOOT_FLAGS :).
> > > >
> > > This is so strange. My compilation here stopped due to syntax problems,
> > >
> > > and I still haven't reproduced the information you mentioned about
> > > UNSPEC_LA_PCREL_64_PART1.
> > I used:
> >
> > ../gcc/configure --with-system-zlib --disable-fixincludes \
> > --enable-default-ssp --enable-default-pie \
> > --disable-werror --disable-multilib \
> > --prefix=/home/xry111/gcc-dev
> >
> > and then
> >
> > make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
> > BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
> >
> > I guess "-g" is needed to reproduce the issue as well as the messages
> > were produced in dwarf generation.
> >
> I have reproduced this problem, and it can be solved by adding a hook.
>
> But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'
>
> to test spec2006 403.gcc, an error will occur. Others have not been
> tested yet.
>
> I roughly debugged it, and the problem should be this:
>
> The problem is that the address of the instruction ‘ldx.d $r12, $r25,
> $r6’ is wrong.
>
> Wrong assembly:
>
> 5826 pcalau12i $r13,%got_pc_hi20(recog_data)
> 5827 addi.d $r12,$r0,%got_pc_lo12(recog_data)
> 5828 lu32i.d $r12,%got64_pc_lo20(recog_data)
> 5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
> 5830 ldx.d $r12,$r13,$r12
> 5831 ld.b $r8,$r12,997
> 5832 .loc 1 829 18 discriminator 1 view .LVU1527
> 5833 ble $r8,$r0,.L476
> 5834 ld.d $r6,$r3,16
> 5835 ld.d $r9,$r3,88
> 5836 .LBB189 = .
> 5837 .loc 1 839 24 view .LVU1528
> 5838 alsl.d $r7,$r19,$r19,2
> 5839 ldx.d $r12,$r25,$r6
> 5840 addi.d $r17,$r3,120
> 5841 .LBE189 = .
> 5842 .loc 1 829 18 discriminator 1 view .LVU1529
> 5843 or $r13,$r0,$r0
> 5844 addi.d $r4,$r12,992
>
> Assembly that works fine using macros:
>
> 3040 la.global $r12,$r13,recog_data
> 3041 ld.b $r9,$r12,997
> 3042 ble $r9,$r0,.L475
> 3043 alsl.d $r5,$r16,$r16,2
> 3044 la.global $r15,$r17,recog_data
> 3045 addi.d $r4,$r12,992
> 3046 addi.d $r18,$r3,48
> 3047 or $r13,$r0,$r0
>
> Comparing the assembly, we can see that lines 5844 and 3045 have the
> same function,
>
> but there is a problem with the base address register optimization at
> line 5844.
>
> regrename.c.283r.loop2_init:
>
> (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
> (const_int 0 [0])) "regrename.c":829:18 discrim 1 156
> {*movdi_64bit}
> (nil))
> (insn 2741 6 2744 34 (parallel [
> (set (reg:DI 1502)
> (unspec:DI [
> (symbol_ref:DI ("recog_data") [flags 0xc0]
> <var_decl 0x7f8c5ffd66c0 recog_data>)
> ] UNSPEC_LA_PCREL_64_PART1))
> (set (reg/f:DI 1479)
> (unspec:DI [
> (symbol_ref:DI ("recog_data") [flags 0xc0]
> <var_decl 0x7f8c5ffd66c0 recog_data>)
> ] UNSPEC_LA_PCREL_64_PART2))
> ]) -1
> (expr_list:REG_UNUSED (reg/f:DI 1479)
> (nil)))
> (insn 2744 2741 2745 34 (set (reg/f:DI 1503)
> (mem:DI (plus:DI (reg/f:DI 1479)
> (reg:DI 1502)) [0 S8 A8])) 156 {*movdi_64bit}
> (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0]
> <var_decl 0x7f8c5ffd66c0 recog_data>)
> (nil)))
>
>
> Virtual register 1479 will be used in insn 2744, but register 1479 was
> assigned the REG_UNUSED attribute in the previous instruction.
>
> The attached file is the wrong file.
> The compilation command is as follows:
>
> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
> -Wno-incompatible-pointer-types -version -o regrename.s
> -mexplicit-relocs=always -fdump-rtl-all-all
I've seen some "guality" test failures in GCC test suite as well.
Normally I just ignore the guality failures but this time they look very
suspicious. I'll investigate these issues...
在 2024/1/17 下午5:50, Xi Ruoyao 写道:
> On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
>>> 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>>>> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
>>>>> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>>>>>
>>>>>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>>>>>> we need a target hook to tell the generic code
>>>>>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>>>>>> see millions lines of messages like
>>>>>>>
>>>>>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>>>>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>>>>>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>>>>>
>>>>>> $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>>>>>> --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>>>>>> --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>>>>>> $ make BOOT_FLAGS="-mcmodel=extreme"
>>>>>>
>>>>>> What did I do wrong?:-(
>>>>> BOOT_CFLAGS, not BOOT_FLAGS :).
>>>>>
>>>> This is so strange. My compilation here stopped due to syntax problems,
>>>>
>>>> and I still haven't reproduced the information you mentioned about
>>>> UNSPEC_LA_PCREL_64_PART1.
>>> I used:
>>>
>>> ../gcc/configure --with-system-zlib --disable-fixincludes \
>>> --enable-default-ssp --enable-default-pie \
>>> --disable-werror --disable-multilib \
>>> --prefix=/home/xry111/gcc-dev
>>>
>>> and then
>>>
>>> make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
>>> BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
>>>
>>> I guess "-g" is needed to reproduce the issue as well as the messages
>>> were produced in dwarf generation.
>>>
>> I have reproduced this problem, and it can be solved by adding a hook.
>>
>> But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'
>>
>> to test spec2006 403.gcc, an error will occur. Others have not been
>> tested yet.
>>
>> I roughly debugged it, and the problem should be this:
>>
>> The problem is that the address of the instruction ‘ldx.d $r12, $r25,
>> $r6’ is wrong.
>>
>> Wrong assembly:
>>
>> 5826 pcalau12i $r13,%got_pc_hi20(recog_data)
>> 5827 addi.d $r12,$r0,%got_pc_lo12(recog_data)
>> 5828 lu32i.d $r12,%got64_pc_lo20(recog_data)
>> 5829 lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
>> 5830 ldx.d $r12,$r13,$r12
>> 5831 ld.b $r8,$r12,997
>> 5832 .loc 1 829 18 discriminator 1 view .LVU1527
>> 5833 ble $r8,$r0,.L476
>> 5834 ld.d $r6,$r3,16
>> 5835 ld.d $r9,$r3,88
>> 5836 .LBB189 = .
>> 5837 .loc 1 839 24 view .LVU1528
>> 5838 alsl.d $r7,$r19,$r19,2
>> 5839 ldx.d $r12,$r25,$r6
>> 5840 addi.d $r17,$r3,120
>> 5841 .LBE189 = .
>> 5842 .loc 1 829 18 discriminator 1 view .LVU1529
>> 5843 or $r13,$r0,$r0
>> 5844 addi.d $r4,$r12,992
>>
>> Assembly that works fine using macros:
>>
>> 3040 la.global $r12,$r13,recog_data
>> 3041 ld.b $r9,$r12,997
>> 3042 ble $r9,$r0,.L475
>> 3043 alsl.d $r5,$r16,$r16,2
>> 3044 la.global $r15,$r17,recog_data
>> 3045 addi.d $r4,$r12,992
>> 3046 addi.d $r18,$r3,48
>> 3047 or $r13,$r0,$r0
>>
>> Comparing the assembly, we can see that lines 5844 and 3045 have the
>> same function,
>>
>> but there is a problem with the base address register optimization at
>> line 5844.
>>
>> regrename.c.283r.loop2_init:
>>
>> (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
>> (const_int 0 [0])) "regrename.c":829:18 discrim 1 156
>> {*movdi_64bit}
>> (nil))
>> (insn 2741 6 2744 34 (parallel [
>> (set (reg:DI 1502)
>> (unspec:DI [
>> (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>> ] UNSPEC_LA_PCREL_64_PART1))
>> (set (reg/f:DI 1479)
>> (unspec:DI [
>> (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>> ] UNSPEC_LA_PCREL_64_PART2))
>> ]) -1
>> (expr_list:REG_UNUSED (reg/f:DI 1479)
>> (nil)))
>> (insn 2744 2741 2745 34 (set (reg/f:DI 1503)
>> (mem:DI (plus:DI (reg/f:DI 1479)
>> (reg:DI 1502)) [0 S8 A8])) 156 {*movdi_64bit}
>> (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>> (nil)))
>>
>>
>> Virtual register 1479 will be used in insn 2744, but register 1479 was
>> assigned the REG_UNUSED attribute in the previous instruction.
>>
>> The attached file is the wrong file.
>> The compilation command is as follows:
>>
>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
>> -Wno-incompatible-pointer-types -version -o regrename.s
>> -mexplicit-relocs=always -fdump-rtl-all-all
> I've seen some "guality" test failures in GCC test suite as well.
> Normally I just ignore the guality failures but this time they look very
> suspicious. I'll investigate these issues...
>
I've also seen this type of failed regression tests and I'll continue to
look at this issue as well.
On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
> > > Virtual register 1479 will be used in insn 2744, but register 1479 was
> > > assigned the REG_UNUSED attribute in the previous instruction.
> > >
> > > The attached file is the wrong file.
> > > The compilation command is as follows:
> > >
> > > $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
> > > -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
> > > -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
> > > -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
> > > -Wno-incompatible-pointer-types -version -o regrename.s
> > > -mexplicit-relocs=always -fdump-rtl-all-all
> > I've seen some "guality" test failures in GCC test suite as well.
> > Normally I just ignore the guality failures but this time they look very
> > suspicious. I'll investigate these issues...
> >
> I've also seen this type of failed regression tests and I'll continue to
> look at this issue as well.
The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.
The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC. Thus
(pc) needed to be included in the RTX, like:
[(set (match_operand:DI 0 "register_operand" "=r")
(unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
(set (match_operand:DI 1 "register_operand" "=r")
(unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
With this the buggy REG_UNUSED notes were gone. But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.
Updated patch attached.
在 2024/1/19 下午1:46, Xi Ruoyao 写道:
> On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
>>>> Virtual register 1479 will be used in insn 2744, but register 1479 was
>>>> assigned the REG_UNUSED attribute in the previous instruction.
>>>>
>>>> The attached file is the wrong file.
>>>> The compilation command is as follows:
>>>>
>>>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
>>>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>>>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>>>> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
>>>> -Wno-incompatible-pointer-types -version -o regrename.s
>>>> -mexplicit-relocs=always -fdump-rtl-all-all
>>> I've seen some "guality" test failures in GCC test suite as well.
>>> Normally I just ignore the guality failures but this time they look very
>>> suspicious. I'll investigate these issues...
>>>
>> I've also seen this type of failed regression tests and I'll continue to
>> look at this issue as well.
> The guality regression is simple: I didn't call
> delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
> the custom implementation.
>
> The failure of this test case was because the compiler believes that two
> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> same result, but this isn't true because the result depends on PC. Thus
> (pc) needed to be included in the RTX, like:
>
> [(set (match_operand:DI 0 "register_operand" "=r")
> (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
> (set (match_operand:DI 1 "register_operand" "=r")
> (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>
> With this the buggy REG_UNUSED notes were gone. But it then prevented
> the CSE when loading the address of __tls_get_addr (i.e. if we address
> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
> than __tls_get_addr such notes are added automatically by optimization
> passes.
>
> Updated patch attached.
>
I'm eliminating redundant la.global directives in my macro implementation.
I will be testing this patch.
在 2024/1/19 下午4:51, chenglulu 写道:
>
> 在 2024/1/19 下午1:46, Xi Ruoyao 写道:
>> On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
>>>>> Virtual register 1479 will be used in insn 2744, but register 1479
>>>>> was
>>>>> assigned the REG_UNUSED attribute in the previous instruction.
>>>>>
>>>>> The attached file is the wrong file.
>>>>> The compilation command is as follows:
>>>>>
>>>>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase
>>>>> regrename.c
>>>>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>>>>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>>>>> -Wno-int-conversion -Wno-implicit-int
>>>>> -Wno-implicit-function-declaration
>>>>> -Wno-incompatible-pointer-types -version -o regrename.s
>>>>> -mexplicit-relocs=always -fdump-rtl-all-all
>>>> I've seen some "guality" test failures in GCC test suite as well.
>>>> Normally I just ignore the guality failures but this time they look
>>>> very
>>>> suspicious. I'll investigate these issues...
>>>>
>>> I've also seen this type of failed regression tests and I'll
>>> continue to
>>> look at this issue as well.
>> The guality regression is simple: I didn't call
>> delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
>> the custom implementation.
>>
>> The failure of this test case was because the compiler believes that two
>> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
>> same result, but this isn't true because the result depends on PC. Thus
>> (pc) needed to be included in the RTX, like:
>>
>> [(set (match_operand:DI 0 "register_operand" "=r")
>> (unspec:DI [(match_operand:DI 2 "") (pc)]
>> UNSPEC_LA_PCREL_64_PART1))
>> (set (match_operand:DI 1 "register_operand" "=r")
>> (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>>
>> With this the buggy REG_UNUSED notes were gone. But it then prevented
>> the CSE when loading the address of __tls_get_addr (i.e. if we address
>> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
>> __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
>> than __tls_get_addr such notes are added automatically by optimization
>> passes.
>>
>> Updated patch attached.
>>
> I'm eliminating redundant la.global directives in my macro
> implementation.
>
> I will be testing this patch.
>
>
>
>
With this patch, spec2006 can pass the test, but spec2017 621 and 654
tests fail.
I haven't debugged the specific cause of the problem yet.
On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > The failure of this test case was because the compiler believes that two
> > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > same result, but this isn't true because the result depends on PC. Thus
> > > (pc) needed to be included in the RTX, like:
> > >
> > > [(set (match_operand:DI 0 "register_operand" "=r")
> > > (unspec:DI [(match_operand:DI 2 "") (pc)]
> > > UNSPEC_LA_PCREL_64_PART1))
> > > (set (match_operand:DI 1 "register_operand" "=r")
> > > (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > >
> > > With this the buggy REG_UNUSED notes were gone. But it then prevented
> > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> > > __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
> > > than __tls_get_addr such notes are added automatically by optimization
> > > passes.
> > >
> > > Updated patch attached.
> > >
> > I'm eliminating redundant la.global directives in my macro
> > implementation.
> >
> > I will be testing this patch.
> >
> >
> >
> >
> With this patch, spec2006 can pass the test, but spec2017 621 and 654
> tests fail.
> I haven't debugged the specific cause of the problem yet.
Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some
<del>unhealthy</del> food in the midnight I realized the hook only
papers over the same issue caused spec2006 failure. I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.
在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
>>>> The failure of this test case was because the compiler believes that two
>>>> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
>>>> same result, but this isn't true because the result depends on PC. Thus
>>>> (pc) needed to be included in the RTX, like:
>>>>
>>>> [(set (match_operand:DI 0 "register_operand" "=r")
>>>> (unspec:DI [(match_operand:DI 2 "") (pc)]
>>>> UNSPEC_LA_PCREL_64_PART1))
>>>> (set (match_operand:DI 1 "register_operand" "=r")
>>>> (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>>>>
>>>> With this the buggy REG_UNUSED notes were gone. But it then prevented
>>>> the CSE when loading the address of __tls_get_addr (i.e. if we address
>>>> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
>>>> __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
>>>> than __tls_get_addr such notes are added automatically by optimization
>>>> passes.
>>>>
>>>> Updated patch attached.
>>>>
>>> I'm eliminating redundant la.global directives in my macro
>>> implementation.
>>>
>>> I will be testing this patch.
>>>
>>>
>>>
>>>
>> With this patch, spec2006 can pass the test, but spec2017 621 and 654
>> tests fail.
>> I haven't debugged the specific cause of the problem yet.
> Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some
> <del>unhealthy</del> food in the midnight I realized the hook only
> papers over the same issue caused spec2006 failure. I tried a bootstrap
> with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
> commented out, and there is no more spurious "note: non-delegitimized
> UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
> I feel that this hook is still written in a buggy way, so maybe removing
> it will solve the spec2017 issue.
>
I found the problem. Binutils did not consider the four instructions
when converting
the type from TLS IE to TLS LE, which caused the conversion error.
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
>
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that two
> > > > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > > > same result, but this isn't true because the result depends on PC. Thus
> > > > > (pc) needed to be included in the RTX, like:
> > > > >
> > > > > [(set (match_operand:DI 0 "register_operand" "=r")
> > > > > (unspec:DI [(match_operand:DI 2 "") (pc)]
> > > > > UNSPEC_LA_PCREL_64_PART1))
> > > > > (set (match_operand:DI 1 "register_operand" "=r")
> > > > > (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > > > >
> > > > > With this the buggy REG_UNUSED notes were gone. But it then prevented
> > > > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> > > > > __tls_get_addr") so I added an REG_EQUAL note for it. For symbols other
> > > > > than __tls_get_addr such notes are added automatically by optimization
> > > > > passes.
> > > > >
> > > > > Updated patch attached.
> > > > >
> > > > I'm eliminating redundant la.global directives in my macro
> > > > implementation.
> > > >
> > > > I will be testing this patch.
> > > >
> > > >
> > > >
> > > >
> > > With this patch, spec2006 can pass the test, but spec2017 621 and 654
> > > tests fail.
> > > I haven't debugged the specific cause of the problem yet.
> > Try removing the TARGET_DELEGITIMIZE_ADDRESS hook? After eating some
> > <del>unhealthy</del> food in the midnight I realized the hook only
> > papers over the same issue caused spec2006 failure. I tried a bootstrap
> > with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
> > commented out, and there is no more spurious "note: non-delegitimized
> > UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
> > I feel that this hook is still written in a buggy way, so maybe removing
> > it will solve the spec2017 issue.
> >
> I found the problem. Binutils did not consider the four instructions
> when converting the type from TLS IE to TLS LE, which caused the conversion error.
Oooops. We better fix this quickly as the Binutils 2.42 release is
imminent.
Maybe we can just disable TLS linker optimization once we see an
R_LARCH_TLS_DESC64* or R_LARCH_TLS_IE64*.
@@ -264,6 +264,9 @@ const char *const
loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)};
#undef STRINGIFY
+static bool
+loongarch_symbol_extreme_p (enum loongarch_symbol_type type);
+
/* Size of guard page. */
#define STACK_CLASH_PROTECTION_GUARD_SIZE \
(1 << param_stack_clash_protection_guard_size)
@@ -1963,6 +1966,14 @@ loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type)
bool
loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
{
+ /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+ so that the linker can infer the PC of pcalau12i to apply relocations
+ to lu32i.d and lu52i.d. Otherwise, the results would be incorrect if
+ these four instructions are not in the same 4KiB page.
+ Therefore, macro instructions are used when cmodel=extreme. */
+ if (loongarch_symbol_extreme_p (type))
+ return false;
+
if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */
#define ATTR_MODEL_TEST
#include "attr-model-test.c"
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-mexplicit-relocs -mcmodel=extreme -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } */
#define ATTR_MODEL_TEST
#include "attr-model-test.c"
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-mexplicit-relocs=auto -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */
#define ATTR_MODEL_TEST
#include "attr-model-test.c"
@@ -1,6 +1,6 @@
/* { dg-do compile } */
/* { dg-options "-mexplicit-relocs=auto -mcmodel=extreme -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } */
#define ATTR_MODEL_TEST
#include "attr-model-test.c"
@@ -1,8 +1,8 @@
/* { dg-do compile } */
/* { dg-options "-mabi=lp64d -O0 -fno-pic -fno-plt -mexplicit-relocs -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.local.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
extern void g (void);
void
@@ -1,8 +1,8 @@
/* { dg-do compile } */
/* { dg-options "-mabi=lp64d -O0 -fpic -fno-plt -mexplicit-relocs -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
extern void g (void);
void
@@ -1,7 +1,7 @@
/* { dg-do compile } */
/* { dg-options "-mabi=lp64d -O0 -fno-pic -fno-plt -mexplicit-relocs=auto -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.local.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
#include "func-call-extreme-1.c"
@@ -1,7 +1,7 @@
/* { dg-do compile } */
/* { dg-options "-mabi=lp64d -O0 -fpic -fno-plt -mexplicit-relocs=auto -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
#include "func-call-extreme-1.c"