[v2,2/2] LoongArch: When the code model is extreme, the symbol address is obtained through macro instructions regardless of the value of -mexplicit-relocs.

Message ID 20240105034021.30177-3-chenglulu@loongson.cn
State Unresolved
Headers
Series When cmodel=extreme, add macro support and only support macros. |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

chenglulu Jan. 5, 2024, 3:40 a.m. UTC
  Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent so that the
linker can infer the PC of pcalau12i to apply relocations to lu32i.d and lu52i.d.
Otherwise, the results would be incorrect if these four instructions are not in
the same 4KiB page.

See the link for details:
https://github.com/loongson/la-abi-specs/blob/release/laelf.adoc#extreme-code-model.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_symbol_extreme_p): Add
	function declaration.
	(loongarch_explicit_relocs_p): Use the macro instruction to get
	the symbol address when loongarch_symbol_extreme_p returns true.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/attr-model-1.c: Modify the content of the search
	string in the test case.
	* gcc.target/loongarch/attr-model-2.c: Likewise.
	* gcc.target/loongarch/attr-model-3.c: Likewise.
	* gcc.target/loongarch/attr-model-4.c: Likewise.
	* gcc.target/loongarch/func-call-extreme-1.c: Likewise.
	* gcc.target/loongarch/func-call-extreme-2.c: Likewise.
	* gcc.target/loongarch/func-call-extreme-3.c: Likewise.
	* gcc.target/loongarch/func-call-extreme-4.c: Likewise.
---
 gcc/config/loongarch/loongarch.cc                     | 11 +++++++++++
 gcc/testsuite/gcc.target/loongarch/attr-model-1.c     |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-2.c     |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-3.c     |  2 +-
 gcc/testsuite/gcc.target/loongarch/attr-model-4.c     |  2 +-
 .../gcc.target/loongarch/func-call-extreme-1.c        |  6 +++---
 .../gcc.target/loongarch/func-call-extreme-2.c        |  6 +++---
 .../gcc.target/loongarch/func-call-extreme-3.c        |  6 +++---
 .../gcc.target/loongarch/func-call-extreme-4.c        |  6 +++---
 9 files changed, 27 insertions(+), 16 deletions(-)
  

Comments

Xi Ruoyao Jan. 5, 2024, 8:37 a.m. UTC | #1
On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>  bool
>  loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>  {
> +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> +     so that the linker can infer the PC of pcalau12i to apply relocations
> +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
> +     these four instructions are not in the same 4KiB page.
> +     Therefore, macro instructions are used when cmodel=extreme.  */
> +  if (loongarch_symbol_extreme_p (type))
> +    return false;

I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
we should still use explicit relocs, but coding all 4 instructions
altogether as

"pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"

Give me several hours trying to implement this...
  
chenglulu Jan. 5, 2024, 8:51 a.m. UTC | #2
在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>>   bool
>>   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>>   {
>> +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>> +     so that the linker can infer the PC of pcalau12i to apply relocations
>> +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
>> +     these four instructions are not in the same 4KiB page.
>> +     Therefore, macro instructions are used when cmodel=extreme.  */
>> +  if (loongarch_symbol_extreme_p (type))
>> +    return false;
> I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> we should still use explicit relocs, but coding all 4 instructions
> altogether as
>
> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>
> Give me several hours trying to implement this...
>
You mean to take the last add directive out separately?
  
chenglulu Jan. 5, 2024, 9:57 a.m. UTC | #3
在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>>   bool
>>   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>>   {
>> +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>> +     so that the linker can infer the PC of pcalau12i to apply relocations
>> +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
>> +     these four instructions are not in the same 4KiB page.
>> +     Therefore, macro instructions are used when cmodel=extreme.  */
>> +  if (loongarch_symbol_extreme_p (type))
>> +    return false;
> I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> we should still use explicit relocs, but coding all 4 instructions
> altogether as
>
> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>
> Give me several hours trying to implement this...
>
I think there is no difference between macros and these instructions put 
together. If implement it in a split form, I think I can try it through 
TARGET_SCHED_MACRO_FUSION_PAIR_P
  
Xi Ruoyao Jan. 5, 2024, 10:25 a.m. UTC | #4
On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> 
> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > >   bool
> > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > >   {
> > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > +     so that the linker can infer the PC of pcalau12i to apply relocations
> > > +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
> > > +     these four instructions are not in the same 4KiB page.
> > > +     Therefore, macro instructions are used when cmodel=extreme.  */
> > > +  if (loongarch_symbol_extreme_p (type))
> > > +    return false;
> > I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> > we should still use explicit relocs, but coding all 4 instructions
> > altogether as
> > 
> > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > 
> > Give me several hours trying to implement this...
> > 
> I think there is no difference between macros and these instructions put 
> together. If implement it in a split form, I think I can try it through 
> TARGET_SCHED_MACRO_FUSION_PAIR_P

There is a difference:

int x;
int t() { return x; }

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
ldx.w a0, t0, t1

is slightly better than

pcalau12i.d t0, %pc_hi20(x)
addi.d t1, r0, %pc_lo12(x)
lu32i.d t1, %pc64_lo20(x)
lu52i.d t1, t1, %pc64_hi12(x)
addi.d t0, t0, t1
ld.w a0, t0, 0

And generating macros when -mexplicit-relocs=always can puzzle people
(it says "always" :-\ ).
  
Xi Ruoyao Jan. 5, 2024, 11:55 a.m. UTC | #5
On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > 
> > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > >   bool
> > > >   loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > >   {
> > > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > > +     so that the linker can infer the PC of pcalau12i to apply relocations
> > > > +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
> > > > +     these four instructions are not in the same 4KiB page.
> > > > +     Therefore, macro instructions are used when cmodel=extreme.  */
> > > > +  if (loongarch_symbol_extreme_p (type))
> > > > +    return false;
> > > I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> > > we should still use explicit relocs, but coding all 4 instructions
> > > altogether as
> > > 
> > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > > 
> > > Give me several hours trying to implement this...
> > > 
> > I think there is no difference between macros and these instructions put 
> > together. If implement it in a split form, I think I can try it through 
> > TARGET_SCHED_MACRO_FUSION_PAIR_P

We don't need to split the insn.  We can just add a "large insn"
containing the assembly output we want.

See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
they are basically an variation of GOT addressing.

I've ran some small tests and now trying to bootstrap GCC with -
mcmodel=extreme in BOOT_CFLAGS...

> 
> There is a difference:
> 
> int x;
> int t() { return x; }
> 
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> ldx.w a0, t0, t1
> 
> is slightly better than
> 
> pcalau12i.d t0, %pc_hi20(x)
> addi.d t1, r0, %pc_lo12(x)
> lu32i.d t1, %pc64_lo20(x)
> lu52i.d t1, t1, %pc64_hi12(x)
> addi.d t0, t0, t1
> ld.w a0, t0, 0
> 
> And generating macros when -mexplicit-relocs=always can puzzle people
> (it says "always" :-\ ).
>
  
chenglulu Jan. 5, 2024, 12:45 p.m. UTC | #6
在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
>> On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
>>> 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
>>>> On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
>>>>>    bool
>>>>>    loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
>>>>>    {
>>>>> +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
>>>>> +     so that the linker can infer the PC of pcalau12i to apply relocations
>>>>> +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
>>>>> +     these four instructions are not in the same 4KiB page.
>>>>> +     Therefore, macro instructions are used when cmodel=extreme.  */
>>>>> +  if (loongarch_symbol_extreme_p (type))
>>>>> +    return false;
>>>> I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
>>>> we should still use explicit relocs, but coding all 4 instructions
>>>> altogether as
>>>>
>>>> "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
>>>>
>>>> Give me several hours trying to implement this...
>>>>
>>> I think there is no difference between macros and these instructions put
>>> together. If implement it in a split form, I think I can try it through
>>> TARGET_SCHED_MACRO_FUSION_PAIR_P
> We don't need to split the insn.  We can just add a "large insn"
> containing the assembly output we want.
>
> See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
> they are basically an variation of GOT addressing.
>
> I've ran some small tests and now trying to bootstrap GCC with -
> mcmodel=extreme in BOOT_CFLAGS...
>
>> There is a difference:
>>
>> int x;
>> int t() { return x; }
>>
>> pcalau12i.d t0, %pc_hi20(x)
>> addi.d t1, r0, %pc_lo12(x)
>> lu32i.d t1, %pc64_lo20(x)
>> lu52i.d t1, t1, %pc64_hi12(x)
>> ldx.w a0, t0, t1
>>
>> is slightly better than
>>
>> pcalau12i.d t0, %pc_hi20(x)
>> addi.d t1, r0, %pc_lo12(x)
>> lu32i.d t1, %pc64_lo20(x)
>> lu52i.d t1, t1, %pc64_hi12(x)
>> addi.d t0, t0, t1
>> ld.w a0, t0, 0
>>
>> And generating macros when -mexplicit-relocs=always can puzzle people
>> (it says "always" :-\ ).
>>
Thumbs up! This method is much better than my method, I learned 
something! grateful!
But I still have to test the accuracy.
  
Xi Ruoyao Jan. 5, 2024, 2:16 p.m. UTC | #7
On Fri, 2024-01-05 at 20:45 +0800, chenglulu wrote:
> 
> 在 2024/1/5 下午7:55, Xi Ruoyao 写道:
> > On Fri, 2024-01-05 at 18:25 +0800, Xi Ruoyao wrote:
> > > On Fri, 2024-01-05 at 17:57 +0800, chenglulu wrote:
> > > > 在 2024/1/5 下午4:37, Xi Ruoyao 写道:
> > > > > On Fri, 2024-01-05 at 11:40 +0800, Lulu Cheng wrote:
> > > > > >    bool
> > > > > >    loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
> > > > > >    {
> > > > > > +  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
> > > > > > +     so that the linker can infer the PC of pcalau12i to apply relocations
> > > > > > +     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
> > > > > > +     these four instructions are not in the same 4KiB page.
> > > > > > +     Therefore, macro instructions are used when cmodel=extreme.  */
> > > > > > +  if (loongarch_symbol_extreme_p (type))
> > > > > > +    return false;
> > > > > I think this is a bit of strange.  With -mexplicit-relocs={auto,always}
> > > > > we should still use explicit relocs, but coding all 4 instructions
> > > > > altogether as
> > > > > 
> > > > > "pcalau12i.d\t%1,%pc64_hi12(%2)\n\taddi.d\t%0,$r0,%pclo12(%2)\n\tlu32i.d\t%0,%pc64_lo20(%2)\n\tlu52i.d\t%0,%0,%pc64_hi12(%2)"
> > > > > 
> > > > > Give me several hours trying to implement this...
> > > > > 
> > > > I think there is no difference between macros and these instructions put
> > > > together. If implement it in a split form, I think I can try it through
> > > > TARGET_SCHED_MACRO_FUSION_PAIR_P
> > We don't need to split the insn.  We can just add a "large insn"
> > containing the assembly output we want.
> > 
> > See the attached patch.  Note that TLS LE/LD/GD needs a fix too because
> > they are basically an variation of GOT addressing.
> > 
> > I've ran some small tests and now trying to bootstrap GCC with -
> > mcmodel=extreme in BOOT_CFLAGS...
> > 
> > > There is a difference:
> > > 
> > > int x;
> > > int t() { return x; }
> > > 
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > ldx.w a0, t0, t1
> > > 
> > > is slightly better than
> > > 
> > > pcalau12i.d t0, %pc_hi20(x)
> > > addi.d t1, r0, %pc_lo12(x)
> > > lu32i.d t1, %pc64_lo20(x)
> > > lu52i.d t1, t1, %pc64_hi12(x)
> > > addi.d t0, t0, t1
> > > ld.w a0, t0, 0
> > > 
> > > And generating macros when -mexplicit-relocs=always can puzzle people
> > > (it says "always" :-\ ).
> > > 
> Thumbs up! This method is much better than my method, I learned 
> something! grateful!
> But I still have to test the accuracy.

I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
we need a target hook to tell the generic code
UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
see millions lines of messages like

../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
UNSPEC_LA_PCREL_64_PART1 (42) found in variable location

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 4f89c4af323..410e1b5e693 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -10868,6 +10868,24 @@ loongarch_asm_code_end (void)
 #undef DUMP_FEATURE
 }
 
+static rtx loongarch_delegitimize_address (rtx op)
+{
+  if (GET_CODE (op) == UNSPEC)
+  {
+    int unspec = XINT (op, 1);
+    switch (unspec)
+      {
+      case UNSPEC_LA_PCREL_64_PART1:
+      case UNSPEC_LA_PCREL_64_PART2:
+	return XVECEXP (op, 0, 0);
+      default:
+	return op;
+      }
+  }
+
+  return op;
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -11129,6 +11147,10 @@ loongarch_asm_code_end (void)
 #define TARGET_VECTORIZE_SUPPORT_VECTOR_MISALIGNMENT \
   loongarch_builtin_support_vector_misalignment
 
+#undef TARGET_DELEGITIMIZE_ADDRESS
+#define TARGET_DELEGITIMIZE_ADDRESS \
+  loongarch_delegitimize_address
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-loongarch.h"
  
chenglulu Jan. 12, 2024, 1:46 a.m. UTC | #8
> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> we need a target hook to tell the generic code
> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> see millions lines of messages like
>
> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>
I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't 
reproduced the problem you mentioned.

$../configure --host=loongarch64-linux-gnu 
--target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \

         --with-arch=loongarch64 --with-abi=lp64d --enable-tls 
--enable-languages=c,c++,fortran,lto --enable-plugin \

         --disable-multilib --disable-host-shared --enable-bootstrap 
--enable-checking=release

     $ make BOOT_FLAGS="-mcmodel=extreme"

What did I do wrong?:-(
  
Xi Ruoyao Jan. 12, 2024, 11:42 a.m. UTC | #9
在 2024-01-12星期五的 09:46 +0800,chenglulu写道:

> > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > we need a target hook to tell the generic code
> > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > see millions lines of messages like
> > 
> > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> 
> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
> 
>     $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>         --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>         --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>     $ make BOOT_FLAGS="-mcmodel=extreme"
> 
> What did I do wrong?:-(

BOOT_CFLAGS, not BOOT_FLAGS :).
  
chenglulu Jan. 13, 2024, 7:01 a.m. UTC | #10
在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>
>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>> we need a target hook to tell the generic code
>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>> see millions lines of messages like
>>>
>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>
>>      $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>>          --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>>          --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>>      $ make BOOT_FLAGS="-mcmodel=extreme"
>>
>> What did I do wrong?:-(
> BOOT_CFLAGS, not BOOT_FLAGS :).
>
This is so strange. My compilation here stopped due to syntax problems,

and I still haven't reproduced the information you mentioned about 
UNSPEC_LA_PCREL_64_PART1.
  
Xi Ruoyao Jan. 13, 2024, 1:05 p.m. UTC | #11
在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> 
> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > 
> > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > we need a target hook to tell the generic code
> > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > > > see millions lines of messages like
> > > > 
> > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
> > > 
> > >      $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > >          --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
> > >          --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
> > >      $ make BOOT_FLAGS="-mcmodel=extreme"
> > > 
> > > What did I do wrong?:-(
> > BOOT_CFLAGS, not BOOT_FLAGS :).
> > 
> This is so strange. My compilation here stopped due to syntax problems,
> 
> and I still haven't reproduced the information you mentioned about 
> UNSPEC_LA_PCREL_64_PART1.

I used:

../gcc/configure --with-system-zlib --disable-fixincludes \
                 --enable-default-ssp --enable-default-pie \
                 --disable-werror --disable-multilib \
                 --prefix=/home/xry111/gcc-dev

and then

make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
     BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log

I guess "-g" is needed to reproduce the issue as well as the messages
were produced in dwarf generation.
  
chenglulu Jan. 13, 2024, 2:05 p.m. UTC | #12
在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
>>> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>>>
>>>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>>>> we need a target hook to tell the generic code
>>>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>>>> see millions lines of messages like
>>>>>
>>>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>>>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>>>
>>>>       $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>>>>           --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>>>>           --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>>>>       $ make BOOT_FLAGS="-mcmodel=extreme"
>>>>
>>>> What did I do wrong?:-(
>>> BOOT_CFLAGS, not BOOT_FLAGS :).
>>>
>> This is so strange. My compilation here stopped due to syntax problems,
>>
>> and I still haven't reproduced the information you mentioned about
>> UNSPEC_LA_PCREL_64_PART1.
> I used:
>
> ../gcc/configure --with-system-zlib --disable-fixincludes \
>                   --enable-default-ssp --enable-default-pie \
>                   --disable-werror --disable-multilib \
>                   --prefix=/home/xry111/gcc-dev
>
> and then
>
> make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
>       BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
>
> I guess "-g" is needed to reproduce the issue as well as the messages
> were produced in dwarf generation.

Oh, okay, I'll try this method!:-)

>
  
Xi Ruoyao Jan. 17, 2024, 9:50 a.m. UTC | #13
On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
> 
> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
> > 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
> > > 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
> > > > 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
> > > > 
> > > > > > I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
> > > > > > we need a target hook to tell the generic code
> > > > > > UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
> > > > > > see millions lines of messages like
> > > > > > 
> > > > > > ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
> > > > > > UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
> > > > > I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
> > > > > 
> > > > >       $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
> > > > >           --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
> > > > >           --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
> > > > >       $ make BOOT_FLAGS="-mcmodel=extreme"
> > > > > 
> > > > > What did I do wrong?:-(
> > > > BOOT_CFLAGS, not BOOT_FLAGS :).
> > > > 
> > > This is so strange. My compilation here stopped due to syntax problems,
> > > 
> > > and I still haven't reproduced the information you mentioned about
> > > UNSPEC_LA_PCREL_64_PART1.
> > I used:
> > 
> > ../gcc/configure --with-system-zlib --disable-fixincludes \
> >                   --enable-default-ssp --enable-default-pie \
> >                   --disable-werror --disable-multilib \
> >                   --prefix=/home/xry111/gcc-dev
> > 
> > and then
> > 
> > make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
> >       BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
> > 
> > I guess "-g" is needed to reproduce the issue as well as the messages
> > were produced in dwarf generation.
> > 
> I have reproduced this problem, and it can be solved by adding a hook.
> 
> But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'
> 
> to test spec2006 403.gcc, an error will occur. Others have not been 
> tested yet.
> 
> I roughly debugged it, and the problem should be this:
> 
> The problem is that the address of the instruction ‘ldx.d $r12, $r25, 
> $r6’ is wrong.
> 
> Wrong assembly:
> 
>     5826         pcalau12i       $r13,%got_pc_hi20(recog_data)
>   5827         addi.d  $r12,$r0,%got_pc_lo12(recog_data)
>   5828         lu32i.d $r12,%got64_pc_lo20(recog_data)
>   5829         lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
>   5830         ldx.d   $r12,$r13,$r12
>   5831         ld.b    $r8,$r12,997
>   5832         .loc 1 829 18 discriminator 1 view .LVU1527
>   5833         ble     $r8,$r0,.L476
>   5834         ld.d    $r6,$r3,16
>   5835         ld.d    $r9,$r3,88
>   5836 .LBB189 = .
>   5837         .loc 1 839 24 view .LVU1528
>   5838         alsl.d  $r7,$r19,$r19,2
>   5839         ldx.d   $r12,$r25,$r6
>   5840         addi.d  $r17,$r3,120
>   5841 .LBE189 = .
>   5842         .loc 1 829 18 discriminator 1 view .LVU1529
>   5843         or      $r13,$r0,$r0
>   5844         addi.d  $r4,$r12,992
> 
> Assembly that works fine using macros:
> 
> 3040         la.global       $r12,$r13,recog_data
> 3041         ld.b    $r9,$r12,997
> 3042         ble     $r9,$r0,.L475
> 3043         alsl.d  $r5,$r16,$r16,2
> 3044         la.global       $r15,$r17,recog_data
> 3045         addi.d  $r4,$r12,992
> 3046         addi.d  $r18,$r3,48
> 3047         or      $r13,$r0,$r0
> 
> Comparing the assembly, we can see that lines 5844 and 3045 have the 
> same function,
> 
> but there is a problem with the base address register optimization at 
> line 5844.
> 
> regrename.c.283r.loop2_init:
> 
> (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
>          (const_int 0 [0])) "regrename.c":829:18 discrim 1 156 
> {*movdi_64bit}
> (nil))
> (insn 2741 6 2744 34 (parallel [
>              (set (reg:DI 1502)
>                  (unspec:DI [
>                          (symbol_ref:DI ("recog_data") [flags 0xc0]  
> <var_decl 0x7f8c5ffd66c0 recog_data>)
>                      ] UNSPEC_LA_PCREL_64_PART1))
>              (set (reg/f:DI 1479)
>                  (unspec:DI [
>                          (symbol_ref:DI ("recog_data") [flags 0xc0]  
> <var_decl 0x7f8c5ffd66c0 recog_data>)
>                      ] UNSPEC_LA_PCREL_64_PART2))
>          ]) -1
>       (expr_list:REG_UNUSED (reg/f:DI 1479)
> (nil)))
> (insn 2744 2741 2745 34 (set (reg/f:DI 1503)
>          (mem:DI (plus:DI (reg/f:DI 1479)
>                  (reg:DI 1502)) [0  S8 A8])) 156 {*movdi_64bit}
>       (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0] 
> <var_decl 0x7f8c5ffd66c0 recog_data>)
> (nil)))
> 
> 
> Virtual register 1479 will be used in insn 2744, but register 1479 was
> assigned the REG_UNUSED attribute in the previous instruction.
> 
> The attached file is the wrong file.
> The compilation command is as follows:
> 
> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c 
> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64 
> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2 
> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration 
> -Wno-incompatible-pointer-types -version -o regrename.s 
> -mexplicit-relocs=always -fdump-rtl-all-all

I've seen some "guality" test failures in GCC test suite as well. 
Normally I just ignore the guality failures but this time they look very
suspicious.  I'll investigate these issues...
  
chenglulu Jan. 17, 2024, 9:57 a.m. UTC | #14
在 2024/1/17 下午5:50, Xi Ruoyao 写道:
> On Wed, 2024-01-17 at 17:38 +0800, chenglulu wrote:
>> 在 2024/1/13 下午9:05, Xi Ruoyao 写道:
>>> 在 2024-01-13星期六的 15:01 +0800,chenglulu写道:
>>>> 在 2024/1/12 下午7:42, Xi Ruoyao 写道:
>>>>> 在 2024-01-12星期五的 09:46 +0800,chenglulu写道:
>>>>>
>>>>>>> I found an issue bootstrapping GCC with -mcmodel=extreme in BOOT_CFLAGS:
>>>>>>> we need a target hook to tell the generic code
>>>>>>> UNSPEC_LA_PCREL_64_PART{1,2} are just a wrapper around symbols, or we'll
>>>>>>> see millions lines of messages like
>>>>>>>
>>>>>>> ../../gcc/gcc/tree.h:4171:1: note: non-delegitimized UNSPEC
>>>>>>> UNSPEC_LA_PCREL_64_PART1 (42) found in variable location
>>>>>> I build GCC with -mcmodel=extreme in BOOT_CFLAGS, but I haven't reproduced the problem you mentioned.
>>>>>>
>>>>>>        $ ../configure --host=loongarch64-linux-gnu --target=loongarch64-linux-gnu --build=loongarch64-linux-gnu \
>>>>>>            --with-arch=loongarch64 --with-abi=lp64d --enable-tls --enable-languages=c,c++,fortran,lto --enable-plugin \
>>>>>>            --disable-multilib --disable-host-shared --enable-bootstrap --enable-checking=release
>>>>>>        $ make BOOT_FLAGS="-mcmodel=extreme"
>>>>>>
>>>>>> What did I do wrong?:-(
>>>>> BOOT_CFLAGS, not BOOT_FLAGS :).
>>>>>
>>>> This is so strange. My compilation here stopped due to syntax problems,
>>>>
>>>> and I still haven't reproduced the information you mentioned about
>>>> UNSPEC_LA_PCREL_64_PART1.
>>> I used:
>>>
>>> ../gcc/configure --with-system-zlib --disable-fixincludes \
>>>                    --enable-default-ssp --enable-default-pie \
>>>                    --disable-werror --disable-multilib \
>>>                    --prefix=/home/xry111/gcc-dev
>>>
>>> and then
>>>
>>> make STAGE1_{C,CXX}FLAGS="-O2 -g" -j8 \
>>>        BOOT_{C,CXX}FLAGS="-O2 -g -mcmodel=extreme" &| tee gcc-build.log
>>>
>>> I guess "-g" is needed to reproduce the issue as well as the messages
>>> were produced in dwarf generation.
>>>
>> I have reproduced this problem, and it can be solved by adding a hook.
>>
>> But unfortunately, when using '-mcmodel=extreme -mexplicit-relocs=always'
>>
>> to test spec2006 403.gcc, an error will occur. Others have not been
>> tested yet.
>>
>> I roughly debugged it, and the problem should be this:
>>
>> The problem is that the address of the instruction ‘ldx.d $r12, $r25,
>> $r6’ is wrong.
>>
>> Wrong assembly:
>>
>>      5826         pcalau12i       $r13,%got_pc_hi20(recog_data)
>>    5827         addi.d  $r12,$r0,%got_pc_lo12(recog_data)
>>    5828         lu32i.d $r12,%got64_pc_lo20(recog_data)
>>    5829         lu52i.d $r12,$r12,%got64_pc_hi12(recog_data)
>>    5830         ldx.d   $r12,$r13,$r12
>>    5831         ld.b    $r8,$r12,997
>>    5832         .loc 1 829 18 discriminator 1 view .LVU1527
>>    5833         ble     $r8,$r0,.L476
>>    5834         ld.d    $r6,$r3,16
>>    5835         ld.d    $r9,$r3,88
>>    5836 .LBB189 = .
>>    5837         .loc 1 839 24 view .LVU1528
>>    5838         alsl.d  $r7,$r19,$r19,2
>>    5839         ldx.d   $r12,$r25,$r6
>>    5840         addi.d  $r17,$r3,120
>>    5841 .LBE189 = .
>>    5842         .loc 1 829 18 discriminator 1 view .LVU1529
>>    5843         or      $r13,$r0,$r0
>>    5844         addi.d  $r4,$r12,992
>>
>> Assembly that works fine using macros:
>>
>> 3040         la.global       $r12,$r13,recog_data
>> 3041         ld.b    $r9,$r12,997
>> 3042         ble     $r9,$r0,.L475
>> 3043         alsl.d  $r5,$r16,$r16,2
>> 3044         la.global       $r15,$r17,recog_data
>> 3045         addi.d  $r4,$r12,992
>> 3046         addi.d  $r18,$r3,48
>> 3047         or      $r13,$r0,$r0
>>
>> Comparing the assembly, we can see that lines 5844 and 3045 have the
>> same function,
>>
>> but there is a problem with the base address register optimization at
>> line 5844.
>>
>> regrename.c.283r.loop2_init:
>>
>> (insn 6 497 2741 34 (set (reg:DI 180 [ ivtmp.713D.15724 ])
>>           (const_int 0 [0])) "regrename.c":829:18 discrim 1 156
>> {*movdi_64bit}
>> (nil))
>> (insn 2741 6 2744 34 (parallel [
>>               (set (reg:DI 1502)
>>                   (unspec:DI [
>>                           (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>>                       ] UNSPEC_LA_PCREL_64_PART1))
>>               (set (reg/f:DI 1479)
>>                   (unspec:DI [
>>                           (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>>                       ] UNSPEC_LA_PCREL_64_PART2))
>>           ]) -1
>>        (expr_list:REG_UNUSED (reg/f:DI 1479)
>> (nil)))
>> (insn 2744 2741 2745 34 (set (reg/f:DI 1503)
>>           (mem:DI (plus:DI (reg/f:DI 1479)
>>                   (reg:DI 1502)) [0  S8 A8])) 156 {*movdi_64bit}
>>        (expr_list:REG_EQUAL (symbol_ref:DI ("recog_data") [flags 0xc0]
>> <var_decl 0x7f8c5ffd66c0 recog_data>)
>> (nil)))
>>
>>
>> Virtual register 1479 will be used in insn 2744, but register 1479 was
>> assigned the REG_UNUSED attribute in the previous instruction.
>>
>> The attached file is the wrong file.
>> The compilation command is as follows:
>>
>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
>> -Wno-incompatible-pointer-types -version -o regrename.s
>> -mexplicit-relocs=always -fdump-rtl-all-all
> I've seen some "guality" test failures in GCC test suite as well.
> Normally I just ignore the guality failures but this time they look very
> suspicious.  I'll investigate these issues...
>
I've also seen this type of failed regression tests and I'll continue to 
look at this issue as well.
  
Xi Ruoyao Jan. 19, 2024, 5:46 a.m. UTC | #15
On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
> > > Virtual register 1479 will be used in insn 2744, but register 1479 was
> > > assigned the REG_UNUSED attribute in the previous instruction.
> > > 
> > > The attached file is the wrong file.
> > > The compilation command is as follows:
> > > 
> > > $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
> > > -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
> > > -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
> > > -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
> > > -Wno-incompatible-pointer-types -version -o regrename.s
> > > -mexplicit-relocs=always -fdump-rtl-all-all
> > I've seen some "guality" test failures in GCC test suite as well.
> > Normally I just ignore the guality failures but this time they look very
> > suspicious.  I'll investigate these issues...
> > 
> I've also seen this type of failed regression tests and I'll continue to 
> look at this issue as well.

The guality regression is simple: I didn't call
delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
the custom implementation.

The failure of this test case was because the compiler believes that two
(UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
same result, but this isn't true because the result depends on PC.  Thus
(pc) needed to be included in the RTX, like:

  [(set (match_operand:DI 0 "register_operand" "=r")
    (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
   (set (match_operand:DI 1 "register_operand" "=r")
    (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]

With this the buggy REG_UNUSED notes were gone.  But it then prevented
the CSE when loading the address of __tls_get_addr (i.e. if we address
10 TLE_LD symbols in a function it would emit 10 instances of "la.global
__tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
than __tls_get_addr such notes are added automatically by optimization
passes.

Updated patch attached.
  
chenglulu Jan. 19, 2024, 8:51 a.m. UTC | #16
在 2024/1/19 下午1:46, Xi Ruoyao 写道:
> On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
>>>> Virtual register 1479 will be used in insn 2744, but register 1479 was
>>>> assigned the REG_UNUSED attribute in the previous instruction.
>>>>
>>>> The attached file is the wrong file.
>>>> The compilation command is as follows:
>>>>
>>>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase regrename.c
>>>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>>>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>>>> -Wno-int-conversion -Wno-implicit-int -Wno-implicit-function-declaration
>>>> -Wno-incompatible-pointer-types -version -o regrename.s
>>>> -mexplicit-relocs=always -fdump-rtl-all-all
>>> I've seen some "guality" test failures in GCC test suite as well.
>>> Normally I just ignore the guality failures but this time they look very
>>> suspicious.  I'll investigate these issues...
>>>
>> I've also seen this type of failed regression tests and I'll continue to
>> look at this issue as well.
> The guality regression is simple: I didn't call
> delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
> the custom implementation.
>
> The failure of this test case was because the compiler believes that two
> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> same result, but this isn't true because the result depends on PC.  Thus
> (pc) needed to be included in the RTX, like:
>
>    [(set (match_operand:DI 0 "register_operand" "=r")
>      (unspec:DI [(match_operand:DI 2 "") (pc)] UNSPEC_LA_PCREL_64_PART1))
>     (set (match_operand:DI 1 "register_operand" "=r")
>      (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>
> With this the buggy REG_UNUSED notes were gone.  But it then prevented
> the CSE when loading the address of __tls_get_addr (i.e. if we address
> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
> than __tls_get_addr such notes are added automatically by optimization
> passes.
>
> Updated patch attached.
>
I'm eliminating redundant la.global directives in my macro implementation.

I will be testing this patch.
  
chenglulu Jan. 22, 2024, 7:27 a.m. UTC | #17
在 2024/1/19 下午4:51, chenglulu 写道:
>
> 在 2024/1/19 下午1:46, Xi Ruoyao 写道:
>> On Wed, 2024-01-17 at 17:57 +0800, chenglulu wrote:
>>>>> Virtual register 1479 will be used in insn 2744, but register 1479 
>>>>> was
>>>>> assigned the REG_UNUSED attribute in the previous instruction.
>>>>>
>>>>> The attached file is the wrong file.
>>>>> The compilation command is as follows:
>>>>>
>>>>> $ ./gcc/cc1 -fpreprocessed regrename.i -quiet -dp -dumpbase 
>>>>> regrename.c
>>>>> -dumpbase-ext .c -mno-relax -mabi=lp64d -march=loongarch64 -mfpu=64
>>>>> -msimd=lasx -mcmodel=extreme -mtune=loongarch64 -g3 -O2
>>>>> -Wno-int-conversion -Wno-implicit-int 
>>>>> -Wno-implicit-function-declaration
>>>>> -Wno-incompatible-pointer-types -version -o regrename.s
>>>>> -mexplicit-relocs=always -fdump-rtl-all-all
>>>> I've seen some "guality" test failures in GCC test suite as well.
>>>> Normally I just ignore the guality failures but this time they look 
>>>> very
>>>> suspicious.  I'll investigate these issues...
>>>>
>>> I've also seen this type of failed regression tests and I'll 
>>> continue to
>>> look at this issue as well.
>> The guality regression is simple: I didn't call
>> delegitimize_mem_from_attrs (the default TARGET_DELEGITIMIZE_ADDRESS) in
>> the custom implementation.
>>
>> The failure of this test case was because the compiler believes that two
>> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
>> same result, but this isn't true because the result depends on PC.  Thus
>> (pc) needed to be included in the RTX, like:
>>
>>    [(set (match_operand:DI 0 "register_operand" "=r")
>>      (unspec:DI [(match_operand:DI 2 "") (pc)] 
>> UNSPEC_LA_PCREL_64_PART1))
>>     (set (match_operand:DI 1 "register_operand" "=r")
>>      (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>>
>> With this the buggy REG_UNUSED notes were gone.  But it then prevented
>> the CSE when loading the address of __tls_get_addr (i.e. if we address
>> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
>> __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
>> than __tls_get_addr such notes are added automatically by optimization
>> passes.
>>
>> Updated patch attached.
>>
> I'm eliminating redundant la.global directives in my macro 
> implementation.
>
> I will be testing this patch.
>
>
>
>
With this patch, spec2006 can pass the test, but spec2017 621 and 654 
tests fail.
I haven't debugged the specific cause of the problem yet.
  
Xi Ruoyao Jan. 23, 2024, 7:36 p.m. UTC | #18
On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > The failure of this test case was because the compiler believes that two
> > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > same result, but this isn't true because the result depends on PC.  Thus
> > > (pc) needed to be included in the RTX, like:
> > > 
> > >    [(set (match_operand:DI 0 "register_operand" "=r")
> > >      (unspec:DI [(match_operand:DI 2 "") (pc)] 
> > > UNSPEC_LA_PCREL_64_PART1))
> > >     (set (match_operand:DI 1 "register_operand" "=r")
> > >      (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > > 
> > > With this the buggy REG_UNUSED notes were gone.  But it then prevented
> > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> > > __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
> > > than __tls_get_addr such notes are added automatically by optimization
> > > passes.
> > > 
> > > Updated patch attached.
> > > 
> > I'm eliminating redundant la.global directives in my macro 
> > implementation.
> > 
> > I will be testing this patch.
> > 
> > 
> > 
> > 
> With this patch, spec2006 can pass the test, but spec2017 621 and 654 
> tests fail.
> I haven't debugged the specific cause of the problem yet.

Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
<del>unhealthy</del> food in the midnight I realized the hook only
papers over the same issue caused spec2006 failure.  I tried a bootstrap
with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
commented out, and there is no more spurious "note: non-delegitimized
UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
I feel that this hook is still written in a buggy way, so maybe removing
it will solve the spec2017 issue.
  
chenglulu Jan. 25, 2024, 12:48 a.m. UTC | #19
在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
>>>> The failure of this test case was because the compiler believes that two
>>>> (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
>>>> same result, but this isn't true because the result depends on PC.  Thus
>>>> (pc) needed to be included in the RTX, like:
>>>>
>>>>     [(set (match_operand:DI 0 "register_operand" "=r")
>>>>       (unspec:DI [(match_operand:DI 2 "") (pc)]
>>>> UNSPEC_LA_PCREL_64_PART1))
>>>>      (set (match_operand:DI 1 "register_operand" "=r")
>>>>       (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
>>>>
>>>> With this the buggy REG_UNUSED notes were gone.  But it then prevented
>>>> the CSE when loading the address of __tls_get_addr (i.e. if we address
>>>> 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
>>>> __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
>>>> than __tls_get_addr such notes are added automatically by optimization
>>>> passes.
>>>>
>>>> Updated patch attached.
>>>>
>>> I'm eliminating redundant la.global directives in my macro
>>> implementation.
>>>
>>> I will be testing this patch.
>>>
>>>
>>>
>>>
>> With this patch, spec2006 can pass the test, but spec2017 621 and 654
>> tests fail.
>> I haven't debugged the specific cause of the problem yet.
> Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
> <del>unhealthy</del> food in the midnight I realized the hook only
> papers over the same issue caused spec2006 failure.  I tried a bootstrap
> with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
> commented out, and there is no more spurious "note: non-delegitimized
> UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
> I feel that this hook is still written in a buggy way, so maybe removing
> it will solve the spec2017 issue.
>
I found the problem. Binutils did not consider the four instructions 
when converting

the type from TLS IE to TLS LE, which caused the conversion error.
  
Xi Ruoyao Jan. 25, 2024, 7:59 a.m. UTC | #20
On Thu, 2024-01-25 at 08:48 +0800, chenglulu wrote:
> 
> 在 2024/1/24 上午3:36, Xi Ruoyao 写道:
> > On Mon, 2024-01-22 at 15:27 +0800, chenglulu wrote:
> > > > > The failure of this test case was because the compiler believes that two
> > > > > (UNSPEC_PCREL_64_PART2 [(symbol)]) instances would always produce the
> > > > > same result, but this isn't true because the result depends on PC.  Thus
> > > > > (pc) needed to be included in the RTX, like:
> > > > > 
> > > > >     [(set (match_operand:DI 0 "register_operand" "=r")
> > > > >       (unspec:DI [(match_operand:DI 2 "") (pc)]
> > > > > UNSPEC_LA_PCREL_64_PART1))
> > > > >      (set (match_operand:DI 1 "register_operand" "=r")
> > > > >       (unspec:DI [(match_dup 2) (pc)] UNSPEC_LA_PCREL_64_PART2))]
> > > > > 
> > > > > With this the buggy REG_UNUSED notes were gone.  But it then prevented
> > > > > the CSE when loading the address of __tls_get_addr (i.e. if we address
> > > > > 10 TLE_LD symbols in a function it would emit 10 instances of "la.global
> > > > > __tls_get_addr") so I added an REG_EQUAL note for it.  For symbols other
> > > > > than __tls_get_addr such notes are added automatically by optimization
> > > > > passes.
> > > > > 
> > > > > Updated patch attached.
> > > > > 
> > > > I'm eliminating redundant la.global directives in my macro
> > > > implementation.
> > > > 
> > > > I will be testing this patch.
> > > > 
> > > > 
> > > > 
> > > > 
> > > With this patch, spec2006 can pass the test, but spec2017 621 and 654
> > > tests fail.
> > > I haven't debugged the specific cause of the problem yet.
> > Try removing the TARGET_DELEGITIMIZE_ADDRESS hook?  After eating some
> > <del>unhealthy</del> food in the midnight I realized the hook only
> > papers over the same issue caused spec2006 failure.  I tried a bootstrap
> > with BOOT_CFLAGS=-O2 -g -mcmodel=extreme and TARGET_DELEGITIMIZE_ADDRESS
> > commented out, and there is no more spurious "note: non-delegitimized
> > UNSPEC UNSPEC_LA_PCREL_64_PART1 (42) found in variable location" things.
> > I feel that this hook is still written in a buggy way, so maybe removing
> > it will solve the spec2017 issue.
> > 
> I found the problem. Binutils did not consider the four instructions 
> when converting the type from TLS IE to TLS LE, which caused the conversion error.

Oooops.  We better fix this quickly as the Binutils 2.42 release is
imminent.

Maybe we can just disable TLS linker optimization once we see an
R_LARCH_TLS_DESC64* or R_LARCH_TLS_IE64*.
  

Patch

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 6a3321327ea..3b4b28f3bcc 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -264,6 +264,9 @@  const char *const
 loongarch_fp_conditions[16]= {LARCH_FP_CONDITIONS (STRINGIFY)};
 #undef STRINGIFY
 
+static bool
+loongarch_symbol_extreme_p (enum loongarch_symbol_type type);
+
 /* Size of guard page.  */
 #define STACK_CLASH_PROTECTION_GUARD_SIZE \
   (1 << param_stack_clash_protection_guard_size)
@@ -1963,6 +1966,14 @@  loongarch_symbolic_constant_p (rtx x, enum loongarch_symbol_type *symbol_type)
 bool
 loongarch_explicit_relocs_p (enum loongarch_symbol_type type)
 {
+  /* Instructions pcalau12i, addi.d, lu32i.d and lu52i.d must be adjancent
+     so that the linker can infer the PC of pcalau12i to apply relocations
+     to lu32i.d and lu52i.d.  Otherwise, the results would be incorrect if
+     these four instructions are not in the same 4KiB page.
+     Therefore, macro instructions are used when cmodel=extreme.  */
+  if (loongarch_symbol_extreme_p (type))
+    return false;
+
   if (la_opt_explicit_relocs != EXPLICIT_RELOCS_AUTO)
     return la_opt_explicit_relocs == EXPLICIT_RELOCS_ALWAYS;
 
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
index 916d715b98b..65acb29162c 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-1.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
index a74c795ac3e..cf0f079e39a 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-2.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs -mcmodel=extreme -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } */
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
index 5622d508678..7c270d462f7 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-3.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs=auto -mcmodel=normal -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 2 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 2 } } */
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-model-4.c b/gcc/testsuite/gcc.target/loongarch/attr-model-4.c
index 482724bb974..627d630c36d 100644
--- a/gcc/testsuite/gcc.target/loongarch/attr-model-4.c
+++ b/gcc/testsuite/gcc.target/loongarch/attr-model-4.c
@@ -1,6 +1,6 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mexplicit-relocs=auto -mcmodel=extreme -O2" } */
-/* { dg-final { scan-assembler-times "%pc64_hi12" 3 } } */
+/* { dg-final { scan-assembler-times "la\.local\t\\\$r\[0-9\]+,\\\$r15," 3 } } */
 
 #define ATTR_MODEL_TEST
 #include "attr-model-test.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-1.c b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-1.c
index db1e0f85396..46318f3d23f 100644
--- a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-1.c
+++ b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-1.c
@@ -1,8 +1,8 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O0 -fno-pic -fno-plt -mexplicit-relocs -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.local.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
 
 extern void g (void);
 void
diff --git a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-2.c b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-2.c
index 21bf81ae837..14b6e658ca1 100644
--- a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-2.c
+++ b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-2.c
@@ -1,8 +1,8 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O0 -fpic -fno-plt -mexplicit-relocs -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
 
 extern void g (void);
 void
diff --git a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-3.c b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-3.c
index a4da44b4a3d..2ccbd2deb7c 100644
--- a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-3.c
+++ b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-3.c
@@ -1,7 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O0 -fno-pic -fno-plt -mexplicit-relocs=auto -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.local.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
 
 #include "func-call-extreme-1.c"
diff --git a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-4.c b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-4.c
index 16b00f4c5f2..0067024ef7d 100644
--- a/gcc/testsuite/gcc.target/loongarch/func-call-extreme-4.c
+++ b/gcc/testsuite/gcc.target/loongarch/func-call-extreme-4.c
@@ -1,7 +1,7 @@ 
 /* { dg-do compile } */
 /* { dg-options "-mabi=lp64d -O0 -fpic -fno-plt -mexplicit-relocs=auto -mcmodel=extreme" } */
-/* { dg-final { scan-assembler "test:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test1:.*pcalau12i.*%got_pc_hi20.*\n\taddi\.d.*%got_pc_lo12.*\n\tlu32i\.d.*%got64_pc_lo20.*\n\tlu52i\.d.*%got64_pc_hi12.*\n\tldx\.d" } } */
-/* { dg-final { scan-assembler "test2:.*pcalau12i.*%pc_hi20.*\n\taddi\.d.*%pc_lo12.*\n\tlu32i\.d.*%pc64_lo20.*\n\tlu52i\.d.*pc64_hi12.*\n\tadd\.d" } } */
+/* { dg-final { scan-assembler "test:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test1:.*la\.global.*,\\\$r15," } } */
+/* { dg-final { scan-assembler "test2:.*la\.local.*,\\\$r15," } } */
 
 #include "func-call-extreme-1.c"