[v5] LoongArch: add movable attribute

Message ID aea3cef0ace1cef1a63f0ff3556174789601f31a.camel@xry111.site
State New, archived
Headers
Series [v5] LoongArch: add movable attribute |

Commit Message

Xi Ruoyao Aug. 1, 2022, 10:07 a.m. UTC
  Changes v4 -> v5: Fix changelog.  No code change.

Changes v3 -> v4:

 * Use "movable" as the attribute name as Huacai says it's already used
   in downstream GCC fork.
 * Remove an inaccurate line from the doc. (Initially I tried to
   implement a "model(...)" like IA64 or M32R. Then I changed my mind
   but forgot to remove the line copied from M32R doc.)

-- >8 --

A linker script and/or a section attribute may locate a local object in
some way unexpected by the code model, leading to a link failure.  This
happens when the Linux kernel loads a module with "local" per-CPU
variables.

Add an attribute to explicitly mark an variable with the address
unlimited by the code model so we would be able to work around such
problems.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_attribute_table):
	New attribute table.
	(TARGET_ATTRIBUTE_TABLE): Define the target hook.
	(loongarch_handle_addr_global_attribute): New static function.
	(loongarch_classify_symbol): Return SYMBOL_GOT_DISP for
	SYMBOL_REF_DECL with addr_global attribute.
	(loongarch_use_anchors_for_symbol_p): New static function.
	(TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define the target hook.
	* doc/extend.texi (Variable Attributes): Document new
	LoongArch specific attribute.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/addr-global.c: New test.
---
 gcc/config/loongarch/loongarch.cc             | 63 +++++++++++++++++++
 gcc/doc/extend.texi                           | 16 +++++
 .../gcc.target/loongarch/attr-movable.c       | 29 +++++++++
 3 files changed, 108 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-movable.c
  

Comments

Xi Ruoyao Aug. 3, 2022, 1:36 a.m. UTC | #1
Is it OK for trunk or I need to change something?

By the way, I'm seeking a possibility to include this into 12.2.  Then
we leaves only 12.1 without this attribute, and we can just say
"building the kernel needs GCC 12.2 or later".

On Mon, 2022-08-01 at 18:07 +0800, Xi Ruoyao wrote:
> Changes v4 -> v5: Fix changelog.  No code change.
> 
> Changes v3 -> v4:
> 
>  * Use "movable" as the attribute name as Huacai says it's already
> used
>    in downstream GCC fork.
>  * Remove an inaccurate line from the doc. (Initially I tried to
>    implement a "model(...)" like IA64 or M32R. Then I changed my mind
>    but forgot to remove the line copied from M32R doc.)
> 
> -- >8 --
> 
> A linker script and/or a section attribute may locate a local object
> in
> some way unexpected by the code model, leading to a link failure. 
> This
> happens when the Linux kernel loads a module with "local" per-CPU
> variables.
> 
> Add an attribute to explicitly mark an variable with the address
> unlimited by the code model so we would be able to work around such
> problems.
> 
> gcc/ChangeLog:
> 
>         * config/loongarch/loongarch.cc (loongarch_attribute_table):
>         New attribute table.
>         (TARGET_ATTRIBUTE_TABLE): Define the target hook.
>         (loongarch_handle_addr_global_attribute): New static function.
>         (loongarch_classify_symbol): Return SYMBOL_GOT_DISP for
>         SYMBOL_REF_DECL with addr_global attribute.
>         (loongarch_use_anchors_for_symbol_p): New static function.
>         (TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define the target hook.
>         * doc/extend.texi (Variable Attributes): Document new
>         LoongArch specific attribute.
> 
> gcc/testsuite/ChangeLog:
> 
>         * gcc.target/loongarch/addr-global.c: New test.
> ---
>  gcc/config/loongarch/loongarch.cc             | 63
> +++++++++++++++++++
>  gcc/doc/extend.texi                           | 16 +++++
>  .../gcc.target/loongarch/attr-movable.c       | 29 +++++++++
>  3 files changed, 108 insertions(+)
>  create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-movable.c
> 
> diff --git a/gcc/config/loongarch/loongarch.cc
> b/gcc/config/loongarch/loongarch.cc
> index 79687340dfd..6b6026700a6 100644
> --- a/gcc/config/loongarch/loongarch.cc
> +++ b/gcc/config/loongarch/loongarch.cc
> @@ -1643,6 +1643,15 @@ loongarch_classify_symbol (const_rtx x)
>        && !loongarch_symbol_binds_local_p (x))
>      return SYMBOL_GOT_DISP;
>  
> +  if (SYMBOL_REF_P (x))
> +    {
> +      tree decl = SYMBOL_REF_DECL (x);
> +      /* A movable symbol may be moved away from the +/- 2GiB range
> around
> +        the PC, so we have to use GOT.  */
> +      if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES
> (decl)))
> +       return SYMBOL_GOT_DISP;
> +    }
> +
>    return SYMBOL_PCREL;
>  }
>  
> @@ -6068,6 +6077,54 @@ loongarch_starting_frame_offset (void)
>    return crtl->outgoing_args_size;
>  }
>  
> +static tree
> +loongarch_handle_movable_attribute (tree *node, tree name, tree, int,
> +                                   bool *no_add_attrs)
> +{
> +  tree decl = *node;
> +  if (TREE_CODE (decl) == VAR_DECL)
> +    {
> +      if (DECL_CONTEXT (decl)
> +         && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
> +         && !TREE_STATIC (decl))
> +       {
> +         error_at (DECL_SOURCE_LOCATION (decl),
> +                   "%qE attribute cannot be specified for local "
> +                   "variables", name);
> +         *no_add_attrs = true;
> +       }
> +    }
> +  else
> +    {
> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
> +      *no_add_attrs = true;
> +    }
> +  return NULL_TREE;
> +}
> +
> +static const struct attribute_spec loongarch_attribute_table[] =
> +{
> +  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
> +       affects_type_identity, handler, exclude } */
> +  { "movable", 0, 0, true, false, false, false,
> +    loongarch_handle_movable_attribute, NULL },
> +  /* The last attribute spec is set to be NULL.  */
> +  {}
> +};
> +
> +bool
> +loongarch_use_anchors_for_symbol_p (const_rtx symbol)
> +{
> +  tree decl = SYMBOL_REF_DECL (symbol);
> +
> +  /* A movable attribute indicates the linker may move the symbol
> away,
> +     so the use of anchor may cause relocation overflow.  */
> +  if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES (decl)))
> +    return false;
> +
> +  return default_use_anchors_for_symbol_p (symbol);
> +}
> +
>  /* Initialize the GCC target structure.  */
>  #undef TARGET_ASM_ALIGNED_HI_OP
>  #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
> @@ -6256,6 +6313,12 @@ loongarch_starting_frame_offset (void)
>  #undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
>  #define TARGET_HAVE_SPECULATION_SAFE_VALUE
> speculation_safe_value_not_needed
>  
> +#undef  TARGET_ATTRIBUTE_TABLE
> +#define TARGET_ATTRIBUTE_TABLE loongarch_attribute_table
> +
> +#undef  TARGET_USE_ANCHORS_FOR_SYMBOL_P
> +#define TARGET_USE_ANCHORS_FOR_SYMBOL_P
> loongarch_use_anchors_for_symbol_p
> +
>  struct gcc_target targetm = TARGET_INITIALIZER;
>  
>  #include "gt-loongarch.h"
> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
> index 7fe7f8817cd..322d8c05a04 100644
> --- a/gcc/doc/extend.texi
> +++ b/gcc/doc/extend.texi
> @@ -7314,6 +7314,7 @@ attributes.
>  * Blackfin Variable Attributes::
>  * H8/300 Variable Attributes::
>  * IA-64 Variable Attributes::
> +* LoongArch Variable Attributes::
>  * M32R/D Variable Attributes::
>  * MeP Variable Attributes::
>  * Microsoft Windows Variable Attributes::
> @@ -8098,6 +8099,21 @@ defined by shared libraries.
>  
>  @end table
>  
> +@node LoongArch Variable Attributes
> +@subsection LoongArch Variable Attributes
> +
> +One attribute is currently defined for the LoongArch.
> +
> +@table @code
> +@item movable
> +@cindex @code{movable} variable attribute, LoongArch
> +Use this attribute on the LoongArch to mark an object possible to be
> moved
> +by the linker, so its address is unlimited by the local data section
> range
> +specified by the code model even if the object is defined locally. 
> This
> +attribute is mostly useful if a @code{section} attribute and/or a
> linker
> +script will move the object somewhere unexpected by the code model.
> +@end table
> +
>  @node M32R/D Variable Attributes
>  @subsection M32R/D Variable Attributes
>  
> diff --git a/gcc/testsuite/gcc.target/loongarch/attr-movable.c
> b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
> new file mode 100644
> index 00000000000..85b1dd4c59a
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
> @@ -0,0 +1,29 @@
> +/* { dg-do compile } */
> +/* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
> +/* { dg-final { scan-assembler-not "%pc" } } */
> +/* { dg-final { scan-assembler-times "%got_pc_hi20" 3 } } */
> +
> +/* movable attribute should mark x and y possibly outside of the
> local
> +   data range defined by the code model, so GOT should be used
> instead of
> +   PC-relative.  */
> +
> +int x __attribute__((movable));
> +int y __attribute__((movable));
> +
> +int
> +test(void)
> +{
> +  return x + y;
> +}
> +
> +/* The following will be used for kernel per-cpu storage
> implemention. */
> +
> +register char *per_cpu_base __asm__("r21");
> +static int counter __attribute__((section(".data..percpu"),
> movable));
> +
> +void
> +inc_counter(void)
> +{
> +  int *ptr = (int *)(per_cpu_base + (long)&counter);
> +  (*ptr)++;
> +}
  
WANG Xuerui Aug. 3, 2022, 2:59 a.m. UTC | #2
On 2022/8/3 09:36, Xi Ruoyao wrote:
> Is it OK for trunk or I need to change something?
>
> By the way, I'm seeking a possibility to include this into 12.2.  Then
> we leaves only 12.1 without this attribute, and we can just say
> "building the kernel needs GCC 12.2 or later".
>
> On Mon, 2022-08-01 at 18:07 +0800, Xi Ruoyao wrote:
>> Changes v4 -> v5: Fix changelog.  No code change.
>>
>> Changes v3 -> v4:
>>
>>   * Use "movable" as the attribute name as Huacai says it's already
>> used
>>     in downstream GCC fork.

I don't think mindlessly caring for vendor forks is always correct. In 
fact I find the name "movable" too generic, and something like 
"force_got_access" could be better.

I don't currently have time to test this, unfortunately, due to day job. 
Might be able to give it a whirl one or two week later though...

>>   * Remove an inaccurate line from the doc. (Initially I tried to
>>     implement a "model(...)" like IA64 or M32R. Then I changed my mind
>>     but forgot to remove the line copied from M32R doc.)
>>
>> -- >8 --
>>
>> A linker script and/or a section attribute may locate a local object
>> in
>> some way unexpected by the code model, leading to a link failure.
>> This
>> happens when the Linux kernel loads a module with "local" per-CPU
>> variables.
>>
>> Add an attribute to explicitly mark an variable with the address
>> unlimited by the code model so we would be able to work around such
>> problems.
>>
>> gcc/ChangeLog:
>>
>>          * config/loongarch/loongarch.cc (loongarch_attribute_table):
>>          New attribute table.
>>          (TARGET_ATTRIBUTE_TABLE): Define the target hook.
>>          (loongarch_handle_addr_global_attribute): New static function.
>>          (loongarch_classify_symbol): Return SYMBOL_GOT_DISP for
>>          SYMBOL_REF_DECL with addr_global attribute.
>>          (loongarch_use_anchors_for_symbol_p): New static function.
>>          (TARGET_USE_ANCHORS_FOR_SYMBOL_P): Define the target hook.
>>          * doc/extend.texi (Variable Attributes): Document new
>>          LoongArch specific attribute.
>>
>> gcc/testsuite/ChangeLog:
>>
>>          * gcc.target/loongarch/addr-global.c: New test.
>> ---
>>   gcc/config/loongarch/loongarch.cc             | 63
>> +++++++++++++++++++
>>   gcc/doc/extend.texi                           | 16 +++++
>>   .../gcc.target/loongarch/attr-movable.c       | 29 +++++++++
>>   3 files changed, 108 insertions(+)
>>   create mode 100644 gcc/testsuite/gcc.target/loongarch/attr-movable.c
>>
>> diff --git a/gcc/config/loongarch/loongarch.cc
>> b/gcc/config/loongarch/loongarch.cc
>> index 79687340dfd..6b6026700a6 100644
>> --- a/gcc/config/loongarch/loongarch.cc
>> +++ b/gcc/config/loongarch/loongarch.cc
>> @@ -1643,6 +1643,15 @@ loongarch_classify_symbol (const_rtx x)
>>         && !loongarch_symbol_binds_local_p (x))
>>       return SYMBOL_GOT_DISP;
>>   
>> +  if (SYMBOL_REF_P (x))
>> +    {
>> +      tree decl = SYMBOL_REF_DECL (x);
>> +      /* A movable symbol may be moved away from the +/- 2GiB range
>> around
>> +        the PC, so we have to use GOT.  */
>> +      if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES
>> (decl)))
>> +       return SYMBOL_GOT_DISP;
>> +    }
>> +
>>     return SYMBOL_PCREL;
>>   }
>>   
>> @@ -6068,6 +6077,54 @@ loongarch_starting_frame_offset (void)
>>     return crtl->outgoing_args_size;
>>   }
>>   
>> +static tree
>> +loongarch_handle_movable_attribute (tree *node, tree name, tree, int,
>> +                                   bool *no_add_attrs)
>> +{
>> +  tree decl = *node;
>> +  if (TREE_CODE (decl) == VAR_DECL)
>> +    {
>> +      if (DECL_CONTEXT (decl)
>> +         && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
>> +         && !TREE_STATIC (decl))
>> +       {
>> +         error_at (DECL_SOURCE_LOCATION (decl),
>> +                   "%qE attribute cannot be specified for local "
>> +                   "variables", name);
>> +         *no_add_attrs = true;
>> +       }
>> +    }
>> +  else
>> +    {
>> +      warning (OPT_Wattributes, "%qE attribute ignored", name);
>> +      *no_add_attrs = true;
>> +    }
>> +  return NULL_TREE;
>> +}
>> +
>> +static const struct attribute_spec loongarch_attribute_table[] =
>> +{
>> +  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
>> +       affects_type_identity, handler, exclude } */
>> +  { "movable", 0, 0, true, false, false, false,
>> +    loongarch_handle_movable_attribute, NULL },
>> +  /* The last attribute spec is set to be NULL.  */
>> +  {}
>> +};
>> +
>> +bool
>> +loongarch_use_anchors_for_symbol_p (const_rtx symbol)
>> +{
>> +  tree decl = SYMBOL_REF_DECL (symbol);
>> +
>> +  /* A movable attribute indicates the linker may move the symbol
>> away,
>> +     so the use of anchor may cause relocation overflow.  */
>> +  if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES (decl)))
>> +    return false;
>> +
>> +  return default_use_anchors_for_symbol_p (symbol);
>> +}
>> +
>>   /* Initialize the GCC target structure.  */
>>   #undef TARGET_ASM_ALIGNED_HI_OP
>>   #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
>> @@ -6256,6 +6313,12 @@ loongarch_starting_frame_offset (void)
>>   #undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
>>   #define TARGET_HAVE_SPECULATION_SAFE_VALUE
>> speculation_safe_value_not_needed
>>   
>> +#undef  TARGET_ATTRIBUTE_TABLE
>> +#define TARGET_ATTRIBUTE_TABLE loongarch_attribute_table
>> +
>> +#undef  TARGET_USE_ANCHORS_FOR_SYMBOL_P
>> +#define TARGET_USE_ANCHORS_FOR_SYMBOL_P
>> loongarch_use_anchors_for_symbol_p
>> +
>>   struct gcc_target targetm = TARGET_INITIALIZER;
>>   
>>   #include "gt-loongarch.h"
>> diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
>> index 7fe7f8817cd..322d8c05a04 100644
>> --- a/gcc/doc/extend.texi
>> +++ b/gcc/doc/extend.texi
>> @@ -7314,6 +7314,7 @@ attributes.
>>   * Blackfin Variable Attributes::
>>   * H8/300 Variable Attributes::
>>   * IA-64 Variable Attributes::
>> +* LoongArch Variable Attributes::
>>   * M32R/D Variable Attributes::
>>   * MeP Variable Attributes::
>>   * Microsoft Windows Variable Attributes::
>> @@ -8098,6 +8099,21 @@ defined by shared libraries.
>>   
>>   @end table
>>   
>> +@node LoongArch Variable Attributes
>> +@subsection LoongArch Variable Attributes
>> +
>> +One attribute is currently defined for the LoongArch.
>> +
>> +@table @code
>> +@item movable
>> +@cindex @code{movable} variable attribute, LoongArch
>> +Use this attribute on the LoongArch to mark an object possible to be
>> moved
>> +by the linker, so its address is unlimited by the local data section
>> range
>> +specified by the code model even if the object is defined locally.
>> This
>> +attribute is mostly useful if a @code{section} attribute and/or a
>> linker
>> +script will move the object somewhere unexpected by the code model.
>> +@end table
>> +
>>   @node M32R/D Variable Attributes
>>   @subsection M32R/D Variable Attributes
>>   
>> diff --git a/gcc/testsuite/gcc.target/loongarch/attr-movable.c
>> b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
>> new file mode 100644
>> index 00000000000..85b1dd4c59a
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
>> @@ -0,0 +1,29 @@
>> +/* { dg-do compile } */
>> +/* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
>> +/* { dg-final { scan-assembler-not "%pc" } } */
>> +/* { dg-final { scan-assembler-times "%got_pc_hi20" 3 } } */
>> +
>> +/* movable attribute should mark x and y possibly outside of the
>> local
>> +   data range defined by the code model, so GOT should be used
>> instead of
>> +   PC-relative.  */
>> +
>> +int x __attribute__((movable));
>> +int y __attribute__((movable));
>> +
>> +int
>> +test(void)
>> +{
>> +  return x + y;
>> +}
>> +
>> +/* The following will be used for kernel per-cpu storage
>> implemention. */
>> +
>> +register char *per_cpu_base __asm__("r21");
>> +static int counter __attribute__((section(".data..percpu"),
>> movable));
>> +
>> +void
>> +inc_counter(void)
>> +{
>> +  int *ptr = (int *)(per_cpu_base + (long)&counter);
>> +  (*ptr)++;
>> +}
  
Xi Ruoyao Aug. 3, 2022, 3:10 a.m. UTC | #3
On Wed, 2022-08-03 at 10:55 +0800, chenglulu@loongson.cn wrote:
> I think there is no problem with this patch。But I have a question. The
> visibility attribute works, so is it necessary to add the moveable
> attribute?

1. My use of -fPIC and visibility is not in the way ELF visibility has
been designed for.  It's hardly to tell if it's an legitimate use or a
misuse.
2. Adding -fPIC can make unwanted side effects, especially: if we add
some optimizations only suitable for -fno-PIC, we'll miss them using -
fPIC.  Note that -fPIC does not only mean "produce position independent
code", but "produce position independent code *suitable for ELF dynamic
libraries*".  So to other people it will be ridiculous to use -fPIC for
kernel.
3. Huacai said he didn't like using __attribute__((visibility)) like
this (in kernel ML) and I share his feeling.

> I'd like to wait for the kernel team to test the performance data of
> the two implementations before deciding whether to support this
> attribute.
> 
> What do you think?

Perhaps, I can't access my dev system now anyway (I've configured the
SSH access but then a sudden power surge happened and I didn't
configured automatically power on :( )
>
  
Xi Ruoyao Aug. 3, 2022, 3:15 a.m. UTC | #4
On Wed, 2022-08-03 at 10:59 +0800, WANG Xuerui wrote:

> I don't think mindlessly caring for vendor forks is always correct. In
> fact I find the name "movable" too generic, and something like 
> "force_got_access" could be better.

The problem is "what will this behave *if* we later add some code model
without GOT".  If it's named "movable" we generate a full 4-instruction
absolute (or PC-relative) address loading sequence if GOT is disabled. 
If it's named "force_got_access" we report an error and reject the code
if GOT is disabled.

> I don't currently have time to test this, unfortunately, due to day job. 
> Might be able to give it a whirl one or two week later though...

Unfortunately, I can't access my dev system via SSH too because while
I'm remote, a sudden power surge happened and I forgot to configure an
automatically power-on.

I'm kind of rushy because I want to make it into 12.2, leaving 12.1 the
only exception cannot build Linux >= 6.0.  But maybe it just can't be
backported anyway.
  
Xi Ruoyao Aug. 4, 2022, 7:47 a.m. UTC | #5
On Wed, 2022-08-03 at 11:10 +0800, Xi Ruoyao via Gcc-patches wrote:

> > I'd like to wait for the kernel team to test the performance data of
> > the two implementations before deciding whether to support this
> > attribute.
> > 
> > What do you think?
> 
> Perhaps, I can't access my dev system now anyway (I've configured the
> SSH access but then a sudden power surge happened and I didn't
> configured automatically power on :( )

Hi folks,

Can someone perform a bench to see if a four-instruction immediate load
sequence can outperform GOT or vice versa?  I cannot access my test
system in at least 1 week, and I may be busy preparing Linux From
Scratch 11.2 release in the remaining of August.

Note: if the four-instruction immediate load sequence outperforms GOT,
we should consider use immediate load instead of GOT for -fno-PIC by
default.

P.S. It seems I have trouble accessing gcc400.fsffrance.org.  I have a C
Farm account and I've already put

   Host gcc400.fsffrance.org
       Port 25465

in ~/.ssh/config, and I can access other C farm machines w/o problem. 
But:

   $ ssh gcc400.fsffrance.org 
   xry111@gcc400.fsffrance.org: Permission denied (publickey,keyboard-interactive).
   
If you know the administrator of the C farm machine, can you tell him to
check the configuration?  If I can access it I may use some time to
perform the bench (in userspace of course) myself.  Thanks.
  
chenglulu Aug. 5, 2022, 1:05 a.m. UTC | #6
I'm working on the implementation of specifing attributes of variables 
for other architectures. If the address is obtained through the GOT 
table and 4 instructions, there is not much difference in performance. 
Is it more reasonable for us to refer to the implementation of the model 
attribute under the IA64 architecture? I will compare the performance of 
the two soon. Do you know the approximate release date of GCC 12.2? I 
also want to fix this before 12.2 is released. Thanks!

在 2022/8/4 下午3:47, Xi Ruoyao 写道:
> On Wed, 2022-08-03 at 11:10 +0800, Xi Ruoyao via Gcc-patches wrote:
>
>>> I'd like to wait for the kernel team to test the performance data of
>>> the two implementations before deciding whether to support this
>>> attribute.
>>>
>>> What do you think?
>> Perhaps, I can't access my dev system now anyway (I've configured the
>> SSH access but then a sudden power surge happened and I didn't
>> configured automatically power on :( )
> Hi folks,
>
> Can someone perform a bench to see if a four-instruction immediate load
> sequence can outperform GOT or vice versa?  I cannot access my test
> system in at least 1 week, and I may be busy preparing Linux From
> Scratch 11.2 release in the remaining of August.
>
> Note: if the four-instruction immediate load sequence outperforms GOT,
> we should consider use immediate load instead of GOT for -fno-PIC by
> default.
>
> P.S. It seems I have trouble accessing gcc400.fsffrance.org.  I have a C
> Farm account and I've already put
>
>     Host gcc400.fsffrance.org
>         Port 25465
>
> in ~/.ssh/config, and I can access other C farm machines w/o problem.
> But:
>
>     $ ssh gcc400.fsffrance.org
>     xry111@gcc400.fsffrance.org: Permission denied (publickey,keyboard-interactive).
>     
> If you know the administrator of the C farm machine, can you tell him to
> check the configuration?  If I can access it I may use some time to
> perform the bench (in userspace of course) myself.  Thanks.
>
  
Xi Ruoyao Aug. 5, 2022, 1:28 a.m. UTC | #7
On Fri, 2022-08-05 at 09:05 +0800, Lulu Cheng wrote:
> I'm working on the implementation of specifing attributes of variables for other architectures.
> If the address is obtained through the GOT table and 4 instructions, there is not much difference in performance.

In this case I still prefer a GOT table entry because for 4-instruction
absolute addressing sequence we'll need to implement 4 new relocation
types in the kernel module loader.

> Is it more reasonable for us to refer to the implementation of the model attribute under the IA64 architecture?

Maybe we can use "model(got)", "model(abs)", "model(pcrel)" etc.

> I will compare the performance of the two soon. Do you know the approximate release date of GCC 12.2?
> I also want to fix this before 12.2 is released.

GCC 12.2 rc1 will be frozen on Aug 12th.
  
chenglulu Aug. 5, 2022, 2:38 a.m. UTC | #8
在 2022/8/5 上午9:28, Xi Ruoyao 写道:
> On Fri, 2022-08-05 at 09:05 +0800, Lulu Cheng wrote:
>> I'm working on the implementation of specifing attributes of variables for other architectures.
>> If the address is obtained through the GOT table and 4 instructions, there is not much difference in performance.
> In this case I still prefer a GOT table entry because for 4-instruction
> absolute addressing sequence we'll need to implement 4 new relocation
> types in the kernel module loader.

If it is accessed through the GOT table, dynamic relocation is required 
when the module is loaded. And accessing the got table may have a cache 
miss.

>> Is it more reasonable for us to refer to the implementation of the model attribute under the IA64 architecture?
> Maybe we can use "model(got)", "model(abs)", "model(pcrel)" etc.

We have a set of instruction implementations that can get a relative pc 
64-bit offset:

   "pcalau12i %1,%%pc_hi20(%3);"

   "addi.d %2,$r0,%%pc_lo12(%3);"
   "lu32i.d %2,%%pc64_lo20(%3);"
   "lu52i.d %2,%2,%%pc64_hi12(%3);"

   "add.d %1,%1,%2;",

This set of instructions can be selected according to the size of the 
offset:

   "pcalau12i %1,%%pc_hi20(%3);"

   "addi.d %2,$r0,%%pc_lo12(%3);"

   "lu32i.d %2,%%pc64_lo20(%3);"

   "add.d %1,%1,%2;",

for offset within signed 52 bits.

or

   "pcalau12i %1,%%pc_hi20(%3);"

   "addi.d %2,$r0,%%pc_lo12(%3);"
   "lu32i.d %2,%%pc64_lo20(%3);"
   "lu52i.d %2,%2,%%pc64_hi12(%3);"

   "add.d %1,%1,%2;"

for offset within signed 64 bits.

So my idea is "model(normal)","model (large)" etc.

>> I will compare the performance of the two soon. Do you know the approximate release date of GCC 12.2?
>> I also want to fix this before 12.2 is released.
> GCC 12.2 rc1 will be frozen on Aug 12th.
>
  
Xi Ruoyao Aug. 5, 2022, 2:51 a.m. UTC | #9
On Fri, 2022-08-05 at 10:38 +0800, Lulu Cheng wrote:

> > > I'm working on the implementation of specifing attributes of variables for other architectures.
> > > If the address is obtained through the GOT table and 4 instructions, there is not much difference in performance.
> > In this case I still prefer a GOT table entry because for 4-instruction
> > absolute addressing sequence we'll need to implement 4 new relocation
> > types in the kernel module loader.
> If it is accessed through the GOT table, dynamic relocation is required when the module is loaded.

Dynamic relocation is required when the module is loaded anyway.  The
.ko modules are actually relocatable ELF objects (produced by ld -r) and
the module loader has to perform some work of a normal linker.

> And accessing the got table may have a cache miss.

/* snip */

> So my idea is "model(normal)","model (large)" etc.

Then should we have an option to disable GOT globally?  Maybe for kernel
we'll just "-mno-got -mcmodel=large" (or "extreme"?  The kernel image is
loaded at 0x9000000000000000 and the modules are above
0xffff000000000000 so we need to handle 64-bit offset).
  
Xi Ruoyao Aug. 5, 2022, 3:34 a.m. UTC | #10
On Fri, 2022-08-05 at 10:51 +0800, Xi Ruoyao via Gcc-patches wrote:

> > If it is accessed through the GOT table, dynamic relocation is required when the module is loaded.
> 
> Dynamic relocation is required when the module is loaded anyway.  The
> .ko modules are actually relocatable ELF objects (produced by ld -r) and
> the module loader has to perform some work of a normal linker.
> 
> > And accessing the got table may have a cache miss.
> 
> /* snip */
> 
> > So my idea is "model(normal)","model (large)" etc.
> 
> Then should we have an option to disable GOT globally?  Maybe for kernel
> we'll just "-mno-got -mcmodel=large" (or "extreme"?  The kernel image is
> loaded at 0x9000000000000000 and the modules are above
> 0xffff000000000000 so we need to handle 64-bit offset).

Or maybe we should just use a PC-relative addressing with 4 instructions
instead of GOT for -fno-PIC?  Both way consumes 16 bytes (4 instructions
for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
relative may be more cache friendly.   But such a major change cannot be
backported for 12.2 IMO.
  
Xi Ruoyao Aug. 5, 2022, 3:45 a.m. UTC | #11
On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:

> Or maybe we should just use a PC-relative addressing with 4 instructions
> instead of GOT for -fno-PIC?

Not possible, Glibc does not support R_LARCH_PCALA* relocations in
ld.so.  So we still need a -mno-got (or something) option to disable GOT
for special cases like the kernel.

> Both way consumes 16 bytes (4 instructions
> for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
> relative may be more cache friendly.   But such a major change cannot be
> backported for 12.2 IMO.
  
chenglulu Aug. 5, 2022, 4:01 a.m. UTC | #12
在 2022/8/5 上午11:45, Xi Ruoyao 写道:
> On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:
>
>> Or maybe we should just use a PC-relative addressing with 4 instructions
>> instead of GOT for -fno-PIC?
> Not possible, Glibc does not support R_LARCH_PCALA* relocations in
> ld.so.  So we still need a -mno-got (or something) option to disable GOT
> for special cases like the kernel.
>
>> Both way consumes 16 bytes (4 instructions
>> for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
>> relative may be more cache friendly.   But such a major change cannot be
>> backported for 12.2 IMO.

I'm very sorry, my understanding of the precpu variable is wrong, I just 
read the code of the kernel you submitted, this precpu variable not only 
has a large offset but also has an uncertain address when compiling, so 
no matter whether it is addressed with pcrel Still got addressing needs 
dynamic relocation when loading. It seems that accessing through the got 
table is a better choice.

The name movable is also very vivid to describe this function in the 
kernel, indicating that the address of the variable can be changed at 
will. But this name is more difficult to understand in gcc, I have no 
opinion on other, can this name be changed?
  
Xi Ruoyao Aug. 5, 2022, 6:03 a.m. UTC | #13
On Fri, 2022-08-05 at 12:01 +0800, Lulu Cheng wrote:
> 
> 在 2022/8/5 上午11:45, Xi Ruoyao 写道:
>  
> 
> 
> 
> > On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:
> > 
> >  
> > 
> > 
> > 
> > > Or maybe we should just use a PC-relative addressing with 4 instructions
> > > instead of GOT for -fno-PIC?
> > Not possible, Glibc does not support R_LARCH_PCALA* relocations in
> > ld.so.  So we still need a -mno-got (or something) option to disable GOT
> > for special cases like the kernel.
> > 
> >  
> > 
> > 
> > 
> > > Both way consumes 16 bytes (4 instructions
> > > for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
> > > relative may be more cache friendly.   But such a major change cannot be
> > > backported for 12.2 IMO.
>  
> 
> 
> 
> I'm very sorry, my understanding of the precpu variable is wrong,
> I just read the code of the kernel you submitted, this precpu variable
> not only has a large offset but also has an uncertain address when compiling,
> so no matter whether it is addressed with pcrel Still got addressing needs
> dynamic relocation when loading. It seems that accessing through the got table
> is a better choice.
> 
> The name movable is also very vivid to describe this function in the kernel,
> indicating that the address of the variable can be changed at will.
> 
> But this name is more difficult to understand in gcc, I have no opinion on other,
> can this name be changed?

Yes, we don't need to be compatible with old vendor compiler IMO.


"force_got_access" as Xuerui suggested?
  
chenglulu Aug. 5, 2022, 7:19 a.m. UTC | #14
在 2022/8/5 下午2:03, Xi Ruoyao 写道:
> On Fri, 2022-08-05 at 12:01 +0800, Lulu Cheng wrote:
>> 在 2022/8/5 上午11:45, Xi Ruoyao 写道:
>>   
>>
>>
>>
>>> On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:
>>>
>>>   
>>>
>>>
>>>
>>>> Or maybe we should just use a PC-relative addressing with 4 instructions
>>>> instead of GOT for -fno-PIC?
>>> Not possible, Glibc does not support R_LARCH_PCALA* relocations in
>>> ld.so.  So we still need a -mno-got (or something) option to disable GOT
>>> for special cases like the kernel.
>>>
>>>   
>>>
>>>
>>>
>>>> Both way consumes 16 bytes (4 instructions
>>>> for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
>>>> relative may be more cache friendly.   But such a major change cannot be
>>>> backported for 12.2 IMO.
>>   
>>
>>
>>
>> I'm very sorry, my understanding of the precpu variable is wrong,
>> I just read the code of the kernel you submitted, this precpu variable
>> not only has a large offset but also has an uncertain address when compiling,
>> so no matter whether it is addressed with pcrel Still got addressing needs
>> dynamic relocation when loading. It seems that accessing through the got table
>> is a better choice.
>>
>> The name movable is also very vivid to describe this function in the kernel,
>> indicating that the address of the variable can be changed at will.
>>
>> But this name is more difficult to understand in gcc, I have no opinion on other,
>> can this name be changed?
> Yes, we don't need to be compatible with old vendor compiler IMO.
>
>
> "force_got_access" as Xuerui suggested?

Compared with these names, I think addr_global is better.
  
WANG Xuerui Aug. 5, 2022, 7:41 a.m. UTC | #15
On 2022/8/5 15:19, Lulu Cheng wrote:
>
>
> 在 2022/8/5 下午2:03, Xi Ruoyao 写道:
>> On Fri, 2022-08-05 at 12:01 +0800, Lulu Cheng wrote:
>>> 在 2022/8/5 上午11:45, Xi Ruoyao 写道:
>>>
>>>> On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:
>>>>
>>>>> Or maybe we should just use a PC-relative addressing with 4 instructions
>>>>> instead of GOT for -fno-PIC?
>>>> Not possible, Glibc does not support R_LARCH_PCALA* relocations in
>>>> ld.so.  So we still need a -mno-got (or something) option to disable GOT
>>>> for special cases like the kernel.
>>>>
>>>>> Both way consumes 16 bytes (4 instructions
>>>>> for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
>>>>> relative may be more cache friendly.   But such a major change cannot be
>>>>> backported for 12.2 IMO.
>>> I'm very sorry, my understanding of the precpu variable is wrong,
>>> I just read the code of the kernel you submitted, this precpu variable
>>> not only has a large offset but also has an uncertain address when compiling,
>>> so no matter whether it is addressed with pcrel Still got addressing needs
>>> dynamic relocation when loading. It seems that accessing through the got table
>>> is a better choice.
>>>
>>> The name movable is also very vivid to describe this function in the kernel,
>>> indicating that the address of the variable can be changed at will.
>>>
>>> But this name is more difficult to understand in gcc, I have no opinion on other,
>>> can this name be changed?
>> Yes, we don't need to be compatible with old vendor compiler IMO.
>>
>>
>> "force_got_access" as Xuerui suggested?
> Compared with these names, I think addr_global is better.

Actually if "model(...)" can be implemented I'd prefer a descriptive 
word/phrase inside model(). Because it may well be the case that more 
peculiar ways of accessing some special data will have to be supported 
in the future, and all of them are kind of "data models" so we'd be able 
to nicely group them with model(...).

Otherwise I actually don't have a particularly strong opinion, aside 
from "movable" which IMO should definitely not be taken.
  
chenglulu Aug. 5, 2022, 7:58 a.m. UTC | #16
在 2022/8/5 下午3:41, WANG Xuerui 写道:
> On 2022/8/5 15:19, Lulu Cheng wrote:
>>
>>
>> 在 2022/8/5 下午2:03, Xi Ruoyao 写道:
>>> On Fri, 2022-08-05 at 12:01 +0800, Lulu Cheng wrote:
>>>> 在 2022/8/5 上午11:45, Xi Ruoyao 写道:
>>>>
>>>>> On Fri, 2022-08-05 at 11:34 +0800, Xi Ruoyao via Gcc-patches wrote:
>>>>>
>>>>>> Or maybe we should just use a PC-relative addressing with 4 instructions
>>>>>> instead of GOT for -fno-PIC?
>>>>> Not possible, Glibc does not support R_LARCH_PCALA* relocations in
>>>>> ld.so.  So we still need a -mno-got (or something) option to disable GOT
>>>>> for special cases like the kernel.
>>>>>
>>>>>> Both way consumes 16 bytes (4 instructions
>>>>>> for PC-relative, 2 instructions and a 64-bit GOT entry for GOT) and PC-
>>>>>> relative may be more cache friendly.   But such a major change cannot be
>>>>>> backported for 12.2 IMO.
>>>> I'm very sorry, my understanding of the precpu variable is wrong,
>>>> I just read the code of the kernel you submitted, this precpu variable
>>>> not only has a large offset but also has an uncertain address when compiling,
>>>> so no matter whether it is addressed with pcrel Still got addressing needs
>>>> dynamic relocation when loading. It seems that accessing through the got table
>>>> is a better choice.
>>>>
>>>> The name movable is also very vivid to describe this function in the kernel,
>>>> indicating that the address of the variable can be changed at will.
>>>>
>>>> But this name is more difficult to understand in gcc, I have no opinion on other,
>>>> can this name be changed?
>>> Yes, we don't need to be compatible with old vendor compiler IMO.
>>>
>>>
>>> "force_got_access" as Xuerui suggested?
>> Compared with these names, I think addr_global is better.
>
> Actually if "model(...)" can be implemented I'd prefer a descriptive 
> word/phrase inside model(). Because it may well be the case that more 
> peculiar ways of accessing some special data will have to be supported 
> in the future, and all of them are kind of "data models" so we'd be 
> able to nicely group them with model(...).
>
> Otherwise I actually don't have a particularly strong opinion, aside 
> from "movable" which IMO should definitely not be taken.
>
>
I think the model of precpu is not very easy to describe. 
model(got)?model(global)? I also want to use attribute model and 
-mcmodel together, but this is just an initial idea, what do you think?
  
Xi Ruoyao Aug. 5, 2022, 9:53 a.m. UTC | #17
On Fri, 2022-08-05 at 15:58 +0800, Lulu Cheng wrote:
> I think the model of precpu is not very easy to describe. model(got)?model(global)? 
> I also want to use attribute model and -mcmodel together, but this is just an initial idea, 
> what do you think?

It seems I had some misunderstanding about IA-64 model attribute.  IA-64
actually does not have -mcmodel= options.  And a code model only
specifies where "the GOT and the local symbols" are, but our new
attribute should apply for both local symbols and global symbols.  So I
don't think we should strongly bind the new attribute and -mcmodel.

Maybe, __attribute__((addressing_model(got/pcrel32/pcrel64/abs32/abs64))
?  I think they are explicit enough (we can implement got and pc32
first, and adding the others when we implement other code models).
  
chenglulu Aug. 8, 2022, 4:53 a.m. UTC | #18
在 2022/8/5 下午5:53, Xi Ruoyao 写道:
> On Fri, 2022-08-05 at 15:58 +0800, Lulu Cheng wrote:
>> I think the model of precpu is not very easy to describe. model(got)?model(global)?
>> I also want to use attribute model and -mcmodel together, but this is just an initial idea,
>> what do you think?
> It seems I had some misunderstanding about IA-64 model attribute.  IA-64
> actually does not have -mcmodel= options.  And a code model only
> specifies where "the GOT and the local symbols" are, but our new
> attribute should apply for both local symbols and global symbols.  So I
> don't think we should strongly bind the new attribute and -mcmodel.
>
> Maybe, __attribute__((addressing_model(got/pcrel32/pcrel64/abs32/abs64))
> ?  I think they are explicit enough (we can implement got and pc32
> first, and adding the others when we implement other code models).

I still think it makes a little bit more sense to put attribute(model) 
and -mcmodel together.

-mcmodel sets the access range of all symbols in a single file, and 
attribute (model) sets the

accsess range of a single symbol in a file. For example 
__attribute__((model(normal/large/extreme))).
  
Xi Ruoyao Aug. 9, 2022, 11:30 a.m. UTC | #19
Sorry for late reply, I'm rebuilding my entire Linux system (from
scratch) for Glibc-2.36 and Binutils-2.39 update and I just reached the
mail client.

On Mon, 2022-08-08 at 12:53 +0800, Lulu Cheng wrote:
> I still think it makes a little bit more sense to put attribute(model)
> and -mcmodel together.
> 
> -mcmodel sets the access range of all symbols in a single fileand 
> attribute (model) sets the
> 
> accsess range of a single symbol in a file. For example 
> __attribute__((model(normal/large/extreme))).

It might make sense, but then it would not be what we want for per-CPU
symbols.  What we want here is "treat a local symbol as-if it's global",
while each code model may already treat local symbol and global symbol
differently.

Disambiguation: here "local" means "defined in this TU", "global"
otherwise (not "local variable" in C).

I'll send v6 with the name "addr_global" if no objection.
  
chenglulu Aug. 9, 2022, 1:03 p.m. UTC | #20
在 2022/8/9 下午7:30, Xi Ruoyao 写道:
> Sorry for late reply, I'm rebuilding my entire Linux system (from
> scratch) for Glibc-2.36 and Binutils-2.39 update and I just reached the
> mail client.
>
> On Mon, 2022-08-08 at 12:53 +0800, Lulu Cheng wrote:
>> I still think it makes a little bit more sense to put attribute(model)
>> and -mcmodel together.
>>
>> -mcmodel sets the access range of all symbols in a single fileand
>> attribute (model) sets the
>>
>> accsess range of a single symbol in a file. For example
>> __attribute__((model(normal/large/extreme))).
> It might make sense, but then it would not be what we want for per-CPU
> symbols.  What we want here is "treat a local symbol as-if it's global",
> while each code model may already treat local symbol and global symbol
> differently.
>
> Disambiguation: here "local" means "defined in this TU", "global"
> otherwise (not "local variable" in C).
>
> I'll send v6 with the name "addr_global" if no objection.
>
I am implementing the mode of cmodel=extreme. In this mode, the value of 
the relative offset is a signed 64-bit value, so this can solve the 
access problem of the variables of the kernel precpu.

So I wonder if it is necessary to add another attribute like addr_global?
  
Xi Ruoyao Aug. 9, 2022, 2:04 p.m. UTC | #21
On Tue, 2022-08-09 at 21:03 +0800, Lulu Cheng wrote:
> 
> 在 2022/8/9 下午7:30, Xi Ruoyao 写道:
>  
> 
> 
> 
> > Sorry for late reply, I'm rebuilding my entire Linux system (from
> > scratch) for Glibc-2.36 and Binutils-2.39 update and I just reached the
> > mail client.
> > 
> > On Mon, 2022-08-08 at 12:53 +0800, Lulu Cheng wrote:
> >  
> > 
> > 
> > 
> > > I still think it makes a little bit more sense to put attribute(model)
> > > and -mcmodel together.
> > > 
> > > -mcmodel sets the access range of all symbols in a single fileand 
> > > attribute (model) sets the
> > > 
> > > accsess range of a single symbol in a file. For example 
> > > __attribute__((model(normal/large/extreme))).
> > It might make sense, but then it would not be what we want for per-CPU
> > symbols.  What we want here is "treat a local symbol as-if it's global",
> > while each code model may already treat local symbol and global symbol
> > differently.
> > 
> > Disambiguation: here "local" means "defined in this TU", "global"
> > otherwise (not "local variable" in C).
> > 
> > I'll send v6 with the name "addr_global" if no objection.
> > 
> I am implementing the mode of cmodel=extreme.
> In this mode, the value of the relative offset is a signed 64-bit value,
> so this can solve the access problem of the variables of the kernel precpu.
> So I wonder if it is necessary to add another attribute like addr_global?

If we use GOT I can implement only PC_HI20 and PC_LO12 relocs in kernel
module loader. If we use extreme I'll need to implement 4 ABS
relocations along with them.

But "the less the better" is not a very strong reason anyway.
  

Patch

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 79687340dfd..6b6026700a6 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -1643,6 +1643,15 @@  loongarch_classify_symbol (const_rtx x)
       && !loongarch_symbol_binds_local_p (x))
     return SYMBOL_GOT_DISP;
 
+  if (SYMBOL_REF_P (x))
+    {
+      tree decl = SYMBOL_REF_DECL (x);
+      /* A movable symbol may be moved away from the +/- 2GiB range around
+	 the PC, so we have to use GOT.  */
+      if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES (decl)))
+	return SYMBOL_GOT_DISP;
+    }
+
   return SYMBOL_PCREL;
 }
 
@@ -6068,6 +6077,54 @@  loongarch_starting_frame_offset (void)
   return crtl->outgoing_args_size;
 }
 
+static tree
+loongarch_handle_movable_attribute (tree *node, tree name, tree, int,
+				    bool *no_add_attrs)
+{
+  tree decl = *node;
+  if (TREE_CODE (decl) == VAR_DECL)
+    {
+      if (DECL_CONTEXT (decl)
+	  && TREE_CODE (DECL_CONTEXT (decl)) == FUNCTION_DECL
+	  && !TREE_STATIC (decl))
+	{
+	  error_at (DECL_SOURCE_LOCATION (decl),
+		    "%qE attribute cannot be specified for local "
+		    "variables", name);
+	  *no_add_attrs = true;
+	}
+    }
+  else
+    {
+      warning (OPT_Wattributes, "%qE attribute ignored", name);
+      *no_add_attrs = true;
+    }
+  return NULL_TREE;
+}
+
+static const struct attribute_spec loongarch_attribute_table[] =
+{
+  /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
+       affects_type_identity, handler, exclude } */
+  { "movable", 0, 0, true, false, false, false,
+    loongarch_handle_movable_attribute, NULL },
+  /* The last attribute spec is set to be NULL.  */
+  {}
+};
+
+bool
+loongarch_use_anchors_for_symbol_p (const_rtx symbol)
+{
+  tree decl = SYMBOL_REF_DECL (symbol);
+
+  /* A movable attribute indicates the linker may move the symbol away,
+     so the use of anchor may cause relocation overflow.  */
+  if (decl && lookup_attribute ("movable", DECL_ATTRIBUTES (decl)))
+    return false;
+
+  return default_use_anchors_for_symbol_p (symbol);
+}
+
 /* Initialize the GCC target structure.  */
 #undef TARGET_ASM_ALIGNED_HI_OP
 #define TARGET_ASM_ALIGNED_HI_OP "\t.half\t"
@@ -6256,6 +6313,12 @@  loongarch_starting_frame_offset (void)
 #undef  TARGET_HAVE_SPECULATION_SAFE_VALUE
 #define TARGET_HAVE_SPECULATION_SAFE_VALUE speculation_safe_value_not_needed
 
+#undef  TARGET_ATTRIBUTE_TABLE
+#define TARGET_ATTRIBUTE_TABLE loongarch_attribute_table
+
+#undef  TARGET_USE_ANCHORS_FOR_SYMBOL_P
+#define TARGET_USE_ANCHORS_FOR_SYMBOL_P loongarch_use_anchors_for_symbol_p
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 #include "gt-loongarch.h"
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 7fe7f8817cd..322d8c05a04 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -7314,6 +7314,7 @@  attributes.
 * Blackfin Variable Attributes::
 * H8/300 Variable Attributes::
 * IA-64 Variable Attributes::
+* LoongArch Variable Attributes::
 * M32R/D Variable Attributes::
 * MeP Variable Attributes::
 * Microsoft Windows Variable Attributes::
@@ -8098,6 +8099,21 @@  defined by shared libraries.
 
 @end table
 
+@node LoongArch Variable Attributes
+@subsection LoongArch Variable Attributes
+
+One attribute is currently defined for the LoongArch.
+
+@table @code
+@item movable
+@cindex @code{movable} variable attribute, LoongArch
+Use this attribute on the LoongArch to mark an object possible to be moved
+by the linker, so its address is unlimited by the local data section range
+specified by the code model even if the object is defined locally.  This
+attribute is mostly useful if a @code{section} attribute and/or a linker
+script will move the object somewhere unexpected by the code model.
+@end table
+
 @node M32R/D Variable Attributes
 @subsection M32R/D Variable Attributes
 
diff --git a/gcc/testsuite/gcc.target/loongarch/attr-movable.c b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
new file mode 100644
index 00000000000..85b1dd4c59a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/attr-movable.c
@@ -0,0 +1,29 @@ 
+/* { dg-do compile } */
+/* { dg-options "-mexplicit-relocs -mcmodel=normal -O2" } */
+/* { dg-final { scan-assembler-not "%pc" } } */
+/* { dg-final { scan-assembler-times "%got_pc_hi20" 3 } } */
+
+/* movable attribute should mark x and y possibly outside of the local
+   data range defined by the code model, so GOT should be used instead of
+   PC-relative.  */
+
+int x __attribute__((movable));
+int y __attribute__((movable));
+
+int
+test(void)
+{
+  return x + y;
+}
+
+/* The following will be used for kernel per-cpu storage implemention. */
+
+register char *per_cpu_base __asm__("r21");
+static int counter __attribute__((section(".data..percpu"), movable));
+
+void
+inc_counter(void)
+{
+  int *ptr = (int *)(per_cpu_base + (long)&counter);
+  (*ptr)++;
+}