[v2] RISC-V: Implement TLS Descriptors.

Message ID 20230908104923.31154-1-ishitatsuyuki@gmail.com
State Unresolved
Headers
Series [v2] RISC-V: Implement TLS Descriptors. |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Tatsuyuki Ishi Sept. 8, 2023, 10:49 a.m. UTC
  This implements TLS Descriptors (TLSDESC) as specified in [1].

In TLSDESC instruction sequence, the first instruction relocates against
the target TLS variable, while subsequent instructions relocates against
the address of the first. Such usage of labels are not well-supported
within GCC. Due to this, the 4-instruction sequence is implemented as a
single RTX insn.

The default remains to be the traditional TLS model, but can be configured
with --with_tls={trad,desc}. The choice can be revisited once toolchain
and libc support ships.

[1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.

gcc/Changelog:
    * config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
    * config.gcc: Add --with_tls configuration option to change the default
    TLS flavor.
    * config/riscv/riscv.h: Add TARGET_TLSDESC determined from
    -mtls-dialect and with_tls defaults.
    * config/riscv/riscv-opts.h: Define enum riscv_tls_type for the two TLS
    flavors.
    * config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
    * config/riscv/riscv.md: Add instruction sequence for TLSDESC.
    * config/riscv/riscv.cc (riscv_symbol_insns): Add instruction sequence
    length data for TLSDESC.
    (riscv_legitimize_tls_address): Add lowering of TLSDESC.
---
No regression in gcc tests for rv64gc, tested alongside the binutils and
glibc implementation. Tested with --with_tls=desc.

v2: Add with_tls configuration option, and a few readability improvements.
    Added Changelog.
    Thanks Kito Cheng for the review.

This contribution is made on behalf of Blue Whale Systems, which has
copyright assignment on file with the FSF.

 gcc/config.gcc                  | 15 ++++++++++++++-
 gcc/config/riscv/riscv-opts.h   |  6 ++++++
 gcc/config/riscv/riscv-protos.h |  5 +++--
 gcc/config/riscv/riscv.cc       | 24 ++++++++++++++++++++----
 gcc/config/riscv/riscv.h        |  9 +++++++--
 gcc/config/riscv/riscv.md       | 21 ++++++++++++++++++++-
 gcc/config/riscv/riscv.opt      | 14 ++++++++++++++
 7 files changed, 84 insertions(+), 10 deletions(-)
  

Comments

Kito Cheng Oct. 2, 2023, 2:10 p.m. UTC | #1
Just one nit and one more comment for doc:

Could you add some doc something like that? mostly I grab from other
target, so you can just included in the patch.

diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
index 31f2234640f..39396668da2 100644
--- a/gcc/doc/install.texi
+++ b/gcc/doc/install.texi
@@ -1174,6 +1174,9 @@ Specify the default TLS dialect, for systems
were there is a choice.
For ARM targets, possible values for @var{dialect} are @code{gnu} or
@code{gnu2}, which select between the original GNU dialect and the GNU TLS
descriptor-based dialect.
+For RISC-V targets, possible values for @var{dialect} are @code{trad} or
+@code{desc}, which select between the traditional GNU dialect and the GNU TLS
+descriptor-based dialect.

@item --enable-multiarch
Specify whether to enable or disable multiarch support.  The default is
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 4085fc90907..459e266d426 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1239,7 +1239,8 @@ See RS/6000 and PowerPC Options.
-minline-atomics  -mno-inline-atomics
-minline-strlen  -mno-inline-strlen
-minline-strcmp  -mno-inline-strcmp
--minline-strncmp  -mno-inline-strncmp}
+-minline-strncmp  -mno-inline-strncmp
+-mtls-dialect=desc  -mtls-dialect=trad}

@emph{RL78 Options}
@gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs
@@ -29538,6 +29539,17 @@ which register to use as base register for
reading the canary,
and from what offset from that base register. There is no default
register or offset as this is entirely for use within the Linux
kernel.
+
+@opindex mtls-dialect=desc
+@item -mtls-dialect=desc
+Use TLS descriptors as the thread-local storage mechanism for dynamic accesses
+of TLS variables.  This is the default.
+
+@opindex mtls-dialect=trad
+@item -mtls-dialect=traditional
+Use traditional TLS as the thread-local storage mechanism for dynamic accesses
+of TLS variables.
+
@end table

@node RL78 Options




> +(define_insn "@tlsdesc<mode>"
> +  [(set (reg:P A0_REGNUM)
> +           (unspec:P
> +                       [(match_operand:P 0 "symbolic_operand" "")
> +                        (match_operand:P 1 "const_int_operand")]
> +                       UNSPEC_TLSDESC))
> +   (clobber (reg:SI T0_REGNUM))]

P rather than SI here.

> +  "TARGET_TLSDESC"
> +  {
> +    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
> +           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
> +           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
> +           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
> +  }
> +  [(set_attr "type" "multi")
> +   (set_attr "length" "16")
> +   (set_attr "mode" "<MODE>")])
> +
>  (define_insn "auipc<mode>"
>    [(set (match_operand:P           0 "register_operand" "=r")
>         (unspec:P
  
Jeff Law Nov. 16, 2023, 1:07 a.m. UTC | #2
On 9/8/23 04:49, Tatsuyuki Ishi via Gcc-patches wrote:
> This implements TLS Descriptors (TLSDESC) as specified in [1].
> 
> In TLSDESC instruction sequence, the first instruction relocates against
> the target TLS variable, while subsequent instructions relocates against
> the address of the first. Such usage of labels are not well-supported
> within GCC. Due to this, the 4-instruction sequence is implemented as a
> single RTX insn.
> 
> The default remains to be the traditional TLS model, but can be configured
> with --with_tls={trad,desc}. The choice can be revisited once toolchain
> and libc support ships.
> 
> [1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.
> 
> gcc/Changelog:
>      * config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
>      * config.gcc: Add --with_tls configuration option to change the default
>      TLS flavor.
>      * config/riscv/riscv.h: Add TARGET_TLSDESC determined from
>      -mtls-dialect and with_tls defaults.
>      * config/riscv/riscv-opts.h: Define enum riscv_tls_type for the two TLS
>      flavors.
>      * config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
>      * config/riscv/riscv.md: Add instruction sequence for TLSDESC.
>      * config/riscv/riscv.cc (riscv_symbol_insns): Add instruction sequence
>      length data for TLSDESC.
>      (riscv_legitimize_tls_address): Add lowering of TLSDESC.
> ---

> @@ -4694,6 +4696,17 @@ case "${target}" in
>   				;;
>   			esac
>   		fi
> +		# Handle --with-tls.
> +		case "$with_tls" in
> +        "" \
> +        | trad | desc)
> +            # OK
> +            ;;
> +        *)
> +            echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
> +            exit 1
> +            ;;
> +        esac
Is there a reason why this isn't formatted like the other cases?




> @@ -1869,6 +1870,24 @@
>     [(set_attr "got" "load")
>      (set_attr "mode" "<MODE>")])
>   
> +(define_insn "@tlsdesc<mode>"
> +  [(set (reg:P A0_REGNUM)
> +	    (unspec:P
> +			[(match_operand:P 0 "symbolic_operand" "")
> +			 (match_operand:P 1 "const_int_operand")]
> +			UNSPEC_TLSDESC))
> +   (clobber (reg:SI T0_REGNUM))]
> +  "TARGET_TLSDESC"
> +  {
> +    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
> +           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
> +           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
> +           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
> +  }
> +  [(set_attr "type" "multi")
> +   (set_attr "length" "16")
> +   (set_attr "mode" "<MODE>")])
Hmm, I would be a bit worried about explicitly using $a0 here.  That's 
generally frowned upon, but probably unavoidable in this case since this 
is a call under the hood.


This needs changes to invoke.texi since it introduces new options.  I 
don't think it has to be anything terribly verbose.  A one liner is 
probably sufficient and I wouldn't be surprised if other ports have 
suitable text we could copy.

So overall if Kito's OK, then I am with the trivial doc change and 
perhaps the formatting fix in config.guess.

jeff
  
Fangrui Song Nov. 16, 2023, 1:17 a.m. UTC | #3
On Mon, Oct 2, 2023 at 7:10 AM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Just one nit and one more comment for doc:
>
> Could you add some doc something like that? mostly I grab from other
> target, so you can just included in the patch.
>
> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
> index 31f2234640f..39396668da2 100644
> --- a/gcc/doc/install.texi
> +++ b/gcc/doc/install.texi
> @@ -1174,6 +1174,9 @@ Specify the default TLS dialect, for systems
> were there is a choice.
> For ARM targets, possible values for @var{dialect} are @code{gnu} or
> @code{gnu2}, which select between the original GNU dialect and the GNU TLS
> descriptor-based dialect.
> +For RISC-V targets, possible values for @var{dialect} are @code{trad} or
> +@code{desc}, which select between the traditional GNU dialect and the GNU TLS
> +descriptor-based dialect.
>
> @item --enable-multiarch
> Specify whether to enable or disable multiarch support.  The default is
> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
> index 4085fc90907..459e266d426 100644
> --- a/gcc/doc/invoke.texi
> +++ b/gcc/doc/invoke.texi
> @@ -1239,7 +1239,8 @@ See RS/6000 and PowerPC Options.
> -minline-atomics  -mno-inline-atomics
> -minline-strlen  -mno-inline-strlen
> -minline-strcmp  -mno-inline-strcmp
> --minline-strncmp  -mno-inline-strncmp}
> +-minline-strncmp  -mno-inline-strncmp
> +-mtls-dialect=desc  -mtls-dialect=trad}
>
> @emph{RL78 Options}
> @gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs
> @@ -29538,6 +29539,17 @@ which register to use as base register for
> reading the canary,
> and from what offset from that base register. There is no default
> register or offset as this is entirely for use within the Linux
> kernel.
> +
> +@opindex mtls-dialect=desc
> +@item -mtls-dialect=desc
> +Use TLS descriptors as the thread-local storage mechanism for dynamic accesses
> +of TLS variables.  This is the default.
> +
> +@opindex mtls-dialect=trad
> +@item -mtls-dialect=traditional

-mtls-dialect=trad.
aarch64-linux-gnu-gcc doesn't support -mtls-dialect=traditional

> +Use traditional TLS as the thread-local storage mechanism for dynamic accesses
> +of TLS variables.
> +
> @end table

This is the default :)

I am happy that we change the default like AArch64, but probably not
now when linker support is not widely available yet.

I cannot comment on the code side as I am not familiar with GCC internals.

> @node RL78 Options
>
>
>
>
> > +(define_insn "@tlsdesc<mode>"
> > +  [(set (reg:P A0_REGNUM)
> > +           (unspec:P
> > +                       [(match_operand:P 0 "symbolic_operand" "")
> > +                        (match_operand:P 1 "const_int_operand")]
> > +                       UNSPEC_TLSDESC))
> > +   (clobber (reg:SI T0_REGNUM))]
>
> P rather than SI here.
>
> > +  "TARGET_TLSDESC"
> > +  {
> > +    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
> > +           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
> > +           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
> > +           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
> > +  }
> > +  [(set_attr "type" "multi")
> > +   (set_attr "length" "16")
> > +   (set_attr "mode" "<MODE>")])
> > +
> >  (define_insn "auipc<mode>"
> >    [(set (match_operand:P           0 "register_operand" "=r")
> >         (unspec:P

It seems that x86-64 supports non-adjacent code sequence. Writing the
pattern this way does not allow interleaving, but I assume
interleaving doesn't enable much.
https://reviews.llvm.org/D114416
  
Tatsuyuki Ishi Nov. 16, 2023, 1:39 a.m. UTC | #4
> On Nov 16, 2023, at 10:17, Fangrui Song <maskray@google.com> wrote:
> 
> On Mon, Oct 2, 2023 at 7:10 AM Kito Cheng <kito.cheng@gmail.com <mailto:kito.cheng@gmail.com>> wrote:
>> 
>> Just one nit and one more comment for doc:
>> 
>> Could you add some doc something like that? mostly I grab from other
>> target, so you can just included in the patch.
>> 
>> diff --git a/gcc/doc/install.texi b/gcc/doc/install.texi
>> index 31f2234640f..39396668da2 100644
>> --- a/gcc/doc/install.texi
>> +++ b/gcc/doc/install.texi
>> @@ -1174,6 +1174,9 @@ Specify the default TLS dialect, for systems
>> were there is a choice.
>> For ARM targets, possible values for @var{dialect} are @code{gnu} or
>> @code{gnu2}, which select between the original GNU dialect and the GNU TLS
>> descriptor-based dialect.
>> +For RISC-V targets, possible values for @var{dialect} are @code{trad} or
>> +@code{desc}, which select between the traditional GNU dialect and the GNU TLS
>> +descriptor-based dialect.
>> 
>> @item --enable-multiarch
>> Specify whether to enable or disable multiarch support.  The default is
>> diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
>> index 4085fc90907..459e266d426 100644
>> --- a/gcc/doc/invoke.texi
>> +++ b/gcc/doc/invoke.texi
>> @@ -1239,7 +1239,8 @@ See RS/6000 and PowerPC Options.
>> -minline-atomics  -mno-inline-atomics
>> -minline-strlen  -mno-inline-strlen
>> -minline-strcmp  -mno-inline-strcmp
>> --minline-strncmp  -mno-inline-strncmp}
>> +-minline-strncmp  -mno-inline-strncmp
>> +-mtls-dialect=desc  -mtls-dialect=trad}
>> 
>> @emph{RL78 Options}
>> @gccoptlist{-msim  -mmul=none  -mmul=g13  -mmul=g14  -mallregs
>> @@ -29538,6 +29539,17 @@ which register to use as base register for
>> reading the canary,
>> and from what offset from that base register. There is no default
>> register or offset as this is entirely for use within the Linux
>> kernel.
>> +
>> +@opindex mtls-dialect=desc
>> +@item -mtls-dialect=desc
>> +Use TLS descriptors as the thread-local storage mechanism for dynamic accesses
>> +of TLS variables.  This is the default.
>> +
>> +@opindex mtls-dialect=trad
>> +@item -mtls-dialect=traditional
> 
> -mtls-dialect=trad.
> aarch64-linux-gnu-gcc doesn't support -mtls-dialect=traditional
> 
>> +Use traditional TLS as the thread-local storage mechanism for dynamic accesses
>> +of TLS variables.
>> +
>> @end table
> 
> This is the default :)
> 
> I am happy that we change the default like AArch64, but probably not
> now when linker support is not widely available yet.
> 
> I cannot comment on the code side as I am not familiar with GCC internals.
> 
>> @node RL78 Options
>> 
>> 
>> 
>> 
>>> +(define_insn "@tlsdesc<mode>"
>>> +  [(set (reg:P A0_REGNUM)
>>> +           (unspec:P
>>> +                       [(match_operand:P 0 "symbolic_operand" "")
>>> +                        (match_operand:P 1 "const_int_operand")]
>>> +                       UNSPEC_TLSDESC))
>>> +   (clobber (reg:SI T0_REGNUM))]
>> 
>> P rather than SI here.
>> 
>>> +  "TARGET_TLSDESC"
>>> +  {
>>> +    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
>>> +           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
>>> +           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
>>> +           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
>>> +  }
>>> +  [(set_attr "type" "multi")
>>> +   (set_attr "length" "16")
>>> +   (set_attr "mode" "<MODE>")])
>>> +
>>> (define_insn "auipc<mode>"
>>>   [(set (match_operand:P           0 "register_operand" "=r")
>>>        (unspec:P
> 
> It seems that x86-64 supports non-adjacent code sequence. Writing the
> pattern this way does not allow interleaving, but I assume
> interleaving doesn't enable much.
> https://reviews.llvm.org/D114416

As mentioned in the commit message, the use of relaxation-only labels does not seem well supported in current GCC. Creating a label seems to force a basic block and I’m not sure how we can avoid it.

If there’s a better way to implement this I’m happy to adopt.

Tatsuyuki.

> 
> -- 
> 宋方睿
  
Tatsuyuki Ishi Nov. 16, 2023, 1:51 a.m. UTC | #5
> On Nov 16, 2023, at 10:07, Jeff Law <jeffreyalaw@gmail.com> wrote:
> 
> 
> 
> On 9/8/23 04:49, Tatsuyuki Ishi via Gcc-patches wrote:
>> This implements TLS Descriptors (TLSDESC) as specified in [1].
>> In TLSDESC instruction sequence, the first instruction relocates against
>> the target TLS variable, while subsequent instructions relocates against
>> the address of the first. Such usage of labels are not well-supported
>> within GCC. Due to this, the 4-instruction sequence is implemented as a
>> single RTX insn.
>> The default remains to be the traditional TLS model, but can be configured
>> with --with_tls={trad,desc}. The choice can be revisited once toolchain
>> and libc support ships.
>> [1]: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/373.
>> gcc/Changelog:
>>     * config/riscv/riscv.opt: Add -mtls-dialect to configure TLS flavor.
>>     * config.gcc: Add --with_tls configuration option to change the default
>>     TLS flavor.
>>     * config/riscv/riscv.h: Add TARGET_TLSDESC determined from
>>     -mtls-dialect and with_tls defaults.
>>     * config/riscv/riscv-opts.h: Define enum riscv_tls_type for the two TLS
>>     flavors.
>>     * config/riscv/riscv-protos.h: Define SYMBOL_TLSDESC symbol type.
>>     * config/riscv/riscv.md: Add instruction sequence for TLSDESC.
>>     * config/riscv/riscv.cc (riscv_symbol_insns): Add instruction sequence
>>     length data for TLSDESC.
>>     (riscv_legitimize_tls_address): Add lowering of TLSDESC.
>> ---
> 
>> @@ -4694,6 +4696,17 @@ case "${target}" in
>>  				;;
>>  			esac
>>  		fi
>> +		# Handle --with-tls.
>> +		case "$with_tls" in
>> +        "" \
>> +        | trad | desc)
>> +            # OK
>> +            ;;
>> +        *)
>> +            echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
>> +            exit 1
>> +            ;;
>> +        esac
> Is there a reason why this isn't formatted like the other cases?

Sorry, this was an oversight. I’ll fix it in the next version.

> 
>> @@ -1869,6 +1870,24 @@
>>    [(set_attr "got" "load")
>>     (set_attr "mode" "<MODE>")])
>>  +(define_insn "@tlsdesc<mode>"
>> +  [(set (reg:P A0_REGNUM)
>> +	    (unspec:P
>> +			[(match_operand:P 0 "symbolic_operand" "")
>> +			 (match_operand:P 1 "const_int_operand")]
>> +			UNSPEC_TLSDESC))
>> +   (clobber (reg:SI T0_REGNUM))]
>> +  "TARGET_TLSDESC"
>> +  {
>> +    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
>> +           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
>> +           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
>> +           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
>> +  }
>> +  [(set_attr "type" "multi")
>> +   (set_attr "length" "16")
>> +   (set_attr "mode" "<MODE>")])
> Hmm, I would be a bit worried about explicitly using $a0 here.  That's generally frowned upon, but probably unavoidable in this case since this is a call under the hood.

Based on what I have read in the AArch64 backend, there are two ways to do this: introduce a custom calling convention, or put in a RTX insn that covers the whole sequence. Ideally we should do the first, but then there’s the label issue and it’s quite a bit more complicated. So I’m sticking with this for now.

> This needs changes to invoke.texi since it introduces new options.  I don't think it has to be anything terribly verbose.  A one liner is probably sufficient and I wouldn't be surprised if other ports have suitable text we could copy.

Ack.

> So overall if Kito's OK, then I am with the trivial doc change and perhaps the formatting fix in config.guess.

Sorry for all the delay on this. My progress has been (and still) blocked on supporting relaxation of TLSDESC in binutils (turns out you can’t run static binaries without relaxing it first). But that doesn’t seem exactly easy to do either, because relaxation that involves GOT elimination isn’t something we have in the RISC-V backend.

I’ll try to send a new version of this patch and get this unblocked on GCC side first.
Presumably this still needs the associated gas and ld support in place, so let me know if you want to merge this soon. I will ask on binutils for whether they could accept the basic part of the implementation without relaxation first.

Thanks,
Tatsuyuki. 

> jeff
  
Jeff Law Nov. 16, 2023, 5:18 a.m. UTC | #6
On 11/15/23 18:17, Fangrui Song wrote:

> 
> It seems that x86-64 supports non-adjacent code sequence. Writing the
> pattern this way does not allow interleaving, but I assume
> interleaving doesn't enable much.
It's of marginal benefit.  We could always split them before scheduling 
if it turned out to matter -- but given nobody's using this stuff yet, 
my inclination is to wait on that kind of micro-optimization.

jeff
  
Jeff Law Nov. 16, 2023, 5:21 a.m. UTC | #7
On 11/15/23 18:39, Tatsuyuki Ishi wrote:

> 
> As mentioned in the commit message, the use of relaxation-only labels 
> does not seem well supported in current GCC. Creating a label seems to 
> force a basic block and I’m not sure how we can avoid it.
> 
> If there’s a better way to implement this I’m happy to adopt.
In general, yes creating a label in the IL is going to create a new 
block.  But you could emit the label as part of the auipc so that it 
doesn't really show up in the IL.  Then you have to make sure the other 
insns reference the right label, which is certainly do-able.   THere's 
also linker relaxing to worry about....

jeff
  
Jeff Law Nov. 16, 2023, 5:23 a.m. UTC | #8
On 11/15/23 18:51, Tatsuyuki Ishi wrote:
>> On Nov 16, 2023, at 10:07, Jeff Law <jeffreyalaw@gmail.com> wrote:

> 
> Based on what I have read in the AArch64 backend, there are two ways to 
> do this: introduce a custom calling convention, or put in a RTX insn 
> that covers the whole sequence. Ideally we should do the first, but then 
> there’s the label issue and it’s quite a bit more complicated. So I’m 
> sticking with this for now.
As I said, I think we're OK here.  We can always revamp as we get 
experience with the implementation -- I don't think any of the stuff 
we're talking about is an ABI change, they're just implementation details.

> 
> Sorry for all the delay on this. My progress has been (and still) 
> blocked on supporting relaxation of TLSDESC in binutils (turns out you 
> can’t run static binaries without relaxing it first). But that doesn’t 
> seem exactly easy to do either, because relaxation that involves GOT 
> elimination isn’t something we have in the RISC-V backend.
Note that binutils is due for another release in the next month or two. 
It'd certainly be helpful to have any issues there resolved in time for 
that release.

> 
> I’ll try to send a new version of this patch and get this unblocked on 
> GCC side first.
Sounds good.  We can always guard its use behind a feature test for GAS 
support.

Jeff
  
Fangrui Song Nov. 16, 2023, 5:33 a.m. UTC | #9
On Wed, Nov 15, 2023 at 9:23 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 11/15/23 18:51, Tatsuyuki Ishi wrote:
> >> On Nov 16, 2023, at 10:07, Jeff Law <jeffreyalaw@gmail.com> wrote:
>
> >
> > Based on what I have read in the AArch64 backend, there are two ways to
> > do this: introduce a custom calling convention, or put in a RTX insn
> > that covers the whole sequence. Ideally we should do the first, but then
> > there’s the label issue and it’s quite a bit more complicated. So I’m
> > sticking with this for now.
> As I said, I think we're OK here.  We can always revamp as we get
> experience with the implementation -- I don't think any of the stuff
> we're talking about is an ABI change, they're just implementation details.
>
> >
> > Sorry for all the delay on this. My progress has been (and still)
> > blocked on supporting relaxation of TLSDESC in binutils (turns out you
> > can’t run static binaries without relaxing it first). But that doesn’t
> > seem exactly easy to do either, because relaxation that involves GOT
> > elimination isn’t something we have in the RISC-V backend.
> Note that binutils is due for another release in the next month or two.
> It'd certainly be helpful to have any issues there resolved in time for
> that release.
>
> >
> > I’ll try to send a new version of this patch and get this unblocked on
> > GCC side first.
> Sounds good.  We can always guard its use behind a feature test for GAS
> support.
>
> Jeff

Agreed.


Tatsuyuki, could you also add some tests? For example

// end of https://maskray.me/blog/2021-02-14-all-about-thread-local-storage
__thread int tls0;
extern __thread int tls1;
int foo() { return ++tls0 + ++tls1; }
static __thread int tls2, tls3;
int bar() { return ++tls2 + ++tls3; }

I have used this to check rtld and linker behavior. I think we need
some `scan-assembler`.
To make it a runnable test, some assembler feature check may be
needed. Perhaps Jeff can make some suggestion or contribute code!
  
Jeff Law Nov. 16, 2023, 5:36 a.m. UTC | #10
On 11/15/23 22:33, Fangrui Song wrote:

> 
> I have used this to check rtld and linker behavior. I think we need
> some `scan-assembler`.
> To make it a runnable test, some assembler feature check may be
> needed. Perhaps Jeff can make some suggestion or contribute code!
TLS isn't really on my radar yet.  I've got about a million things to do 
on the scalar and vector optimization fronts.  The TLS stuff is just one 
of the lingering items from our Tuesday patchwork sync meetings -- 
getting it wrapped up gets it off the list of things to look at every 
week ;-)


jeff
  
Tatsuyuki Ishi Nov. 16, 2023, 5:37 a.m. UTC | #11
> On Nov 16, 2023, at 14:33, Fangrui Song <maskray@google.com> wrote:
> 
> On Wed, Nov 15, 2023 at 9:23 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>> 
>> 
>> 
>> On 11/15/23 18:51, Tatsuyuki Ishi wrote:
>>>> On Nov 16, 2023, at 10:07, Jeff Law <jeffreyalaw@gmail.com> wrote:
>> 
>>> 
>>> Based on what I have read in the AArch64 backend, there are two ways to
>>> do this: introduce a custom calling convention, or put in a RTX insn
>>> that covers the whole sequence. Ideally we should do the first, but then
>>> there’s the label issue and it’s quite a bit more complicated. So I’m
>>> sticking with this for now.
>> As I said, I think we're OK here.  We can always revamp as we get
>> experience with the implementation -- I don't think any of the stuff
>> we're talking about is an ABI change, they're just implementation details.
>> 
>>> 
>>> Sorry for all the delay on this. My progress has been (and still)
>>> blocked on supporting relaxation of TLSDESC in binutils (turns out you
>>> can’t run static binaries without relaxing it first). But that doesn’t
>>> seem exactly easy to do either, because relaxation that involves GOT
>>> elimination isn’t something we have in the RISC-V backend.
>> Note that binutils is due for another release in the next month or two.
>> It'd certainly be helpful to have any issues there resolved in time for
>> that release.
>> 
>>> 
>>> I’ll try to send a new version of this patch and get this unblocked on
>>> GCC side first.
>> Sounds good.  We can always guard its use behind a feature test for GAS
>> support.
>> 
>> Jeff
> 
> Agreed.
> 
> 
> Tatsuyuki, could you also add some tests? For example
> 
> // end of https://maskray.me/blog/2021-02-14-all-about-thread-local-storage
> __thread int tls0;
> extern __thread int tls1;
> int foo() { return ++tls0 + ++tls1; }
> static __thread int tls2, tls3;
> int bar() { return ++tls2 + ++tls3; }
> 
> I have used this to check rtld and linker behavior. I think we need
> some `scan-assembler`.
> To make it a runnable test, some assembler feature check may be
> needed. Perhaps Jeff can make some suggestion or contribute code!
> 

I believe there’s existing platform-generic TLS coverage in gcc/testsuite/gcc.dg/torture/tls. GCC's test suite seems pretty sparse, but a lot more testing is done by glibc’s testsuite (which is also where I found the static TLS relaxation issue).

Tatsuyuki.

> 
> -- 
> 宋方睿
  

Patch

diff --git a/gcc/config.gcc b/gcc/config.gcc
index 415e0e1ebc5..86df7b89016 100644
--- a/gcc/config.gcc
+++ b/gcc/config.gcc
@@ -2434,6 +2434,7 @@  riscv*-*-linux*)
 	# Force .init_array support.  The configure script cannot always
 	# automatically detect that GAS supports it, yet we require it.
 	gcc_cv_initfini_array=yes
+	with_tls=${with_tls:-trad}
 	;;
 riscv*-*-elf* | riscv*-*-rtems*)
 	tm_file="elfos.h newlib-stdint.h ${tm_file} riscv/elf.h"
@@ -2476,6 +2477,7 @@  riscv*-*-freebsd*)
 	# Force .init_array support.  The configure script cannot always
 	# automatically detect that GAS supports it, yet we require it.
 	gcc_cv_initfini_array=yes
+	with_tls=${with_tls:-trad}
 	;;
 
 loongarch*-*-linux*)
@@ -4566,7 +4568,7 @@  case "${target}" in
 		;;
 
 	riscv*-*-*)
-		supported_defaults="abi arch tune riscv_attribute isa_spec"
+		supported_defaults="abi arch tune riscv_attribute isa_spec tls"
 
 		case "${target}" in
 		riscv-* | riscv32*) xlen=32 ;;
@@ -4694,6 +4696,17 @@  case "${target}" in
 				;;
 			esac
 		fi
+		# Handle --with-tls.
+		case "$with_tls" in
+        "" \
+        | trad | desc)
+            # OK
+            ;;
+        *)
+            echo "Unknown TLS method used in --with-tls=$with_tls" 1>&2
+            exit 1
+            ;;
+        esac
 
 		# Handle --with-multilib-list.
 		if test "x${with_multilib_list}" != xdefault; then
diff --git a/gcc/config/riscv/riscv-opts.h b/gcc/config/riscv/riscv-opts.h
index 378a17699cd..db03f35430a 100644
--- a/gcc/config/riscv/riscv-opts.h
+++ b/gcc/config/riscv/riscv-opts.h
@@ -319,4 +319,10 @@  enum riscv_entity
 #define TARGET_VECTOR_VLS                                                      \
   (TARGET_VECTOR && riscv_autovec_preference == RVV_SCALABLE)
 
+/* TLS types.  */
+enum riscv_tls_type {
+  TLS_TRADITIONAL,
+  TLS_DESCRIPTORS
+};
+
 #endif /* ! GCC_RISCV_OPTS_H */
diff --git a/gcc/config/riscv/riscv-protos.h b/gcc/config/riscv/riscv-protos.h
index 472c00dc439..9b7471f7591 100644
--- a/gcc/config/riscv/riscv-protos.h
+++ b/gcc/config/riscv/riscv-protos.h
@@ -33,9 +33,10 @@  enum riscv_symbol_type {
   SYMBOL_TLS,
   SYMBOL_TLS_LE,
   SYMBOL_TLS_IE,
-  SYMBOL_TLS_GD
+  SYMBOL_TLS_GD,
+  SYMBOL_TLSDESC,
 };
-#define NUM_SYMBOL_TYPES (SYMBOL_TLS_GD + 1)
+#define NUM_SYMBOL_TYPES (SYMBOL_TLSDESC + 1)
 
 /* Classifies an address.
 
diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 49062bef9fc..c158e224aaa 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -799,6 +799,7 @@  static int riscv_symbol_insns (enum riscv_symbol_type type)
     case SYMBOL_ABSOLUTE: return 2; /* LUI + the reference.  */
     case SYMBOL_PCREL: return 2; /* AUIPC + the reference.  */
     case SYMBOL_TLS_LE: return 3; /* LUI + ADD TP + the reference.  */
+    case SYMBOL_TLSDESC: return 6; /* 4-instruction call + ADD TP + the reference.  */
     case SYMBOL_GOT_DISP: return 3; /* AUIPC + LD GOT + the reference.  */
     default: gcc_unreachable ();
     }
@@ -1734,7 +1735,7 @@  riscv_call_tls_get_addr (rtx sym, rtx result)
 static rtx
 riscv_legitimize_tls_address (rtx loc)
 {
-  rtx dest, tp, tmp;
+  rtx dest, tp, tmp, a0;
   enum tls_model model = SYMBOL_REF_TLS_MODEL (loc);
 
 #if 0
@@ -1750,9 +1751,24 @@  riscv_legitimize_tls_address (rtx loc)
       /* Rely on section anchors for the optimization that LDM TLS
 	 provides.  The anchor's address is loaded with GD TLS. */
     case TLS_MODEL_GLOBAL_DYNAMIC:
-      tmp = gen_rtx_REG (Pmode, GP_RETURN);
-      dest = gen_reg_rtx (Pmode);
-      emit_libcall_block (riscv_call_tls_get_addr (loc, tmp), dest, tmp, loc);
+      if (TARGET_TLSDESC)
+	{
+	  static unsigned seqno;
+	  tp = gen_rtx_REG (Pmode, THREAD_POINTER_REGNUM);
+	  a0 = gen_rtx_REG (Pmode, GP_ARG_FIRST);
+	  dest = gen_reg_rtx (Pmode);
+
+	  emit_insn (gen_tlsdesc (Pmode, loc, GEN_INT (seqno)));
+	  emit_insn (gen_add3_insn (dest, a0, tp));
+	  seqno++;
+	}
+      else
+	{
+	  tmp = gen_rtx_REG (Pmode, GP_RETURN);
+	  dest = gen_reg_rtx (Pmode);
+	  emit_libcall_block (riscv_call_tls_get_addr (loc, tmp), dest, tmp,
+			      loc);
+	}
       break;
 
     case TLS_MODEL_INITIAL_EXEC:
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index e18a0081297..faea78f5f4c 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -59,6 +59,7 @@  extern const char *riscv_multi_lib_check (int argc, const char **argv);
    --with-abi is ignored if -mabi is specified.
    --with-tune is ignored if -mtune or -mcpu is specified.
    --with-isa-spec is ignored if -misa-spec is specified.
+   --with-tls is ignored if -mtls-dialect is specified.
 
    But using default -march/-mtune value if -mcpu don't have valid option.  */
 #define OPTION_DEFAULT_SPECS \
@@ -68,8 +69,9 @@  extern const char *riscv_multi_lib_check (int argc, const char **argv);
   {"arch", "%{!march=*:"						\
 	   "  %{!mcpu=*:-march=%(VALUE)}"				\
 	   "  %{mcpu=*:%:riscv_expand_arch_from_cpu(%* %(VALUE))}}" },	\
-  {"abi", "%{!mabi=*:-mabi=%(VALUE)}" }, \
-  {"isa_spec", "%{!misa-spec=*:-misa-spec=%(VALUE)}" }, \
+  {"abi", "%{!mabi=*:-mabi=%(VALUE)}" },				\
+  {"isa_spec", "%{!misa-spec=*:-misa-spec=%(VALUE)}" },			\
+  {"tls", "%{!mtls-dialect=*:-mtls-dialect=%(VALUE)}"},         	\
 
 #ifdef IN_LIBGCC2
 #undef TARGET_64BIT
@@ -1122,4 +1124,7 @@  extern void riscv_remove_unneeded_save_restore_calls (void);
 #define OPTIMIZE_MODE_SWITCHING(ENTITY) (TARGET_VECTOR)
 #define NUM_MODES_FOR_MODE_SWITCHING {VXRM_MODE_NONE, riscv_vector::FRM_NONE}
 
+/* Check TLS Descriptors mechanism is selected.  */
+#define TARGET_TLSDESC (riscv_tls_dialect == TLS_DESCRIPTORS)
+
 #endif /* ! GCC_RISCV_H */
diff --git a/gcc/config/riscv/riscv.md b/gcc/config/riscv/riscv.md
index b456fa6abb3..aaef9cd429b 100644
--- a/gcc/config/riscv/riscv.md
+++ b/gcc/config/riscv/riscv.md
@@ -47,7 +47,7 @@ 
   UNSPEC_TLS_LE
   UNSPEC_TLS_IE
   UNSPEC_TLS_GD
-
+  UNSPEC_TLSDESC
   ;; High part of PC-relative address.
   UNSPEC_AUIPC
 
@@ -121,6 +121,7 @@ 
    (T1_REGNUM			6)
    (S0_REGNUM			8)
    (S1_REGNUM			9)
+   (A0_REGNUM			10)
    (S2_REGNUM			18)
    (S3_REGNUM			19)
    (S4_REGNUM			20)
@@ -1869,6 +1870,24 @@ 
   [(set_attr "got" "load")
    (set_attr "mode" "<MODE>")])
 
+(define_insn "@tlsdesc<mode>"
+  [(set (reg:P A0_REGNUM)
+	    (unspec:P
+			[(match_operand:P 0 "symbolic_operand" "")
+			 (match_operand:P 1 "const_int_operand")]
+			UNSPEC_TLSDESC))
+   (clobber (reg:SI T0_REGNUM))]
+  "TARGET_TLSDESC"
+  {
+    return ".LT%1: auipc\ta0, %%tlsdesc_hi(%0)\;"
+           "<load>\tt0,%%tlsdesc_load_lo(.LT%1)(a0)\;"
+           "addi\ta0,a0,%%tlsdesc_add_lo(.LT%1)\;"
+           "jalr\tt0,t0,%%tlsdesc_call(.LT%1)";
+  }
+  [(set_attr "type" "multi")
+   (set_attr "length" "16")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "auipc<mode>"
   [(set (match_operand:P           0 "register_operand" "=r")
 	(unspec:P
diff --git a/gcc/config/riscv/riscv.opt b/gcc/config/riscv/riscv.opt
index 6304efebfd5..9ba690f8497 100644
--- a/gcc/config/riscv/riscv.opt
+++ b/gcc/config/riscv/riscv.opt
@@ -311,3 +311,17 @@  Enum(riscv_autovec_lmul) String(m8) Value(RVV_M8)
 -param=riscv-autovec-lmul=
 Target RejectNegative Joined Enum(riscv_autovec_lmul) Var(riscv_autovec_lmul) Init(RVV_M1)
 -param=riscv-autovec-lmul=<string>	Set the RVV LMUL of auto-vectorization in the RISC-V port.
+
+Enum
+Name(tls_type) Type(enum riscv_tls_type)
+The possible TLS dialects:
+
+EnumValue
+Enum(tls_type) String(trad) Value(TLS_TRADITIONAL)
+
+EnumValue
+Enum(tls_type) String(desc) Value(TLS_DESCRIPTORS)
+
+mtls-dialect=
+Target RejectNegative Joined Enum(tls_type) Var(riscv_tls_dialect) Init(TLS_TRADITIONAL) Save
+Specify TLS dialect.