Message ID | 20231201090424.854662-1-cailulu@loongson.cn |
---|---|
Headers |
Return-Path: <binutils-bounces+ouuuleilei=gmail.com@sourceware.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp981542vqy; Fri, 1 Dec 2023 01:04:41 -0800 (PST) X-Google-Smtp-Source: AGHT+IFN2FbZ3w9JogwDkofess7W71V9kyw+2mfNLa1O6xJexlPyfxGLcpuRPCoq0+mJ2k9KleGd X-Received: by 2002:a05:6808:130a:b0:3ae:16b6:6338 with SMTP id y10-20020a056808130a00b003ae16b66338mr2744913oiv.3.1701421481060; Fri, 01 Dec 2023 01:04:41 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701421481; cv=pass; d=google.com; s=arc-20160816; b=WxgGdIe8NwiZku5FbTouf7NaXpTS+z/Gonevctx2J81CinY1ALhcCTijRZOAGttdFs sY0GAN1DBpOKp2uTbFE8C5AP1fVqTdN4VssYGpZ6ePTMzIvXKfVyJdpDbKiWa4kfawM8 R/NVfMMSckkeuSXKK1gZxYxKGHl21IaVZYy7aRXB+TQBzX4fJPt2WhWN3+uL5HLdsgmV Xunf80z0IqAXQCUrc2/Ug4strAG+I23GqJ6Q/eH+YjhmAtkadyWzYemZtAIp+65Dqa6h uOgiH6NQiXt8OzlgKFe85mUsLCJ2Y3nDGRo1+MKgNnPm8Sha17TDsPBrsURofgr/EF17 4jOA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:arc-filter :dmarc-filter:delivered-to; bh=eV86TQ+ZRVwOKeEoR4UhqbalnhG36b2pGG9Xlt7T9BI=; fh=FvlVQEmCpm5H8oBXQswexzCupeeQXnbssobXN+3R2Wg=; b=xfP2VCyOMnaX1QhjV34ZKQ6DRafV0QAwHj7bbTMW2M1fHhuBsqa9Py8bzqJHzkVGWj jMMq4nxGlTNFVT1up7xSraByU0zRTYwpj1DKOL4ma+jw0qgiRO/2Gp1XICLDqjIYgpb5 VdC7h94Ukn1LB5OeewgV58WxeDQ6gt+jDLhxhp85sAVlNLsCa/WODi0vvnaVdN23svBy CeIxZfqLyEvIsuKaSvpw863JFtqIVl4SJYsEKQAu1zo5ckcI4MH1Dq9wx7lqgg6PUQsJ tS8a2nHlV5dNbXrBPFO+8MKrDqxLOiVVTF5RyxKl5sG5ucFlZsxsIRxO2CG0hLnOMmNX 8DEQ== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j14-20020a05620a288e00b0077dc416a858si2914743qkp.105.2023.12.01.01.04.40 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 01:04:41 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id C6296386188E for <ouuuleilei@gmail.com>; Fri, 1 Dec 2023 09:04:40 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 18276385C413 for <binutils@sourceware.org>; Fri, 1 Dec 2023 09:04:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 18276385C413 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 18276385C413 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701421473; cv=none; b=hpwLFcFcx3QLxOqREmgV4PEqUuo17bcn6cJfeJR9JTqJGHZmcCPa8Uas1uCmYrOtxsyNsvrDeA9JGMuczuQyLVG5fHlF+NyZ+Vd9JWf+i0jx9DQPNlUzSnjjOe7IdAwZRbqaEMHZz7kOsiXlVmuygJSNFJfg2fJDNood45Gg6/0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701421473; c=relaxed/simple; bh=hKulfmY05zend1NXwEtCXtQN+mgRfEMi9bp8WBqoqgc=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=yDFe1eV4lmNsY90h93TqHG+IFNRliCs2m+Dn5OQz0ytkp76zvH+MHXmthZD1Qy2h8mkoyQmNNywi9Wk0+0chWYfB4Ql+T4KE0C0q0XB2cuJWVqLENlUulwD+rhSonDXNq8aHtSdPpqmv1XICJ2FnwS66Ak72cNLNsmZGRNczjrk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.2.6.5]) by gateway (Coremail) with SMTP id _____8Cx5_GboWll4Sg+AA--.58118S3; Fri, 01 Dec 2023 17:04:27 +0800 (CST) Received: from 5.5.5 (unknown [10.2.6.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxS9yaoWll53hRAA--.48257S4; Fri, 01 Dec 2023 17:04:26 +0800 (CST) From: Lulu Cai <cailulu@loongson.cn> To: binutils@sourceware.org Cc: xuchenghua@loongson.cn, chenglulu@loongson.cn, liuzhensong@loongson.cn, mengqinggang@loongson.cn, xry111@xry111.site, i.swmail@xen0n.name, maskray@google.com, luweining@loongson.cn, wanglei@loongson.cn, hejinyang@loongson.cn, Lulu Cai <cailulu@loongson.cn> Subject: [PATCH v1 0/4] LoongArch: Add support for TLS Descriptors (TLSDESC) Date: Fri, 1 Dec 2023 17:04:20 +0800 Message-Id: <20231201090424.854662-1-cailulu@loongson.cn> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: AQAAf8DxS9yaoWll53hRAA--.48257S4 X-CM-SenderInfo: xfdlz3tox6z05rqj20fqof0/1tbiAQAMB2VpQqsHYwAJsC X-Coremail-Antispam: 1Uk129KBj93XoWxCFWDXFW8JF4rtw4fCr47Jrc_yoWrJrW8p3 y3ZFnYka18CFZrXF95W345XFn5XayxGrWaga4ftF1akwsaqry0vwn7trZxXay5JayDt34F vw109w13WF1UtFbCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07UMpBfUUUUU= X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list <binutils.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/binutils>, <mailto:binutils-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/binutils/> List-Post: <mailto:binutils@sourceware.org> List-Help: <mailto:binutils-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/binutils>, <mailto:binutils-request@sourceware.org?subject=subscribe> Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784069731273791552 X-GMAIL-MSGID: 1784069731273791552 |
Series |
LoongArch: Add support for TLS Descriptors (TLSDESC)
|
|
Message
Lulu Cai
Dec. 1, 2023, 9:04 a.m. UTC
The LoongArch TLS Descriptors implementation contains several points: 1. The instruction sequences is: pcalau12i $a0,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 ld.d $a1,$a0,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 addi.d $a0,$a0,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 jirl $ra,$a1,%desc_call(var) #R_LARCH_TLS_DESC_CALL The linker for each DESC generates a R_LARCH_TLS_DESC64 dynamic relocation, which relocation is placed at .rela.dyn. TLSDESC always allocates two GOT slots and one dynamic relocation space to TLSDESC. 2. When using multiple ways to access the same TLS variable, a maximum of 5 GOT slots are used. For example, using GD, TLSDESC, and IE to access the same TLS variable,GD always uses the first two of the five GOT, TLSDESC uses the third and fourth, and IE uses the last. 3. TLSDESC always requires dynamic relocation because of LoongArch does not yet have a tls type transition. Howerer statically linked programs cannot resolve TLSDESC's dynamic relocation, so we did a type transition for this case. DESC -> LE: pcalau12i $a0,%desc_pc_hi20(var) => lu12i.w $a0,%le_hi20(var) ld.d $a1,$a0,%desc_ld_pc_lo12(var) => ori $a0,$a0,%le_lo12(var) addi.d $a0,$a0,%desc_add_pc_lo12(var) => NOP jirl $ra,$a1,%desc_call(var) => NOP 4. The current code passes the tests of gas ld and glibc. Lulu Cai (4): LoongArch: Add new relocs and macro for TLSDESC. LoongArch: Add support for TLSDESC in ld. LoongArch: Add transition support for DESC to LE. LoongArch: Add testsuits for TLSDESC in gas and ld. bfd/bfd-in2.h | 12 + bfd/elfnn-loongarch.c | 276 ++++++++++++++++-- bfd/elfxx-loongarch.c | 209 ++++++++++++- bfd/libbfd.h | 12 + bfd/reloc.c | 29 ++ gas/config/tc-loongarch.c | 14 +- gas/testsuite/gas/loongarch/tlsdesc_32.d | 26 ++ gas/testsuite/gas/loongarch/tlsdesc_32.s | 12 + gas/testsuite/gas/loongarch/tlsdesc_64.d | 26 ++ gas/testsuite/gas/loongarch/tlsdesc_64.s | 12 + .../gas/loongarch/tlsdesc_large_abs.d | 21 ++ .../gas/loongarch/tlsdesc_large_abs.s | 9 + .../gas/loongarch/tlsdesc_large_pc.d | 34 +++ .../gas/loongarch/tlsdesc_large_pc.s | 16 + include/elf/loongarch.h | 22 +- include/opcode/loongarch.h | 3 + .../ld-loongarch-elf/ld-loongarch-elf.exp | 16 + ld/testsuite/ld-loongarch-elf/tls-desc.dd | 74 +++++ ld/testsuite/ld-loongarch-elf/tls-desc.rd | 79 +++++ ld/testsuite/ld-loongarch-elf/tls-desc.s | 102 +++++++ .../ld-loongarch-elf/tls-relax-desc-le.d | 15 + .../ld-loongarch-elf/tls-relax-desc-le.s | 8 + opcodes/loongarch-opc.c | 54 ++++ 23 files changed, 1054 insertions(+), 27 deletions(-) create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_32.d create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_32.s create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_64.d create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_64.s create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_abs.d create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_abs.s create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_pc.d create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_pc.s create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.dd create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.rd create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.s create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-desc-le.d create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-desc-le.s
Comments
Hello, On Dec 1, 2023, Lulu Cai <cailulu@loongson.cn> wrote: > The LoongArch TLS Descriptors implementation contains several points: I'm excited to see another platform gain TLS Descriptors support. I'm not deeply acquainted with LoongArch, but I'll dare chime in. > 1. The instruction sequences is: > pcalau12i $a0,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 > ld.d $a1,$a0,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 > addi.d $a0,$a0,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 > jirl $ra,$a1,%desc_call(var) #R_LARCH_TLS_DESC_CALL Are these instructions fixed, and supposed to appear in this sequence, or can different registers be used, and the instructions intermixed with other unrelated ones? The ability to intermix them for better scheduling and register allocation was one of the guiding design principles of TLS Descriptors, so the canonical sequence and the design of relaxations should ideally take flexibility into account, and choose relaxations with similar scheduling profiles. Say, would compiler-generated or hand-coded asm still work if one used: pcalau12i $a2,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 ld.d $a3,$a2,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 addi.d $a0,$a2,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 jirl $ra,$a3,%desc_call(var) #R_LARCH_TLS_DESC_CALL or even pcalau12i $a2,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 or $a5,$a2 or $a6,$a2 addi.d $a4,$a5,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 ld.d $a3,$a6,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 or $a0,$a4,$r0 jirl $ra,$a3,%desc_call(var) #R_LARCH_TLS_DESC_CALL ? (I realize you seem to have not planned/implemented relaxations, aside from the LE one for static linking, but planning for them ahead of time about them helps make sure they're doable) E.g., for IE, I'd suggest turning the latter sequence into (I'm making up relocation names): pcalau12i $a2,%gotpc_tlsoff_hi20(var) or $a5,$a2,$r0 #not necessary, but not marked, so unchanged or $a6,$a2,$r0 nop ld.d $a3,$a6,%gotpc_tlsoff_lo12(var) or $a0,$a4,$r0 #not necessary, but not marked, so unchanged or $a0,$a3,$r0 and or LE, I'd suggest: pcalau12i $a2,%tlsoff_hi20(var) or $a5,$a2,$r0 or $a6,$a2,$r0 #not necessary, but not marked, so unchanged addi.d $a4,$a5,%tlsoff_lo12(var) nop or $a0,$a4,$r0 #not necessary, but not marked, so unchanged nop This addi.d is what I suggest instead of the 'ori' in the LE relaxation. The main difference in my suggestion is that it takes the same position of the original addi instruction, thus the very same scheduling profile, and more importantly participating the same way in the data flow, as the extra moves help see. I realize that addi rather than ori may require offsetting the base address to account for the signed rather than unsigned (I suppose) immediate, so maybe it's not worth it. I am not sure, however, whether you can even separate the pcalau12i hi20 instruction from the subsequent lo12 one (ISTM that it would be challenging to match them if so, especially if a single hi20 is reused by multiple lo12 loads), so maybe there is less flexibility to be exploited than I'm making out. Anyway, I hope this makes sense and that it helps,
Thank you very much for your suggestions. 在 2023/12/2 上午12:14, Alexandre Oliva 写道: > Hello, > > On Dec 1, 2023, Lulu Cai <cailulu@loongson.cn> wrote: > >> The LoongArch TLS Descriptors implementation contains several points: > I'm excited to see another platform gain TLS Descriptors support. > > I'm not deeply acquainted with LoongArch, but I'll dare chime in. > >> 1. The instruction sequences is: >> pcalau12i $a0,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 >> ld.d $a1,$a0,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 >> addi.d $a0,$a0,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 >> jirl $ra,$a1,%desc_call(var) #R_LARCH_TLS_DESC_CALL > Are these instructions fixed, and supposed to appear in this sequence, > or can different registers be used, and the instructions intermixed with > other unrelated ones? The ability to intermix them for better > scheduling and register allocation was one of the guiding design > principles of TLS Descriptors, so the canonical sequence and the design > of relaxations should ideally take flexibility into account, and choose > relaxations with similar scheduling profiles. > > Say, would compiler-generated or hand-coded asm still work if one used: > > pcalau12i $a2,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 > ld.d $a3,$a2,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 > addi.d $a0,$a2,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 > jirl $ra,$a3,%desc_call(var) #R_LARCH_TLS_DESC_CALL > > or even > > pcalau12i $a2,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 > or $a5,$a2 > or $a6,$a2 > addi.d $a4,$a5,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 > ld.d $a3,$a6,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 > or $a0,$a4,$r0 > jirl $ra,$a3,%desc_call(var) #R_LARCH_TLS_DESC_CALL > > ? I do a test, these two sequences still work. But in this version patch, TLS descriptors instructions sequences expand for la.tls.desc and fixed registers and instructions are used. > (I realize you seem to have not planned/implemented relaxations, aside > from the LE one for static linking, but planning for them ahead of time > about them helps make sure they're doable) We will support relax to IE in the future. Because glibc can only resolve R_XXX_IRELATIVE relocation in static linking, we relax DESC to LE to avoid generating R_LARCH_TLS_DESC relocation. > E.g., for IE, I'd suggest turning the latter sequence into (I'm making > up relocation names): > > pcalau12i $a2,%gotpc_tlsoff_hi20(var) > or $a5,$a2,$r0 #not necessary, but not marked, so unchanged > or $a6,$a2,$r0 > nop > ld.d $a3,$a6,%gotpc_tlsoff_lo12(var) > or $a0,$a4,$r0 #not necessary, but not marked, so unchanged > or $a0,$a3,$r0 > > and or LE, I'd suggest: > > pcalau12i $a2,%tlsoff_hi20(var) > or $a5,$a2,$r0 > or $a6,$a2,$r0 #not necessary, but not marked, so unchanged > addi.d $a4,$a5,%tlsoff_lo12(var) > nop > or $a0,$a4,$r0 #not necessary, but not marked, so unchanged > nop > > This addi.d is what I suggest instead of the 'ori' in the LE relaxation. > The main difference in my suggestion is that it takes the same position > of the original addi instruction, thus the very same scheduling profile, > and more importantly participating the same way in the data flow, as the > extra moves help see. We will add a new relocation for addi.d, the related patch is here: https://sourceware.org/pipermail/binutils/2023-December/130921.html > > I realize that addi rather than ori may require offsetting the base > address to account for the signed rather than unsigned (I suppose) > immediate, so maybe it's not worth it. I am not sure, however, whether > you can even separate the pcalau12i hi20 instruction from the subsequent > lo12 one (ISTM that it would be challenging to match them if so, > especially if a single hi20 is reused by multiple lo12 loads), so maybe > there is less flexibility to be exploited than I'm making out. > Anyway, I hope this makes sense and that it helps, >
On 2023-12-01 17:04, Lulu Cai wrote: > The LoongArch TLS Descriptors implementation contains several points: > > 1. The instruction sequences is: > pcalau12i $a0,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 > ld.d $a1,$a0,%desc_ld_pc_lo12(var) #R_LARCH_TLS_DESC_LD_PC_LO12 > addi.d $a0,$a0,%desc_add_pc_lo12(var) #R_LARCH_TLS_DESC_ADD_PC_LO12 > jirl $ra,$a1,%desc_call(var) #R_LARCH_TLS_DESC_CALL > > The linker for each DESC generates a R_LARCH_TLS_DESC64 dynamic relocation, > which relocation is placed at .rela.dyn. > TLSDESC always allocates two GOT slots and one dynamic relocation space to TLSDESC. Hi, all, There is a new idea of la.tls.desc insn sequence. The sequence is, pcalau12i $a0,%desc_pc_hi20(var) #R_LARCH_TLS_DESC_PC_HI20 #R_LARCH_RELAX if needed addi.d $a0,$a0,%desc_pc_lo12(var) #R_LARCH_TLS_DESC_PC_LO12 #R_LARCH_RELAX if needed ld.d $ra,$a0,%desc_ld(var) #R_LARCH_TLS_DESC_LD #R_LARCH_RELAX if needed jirl $ra,$ra,%desc_call(var) #R_LARCH_TLS_DESC_CALL #R_LARCH_RELAX if needed It loads the address of TLSDESC got entry first, and access it and jump then. The pcalau12i + addi.d should be adjacent. For DESC to LE type transition, pcalau12i $a0,%desc_pc_hi20(var) => lu12i.w $a0,%le_hi20(var) addi.d $a0,$a0,%desc_pc_lo12(var) => ori $a0,$a0,%le_lo12(var) ld.d $ra,$a0,%desc_ld(var) => NOP, delete if with RELAX jirl $ra,$ra,%desc_call(var) => NOP, delete if with RELAX For DESC to IE type transition, pcalau12i $a0,%desc_pc_hi20(var) => pcalau12i $a0,%ie_hi20(var) addi.d $a0,$a0,%desc_pc_lo12(var) => ld.d $a0,$a0,%ie_lo12(var) ld.d $ra,$a0,%desc_ld(var) => NOP, delete if with RELAX jirl $ra,$ra,%desc_call(var) => NOP, delete if with RELAX For DESC relax, Do it if cannot do DESC to LE/IE pcalau12i + addi.d -> pcaddi $a0, %???(var) (pseudo reloc type maybe "R_LARCH_TLS_DESC_PCREL20_S2") ld.d $ra,$a0,%desc_ld(var) jirl $ra,$ra,%desc_call(var) And for la.tls.gd or la.tls.ld, we can also do load got entry address relax. pcalau12i + addi.d -> pcaddi $a0, %???(var) (pseudo reloc type maybe "R_LARCH_TLS_GD/LD_PCREL20_S2") Some relative info can be got in loongarch_elf_relax_section(), e.g. sec_addr (got), got_off, desc_off. It can be relaxed in theory. It seems cannot reuse R_LARCH_PCREL20_S2 and needs other relocation types. All suggestions and ideas are welcome. Thanks in advance. Jinyang > > 2. When using multiple ways to access the same TLS variable, a maximum of 5 GOT > slots are used. For example, using GD, TLSDESC, and IE to access the same TLS > variable,GD always uses the first two of the five GOT, TLSDESC uses the third > and fourth, and IE uses the last. > > 3. TLSDESC always requires dynamic relocation because of LoongArch does not yet have > a tls type transition. Howerer statically linked programs cannot resolve TLSDESC's > dynamic relocation, so we did a type transition for this case. > DESC -> LE: > pcalau12i $a0,%desc_pc_hi20(var) => lu12i.w $a0,%le_hi20(var) > ld.d $a1,$a0,%desc_ld_pc_lo12(var) => ori $a0,$a0,%le_lo12(var) > addi.d $a0,$a0,%desc_add_pc_lo12(var) => NOP > jirl $ra,$a1,%desc_call(var) => NOP > > 4. The current code passes the tests of gas ld and glibc. > > Lulu Cai (4): > LoongArch: Add new relocs and macro for TLSDESC. > LoongArch: Add support for TLSDESC in ld. > LoongArch: Add transition support for DESC to LE. > LoongArch: Add testsuits for TLSDESC in gas and ld. > > bfd/bfd-in2.h | 12 + > bfd/elfnn-loongarch.c | 276 ++++++++++++++++-- > bfd/elfxx-loongarch.c | 209 ++++++++++++- > bfd/libbfd.h | 12 + > bfd/reloc.c | 29 ++ > gas/config/tc-loongarch.c | 14 +- > gas/testsuite/gas/loongarch/tlsdesc_32.d | 26 ++ > gas/testsuite/gas/loongarch/tlsdesc_32.s | 12 + > gas/testsuite/gas/loongarch/tlsdesc_64.d | 26 ++ > gas/testsuite/gas/loongarch/tlsdesc_64.s | 12 + > .../gas/loongarch/tlsdesc_large_abs.d | 21 ++ > .../gas/loongarch/tlsdesc_large_abs.s | 9 + > .../gas/loongarch/tlsdesc_large_pc.d | 34 +++ > .../gas/loongarch/tlsdesc_large_pc.s | 16 + > include/elf/loongarch.h | 22 +- > include/opcode/loongarch.h | 3 + > .../ld-loongarch-elf/ld-loongarch-elf.exp | 16 + > ld/testsuite/ld-loongarch-elf/tls-desc.dd | 74 +++++ > ld/testsuite/ld-loongarch-elf/tls-desc.rd | 79 +++++ > ld/testsuite/ld-loongarch-elf/tls-desc.s | 102 +++++++ > .../ld-loongarch-elf/tls-relax-desc-le.d | 15 + > .../ld-loongarch-elf/tls-relax-desc-le.s | 8 + > opcodes/loongarch-opc.c | 54 ++++ > 23 files changed, 1054 insertions(+), 27 deletions(-) > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_32.d > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_32.s > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_64.d > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_64.s > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_abs.d > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_abs.s > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_pc.d > create mode 100644 gas/testsuite/gas/loongarch/tlsdesc_large_pc.s > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.dd > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.rd > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-desc.s > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-desc-le.d > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-desc-le.s >