Message ID | 20231202065334.25904-1-changjiachen@stu.xupt.edu.cn |
---|---|
Headers |
Return-Path: <binutils-bounces+ouuuleilei=gmail.com@sourceware.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp1618728vqy; Fri, 1 Dec 2023 22:54:12 -0800 (PST) X-Google-Smtp-Source: AGHT+IFJRMD6iTxbP9iOl8+QeEXrv55Z+mhnGYA2tH8smSsLlXr1pn8ZKbe0jsRzoTupLwUaFnwy X-Received: by 2002:ac8:5d46:0:b0:423:7637:149c with SMTP id g6-20020ac85d46000000b004237637149cmr1104577qtx.4.1701500052209; Fri, 01 Dec 2023 22:54:12 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701500052; cv=pass; d=google.com; s=arc-20160816; b=syLFu5SKmODCitLgJ3Voz0bMVeRt9kg8+Z1GzFNPPRAHyu+kezY1eJRnG8uVwKu6m4 YRWuAGS9j3uYszY7gA4e7D1PuhtKTUD8jNFVTkJzlPPNAsXH0Ias+zcuhV3pO6S/GJDz kuut3DKNHcP25NOCgdqWvvMGcNBvNYYKmgO3mWdQLvo2b459MSKXisuMdEfjRajzzmuz CZZR45bK50m9hQkUTfiWOwZ2eectOjS5AxVg09b6GPsyr3cJ92SNCGlyR5N9yFf7fy1K ymqAYgX+xKXamonOL7lylZEu0OZLOJKigx6+mV7iX2YgmY6aQlI8dEL+FL8+Rbxozgb2 S+Rg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:arc-filter :dmarc-filter:delivered-to; bh=nZ2qNNm64wKcQK6Vn2lzt0FhwZ/cH70Y3QwXIMt1PVs=; fh=uTF/VfpCAq2GtcL5NdCKul3VKQPlBSDYmUlZE/0CcYM=; b=vJGxtlNYSV5rEl5JJ0s5jc18utFWRIQSPtULm5thrGmQx5pPA4zxuAgKC93nfg82Xh muLwgTVKU7tvt2E9K/vcXjy/0Y8Oca7Lu8PS/5UnZpXQQtuTuRGD01p3Sk7a2LvVSI8v NMNP6xJ3M1wZ7+28+1lysCL8h9/4IsL4GEtJ1ihClHU8OYPMgbxZBt2yoNefAhdJFWy7 6sQR913MbjTJm+JgItYUSlHLXMa1qPXoczCziQnd/4oagibJ9H8GjjfAcn9RFQnC+a1F qB9KN04o1/M0shJOg9Skmsp/NFmHC2ROL8pJIjo10kayx9CtPbPtILGSIX5Od4pGeaqy ENTA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xupt.edu.cn Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id c22-20020ac853d6000000b0042375995114si4742699qtq.316.2023.12.01.22.54.12 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 22:54:12 -0800 (PST) Received-SPF: pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of binutils-bounces+ouuuleilei=gmail.com@sourceware.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="binutils-bounces+ouuuleilei=gmail.com@sourceware.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=xupt.edu.cn Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id BE106385E017 for <ouuuleilei@gmail.com>; Sat, 2 Dec 2023 06:54:08 +0000 (GMT) X-Original-To: binutils@sourceware.org Delivered-To: binutils@sourceware.org Received: from mail-m25469.xmail.ntesmail.com (mail-m25469.xmail.ntesmail.com [103.129.254.69]) by sourceware.org (Postfix) with ESMTPS id 1D89F385842E for <binutils@sourceware.org>; Sat, 2 Dec 2023 06:53:56 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 1D89F385842E Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=stu.xupt.edu.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=stu.xupt.edu.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 1D89F385842E Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=103.129.254.69 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701500041; cv=none; b=ixUZXMwO4COoQrQGg1Wvi3qYBdBw6sdmm1Fopi6ZmfLOdK44wnPcuu4aNsSd+YEqQf+U/h18bwsKhuFJLE4rI0HOjpmzfG7/ekBlzNQjydn5Zvgt5bZyLvn84ZMzo5Pc4rfDJ1to/DAAMJdchlwY42vVPIoq7aeYmJtbjz8eZWc= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701500041; c=relaxed/simple; bh=CVAoydc2CItdBJkPrWjKJyM0B9ZAypkhs3r3pt0aQs8=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=IJ0kNX9eyzelTbkbF3tY+Qxc80XZUqJ46DUpbyDMe9sP9tlHMoe0I9ezgozCNA2EkzGYDH5Sd6NgkZbPXUDp0II8s09zEPJ3mOZNXKz+bEDUhRZnRP9t9kJYuUtt2eAZ8s/sKR/LzYWvOBIKuXAfBYtwm6cTelVSaDcCK8yl4Fk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from ubuntu.localdomain (unknown [223.104.204.129]) by mail-m121144.qiye.163.com (Hmail) with ESMTPA id 8B243AC00A4; Sat, 2 Dec 2023 14:53:42 +0800 (CST) From: changjiachen <changjiachen@stu.xupt.edu.cn> To: binutils@sourceware.org Cc: xuchenghua@loongson.cn, chenglulu@loongson.cn, liuzhensong@loongson.cn, xry111@xry111.site, i.swmail@xen0n.name, maskray@google.com, cailulu@loongson.cn, luweining@loongson.cn, wanglei@loongson.cn, hejinyang@loongson.cn, Lazy_Linux@126.com, mengqinggang@loongson.cn, changjiachen <changjiachen@stu.xupt.edu.cn> Subject: [PATCH v2 0/5] LoongArch tls le model linker relaxation support. Date: Sat, 2 Dec 2023 14:53:29 +0800 Message-Id: <20231202065334.25904-1-changjiachen@stu.xupt.edu.cn> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlCGR5KVh4aH04dQ0hNQ0wdHlUTARMWGhIXJBQOD1 lXWRgSC1lBWUlJSFVKS09VSUtPVUpJQllXWRYaDxIVHRRZQVlPS0hVSkpLSEpDVUpLS1VLWQY+ X-HM-Tid: 0a8c294df208b039kuuu8b243ac00a4 X-HM-MType: 10 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6Nz46NBw4DTw2LAtJGh05FxlK MCgaChBVSlVKTEtKTktLS0lPS0pJVTMWGhIXVRgTGhUcERIaGBMeFTsIDw5VAw4LD1UeHw5VGBVF WVdZEgtZQVlJSUhVSktPVUlLT1VKSUJZV1kIAVlBSENJSjcG X-Spam-Status: No, score=-5.7 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H5, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: binutils@sourceware.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Binutils mailing list <binutils.sourceware.org> List-Unsubscribe: <https://sourceware.org/mailman/options/binutils>, <mailto:binutils-request@sourceware.org?subject=unsubscribe> List-Archive: <https://sourceware.org/pipermail/binutils/> List-Post: <mailto:binutils@sourceware.org> List-Help: <mailto:binutils-request@sourceware.org?subject=help> List-Subscribe: <https://sourceware.org/mailman/listinfo/binutils>, <mailto:binutils-request@sourceware.org?subject=subscribe> Errors-To: binutils-bounces+ouuuleilei=gmail.com@sourceware.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784152118929542508 X-GMAIL-MSGID: 1784152118929542508 |
Series |
LoongArch tls le model linker relaxation support.
|
|
Message
changjiachen
Dec. 2, 2023, 6:53 a.m. UTC
This is the v2 version of patches to support loongarch linker tls le model relax. Changes from v1: * Modified v1-0000-cover-letter.patch part of the explanatory content. Before Modify: example: __thread int a = 1; old insn sequence: lu12i.w $r12,%le_hi20_r(a) ori $r12,$r12,%le_lo12_r(a) add.d $r12,$r12,$r2,%le_add_r(a) li.w $r13,$r0,1 stptr.w $r13,$r12,0 new insn sequence: lu12i.w $r12,%le_hi20_r(a) add.d $r12,$r12,$r2,%le_add_r(a) li.w $r13,$r0,1 st.w $r13,$r12,%le_lo12_r(a) After Modify: example: __thread int a = 1; old insn sequence(at the O0 optimization level): lu12i.w $r12,%le_hi20(a) ori $r12,$r12,%le_lo12(a) add.d $r12,$r12,$r2 addi.w $r13,$r0,1 stptr.w $r13,$r12,0 new insn sequence(at the O0 optimization level): lu12i.w $r12,%le_hi20_r(a) add.d $r12,$r12,$r2,%le_add_r(a) addi.w $r13,$r0,1 st.w $r13,$r12,%le_lo12_r(a) changjiachen (5): LoongArch: bfd: Add support for tls le relax. LoongArch: include: Add support for tls le relax. LoongArch: opcodes: Add support for tls le relax. LoongArch: gas: Add support for tls le relax. LoongArch: ld: Add support for tls le relax. bfd/bfd-in2.h | 4 + bfd/elfnn-loongarch.c | 74 +++++++++ bfd/elfxx-loongarch.c | 50 ++++++ bfd/libbfd.h | 3 + bfd/reloc.c | 6 + gas/config/tc-loongarch.c | 12 +- gas/testsuite/gas/loongarch/reloc.d | 18 +++ gas/testsuite/gas/loongarch/reloc.s | 11 ++ include/elf/loongarch.h | 13 ++ ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ .../relax-bound-check-tls-le.s | 48 ++++++ .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- .../tls-relax-compatible-check-old.s | 39 +++++ opcodes/loongarch-opc.c | 1 + 16 files changed, 501 insertions(+), 3 deletions(-) create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s
Comments
On 2023-12-02 14:53, changjiachen wrote: > This is the v2 version of patches to support loongarch linker tls le model relax. > > Changes from v1: > > * Modified v1-0000-cover-letter.patch part of the explanatory content. > > Before Modify: > > example: __thread int a = 1; > > old insn sequence: > > lu12i.w $r12,%le_hi20_r(a) > ori $r12,$r12,%le_lo12_r(a) > add.d $r12,$r12,$r2,%le_add_r(a) > li.w $r13,$r0,1 > stptr.w $r13,$r12,0 > > new insn sequence: > > lu12i.w $r12,%le_hi20_r(a) > add.d $r12,$r12,$r2,%le_add_r(a) > li.w $r13,$r0,1 > st.w $r13,$r12,%le_lo12_r(a) > > After Modify: > > example: __thread int a = 1; > > old insn sequence(at the O0 optimization level): If the sequence appear only at -O0, is it worth optimizing by relaxation? > > lu12i.w $r12,%le_hi20(a) > ori $r12,$r12,%le_lo12(a) > add.d $r12,$r12,$r2 > addi.w $r13,$r0,1 > stptr.w $r13,$r12,0 > > new insn sequence(at the O0 optimization level): > > lu12i.w $r12,%le_hi20_r(a) > add.d $r12,$r12,$r2,%le_add_r(a) And here, if the sequence appear in other optimization level, will register value ($r12) being different between the old sequence and the new sequence cause other problems, e.g. worse sequence? Have you tried this relaxation at other optimization levels? Thanks. > addi.w $r13,$r0,1 > st.w $r13,$r12,%le_lo12_r(a) > > changjiachen (5): > LoongArch: bfd: Add support for tls le relax. > LoongArch: include: Add support for tls le relax. > LoongArch: opcodes: Add support for tls le relax. > LoongArch: gas: Add support for tls le relax. > LoongArch: ld: Add support for tls le relax. > > bfd/bfd-in2.h | 4 + > bfd/elfnn-loongarch.c | 74 +++++++++ > bfd/elfxx-loongarch.c | 50 ++++++ > bfd/libbfd.h | 3 + > bfd/reloc.c | 6 + > gas/config/tc-loongarch.c | 12 +- > gas/testsuite/gas/loongarch/reloc.d | 18 +++ > gas/testsuite/gas/loongarch/reloc.s | 11 ++ > include/elf/loongarch.h | 13 ++ > ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ > .../relax-bound-check-tls-le.s | 48 ++++++ > .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ > ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ > ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- > .../tls-relax-compatible-check-old.s | 39 +++++ > opcodes/loongarch-opc.c | 1 + > 16 files changed, 501 insertions(+), 3 deletions(-) > create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s > create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s > create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s > create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s > create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s >
The above is a simple explanation of the O0 optimization, which is currently available with O2 and O3 turned on. example: test.c: __thread int count1; int main(){ count1 = 1; } (Enable O2 option and no relax) 0000000120000480 <main>: 120000480: 1400000c lu12i.w $t0, 0 120000484: 0280040d li.w $t1, 1 120000488: 0010898c add.d $t0, $t0, $tp 12000048c: 00150004 move $a0, $zero 120000490: 2980018d st.w $t1, $t0, 0 120000494: 4c000020 ret (Enable O2 option and relax) 0000000120000480 <main>: 120000480: 0280040d li.w $t1, 1 120000484: 00150004 move $a0, $zero 120000488: 2980004d st.w $t1, $tp, 0 12000048c: 4c000020 ret As you can see, with the O2 option turned on, the order of instructions changes, but the relax optimization is still not affected, and the address calculation of the tls variable count1 is correct before and after optimization. The situation of enabling O3 is similar to that of enabling O2. From: Jinyang He <hejinyang@loongson.cn> Date: 2023-12-04 10:25:13 To: changjiachen <changjiachen@stu.xupt.edu.cn>,binutils@sourceware.org Cc: xuchenghua@loongson.cn,chenglulu@loongson.cn,liuzhensong@loongson.cn,xry111@xry111.site,i.swmail@xen0n.name,maskray@google.com,cailulu@loongson.cn,luweining@loongson.cn,wanglei@loongson.cn,Lazy_Linux@126.com,mengqinggang@loongson.cn Subject: Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support.> >On 2023-12-02 14:53, changjiachen wrote: >> This is the v2 version of patches to support loongarch linker tls le model relax. >> >> Changes from v1: >> >> * Modified v1-0000-cover-letter.patch part of the explanatory content. >> >> Before Modify: >> >> example: __thread int a = 1; >> >> old insn sequence: >> >> lu12i.w $r12,%le_hi20_r(a) >> ori $r12,$r12,%le_lo12_r(a) >> add.d $r12,$r12,$r2,%le_add_r(a) >> li.w $r13,$r0,1 >> stptr.w $r13,$r12,0 >> >> new insn sequence: >> >> lu12i.w $r12,%le_hi20_r(a) >> add.d $r12,$r12,$r2,%le_add_r(a) >> li.w $r13,$r0,1 >> st.w $r13,$r12,%le_lo12_r(a) >> >> After Modify: >> >> example: __thread int a = 1; >> >> old insn sequence(at the O0 optimization level): > >If the sequence appear only at -O0, is it worth optimizing by relaxation? > > >> >> lu12i.w $r12,%le_hi20(a) >> ori $r12,$r12,%le_lo12(a) >> add.d $r12,$r12,$r2 >> addi.w $r13,$r0,1 >> stptr.w $r13,$r12,0 >> >> new insn sequence(at the O0 optimization level): >> >> lu12i.w $r12,%le_hi20_r(a) >> add.d $r12,$r12,$r2,%le_add_r(a) >And here, if the sequence appear in other optimization level, will >register value ($r12) being different between the old sequence and >the new sequence cause other problems, e.g. worse sequence? Have you > >tried this relaxation at other optimization levels? > > >Thanks. > >> addi.w $r13,$r0,1 >> st.w $r13,$r12,%le_lo12_r(a) >> >> changjiachen (5): >> LoongArch: bfd: Add support for tls le relax. >> LoongArch: include: Add support for tls le relax. >> LoongArch: opcodes: Add support for tls le relax. >> LoongArch: gas: Add support for tls le relax. >> LoongArch: ld: Add support for tls le relax. >> >> bfd/bfd-in2.h | 4 + >> bfd/elfnn-loongarch.c | 74 +++++++++ >> bfd/elfxx-loongarch.c | 50 ++++++ >> bfd/libbfd.h | 3 + >> bfd/reloc.c | 6 + >> gas/config/tc-loongarch.c | 12 +- >> gas/testsuite/gas/loongarch/reloc.d | 18 +++ >> gas/testsuite/gas/loongarch/reloc.s | 11 ++ >> include/elf/loongarch.h | 13 ++ >> ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ >> .../relax-bound-check-tls-le.s | 48 ++++++ >> .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ >> ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ >> ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- >> .../tls-relax-compatible-check-old.s | 39 +++++ >> opcodes/loongarch-opc.c | 1 + >> 16 files changed, 501 insertions(+), 3 deletions(-) >> create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s >> create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s >> >
On 2023-12-04 11:39, 常佳琛 wrote: > The above is a simple explanation of the O0 optimization, > which is currently available with O2 and O3 turned on. > > example: > test.c: > __thread int count1; > int main(){ > count1 = 1; > } > (Enable O2 option and no relax) > 0000000120000480 <main>: > 120000480:1400000c lu12i.w $t0, 0 > 120000484:0280040d li.w $t1, 1 > 120000488:0010898c add.d $t0, $t0, $tp > 12000048c:00150004 move $a0, $zero > 120000490:2980018d st.w $t1, $t0, 0 > 120000494:4c000020 ret > > (Enable O2 option and relax) > 0000000120000480 <main>: > 120000480:0280040d li.w $t1, 1 > 120000484:00150004 move $a0, $zero > 120000488:2980004d st.w $t1, $tp, 0 > 12000048c:4c000020 ret > > As you can see, with the O2 option turned on, the order of > instructions changes, > but the relax optimization is still not affected, and the address > calculation of the > tls variable count1 is correct before and after optimization. The > situation of enabling > O3 is similar to that of enabling O2. > > How can I get your gcc (or patches)? I tried to compare access to non-thread var with old gcc. Condition: __thread int a; int b; extern int foo(int *); Compare in old gcc: a = 1; b = 1; lu12i.w $r12,%le_hi20(a) pcalau12i $r12,%pc_hi20(b) ori $r12,$r12,%le_lo12(a) addi.w $r13,$r0,1 addi.w $r13,$r0,1 stx.w $r13,$r12,$r2 st.w $r13,$r12,%pc_lo12(b) a = 1; return foo(&a); b = 1; return foo(&b); lu12i.w $r12,%le_hi20(a) pcalau12i $r4,%pc_hi20(b) ori $r12,$r12,%le_lo12(a) addi.d $r4,$r4,%pc_lo12(b) addi.w $r13,$r0,1 addi.w $r12,$r0,1 add.d $r4,$r12,$r2 stx.w $r13,$r12,$r2 stptr.w $r12,$r4,0 b %plt(foo) b %plt(foo) I worry about this case we need the address of the thread-var after accessing it, which may cause worse sequence in your gcc. For the non-thread-var it load the address to a register first and then access it by that register. How about your gcc handle this case? > > > From: Jinyang He <hejinyang@loongson.cn> > Date: 2023-12-04 10:25:13 > To: changjiachen <changjiachen@stu.xupt.edu.cn>,binutils@sourceware.org > Cc: xuchenghua@loongson.cn,chenglulu@loongson.cn,liuzhensong@loongson.cn,xry111@xry111.site,i.swmail@xen0n.name,maskray@google.com,cailulu@loongson.cn,luweining@loongson.cn,wanglei@loongson.cn,Lazy_Linux@126.com,mengqinggang@loongson.cn > Subject: Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support.> > >On 2023-12-02 14:53, changjiachen wrote: > >> This is the v2 version of patches to support loongarch linker tls le model relax. > >> > >> Changes from v1: > >> > >> * Modified v1-0000-cover-letter.patch part of the explanatory content. > >> > >> Before Modify: > >> > >> example: __thread int a = 1; > >> > >> old insn sequence: > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> ori $r12,$r12,%le_lo12_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >> li.w $r13,$r0,1 > >> stptr.w $r13,$r12,0 > >> > >> new insn sequence: > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >> li.w $r13,$r0,1 > >> st.w $r13,$r12,%le_lo12_r(a) > >> > >> After Modify: > >> > >> example: __thread int a = 1; > >> > >> old insn sequence(at the O0 optimization level): > > > >If the sequence appear only at -O0, is it worth optimizing by relaxation? > > > > > >> > >> lu12i.w $r12,%le_hi20(a) > >> ori $r12,$r12,%le_lo12(a) > >> add.d $r12,$r12,$r2 > >> addi.w $r13,$r0,1 > >> stptr.w $r13,$r12,0 > >> > >> new insn sequence(at the O0 optimization level): > >> > >> lu12i.w $r12,%le_hi20_r(a) > >> add.d $r12,$r12,$r2,%le_add_r(a) > >And here, if the sequence appear in other optimization level, will > >register value ($r12) being different between the old sequence and > >the new sequence cause other problems, e.g. worse sequence? Have you > > > >tried this relaxation at other optimization levels? > > > > > >Thanks. > > > >> addi.w $r13,$r0,1 > >> st.w $r13,$r12,%le_lo12_r(a) > >> > >> changjiachen (5): > >> LoongArch: bfd: Add support for tls le relax. > >> LoongArch: include: Add support for tls le relax. > >> LoongArch: opcodes: Add support for tls le relax. > >> LoongArch: gas: Add support for tls le relax. > >> LoongArch: ld: Add support for tls le relax. > >> > >> bfd/bfd-in2.h | 4 + > >> bfd/elfnn-loongarch.c | 74 +++++++++ > >> bfd/elfxx-loongarch.c | 50 ++++++ > >> bfd/libbfd.h | 3 + > >> bfd/reloc.c | 6 + > >> gas/config/tc-loongarch.c | 12 +- > >> gas/testsuite/gas/loongarch/reloc.d | 18 +++ > >> gas/testsuite/gas/loongarch/reloc.s | 11 ++ > >> include/elf/loongarch.h | 13 ++ > >> ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ > >> .../relax-bound-check-tls-le.s | 48 ++++++ > >> .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ > >> ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ > >> ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- > >> .../tls-relax-compatible-check-old.s | 39 +++++ > >> opcodes/loongarch-opc.c | 1 + > >> 16 files changed, 501 insertions(+), 3 deletions(-) > >> create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s > >> create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s > >> > > >
发件人:Jinyang He <hejinyang@loongson.cn> 发送日期:2023-12-04 16:57:55 收件人:"常佳琛" <changjiachen@stu.xupt.edu.cn>,binutils@sourceware.org 抄送人:xuchenghua@loongson.cn,chenglulu@loongson.cn,liuzhensong@loongson.cn,xry111@xry111.site,i.swmail@xen0n.name,maskray@google.com,cailulu@loongson.cn,luweining@loongson.cn,wanglei@loongson.cn,Lazy_Linux@126.com,mengqinggang@loongson.cn 主题:Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support.> >On 2023-12-04 11:39, 常佳琛 wrote: >> The above is a simple explanation of the O0 optimization, >> which is currently available with O2 and O3 turned on. >> >> example: >> test.c: >> __thread int count1; >> int main(){ >> count1 = 1; >> } >> (Enable O2 option and no relax) >> 0000000120000480 <main>: >> 120000480:1400000c lu12i.w $t0, 0 >> 120000484:0280040d li.w $t1, 1 >> 120000488:0010898c add.d $t0, $t0, $tp >> 12000048c:00150004 move $a0, $zero >> 120000490:2980018d st.w $t1, $t0, 0 >> 120000494:4c000020 ret >> >> (Enable O2 option and relax) >> 0000000120000480 <main>: >> 120000480:0280040d li.w $t1, 1 >> 120000484:00150004 move $a0, $zero >> 120000488:2980004d st.w $t1, $tp, 0 >> 12000048c:4c000020 ret >> >> As you can see, with the O2 option turned on, the order of >> instructions changes, >> but the relax optimization is still not affected, and the address >> calculation of the >> tls variable count1 is correct before and after optimization. The >> situation of enabling >> O3 is similar to that of enabling O2. >> >> > >How can I get your gcc (or patches)? I tried to compare access to >non-thread var with old gcc. Reply : There are still some issues with gcc that need to be worked out. As for gcc patch, it will be shipped on Tuesday or Wednesday of this week, you may have to wait for a while. changjiachen > >Condition: >__thread int a; >int b; >extern int foo(int *); > >Compare in old gcc: > >a = 1; b = 1; > >lu12i.w $r12,%le_hi20(a) pcalau12i $r12,%pc_hi20(b) >ori $r12,$r12,%le_lo12(a) >addi.w $r13,$r0,1 addi.w $r13,$r0,1 >stx.w $r13,$r12,$r2 st.w $r13,$r12,%pc_lo12(b) > > >a = 1; return foo(&a); b = 1; return foo(&b); > >lu12i.w $r12,%le_hi20(a) pcalau12i $r4,%pc_hi20(b) >ori $r12,$r12,%le_lo12(a) addi.d $r4,$r4,%pc_lo12(b) >addi.w $r13,$r0,1 addi.w $r12,$r0,1 >add.d $r4,$r12,$r2 >stx.w $r13,$r12,$r2 stptr.w $r12,$r4,0 >b %plt(foo) b %plt(foo) > >I worry about this case we need the address of the thread-var after >accessing it, which may cause worse sequence in your gcc. For the >non-thread-var it load the address to a register first and then >access it by that register. How about your gcc handle this case? > > >> >> >> From: Jinyang He <hejinyang@loongson.cn> >> Date: 2023-12-04 10:25:13 >> To: changjiachen <changjiachen@stu.xupt.edu.cn>,binutils@sourceware.org >> Cc: xuchenghua@loongson.cn,chenglulu@loongson.cn,liuzhensong@loongson.cn,xry111@xry111.site,i.swmail@xen0n.name,maskray@google.com,cailulu@loongson.cn,luweining@loongson.cn,wanglei@loongson.cn,Lazy_Linux@126.com,mengqinggang@loongson.cn >> Subject: Re: [PATCH v2 0/5] LoongArch tls le model linker relaxation support.> >> >On 2023-12-02 14:53, changjiachen wrote: >> >> This is the v2 version of patches to support loongarch linker tls le model relax. >> >> >> >> Changes from v1: >> >> >> >> * Modified v1-0000-cover-letter.patch part of the explanatory content. >> >> >> >> Before Modify: >> >> >> >> example: __thread int a = 1; >> >> >> >> old insn sequence: >> >> >> >> lu12i.w $r12,%le_hi20_r(a) >> >> ori $r12,$r12,%le_lo12_r(a) >> >> add.d $r12,$r12,$r2,%le_add_r(a) >> >> li.w $r13,$r0,1 >> >> stptr.w $r13,$r12,0 >> >> >> >> new insn sequence: >> >> >> >> lu12i.w $r12,%le_hi20_r(a) >> >> add.d $r12,$r12,$r2,%le_add_r(a) >> >> li.w $r13,$r0,1 >> >> st.w $r13,$r12,%le_lo12_r(a) >> >> >> >> After Modify: >> >> >> >> example: __thread int a = 1; >> >> >> >> old insn sequence(at the O0 optimization level): >> > >> >If the sequence appear only at -O0, is it worth optimizing by relaxation? >> > >> > >> >> >> >> lu12i.w $r12,%le_hi20(a) >> >> ori $r12,$r12,%le_lo12(a) >> >> add.d $r12,$r12,$r2 >> >> addi.w $r13,$r0,1 >> >> stptr.w $r13,$r12,0 >> >> >> >> new insn sequence(at the O0 optimization level): >> >> >> >> lu12i.w $r12,%le_hi20_r(a) >> >> add.d $r12,$r12,$r2,%le_add_r(a) >> >And here, if the sequence appear in other optimization level, will >> >register value ($r12) being different between the old sequence and >> >the new sequence cause other problems, e.g. worse sequence? Have you >> > >> >tried this relaxation at other optimization levels? >> > >> > >> >Thanks. >> > >> >> addi.w $r13,$r0,1 >> >> st.w $r13,$r12,%le_lo12_r(a) >> >> >> >> changjiachen (5): >> >> LoongArch: bfd: Add support for tls le relax. >> >> LoongArch: include: Add support for tls le relax. >> >> LoongArch: opcodes: Add support for tls le relax. >> >> LoongArch: gas: Add support for tls le relax. >> >> LoongArch: ld: Add support for tls le relax. >> >> >> >> bfd/bfd-in2.h | 4 + >> >> bfd/elfnn-loongarch.c | 74 +++++++++ >> >> bfd/elfxx-loongarch.c | 50 ++++++ >> >> bfd/libbfd.h | 3 + >> >> bfd/reloc.c | 6 + >> >> gas/config/tc-loongarch.c | 12 +- >> >> gas/testsuite/gas/loongarch/reloc.d | 18 +++ >> >> gas/testsuite/gas/loongarch/reloc.s | 11 ++ >> >> include/elf/loongarch.h | 13 ++ >> >> ld/testsuite/ld-loongarch-elf/old-tls-le.s | 19 +++ >> >> .../relax-bound-check-tls-le.s | 48 ++++++ >> >> .../ld-loongarch-elf/relax-check-tls-le.s | 43 ++++++ >> >> ld/testsuite/ld-loongarch-elf/relax-tls-le.s | 17 ++ >> >> ld/testsuite/ld-loongarch-elf/relax.exp | 146 +++++++++++++++++- >> >> .../tls-relax-compatible-check-old.s | 39 +++++ >> >> opcodes/loongarch-opc.c | 1 + >> >> 16 files changed, 501 insertions(+), 3 deletions(-) >> >> create mode 100644 ld/testsuite/ld-loongarch-elf/old-tls-le.s >> >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-bound-check-tls-le.s >> >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-check-tls-le.s >> >> create mode 100644 ld/testsuite/ld-loongarch-elf/relax-tls-le.s >> >> create mode 100644 ld/testsuite/ld-loongarch-elf/tls-relax-compatible-check-old.s >> >> >> > >> >
On Mon, 2023-12-04 at 17:25 +0800, 常佳琛 wrote: > > How can I get your gcc (or patches)? I tried to compare access to > > non-thread var with old gcc. > Reply : > There are still some issues with gcc that need to be worked out. > As for gcc patch, it will be shipped on Tuesday or Wednesday of this > week, you may have to wait for a while. Let's not add huge thunks of new features into GCC at the moment because we are in stage 3 (general bugfixing) of GCC 14 development. So we should concentrate on fixing bugs and avoid from potentially introducing new bugs. You may still post the GCC patch for a review though.
On Mon, 2023-12-04 at 17:37 +0800, Xi Ruoyao wrote: > On Mon, 2023-12-04 at 17:25 +0800, 常佳琛 wrote: > > > How can I get your gcc (or patches)? I tried to compare access to > > > non-thread var with old gcc. > > Reply : > > There are still some issues with gcc that need to be worked out. > > As for gcc patch, it will be shipped on Tuesday or Wednesday of this > > week, you may have to wait for a while. > > Let's not add huge thunks of new features into GCC at the moment because > we are in stage 3 (general bugfixing) of GCC 14 development. So we > should concentrate on fixing bugs and avoid from potentially introducing > new bugs. > > You may still post the GCC patch for a review though. FWIW if you want this for GCC 14, IMO you can just remove SYMBOL_TLS_LE from loongarch_explicit_relocs_p in GCC (so GCC will always generate la.tls.le instead of the real instruction sequence to load address of a LE TLS symbol, unless -mexplicit-relocs=always) and expand la.tls.le as you wish in GAS. This will be a one-line change in GCC and it's more acceptable than a huge diff in stage 3.