Message ID | 20231116131836.504699-2-xry111@xry111.site |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp3204323vqg; Thu, 16 Nov 2023 05:20:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IGvaXBl0qjmglh0IaEDYV5pH1NEgFCazA3LroLCONlDueI4bMG1d5aoYSeIAii46Cqh1TdI X-Received: by 2002:a05:622a:216:b0:41e:453a:4dfe with SMTP id b22-20020a05622a021600b0041e453a4dfemr9109587qtx.36.1700140813916; Thu, 16 Nov 2023 05:20:13 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700140813; cv=pass; d=google.com; s=arc-20160816; b=TTBMtIUyWN1czGJCLhHjFIgsIsSJs+TANwv3hFgyi643r4CIAIlZDTACeCf5rroeqj ojhQMzuDPBl1xUqjBkrPiHPm+7C6XuHRwGnYujazbnDXLfd0e3uL3nEg3fj3urfJ9WkZ 4v4j9qNewDVc8Z3R8uQSoqjuTU7rt5CRCvbsrC5ZWq3WHceNQuAO7d2iHDBKQkq2TVlL S3W59g/AZ5FH0yu1JFx8bSHnLOhjhm/Itdb8/uVYcvXLRaxGlG5Prji9uwa22b69dxs4 gVUOOClpKTJl8NLHZINBcxNBbyWXeWqaLbdklSlbLF/NcNfV8Oc9KZWVX6vqxlnwfDBK LP8w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=88pzg7Pciqf19VkFs1jCOFqjJ2RQnc/kfU187tbMOxM=; fh=oUCfM/eMlWtMCtZZKY1bglzxCo7b3kw9D5LTFFWuz38=; b=Ad3uRZPt5olplYj36/YRhyqRqF2i5hfd4I6Z2eGCbnZJ393pvkhDJbGYfKbeZzaGxu GTvR9MPlvGzFs0WglI5sWz956aWM4XcJs8Nm3g32PEtdgO2oHtP5Ysut97igDW4e2GKf +2LGU/Wla+xM22UpJuJmZ87M52Sz4V8a9nMTNGIbN4nHTpR9a28w/0fiDDQOqFIU8Jmr ly3AvXziQuQcHvwhd+gOhxJpoCVk3qLKrUIAzclf6B/XNoSh6x4r254pP4dL51M557Xc sDcV0cxtG4/iGXbUcRvDKEBP2e3prLGKBhvf++BL1s4A6hI4u1/OpCv1q5sYV83Ais/1 aROg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b="BFqyE/jW"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id c14-20020ac85a8e000000b0041cb3f5c435si10907391qtc.660.2023.11.16.05.20.13 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 05:20:13 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b="BFqyE/jW"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 376D4385771F for <ouuuleilei@gmail.com>; Thu, 16 Nov 2023 13:19:49 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id D57363858D33 for <gcc-patches@gcc.gnu.org>; Thu, 16 Nov 2023 13:19:21 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D57363858D33 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D57363858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700140763; cv=none; b=eBwdXYD24vQjdZFjfGIpEDIrKqwZjqgpwFg8HIdKhRdzrD4tcZ/YxvTreGGyj8S/EofOCxM0qmP0yG5lAubwUOiGW3bh21eh2tgI7zHybfaSNEttA+a2A3MuWBBrtVtpXHijmU+mosHIYx84RsepN9/7Zf5oV6UvNF2r5sXrJhU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700140763; c=relaxed/simple; bh=NkN0laRgSa3LKY/uYMz2hCPpDAejQ470Gb4yqT8JT0Q=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=dg2VNO0Hx/buIL4H3wi1n/nKkLsihCmpbJcVoV7OYAXNHWeHhbiMnjA/2Owvju2fTCzHFx88/Bu8WF7rBELdsdVYLwGcmrYxuEdlw/cL7uibHxmXX5l7xsr4uKGh0yMQjy4Iv68ZZh+4QeRk26vBBldDXpW1MeQQwkhzuXjSoYI= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1700140753; bh=NkN0laRgSa3LKY/uYMz2hCPpDAejQ470Gb4yqT8JT0Q=; h=From:To:Cc:Subject:Date:From; b=BFqyE/jW0VqgrNt8w08DpkXaDp138vYJ+qRhl9TF7OVLgq6XqGoNbiP+7/LmWOhFk QwDhfgiugsoFoo6GWw2H/ZkehxKRIwjdr/zJ3oQgFAqP9gT6JuNRIV4H3466jEVew8 jgmcwIaLO0ewno9BWZ5G1dXFr8Df2Y+aD1r6qfxU= Received: from stargazer.. (unknown [113.200.174.70]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id E6F3D66A03; Thu, 16 Nov 2023 08:19:11 -0500 (EST) From: Xi Ruoyao <xry111@xry111.site> To: gcc-patches@gcc.gnu.org Cc: chenglulu <chenglulu@loongson.cn>, i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao <xry111@xry111.site> Subject: [PATCH 0/5] LoongArch: Initial LA664 support Date: Thu, 16 Nov 2023 21:18:32 +0800 Message-ID: <20231116131836.504699-2-xry111@xry111.site> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782726854208409897 X-GMAIL-MSGID: 1782726854208409897 |
Series |
LoongArch: Initial LA664 support
|
|
Message
Xi Ruoyao
Nov. 16, 2023, 1:18 p.m. UTC
Loongson 3A6000 processor will be shipped to general users in this month and it features 4 cores with the new LA664 micro architecture. Here is some changes from LA464: 1. The 32-bit division instruction now ignores the high 32 bits of the input registers. This is enumerated via CPUCFG word 0x2, bit 26. 2. The micro architecture now guarantees two loads on the same memory address won't be reordered with each other. dbar 0x700 is turned into nop. 3. The architecture now supports approximate square root instructions (FRECIPE and VRSQRTE) on 32-bit or 64-bit floating-point values and the vectors of these values. 4. The architecture now supports SC.Q instruction for 128-bit CAS. 5. The architecture now supports LL.ACQ and SC.REL instructions (well, I don't really know what they are for). 6. The architecture now supports CAS instructions for 64, 32, 16, or 8-bit values. 7. The architecture now supports atomic add and atomic swap instructions for 16 or 8-bit values. 8. Some non-zero hint values of DBAR instructions are added. These features are documented in LoongArch v1.1. Implementations can implement any subset of them and enumerate the implemented features via CPUCFG. LA664 implements them all. (8) is already implemented in previous patches because it's completely backward-compatible. This series implements (1) and (2) with switches -mdiv32 and -mld-seq-sa (these names are derived from the names of the corresponding CPUCFG bits documented in the LoongArch v1.1 specification). The other features require Binutils support and we are close to the end of GCC 14 stage 1, so I'm posting this series first now. With -march=la664, these two options are implicitly enabled but they can be turned off with -mno-div32 or -mno-ld-seq-sa. With -march=native, the current CPU is probed via CPUCFG and these options are implicitly enabled if the CPU supports the corresponding feature. They can be turned off with explicit -mno-div32 or -mno-ld-seq-sa as well. -mtune=la664 is implemented as a copy of -mtune=la464 and we can adjust it with benchmark results later. Bootstrapped and regtested on a LA664 with BOOT_CFLAGS="-march=la664 -O2", a LA464 with BOOT_CFLAGS="-march=native -O2". And manually verified -march=native probing on LA664 and LA464. Xi Ruoyao (5): LoongArch: Switch loongarch-def to C++ LoongArch: genopts: Add infrastructure to generate code for new features in ISA evolution LoongArch: Take the advantage of -mdiv32 if it's enabled LoongArch: Don't emit dbar 0x700 if -mld-seq-sa LoongArch: Add -march=la664 and -mtune=la664 gcc/config/loongarch/genopts/genstr.sh | 78 ++++++- gcc/config/loongarch/genopts/isa-evolution.in | 2 + .../loongarch/genopts/loongarch-strings | 1 + gcc/config/loongarch/genopts/loongarch.opt.in | 10 + gcc/config/loongarch/loongarch-cpu.cc | 37 ++-- gcc/config/loongarch/loongarch-cpucfg-map.h | 36 +++ gcc/config/loongarch/loongarch-def-array.h | 40 ++++ gcc/config/loongarch/loongarch-def.c | 205 ------------------ gcc/config/loongarch/loongarch-def.cc | 193 +++++++++++++++++ gcc/config/loongarch/loongarch-def.h | 67 ++++-- gcc/config/loongarch/loongarch-opts.h | 9 +- gcc/config/loongarch/loongarch-str.h | 8 +- gcc/config/loongarch/loongarch-tune.h | 123 ++++++++++- gcc/config/loongarch/loongarch.cc | 6 +- gcc/config/loongarch/loongarch.md | 31 ++- gcc/config/loongarch/loongarch.opt | 23 +- gcc/config/loongarch/t-loongarch | 25 ++- .../gcc.target/loongarch/div-div32.c | 31 +++ .../gcc.target/loongarch/div-no-div32.c | 11 + 19 files changed, 664 insertions(+), 272 deletions(-) create mode 100644 gcc/config/loongarch/genopts/isa-evolution.in create mode 100644 gcc/config/loongarch/loongarch-cpucfg-map.h create mode 100644 gcc/config/loongarch/loongarch-def-array.h delete mode 100644 gcc/config/loongarch/loongarch-def.c create mode 100644 gcc/config/loongarch/loongarch-def.cc create mode 100644 gcc/testsuite/gcc.target/loongarch/div-div32.c create mode 100644 gcc/testsuite/gcc.target/loongarch/div-no-div32.c
Comments
Hi, Thank you very much for the modification, but I think we need to support la664 with the configuration items of configure. I also defined ISA_BASE_LA64V110 to represent the LoongArch1.1 instruction set, what do you think? 在 2023/11/16 下午9:18, Xi Ruoyao 写道: > Loongson 3A6000 processor will be shipped to general users in this month > and it features 4 cores with the new LA664 micro architecture. Here is > some changes from LA464: > > 1. The 32-bit division instruction now ignores the high 32 bits of the > input registers. This is enumerated via CPUCFG word 0x2, bit 26. > 2. The micro architecture now guarantees two loads on the same memory > address won't be reordered with each other. dbar 0x700 is turned > into nop. > 3. The architecture now supports approximate square root instructions > (FRECIPE and VRSQRTE) on 32-bit or 64-bit floating-point values and > the vectors of these values. > 4. The architecture now supports SC.Q instruction for 128-bit CAS. > 5. The architecture now supports LL.ACQ and SC.REL instructions (well, I > don't really know what they are for). > 6. The architecture now supports CAS instructions for 64, 32, 16, or 8-bit > values. > 7. The architecture now supports atomic add and atomic swap instructions > for 16 or 8-bit values. > 8. Some non-zero hint values of DBAR instructions are added. > > These features are documented in LoongArch v1.1. Implementations can > implement any subset of them and enumerate the implemented features via > CPUCFG. LA664 implements them all. > > (8) is already implemented in previous patches because it's completely > backward-compatible. This series implements (1) and (2) with switches > -mdiv32 and -mld-seq-sa (these names are derived from the names of the > corresponding CPUCFG bits documented in the LoongArch v1.1 > specification). > > The other features require Binutils support and we are close to the end > of GCC 14 stage 1, so I'm posting this series first now. > > With -march=la664, these two options are implicitly enabled but they can > be turned off with -mno-div32 or -mno-ld-seq-sa. > > With -march=native, the current CPU is probed via CPUCFG and these > options are implicitly enabled if the CPU supports the corresponding > feature. They can be turned off with explicit -mno-div32 or > -mno-ld-seq-sa as well. > > -mtune=la664 is implemented as a copy of -mtune=la464 and we can adjust > it with benchmark results later. > > Bootstrapped and regtested on a LA664 with BOOT_CFLAGS="-march=la664 > -O2", a LA464 with BOOT_CFLAGS="-march=native -O2". And manually > verified -march=native probing on LA664 and LA464. > > Xi Ruoyao (5): > LoongArch: Switch loongarch-def to C++ > LoongArch: genopts: Add infrastructure to generate code for new > features in ISA evolution > LoongArch: Take the advantage of -mdiv32 if it's enabled > LoongArch: Don't emit dbar 0x700 if -mld-seq-sa > LoongArch: Add -march=la664 and -mtune=la664 > > gcc/config/loongarch/genopts/genstr.sh | 78 ++++++- > gcc/config/loongarch/genopts/isa-evolution.in | 2 + > .../loongarch/genopts/loongarch-strings | 1 + > gcc/config/loongarch/genopts/loongarch.opt.in | 10 + > gcc/config/loongarch/loongarch-cpu.cc | 37 ++-- > gcc/config/loongarch/loongarch-cpucfg-map.h | 36 +++ > gcc/config/loongarch/loongarch-def-array.h | 40 ++++ > gcc/config/loongarch/loongarch-def.c | 205 ------------------ > gcc/config/loongarch/loongarch-def.cc | 193 +++++++++++++++++ > gcc/config/loongarch/loongarch-def.h | 67 ++++-- > gcc/config/loongarch/loongarch-opts.h | 9 +- > gcc/config/loongarch/loongarch-str.h | 8 +- > gcc/config/loongarch/loongarch-tune.h | 123 ++++++++++- > gcc/config/loongarch/loongarch.cc | 6 +- > gcc/config/loongarch/loongarch.md | 31 ++- > gcc/config/loongarch/loongarch.opt | 23 +- > gcc/config/loongarch/t-loongarch | 25 ++- > .../gcc.target/loongarch/div-div32.c | 31 +++ > .../gcc.target/loongarch/div-no-div32.c | 11 + > 19 files changed, 664 insertions(+), 272 deletions(-) > create mode 100644 gcc/config/loongarch/genopts/isa-evolution.in > create mode 100644 gcc/config/loongarch/loongarch-cpucfg-map.h > create mode 100644 gcc/config/loongarch/loongarch-def-array.h > delete mode 100644 gcc/config/loongarch/loongarch-def.c > create mode 100644 gcc/config/loongarch/loongarch-def.cc > create mode 100644 gcc/testsuite/gcc.target/loongarch/div-div32.c > create mode 100644 gcc/testsuite/gcc.target/loongarch/div-no-div32.c >
On Fri, 2023-11-17 at 10:41 +0800, chenglulu wrote: > Hi, > > Thank you very much for the modification, but I think we need to support > la664 with the configuration items of configure. I'll add it. > I also defined ISA_BASE_LA64V110 to represent the LoongArch1.1 > instruction set, what do you think? I'll add it too. I had misread section 1.5 paragraph 1 of the spec so I didn't consider this a good idea, but after reading it again I think it should be added. > 在 2023/11/16 下午9:18, Xi Ruoyao 写道: > > Loongson 3A6000 processor will be shipped to general users in this month > > and it features 4 cores with the new LA664 micro architecture. Here is > > some changes from LA464: > > > > 1. The 32-bit division instruction now ignores the high 32 bits of the > > input registers. This is enumerated via CPUCFG word 0x2, bit 26. > > 2. The micro architecture now guarantees two loads on the same memory > > address won't be reordered with each other. dbar 0x700 is turned > > into nop. > > 3. The architecture now supports approximate square root instructions > > (FRECIPE and VRSQRTE) on 32-bit or 64-bit floating-point values and > > the vectors of these values. > > 4. The architecture now supports SC.Q instruction for 128-bit CAS. > > 5. The architecture now supports LL.ACQ and SC.REL instructions (well, I > > don't really know what they are for). > > 6. The architecture now supports CAS instructions for 64, 32, 16, or 8-bit > > values. > > 7. The architecture now supports atomic add and atomic swap instructions > > for 16 or 8-bit values. > > 8. Some non-zero hint values of DBAR instructions are added. > > > > These features are documented in LoongArch v1.1. Implementations can > > implement any subset of them and enumerate the implemented features via > > CPUCFG. LA664 implements them all. > > > > (8) is already implemented in previous patches because it's completely > > backward-compatible. This series implements (1) and (2) with switches > > -mdiv32 and -mld-seq-sa (these names are derived from the names of the > > corresponding CPUCFG bits documented in the LoongArch v1.1 > > specification). > > > > The other features require Binutils support and we are close to the end > > of GCC 14 stage 1, so I'm posting this series first now. > > > > With -march=la664, these two options are implicitly enabled but they can > > be turned off with -mno-div32 or -mno-ld-seq-sa. > > > > With -march=native, the current CPU is probed via CPUCFG and these > > options are implicitly enabled if the CPU supports the corresponding > > feature. They can be turned off with explicit -mno-div32 or > > -mno-ld-seq-sa as well. > > > > -mtune=la664 is implemented as a copy of -mtune=la464 and we can adjust > > it with benchmark results later. > > > > Bootstrapped and regtested on a LA664 with BOOT_CFLAGS="-march=la664 > > -O2", a LA464 with BOOT_CFLAGS="-march=native -O2". And manually > > verified -march=native probing on LA664 and LA464. > > > > Xi Ruoyao (5): > > LoongArch: Switch loongarch-def to C++ > > LoongArch: genopts: Add infrastructure to generate code for new > > features in ISA evolution > > LoongArch: Take the advantage of -mdiv32 if it's enabled > > LoongArch: Don't emit dbar 0x700 if -mld-seq-sa > > LoongArch: Add -march=la664 and -mtune=la664 > > > > gcc/config/loongarch/genopts/genstr.sh | 78 ++++++- > > gcc/config/loongarch/genopts/isa-evolution.in | 2 + > > .../loongarch/genopts/loongarch-strings | 1 + > > gcc/config/loongarch/genopts/loongarch.opt.in | 10 + > > gcc/config/loongarch/loongarch-cpu.cc | 37 ++-- > > gcc/config/loongarch/loongarch-cpucfg-map.h | 36 +++ > > gcc/config/loongarch/loongarch-def-array.h | 40 ++++ > > gcc/config/loongarch/loongarch-def.c | 205 ------------------ > > gcc/config/loongarch/loongarch-def.cc | 193 +++++++++++++++++ > > gcc/config/loongarch/loongarch-def.h | 67 ++++-- > > gcc/config/loongarch/loongarch-opts.h | 9 +- > > gcc/config/loongarch/loongarch-str.h | 8 +- > > gcc/config/loongarch/loongarch-tune.h | 123 ++++++++++- > > gcc/config/loongarch/loongarch.cc | 6 +- > > gcc/config/loongarch/loongarch.md | 31 ++- > > gcc/config/loongarch/loongarch.opt | 23 +- > > gcc/config/loongarch/t-loongarch | 25 ++- > > .../gcc.target/loongarch/div-div32.c | 31 +++ > > .../gcc.target/loongarch/div-no-div32.c | 11 + > > 19 files changed, 664 insertions(+), 272 deletions(-) > > create mode 100644 gcc/config/loongarch/genopts/isa-evolution.in > > create mode 100644 gcc/config/loongarch/loongarch-cpucfg-map.h > > create mode 100644 gcc/config/loongarch/loongarch-def-array.h > > delete mode 100644 gcc/config/loongarch/loongarch-def.c > > create mode 100644 gcc/config/loongarch/loongarch-def.cc > > create mode 100644 gcc/testsuite/gcc.target/loongarch/div-div32.c > > create mode 100644 gcc/testsuite/gcc.target/loongarch/div-no-div32.c > >
在 2023/11/17 下午12:55, Xi Ruoyao 写道: > On Fri, 2023-11-17 at 10:41 +0800, chenglulu wrote: >> Hi, >> >> Thank you very much for the modification, but I think we need to support >> la664 with the configuration items of configure. > I'll add it. > >> I also defined ISA_BASE_LA64V110 to represent the LoongArch1.1 >> instruction set, what do you think? > I'll add it too. I had misread section 1.5 paragraph 1 of the spec so I > didn't consider this a good idea, but after reading it again I think it > should be added. I have already added these two, but not on the basis of your patch. So... > >> 在 2023/11/16 下午9:18, Xi Ruoyao 写道: >>> Loongson 3A6000 processor will be shipped to general users in this month >>> and it features 4 cores with the new LA664 micro architecture. Here is >>> some changes from LA464: >>> >>> 1. The 32-bit division instruction now ignores the high 32 bits of the >>> input registers. This is enumerated via CPUCFG word 0x2, bit 26. >>> 2. The micro architecture now guarantees two loads on the same memory >>> address won't be reordered with each other. dbar 0x700 is turned >>> into nop. >>> 3. The architecture now supports approximate square root instructions >>> (FRECIPE and VRSQRTE) on 32-bit or 64-bit floating-point values and >>> the vectors of these values. >>> 4. The architecture now supports SC.Q instruction for 128-bit CAS. >>> 5. The architecture now supports LL.ACQ and SC.REL instructions (well, I >>> don't really know what they are for). >>> 6. The architecture now supports CAS instructions for 64, 32, 16, or 8-bit >>> values. >>> 7. The architecture now supports atomic add and atomic swap instructions >>> for 16 or 8-bit values. >>> 8. Some non-zero hint values of DBAR instructions are added. >>> >>> These features are documented in LoongArch v1.1. Implementations can >>> implement any subset of them and enumerate the implemented features via >>> CPUCFG. LA664 implements them all. >>> >>> (8) is already implemented in previous patches because it's completely >>> backward-compatible. This series implements (1) and (2) with switches >>> -mdiv32 and -mld-seq-sa (these names are derived from the names of the >>> corresponding CPUCFG bits documented in the LoongArch v1.1 >>> specification). >>> >>> The other features require Binutils support and we are close to the end >>> of GCC 14 stage 1, so I'm posting this series first now. >>> >>> With -march=la664, these two options are implicitly enabled but they can >>> be turned off with -mno-div32 or -mno-ld-seq-sa. >>> >>> With -march=native, the current CPU is probed via CPUCFG and these >>> options are implicitly enabled if the CPU supports the corresponding >>> feature. They can be turned off with explicit -mno-div32 or >>> -mno-ld-seq-sa as well. >>> >>> -mtune=la664 is implemented as a copy of -mtune=la464 and we can adjust >>> it with benchmark results later. >>> >>> Bootstrapped and regtested on a LA664 with BOOT_CFLAGS="-march=la664 >>> -O2", a LA464 with BOOT_CFLAGS="-march=native -O2". And manually >>> verified -march=native probing on LA664 and LA464. >>> >>> Xi Ruoyao (5): >>> LoongArch: Switch loongarch-def to C++ >>> LoongArch: genopts: Add infrastructure to generate code for new >>> features in ISA evolution >>> LoongArch: Take the advantage of -mdiv32 if it's enabled >>> LoongArch: Don't emit dbar 0x700 if -mld-seq-sa >>> LoongArch: Add -march=la664 and -mtune=la664 >>> >>> gcc/config/loongarch/genopts/genstr.sh | 78 ++++++- >>> gcc/config/loongarch/genopts/isa-evolution.in | 2 + >>> .../loongarch/genopts/loongarch-strings | 1 + >>> gcc/config/loongarch/genopts/loongarch.opt.in | 10 + >>> gcc/config/loongarch/loongarch-cpu.cc | 37 ++-- >>> gcc/config/loongarch/loongarch-cpucfg-map.h | 36 +++ >>> gcc/config/loongarch/loongarch-def-array.h | 40 ++++ >>> gcc/config/loongarch/loongarch-def.c | 205 ------------------ >>> gcc/config/loongarch/loongarch-def.cc | 193 +++++++++++++++++ >>> gcc/config/loongarch/loongarch-def.h | 67 ++++-- >>> gcc/config/loongarch/loongarch-opts.h | 9 +- >>> gcc/config/loongarch/loongarch-str.h | 8 +- >>> gcc/config/loongarch/loongarch-tune.h | 123 ++++++++++- >>> gcc/config/loongarch/loongarch.cc | 6 +- >>> gcc/config/loongarch/loongarch.md | 31 ++- >>> gcc/config/loongarch/loongarch.opt | 23 +- >>> gcc/config/loongarch/t-loongarch | 25 ++- >>> .../gcc.target/loongarch/div-div32.c | 31 +++ >>> .../gcc.target/loongarch/div-no-div32.c | 11 + >>> 19 files changed, 664 insertions(+), 272 deletions(-) >>> create mode 100644 gcc/config/loongarch/genopts/isa-evolution.in >>> create mode 100644 gcc/config/loongarch/loongarch-cpucfg-map.h >>> create mode 100644 gcc/config/loongarch/loongarch-def-array.h >>> delete mode 100644 gcc/config/loongarch/loongarch-def.c >>> create mode 100644 gcc/config/loongarch/loongarch-def.cc >>> create mode 100644 gcc/testsuite/gcc.target/loongarch/div-div32.c >>> create mode 100644 gcc/testsuite/gcc.target/loongarch/div-no-div32.c >>>