From patchwork Mon Nov 13 23:18:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 164674 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b909:0:b0:403:3b70:6f57 with SMTP id t9csp1533796vqg; Mon, 13 Nov 2023 15:19:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IEC3WJfrLOUS8Jlcn3IMfHTj7ZsSwM0rFUN1v9nuef9zrDE44+amc8rkJGepDkXwIWLEDn2 X-Received: by 2002:ac8:5f94:0:b0:41c:b909:e033 with SMTP id j20-20020ac85f94000000b0041cb909e033mr658088qta.23.1699917560314; Mon, 13 Nov 2023 15:19:20 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1699917560; cv=pass; d=google.com; s=arc-20160816; b=pfoQ4Ei+OMQ+Ng5eqr4eQ4cMPoiWHBTA50vwho44FSN2RMy84OzPvMtVpUw/3h1iRa N+bPNsaxmscOWMUPg7yPjZy8wySexlSPdxIqro5/5fEnPC8YQhl2/eRSsH1OcAEHPbbX 46QsoM3k87khe/pEbE8Xfc+EIlXBwrtDY0Bo5StcyE05POb/j2fzmPD4mOkVW0HkCGFA 5cxQyVQQnIjSz484yHLnqu8qUXflmFuUXYtK6TOmpHx6p6IpfL3Muw+qeqajoQeHaF8j DXWLPBcMiBnNGMxtLJxzSGogXXIgk1ZCzB6elya8PkaW1V4QEdhf41waFTtLBG9WV86n V6Bg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=Y9WxdpqrvAg/lPmwQVxmdzOIXBm42U/bGcqqhZgNclA=; fh=oUCfM/eMlWtMCtZZKY1bglzxCo7b3kw9D5LTFFWuz38=; b=KfQ88cfo5cbNbYKK9PJD+5Y4um3aXsiq45AfpHMeEmcvHQe/OUb0/5EK3MKo+0ljVX nwAI3NuHSliWuirlinBJE8ETE9yHO4HNy1ns4K+Jjeh/5B95OIfTzrSxaz1R3F2ABWIm cUK7WzYgSewiazSeIsNm7vObARqjTDnfuCBuTMPnmlfd6rhwWZ3yiZvxfMsz5eKsKkJf Pls1bT1XHXkYzJ1LAThOu+MfxADvrXGJI/qZpvh4KGVsBGoO6d7iFYC63qmEcfXhWPfq V2Jbe8Tj7u6ArT7unAtSpEbi3a6YISiGVRj1dX5BXaGQZ3nQttgzmV8tLcDG2EveS41L xuvQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=bw5Lxd+o; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id fb9-20020a05622a480900b00421bf0fe208si5668623qtb.264.2023.11.13.15.19.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 15:19:20 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=bw5Lxd+o; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 1B09C3882ADC for ; Mon, 13 Nov 2023 23:19:20 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id 01BCD3858C20 for ; Mon, 13 Nov 2023 23:18:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 01BCD3858C20 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 01BCD3858C20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699917536; cv=none; b=iQNhklpnJwQRhvETBlyu4vevs65id3t7ImjI27khKiyjQvdBwOvr3Hdw3v+zNk79zxJMQdWp4B4KaFE/q8vfaHq3Nhkhb51pehqEcyH8YgMACtr6Rj6V+sSewLUKRQsCq+wXPz0sMSVPLphPguWjBWEH/MKlwodJV22A2ijoMOU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1699917536; c=relaxed/simple; bh=fKJ808VcyUiAvy55k5lLv0Frym0xlbVg/eWUtEI8RP4=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=dH8Xv6zzRrsDAcikDVVBpIQkH3npMH2TFScY/4cefV/0sUXYbBm+uKr4y9rE2SCSLr03nCGReRUM/Rh8oS0zbj0JNWH7w4gr4D1oFihnSVEQ0mvO52KPdGJjfiSuykcZdFEPcJumRFMezghanbBxopTqqRrNn/HbQdFXlqwBM3g= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1699917530; bh=fKJ808VcyUiAvy55k5lLv0Frym0xlbVg/eWUtEI8RP4=; h=From:To:Cc:Subject:Date:From; b=bw5Lxd+ova7ESBorwKXQ2V6CQLeHkte0wZYmCtpHJLsnt6aDIIlJ2erRuoKE5nqS9 3MR35B2fdcSz5/8WUC1r/BQDotx0wOvxn8/T2Lj8symNxeWx902oteWVnzwetlzFV3 uwtbNIfYgKQ6ebNbkO9GfoqQGVuubMYUsyZ7tW8k= Received: from stargazer.. (unknown [IPv6:240e:358:11b3:9500:dc73:854d:832e:2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id A743366B06; Mon, 13 Nov 2023 18:18:46 -0500 (EST) From: Xi Ruoyao To: gcc-patches@gcc.gnu.org Cc: chenglulu , i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao Subject: [PATCH] LoongArch: Use finer-grained DBAR hints Date: Tue, 14 Nov 2023 07:18:02 +0800 Message-ID: <20231113231837.369907-1-xry111@xry111.site> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782492755346592515 X-GMAIL-MSGID: 1782492755346592515 LA664 defines DBAR hints 0x1 - 0x1f (except 0xf and 0x1f) as follows [1-2]: - Bit 4: kind of constraint (0: completion, 1: ordering) - Bit 3: barrier for previous read (0: true, 1: false) - Bit 2: barrier for previous write (0: true, 1: false) - Bit 1: barrier for succeeding read (0: true, 1: false) - Bit 0: barrier for succeeding write (0: true, 1: false) LLVM has already utilized them for different memory orders [3]: - Bit 4 is always set to one because it's only intended to be zero for things like MMIO devices, which are out of the scope of memory orders. - An acquire barrier is used to implement acquire loads like ld.d $a1, $t0, 0 dbar acquire_hint where the load operation (ld.d) should not be reordered with any load or store operation after the acquire load. To accomplish this constraint, we need to prevent the load operation from being reordered after the barrier, and also prevent any following load/store operation from being reordered before the barrier. Thus bits 0, 1, and 3 must be zero, and bit 2 can be one, so acquire_hint should be 0b10100. - An release barrier is used to implement release stores like dbar release_hint st.d $a1, $t0, 0 where the store operation (st.d) should not be reordered with any load or store operation before the release store. So we need to prevent the store operation from being reordered before the barrier, and also prevent any preceding load/store operation from being reordered after the barrier. So bits 0, 2, 3 must be zero, and bit 1 can be one. So release_hint should be 0b10010. A similar mapping has been utilized for RISC-V GCC [4], LoongArch Linux kernel [1], and LoongArch LLVM [3]. So the mapping should be correct. And I've also bootstrapped & regtested GCC on a LA664 with this patch. The LoongArch CPUs should treat "unknown" hints as dbar 0, so we can unconditionally emit the new hints without a compiler switch. [1]: https://git.kernel.org/torvalds/c/e031a5f3f1ed [2]: https://github.com/loongson-community/docs/pull/12 [3]: https://github.com/llvm/llvm-project/pull/68787 [4]: https://gcc.gnu.org/r14-406 gcc/ChangeLog: * config/loongarch/sync.md (mem_thread_fence): Remove redundant check. (mem_thread_fence_1): Emit finer-grained DBAR hints for different memory models, instead of 0. --- Bootstrapped and regtested on loongarch64-linux-gnu (running on a LA664). Ok for trunk? gcc/config/loongarch/sync.md | 49 +++++++++++++++++++++++++++++------- 1 file changed, 40 insertions(+), 9 deletions(-) diff --git a/gcc/config/loongarch/sync.md b/gcc/config/loongarch/sync.md index db3a21690b8..511aba5ffa6 100644 --- a/gcc/config/loongarch/sync.md +++ b/gcc/config/loongarch/sync.md @@ -50,23 +50,54 @@ (define_expand "mem_thread_fence" [(match_operand:SI 0 "const_int_operand" "")] ;; model "" { - if (INTVAL (operands[0]) != MEMMODEL_RELAXED) - { - rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); - MEM_VOLATILE_P (mem) = 1; - emit_insn (gen_mem_thread_fence_1 (mem, operands[0])); - } + rtx mem = gen_rtx_MEM (BLKmode, gen_rtx_SCRATCH (Pmode)); + MEM_VOLATILE_P (mem) = 1; + emit_insn (gen_mem_thread_fence_1 (mem, operands[0])); + DONE; }) -;; Until the LoongArch memory model (hence its mapping from C++) is finalized, -;; conservatively emit a full FENCE. +;; DBAR hint encoding for LA664 and later micro-architectures, paraphrased from +;; the Linux patch revealing it [1]: +;; +;; - Bit 4: kind of constraint (0: completion, 1: ordering) +;; - Bit 3: barrier for previous read (0: true, 1: false) +;; - Bit 2: barrier for previous write (0: true, 1: false) +;; - Bit 1: barrier for succeeding read (0: true, 1: false) +;; - Bit 0: barrier for succeeding write (0: true, 1: false) +;; +;; [1]: https://git.kernel.org/torvalds/c/e031a5f3f1ed +;; +;; Implementations without support for the finer-granularity hints simply treat +;; all as the full barrier (DBAR 0), so we can unconditionally start emiting the +;; more precise hints right away. (define_insn "mem_thread_fence_1" [(set (match_operand:BLK 0 "" "") (unspec:BLK [(match_dup 0)] UNSPEC_MEMORY_BARRIER)) (match_operand:SI 1 "const_int_operand" "")] ;; model "" - "dbar\t0") + { + enum memmodel model = memmodel_base (INTVAL (operands[1])); + + switch (model) + { + case MEMMODEL_ACQUIRE: + case MEMMODEL_CONSUME: + /* Consume is implemented using the stronger acquire memory order + because of a deficiency in C++11's semantics. */ + return "dbar\t0b10100"; + case MEMMODEL_RELEASE: + return "dbar\t0b10010"; + case MEMMODEL_ACQ_REL: + case MEMMODEL_SEQ_CST: + return "dbar\t0b10000"; + default: + /* GCC internal: "For the '__ATOMIC_RELAXED' model no instructions + need to be issued and this expansion is not invoked." + Other values should not be returned by memmodel_base. */ + gcc_unreachable (); + } + }) ;; Atomic memory operations.