[v2,0/5] Add support for approximate instructions and optimize divf/sqrtf/rsqrt operations.
Message ID | 20231205070147.53352-1-xujiahao@loongson.cn |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3254744vqy; Mon, 4 Dec 2023 23:04:10 -0800 (PST) X-Google-Smtp-Source: AGHT+IEtyqt2kuBN4VDOnkmepZ4HxsjTO89QPBIb6GwlQL6ar0tnml4kEbCUD+iEcrMa9egyRHTX X-Received: by 2002:a0c:ef08:0:b0:67a:44fd:3416 with SMTP id t8-20020a0cef08000000b0067a44fd3416mr990725qvr.14.1701759849642; Mon, 04 Dec 2023 23:04:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1701759849; cv=pass; d=google.com; s=arc-20160816; b=C/sOil0adbDUl0dk1z/F3B8B1OJsbtJarPhjTJPLcPw/ymDl39mKj9CckFpwkl7iBP TgWdz7gY/Z63evgeUw8MQcdVJzEGcxPknnMqZX7GI1WcvwbXpSJxYaMeqvXA4NmSgS3f k4tKsayi62pmFB6NI0CbGVCdGLl5LYqtKaowHFVMxi8JdzDrRFklY0qj7AmM3aEn6KMD /POwPH2I+VfvZpZHw1ou+V4HwGr+K3JzqNYmCUuHKfNv0rFuPsx5Ob9YUlfFHN2tLHH3 STEkQesHbwTWJy8goi99+VwvnG+NF2vLOi8j9n/+CJLL3aGeiHPjnZtJa9n0nmtd7tKs GEHA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:arc-filter :dmarc-filter:delivered-to; bh=KovQeA9vwvzBN1PMEaqsQmYzXMvXz0ws7paSMxjXCP4=; fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=; b=lU4pADjgI/Jx7Nu+Syb8ChH52ulpd+WtV3anSWu/g+aHKenFcL3WlUqeZN7Wp8Kx2c tOyBwjUDeCTkeRJxwu039Vi4NC9YYlBXehQRYUrCGJDOpjiGDjm+czTvIn18qZtMVRE+ yION+ZAgHozr9Ol7NBPPk65G0BWjUa+8Zd0aQ8iFYTJxNqQVxR+XtlyJF9L1cub2Kc93 91LPEgew+xE3nJ+CG3pkSxbAIR6LOXizemoMb6MWkSnkZA6CFKKHGxgwsNkBV0jNYx6D MbsW/z35zVt3vJxEtYQIcuxabQtJB6saEfDwHVZ9+winqL1odKFGfwVzpoxLxsrtkmK1 Sr3w== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ew7-20020a0562140aa700b0067a91995b56si9568921qvb.543.2023.12.04.23.04.09 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 23:04:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 404F5384F98A for <ouuuleilei@gmail.com>; Tue, 5 Dec 2023 07:04:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id C3495384F98C for <gcc-patches@gcc.gnu.org>; Tue, 5 Dec 2023 07:02:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C3495384F98C Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C3495384F98C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701759743; cv=none; b=OhLqnOmgKx7+1+OJlDlI5d9Z2pQ7fwzpW7hYL+UnRoJK3ly4IgZNtsJc+ObkErD5WgN3DPnf1pns6Rr3tT85wp/Vx1AFKkcVIG8z1U+vrgyMQ6Ello7Pplq56IyZdJz2mmB9oIeWh121j1107ni58KnvZtXvorV9hFyu8f1lAW4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701759743; c=relaxed/simple; bh=Dtpsiezg5iuf+SdDPYmNKVmI1H1i+U7JMHC5bFYvssU=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=XUsAtSlUl4SqY2WgXeCdPxkKKeWWxNuwxtW593391FkZyS5DVXzPch8Lo1ID7Pb4ozfrOcOpXW/PNEqhru0sZ0Jhy2KUMpsgFlS9JwQrFfTmiddetUhjsKqc5iyGxDEaqW/699BrAIkWNvdF/2yFcUxJ6JTqbsvQc5erFCmonPk= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from <xujiahao@loongson.cn>) id 1rAPRZ-0006NF-Mg for gcc-patches@gcc.gnu.org; Tue, 05 Dec 2023 02:02:13 -0500 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8CxNvHkym5lxPg+AA--.59901S3; Tue, 05 Dec 2023 15:01:57 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8Cxvdzeym5ljDlVAA--.57842S4; Tue, 05 Dec 2023 15:01:50 +0800 (CST) From: Jiahao Xu <xujiahao@loongson.cn> To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn> Subject: [PATCH v2 0/5] Add support for approximate instructions and optimize divf/sqrtf/rsqrt operations. Date: Tue, 5 Dec 2023 15:01:42 +0800 Message-Id: <20231205070147.53352-1-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-CM-TRANSID: AQAAf8Cxvdzeym5ljDlVAA--.57842S4 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoWxurW8Kw15XFyrKw4fKrWxGrX_yoWrZFy3p3 y7CrnrtF48GFZ3Wr1kJa43XF4DXF97K3ya93WSy340krWIqr9Fv3WktrnxXFy3Ja45Jryx Xwn5uw15W3WYv3XCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Jr0_JF4l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Jr0_Gr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r106r15McIj6I8E87Iv 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_Jr0_JF4lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07UWHqcUUUUU= Received-SPF: pass client-ip=114.242.206.163; envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-7.2 required=5.0 tests=BAYES_00, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784424536450060008 X-GMAIL-MSGID: 1784424536450060008 |
Series |
Add support for approximate instructions and optimize divf/sqrtf/rsqrt operations.
|
|
Message
Jiahao Xu
Dec. 5, 2023, 7:01 a.m. UTC
LoongArch V1.1 adds support for approximate instructions, which are utilized along with additional Newton-Raphson steps implement single precision floating-point division, square root and reciprocal square root operations for better throughput. The patches are modifications made based on the patch: https://gcc.gnu.org/pipermail/gcc-patches/2023-December/639243.html. Jiahao Xu (5): LoongArch: Add support for LoongArch V1.1 approximate instructions. LoongArch: Use standard pattern name for xvfrsqrt/vfrsqrt instructions. LoongArch: Redefine pattern for xvfrecip/vfrecip instructions. LoongArch: New options -mrecip and -mrecip= with ffast-math. LoongArch: Vectorized loop unrolling is disable for divf/sqrtf/rsqrtf when -mrecip is enabled. gcc/config/loongarch/genopts/isa-evolution.in | 1 + gcc/config/loongarch/genopts/loongarch.opt.in | 11 + gcc/config/loongarch/larchintrin.h | 38 +++ gcc/config/loongarch/lasx.md | 89 ++++++- gcc/config/loongarch/lasxintrin.h | 34 +++ gcc/config/loongarch/loongarch-builtins.cc | 66 +++++ gcc/config/loongarch/loongarch-cpucfg-map.h | 1 + gcc/config/loongarch/loongarch-protos.h | 2 + gcc/config/loongarch/loongarch-str.h | 1 + gcc/config/loongarch/loongarch.cc | 252 +++++++++++++++++- gcc/config/loongarch/loongarch.h | 18 ++ gcc/config/loongarch/loongarch.md | 104 ++++++-- gcc/config/loongarch/loongarch.opt | 15 ++ gcc/config/loongarch/lsx.md | 89 ++++++- gcc/config/loongarch/lsxintrin.h | 34 +++ gcc/config/loongarch/predicates.md | 8 + gcc/doc/extend.texi | 18 ++ gcc/doc/invoke.texi | 54 ++++ gcc/testsuite/gcc.target/loongarch/divf.c | 10 + .../loongarch/larch-frecipe-builtin.c | 28 ++ .../gcc.target/loongarch/recip-divf.c | 9 + .../gcc.target/loongarch/recip-sqrtf.c | 23 ++ gcc/testsuite/gcc.target/loongarch/sqrtf.c | 24 ++ .../loongarch/vector/lasx/lasx-divf.c | 13 + .../vector/lasx/lasx-frecipe-builtin.c | 30 +++ .../loongarch/vector/lasx/lasx-recip-divf.c | 12 + .../loongarch/vector/lasx/lasx-recip-sqrtf.c | 28 ++ .../loongarch/vector/lasx/lasx-recip.c | 24 ++ .../loongarch/vector/lasx/lasx-rsqrt.c | 26 ++ .../loongarch/vector/lasx/lasx-sqrtf.c | 29 ++ .../loongarch/vector/lsx/lsx-divf.c | 13 + .../vector/lsx/lsx-frecipe-builtin.c | 30 +++ .../loongarch/vector/lsx/lsx-recip-divf.c | 12 + .../loongarch/vector/lsx/lsx-recip-sqrtf.c | 28 ++ .../loongarch/vector/lsx/lsx-recip.c | 24 ++ .../loongarch/vector/lsx/lsx-rsqrt.c | 26 ++ .../loongarch/vector/lsx/lsx-sqrtf.c | 29 ++ 37 files changed, 1212 insertions(+), 41 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/larch-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-rsqrt.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-divf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-rsqrt.c create mode 100644 gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-sqrtf.c