From patchwork Wed Dec  6 07:04:49 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiahao Xu <xujiahao@loongson.cn>
X-Patchwork-Id: 174335
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3929225vqy;
        Tue, 5 Dec 2023 23:05:36 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IFtKs5JxnxV602ubnauhoN/lmkTowe9qdr2H7K0t7/UNiWn5SHLgilyJcMi1VVrmbGy9myz
X-Received: by 2002:a05:620a:8224:b0:77e:fba3:93ab with SMTP id
 ow36-20020a05620a822400b0077efba393abmr457167qkn.141.1701846336575;
        Tue, 05 Dec 2023 23:05:36 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1701846336; cv=pass;
        d=google.com; s=arc-20160816;
        b=EOS3tdpmYuep+dKR24AK8pBzM7YjPQ2brvK6JX3CyeBNiRyJvonHC6jPM8P9790xIG
         n+ZkAgTjUVATkEtVxF9el2LQA6okeusmX5VUeyexQmzbqAacPzK9N1N8BxCrLLE9pTsq
         /IM2rDoZL+CJPdyugndMaRGI3xYPG4CGv2rpj3/zyJfZDDaqzc9XOz5ag8voplriCDEt
         CZciK4s5sLJn6nGkP1tlHtDmEmSbhDqf9YvFjd4N1vyIraD2fLS0yDcu19JHm89mEIi+
         jb6+Os3Vk0U1DfA5UOFir3yuU7FOwvgOynNK3VJlz6/Or8O1usoxuN9B18rPVzaRPw/8
         6dDg==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:references:in-reply-to:message-id:date:subject:cc:to
         :from:arc-filter:dmarc-filter:delivered-to;
        bh=WKpr78abrEMQ6qcOSChxGeGvjJ/gw0ZJ3I1tpT51Uyw=;
        fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=;
        b=NUH4Yx5/9aOdfBcvJcoP+PcrYZUPFKSiuaZOhrfWpiLP0a6ddk2kIKh8zYC25HGLwy
         KdjjwKRobyzTYel7nVpgO0q4whavG8cuiC8ZwHPozt5J3pPhIxBDkcBzICim2RHZoS6z
         i3LgpT3gDxdMgqd5urfNRz7S3pnuD1O7L0kuVRIFKo3pg06opkXru/yYCBXWYph8V7v6
         P8jsSBWQNweovjEeQwx1YKfuGP0fARUh58rFg/iHJ887AcbcBFJbiCroYrafLtfU8N3m
         HwQ774CQQfcgI9/GSm6MbdZXYmGsRysbH8ocxfJrDQ+WsRP2qFjQ44kgLypE8GvwzQzg
         wmog==
ARC-Authentication-Results: i=2; mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97])
        by mx.google.com with ESMTPS id
 e8-20020a37ac08000000b0077f2d713eddsi558524qkm.114.2023.12.05.23.05.36
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Dec 2023 23:05:36 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender) client-ip=8.43.85.97;
Authentication-Results: mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 5579D386D636
	for <ouuuleilei@gmail.com>; Wed,  6 Dec 2023 07:05:35 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163])
 by sourceware.org (Postfix) with ESMTP id C72D0386C5A5
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Dec 2023 07:05:07 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org C72D0386C5A5
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org C72D0386C5A5
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=114.242.206.163
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701846312; cv=none;
 b=QRulH0vUHGHLiyfIMO+D/duhS4ehds7V97j28+lw/OuJzok3RlrTcEEBxSfJB28EgezG4BL1qwvrDuGnNkidyHC0pi9P4T1Q0kLh0SWV4PYtTq/ohg9qpc1v50Cpy1iu9WwoLvei48IpL/v+PgHMYl5yPSmBr+2GLKaJShXFCPk=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701846312; c=relaxed/simple;
 bh=8g8MbX5LagJGwRTPWwMhyvVL+MeJ+0CciaeqAruD5+c=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=mxb9z29rQ+TWoiGCpMmdS5d/RgVRPA55uhvfJjPLhhHBiRaJU1NX5KvppA2tHo0gkAi1P2ay+NkesJ/pPui8QjPYMzi5ocCMBnwpNaJWmBw53fLK7vk2sIiEruJuVSAB31ndz0qD+y4plD8yH9PHZNPm+0j332/Kxk85m6fNXqA=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from loongson.cn (unknown [10.10.130.252])
 by gateway (Coremail) with SMTP id _____8AxZ+gcHXBlQzs_AA--.25131S3;
 Wed, 06 Dec 2023 15:05:00 +0800 (CST)
Received: from slurm-master.loongson.cn (unknown [10.10.130.252])
 by localhost.localdomain (Coremail) with SMTP id
 AQAAf8Dxvi8XHXBlp0BWAA--.59594S5;
 Wed, 06 Dec 2023 15:04:57 +0800 (CST)
From: Jiahao Xu <xujiahao@loongson.cn>
To: gcc-patches@gcc.gnu.org
Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn,
 xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn>
Subject: [PATCH v3 1/5] LoongArch: Add support for LoongArch V1.1 approximate
 instructions.
Date: Wed,  6 Dec 2023 15:04:49 +0800
Message-Id: <20231206070453.3252-2-xujiahao@loongson.cn>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20231206070453.3252-1-xujiahao@loongson.cn>
References: <20231206070453.3252-1-xujiahao@loongson.cn>
MIME-Version: 1.0
X-CM-TRANSID: AQAAf8Dxvi8XHXBlp0BWAA--.59594S5
X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/
X-Coremail-Antispam: 1Uk129KBj9fXoWfur1UXFW5Wr4fAF1ktF4kZrc_yoW5WF4DGo
 WrCF4DJa1xGryIyrW5KrnxXrWjvayFyF4DAay3Zws5Ca1xJr90k347W3WFy342qF1kWrn8
 C3s5W3sxXa4xJFs5l-sFpf9Il3svdjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8wcxFpf
 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3
 UjIYCTnIWjp_UUUY17kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI
 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUGVWUXwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG
 Y2AK021l84ACjcxK6xIIjxv20xvE14v26r1I6r4UM28EF7xvwVC0I7IYx2IY6xkF7I0E14
 v26r4j6F4UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF
 wI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI
 0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280
 aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28Icx
 kI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2Iq
 xVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42
 IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY
 6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aV
 CY1x0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxU7MmhUUUUU
X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1784515224271564893
X-GMAIL-MSGID: 1784515224271564893

This patch adds define_insn/builtins/intrinsics for these instructions, and add option
-mfrecipe to control instruction generation.

gcc/ChangeLog:

	* config/loongarch/genopts/isa-evolution.in (fecipe): Add.
	* config/loongarch/larchintrin.h (__frecipe_s): New intrinsic.
	(__frecipe_d): Ditto.
	(__frsqrte_s): Ditto.
	(__frsqrte_d): Ditto.
	* config/loongarch/lasx.md (lasx_xvfrecipe_<flasxfmt>): New insn pattern.
	(lasx_xvfrsqrte_<flasxfmt>): Ditto.
	* config/loongarch/lasxintrin.h (__lasx_xvfrecipe_s): New intrinsic.
	(__lasx_xvfrecipe_d): Ditto.
	(__lasx_xvfrsqrte_s): Ditto.
	(__lasx_xvfrsqrte_d): Ditto.
	* config/loongarch/loongarch-builtins.cc (AVAIL_ALL): Add predicates.
	(LSX_EXT_BUILTIN): New macro.
	(LASX_EXT_BUILTIN): Ditto.
	* config/loongarch/loongarch-cpucfg-map.h: Regenerate.
	* config/loongarch/loongarch-c.cc: Add builtin macro "__loongarch_frecipe".
	* config/loongarch/loongarch-def.cc: Regenerate.
	* config/loongarch/loongarch-str.h (OPTSTR_FRECIPE): Regenerate.
	* config/loongarch/loongarch.cc (loongarch_asm_code_end): Dump status for TARGET_FRECIPE.
	* config/loongarch/loongarch.md (loongarch_frecipe_<fmt>): New insn pattern.
	(loongarch_frsqrte_<fmt>): Ditto.
	* config/loongarch/loongarch.opt: Regenerate.
	* config/loongarch/lsx.md (lsx_vfrecipe_<flsxfmt>): New insn pattern.
	(lsx_vfrsqrte_<flsxfmt>): Ditto.
	* config/loongarch/lsxintrin.h (__lsx_vfrecipe_s): New intrinsic.
	(__lsx_vfrecipe_d): Ditto.
	(__lsx_vfrsqrte_s): Ditto.
	(__lsx_vfrsqrte_d): Ditto.
	* doc/extend.texi: Add documentation for LoongArch new builtins and intrinsics.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/larch-frecipe-builtin.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c: New test.

diff --git a/gcc/config/loongarch/genopts/isa-evolution.in b/gcc/config/loongarch/genopts/isa-evolution.in
index a6bc3f87f20..11a198b649f 100644
--- a/gcc/config/loongarch/genopts/isa-evolution.in
+++ b/gcc/config/loongarch/genopts/isa-evolution.in
@@ -1,3 +1,4 @@
+2	25	frecipe		Support frecipe.{s/d} and frsqrte.{s/d} instructions.
 2	26	div32		Support div.w[u] and mod.w[u] instructions with inputs not sign-extended.
 2	27	lam-bh		Support am{swap/add}[_db].{b/h} instructions.
 2	28	lamcas		Support amcas[_db].{b/h/w/d} instructions.
diff --git a/gcc/config/loongarch/larchintrin.h b/gcc/config/loongarch/larchintrin.h
index e571ed27b37..bb1cda831eb 100644
--- a/gcc/config/loongarch/larchintrin.h
+++ b/gcc/config/loongarch/larchintrin.h
@@ -333,6 +333,44 @@ __iocsrwr_d (unsigned long int _1, unsigned int _2)
 }
 #endif
 
+#ifdef __loongarch_frecipe
+/* Assembly instruction format: fd, fj.  */
+/* Data types in instruction templates:  SF, SF.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__frecipe_s (float _1)
+{
+  __builtin_loongarch_frecipe_s ((float) _1);
+}
+
+/* Assembly instruction format: fd, fj.  */
+/* Data types in instruction templates:  DF, DF.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__frecipe_d (double _1)
+{
+  __builtin_loongarch_frecipe_d ((double) _1);
+}
+
+/* Assembly instruction format: fd, fj.  */
+/* Data types in instruction templates:  SF, SF.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__frsqrte_s (float _1)
+{
+  __builtin_loongarch_frsqrte_s ((float) _1);
+}
+
+/* Assembly instruction format: fd, fj.  */
+/* Data types in instruction templates:  DF, DF.  */
+extern __inline void
+__attribute__ ((__gnu_inline__, __always_inline__, __artificial__))
+__frsqrte_d (double _1)
+{
+  __builtin_loongarch_frsqrte_d ((double) _1);
+}
+#endif
+
 /* Assembly instruction format:	ui15.  */
 /* Data types in instruction templates:  USI.  */
 #define __dbar(/*ui15*/ _1) __builtin_loongarch_dbar ((_1))
diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index 116b30c0774..f6e5208a6f1 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -40,8 +40,10 @@ (define_c_enum "unspec" [
   UNSPEC_LASX_XVFCVTL
   UNSPEC_LASX_XVFLOGB
   UNSPEC_LASX_XVFRECIP
+  UNSPEC_LASX_XVFRECIPE
   UNSPEC_LASX_XVFRINT
   UNSPEC_LASX_XVFRSQRT
+  UNSPEC_LASX_XVFRSQRTE
   UNSPEC_LASX_XVFCMP_SAF
   UNSPEC_LASX_XVFCMP_SEQ
   UNSPEC_LASX_XVFCMP_SLE
@@ -1633,6 +1635,17 @@ (define_insn "lasx_xvfrecip_<flasxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
+;; Approximate Reciprocal Instructions.
+
+(define_insn "lasx_xvfrecipe_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		  UNSPEC_LASX_XVFRECIPE))]
+  "ISA_HAS_LASX && TARGET_FRECIPE"
+  "xvfrecipe.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "lasx_xvfrsqrt_<flasxfmt>"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
 	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
@@ -1642,6 +1655,17 @@ (define_insn "lasx_xvfrsqrt_<flasxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
+;; Approximate Reciprocal Square Root Instructions.
+
+(define_insn "lasx_xvfrsqrte_<flasxfmt>"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		  UNSPEC_LASX_XVFRSQRTE))]
+  "ISA_HAS_LASX && TARGET_FRECIPE"
+  "xvfrsqrte.<flasxfmt>\t%u0,%u1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "lasx_xvftint_u_<ilasxfmt_u>_<flasxfmt>"
   [(set (match_operand:<VIMODE256> 0 "register_operand" "=f")
 	(unspec:<VIMODE256> [(match_operand:FLASX 1 "register_operand" "f")]
diff --git a/gcc/config/loongarch/lasxintrin.h b/gcc/config/loongarch/lasxintrin.h
index 7bce2c757f1..5e65e76e74c 100644
--- a/gcc/config/loongarch/lasxintrin.h
+++ b/gcc/config/loongarch/lasxintrin.h
@@ -2399,6 +2399,40 @@ __m256d __lasx_xvfrecip_d (__m256d _1)
   return (__m256d)__builtin_lasx_xvfrecip_d ((v4f64)_1);
 }
 
+#if defined(__loongarch_frecipe)
+/* Assembly instruction format: xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrecipe_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrecipe_s ((v8f32)_1);
+}
+
+/* Assembly instruction format: xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrecipe_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrecipe_d ((v4f64)_1);
+}
+
+/* Assembly instruction format: xd, xj.  */
+/* Data types in instruction templates:  V8SF, V8SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256 __lasx_xvfrsqrte_s (__m256 _1)
+{
+  return (__m256)__builtin_lasx_xvfrsqrte_s ((v8f32)_1);
+}
+
+/* Assembly instruction format: xd, xj.  */
+/* Data types in instruction templates:  V4DF, V4DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m256d __lasx_xvfrsqrte_d (__m256d _1)
+{
+  return (__m256d)__builtin_lasx_xvfrsqrte_d ((v4f64)_1);
+}
+#endif
+
 /* Assembly instruction format:	xd, xj.  */
 /* Data types in instruction templates:  V8SF, V8SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index 5d037ab7f10..507fc953c72 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -120,6 +120,9 @@ struct loongarch_builtin_description
 AVAIL_ALL (hard_float, TARGET_HARD_FLOAT_ABI)
 AVAIL_ALL (lsx, ISA_HAS_LSX)
 AVAIL_ALL (lasx, ISA_HAS_LASX)
+AVAIL_ALL (frecipe, TARGET_FRECIPE && TARGET_HARD_FLOAT_ABI)
+AVAIL_ALL (lsx_frecipe, ISA_HAS_LSX && TARGET_FRECIPE)
+AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && TARGET_FRECIPE)
 
 /* Construct a loongarch_builtin_description from the given arguments.
 
@@ -164,6 +167,15 @@ AVAIL_ALL (lasx, ISA_HAS_LASX)
     "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT,			\
     FUNCTION_TYPE, loongarch_builtin_avail_lsx }
 
+ /* Define an LSX LARCH_BUILTIN_DIRECT function __builtin_lsx_<INSN>
+    for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
+    field. AVAIL is the name of the availability predicate, without the leading
+    loongarch_builtin_avail_.  */
+#define LSX_EXT_BUILTIN(INSN, FUNCTION_TYPE, AVAIL)                     \
+  { CODE_FOR_lsx_ ## INSN,                                              \
+    "__builtin_lsx_" #INSN,  LARCH_BUILTIN_DIRECT,                      \
+    FUNCTION_TYPE, loongarch_builtin_avail_##AVAIL }
+
 
 /* Define an LSX LARCH_BUILTIN_LSX_TEST_BRANCH function __builtin_lsx_<INSN>
    for instruction CODE_FOR_lsx_<INSN>.  FUNCTION_TYPE is a builtin_description
@@ -189,6 +201,15 @@ AVAIL_ALL (lasx, ISA_HAS_LASX)
     "__builtin_lasx_" #INSN,  LARCH_BUILTIN_LASX,			\
     FUNCTION_TYPE, loongarch_builtin_avail_lasx }
 
+/* Define an LASX LARCH_BUILTIN_DIRECT function __builtin_lasx_<INSN>
+   for instruction CODE_FOR_lasx_<INSN>.  FUNCTION_TYPE is a builtin_description
+   field. AVAIL is the name of the availability predicate, without the leading
+   loongarch_builtin_avail_.  */
+#define LASX_EXT_BUILTIN(INSN, FUNCTION_TYPE, AVAIL)                    \
+  { CODE_FOR_lasx_ ## INSN,                                             \
+    "__builtin_lasx_" #INSN,  LARCH_BUILTIN_LASX,                       \
+    FUNCTION_TYPE, loongarch_builtin_avail_##AVAIL }
+
 /* Define an LASX LARCH_BUILTIN_DIRECT_NO_TARGET function __builtin_lasx_<INSN>
    for instruction CODE_FOR_lasx_<INSN>.  FUNCTION_TYPE is a builtin_description
    field.  */
@@ -804,6 +825,27 @@ static const struct loongarch_builtin_description loongarch_builtins[] = {
   DIRECT_NO_TARGET_BUILTIN (syscall, LARCH_VOID_FTYPE_USI, default),
   DIRECT_NO_TARGET_BUILTIN (break, LARCH_VOID_FTYPE_USI, default),
 
+  /* Built-in functions for frecipe.{s/d} and frsqrte.{s/d}.  */
+
+  DIRECT_BUILTIN (frecipe_s, LARCH_SF_FTYPE_SF, frecipe),
+  DIRECT_BUILTIN (frecipe_d, LARCH_DF_FTYPE_DF, frecipe),
+  DIRECT_BUILTIN (frsqrte_s, LARCH_SF_FTYPE_SF, frecipe),
+  DIRECT_BUILTIN (frsqrte_d, LARCH_DF_FTYPE_DF, frecipe),
+
+  /* Built-in functions for new LSX instructions.  */
+
+  LSX_EXT_BUILTIN (vfrecipe_s, LARCH_V4SF_FTYPE_V4SF, lsx_frecipe),
+  LSX_EXT_BUILTIN (vfrecipe_d, LARCH_V2DF_FTYPE_V2DF, lsx_frecipe),
+  LSX_EXT_BUILTIN (vfrsqrte_s, LARCH_V4SF_FTYPE_V4SF, lsx_frecipe),
+  LSX_EXT_BUILTIN (vfrsqrte_d, LARCH_V2DF_FTYPE_V2DF, lsx_frecipe),
+
+  /* Built-in functions for new LASX instructions.  */
+
+  LASX_EXT_BUILTIN (xvfrecipe_s, LARCH_V8SF_FTYPE_V8SF, lasx_frecipe),
+  LASX_EXT_BUILTIN (xvfrecipe_d, LARCH_V4DF_FTYPE_V4DF, lasx_frecipe),
+  LASX_EXT_BUILTIN (xvfrsqrte_s, LARCH_V8SF_FTYPE_V8SF, lasx_frecipe),
+  LASX_EXT_BUILTIN (xvfrsqrte_d, LARCH_V4DF_FTYPE_V4DF, lasx_frecipe),
+
   /* Built-in functions for LSX.  */
   LSX_BUILTIN (vsll_b, LARCH_V16QI_FTYPE_V16QI_V16QI),
   LSX_BUILTIN (vsll_h, LARCH_V8HI_FTYPE_V8HI_V8HI),
diff --git a/gcc/config/loongarch/loongarch-c.cc b/gcc/config/loongarch/loongarch-c.cc
index fbc33a10351..44f52245c78 100644
--- a/gcc/config/loongarch/loongarch-c.cc
+++ b/gcc/config/loongarch/loongarch-c.cc
@@ -102,6 +102,9 @@ loongarch_cpu_cpp_builtins (cpp_reader *pfile)
   else
     builtin_define ("__loongarch_frlen=0");
 
+  if (TARGET_HARD_FLOAT && TARGET_FRECIPE)
+    builtin_define ("__loongarch_frecipe");
+
   if (ISA_HAS_LSX)
     {
       builtin_define ("__loongarch_simd");
diff --git a/gcc/config/loongarch/loongarch-cpucfg-map.h b/gcc/config/loongarch/loongarch-cpucfg-map.h
index 02ff1671255..148333c249c 100644
--- a/gcc/config/loongarch/loongarch-cpucfg-map.h
+++ b/gcc/config/loongarch/loongarch-cpucfg-map.h
@@ -29,6 +29,7 @@ static constexpr struct {
   unsigned int cpucfg_bit;
   HOST_WIDE_INT isa_evolution_bit;
 } cpucfg_map[] = {
+  { 2, 1u << 25, OPTION_MASK_ISA_FRECIPE },
   { 2, 1u << 26, OPTION_MASK_ISA_DIV32 },
   { 2, 1u << 27, OPTION_MASK_ISA_LAM_BH },
   { 2, 1u << 28, OPTION_MASK_ISA_LAMCAS },
diff --git a/gcc/config/loongarch/loongarch-def.cc b/gcc/config/loongarch/loongarch-def.cc
index bc6997e45b5..c41804a180e 100644
--- a/gcc/config/loongarch/loongarch-def.cc
+++ b/gcc/config/loongarch/loongarch-def.cc
@@ -60,7 +60,8 @@ array_arch<loongarch_isa> loongarch_cpu_default_isa =
 	    .fpu_ (ISA_EXT_FPU64)
 	    .simd_ (ISA_EXT_SIMD_LASX)
 	    .evolution_ (OPTION_MASK_ISA_DIV32 | OPTION_MASK_ISA_LD_SEQ_SA
-		    | OPTION_MASK_ISA_LAM_BH | OPTION_MASK_ISA_LAMCAS));
+			 | OPTION_MASK_ISA_LAM_BH | OPTION_MASK_ISA_LAMCAS
+			 | OPTION_MASK_ISA_FRECIPE));
 
 static inline loongarch_cache la464_cache ()
 {
diff --git a/gcc/config/loongarch/loongarch-str.h b/gcc/config/loongarch/loongarch-str.h
index 7c78d1443d5..4d1bfd675e8 100644
--- a/gcc/config/loongarch/loongarch-str.h
+++ b/gcc/config/loongarch/loongarch-str.h
@@ -68,6 +68,7 @@ along with GCC; see the file COPYING3.  If not see
 #define STR_EXPLICIT_RELOCS_NONE "none"
 #define STR_EXPLICIT_RELOCS_ALWAYS "always"
 
+#define OPTSTR_FRECIPE	"frecipe"
 #define OPTSTR_DIV32	"div32"
 #define OPTSTR_LAM_BH	"lam-bh"
 #define OPTSTR_LAMCAS	"lamcas"
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 3545e66a10e..57a20bec8a4 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -11503,6 +11503,7 @@ loongarch_asm_code_end (void)
 	       loongarch_cpu_strings [la_target.cpu_tune]);
       fprintf (asm_out_file, "%s Base ISA: %s\n", ASM_COMMENT_START,
 	       loongarch_isa_base_strings [la_target.isa.base]);
+      DUMP_FEATURE (TARGET_FRECIPE);
       DUMP_FEATURE (TARGET_DIV32);
       DUMP_FEATURE (TARGET_LAM_BH);
       DUMP_FEATURE (TARGET_LAMCAS);
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 7a101dd64b7..07beede8892 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -59,6 +59,12 @@ (define_c_enum "unspec" [
   ;; Stack tie
   UNSPEC_TIE
 
+  ;; RSQRT
+  UNSPEC_RSQRTE
+
+  ;; RECIP
+  UNSPEC_RECIPE
+
   ;; CRC
   UNSPEC_CRC
   UNSPEC_CRCC
@@ -220,6 +226,7 @@ (define_attr "qword_mode" "no,yes"
 ;; fmadd	floating point multiply-add
 ;; fdiv		floating point divide
 ;; frdiv	floating point reciprocal divide
+;; frecipe      floating point approximate reciprocal
 ;; fabs		floating point absolute value
 ;; flogb	floating point exponent extract
 ;; fneg		floating point negation
@@ -229,6 +236,7 @@ (define_attr "qword_mode" "no,yes"
 ;; fscaleb	floating point scale
 ;; fsqrt	floating point square root
 ;; frsqrt       floating point reciprocal square root
+;; frsqrte      floating point approximate reciprocal square root
 ;; multi	multiword sequence (or user asm statements)
 ;; atomic	atomic memory update instruction
 ;; syncloop	memory atomic operation implemented as a sync loop
@@ -238,8 +246,8 @@ (define_attr "type"
   "unknown,branch,jump,call,load,fpload,fpidxload,store,fpstore,fpidxstore,
    prefetch,prefetchx,condmove,mgtf,mftg,const,arith,logical,
    shift,slt,signext,clz,trap,imul,idiv,move,
-   fmove,fadd,fmul,fmadd,fdiv,frdiv,fabs,flogb,fneg,fcmp,fcopysign,fcvt,
-   fscaleb,fsqrt,frsqrt,accext,accmod,multi,atomic,syncloop,nop,ghost,
+   fmove,fadd,fmul,fmadd,fdiv,frdiv,frecipe,fabs,flogb,fneg,fcmp,fcopysign,fcvt,
+   fscaleb,fsqrt,frsqrt,frsqrte,accext,accmod,multi,atomic,syncloop,nop,ghost,
    simd_div,simd_fclass,simd_flog2,simd_fadd,simd_fcvt,simd_fmul,simd_fmadd,
    simd_fdiv,simd_bitins,simd_bitmov,simd_insert,simd_sld,simd_mul,simd_fcmp,
    simd_fexp2,simd_int_arith,simd_bit,simd_shift,simd_splat,simd_fill,
@@ -908,6 +916,18 @@ (define_insn "*recip<mode>3"
   [(set_attr "type" "frdiv")
    (set_attr "mode" "<UNITMODE>")])
 
+;; Approximate Reciprocal Instructions.
+
+(define_insn "loongarch_frecipe_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+    (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+	     UNSPEC_RECIPE))]
+  "TARGET_FRECIPE"
+  "frecipe.<fmt>\t%0,%1"
+  [(set_attr "type" "frecipe")
+   (set_attr "mode" "<UNITMODE>")
+   (set_attr "insn_count" "1")])
+
 ;; Integer division and modulus.
 (define_expand "<optab><mode>3"
   [(set (match_operand:GPR 0 "register_operand")
@@ -1133,6 +1153,17 @@ (define_insn "*rsqrt<mode>b"
   [(set_attr "type" "frsqrt")
    (set_attr "mode" "<UNITMODE>")
    (set_attr "insn_count" "1")])
+
+;; Approximate Reciprocal Square Root Instructions.
+
+(define_insn "loongarch_frsqrte_<fmt>"
+  [(set (match_operand:ANYF 0 "register_operand" "=f")
+    (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+		 UNSPEC_RSQRTE))]
+  "TARGET_FRECIPE"
+  "frsqrte.<fmt>\t%0,%1"
+  [(set_attr "type" "frsqrte")
+   (set_attr "mode" "<UNITMODE>")])
 
 ;;
 ;;  ....................
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index 41e6424e861..cdd59ae4fcf 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -260,6 +260,10 @@ default value is 4.
 Variable
 HOST_WIDE_INT isa_evolution = 0
 
+mfrecipe
+Target Mask(ISA_FRECIPE) Var(isa_evolution)
+Support frecipe.{s/d} and frsqrte.{s/d} instructions.
+
 mdiv32
 Target Mask(ISA_DIV32) Var(isa_evolution)
 Support div.w[u] and mod.w[u] instructions with inputs not sign-extended.
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index 23239993404..e2393aed139 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -42,8 +42,10 @@ (define_c_enum "unspec" [
   UNSPEC_LSX_VFCVTL
   UNSPEC_LSX_VFLOGB
   UNSPEC_LSX_VFRECIP
+  UNSPEC_LSX_VFRECIPE
   UNSPEC_LSX_VFRINT
   UNSPEC_LSX_VFRSQRT
+  UNSPEC_LSX_VFRSQRTE
   UNSPEC_LSX_VFCMP_SAF
   UNSPEC_LSX_VFCMP_SEQ
   UNSPEC_LSX_VFCMP_SLE
@@ -1546,6 +1548,17 @@ (define_insn "lsx_vfrecip_<flsxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
+;; Approximate Reciprocal Instructions.
+
+(define_insn "lsx_vfrecipe_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+    (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		 UNSPEC_LSX_VFRECIPE))]
+  "ISA_HAS_LSX && TARGET_FRECIPE"
+  "vfrecipe.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "lsx_vfrsqrt_<flsxfmt>"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
 	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
@@ -1555,6 +1568,17 @@ (define_insn "lsx_vfrsqrt_<flsxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
+;; Approximate Reciprocal Square Root Instructions.
+
+(define_insn "lsx_vfrsqrte_<flsxfmt>"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+    (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		 UNSPEC_LSX_VFRSQRTE))]
+  "ISA_HAS_LSX && TARGET_FRECIPE"
+  "vfrsqrte.<flsxfmt>\t%w0,%w1"
+  [(set_attr "type" "simd_fdiv")
+   (set_attr "mode" "<MODE>")])
+
 (define_insn "lsx_vftint_u_<ilsxfmt_u>_<flsxfmt>"
   [(set (match_operand:<VIMODE> 0 "register_operand" "=f")
 	(unspec:<VIMODE> [(match_operand:FLSX 1 "register_operand" "f")]
diff --git a/gcc/config/loongarch/lsxintrin.h b/gcc/config/loongarch/lsxintrin.h
index 29553c093fa..57a6fc40a8f 100644
--- a/gcc/config/loongarch/lsxintrin.h
+++ b/gcc/config/loongarch/lsxintrin.h
@@ -2480,6 +2480,40 @@ __m128d __lsx_vfrecip_d (__m128d _1)
   return (__m128d)__builtin_lsx_vfrecip_d ((v2f64)_1);
 }
 
+#if defined(__loongarch_frecipe)
+/* Assembly instruction format: vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrecipe_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrecipe_s ((v4f32)_1);
+}
+
+/* Assembly instruction format: vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrecipe_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrecipe_d ((v2f64)_1);
+}
+
+/* Assembly instruction format: vd, vj.  */
+/* Data types in instruction templates:  V4SF, V4SF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128 __lsx_vfrsqrte_s (__m128 _1)
+{
+  return (__m128)__builtin_lsx_vfrsqrte_s ((v4f32)_1);
+}
+
+/* Assembly instruction format: vd, vj.  */
+/* Data types in instruction templates:  V2DF, V2DF.  */
+extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
+__m128d __lsx_vfrsqrte_d (__m128d _1)
+{
+  return (__m128d)__builtin_lsx_vfrsqrte_d ((v2f64)_1);
+}
+#endif
+
 /* Assembly instruction format:	vd, vj.  */
 /* Data types in instruction templates:  V4SF, V4SF.  */
 extern __inline __attribute__((__gnu_inline__, __always_inline__, __artificial__))
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index 32ae15e1d5b..98c6d320fbe 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -17027,6 +17027,14 @@ The intrinsics provided are listed below:
     void __builtin_loongarch_break (imm0_32767)
 @end smallexample
 
+These instrisic functions are available by using @option{-mfrecipe}.
+@smallexample
+    float __builtin_loongarch_frecipe_s (float);
+    double  __builtin_loongarch_frecipe_d (double);
+    float __builtin_loongarch_frsqrte_s (float);
+    double  __builtin_loongarch_frsqrte_d (double);
+@end smallexample
+
 @emph{Note:}Since the control register is divided into 32-bit and 64-bit,
 but the access instruction is not distinguished. So GCC renames the control
 instructions when implementing intrinsics.
@@ -17099,6 +17107,15 @@ function you need to include @code{larchintrin.h}.
     void __break (imm0_32767)
 @end smallexample
 
+These instrisic functions are available by including @code{larchintrin.h} and
+using @option{-mfrecipe}.
+@smallexample
+    float __frecipe_s (float);
+    double __frecipe_d (double);
+    float __frsqrte_s (float);
+    double __frsqrte_d (double);
+@end smallexample
+
 Additional built-in functions are available for LoongArch family
 processors to efficiently use 128-bit floating-point (__float128)
 values.
@@ -17939,6 +17956,15 @@ __m128i __lsx_vxori_b (__m128i, imm0_255);
 __m128i __lsx_vxor_v (__m128i, __m128i);
 @end smallexample
 
+These instrisic functions are available by including @code{lsxintrin.h} and
+using @option{-mfrecipe} and @option{-mlsx}.
+@smallexample
+__m128d __lsx_vfrecipe_d (__m128d);
+__m128 __lsx_vfrecipe_s (__m128);
+__m128d __lsx_vfrsqrte_d (__m128d);
+__m128 __lsx_vfrsqrte_s (__m128);
+@end smallexample
+
 @node LoongArch ASX Vector Intrinsics
 @subsection LoongArch ASX Vector Intrinsics
 
@@ -18778,6 +18804,15 @@ __m256i __lasx_xvxori_b (__m256i, imm0_255);
 __m256i __lasx_xvxor_v (__m256i, __m256i);
 @end smallexample
 
+These instrisic functions are available by including @code{lasxintrin.h} and
+using @option{-mfrecipe} and @option{-mlasx}.
+@smallexample
+__m256d __lasx_xvfrecipe_d (__m256d);
+__m256 __lasx_xvfrecipe_s (__m256);
+__m256d __lasx_xvfrsqrte_d (__m256d);
+__m256 __lasx_xvfrsqrte_s (__m256);
+@end smallexample
+
 @node MIPS DSP Built-in Functions
 @subsection MIPS DSP Built-in Functions
 
diff --git a/gcc/testsuite/gcc.target/loongarch/larch-frecipe-builtin.c b/gcc/testsuite/gcc.target/loongarch/larch-frecipe-builtin.c
new file mode 100644
index 00000000000..b9329f34676
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/larch-frecipe-builtin.c
@@ -0,0 +1,28 @@
+/* Test builtins for frecipe.{s/d} and frsqrte.{s/d} instructions */
+/* { dg-do compile } */
+/* { dg-options "-mfrecipe" } */
+/* { dg-final { scan-assembler-times "test_frecipe_s:.*frecipe\\.s.*test_frecipe_s" 1 } } */
+/* { dg-final { scan-assembler-times "test_frecipe_d:.*frecipe\\.d.*test_frecipe_d" 1 } } */
+/* { dg-final { scan-assembler-times "test_frsqrte_s:.*frsqrte\\.s.*test_frsqrte_s" 1 } } */
+/* { dg-final { scan-assembler-times "test_frsqrte_d:.*frsqrte\\.d.*test_frsqrte_d" 1 } } */
+
+float
+test_frecipe_s (float _1)
+{
+  return __builtin_loongarch_frecipe_s (_1);
+}
+double
+test_frecipe_d (double _1)
+{
+  return __builtin_loongarch_frecipe_d (_1);
+}
+float
+test_frsqrte_s (float _1)
+{
+  return __builtin_loongarch_frsqrte_s (_1);
+}
+double
+test_frsqrte_d (double _1)
+{
+  return __builtin_loongarch_frsqrte_d (_1);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c
new file mode 100644
index 00000000000..522535b45a3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-frecipe-builtin.c
@@ -0,0 +1,30 @@
+/* Test builtins for xvfrecipe.{s/d} and xvfrsqrte.{s/d} instructions */
+/* { dg-do compile } */
+/* { dg-options "-mlasx -mfrecipe" } */
+/* { dg-final { scan-assembler-times "lasx_xvfrecipe_s:.*xvfrecipe\\.s.*lasx_xvfrecipe_s" 1 } } */
+/* { dg-final { scan-assembler-times "lasx_xvfrecipe_d:.*xvfrecipe\\.d.*lasx_xvfrecipe_d" 1 } } */
+/* { dg-final { scan-assembler-times "lasx_xvfrsqrte_s:.*xvfrsqrte\\.s.*lasx_xvfrsqrte_s" 1 } } */
+/* { dg-final { scan-assembler-times "lasx_xvfrsqrte_d:.*xvfrsqrte\\.d.*lasx_xvfrsqrte_d" 1 } } */
+
+#include <lasxintrin.h>
+
+v8f32
+__lasx_xvfrecipe_s (v8f32 _1)
+{
+  return __builtin_lasx_xvfrecipe_s (_1);
+}
+v4f64
+__lasx_xvfrecipe_d (v4f64 _1)
+{
+  return __builtin_lasx_xvfrecipe_d (_1);
+}
+v8f32
+__lasx_xvfrsqrte_s (v8f32 _1)
+{
+  return __builtin_lasx_xvfrsqrte_s (_1);
+}
+v4f64
+__lasx_xvfrsqrte_d (v4f64 _1)
+{
+  return __builtin_lasx_xvfrsqrte_d (_1);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c
new file mode 100644
index 00000000000..4ad0cb0ffd6
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-frecipe-builtin.c
@@ -0,0 +1,30 @@
+/* Test builtins for vfrecipe.{s/d} and vfrsqrte.{s/d} instructions */
+/* { dg-do compile } */
+/* { dg-options "-mlsx -mfrecipe" } */
+/* { dg-final { scan-assembler-times "lsx_vfrecipe_s:.*vfrecipe\\.s.*lsx_vfrecipe_s" 1 } } */
+/* { dg-final { scan-assembler-times "lsx_vfrecipe_d:.*vfrecipe\\.d.*lsx_vfrecipe_d" 1 } } */
+/* { dg-final { scan-assembler-times "lsx_vfrsqrte_s:.*vfrsqrte\\.s.*lsx_vfrsqrte_s" 1 } } */
+/* { dg-final { scan-assembler-times "lsx_vfrsqrte_d:.*vfrsqrte\\.d.*lsx_vfrsqrte_d" 1 } } */
+
+#include <lsxintrin.h>
+
+v4f32
+__lsx_vfrecipe_s (v4f32 _1)
+{
+  return __builtin_lsx_vfrecipe_s (_1);
+}
+v2f64
+__lsx_vfrecipe_d (v2f64 _1)
+{
+  return __builtin_lsx_vfrecipe_d (_1);
+}
+v4f32
+__lsx_vfrsqrte_s (v4f32 _1)
+{
+  return __builtin_lsx_vfrsqrte_s (_1);
+}
+v2f64
+__lsx_vfrsqrte_d (v2f64 _1)
+{
+  return __builtin_lsx_vfrsqrte_d (_1);
+}

From patchwork Wed Dec  6 07:04:50 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiahao Xu <xujiahao@loongson.cn>
X-Patchwork-Id: 174338
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3929531vqy;
        Tue, 5 Dec 2023 23:06:09 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IHSpBJifWsebcxvL/bifZysj5NnQhxbsl9hzyvPR7Y/Q7ocQN+E4U1r6x4KGZIbDM8o9JXo
X-Received: by 2002:ad4:42cd:0:b0:67a:9797:d88e with SMTP id
 f13-20020ad442cd000000b0067a9797d88emr455596qvr.54.1701846369182;
        Tue, 05 Dec 2023 23:06:09 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1701846369; cv=pass;
        d=google.com; s=arc-20160816;
        b=g8stz+81EkIgI5+E5R79NM7eshCigy04Y6vFtj5ZHIn2rMy4wm5SkRnyCflK3WZP5t
         8uz0c3n2VElFXRbW9misnyxRxZbwTZMtUzqP2BY4nAHevl+uFG+aStTKtZS44ViYB/s+
         DKq4QSRoMzr/GsG0GA2hmEIMIu9KoAdHixxuQVVTrpDLwEVk3oQ+NEs5Yb7+ThXrjUwx
         OLE/wIIOQpdVH/U6i3zy3gs9JKLPJYwoKvlIUMVZ7XoDObYBtn+SKVJbJpxxzkuGSXxi
         lBHZVggRxjYX9v4S4gXUnY4AI15z/I3vJ4aojW6u0G5KgaUEdfSLziXk9cPFYMcQMrcy
         fKBA==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:references:in-reply-to:message-id:date:subject:cc:to
         :from:arc-filter:dmarc-filter:delivered-to;
        bh=+d997pnMMBTgUlV3vKL3cLAEb1yYGMt0Z8yiMyAcmWE=;
        fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=;
        b=sovUt4P4zDe0gnly169gn86STErqaW4p8tK37MTyQolaLkvKUEqh6yFSMIyPMY8d+3
         x8XTzsdAhv6iyq3xs5BoGn92llFsASmgfovkCrTQIzgECFvDqlVUJL7i26nGdP7jppE1
         5GaLLlTzJedJsoR+ZsajINhcLOVk0VdiuuZurPjC/iIV001iVKl7s+Qv+X5C0P/QzLtO
         OcODkp0meKw2dUmiLZu5OD8LXa65IJ4THSgiSseM2bOTbBDoRaTKoluBKvY2PaRhZUKm
         OoLhleM0muxcwCFnwA/LjGqP1NcDnO+XESqjVUuCFhjlZ9W9v4lh+Va2EyhZxSXI4Wtm
         7jNw==
ARC-Authentication-Results: i=2; mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 ov12-20020a05620a628c00b0077d6c574808si13969207qkn.491.2023.12.05.23.06.09
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Dec 2023 23:06:09 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 813E438618BA
	for <ouuuleilei@gmail.com>; Wed,  6 Dec 2023 07:06:05 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10])
 by sourceware.org (Postfix) with ESMTPS id 80E113865C21
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Dec 2023 07:05:18 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 80E113865C21
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 80E113865C21
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2001:470:142:3::10
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701846324; cv=none;
 b=FZjwEbF3JUEvW6+7J0Iozq2J2UtJy4FgeWHnOg3kaURMfDTx9R5E8rN8rAcW+sRPNI9tWWoBkYT/Z/6UBjOo1XbRz/VQOSrCxZpypWAjVLgRzI6A5mN88ZyNyonS8RFxUkwaRr0r21/FFS0QdkCULl0QQd9RLznjh1DJ92gT7uM=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701846324; c=relaxed/simple;
 bh=SYLL0MPKyX3ubCnXPY2Rc8+Yoz12WrTvhRZj6ryyT7Y=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=KjvszpjguK24MQrzNV+yWpZQW9KIcWOAfqJ+YeG10cCkpsbJ8YQT2M3b41Tb2UohDQ43xZ4uyZvYPHtTIQUnZPsSb/JPrAc01cCJW6HsYjRRGKuU2SCdenTcbOunsCyv7AIq5IEhOVSeGWpVT0zFMGJcsxD2gJ90b2N8uXV1mp8=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from mail.loongson.cn ([114.242.206.163])
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <xujiahao@loongson.cn>) id 1rAly6-0000no-FG
 for gcc-patches@gcc.gnu.org; Wed, 06 Dec 2023 02:05:17 -0500
Received: from loongson.cn (unknown [10.10.130.252])
 by gateway (Coremail) with SMTP id _____8AxlPAfHXBlRjs_AA--.60693S3;
 Wed, 06 Dec 2023 15:05:03 +0800 (CST)
Received: from slurm-master.loongson.cn (unknown [10.10.130.252])
 by localhost.localdomain (Coremail) with SMTP id
 AQAAf8Dxvi8XHXBlp0BWAA--.59594S6;
 Wed, 06 Dec 2023 15:05:01 +0800 (CST)
From: Jiahao Xu <xujiahao@loongson.cn>
To: gcc-patches@gcc.gnu.org
Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn,
 xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn>
Subject: [PATCH v3 2/5] LoongArch: Use standard pattern name for
 xvfrsqrt/vfrsqrt instructions.
Date: Wed,  6 Dec 2023 15:04:50 +0800
Message-Id: <20231206070453.3252-3-xujiahao@loongson.cn>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20231206070453.3252-1-xujiahao@loongson.cn>
References: <20231206070453.3252-1-xujiahao@loongson.cn>
MIME-Version: 1.0
X-CM-TRANSID: AQAAf8Dxvi8XHXBlp0BWAA--.59594S6
X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/
X-Coremail-Antispam: 1Uk129KBj93XoW3WF4xCFyUWr4kWw1xWw4Dtrc_yoW3uw18p3
 9rCw1vyrW8JFs7Kr1kt3y5Xr45tr9rGF129a9I93y2kan0q3WDZF1vkFZFqFyjqw4rGryI
 vw4rW3WjvFWUC3cCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa
 sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU
 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2
 IYs7xG6rWj6s0DM7CIcVAFz4kK6r126r13M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v
 e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI
 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK
 xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx
 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv
 67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2
 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02
 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x
 vE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE
 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x
 kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07jjpB-UUUUU=
Received-SPF: pass client-ip=114.242.206.163;
 envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1784515258233077068
X-GMAIL-MSGID: 1784515258233077068

Rename lasx_xvfrsqrt*/lsx_vfrsqrt* to rsqrt<mode>2 to align with standard
pattern name. Define function use_rsqrt_p to decide when to use rsqrt optab.

gcc/ChangeLog:

	* config/loongarch/lasx.md (lasx_xvfrsqrt_<flasxfmt>): Renamed to ..
	(rsqrt<mode>2): .. this.
	* config/loongarch/loongarch-builtins.cc
	(CODE_FOR_lsx_vfrsqrt_d): Redefine to standard pattern name.
	(CODE_FOR_lsx_vfrsqrt_s): Ditto.
	(CODE_FOR_lasx_xvfrsqrt_d): Ditto.
	(CODE_FOR_lasx_xvfrsqrt_s): Ditto.
	* config/loongarch/loongarch.cc (use_rsqrt_p): New function.
	(loongarch_optab_supported_p): Ditto.
	(TARGET_OPTAB_SUPPORTED_P): New hook.
	* config/loongarch/loongarch.md (*rsqrt<mode>a): Remove.
	(*rsqrt<mode>2): New insn pattern.
	(*rsqrt<mode>b): Remove.
	* config/loongarch/lsx.md (lsx_vfrsqrt_<flsxfmt>): Renamed to ..
	(rsqrt<mode>2): .. this.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/vector/lasx/lasx-rsqrt.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-rsqrt.c: New test.

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index f6e5208a6f1..c8edc1bfd76 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1646,10 +1646,10 @@ (define_insn "lasx_xvfrecipe_<flasxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "lasx_xvfrsqrt_<flasxfmt>"
+(define_insn "rsqrt<mode>2"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
-	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
-		      UNSPEC_LASX_XVFRSQRT))]
+    (unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+		  UNSPEC_LASX_XVFRSQRT))]
   "ISA_HAS_LASX"
   "xvfrsqrt.<flasxfmt>\t%u0,%u1"
   [(set_attr "type" "simd_fdiv")
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index 507fc953c72..ba8686d4ceb 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -500,6 +500,8 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && TARGET_FRECIPE)
 #define CODE_FOR_lsx_vssrlrn_bu_h CODE_FOR_lsx_vssrlrn_u_bu_h
 #define CODE_FOR_lsx_vssrlrn_hu_w CODE_FOR_lsx_vssrlrn_u_hu_w
 #define CODE_FOR_lsx_vssrlrn_wu_d CODE_FOR_lsx_vssrlrn_u_wu_d
+#define CODE_FOR_lsx_vfrsqrt_d CODE_FOR_rsqrtv2df2
+#define CODE_FOR_lsx_vfrsqrt_s CODE_FOR_rsqrtv4sf2
 
 /* LoongArch ASX define CODE_FOR_lasx_mxxx */
 #define CODE_FOR_lasx_xvsadd_b CODE_FOR_ssaddv32qi3
@@ -776,6 +778,8 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && TARGET_FRECIPE)
 #define CODE_FOR_lasx_xvsat_hu CODE_FOR_lasx_xvsat_u_hu
 #define CODE_FOR_lasx_xvsat_wu CODE_FOR_lasx_xvsat_u_wu
 #define CODE_FOR_lasx_xvsat_du CODE_FOR_lasx_xvsat_u_du
+#define CODE_FOR_lasx_xvfrsqrt_d CODE_FOR_rsqrtv4df2
+#define CODE_FOR_lasx_xvfrsqrt_s CODE_FOR_rsqrtv8sf2
 
 static const struct loongarch_builtin_description loongarch_builtins[] = {
 #define LARCH_MOVFCSR2GR 0
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 57a20bec8a4..96a4b846f2d 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -11487,6 +11487,30 @@ loongarch_builtin_support_vector_misalignment (machine_mode mode,
 						      is_packed);
 }
 
+static bool
+use_rsqrt_p (void)
+{
+  return (flag_finite_math_only
+	  && !flag_trapping_math
+	  && flag_unsafe_math_optimizations);
+}
+
+/* Implement the TARGET_OPTAB_SUPPORTED_P hook.  */
+
+static bool
+loongarch_optab_supported_p (int op, machine_mode, machine_mode,
+			     optimization_type opt_type)
+{
+  switch (op)
+    {
+    case rsqrt_optab:
+      return opt_type == OPTIMIZE_FOR_SPEED && use_rsqrt_p ();
+
+    default:
+      return true;
+    }
+}
+
 /* If -fverbose-asm, dump some info for debugging.  */
 static void
 loongarch_asm_code_end (void)
@@ -11625,6 +11649,9 @@ loongarch_asm_code_end (void)
 #undef TARGET_FUNCTION_ARG_BOUNDARY
 #define TARGET_FUNCTION_ARG_BOUNDARY loongarch_function_arg_boundary
 
+#undef TARGET_OPTAB_SUPPORTED_P
+#define TARGET_OPTAB_SUPPORTED_P loongarch_optab_supported_p
+
 #undef TARGET_VECTOR_MODE_SUPPORTED_P
 #define TARGET_VECTOR_MODE_SUPPORTED_P loongarch_vector_mode_supported_p
 
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index 07beede8892..fd154b02e48 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -60,6 +60,7 @@ (define_c_enum "unspec" [
   UNSPEC_TIE
 
   ;; RSQRT
+  UNSPEC_RSQRT
   UNSPEC_RSQRTE
 
   ;; RECIP
@@ -1134,25 +1135,14 @@ (define_insn "sqrt<mode>2"
    (set_attr "mode" "<UNITMODE>")
    (set_attr "insn_count" "1")])
 
-(define_insn "*rsqrt<mode>a"
+(define_insn "*rsqrt<mode>2"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
-	(div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
-		  (sqrt:ANYF (match_operand:ANYF 2 "register_operand" "f"))))]
-  "flag_unsafe_math_optimizations"
-  "frsqrt.<fmt>\t%0,%2"
-  [(set_attr "type" "frsqrt")
-   (set_attr "mode" "<UNITMODE>")
-   (set_attr "insn_count" "1")])
-
-(define_insn "*rsqrt<mode>b"
-  [(set (match_operand:ANYF 0 "register_operand" "=f")
-	(sqrt:ANYF (div:ANYF (match_operand:ANYF 1 "const_1_operand" "")
-			     (match_operand:ANYF 2 "register_operand" "f"))))]
-  "flag_unsafe_math_optimizations"
-  "frsqrt.<fmt>\t%0,%2"
+    (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
+	     UNSPEC_RSQRT))]
+  "TARGET_HARD_FLOAT"
+  "frsqrt.<fmt>\t%0,%1"
   [(set_attr "type" "frsqrt")
-   (set_attr "mode" "<UNITMODE>")
-   (set_attr "insn_count" "1")])
+   (set_attr "mode" "<UNITMODE>")])
 
 ;; Approximate Reciprocal Square Root Instructions.
 
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index e2393aed139..aeae1b1a622 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -1559,10 +1559,10 @@ (define_insn "lsx_vfrecipe_<flsxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "lsx_vfrsqrt_<flsxfmt>"
+(define_insn "rsqrt<mode>2"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
-	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
-		     UNSPEC_LSX_VFRSQRT))]
+    (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+		 UNSPEC_LSX_VFRSQRT))]
   "ISA_HAS_LSX"
   "vfrsqrt.<flsxfmt>\t%w0,%w1"
   [(set_attr "type" "simd_fdiv")
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-rsqrt.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-rsqrt.c
new file mode 100644
index 00000000000..24316944d4e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-rsqrt.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlasx -ffast-math" } */
+/* { dg-final { scan-assembler "xvfrsqrt.s" } } */
+/* { dg-final { scan-assembler "xvfrsqrt.d" } } */
+
+extern float sqrtf (float);
+
+float a[8], b[8];
+
+void
+foo1(void)
+{
+  for (int i = 0; i < 8; i++)
+    a[i] = 1 / sqrtf (b[i]);
+}
+
+extern double sqrt (double);
+
+double da[4], db[4];
+
+void
+foo2(void)
+{
+  for (int i = 0; i < 4; i++)
+    da[i] = 1 / sqrt (db[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-rsqrt.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-rsqrt.c
new file mode 100644
index 00000000000..519cc47644c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-rsqrt.c
@@ -0,0 +1,26 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlsx -ffast-math" } */
+/* { dg-final { scan-assembler "vfrsqrt.s" } } */
+/* { dg-final { scan-assembler "vfrsqrt.d" } } */
+
+extern float sqrtf (float);
+
+float a[4], b[4];
+
+void
+foo1(void)
+{
+  for (int i = 0; i < 4; i++)
+    a[i] = 1 / sqrtf (b[i]);
+}
+
+extern double sqrt (double);
+
+double da[2], db[2];
+
+void
+foo2(void)
+{
+  for (int i = 0; i < 2; i++)
+    da[i] = 1 / sqrt (db[i]);
+}

From patchwork Wed Dec  6 07:04:51 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiahao Xu <xujiahao@loongson.cn>
X-Patchwork-Id: 174336
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3929352vqy;
        Tue, 5 Dec 2023 23:05:51 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IHoiiSEMkVoOgfUqfoybyAgBUNhNVHV+WF/j19dlSZpjDGnxxOMiPC6M8p5OHEnqh3qz/+6
X-Received: by 2002:a05:620a:2988:b0:77e:fba3:a1f8 with SMTP id
 r8-20020a05620a298800b0077efba3a1f8mr590366qkp.82.1701846350916;
        Tue, 05 Dec 2023 23:05:50 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1701846350; cv=pass;
        d=google.com; s=arc-20160816;
        b=vqxU5co9l/APJIYKG9lbiAd/C04gABiDWOz+VNUjtxfFoIhUZmZO43KBBE5ZNNOvq7
         xWnithZI3xFWlJaHEwwDQ0TGplZRTQ7yuhr6E0lB5AzJdKMtljAYuh8E4N40GGlBxxYo
         bn1nbhlNb2tDvLNmpEornHP1Oe2eHR1gkdlrTeoZoPMVDMdoWTSjy2B5vPZNCqDkFYG7
         xvcrET8vP25w8M1+W/Db1O9fdTBaAkYoAYlxi7O6dU7KmqYIM8MfP0rm4G/xDCGADoGE
         w7wMbRo0yYxSpPdf6iNNp9geZwpK128RB8kUKvNO69EjzZ5rjHq0QWB37IkfOHbgDUWc
         86vQ==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:references:in-reply-to:message-id:date:subject:cc:to
         :from:arc-filter:dmarc-filter:delivered-to;
        bh=Q5/EiY7+gMeSmkLU1S8/XABKWNllq5To4Lsl8vA6C48=;
        fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=;
        b=GcFwjyIj1et/Vz/9qBiN1kdwefmWAKwxqWieYKRA2IZFr90RBqd9pRnlo3M9H8+NI0
         7IjDpSM96kGFnyxZ37jUygnDQwe4npYWNGALRl7N0aRvCYiGah6Qak3X+XR2NINUVtr+
         NRcSL2M62mqYfF9h4fkUreRsKmDeCOxBAO2oxwaPGMFI6e2O6LTraUilrZhDI892djSs
         g/7GbNuIX8MSc1Yhr8FejUg2kZNee5CthS/2JxrWYTqXvbd2JxbLbhUMlP4ih38N5CXz
         ChVcIfhciL/1rlsmCj3l3PYkz9LB4Z96cGc+N5g5pn1p+b6Z3+n8AiajdexfhurflyaY
         cz4g==
ARC-Authentication-Results: i=2; mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 pw4-20020a05620a63c400b0077d714e1421si14055647qkn.486.2023.12.05.23.05.50
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Dec 2023 23:05:50 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 955B438708A7
	for <ouuuleilei@gmail.com>; Wed,  6 Dec 2023 07:05:47 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10])
 by sourceware.org (Postfix) with ESMTPS id 8324D3845BDE
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Dec 2023 07:05:18 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8324D3845BDE
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8324D3845BDE
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2001:470:142:3::10
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701846324; cv=none;
 b=CDBqtL4S3cF6HuyyY/p8R1DQlWtDcomo7K2Ds0YmTO2eTjP+0NfuXY889WPrj45DgMKDxYZTQ9QcTkas2YOhUSf4asOEwbEvR77MxZSogr0IN3x0+VgtCcYYj7NN9p+ilq+irz8N4LK5EADpvE3XeykVuIT8bMntUITCafULmAY=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701846324; c=relaxed/simple;
 bh=EntkarUqWmzN5aay2Ju73qLYajU1MEfGIwSUv3kUBCA=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=dGQay5qJJkXn0F0+HVmmhP05471+/5m8doxpqX4zkQkq/qzZU4yyeRrbs+v1z9+qmxlvhYlrpPwaKwxXbbGmtaZ4/ymTMRI+FGtiCx6ojrnImgPo+A2hmJtA7jbn7oATVXR8WjN+4kcXC9UihCsNa0abAwOxHNdhOY/J7048w1I=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from mail.loongson.cn ([114.242.206.163])
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <xujiahao@loongson.cn>) id 1rAly8-0000og-1L
 for gcc-patches@gcc.gnu.org; Wed, 06 Dec 2023 02:05:17 -0500
Received: from loongson.cn (unknown [10.10.130.252])
 by gateway (Coremail) with SMTP id _____8Bxd+giHXBlSTs_AA--.24548S3;
 Wed, 06 Dec 2023 15:05:06 +0800 (CST)
Received: from slurm-master.loongson.cn (unknown [10.10.130.252])
 by localhost.localdomain (Coremail) with SMTP id
 AQAAf8Dxvi8XHXBlp0BWAA--.59594S7;
 Wed, 06 Dec 2023 15:05:04 +0800 (CST)
From: Jiahao Xu <xujiahao@loongson.cn>
To: gcc-patches@gcc.gnu.org
Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn,
 xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn>
Subject: [PATCH v3 3/5] LoongArch: Redefine pattern for xvfrecip/vfrecip
 instructions.
Date: Wed,  6 Dec 2023 15:04:51 +0800
Message-Id: <20231206070453.3252-4-xujiahao@loongson.cn>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20231206070453.3252-1-xujiahao@loongson.cn>
References: <20231206070453.3252-1-xujiahao@loongson.cn>
MIME-Version: 1.0
X-CM-TRANSID: AQAAf8Dxvi8XHXBlp0BWAA--.59594S7
X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/
X-Coremail-Antispam: 1Uk129KBj93XoW3WF43Gr15Jr1UtFWUur1rGrX_yoW7AryDpr
 ZrC3ZFyrWrJFsIgw1ktay5Xr15Kr9rKF429FW3Z39xAa1jqw1vyF1FkFZIqF17Xw4rKr1I
 va1Fga1YvFWDC3gCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa
 sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU
 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2
 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v
 e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI
 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK
 xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx
 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1q6rW5McIj6I8E87Iv
 67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2
 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02
 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x
 vE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE
 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x
 kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07josjUUUUUU=
Received-SPF: pass client-ip=114.242.206.163;
 envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Status: No, score=-13.6 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_FAIL, SPF_HELO_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1784515239389801944
X-GMAIL-MSGID: 1784515239389801944

Redefine pattern for [x]vfrecip instructions use rtx code instead of unspec, and enable
[x]vfrecip instructions to be generated during auto-vectorization.

gcc/ChangeLog:

	* config/loongarch/lasx.md (lasx_xvfrecip_<flasxfmt>): Renamed to ..
	(recip<mode>3): .. this.
	* config/loongarch/loongarch-builtins.cc (CODE_FOR_lsx_vfrecip_d): Redefine
	to new pattern name.
	(CODE_FOR_lsx_vfrecip_s): Ditto.
	(CODE_FOR_lasx_xvfrecip_d): Ditto.
	(CODE_FOR_lasx_xvfrecip_s): Ditto.
	(loongarch_expand_builtin_direct): For the vector recip instructions, construct a
	temporary parameter const1_vector.
	* config/loongarch/lsx.md (lsx_vfrecip_<flsxfmt>): Renamed to ..
	(recip<mode>3): .. this.
	* config/loongarch/predicates.md (const_vector_1_operand): New predicate.

diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index c8edc1bfd76..e4310c4523d 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1626,12 +1626,12 @@ (define_insn "lasx_xvfmina_<flasxfmt>"
   [(set_attr "type" "simd_fminmax")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "lasx_xvfrecip_<flasxfmt>"
+(define_insn "recip<mode>3"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
-	(unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
-		      UNSPEC_LASX_XVFRECIP))]
+       (div:FLASX (match_operand:FLASX 1 "const_vector_1_operand" "")
+		  (match_operand:FLASX 2 "register_operand" "f")))]
   "ISA_HAS_LASX"
-  "xvfrecip.<flasxfmt>\t%u0,%u1"
+  "xvfrecip.<flasxfmt>\t%u0,%u2"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
diff --git a/gcc/config/loongarch/loongarch-builtins.cc b/gcc/config/loongarch/loongarch-builtins.cc
index ba8686d4ceb..c77394176db 100644
--- a/gcc/config/loongarch/loongarch-builtins.cc
+++ b/gcc/config/loongarch/loongarch-builtins.cc
@@ -502,6 +502,8 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && TARGET_FRECIPE)
 #define CODE_FOR_lsx_vssrlrn_wu_d CODE_FOR_lsx_vssrlrn_u_wu_d
 #define CODE_FOR_lsx_vfrsqrt_d CODE_FOR_rsqrtv2df2
 #define CODE_FOR_lsx_vfrsqrt_s CODE_FOR_rsqrtv4sf2
+#define CODE_FOR_lsx_vfrecip_d CODE_FOR_recipv2df3
+#define CODE_FOR_lsx_vfrecip_s CODE_FOR_recipv4sf3
 
 /* LoongArch ASX define CODE_FOR_lasx_mxxx */
 #define CODE_FOR_lasx_xvsadd_b CODE_FOR_ssaddv32qi3
@@ -780,6 +782,8 @@ AVAIL_ALL (lasx_frecipe, ISA_HAS_LASX && TARGET_FRECIPE)
 #define CODE_FOR_lasx_xvsat_du CODE_FOR_lasx_xvsat_u_du
 #define CODE_FOR_lasx_xvfrsqrt_d CODE_FOR_rsqrtv4df2
 #define CODE_FOR_lasx_xvfrsqrt_s CODE_FOR_rsqrtv8sf2
+#define CODE_FOR_lasx_xvfrecip_d CODE_FOR_recipv4df3
+#define CODE_FOR_lasx_xvfrecip_s CODE_FOR_recipv8sf3
 
 static const struct loongarch_builtin_description loongarch_builtins[] = {
 #define LARCH_MOVFCSR2GR 0
@@ -3024,6 +3028,22 @@ loongarch_expand_builtin_direct (enum insn_code icode, rtx target, tree exp,
   if (has_target_p)
     create_output_operand (&ops[opno++], target, TYPE_MODE (TREE_TYPE (exp)));
 
+  /* For the vector reciprocal instructions, we need to construct a temporary
+     parameter const1_vector.  */
+  switch (icode)
+    {
+    case CODE_FOR_recipv8sf3:
+    case CODE_FOR_recipv4df3:
+    case CODE_FOR_recipv4sf3:
+    case CODE_FOR_recipv2df3:
+      loongarch_prepare_builtin_arg (&ops[2], exp, 0);
+      create_input_operand (&ops[1], CONST1_RTX (ops[0].mode), ops[0].mode);
+      return loongarch_expand_builtin_insn (icode, 3, ops, has_target_p);
+
+    default:
+      break;
+    }
+
   /* Map the arguments to the other operands.  */
   gcc_assert (opno + call_expr_nargs (exp)
 	      == insn_data[icode].n_generator_args);
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index aeae1b1a622..06402e3b353 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -1539,12 +1539,12 @@ (define_insn "lsx_vfmina_<flsxfmt>"
   [(set_attr "type" "simd_fminmax")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "lsx_vfrecip_<flsxfmt>"
+(define_insn "recip<mode>3"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
-	(unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
-		     UNSPEC_LSX_VFRECIP))]
+       (div:FLSX (match_operand:FLSX 1 "const_vector_1_operand" "")
+		 (match_operand:FLSX 2 "register_operand" "f")))]
   "ISA_HAS_LSX"
-  "vfrecip.<flsxfmt>\t%w0,%w1"
+  "vfrecip.<flsxfmt>\t%w0,%w2"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index d02e846cb12..f7796da10b2 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -227,6 +227,10 @@ (define_predicate "const_1_operand"
   (and (match_code "const_int,const_wide_int,const_double,const_vector")
        (match_test "op == CONST1_RTX (GET_MODE (op))")))
 
+(define_predicate "const_vector_1_operand"
+  (and (match_code "const_vector")
+       (match_test "op == CONST1_RTX (GET_MODE (op))")))
+
 (define_predicate "reg_or_1_operand"
   (ior (match_operand 0 "const_1_operand")
        (match_operand 0 "register_operand")))

From patchwork Wed Dec  6 07:04:52 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiahao Xu <xujiahao@loongson.cn>
X-Patchwork-Id: 174339
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3929701vqy;
        Tue, 5 Dec 2023 23:06:31 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IFZiEUPqCQzDc3ziL6KXqHpfaxvAMfjxeP9fm9SSsRA4ScZlbB+1q0CMoARpk/nEgksna06
X-Received: by 2002:a05:6808:2225:b0:3b8:412a:e75b with SMTP id
 bd37-20020a056808222500b003b8412ae75bmr617574oib.10.1701846391430;
        Tue, 05 Dec 2023 23:06:31 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1701846391; cv=pass;
        d=google.com; s=arc-20160816;
        b=sVqrc73STDb6ZSlqW8uTryJ/6hoAe1RjQxjTr2xwqB8RUA5WnA/srquzzpOHT6yWfz
         n7MGjZFXlH32s7ogA5m2hoTUZp1OkcUgeP0MVC0Xur3MHAKV7gjmDPp4DooQM1o85nwD
         QGt+9U8fBK1XAHzIunsac4s10Hqnpkko1fjOGhdPkhGh2r/7nfODK1mk7Sai56rHmX8M
         dp8fbowVvo5xCcBz4DYvt4M3kB5SRuxDoB8TVHc1qNqqpz0bX3YyVip6r7MSPgD3Kj75
         e06GtJ10MWNs2fB0Defk2gC6owllxCAXFC+ghTwikO1qqiSCO0nkfoK2NSeLQn6Gifqh
         1J6Q==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:references:in-reply-to:message-id:date:subject:cc:to
         :from:arc-filter:dmarc-filter:delivered-to;
        bh=L3h3IINSXtVD90hK0UsMDP7Nxg2IIvGyafe6x6cuwaw=;
        fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=;
        b=g9nEzn8GPpyHbz+3VwfZ84Or+7JW6/YuIyHC5iTuh3ZWO/TzKeddQeh8dxi2U9x6MW
         9RcLVw1AxIUTQ1yoKxlPSo5+HTTlzaTVsQGPUMbU+YEyE+wFowMMrqzNTqcpusnHtz3s
         4VtT7r/o2DctSgROivvTH8ChRHbERmD/+EXUVhDXbgZIw+vrzNXB3E6klb/4b2NsqJPn
         IXcefbZ2VrQUV6/ZY5IK6/VHn72dFNHxuARZVN8MhVCN7MSJFITpooiraSnGlsHkhoTw
         FXUDKE6WPhfn2uPV2TQ7NQLhs2WwnqkfSurmYBpZqN/SS8fR3D3aYkK5PBzDr9Urj3jy
         MvGg==
ARC-Authentication-Results: i=2; mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97])
        by mx.google.com with ESMTPS id
 cm9-20020a05622a250900b00421c3b9dbe9si14274992qtb.208.2023.12.05.23.06.31
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Dec 2023 23:06:31 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender) client-ip=8.43.85.97;
Authentication-Results: mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as
 permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 311A9384576A
	for <ouuuleilei@gmail.com>; Wed,  6 Dec 2023 07:06:31 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10])
 by sourceware.org (Postfix) with ESMTPS id 15DDD386D62E
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Dec 2023 07:05:20 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 15DDD386D62E
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 15DDD386D62E
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=2001:470:142:3::10
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701846324; cv=none;
 b=HBfA1hRMdHSgCyZwweqfk1XtoLmlIpu87oMY7fBFORu53mLQMvyfPk360TqA8NCcuQzeIXUXfON+CSJWZPcLy8kd6K22c+qR2hJNP2yfy6HxQ3ZTiwP6uEYyLm+GFpOQqe4JnVgwv+n0fydzjK73BmnyGVf+Z56Gaz+6QxL1P7Y=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701846324; c=relaxed/simple;
 bh=hv/ia8d5dXQGYle6Wo/OdxjbVwZ2FwFheRNeSA5lT9I=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=ib8cePTGaR5BM9vcwYSKKAPnpLY649FSurRMdSINnzkyqfRGBLpKLwuKGygnZ1KbR8xJGvRjRxksMFMEJlQWmXEGArHLkN/GvB0f+EvdKkAg3OYzi2Bb5qh5jYXysccMBdY/vf8oW799ev6jFE9rAZ6KoE84rFGil9E2uhucZaA=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from mail.loongson.cn ([114.242.206.163])
 by eggs.gnu.org with esmtp (Exim 4.90_1)
 (envelope-from <xujiahao@loongson.cn>) id 1rAly8-0000px-1R
 for gcc-patches@gcc.gnu.org; Wed, 06 Dec 2023 02:05:19 -0500
Received: from loongson.cn (unknown [10.10.130.252])
 by gateway (Coremail) with SMTP id _____8BxHOslHXBlTDs_AA--.56420S3;
 Wed, 06 Dec 2023 15:05:10 +0800 (CST)
Received: from slurm-master.loongson.cn (unknown [10.10.130.252])
 by localhost.localdomain (Coremail) with SMTP id
 AQAAf8Dxvi8XHXBlp0BWAA--.59594S8;
 Wed, 06 Dec 2023 15:05:07 +0800 (CST)
From: Jiahao Xu <xujiahao@loongson.cn>
To: gcc-patches@gcc.gnu.org
Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn,
 xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn>
Subject: [PATCH v3 4/5] LoongArch: New options -mrecip and -mrecip= with
 ffast-math.
Date: Wed,  6 Dec 2023 15:04:52 +0800
Message-Id: <20231206070453.3252-5-xujiahao@loongson.cn>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20231206070453.3252-1-xujiahao@loongson.cn>
References: <20231206070453.3252-1-xujiahao@loongson.cn>
MIME-Version: 1.0
X-CM-TRANSID: AQAAf8Dxvi8XHXBlp0BWAA--.59594S8
X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/
X-Coremail-Antispam: 1Uk129KBj9fXoWfCr47Xr48tF15KrW5Wr45Jwc_yoW5uF4rGo
 WrAF4DGw18GrySkw4DKrsxZry8Xw1jyr4xAa9I9wn5CFs7Xr15t3sFka1Yv343CrnxXry5
 C3s7uFZ8Z347Za1kl-sFpf9Il3svdjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8wcxFpf
 9Il3svdxBIdaVrn0xqx4xG64xvF2IEw4CE5I8CrVC2j2Jv73VFW2AGmfu7bjvjm3AaLaJ3
 UjIYCTnIWjp_UUUY87kC6x804xWl14x267AKxVWUJVW8JwAFc2x0x2IEx4CE42xK8VAvwI
 8IcIk0rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2ocxC64kIII0Yj41l84x0c7CEw4AK67xG
 Y2AK021l84ACjcxK6xIIjxv20xvE14v26r4j6ryUM28EF7xvwVC0I7IYx2IY6xkF7I0E14
 v26r4j6F4UM28EF7xvwVC2z280aVAFwI0_Gr1j6F4UJwA2z4x0Y4vEx4A2jsIEc7CjxVAF
 wI0_Gr1j6F4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI
 0UMc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUtVWrXwAv7VC2z280
 aVAFwI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28Icx
 kI7VAKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2Iq
 xVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42
 IY6xIIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY
 6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aV
 CY1x0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU84xRDUUUUU==
Received-SPF: pass client-ip=114.242.206.163;
 envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001,
 SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no
X-Spam_action: no action
X-Spam-Status: No, score=-13.3 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1784515281835403886
X-GMAIL-MSGID: 1784515281835403886

When both the -mrecip and -mfrecipe options are enabled, use approximate reciprocal
instructions and approximate reciprocal square root instructions with additional
Newton-Raphson steps to implement single precision floating-point division, square
root and reciprocal square root operations, for a better performance.

gcc/ChangeLog:

        * config/loongarch/genopts/loongarch.opt.in (recip_mask): New variable.
        (-mrecip, -mrecip): New options.
        * config/loongarch/lasx.md (div<mode>3): New expander.
        (*div<mode>3): Rename.
        (sqrt<mode>2): New expander.
        (*sqrt<mode>2): Rename.
        (rsqrt<mode>2): New expander.
        * config/loongarch/loongarch-protos.h (loongarch_emit_swrsqrtsf): New prototype.
        (loongarch_emit_swdivsf): Ditto.
        * config/loongarch/loongarch.cc (loongarch_option_override_internal): Set
        recip_mask for -mrecip and -mrecip= options.
        (loongarch_emit_swrsqrtsf): New function.
        (loongarch_emit_swdivsf): Ditto.
        * config/loongarch/loongarch.h (RECIP_MASK_NONE, RECIP_MASK_DIV, RECIP_MASK_SQRT
        RECIP_MASK_RSQRT, RECIP_MASK_VEC_DIV, RECIP_MASK_VEC_SQRT, RECIP_MASK_VEC_RSQRT
        RECIP_MASK_ALL): New bitmasks.
        (TARGET_RECIP_DIV, TARGET_RECIP_SQRT, TARGET_RECIP_RSQRT, TARGET_RECIP_VEC_DIV
        TARGET_RECIP_VEC_SQRT, TARGET_RECIP_VEC_RSQRT): New tests.
        * config/loongarch/loongarch.md (sqrt<mode>2): New expander.
        (*sqrt<mode>2): Rename.
        (rsqrt<mode>2): New expander.
        * config/loongarch/loongarch.opt (recip_mask): New variable.
        (-mrecip, -mrecip): New options.
        * config/loongarch/lsx.md (div<mode>3): New expander.
        (*div<mode>3): Rename.
        (sqrt<mode>2): New expander.
        (*sqrt<mode>2): Rename.
        (rsqrt<mode>2): New expander.
        * config/loongarch/predicates.md (reg_or_vecotr_1_operand): New predicate.
        * doc/invoke.texi (LoongArch Options): Document new options.

gcc/testsuite/ChangeLog:

	* gcc.target/loongarch/divf.c: New test.
	* gcc.target/loongarch/recip-divf.c: New test.
	* gcc.target/loongarch/recip-sqrtf.c: New test.
	* gcc.target/loongarch/sqrtf.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-divf.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-recip-divf.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-recip.c: New test.
	* gcc.target/loongarch/vector/lasx/lasx-sqrtf.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-divf.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-recip-divf.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-recip.c: New test.
	* gcc.target/loongarch/vector/lsx/lsx-sqrtf.c: New test.

diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in
index 483b185b059..c3848d02fd3 100644
--- a/gcc/config/loongarch/genopts/loongarch.opt.in
+++ b/gcc/config/loongarch/genopts/loongarch.opt.in
@@ -23,6 +23,9 @@ config/loongarch/loongarch-opts.h
 HeaderInclude
 config/loongarch/loongarch-str.h
 
+TargetVariable
+unsigned int recip_mask = 0
+
 ; ISA related options
 ;; Base ISA
 Enum
@@ -194,6 +197,14 @@ mexplicit-relocs
 Target Var(la_opt_explicit_relocs_backward) Init(M_OPT_UNSET)
 Use %reloc() assembly operators (for backward compatibility).
 
+mrecip
+Target RejectNegative Var(loongarch_recip)
+Generate approximate reciprocal divide and square root for better throughput.
+
+mrecip=
+Target RejectNegative Joined Var(loongarch_recip_name)
+Control generation of reciprocal estimates.
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md
index e4310c4523d..f6f2feedbb3 100644
--- a/gcc/config/loongarch/lasx.md
+++ b/gcc/config/loongarch/lasx.md
@@ -1194,7 +1194,25 @@ (define_insn "mul<mode>3"
   [(set_attr "type" "simd_fmul")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "div<mode>3"
+(define_expand "div<mode>3"
+  [(set (match_operand:FLASX 0 "register_operand")
+    (div:FLASX (match_operand:FLASX 1 "reg_or_vecotr_1_operand")
+	       (match_operand:FLASX 2 "register_operand")))]
+  "ISA_HAS_LASX"
+{
+  if (<MODE>mode == V8SFmode
+    && TARGET_RECIP_VEC_DIV
+    && optimize_insn_for_speed_p ()
+    && flag_finite_math_only && !flag_trapping_math
+    && flag_unsafe_math_optimizations)
+  {
+    loongarch_emit_swdivsf (operands[0], operands[1],
+	operands[2], V8SFmode);
+    DONE;
+  }
+})
+
+(define_insn "*div<mode>3"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
 	(div:FLASX (match_operand:FLASX 1 "register_operand" "f")
 		   (match_operand:FLASX 2 "register_operand" "f")))]
@@ -1223,7 +1241,23 @@ (define_insn "fnma<mode>4"
   [(set_attr "type" "simd_fmadd")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "sqrt<mode>2"
+(define_expand "sqrt<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand")
+    (sqrt:FLASX (match_operand:FLASX 1 "register_operand")))]
+  "ISA_HAS_LASX"
+{
+  if (<MODE>mode == V8SFmode
+      && TARGET_RECIP_VEC_SQRT
+      && flag_unsafe_math_optimizations
+      && optimize_insn_for_speed_p ()
+      && flag_finite_math_only && !flag_trapping_math)
+    {
+      loongarch_emit_swrsqrtsf (operands[0], operands[1], V8SFmode, 0);
+      DONE;
+    }
+})
+
+(define_insn "*sqrt<mode>2"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
 	(sqrt:FLASX (match_operand:FLASX 1 "register_operand" "f")))]
   "ISA_HAS_LASX"
@@ -1646,7 +1680,20 @@ (define_insn "lasx_xvfrecipe_<flasxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "rsqrt<mode>2"
+(define_expand "rsqrt<mode>2"
+  [(set (match_operand:FLASX 0 "register_operand" "=f")
+    (unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
+	     UNSPEC_LASX_XVFRSQRT))]
+  "ISA_HAS_LASX"
+ {
+   if (<MODE>mode == V8SFmode && TARGET_RECIP_VEC_RSQRT)
+     {
+       loongarch_emit_swrsqrtsf (operands[0], operands[1], V8SFmode, 1);
+       DONE;
+     }
+})
+
+(define_insn "*rsqrt<mode>2"
   [(set (match_operand:FLASX 0 "register_operand" "=f")
     (unspec:FLASX [(match_operand:FLASX 1 "register_operand" "f")]
 		  UNSPEC_LASX_XVFRSQRT))]
diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h
index cb8fc36b086..f2ff93b5e10 100644
--- a/gcc/config/loongarch/loongarch-protos.h
+++ b/gcc/config/loongarch/loongarch-protos.h
@@ -220,5 +220,7 @@ extern rtx loongarch_gen_const_int_vector_shuffle (machine_mode, int);
 extern tree loongarch_build_builtin_va_list (void);
 
 extern rtx loongarch_build_signbit_mask (machine_mode, bool, bool);
+extern void loongarch_emit_swrsqrtsf (rtx, rtx, machine_mode, bool);
+extern void loongarch_emit_swdivsf (rtx, rtx, rtx, machine_mode);
 extern bool loongarch_explicit_relocs_p (enum loongarch_symbol_type);
 #endif /* ! GCC_LOONGARCH_PROTOS_H */
diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 96a4b846f2d..2c06edcff92 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -7547,6 +7547,71 @@ loongarch_option_override_internal (struct gcc_options *opts,
 
   /* Function to allocate machine-dependent function status.  */
   init_machine_status = &loongarch_init_machine_status;
+
+  /* -mrecip options.  */
+  static struct
+    {
+      const char *string;	    /* option name.  */
+      unsigned int mask;	    /* mask bits to set.  */
+    }
+  const recip_options[] = {
+	{ "all",       RECIP_MASK_ALL },
+	{ "none",      RECIP_MASK_NONE },
+	{ "div",       RECIP_MASK_DIV },
+	{ "sqrt",      RECIP_MASK_SQRT },
+	{ "rsqrt",     RECIP_MASK_RSQRT },
+	{ "vec-div",   RECIP_MASK_VEC_DIV },
+	{ "vec-sqrt",  RECIP_MASK_VEC_SQRT },
+	{ "vec-rsqrt", RECIP_MASK_VEC_RSQRT },
+  };
+
+  if (loongarch_recip_name)
+    {
+      char *p = ASTRDUP (loongarch_recip_name);
+      char *q;
+      unsigned int mask, i;
+      bool invert;
+
+      while ((q = strtok (p, ",")) != NULL)
+	{
+	  p = NULL;
+	  if (*q == '!')
+	    {
+	      invert = true;
+	      q++;
+	    }
+	  else
+	    invert = false;
+
+	  if (!strcmp (q, "default"))
+	    mask = RECIP_MASK_ALL;
+	  else
+	    {
+	      for (i = 0; i < ARRAY_SIZE (recip_options); i++)
+		if (!strcmp (q, recip_options[i].string))
+		  {
+		    mask = recip_options[i].mask;
+		    break;
+		  }
+
+	      if (i == ARRAY_SIZE (recip_options))
+		{
+		  error ("unknown option for %<-mrecip=%s%>", q);
+		  invert = false;
+		  mask = RECIP_MASK_NONE;
+		}
+	    }
+
+	  if (invert)
+	    recip_mask &= ~mask;
+	  else
+	    recip_mask |= mask;
+	}
+    }
+  if (loongarch_recip)
+    recip_mask |= RECIP_MASK_ALL;
+  if (!TARGET_FRECIPE)
+    recip_mask = RECIP_MASK_NONE;
 }
 
 
@@ -11470,6 +11535,126 @@ loongarch_build_signbit_mask (machine_mode mode, bool vect, bool invert)
   return force_reg (vec_mode, v);
 }
 
+/* Use rsqrte instruction and Newton-Rhapson to compute the approximation of
+   a single precision floating point [reciprocal] square root.  */
+
+void loongarch_emit_swrsqrtsf (rtx res, rtx a, machine_mode mode, bool recip)
+{
+  rtx x0, e0, e1, e2, mhalf, monehalf;
+  REAL_VALUE_TYPE r;
+  int unspec;
+
+  x0 = gen_reg_rtx (mode);
+  e0 = gen_reg_rtx (mode);
+  e1 = gen_reg_rtx (mode);
+  e2 = gen_reg_rtx (mode);
+
+  real_arithmetic (&r, ABS_EXPR, &dconsthalf, NULL);
+  mhalf = const_double_from_real_value (r, SFmode);
+
+  real_arithmetic (&r, PLUS_EXPR, &dconsthalf, &dconst1);
+  monehalf = const_double_from_real_value (r, SFmode);
+  unspec = UNSPEC_RSQRTE;
+
+  if (VECTOR_MODE_P (mode))
+    {
+      mhalf = loongarch_build_const_vector (mode, true, mhalf);
+      monehalf = loongarch_build_const_vector (mode, true, monehalf);
+      unspec = GET_MODE_SIZE (mode) == 32 ? UNSPEC_LASX_XVFRSQRTE
+					  : UNSPEC_LSX_VFRSQRTE;
+    }
+
+  /* rsqrt(a) =  rsqrte(a) * (1.5 - 0.5 * a * rsqrte(a) * rsqrte(a))
+     sqrt(a)  =  a * rsqrte(a) * (1.5 - 0.5 * a * rsqrte(a) * rsqrte(a))  */
+
+  a = force_reg (mode, a);
+
+  /* x0 = rsqrt(a) estimate.  */
+  emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, a),
+					      unspec)));
+
+  /* If (a == 0.0) Filter out infinity to prevent NaN for sqrt(0.0).  */
+  if (!recip)
+    {
+      rtx zero = force_reg (mode, CONST0_RTX (mode));
+
+      if (VECTOR_MODE_P (mode))
+	{
+	  machine_mode imode = related_int_vector_mode (mode).require ();
+	  rtx mask = gen_reg_rtx (imode);
+	  emit_insn (gen_rtx_SET (mask, gen_rtx_NE (imode, a, zero)));
+	  emit_insn (gen_rtx_SET (x0, gen_rtx_AND (mode, x0,
+						   gen_lowpart (mode, mask))));
+	}
+      else
+	{
+	  rtx target = emit_conditional_move (x0, { GT, a, zero, mode },
+					      x0, zero, mode, 0);
+	  if (target != x0)
+	    emit_move_insn (x0, target);
+	}
+    }
+
+  /* e0 = x0 * a  */
+  emit_insn (gen_rtx_SET (e0, gen_rtx_MULT (mode, x0, a)));
+  /* e1 = e0 * x0  */
+  emit_insn (gen_rtx_SET (e1, gen_rtx_MULT (mode, e0, x0)));
+
+  /* e2 = 1.5 - e1 * 0.5  */
+  mhalf = force_reg (mode, mhalf);
+  monehalf = force_reg (mode, monehalf);
+  emit_insn (gen_rtx_SET (e2, gen_rtx_FMA (mode,
+					   gen_rtx_NEG (mode, e1),
+							mhalf, monehalf)));
+
+  if (recip)
+    /* res = e2 * x0  */
+    emit_insn (gen_rtx_SET (res, gen_rtx_MULT (mode, x0, e2)));
+  else
+    /* res = e2 * e0  */
+    emit_insn (gen_rtx_SET (res, gen_rtx_MULT (mode, e2, e0)));
+}
+
+/* Use recipe instruction and Newton-Rhapson to compute the approximation of
+   a single precision floating point divide.  */
+
+void loongarch_emit_swdivsf (rtx res, rtx a, rtx b, machine_mode mode)
+{
+  rtx x0, e0, mtwo;
+  REAL_VALUE_TYPE r;
+  x0 = gen_reg_rtx (mode);
+  e0 = gen_reg_rtx (mode);
+  int unspec = UNSPEC_RECIPE;
+
+  real_arithmetic (&r, ABS_EXPR, &dconst2, NULL);
+  mtwo = const_double_from_real_value (r, SFmode);
+
+  if (VECTOR_MODE_P (mode))
+    {
+      mtwo = loongarch_build_const_vector (mode, true, mtwo);
+      unspec = GET_MODE_SIZE (mode) == 32 ? UNSPEC_LASX_XVFRECIPE
+					  : UNSPEC_LSX_VFRECIPE;
+    }
+
+  mtwo = force_reg (mode, mtwo);
+
+  /* a / b = a * recipe(b) * (2.0 - b * recipe(b))  */
+
+  /* x0 = 1./b estimate.  */
+  emit_insn (gen_rtx_SET (x0, gen_rtx_UNSPEC (mode, gen_rtvec (1, b),
+					      unspec)));
+  /* 2.0 - b * x0  */
+  emit_insn (gen_rtx_SET (e0, gen_rtx_FMA (mode,
+					   gen_rtx_NEG (mode, b), x0, mtwo)));
+
+  /* x0 = a * x0  */
+  if (a != CONST1_RTX (mode))
+    emit_insn (gen_rtx_SET (x0, gen_rtx_MULT (mode, a, x0)));
+
+  /* res = e0 * x0  */
+  emit_insn (gen_rtx_SET (res, gen_rtx_MULT (mode, e0, x0)));
+}
+
 static bool
 loongarch_builtin_support_vector_misalignment (machine_mode mode,
 					       const_tree type,
@@ -11665,6 +11850,9 @@ loongarch_asm_code_end (void)
 #define TARGET_VECTORIZE_AUTOVECTORIZE_VECTOR_MODES \
   loongarch_autovectorize_vector_modes
 
+#undef TARGET_OPTAB_SUPPORTED_P
+#define TARGET_OPTAB_SUPPORTED_P loongarch_optab_supported_p
+
 #undef TARGET_INIT_BUILTINS
 #define TARGET_INIT_BUILTINS loongarch_init_builtins
 #undef TARGET_BUILTIN_DECL
diff --git a/gcc/config/loongarch/loongarch.h b/gcc/config/loongarch/loongarch.h
index fa8a3f5582f..f1350b6048f 100644
--- a/gcc/config/loongarch/loongarch.h
+++ b/gcc/config/loongarch/loongarch.h
@@ -702,6 +702,24 @@ enum reg_class
    && (GET_MODE_CLASS (MODE) == MODE_VECTOR_INT		\
        || GET_MODE_CLASS (MODE) == MODE_VECTOR_FLOAT))
 
+#define RECIP_MASK_NONE         0x00
+#define RECIP_MASK_DIV          0x01
+#define RECIP_MASK_SQRT         0x02
+#define RECIP_MASK_RSQRT        0x04
+#define RECIP_MASK_VEC_DIV      0x08
+#define RECIP_MASK_VEC_SQRT     0x10
+#define RECIP_MASK_VEC_RSQRT    0x20
+#define RECIP_MASK_ALL (RECIP_MASK_DIV | RECIP_MASK_SQRT \
+			| RECIP_MASK_RSQRT | RECIP_MASK_VEC_SQRT \
+			| RECIP_MASK_VEC_DIV | RECIP_MASK_VEC_RSQRT)
+
+#define TARGET_RECIP_DIV        ((recip_mask & RECIP_MASK_DIV) != 0 || TARGET_uARCH_LA664)
+#define TARGET_RECIP_SQRT       ((recip_mask & RECIP_MASK_SQRT) != 0 || TARGET_uARCH_LA664)
+#define TARGET_RECIP_RSQRT      ((recip_mask & RECIP_MASK_RSQRT) != 0 || TARGET_uARCH_LA664)
+#define TARGET_RECIP_VEC_DIV    ((recip_mask & RECIP_MASK_VEC_DIV) != 0 || TARGET_uARCH_LA664)
+#define TARGET_RECIP_VEC_SQRT   ((recip_mask & RECIP_MASK_VEC_SQRT) != 0 || TARGET_uARCH_LA664)
+#define TARGET_RECIP_VEC_RSQRT  ((recip_mask & RECIP_MASK_VEC_RSQRT) != 0 || TARGET_uARCH_LA664)
+
 /* 1 if N is a possible register number for function argument passing.
    We have no FP argument registers when soft-float.  */
 
diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md
index fd154b02e48..1a10b809e3c 100644
--- a/gcc/config/loongarch/loongarch.md
+++ b/gcc/config/loongarch/loongarch.md
@@ -893,9 +893,21 @@ (define_peephole
 ;; Float division and modulus.
 (define_expand "div<mode>3"
   [(set (match_operand:ANYF 0 "register_operand")
-	(div:ANYF (match_operand:ANYF 1 "reg_or_1_operand")
-		  (match_operand:ANYF 2 "register_operand")))]
-  "")
+    (div:ANYF (match_operand:ANYF 1 "reg_or_1_operand")
+	      (match_operand:ANYF 2 "register_operand")))]
+  ""
+{
+  if (<MODE>mode == SFmode
+    && TARGET_RECIP_DIV
+    && optimize_insn_for_speed_p ()
+    && flag_finite_math_only && !flag_trapping_math
+    && flag_unsafe_math_optimizations)
+  {
+    loongarch_emit_swdivsf (operands[0], operands[1],
+	operands[2], SFmode);
+    DONE;
+  }
+})
 
 (define_insn "*div<mode>3"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
@@ -1126,7 +1138,23 @@ (define_insn "*fnma<mode>4"
 ;;
 ;;  ....................
 
-(define_insn "sqrt<mode>2"
+(define_expand "sqrt<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand")
+    (sqrt:ANYF (match_operand:ANYF 1 "register_operand")))]
+  ""
+ {
+  if (<MODE>mode == SFmode
+      && TARGET_RECIP_SQRT
+      && flag_unsafe_math_optimizations
+      && !optimize_insn_for_size_p ()
+      && flag_finite_math_only && !flag_trapping_math)
+    {
+      loongarch_emit_swrsqrtsf (operands[0], operands[1], SFmode, 0);
+      DONE;
+    }
+ })
+
+(define_insn "*sqrt<mode>2"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
 	(sqrt:ANYF (match_operand:ANYF 1 "register_operand" "f")))]
   ""
@@ -1135,6 +1163,19 @@ (define_insn "sqrt<mode>2"
    (set_attr "mode" "<UNITMODE>")
    (set_attr "insn_count" "1")])
 
+(define_expand "rsqrt<mode>2"
+  [(set (match_operand:ANYF 0 "register_operand")
+    (unspec:ANYF [(match_operand:ANYF 1 "register_operand")]
+	   UNSPEC_RSQRT))]
+  "TARGET_HARD_FLOAT"
+{
+   if (<MODE>mode == SFmode && TARGET_RECIP_RSQRT)
+     {
+       loongarch_emit_swrsqrtsf (operands[0], operands[1], SFmode, 1);
+       DONE;
+     }
+})
+
 (define_insn "*rsqrt<mode>2"
   [(set (match_operand:ANYF 0 "register_operand" "=f")
     (unspec:ANYF [(match_operand:ANYF 1 "register_operand" "f")]
diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt
index cdd59ae4fcf..61d25130ea9 100644
--- a/gcc/config/loongarch/loongarch.opt
+++ b/gcc/config/loongarch/loongarch.opt
@@ -31,6 +31,9 @@ config/loongarch/loongarch-opts.h
 HeaderInclude
 config/loongarch/loongarch-str.h
 
+TargetVariable
+unsigned int recip_mask = 0
+
 ; ISA related options
 ;; Base ISA
 Enum
@@ -202,6 +205,14 @@ mexplicit-relocs
 Target Var(la_opt_explicit_relocs_backward) Init(M_OPT_UNSET)
 Use %reloc() assembly operators (for backward compatibility).
 
+mrecip
+Target RejectNegative Var(loongarch_recip)
+Generate approximate reciprocal divide and square root for better throughput.
+
+mrecip=
+Target RejectNegative Joined Var(loongarch_recip_name)
+Control generation of reciprocal estimates.
+
 ; The code model option names for -mcmodel.
 Enum
 Name(cmodel) Type(int)
diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md
index 06402e3b353..55810041d39 100644
--- a/gcc/config/loongarch/lsx.md
+++ b/gcc/config/loongarch/lsx.md
@@ -1083,7 +1083,25 @@ (define_insn "mul<mode>3"
   [(set_attr "type" "simd_fmul")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "div<mode>3"
+(define_expand "div<mode>3"
+  [(set (match_operand:FLSX 0 "register_operand")
+    (div:FLSX (match_operand:FLSX 1 "reg_or_vecotr_1_operand")
+	      (match_operand:FLSX 2 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  if (<MODE>mode == V4SFmode
+    && TARGET_RECIP_VEC_DIV
+    && optimize_insn_for_speed_p ()
+    && flag_finite_math_only && !flag_trapping_math
+    && flag_unsafe_math_optimizations)
+  {
+    loongarch_emit_swdivsf (operands[0], operands[1],
+	operands[2], V4SFmode);
+    DONE;
+  }
+})
+
+(define_insn "*div<mode>3"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
 	(div:FLSX (match_operand:FLSX 1 "register_operand" "f")
 		  (match_operand:FLSX 2 "register_operand" "f")))]
@@ -1112,7 +1130,23 @@ (define_insn "fnma<mode>4"
   [(set_attr "type" "simd_fmadd")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "sqrt<mode>2"
+(define_expand "sqrt<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand")
+    (sqrt:FLSX (match_operand:FLSX 1 "register_operand")))]
+  "ISA_HAS_LSX"
+{
+  if (<MODE>mode == V4SFmode
+      && TARGET_RECIP_VEC_SQRT
+      && flag_unsafe_math_optimizations
+      && optimize_insn_for_speed_p ()
+      && flag_finite_math_only && !flag_trapping_math)
+    {
+      loongarch_emit_swrsqrtsf (operands[0], operands[1], V4SFmode, 0);
+      DONE;
+    }
+})
+
+(define_insn "*sqrt<mode>2"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
 	(sqrt:FLSX (match_operand:FLSX 1 "register_operand" "f")))]
   "ISA_HAS_LSX"
@@ -1559,7 +1593,20 @@ (define_insn "lsx_vfrecipe_<flsxfmt>"
   [(set_attr "type" "simd_fdiv")
    (set_attr "mode" "<MODE>")])
 
-(define_insn "rsqrt<mode>2"
+(define_expand "rsqrt<mode>2"
+  [(set (match_operand:FLSX 0 "register_operand" "=f")
+    (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
+	     UNSPEC_LSX_VFRSQRT))]
+ "ISA_HAS_LSX"
+{
+ if (<MODE>mode == V4SFmode && TARGET_RECIP_VEC_RSQRT)
+   {
+     loongarch_emit_swrsqrtsf (operands[0], operands[1], V4SFmode, 1);
+     DONE;
+   }
+})
+
+(define_insn "*rsqrt<mode>2"
   [(set (match_operand:FLSX 0 "register_operand" "=f")
     (unspec:FLSX [(match_operand:FLSX 1 "register_operand" "f")]
 		 UNSPEC_LSX_VFRSQRT))]
diff --git a/gcc/config/loongarch/predicates.md b/gcc/config/loongarch/predicates.md
index f7796da10b2..9e9ce58cb53 100644
--- a/gcc/config/loongarch/predicates.md
+++ b/gcc/config/loongarch/predicates.md
@@ -235,6 +235,10 @@ (define_predicate "reg_or_1_operand"
   (ior (match_operand 0 "const_1_operand")
        (match_operand 0 "register_operand")))
 
+(define_predicate "reg_or_vecotr_1_operand"
+  (ior (match_operand 0 "const_vector_1_operand")
+       (match_operand 0 "register_operand")))
+
 ;; These are used in vec_merge, hence accept bitmask as const_int.
 (define_predicate "const_exp_2_operand"
   (and (match_code "const_int")
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 6fe63b5f999..bb83edbcb8f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -1051,6 +1051,7 @@ Objective-C and Objective-C++ Dialects}.
 -mexplicit-relocs=@var{style} -mexplicit-relocs -mno-explicit-relocs
 -mdirect-extern-access -mno-direct-extern-access
 -mcmodel=@var{code-model} -mrelax -mpass-mrelax-to-as}
+-mrecip  -mrecip=@var{opt}
 
 @emph{M32R/D Options}
 @gccoptlist{-m32r2  -m32rx  -m32r
@@ -26598,6 +26599,59 @@ detecting corresponding assembler support:
 This option is mostly useful for debugging, or interoperation with
 assemblers different from the build-time one.
 
+@opindex mrecip
+@item -mrecip
+This option enables use of the reciprocal estimate and reciprocal square
+root estimate instructions with additional Newton-Raphson steps to increase
+precision instead of doing a divide or square root and divide for
+floating-point arguments.
+These instructions are generated only when @option{-funsafe-math-optimizations}
+is enabled together with @option{-ffinite-math-only} and
+@option{-fno-trapping-math}.
+This option is off by default. Before you can use this option, you must sure the
+target CPU supports frecipe and frsqrte instructions.
+Note that while the throughput of the sequence is higher than the throughput of
+the non-reciprocal instruction, the precision of the sequence can be decreased
+by up to 2 ulp (i.e. the inverse of 1.0 equals 0.99999994).
+
+@opindex mrecip=opt
+@item -mrecip=@var{opt}
+This option controls which reciprocal estimate instructions
+may be used.  @var{opt} is a comma-separated list of options, which may
+be preceded by a @samp{!} to invert the option:
+
+@table @samp
+@item all
+Enable all estimate instructions.
+
+@item default
+Enable the default instructions, equivalent to @option{-mrecip}.
+
+@item none
+Disable all estimate instructions, equivalent to @option{-mno-recip}.
+
+@item div
+Enable the approximation for scalar division.
+
+@item vec-div
+Enable the approximation for vectorized division.
+
+@item sqrt
+Enable the approximation for scalar square root.
+
+@item vec-sqrt
+Enable the approximation for vectorized square root.
+
+@item rsqrt
+Enable the approximation for scalar reciprocal square root.
+
+@item vec-rsqrt
+Enable the approximation for vectorized reciprocal square root.
+@end table
+
+So, for example, @option{-mrecip=all,!sqrt} enables
+all of the reciprocal approximations, except for scalar square root.
+
 @item loongarch-vect-unroll-limit
 The vectorizer will use available tuning information to determine whether it
 would be beneficial to unroll the main vectorized loop and by how much.  This
diff --git a/gcc/testsuite/gcc.target/loongarch/divf.c b/gcc/testsuite/gcc.target/loongarch/divf.c
new file mode 100644
index 00000000000..6c831817c9e
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/divf.c
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mfrecipe -fno-unsafe-math-optimizations" } */
+/* { dg-final { scan-assembler "fdiv.s" } } */
+/* { dg-final { scan-assembler-not "frecipe.s" } } */
+
+float
+foo(float a, float b)
+{
+  return a / b;
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/recip-divf.c b/gcc/testsuite/gcc.target/loongarch/recip-divf.c
new file mode 100644
index 00000000000..db5e3e48888
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/recip-divf.c
@@ -0,0 +1,9 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mfrecipe" } */
+/* { dg-final { scan-assembler "frecipe.s" } } */
+
+float
+foo(float a, float b)
+{
+  return a / b;
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/recip-sqrtf.c b/gcc/testsuite/gcc.target/loongarch/recip-sqrtf.c
new file mode 100644
index 00000000000..7f45db6cdea
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/recip-sqrtf.c
@@ -0,0 +1,23 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mfrecipe" } */
+/* { dg-final { scan-assembler-times "frsqrte.s" 3 } } */
+
+extern float sqrtf (float);
+
+float
+foo1 (float a, float b)
+{
+  return a/sqrtf(b);
+}
+
+float
+foo2 (float a, float b)
+{
+  return sqrtf(a/b);
+}
+
+float
+foo3 (float a)
+{
+  return sqrtf(a);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/sqrtf.c b/gcc/testsuite/gcc.target/loongarch/sqrtf.c
new file mode 100644
index 00000000000..c2720faac7b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/sqrtf.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mfrecipe -fno-unsafe-math-optimizations" } */
+/* { dg-final { scan-assembler-times "fsqrt.s" 3 } } */
+/* { dg-final { scan-assembler-not "frsqrte.s" } } */
+
+extern float sqrtf (float);
+
+float
+foo1 (float a, float b)
+{
+  return a/sqrtf(b);
+}
+
+float
+foo2 (float a, float b)
+{
+  return sqrtf(a/b);
+}
+
+float
+foo3 (float a)
+{
+  return sqrtf(a);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-divf.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-divf.c
new file mode 100644
index 00000000000..748a82200d9
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-divf.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mrecip -mlasx -mfrecipe -fno-unsafe-math-optimizations" } */
+/* { dg-final { scan-assembler "xvfdiv.s" } } */
+/* { dg-final { scan-assembler-not "xvfrecipe.s" } } */
+
+float a[8],b[8],c[8];
+
+void 
+foo ()
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = a[i] / b[i];
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-divf.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-divf.c
new file mode 100644
index 00000000000..6532756f07d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-divf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlasx -mfrecipe" } */
+/* { dg-final { scan-assembler "xvfrecipe.s" } } */
+
+float a[8],b[8],c[8];
+
+void
+foo ()
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = a[i] / b[i];
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c
new file mode 100644
index 00000000000..a623dff8f27
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip-sqrtf.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlasx -mfrecipe" } */
+/* { dg-final { scan-assembler-times "xvfrsqrte.s" 3 } } */
+
+float a[8], b[8], c[8];
+
+extern float sqrtf (float);
+
+void
+foo1 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = a[i] / sqrtf (b[i]);
+}
+
+void
+foo2 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = sqrtf (a[i] / b[i]);
+}
+
+void
+foo3 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = sqrtf (a[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip.c
new file mode 100644
index 00000000000..083c868406b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-recip.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlasx -fno-vect-cost-model" } */
+/* { dg-final { scan-assembler "xvfrecip.s" } } */
+/* { dg-final { scan-assembler "xvfrecip.d" } } */
+/* { dg-final { scan-assembler-not "xvfdiv.s" } } */
+/* { dg-final { scan-assembler-not "xvfdiv.d" } } */
+
+float a[8], b[8];
+
+void 
+foo1(void)
+{
+  for (int i = 0; i < 8; i++)
+    a[i] = 1 / (b[i]);
+}
+
+double da[4], db[4];
+
+void
+foo2(void)
+{
+  for (int i = 0; i < 4; i++)
+    da[i] = 1 / (db[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-sqrtf.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-sqrtf.c
new file mode 100644
index 00000000000..a005a38865d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-sqrtf.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -fno-unsafe-math-optimizations  -mrecip -mlasx -mfrecipe" } */
+/* { dg-final { scan-assembler-times "xvfsqrt.s" 3 } } */
+/* { dg-final { scan-assembler-not "xvfrsqrte.s" } } */
+
+float a[8], b[8], c[8];
+
+extern float sqrtf (float);
+
+void
+foo1 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = a[i] / sqrtf (b[i]);
+}
+
+void
+foo2 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = sqrtf (a[i] / b[i]);
+}
+
+void
+foo3 (void)
+{
+  for (int i = 0; i < 8; i++)
+    c[i] = sqrtf (a[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-divf.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-divf.c
new file mode 100644
index 00000000000..1219b1ef842
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-divf.c
@@ -0,0 +1,13 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlsx -mfrecipe -fno-unsafe-math-optimizations" } */
+/* { dg-final { scan-assembler "vfdiv.s" } } */
+/* { dg-final { scan-assembler-not "vfrecipe.s" } } */
+
+float a[4],b[4],c[4];
+
+void
+foo ()
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = a[i] / b[i];
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-divf.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-divf.c
new file mode 100644
index 00000000000..edbe8d9098f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-divf.c
@@ -0,0 +1,12 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlsx -mfrecipe" } */
+/* { dg-final { scan-assembler "vfrecipe.s" } } */
+
+float a[4],b[4],c[4];
+
+void
+foo ()
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = a[i] / b[i];
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c
new file mode 100644
index 00000000000..d356f915eb5
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip-sqrtf.c
@@ -0,0 +1,28 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlsx -mfrecipe" } */
+/* { dg-final { scan-assembler-times "vfrsqrte.s" 3 } } */
+
+float a[4], b[4], c[4];
+
+extern float sqrtf (float);
+
+void
+foo1 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = a[i] / sqrtf (b[i]);
+}
+
+void
+foo2 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = sqrtf (a[i] / b[i]);
+}
+
+void
+foo3 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = sqrtf (a[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip.c
new file mode 100644
index 00000000000..c4d6af4db93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-recip.c
@@ -0,0 +1,24 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -mlsx -fno-vect-cost-model" } */
+/* { dg-final { scan-assembler "vfrecip.s" } } */
+/* { dg-final { scan-assembler "vfrecip.d" } } */
+/* { dg-final { scan-assembler-not "vfdiv.s" } } */
+/* { dg-final { scan-assembler-not "vfdiv.d" } } */
+
+float a[4], b[4];
+
+void
+foo1(void)
+{
+  for (int i = 0; i < 4; i++)
+    a[i] = 1 / (b[i]);
+}
+
+double da[2], db[2];
+
+void
+foo2(void)
+{
+  for (int i = 0; i < 2; i++)
+    da[i] = 1 / (db[i]);
+}
diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-sqrtf.c b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-sqrtf.c
new file mode 100644
index 00000000000..3ff6570a67a
--- /dev/null
+++ b/gcc/testsuite/gcc.target/loongarch/vector/lsx/lsx-sqrtf.c
@@ -0,0 +1,29 @@
+/* { dg-do compile } */
+/* { dg-options "-O2 -ffast-math -mrecip -mlsx -mfrecipe -fno-unsafe-math-optimizations" } */
+/* { dg-final { scan-assembler-times "vfsqrt.s" 3 } } */
+/* { dg-final { scan-assembler-not "vfrsqrte.s" } } */
+
+float a[4], b[4], c[4];
+
+extern float sqrtf (float);
+
+void
+foo1 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = a[i] / sqrtf (b[i]);
+}
+
+void
+foo2 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = sqrtf (a[i] / b[i]);
+}
+
+void
+foo3 (void)
+{
+  for (int i = 0; i < 4; i++)
+    c[i] = sqrtf (a[i]);
+}

From patchwork Wed Dec  6 07:04:53 2023
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: Jiahao Xu <xujiahao@loongson.cn>
X-Patchwork-Id: 174337
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org>
Delivered-To: ouuuleilei@gmail.com
Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp3929502vqy;
        Tue, 5 Dec 2023 23:06:06 -0800 (PST)
X-Google-Smtp-Source: 
 AGHT+IFSEr6M8NAYzcgMM7y2dcuu83aQJxHbwr8ldwS+drstr6NweBOj1kIdjoGzfW7E9tSNtSwA
X-Received: by 2002:ac8:5702:0:b0:425:4043:50f9 with SMTP id
 2-20020ac85702000000b00425404350f9mr550712qtw.136.1701846366023;
        Tue, 05 Dec 2023 23:06:06 -0800 (PST)
ARC-Seal: i=2; a=rsa-sha256; t=1701846366; cv=pass;
        d=google.com; s=arc-20160816;
        b=CpD4m7srXsk8C8lQcbiL+JnQz6pnZA2li0wUj3ymjLyuMxj+Jyp5+zhDTww9mPoudH
         Fd6UCiG6IOQK7VC5138hNkRBJSVHUaSKKjy5KHBcl4aVTsXf7waROckNU7e3i/z/hsfO
         mrihL9FLdUPyrlrMelVgHDeTAe5MR89EVotzLnFv6xrNZ7d2Xq9jJDQKixte1EUppN+u
         OFgJ61shGmrgoh0MD1ty02xv1oe3tO751/toX8NlpbX3RGkFDcJN50SVQU4mZKkRZYCl
         q8wlfrPFm8OI2RoKM7TkTpu56r07UrfnNPkEVmV86z22b+jphX4sY3mchjzS7Ajbr86g
         YKHA==
ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com;
 s=arc-20160816;
        h=errors-to:list-subscribe:list-help:list-post:list-archive
         :list-unsubscribe:list-id:precedence:content-transfer-encoding
         :mime-version:references:in-reply-to:message-id:date:subject:cc:to
         :from:arc-filter:dmarc-filter:delivered-to;
        bh=KLiUBS5SYD2FaZTiM2B/7ycJng9ZJM9mVL9Gp/K+DfY=;
        fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=;
        b=ANEeK5+/o7dC21trYwaf2nkWMJSE708su2Ekg8147hkStb7leF1fB2dM3hn6NQUEeN
         3/CZSVAFs3bQ+j6l+rVAGs8Vy8rwxRzqYZfJxEqQ2K5LMqidOksZJsJP2YgqiKEwCY2+
         uT/rzx5gH+aad8ZrWkK+R3oKHGTU36q0rlhn0q6Tf1C7h0i+OsOZ6mSscF9x86V8yDTO
         haVWsUeDw91jKpAvyPgQPReQX7wpvhfXznqUZ3CJu7yb3HqCLGwKbkfUpCehN4KXvHoW
         vVM42RrM8M1tvPr69AsC/Fuz5jQJl3tOrfAhWWJKpfPbSd9tOtrQWXkaTkono5YYg3QA
         /ZQQ==
ARC-Authentication-Results: i=2; mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (server2.sourceware.org.
 [2620:52:3:1:0:246e:9693:128c])
        by mx.google.com with ESMTPS id
 l11-20020ac8078b000000b0042540184032si9137794qth.336.2023.12.05.23.06.05
        for <ouuuleilei@gmail.com>
        (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256);
        Tue, 05 Dec 2023 23:06:06 -0800 (PST)
Received-SPF: pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 client-ip=2620:52:3:1:0:246e:9693:128c;
Authentication-Results: mx.google.com;
       arc=pass (i=1);
       spf=pass (google.com: domain of
 gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates
 2620:52:3:1:0:246e:9693:128c as permitted sender)
 smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"
Received: from server2.sourceware.org (localhost [IPv6:::1])
	by sourceware.org (Postfix) with ESMTP id 6A582384577C
	for <ouuuleilei@gmail.com>; Wed,  6 Dec 2023 07:06:01 +0000 (GMT)
X-Original-To: gcc-patches@gcc.gnu.org
Delivered-To: gcc-patches@gcc.gnu.org
Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163])
 by sourceware.org (Postfix) with ESMTP id D6DE338449F5
 for <gcc-patches@gcc.gnu.org>; Wed,  6 Dec 2023 07:05:13 +0000 (GMT)
DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D6DE338449F5
Authentication-Results: sourceware.org;
 dmarc=none (p=none dis=none) header.from=loongson.cn
Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn
ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D6DE338449F5
Authentication-Results: server2.sourceware.org;
 arc=none smtp.remote-ip=114.242.206.163
ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1701846316; cv=none;
 b=grI5V88W/9hi2dyksf1KupjxdfscchPN15mviU2AxMId3SEgRF3pDHVLecpbepLa7fMk5dWlYqWbAFYotPRQg9sj7p3PA9RPXh+e1snx7LeoLBBcBNzJQFIpSsyu3JLWKdcJXe/NrBxQOAkWTGngWOOzMGsy4eiVLB9HAmt26Hw=
ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key;
 t=1701846316; c=relaxed/simple;
 bh=gtvKEAUbYRjuSptFnR0xIoa60twLkfCjw3OJJ2hx9L4=;
 h=From:To:Subject:Date:Message-Id:MIME-Version;
 b=ro6WydDUyYvEU/2M+xnjJn4RonPKRg/M1KBEYmuLtCGkwHiJuFFl+Gut46FMm4SFg6DZJjyTZ6d5YGBrhUJbyWObTvilhF1s7suepowvV8RZ3xaEG99xq/2HT+xuUv1QuSzDWgiCjLmuNBMjYmzbZdv905HldrdIhM87P+uOa4Y=
ARC-Authentication-Results: i=1; server2.sourceware.org
Received: from loongson.cn (unknown [10.10.130.252])
 by gateway (Coremail) with SMTP id _____8Cxc_AoHXBlUDs_AA--.60467S3;
 Wed, 06 Dec 2023 15:05:12 +0800 (CST)
Received: from slurm-master.loongson.cn (unknown [10.10.130.252])
 by localhost.localdomain (Coremail) with SMTP id
 AQAAf8Dxvi8XHXBlp0BWAA--.59594S9;
 Wed, 06 Dec 2023 15:05:11 +0800 (CST)
From: Jiahao Xu <xujiahao@loongson.cn>
To: gcc-patches@gcc.gnu.org
Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn,
 xuchenghua@loongson.cn, Jiahao Xu <xujiahao@loongson.cn>
Subject: [PATCH v3 5/5] LoongArch: Vectorized loop unrolling is disable for
 divf/sqrtf/rsqrtf when -mrecip is enabled.
Date: Wed,  6 Dec 2023 15:04:53 +0800
Message-Id: <20231206070453.3252-6-xujiahao@loongson.cn>
X-Mailer: git-send-email 2.20.1
In-Reply-To: <20231206070453.3252-1-xujiahao@loongson.cn>
References: <20231206070453.3252-1-xujiahao@loongson.cn>
MIME-Version: 1.0
X-CM-TRANSID: AQAAf8Dxvi8XHXBlp0BWAA--.59594S9
X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/
X-Coremail-Antispam: 1Uk129KBj93XoW7KrWUAFW7trWfZryrCry7twc_yoW8tFyUpr
 ZIyr13tw4DJr47WrsrJ3yxWw1ayr9xGF42qa13ta4fCa17Kr1Fq3WkKr1qvFZrX3y5WryI
 vr1IqFs8Za45CwbCm3ZEXasCq-sJn29KB7ZKAUJUUUU5529EdanIXcx71UUUUU7KY7ZEXa
 sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU
 0xBIdaVrnRJUUUk2b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2
 IYs7xG6rWj6s0DM7CIcVAFz4kK6r106r15M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v
 e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Xr0_Ar1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI
 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK
 xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx
 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r1q6rW5McIj6I8E87Iv
 67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2
 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02
 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x
 vE2Ix0cI8IcVAFwI0_Gr0_Xr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r4j6F4UMIIF0xvE
 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVW8JVWxJwCI42IY6I8E87Iv6x
 kF7I0E14v26r4j6r4UJbIYCTnIWIevJa73UjIFyTuYvjxUcHUqUUUUU
X-Spam-Status: No, score=-13.1 required=5.0 tests=BAYES_00, GIT_PATCH_0,
 KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP,
 T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6
X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on
 server2.sourceware.org
X-BeenThere: gcc-patches@gcc.gnu.org
X-Mailman-Version: 2.1.30
Precedence: list
List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org>
List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe>
List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/>
List-Post: <mailto:gcc-patches@gcc.gnu.org>
List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help>
List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>,
 <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe>
Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org
X-getmail-retrieved-from-mailbox: INBOX
X-GMAIL-THRID: 1784515255274542829
X-GMAIL-MSGID: 1784515255274542829

Using -mrecip generates a sequence of instructions to replace divf, sqrtf and rsqrtf. The number
of generated instructions is close to or exceeds the maximum issue instructions per cycle of the
LoongArch, so vectorized loop unrolling is not performed on them.

gcc/ChangeLog:

	* config/loongarch/loongarch.cc (loongarch_vector_costs::determine_suggested_unroll_factor):
	If m_has_recip is true, uf return 1.
	(loongarch_vector_costs::add_stmt_cost): Detect the use of approximate instruction sequence.

diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc
index 2c06edcff92..0ca60e15ced 100644
--- a/gcc/config/loongarch/loongarch.cc
+++ b/gcc/config/loongarch/loongarch.cc
@@ -3974,7 +3974,9 @@ protected:
   /* Reduction factor for suggesting unroll factor.  */
   unsigned m_reduc_factor = 0;
   /* True if the loop contains an average operation. */
-  bool m_has_avg =false;
+  bool m_has_avg = false;
+  /* True if the loop uses approximation instruction sequence.  */
+  bool m_has_recip = false;
 };
 
 /* Implement TARGET_VECTORIZE_CREATE_COSTS.  */
@@ -4021,7 +4023,7 @@ loongarch_vector_costs::determine_suggested_unroll_factor (loop_vec_info loop_vi
 {
   class loop *loop = LOOP_VINFO_LOOP (loop_vinfo);
 
-  if (m_has_avg)
+  if (m_has_avg || m_has_recip)
     return 1;
 
   /* Don't unroll if it's specified explicitly not to be unrolled.  */
@@ -4081,6 +4083,36 @@ loongarch_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind,
 	}
     }
 
+  combined_fn cfn;
+  if (kind == vector_stmt
+      && stmt_info
+      && stmt_info->stmt)
+    {
+      /* Detect the use of approximate instruction sequence.  */
+      if ((TARGET_RECIP_VEC_SQRT || TARGET_RECIP_VEC_RSQRT)
+	  && (cfn = gimple_call_combined_fn (stmt_info->stmt)) != CFN_LAST)
+	switch (cfn)
+	  {
+	  case CFN_BUILT_IN_SQRTF:
+	    m_has_recip = true;
+	  default:
+	    break;
+	  }
+      else if (TARGET_RECIP_VEC_DIV
+	       && gimple_code (stmt_info->stmt) == GIMPLE_ASSIGN)
+	{
+	  machine_mode mode = TYPE_MODE (vectype);
+	  switch (gimple_assign_rhs_code (stmt_info->stmt))
+	    {
+	    case RDIV_EXPR:
+	      if (GET_MODE_INNER (mode) == SFmode)
+		m_has_recip = true;
+	    default:
+	      break;
+	    }
+	}
+    }
+
   return retval;
 }