From patchwork Mon Oct 16 02:00:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 153135 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3189113vqb; Sun, 15 Oct 2023 19:01:48 -0700 (PDT) X-Google-Smtp-Source: AGHT+IECXvz90qaNfncSbczXKJrIgGd0TaHjMgV6krhbY2GmivwqOU7wQG005GJQVl2kX0gBks0c X-Received: by 2002:a05:620a:40c1:b0:774:2dc0:649b with SMTP id g1-20020a05620a40c100b007742dc0649bmr8376418qko.18.1697421708407; Sun, 15 Oct 2023 19:01:48 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697421708; cv=pass; d=google.com; s=arc-20160816; b=AfTz1Vm45xFuSUc3KvSa2nXtyzwpvhk0F4RWrZn5pPOm3M7fPf/uNsIVF8nX9/glEk Hki81JbCn8MguS3WnzM2ku+bKLV5Yhc5wrS5/OCwy5CC+bs+nUWUwiucTDlSxO3UP1+E gKlOc3R4oepeJeujRl/iVbwKYKZUP1GsPmcQXHeVVsE+l8u0XCpE64t98I5IUHQsRSd1 A0tzmuA6Xp2H47i7V8ldA/dOkpeTkf3DjD0Iu1Y5J7AJLt0A3mxOn/uu7ztBUYT93RV+ zm/T5BDdY6AdsigZGMUmq995Dmh554MC27teeIPOjj1gkkD3C4g5D6VcetnIIoK+jJXd baAg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:arc-filter:dmarc-filter:delivered-to; bh=geoCEOCCSNdxOA/zLJ5MUcOW1w4e4HjvvWptNvD7zU4=; fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=; b=nkPu6M+gjEXr8PME7c7baovuaAtboAjkxpNiqTFb2KlSLl5j8s8qL4n7Hau7FU9wUz hoHZP3ytWkOnvANreZKBB9Rf8N0R5q1QtDLn42R/bgpWsEnBSmuzcsSCVeMENl+a4Kjq 3piPCbe5KMFAentCbnxkrNipO8HxWQCmhIbfhIiJ5fEqJAjay7r2Z1Qo2CkCPq3D6+78 xgmuXTmwDxz5dlfb0G+067tafCgILz0f9TbQg46lDYIfYF/ctiwyx8I8bHyK8HGmRJ/3 O7+e8+kxcehVI70DvGiW3I1R2zT6zjjpAXXvR09qDlipIWAGbiuKlTCuQTb7UEibz/eh lzvg== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id m2-20020a05620a24c200b007743658a2besi4924892qkn.499.2023.10.15.19.01.48 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 19:01:48 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2E93F3858423 for ; Mon, 16 Oct 2023 02:01:48 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id BBC323858415 for ; Mon, 16 Oct 2023 02:01:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBC323858415 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BBC323858415 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421673; cv=none; b=A5AojZ0CxkPf25d7/FjADc8XUWF7p1TobuqGPGY3yeR/NPHg+5eDZbVGMKYFuPUnR3a7svYOXqhXNhrgulWc0KhzMjYQyjGQSrsCoNZ+0CXewdm4PdczDv64jHUUBKwjhuc7k4hpqiHSByzO7uxONUDysAXceedgc7QD+4EhJcw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421673; c=relaxed/simple; bh=R3hZGIDSbaH5pcawj+/5WIcu2WoMrjLqpvgJ4BORogo=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=hkIaI6Nmhbu6KJU8qeCjQFuCCpwXTvs1Un8t7Bb4vVjLbpPSEOaIbKEwNpUZMrVLamqyBl3TF4ijbkA26XqwvXsd5NnupSZlCcgRLnIs65PI4KgSHu6LodaV951Z7T+lQLLnzRoN0u6epDLznW0L+PXROc1kJthVOI5DyjVFr7I= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsCuI-0005W7-E7 for gcc-patches@gcc.gnu.org; Sun, 15 Oct 2023 22:01:01 -0400 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8Cx5_E1mSxl4zkyAA--.31106S3; Mon, 16 Oct 2023 10:00:21 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxbNwvmSxlz_4lAA--.14027S5; Mon, 16 Oct 2023 10:00:20 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH 1/3] LoongArch:Implement avg and sad standard names. Date: Mon, 16 Oct 2023 10:00:12 +0800 Message-Id: <20231016020014.41979-2-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231016020014.41979-1-xujiahao@loongson.cn> References: <20231016020014.41979-1-xujiahao@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxbNwvmSxlz_4lAA--.14027S5 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoWfJw13Cr15WrWxWr4rXFyUCFX_yoWkGw1xp3 97Gw18tr48JFs7Kw1vgFy5Jr47GFsrGF47ZasxGrZFkry7tr92q340yFZIqFyYyw4Yvr17 Xan3Ca12qryxKwcCm3ZEXasCq-sJn29KB7ZKAUJUUUUU529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUk0b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4j6r4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI0UMc 02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUAVWUtwAv7VC2z280aVAF wI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28IcxkI7V AKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCj r7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6x IIjxv20xvE14v26r1j6r1xMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAI w20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x 0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxU2nYFDUUUU Received-SPF: pass client-ip=114.242.206.163; envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779875665194864813 X-GMAIL-MSGID: 1779875665194864813 gcc/ChangeLog: * config/loongarch/lasx.md (avg3_floor, uavg3_floor, avg3_ceil, uavg3_ceil, ssadv16qi, usadv16qi): New patterns. * config/loongarch/lsx.md (avg3_floor, uavg3_floor, avg3_ceil, uavg3_ceil, ssadv16qi, usadv16qi): New patterns. gcc/testsuite/ChangeLog: * gcc.target/loongarch/avg-ceil-lasx.c: New test. * gcc.target/loongarch/avg-ceil-lsx.c: New test. * gcc.target/loongarch/avg-floor-lasx.c: New test. * gcc.target/loongarch/avg-floor-lsx.c: New test. * gcc.target/loongarch/sad-lasx.c.c: New test. * gcc.target/loongarch/sad-lsx.c: New test. diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index 2bc5d47ed4a..483d78bb210 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -5171,3 +5171,77 @@ const0_rtx)); DONE; }) + +(define_expand "avg3_ceil" + [(match_operand:ILASX_WHB 0 "register_operand") + (match_operand:ILASX_WHB 1 "register_operand") + (match_operand:ILASX_WHB 2 "register_operand")] + "ISA_HAS_LASX" +{ + emit_insn (gen_lasx_xvavgr_s_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "uavg3_ceil" + [(match_operand:ILASX_WHB 0 "register_operand") + (match_operand:ILASX_WHB 1 "register_operand") + (match_operand:ILASX_WHB 2 "register_operand")] + "ISA_HAS_LASX" +{ + emit_insn (gen_lasx_xvavgr_u_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "avg3_floor" + [(match_operand:ILASX_WHB 0 "register_operand") + (match_operand:ILASX_WHB 1 "register_operand") + (match_operand:ILASX_WHB 2 "register_operand")] + "ISA_HAS_LASX" +{ + emit_insn (gen_lasx_xvavg_s_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "uavg3_floor" + [(match_operand:ILASX_WHB 0 "register_operand") + (match_operand:ILASX_WHB 1 "register_operand") + (match_operand:ILASX_WHB 2 "register_operand")] + "ISA_HAS_LASX" +{ + emit_insn (gen_lasx_xvavg_u_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "usadv32qi" + [(match_operand:V8SI 0 "register_operand") + (match_operand:V32QI 1 "register_operand") + (match_operand:V32QI 2 "register_operand") + (match_operand:V8SI 3 "register_operand")] + "ISA_HAS_LASX" +{ + rtx t1 = gen_reg_rtx (V32QImode); + rtx t2 = gen_reg_rtx (V16HImode); + rtx t3 = gen_reg_rtx (V8SImode); + emit_insn (gen_lasx_xvabsd_u_bu (t1, operands[1], operands[2])); + emit_insn (gen_lasx_xvhaddw_h_b (t2, t1, t1)); + emit_insn (gen_lasx_xvhaddw_w_h (t3, t2, t2)); + emit_insn (gen_addv8si3 (operands[0], t3, operands[3])); + DONE; +}) + +(define_expand "ssadv32qi" + [(match_operand:V8SI 0 "register_operand") + (match_operand:V32QI 1 "register_operand") + (match_operand:V32QI 2 "register_operand") + (match_operand:V8SI 3 "register_operand")] + "ISA_HAS_LASX" +{ + rtx t1 = gen_reg_rtx (V32QImode); + rtx t2 = gen_reg_rtx (V16HImode); + rtx t3 = gen_reg_rtx (V8SImode); + emit_insn (gen_lasx_xvabsd_s_b (t1, operands[1], operands[2])); + emit_insn (gen_lasx_xvhaddw_h_b (t2, t1, t1)); + emit_insn (gen_lasx_xvhaddw_w_h (t3, t2, t2)); + emit_insn (gen_addv8si3 (operands[0], t3, operands[3])); + DONE; +}) diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md index 075f6ba569d..b63c6ff4dee 100644 --- a/gcc/config/loongarch/lsx.md +++ b/gcc/config/loongarch/lsx.md @@ -3581,6 +3581,80 @@ DONE; }) +(define_expand "avg3_ceil" + [(match_operand:ILSX_WHB 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand") + (match_operand:ILSX_WHB 2 "register_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vavgr_s_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "uavg3_ceil" + [(match_operand:ILSX_WHB 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand") + (match_operand:ILSX_WHB 2 "register_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vavgr_u_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "avg3_floor" + [(match_operand:ILSX_WHB 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand") + (match_operand:ILSX_WHB 2 "register_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vavg_s_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "uavg3_floor" + [(match_operand:ILSX_WHB 0 "register_operand") + (match_operand:ILSX_WHB 1 "register_operand") + (match_operand:ILSX_WHB 2 "register_operand")] + "ISA_HAS_LSX" +{ + emit_insn (gen_lsx_vavg_u_ (operands[0], operands[1], operands[2])); + DONE; +}) + +(define_expand "usadv16qi" + [(match_operand:V4SI 0 "register_operand") + (match_operand:V16QI 1 "register_operand") + (match_operand:V16QI 2 "register_operand") + (match_operand:V4SI 3 "register_operand")] + "ISA_HAS_LSX" +{ + rtx t1 = gen_reg_rtx (V16QImode); + rtx t2 = gen_reg_rtx (V8HImode); + rtx t3 = gen_reg_rtx (V4SImode); + emit_insn (gen_lsx_vabsd_u_bu (t1, operands[1], operands[2])); + emit_insn (gen_lsx_vhaddw_h_b (t2, t1, t1)); + emit_insn (gen_lsx_vhaddw_w_h (t3, t2, t2)); + emit_insn (gen_addv4si3 (operands[0], t3, operands[3])); + DONE; +}) + +(define_expand "ssadv16qi" + [(match_operand:V4SI 0 "register_operand") + (match_operand:V16QI 1 "register_operand") + (match_operand:V16QI 2 "register_operand") + (match_operand:V4SI 3 "register_operand")] + "ISA_HAS_LSX" +{ + rtx t1 = gen_reg_rtx (V16QImode); + rtx t2 = gen_reg_rtx (V8HImode); + rtx t3 = gen_reg_rtx (V4SImode); + emit_insn (gen_lsx_vabsd_s_b (t1, operands[1], operands[2])); + emit_insn (gen_lsx_vhaddw_h_b (t2, t1, t1)); + emit_insn (gen_lsx_vhaddw_w_h (t3, t2, t2)); + emit_insn (gen_addv4si3 (operands[0], t3, operands[3])); + DONE; +}) + (define_insn "lsx_vwev_d_w" [(set (match_operand:V2DI 0 "register_operand" "=f") (addsubmul:V2DI diff --git a/gcc/testsuite/gcc.target/loongarch/avg-ceil-lasx.c b/gcc/testsuite/gcc.target/loongarch/avg-ceil-lasx.c new file mode 100644 index 00000000000..a4fc7a63f97 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/avg-ceil-lasx.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ +/* { dg-final { scan-assembler "xvavgr.b" } } */ +/* { dg-final { scan-assembler "xvavgr.bu" } } */ +/* { dg-final { scan-assembler "xvavgr.hu" } } */ +/* { dg-final { scan-assembler "xvavgr.h" } } */ + +#define N 1024 + +#define TEST(TYPE, NAME) \ + TYPE a_##NAME[N], b_##NAME[N], c_##NAME[N]; \ + void f_##NAME (void) \ + { \ + int i; \ + for (i = 0; i < N; i++) \ + a_##NAME[i] = (b_##NAME[i] + c_##NAME[i] + 1) >> 1; \ + } + +TEST(char, 1); +TEST(short, 2); +TEST(unsigned char, 3); +TEST(unsigned short, 4); diff --git a/gcc/testsuite/gcc.target/loongarch/avg-ceil-lsx.c b/gcc/testsuite/gcc.target/loongarch/avg-ceil-lsx.c new file mode 100644 index 00000000000..7aae01600d7 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/avg-ceil-lsx.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlsx" } */ +/* { dg-final { scan-assembler "vavgr.b" } } */ +/* { dg-final { scan-assembler "vavgr.bu" } } */ +/* { dg-final { scan-assembler "vavgr.hu" } } */ +/* { dg-final { scan-assembler "vavgr.h" } } */ + +#define N 1024 + +#define TEST(TYPE, NAME) \ + TYPE a_##NAME[N], b_##NAME[N], c_##NAME[N]; \ + void f_##NAME (void) \ + { \ + int i; \ + for (i = 0; i < N; i++) \ + a_##NAME[i] = (b_##NAME[i] + c_##NAME[i] + 1) >> 1; \ + } + +TEST(char, 1); +TEST(short, 2); +TEST(unsigned char, 3); +TEST(unsigned short, 4); diff --git a/gcc/testsuite/gcc.target/loongarch/avg-floor-lasx.c b/gcc/testsuite/gcc.target/loongarch/avg-floor-lasx.c new file mode 100644 index 00000000000..da6956f6f91 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/avg-floor-lasx.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ +/* { dg-final { scan-assembler "xvavg.b" } } */ +/* { dg-final { scan-assembler "xvavg.bu" } } */ +/* { dg-final { scan-assembler "xvavg.hu" } } */ +/* { dg-final { scan-assembler "xvavg.h" } } */ + +#define N 1024 + +#define TEST(TYPE, NAME) \ + TYPE a_##NAME[N], b_##NAME[N], c_##NAME[N]; \ + void f_##NAME (void) \ + { \ + int i; \ + for (i = 0; i < N; i++) \ + a_##NAME[i] = (b_##NAME[i] + c_##NAME[i]) >> 1; \ + } + +TEST(char, 1); +TEST(short, 2); +TEST(unsigned char, 3); +TEST(unsigned short, 4); diff --git a/gcc/testsuite/gcc.target/loongarch/avg-floor-lsx.c b/gcc/testsuite/gcc.target/loongarch/avg-floor-lsx.c new file mode 100644 index 00000000000..d16c23ac0cc --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/avg-floor-lsx.c @@ -0,0 +1,22 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlsx" } */ +/* { dg-final { scan-assembler "vavg.b" } } */ +/* { dg-final { scan-assembler "vavg.bu" } } */ +/* { dg-final { scan-assembler "vavg.hu" } } */ +/* { dg-final { scan-assembler "vavg.h" } } */ + +#define N 1024 + +#define TEST(TYPE, NAME) \ + TYPE a_##NAME[N], b_##NAME[N], c_##NAME[N]; \ + void f_##NAME (void) \ + { \ + int i; \ + for (i = 0; i < N; i++) \ + a_##NAME[i] = (b_##NAME[i] + c_##NAME[i]) >> 1; \ + } + +TEST(char, 1); +TEST(short, 2); +TEST(unsigned char, 3); +TEST(unsigned short, 4); diff --git a/gcc/testsuite/gcc.target/loongarch/sad-lasx.c b/gcc/testsuite/gcc.target/loongarch/sad-lasx.c new file mode 100644 index 00000000000..47ca4039489 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/sad-lasx.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ + +#define N 1024 + +#define TEST(SIGN) \ + SIGN char a_##SIGN[N], b_##SIGN[N]; \ + int f_##SIGN (void) \ + { \ + int i, sum = 0; \ + for (i = 0; i < N; i++) \ + sum += __builtin_abs (a_##SIGN[i] - b_##SIGN[i]);; \ + return sum; \ + } + +TEST(signed); +TEST(unsigned); + +/* { dg-final { scan-assembler {\txvabsd.bu\t} } } */ +/* { dg-final { scan-assembler {\txvabsd.b\t} } } */ diff --git a/gcc/testsuite/gcc.target/loongarch/sad-lsx.c b/gcc/testsuite/gcc.target/loongarch/sad-lsx.c new file mode 100644 index 00000000000..2aadf3d9309 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/sad-lsx.c @@ -0,0 +1,20 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlsx" } */ + +#define N 1024 + +#define TEST(SIGN) \ + SIGN char a_##SIGN[N], b_##SIGN[N]; \ + int f_##SIGN (void) \ + { \ + int i, sum = 0; \ + for (i = 0; i < N; i++) \ + sum += __builtin_abs (a_##SIGN[i] - b_##SIGN[i]);; \ + return sum; \ + } + +TEST(signed); +TEST(unsigned); + +/* { dg-final { scan-assembler {\tvabsd.bu\t} } } */ +/* { dg-final { scan-assembler {\tvabsd.b\t} } } */ From patchwork Mon Oct 16 02:00:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 153136 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3189118vqb; Sun, 15 Oct 2023 19:01:50 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHFLnYsCYgy+jvr5DxaycHNy5YGP8xKe2uLdTV9pVJE6Jno5DoeU8JwYcfEpqOmGB52nYqC X-Received: by 2002:a05:622a:14cf:b0:417:d8e0:5024 with SMTP id u15-20020a05622a14cf00b00417d8e05024mr42551355qtx.52.1697421710413; Sun, 15 Oct 2023 19:01:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697421710; cv=pass; d=google.com; s=arc-20160816; b=fCF0vW+ZQqRMUA6p5rpEZU6ZTA8v+I7y0mMMeQ7sLP04Uma1VVc8GTDtMU1jQ6Hg2d d3jKGgodnR0yQ5k20eVfIpc1+CS5OL5Qy0t6p1cc2kkDF5Af89Cx/Q9Z72pkaR6UEh8s AQqeesDOtzq9WZySHX5XKx9MvywViUsOLYyuHVd4EB1dcAH1AnAKaH0tfFMxyJqWYk// a5i0KNbTCCTpHZwZFM22o5rd6KwdqjZjdldN0cPEvtkxmYHDl9lqPcRom9P1yFbNtRBi 15Bg7M59kME9+whc3mEM4yjll9eLknhNOzN3mF/ZDaG2n1fn/58v+ouA/qHcDM2PaBzE /N7w== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:arc-filter:dmarc-filter:delivered-to; bh=4X+JKhJi+sQP14xpigofaV4SE/v76h+RKzXo//TVVn8=; fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=; b=fou9v5YZIddaF31Otd0GB74ThSuuLBG65PCAqNXlra+JtR6RM9RkHLdcpJOZtmqHJG 9rBzbWe0QkBmKhinvkVz2LE57L9a3B7QoF6+pUC6p+EpxQ/gQqsMtb40qXdtg9+dVkKh 2S7lO53n6vIDide5c0q45M7ftNwiELC1Qj94Aj9Hk8hjuQt7X62/cFp4Nmu3tpU2xk0G 2isHBKAy2srU0v/sK+0s5bAmqSEGEED3nESc2gJccUTjOMsD6s+qpPe9MO8U9aTSFaDx c02JLGpkTFYaWNIF8qdTRbOeayjFnueORfySpG4OKDfmAlat96Ux71asDWy0L5IvzgZR 7NJw== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id d10-20020a05622a05ca00b004181bf4f326si4657383qtb.604.2023.10.15.19.01.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 19:01:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2FEAA38582AC for ; Mon, 16 Oct 2023 02:01:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from eggs.gnu.org (eggs.gnu.org [IPv6:2001:470:142:3::10]) by sourceware.org (Postfix) with ESMTPS id BBBFF3858C3A for ; Mon, 16 Oct 2023 02:01:07 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BBBFF3858C3A Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=fail smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org BBBFF3858C3A Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2001:470:142:3::10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421673; cv=none; b=oiAJDhYzOV2bP6rnuP4Cs1BNq7VLEMdDXUsQ1uXaxtPBGWOqariQzffhmC8c8RuMYIgWJdZGRppnrtZdLSHSnP+LGY+jdApIoha4Vs2f9lOf6/kLCK0I6eHKcPoOOhCNYDfIuVfzaeL9T1DUjuO2AO4LaSxjm7eO7flKkT+o0vw= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421673; c=relaxed/simple; bh=qqYbncI1J0T6lj508Z3DA0UCMdezXNJfWle4U8K15VY=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=wUwFgjMv1uuOImkU5H/GxXSzVUi7DZWynRoFrCvs5i+VvMSkFiMFS7/eIqWqWmwKSlZBtz/9QqjTO1yH3R/Zw6RGXTs/01iNqI70kDkyY75gNpq2CokDlNQ4QIcW17z+hs9J/4cN8GTgu51XLngfqj060UwSMnD8fgXzY9Ax36Y= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from mail.loongson.cn ([114.242.206.163]) by eggs.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qsCuI-0005WK-Dz for gcc-patches@gcc.gnu.org; Sun, 15 Oct 2023 22:01:00 -0400 Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8Axjus2mSxl5jkyAA--.28528S3; Mon, 16 Oct 2023 10:00:22 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxbNwvmSxlz_4lAA--.14027S6; Mon, 16 Oct 2023 10:00:22 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH 2/3] LoongArch:Implement vec_widen standard names. Date: Mon, 16 Oct 2023 10:00:13 +0800 Message-Id: <20231016020014.41979-3-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231016020014.41979-1-xujiahao@loongson.cn> References: <20231016020014.41979-1-xujiahao@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxbNwvmSxlz_4lAA--.14027S6 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoW3CFWUKFW5GryfuF13Jr4DWrX_yoWkZw48pr WxCw1YyF48X3Z7Gw1kGa43AwsxKrsrWrnruFnxCrZakr13Kryjgw4IyF9aqFyDJw4Fqr12 9rs5ua1Uu3WUK3gCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUk0b4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r126r13M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4j6r4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI0UMc 02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUtVWrXwAv7VC2z280aVAF wI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28IcxkI7V AKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCj r7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6x IIjxv20xvE14v26r1I6r4UMIIF0xvE2Ix0cI8IcVCY1x0267AKxVWUJVW8JwCI42IY6xAI w20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x 0267AKxVWUJVW8JbIYCTnIWIevJa73UjIFyTuYvjxU7JKsUUUUU Received-SPF: pass client-ip=114.242.206.163; envelope-from=xujiahao@loongson.cn; helo=mail.loongson.cn X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Spam-Status: No, score=-14.2 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SPF_FAIL, SPF_HELO_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779875667489713270 X-GMAIL-MSGID: 1779875667489713270 Add support for vec_widen lo/hi patterns. These do not directly match on Loongarch lasx instructions but can be emulated with even/odd + vector merge. gcc/ChangeLog: * config/loongarch/lasx.md (vec_widen_add_hi_, vec_widen_add_lo_, vec_widen_sub_hi_, vec_widen_sub_lo_, vec_widen_mult_hi_, vec_widen_mult_lo_): New patterns. * config/loongarch/loongarch.cc (loongarch_expand_vec_widen_hilo):New function. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vect-widen-add.c: New test. * gcc.target/loongarch/vect-widen-sub.c: New test. * gcc.target/loongarch/vect-widen-mul.c: New test. diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index 483d78bb210..02c6019e1dd 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -5048,23 +5048,71 @@ [(set_attr "type" "simd_store") (set_attr "mode" "DI")]) -(define_insn "vec_widen_mult_even_v8si" - [(set (match_operand:V4DI 0 "register_operand" "=f") - (mult:V4DI - (any_extend:V4DI - (vec_select:V4SI - (match_operand:V8SI 1 "register_operand" "%f") - (parallel [(const_int 0) (const_int 2) - (const_int 4) (const_int 6)]))) - (any_extend:V4DI - (vec_select:V4SI - (match_operand:V8SI 2 "register_operand" "f") - (parallel [(const_int 0) (const_int 2) - (const_int 4) (const_int 6)])))))] - "ISA_HAS_LASX" - "xvmulwev.d.w\t%u0,%u1,%u2" - [(set_attr "type" "simd_int_arith") - (set_attr "mode" "V4DI")]) +(define_expand "vec_widen_add_hi_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , true, "add"); + DONE; +}) + +(define_expand "vec_widen_add_lo_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , false, "add"); + DONE; +}) + +(define_expand "vec_widen_sub_hi_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , true, "sub"); + DONE; +}) + +(define_expand "vec_widen_sub_lo_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , false, "sub"); + DONE; +}) + +(define_expand "vec_widen_mult_hi_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , true, "mult"); + DONE; +}) + +(define_expand "vec_widen_mult_lo_" + [(match_operand: 0 "register_operand") + (any_extend: (match_operand:ILASX_HB 1 "register_operand")) + (any_extend: (match_operand:ILASX_HB 2 "register_operand"))] + "ISA_HAS_LASX" +{ + loongarch_expand_vec_widen_hilo (operands[0], operands[1], operands[2], + , false, "mult"); + DONE; +}) ;; Vector reduction operation (define_expand "reduc_plus_scal_v4di" diff --git a/gcc/config/loongarch/loongarch-protos.h b/gcc/config/loongarch/loongarch-protos.h index 251011c5414..72ae9918b09 100644 --- a/gcc/config/loongarch/loongarch-protos.h +++ b/gcc/config/loongarch/loongarch-protos.h @@ -205,6 +205,7 @@ extern void loongarch_register_frame_header_opt (void); extern void loongarch_expand_vec_cond_expr (machine_mode, machine_mode, rtx *); extern void loongarch_expand_vec_cond_mask_expr (machine_mode, machine_mode, rtx *); +extern void loongarch_expand_vec_widen_hilo (rtx, rtx, rtx, bool, bool, const char *); /* Routines implemented in loongarch-c.c. */ void loongarch_cpu_cpp_builtins (cpp_reader *); diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 9e1b0d0cfa8..472f8fd37c9 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -8032,6 +8032,142 @@ loongarch_expand_vec_perm_even_odd (struct expand_vec_perm_d *d) return loongarch_expand_vec_perm_even_odd_1 (d, odd); } +static void +loongarch_expand_vec_interleave (rtx target, rtx op0, rtx op1, bool high_p) +{ + struct expand_vec_perm_d d; + unsigned i, nelt, base; + bool ok; + + d.target = target; + d.op0 = op0; + d.op1 = op1; + d.vmode = GET_MODE (target); + d.nelt = nelt = GET_MODE_NUNITS (d.vmode); + d.one_vector_p = false; + d.testing_p = false; + + base = high_p ? nelt / 2 : 0; + for (i = 0; i < nelt / 2; ++i) + { + d.perm[i * 2] = i + base; + d.perm[i * 2 + 1] = i + base + nelt; + } + + ok = loongarch_expand_vec_perm_interleave (&d); + gcc_assert (ok); +} + +/* The loongarch lasx instructions xvmulwev and xvmulwod return the even or odd parts of the + double sized result elements in the corresponding elements of the target register. That's + NOT what the vec_widen_umult_lo/hi patterns are expected to do. We emulate the widening + lo/hi multiplies with the even/odd versions followed by a vector merge. */ + +void +loongarch_expand_vec_widen_hilo (rtx dest, rtx op1, rtx op2, + bool uns_p, bool high_p, const char *optab) +{ + machine_mode wmode = GET_MODE (dest); + machine_mode mode = GET_MODE (op1); + rtx t1, t2, t3; + + t1 = gen_reg_rtx (wmode); + t2 = gen_reg_rtx (wmode); + t3 = gen_reg_rtx (wmode); + switch (mode) + { + case V16HImode: + if (!strcmp (optab, "add")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvaddwev_w_h (t1, op1, op2)); + emit_insn (gen_lasx_xvaddwod_w_h (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvaddwev_w_hu (t1, op1, op2)); + emit_insn (gen_lasx_xvaddwod_w_hu (t2, op1, op2)); + } + } + else if (!strcmp (optab, "mult")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvmulwev_w_h (t1, op1, op2)); + emit_insn (gen_lasx_xvmulwod_w_h (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvmulwev_w_hu (t1, op1, op2)); + emit_insn (gen_lasx_xvmulwod_w_hu (t2, op1, op2)); + } + } + else if (!strcmp (optab, "sub")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvsubwev_w_h (t1, op1, op2)); + emit_insn (gen_lasx_xvsubwod_w_h (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvsubwev_w_hu (t1, op1, op2)); + emit_insn (gen_lasx_xvsubwod_w_hu (t2, op1, op2)); + } + } + break; + + case V32QImode: + if (!strcmp (optab, "add")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvaddwev_h_b (t1, op1, op2)); + emit_insn (gen_lasx_xvaddwod_h_b (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvaddwev_h_bu (t1, op1, op2)); + emit_insn (gen_lasx_xvaddwod_h_bu (t2, op1, op2)); + } + } + else if (!strcmp (optab, "mult")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvmulwev_h_b (t1, op1, op2)); + emit_insn (gen_lasx_xvmulwod_h_b (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvmulwev_h_bu (t1, op1, op2)); + emit_insn (gen_lasx_xvmulwod_h_bu (t2, op1, op2)); + } + } + else if (!strcmp (optab, "sub")) + { + if (!uns_p) + { + emit_insn (gen_lasx_xvsubwev_h_b (t1, op1, op2)); + emit_insn (gen_lasx_xvsubwod_h_b (t2, op1, op2)); + } + else + { + emit_insn (gen_lasx_xvsubwev_h_bu (t1, op1, op2)); + emit_insn (gen_lasx_xvsubwod_h_bu (t2, op1, op2)); + } + } + break; + + default: + gcc_unreachable (); + } + + loongarch_expand_vec_interleave (t3, t1, t2, high_p); + emit_move_insn (dest, gen_lowpart (wmode, t3)); +} + /* Expand a variable vector permutation for LASX. */ void diff --git a/gcc/config/loongarch/loongarch.md b/gcc/config/loongarch/loongarch.md index 3286b0c56ae..a76bf2c6c9f 100644 --- a/gcc/config/loongarch/loongarch.md +++ b/gcc/config/loongarch/loongarch.md @@ -509,6 +509,8 @@ ;; is like , but the signed form expands to "s" rather than "". (define_code_attr su [(sign_extend "s") (zero_extend "u")]) +(define_code_attr u_bool [(sign_extend "false") (zero_extend "true")]) + ;; expands to the name of the optab for a particular code. (define_code_attr optab [(ashift "ashl") (ashiftrt "ashr") diff --git a/gcc/testsuite/gcc.target/loongarch/vect-widen-add.c b/gcc/testsuite/gcc.target/loongarch/vect-widen-add.c new file mode 100644 index 00000000000..2d273adaf92 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vect-widen-add.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ +/* { dg-final { scan-assembler "xvaddwev.w.h" } } */ +/* { dg-final { scan-assembler "xvaddwod.w.h" } } */ +/* { dg-final { scan-assembler "xvaddwev.w.hu" } } */ +/* { dg-final { scan-assembler "xvaddwod.w.hu" } } */ + +#include + +#define SIZE 1024 + +void wide_uadd (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] + b[i]; + } +} + +void wide_sadd (int32_t *foo, int16_t *a, int16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] + b[i]; + } +} diff --git a/gcc/testsuite/gcc.target/loongarch/vect-widen-mul.c b/gcc/testsuite/gcc.target/loongarch/vect-widen-mul.c new file mode 100644 index 00000000000..282a168369e --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vect-widen-mul.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ +/* { dg-final { scan-assembler "xvmulwev.w.h" } } */ +/* { dg-final { scan-assembler "xvmulwod.w.h" } } */ +/* { dg-final { scan-assembler "xvmulwev.w.hu" } } */ +/* { dg-final { scan-assembler "xvmulwod.w.hu" } } */ + +#include + +#define SIZE 1024 + +void wide_umul (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] * b[i]; + } +} + +void wide_smul (int32_t *foo, int16_t *a, int16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] * b[i]; + } +} diff --git a/gcc/testsuite/gcc.target/loongarch/vect-widen-sub.c b/gcc/testsuite/gcc.target/loongarch/vect-widen-sub.c new file mode 100644 index 00000000000..30cc2206b81 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vect-widen-sub.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -mlasx" } */ +/* { dg-final { scan-assembler "xvsubwev.w.h" } } */ +/* { dg-final { scan-assembler "xvsubwod.w.h" } } */ +/* { dg-final { scan-assembler "xvsubwev.w.hu" } } */ +/* { dg-final { scan-assembler "xvsubwod.w.hu" } } */ + +#include + +#define SIZE 1024 + +void wide_usub (uint32_t *foo, uint16_t *a, uint16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] - b[i]; + } +} + +void wide_ssub (int32_t *foo, int16_t *a, int16_t *b) +{ + for ( int i = 0; i < SIZE; i++) + { + foo[i] = a[i] - b[i]; + } +} From patchwork Mon Oct 16 02:00:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 153134 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3188782vqb; Sun, 15 Oct 2023 19:00:57 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHE9HNAVgnSc3m6op2iFY7CW/uEYtcZksDl0J6kgHT0oh968Lfalyx1gcr+AMtgb+0QVYPq X-Received: by 2002:a05:6214:5585:b0:626:f3d:9e46 with SMTP id mi5-20020a056214558500b006260f3d9e46mr33704572qvb.18.1697421657193; Sun, 15 Oct 2023 19:00:57 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1697421657; cv=pass; d=google.com; s=arc-20160816; b=NMgZGrFHPvJ5XXxLV2gITR+r8qRpEecrmoE3D9G0gSfRBBxwinMNGyLE7X74j1L524 XMIOcoL7VKYwDFvnb3m8KznqBIsyGl9BoV6wFpNusTi5YASF8dW0ho7Mm+3ew3InDBaF TB0ILtgURRZH4S2PkB3ZEaWDfCmjgUW84idGa5GrnNNZgMdMZcfcV4xcER/WXJrjdcPr nPTIM7/duYZ+TKQ1TMezs/NGtTw2SfPk1nPNeBGcAg8g68PF3JsxoWIqYvtm7iuqV92n doN8dUP8mf1QW2W9MXNm5rqVF7QZ8PzB+/qzqEGP7vrnCvfd1IB5aVDpSrKfjMiL0mD/ QQ7g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:arc-filter:dmarc-filter:delivered-to; bh=OKO6wbZm/+ay/a5B+JdEg93OuIy7PYi9rXgc4h0N7+0=; fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=; b=EVS+1rZDrRVgIgIO4rQ1ZwMJeh5EV9hrVrrofl2PquxSoVqGG+GN8bWCCjBpAID5xq FEYQ2ao10kgMoANit/kxf5wqckMMGlMWduIfqqILHJtXG2VzsfAD1c8EptN2le0yig7o ylYBDs3gLZTM5ed2F/FhFLDt5wVOVK26o3uPDQPVWU0tBkGiO7MONTaWWOASNmvkA0hA yjlM+5mCbTWlfanV1z1lWVQV9jc9eCFO/Pra4E9k43TgNvA0zTFTm7DWwPwNo0kEKqYs 9xPX6nh4/wdNb8dwp7HkiVGT65JajHbJ223wecGglC1zFsz5z8xLEJF5bEvFLMlO6A5D vGTA== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id q7-20020a0cfa07000000b0064188f9b3d0si4647857qvn.188.2023.10.15.19.00.57 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 19:00:57 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id EEF9C3858428 for ; Mon, 16 Oct 2023 02:00:56 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id 5D3E33858D33 for ; Mon, 16 Oct 2023 02:00:27 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D3E33858D33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5D3E33858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421630; cv=none; b=VxtLSuTxGU0gZlLFSAv3CIxBJLW2WMxtjklA53FCwGtS3kaKM0xPDyQ/2cG/CmTvTfDKO0uD9Kt+VHyvZFlh2ka+OCYkeUfvs16b45/8u8T/SgNTiMturLzCWik3v5dzGqnkruvQryIZ1Z32GxiSA62ip8C8Lwqc2Bnf7EOpmYU= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1697421630; c=relaxed/simple; bh=6Bej7qZQenPht5gamPmrYOu4dpO5PXuKr3R4dL/fyHg=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=t32dnM4VLiT2Cd18IvBxn3RUgOz29AJAn/uQ+OhhHWpAeV/SHQiH8cq9E5CYNggwV+cAYMS4c4Oe4wef2/CQ+gl0ka1GoXxfs+d0joQ5Tq+S/8Cbs50EcD8xuwDFtxeERs8KBcDyL+3w9RU9bjacaU5ywuUOu8bTkiH2MELJ4tU= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.10.130.252]) by gateway (Coremail) with SMTP id _____8BxuOg5mSxl7TkyAA--.61081S3; Mon, 16 Oct 2023 10:00:25 +0800 (CST) Received: from slurm-master.loongson.cn (unknown [10.10.130.252]) by localhost.localdomain (Coremail) with SMTP id AQAAf8BxbNwvmSxlz_4lAA--.14027S7; Mon, 16 Oct 2023 10:00:24 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH 3/3] LoongArch:Implement the new vector cost model framework. Date: Mon, 16 Oct 2023 10:00:14 +0800 Message-Id: <20231016020014.41979-4-xujiahao@loongson.cn> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20231016020014.41979-1-xujiahao@loongson.cn> References: <20231016020014.41979-1-xujiahao@loongson.cn> MIME-Version: 1.0 X-CM-TRANSID: AQAAf8BxbNwvmSxlz_4lAA--.14027S7 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoW3ZryUWFyrZF4UWw1fCFWkKrX_yoWkZr1rpr W2kry3Jw48twnxXF1kJ39aqrs0yrZrGF42gF43t34fCr45KrnaqF1vkryqvFy7Ga4rCr1I qr1rX3Z8Z3Z8AacCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkYb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1Y6r17M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_Gr0_Xr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVWxJVW8Jr1l84ACjcxK6I8E87Iv6xkF7I0E14v2 6r4j6r4UJwAS0I0E0xvYzxvE52x082IY62kv0487Mc804VCY07AIYIkI8VC2zVCFFI0UMc 02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2IY67AKxVWUtVWrXwAv7VC2z280aVAF wI0_Gr0_Cr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0Y48IcxkI7VAKI48JMxAIw28IcxkI7V AKI48JMxC20s026xCaFVCjc4AY6r1j6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCj r7xvwVAFwI0_JrI_JrWlx4CE17CEb7AF67AKxVWUAVWUtwCIc40Y0x0EwIxGrwCI42IY6x IIjxv20xvE14v26r4j6ryUMIIF0xvE2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAI w20EY4v20xvaj40_Jr0_JF4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x 0267AKxVW8JVW8JrUvcSsGvfC2KfnxnUUI43ZEXa7IU8l38UUUUUU== X-Spam-Status: No, score=-11.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779875611542372276 X-GMAIL-MSGID: 1779875611542372276 This patch make loongarch use the new vector hooks and implements the costing function determine_suggested_unroll_factor, to make it be able to suggest the unroll factor for a given loop being vectorized base vec_ops analysis during vector costing and the available issue information. Referring to aarch64 and rs6000 port. The patch also reduces the cost of unaligned stores, making it equal to the cost of aligned ones in order to avoid odd alignment peeling. gcc/ChangeLog: * config/loongarch/loongarch.cc (loongarch_vector_costs): Inherit from vector_costs. Add a constructor. (loongarch_vector_costs::add_stmt_cost): Use adjust_cost_for_freq to adjust the cost for inner loops. (loongarch_vector_costs::count_operations): New function. (loongarch_vector_costs::determine_suggested_unroll_factor):Ditto. (loongarch_vector_costs::finish_cost): Ditto. (loongarch_builtin_vectorization_cost): Adjust. * config/loongarch/loongarch.opt (loongarch-vect-unroll-limit): New parameter. (loongarcg-vect-issue-info): Ditto. (mmemvec-cost): Delete. * doc/invoke.texi: (loongarcg-vect-unroll-limit): Document new option. diff --git a/gcc/config/loongarch/genopts/loongarch.opt.in b/gcc/config/loongarch/genopts/loongarch.opt.in index 9f98f2d845a..4a2d7438f1b 100644 --- a/gcc/config/loongarch/genopts/loongarch.opt.in +++ b/gcc/config/loongarch/genopts/loongarch.opt.in @@ -146,10 +146,6 @@ mbranch-cost= Target RejectNegative Joined UInteger Var(loongarch_branch_cost) -mbranch-cost=COST Set the cost of branches to roughly COST instructions. -mmemvec-cost= -Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5) -mmemvec-cost=COST Set the cost of vector memory access instructions. - mcheck-zero-division Target Mask(CHECK_ZERO_DIV) Trap on integer divide by zero. @@ -213,3 +209,14 @@ mrelax Target Var(loongarch_mrelax) Init(HAVE_AS_MRELAX_OPTION) Take advantage of linker relaxations to reduce the number of instructions required to materialize symbol addresses. + +-param=loongarch-vect-unroll-limit= +Target Joined UInteger Var(loongarch_vect_unroll_limit) Init(6) IntegerRange(1, 64) Param +Used to limit unroll factor which indicates how much the autovectorizer may +unroll a loop. The default value is 6. + +-param=loongarch-vect-issue-info= +Target Undocumented Joined UInteger Var(loongarch_vect_issue_info) Init(4) IntegerRange(1, 64) Param +Indicate how many non memory access vector instructions can be issued per +cycle, it's used in unroll factor determination for autovectorizer. The +default value is 4. diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 472f8fd37c9..cfd35a63ff1 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -65,6 +65,8 @@ along with GCC; see the file COPYING3. If not see #include "rtl-iter.h" #include "opts.h" #include "function-abi.h" +#include "cfgloop.h" +#include "tree-vectorizer.h" /* This file should be included last. */ #include "target-def.h" @@ -3845,8 +3847,6 @@ loongarch_rtx_costs (rtx x, machine_mode mode, int outer_code, } } -/* Vectorizer cost model implementation. */ - /* Implement targetm.vectorize.builtin_vectorization_cost. */ static int @@ -3865,36 +3865,182 @@ loongarch_builtin_vectorization_cost (enum vect_cost_for_stmt type_of_cost, case vector_load: case vec_to_scalar: case scalar_to_vec: - case cond_branch_not_taken: - case vec_promote_demote: case scalar_store: case vector_store: return 1; + case vec_promote_demote: case vec_perm: return LASX_SUPPORTED_MODE_P (mode) && !LSX_SUPPORTED_MODE_P (mode) ? 2 : 1; case unaligned_load: - case vector_gather_load: - return 2; - case unaligned_store: - case vector_scatter_store: - return 10; + return 2; case cond_branch_taken: - return 3; + return 4; + + case cond_branch_not_taken: + return 2; case vec_construct: elements = TYPE_VECTOR_SUBPARTS (vectype); - return elements / 2 + 1; + if (ISA_HAS_LASX) + return elements + 1; + else + return elements; default: gcc_unreachable (); } } +class loongarch_vector_costs : public vector_costs +{ +public: + using vector_costs::vector_costs; + + unsigned int add_stmt_cost (int count, vect_cost_for_stmt kind, + stmt_vec_info stmt_info, slp_tree, tree vectype, + int misalign, + vect_cost_model_location where) override; + void finish_cost (const vector_costs *) override; + +protected: + void count_operations (vect_cost_for_stmt, stmt_vec_info, + vect_cost_model_location, unsigned int); + unsigned int determine_suggested_unroll_factor (loop_vec_info); + /* The number of vectorized stmts in loop. */ + unsigned m_stmts = 0; + /* The number of load and store operations in loop. */ + unsigned m_loads = 0; + unsigned m_stores = 0; + /* Reduction factor for suggesting unroll factor. */ + unsigned m_reduc_factor = 0; + /* True if the loop contains an average operation. */ + bool m_has_avg =false; +}; + +/* Implement TARGET_VECTORIZE_CREATE_COSTS. */ +static vector_costs * +loongarch_vectorize_create_costs (vec_info *vinfo, bool costing_for_scalar) +{ + return new loongarch_vector_costs (vinfo, costing_for_scalar); +} + +void +loongarch_vector_costs::count_operations (vect_cost_for_stmt kind, + stmt_vec_info stmt_info, + vect_cost_model_location where, + unsigned int count) +{ + if (!m_costing_for_scalar + && is_a (m_vinfo) + && where == vect_body) + { + m_stmts += count; + + if (kind == scalar_load + || kind == vector_load + || kind == unaligned_load) + m_loads += count; + else if (kind == scalar_store + || kind == vector_store + || kind == unaligned_store) + m_stores += count; + else if ((kind == scalar_stmt + || kind == vector_stmt + || kind == vec_to_scalar) + && stmt_info && vect_is_reduction (stmt_info)) + { + tree lhs = gimple_get_lhs (stmt_info->stmt); + unsigned int base = FLOAT_TYPE_P (TREE_TYPE (lhs)) ? 2 : 1; + m_reduc_factor = MAX (base * count, m_reduc_factor); + } + } +} + +unsigned int +loongarch_vector_costs::determine_suggested_unroll_factor (loop_vec_info loop_vinfo) +{ + class loop *loop = LOOP_VINFO_LOOP (loop_vinfo); + + if (m_has_avg) + return 1; + + /* Don't unroll if it's specified explicitly not to be unrolled. */ + if (loop->unroll == 1 + || (OPTION_SET_P (flag_unroll_loops) && !flag_unroll_loops) + || (OPTION_SET_P (flag_unroll_all_loops) && !flag_unroll_all_loops)) + return 1; + + unsigned int nstmts_nonldst = m_stmts - m_loads - m_stores; + /* Don't unroll if no vector instructions excepting for memory access. */ + if (nstmts_nonldst == 0) + return 1; + + /* Use this simple hardware resource model that how many non vld/vst + vector instructions can be issued per cycle. */ + unsigned int issue_info = loongarch_vect_issue_info; + unsigned int reduc_factor = m_reduc_factor > 1 ? m_reduc_factor : 1; + unsigned int uf = CEIL (reduc_factor * issue_info, nstmts_nonldst); + uf = MIN ((unsigned int) loongarch_vect_unroll_limit, uf); + + return 1 << ceil_log2 (uf); +} + +unsigned +loongarch_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, + stmt_vec_info stmt_info, slp_tree, + tree vectype, int misalign, + vect_cost_model_location where) +{ + unsigned retval = 0; + + if (flag_vect_cost_model) + { + int stmt_cost = loongarch_builtin_vectorization_cost (kind, vectype, + misalign); + retval = adjust_cost_for_freq (stmt_info, where, count * stmt_cost); + m_costs[where] += retval; + + count_operations (kind, stmt_info, where, count); + } + + if (stmt_info) + { + /* Detect the use of an averaging operation. */ + gimple *stmt = stmt_info->stmt; + if (is_gimple_call (stmt) + && gimple_call_internal_p (stmt)) + { + switch (gimple_call_internal_fn (stmt)) + { + case IFN_AVG_FLOOR: + case IFN_AVG_CEIL: + m_has_avg = true; + default: + break; + } + } + } + + return retval; +} + +void +loongarch_vector_costs::finish_cost (const vector_costs *scalar_costs) +{ + loop_vec_info loop_vinfo = dyn_cast (m_vinfo); + if (loop_vinfo) + { + m_suggested_unroll_factor = determine_suggested_unroll_factor (loop_vinfo); + } + + vector_costs::finish_cost (scalar_costs); +} + /* Implement TARGET_ADDRESS_COST. */ static int @@ -7265,9 +7411,6 @@ loongarch_option_override_internal (struct gcc_options *opts, if (TARGET_DIRECT_EXTERN_ACCESS && flag_shlib) error ("%qs cannot be used for compiling a shared library", "-mdirect-extern-access"); - if (loongarch_vector_access_cost == 0) - loongarch_vector_access_cost = 5; - switch (la_target.cmodel) { @@ -11279,6 +11422,8 @@ loongarch_builtin_support_vector_misalignment (machine_mode mode, #undef TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST #define TARGET_VECTORIZE_BUILTIN_VECTORIZATION_COST \ loongarch_builtin_vectorization_cost +#undef TARGET_VECTORIZE_CREATE_COSTS +#define TARGET_VECTORIZE_CREATE_COSTS loongarch_vectorize_create_costs #undef TARGET_IN_SMALL_DATA_P diff --git a/gcc/config/loongarch/loongarch.opt b/gcc/config/loongarch/loongarch.opt index e1b085ae87c..6215abcac04 100644 --- a/gcc/config/loongarch/loongarch.opt +++ b/gcc/config/loongarch/loongarch.opt @@ -153,10 +153,6 @@ mbranch-cost= Target RejectNegative Joined UInteger Var(loongarch_branch_cost) -mbranch-cost=COST Set the cost of branches to roughly COST instructions. -mmemvec-cost= -Target RejectNegative Joined UInteger Var(loongarch_vector_access_cost) IntegerRange(1, 5) -mmemvec-cost=COST Set the cost of vector memory access instructions. - mcheck-zero-division Target Mask(CHECK_ZERO_DIV) Trap on integer divide by zero. @@ -220,3 +216,14 @@ mrelax Target Var(loongarch_mrelax) Init(HAVE_AS_MRELAX_OPTION) Take advantage of linker relaxations to reduce the number of instructions required to materialize symbol addresses. + +-param=loongarch-vect-unroll-limit= +Target Joined UInteger Var(loongarch_vect_unroll_limit) Init(6) IntegerRange(1, 64) Param +Used to limit unroll factor which indicates how much the autovectorizer may +unroll a loop. The default value is 6. + +-param=loongarch-vect-issue-info= +Target Undocumented Joined UInteger Var(loongarch_vect_issue_info) Init(4) IntegerRange(1, 64) Param +Indicate how many non memory access vector instructions can be issued per +cycle, it's used in unroll factor determination for autovectorizer. The +default value is 4. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index fee659462ff..733723e29d7 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -26205,6 +26205,13 @@ environments where no dynamic link is performed, like firmwares, OS kernels, executables linked with @option{-static} or @option{-static-pie}. @option{-mdirect-extern-access} is not compatible with @option{-fPIC} or @option{-fpic}. + +@item loongarch-vect-unroll-limit +The vectorizer will use available tuning information to determine whether it +would be beneficial to unroll the main vectorized loop and by how much. This +parameter set's the upper bound of how much the vectorizer will unroll the main +loop. The default value is six. + @end table