From patchwork Fri Jan 5 07:38:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiahao Xu X-Patchwork-Id: 185253 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp6075536dyb; Thu, 4 Jan 2024 23:39:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IGXR3WEuHJsGbzTISfcdCM0YyJp4O21XwPlCoMcY0oUuLrCXKDrrEuGJbjFvaSc5WzZt3VI X-Received: by 2002:ae9:f408:0:b0:781:1189:8079 with SMTP id y8-20020ae9f408000000b0078111898079mr1889016qkl.121.1704440361575; Thu, 04 Jan 2024 23:39:21 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704440361; cv=pass; d=google.com; s=arc-20160816; b=PgIai9sB/mSeOliB9Hs3T4FRG2xDkdMTkgjycMq+KLFZoUaFVU2UEAMXjTVifseMl6 ULQOIFBDgPAcPW5ZTXGGHRJZ+gU7KR7EgufN7sHnBFtOrWO/jn0UzO6i8+SUUiju8/fC GbOMcQOzilPIXvsYIfc/jp866ju7VaanQ97Uhj/SsJ4LJm4aLIavsPKNl3abi7QXLGxr eFEC885lcFcOnnkHq6Chr0I1AJPkxN6j3EJ9sgnW2VmWie/RbvLcPzrqJBfVK/Dmx/ry n2SXgs8mvtPEdRSzA9HdRWHwdbYcgsBNj+MlIzQJkR5l6l+YsLRkWSosyd/GIf6OETXF u8FQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:arc-filter :dmarc-filter:delivered-to; bh=o3RcodCivfW6y95iRKFk/wMWBPdRT+zRQnRvuP5VvdQ=; fh=w+xsGLuzpTj56E0bzPMCc39RWHkBXt2f2AGs+4pGimo=; b=gU988x/XIh6h/6n7l7ClfXa9SSejhI0v4saVjntfAMo1is+sE48VWY8oMDw+mQR751 CTPDjHePn3sQroHkddgYCMk3Qt3iQ5uT+gwlGl8tN9qlCp9zKRmBzMBI/fJcYOR0v88E 6T3+Lm0T1ky3hQ3MPQVZ/25ULNN9IBny6ZB3iUyonW2iC3oSxbaEe4NPrKs6hJy96GWF 3Vm0bLQfeGOYqLHm7z5d5lywqJqPVIRfp9OBVLGp9w2iLgqb3jgu1621If7A2kjZ4S1d VfQAdKDDBOP+UwcpUff2RCOAEB1xe/yR/4Q6odgajnKjYWxKxpjk83KAXSgVu/ri6Jy+ ky2A== ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id qr2-20020a05620a390200b007813d18b046si1218270qkn.473.2024.01.04.23.39.21 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 23:39:21 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 48AB2385771D for ; Fri, 5 Jan 2024 07:39:21 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail.loongson.cn (mail.loongson.cn [114.242.206.163]) by sourceware.org (Postfix) with ESMTP id CDB85385DC33 for ; Fri, 5 Jan 2024 07:38:30 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org CDB85385DC33 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=loongson.cn Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=loongson.cn ARC-Filter: OpenARC Filter v1.0.0 sourceware.org CDB85385DC33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=114.242.206.163 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704440314; cv=none; b=tyrcwGaXwG/yy0J4QrKgfqElYpAuNSsedUj828uxYG3ZhpL7ulY1RNXmKGdKeQyQIpoQn6B6R7ETtJSNycPpJ3Q4X9kV5JLchmp9ap1nxYutTP+elC+S5vTmNQOjdiDQngBqVUBdKpxiFK39Kllgh9KVqEXk1SRk1svEyb7yUx8= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704440314; c=relaxed/simple; bh=SuwgkYeFKVnJPfkd+6AhjdEoKZtOLXuY+9k4YWskapM=; h=From:To:Subject:Date:Message-Id:MIME-Version; b=qus6R6UIR9FPzTaSNGg1z2a6EprSRv6SKlXlkV0F4+gYn186DvhKMATD7w7iaA3oBnOHkKfe6KxE5VTIhHsOEJt75lfh2ynFpG/oMJ82AReq+iFVWY8s6vAKgq65W9i0kxcUDiAt9s6p+OxM8UaUs9hZrYHXq3Lfx8kWVxc1Vh8= ARC-Authentication-Results: i=1; server2.sourceware.org Received: from loongson.cn (unknown [10.2.6.5]) by gateway (Coremail) with SMTP id _____8Bx2enzsZdlEzwCAA--.4076S3; Fri, 05 Jan 2024 15:38:27 +0800 (CST) Received: from 5.5.5 (unknown [10.2.6.5]) by localhost.localdomain (Coremail) with SMTP id AQAAf8DxXN7zsZdl3xADAA--.8006S4; Fri, 05 Jan 2024 15:38:27 +0800 (CST) From: Jiahao Xu To: gcc-patches@gcc.gnu.org Cc: xry111@xry111.site, i@xen0n.name, chenglulu@loongson.cn, xuchenghua@loongson.cn, Jiahao Xu Subject: [PATCH] LoongArch: Implenment vec_init where N is a LSX vector mode Date: Fri, 5 Jan 2024 15:38:25 +0800 Message-Id: <20240105073825.1806927-1-xujiahao@loongson.cn> X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-CM-TRANSID: AQAAf8DxXN7zsZdl3xADAA--.8006S4 X-CM-SenderInfo: 50xmxthkdrqz5rrqw2lrqou0/ X-Coremail-Antispam: 1Uk129KBj93XoW3Jw13ZryftF1ktrW3ZFyUXFc_yoWxCF4xpr Z8C347Ar48XrZIg3WkG3y5Xr4Y9ry7Gw47XFyS93sFk39Fg3s7tw1rtry2qFyjya15u347 X3WfGayjva48J3gCm3ZEXasCq-sJn29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7KY7ZEXa sCq-sGcSsGvfJ3Ic02F40EFcxC0VAKzVAqx4xG6I80ebIjqfuFe4nvWSU5nxnvy29KBjDU 0xBIdaVrnRJUUUkFb4IE77IF4wAFF20E14v26r1j6r4UM7CY07I20VC2zVCF04k26cxKx2 IYs7xG6rWj6s0DM7CIcVAFz4kK6r1j6r18M28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48v e4kI8wA2z4x0Y4vE2Ix0cI8IcVAFwI0_JFI_Gr1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI 0_Gr0_Cr1l84ACjcxK6I8E87Iv67AKxVW8Jr0_Cr1UM28EF7xvwVC2z280aVCY1x0267AK xVW8Jr0_Cr1UM2AIxVAIcxkEcVAq07x20xvEncxIr21l57IF6xkI12xvs2x26I8E6xACxx 1l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20xvE14v26r126r1DMcIj6I8E87Iv 67AKxVW8JVWxJwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xvr2IYc2Ij64vIr41l42xK82IYc2 Ij64vIr41l4I8I3I0E4IkC6x0Yz7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s02 6x8GjcxK67AKxVWUGVWUWwC2zVAF1VAY17CE14v26r126r1DMIIYrxkI7VAKI48JMIIF0x vE2Ix0cI8IcVAFwI0_JFI_Gr1lIxAIcVC0I7IYx2IY6xkF7I0E14v26r1j6r4UMIIF0xvE 42xK8VAvwI8IcIk0rVWUJVWUCwCI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6x kF7I0E14v26r1j6r4UYxBIdaVFxhVjvjDU0xZFpf9x07j1q2_UUUUU= X-Spam-Status: No, score=-12.7 required=5.0 tests=BAYES_00, GIT_PATCH_0, KAM_DMARC_STATUS, KAM_SHORT, SCC_5_SHORT_WORD_LINES, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787235256706718536 X-GMAIL-MSGID: 1787235256706718536 This patch implenments more vec_init optabs that can handle two LSX vectors producing a LASX vector by concatenating them. When an lsx vector is concatenated with an LSX const_vector of zeroes, the vec_concatz pattern can be used effectively. For example as below typedef short v8hi __attribute__ ((vector_size (16))); typedef short v16hi __attribute__ ((vector_size (32))); v8hi a, b; v16hi vec_initv16hiv8hi () { return __builtin_shufflevector (a, b, 0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15); } Before this patch: vec_initv16hiv8hi: addi.d $r3,$r3,-64 .cfi_def_cfa_offset 64 xvrepli.h $xr0,0 la.local $r12,.LANCHOR0 xvst $xr0,$r3,0 xvst $xr0,$r3,32 vld $vr0,$r12,0 vst $vr0,$r3,0 vld $vr0,$r12,16 vst $vr0,$r3,32 xvld $xr1,$r3,32 xvld $xr2,$r3,32 xvld $xr0,$r3,0 xvilvh.h $xr0,$xr1,$xr0 xvld $xr1,$r3,0 xvilvl.h $xr1,$xr2,$xr1 addi.d $r3,$r3,64 .cfi_def_cfa_offset 0 xvpermi.q $xr0,$xr1,32 jr $r1 After this patch: vec_initv16hiv8hi: la.local $r12,.LANCHOR0 vld $vr0,$r12,32 vld $vr2,$r12,48 xvilvh.h $xr1,$xr2,$xr0 xvilvl.h $xr0,$xr2,$xr0 xvpermi.q $xr1,$xr0,32 xvst $xr1,$r4,0 jr $r1 gcc/ChangeLog: * config/loongarch/lasx.md (vec_initv32qiv16qi): Rename to .. (vec_init): .. this, and extend to mode. (@vec_concatz): New insn pattern. * config/loongarch/loongarch.cc (loongarch_expand_vector_group_init): Handle VALS containing two vectors. gcc/testsuite/ChangeLog: * gcc.target/loongarch/vector/lasx/lasx-vec-init-2.c: New test. diff --git a/gcc/config/loongarch/lasx.md b/gcc/config/loongarch/lasx.md index e196613ffe4..36dc3d95eac 100644 --- a/gcc/config/loongarch/lasx.md +++ b/gcc/config/loongarch/lasx.md @@ -465,6 +465,11 @@ (V16HI "w") (V32QI "w")]) +;; Half modes of all LASX vector modes, in lower-case. +(define_mode_attr lasxhalf [(V32QI "v16qi") (V16HI "v8hi") + (V8SI "v4si") (V4DI "v2di") + (V8SF "v4sf") (V4DF "v2df")]) + (define_expand "vec_init" [(match_operand:LASX 0 "register_operand") (match_operand:LASX 1 "")] @@ -474,9 +479,9 @@ DONE; }) -(define_expand "vec_initv32qiv16qi" - [(match_operand:V32QI 0 "register_operand") - (match_operand:V16QI 1 "")] +(define_expand "vec_init" + [(match_operand:LASX 0 "register_operand") + (match_operand: 1 "")] "ISA_HAS_LASX" { loongarch_expand_vector_group_init (operands[0], operands[1]); @@ -577,6 +582,21 @@ [(set_attr "type" "simd_insert") (set_attr "mode" "")]) +(define_insn "@vec_concatz" + [(set (match_operand:LASX 0 "register_operand" "=f") + (vec_concat:LASX + (match_operand: 1 "nonimmediate_operand") + (match_operand: 2 "const_0_operand")))] + "ISA_HAS_LASX" +{ + if (MEM_P (operands[1])) + return "vld\t%w0,%1"; + else + return "vori.b\t%w0,%w1,0"; +} + [(set_attr "type" "simd_splat") + (set_attr "mode" "")]) + (define_insn "vec_concat" [(set (match_operand:LASX 0 "register_operand" "=f") (vec_concat:LASX diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index 28d64135c54..b2a296a1dd9 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -9858,10 +9858,46 @@ loongarch_gen_const_int_vector_shuffle (machine_mode mode, int val) void loongarch_expand_vector_group_init (rtx target, rtx vals) { - rtx ops[2] = { force_reg (E_V16QImode, XVECEXP (vals, 0, 0)), - force_reg (E_V16QImode, XVECEXP (vals, 0, 1)) }; - emit_insn (gen_rtx_SET (target, gen_rtx_VEC_CONCAT (E_V32QImode, ops[0], - ops[1]))); + machine_mode vmode = GET_MODE (target); + machine_mode half_mode = VOIDmode; + rtx low = XVECEXP (vals, 0, 0); + rtx high = XVECEXP (vals, 0, 1); + + switch (vmode) + { + case E_V32QImode: + half_mode = V16QImode; + break; + case E_V16HImode: + half_mode = V8HImode; + break; + case E_V8SImode: + half_mode = V4SImode; + break; + case E_V4DImode: + half_mode = V2DImode; + break; + case E_V8SFmode: + half_mode = V4SFmode; + break; + case E_V4DFmode: + half_mode = V2DFmode; + break; + default: + gcc_unreachable (); + } + + if (high == CONST0_RTX (half_mode)) + emit_insn (gen_vec_concatz (vmode, target, low, high)); + else + { + if (!register_operand (low, half_mode)) + low = force_reg (half_mode, low); + if (!register_operand (high, half_mode)) + high = force_reg (half_mode, high); + emit_insn (gen_rtx_SET (target, + gen_rtx_VEC_CONCAT (vmode, low, high))); + } } /* Expand initialization of a vector which has all same elements. */ diff --git a/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-2.c b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-2.c new file mode 100644 index 00000000000..7592198c448 --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vector/lasx/lasx-vec-init-2.c @@ -0,0 +1,65 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -fno-vect-cost-model -mlasx" } */ +/* { dg-final { scan-assembler-times "vld" 12 } } */ + + +typedef char v16qi __attribute__ ((vector_size (16))); +typedef char v32qi __attribute__ ((vector_size (32))); + +typedef short v8hi __attribute__ ((vector_size (16))); +typedef short v16hi __attribute__ ((vector_size (32))); + +typedef int v4si __attribute__ ((vector_size (16))); +typedef int v8si __attribute__ ((vector_size (32))); + +typedef long v2di __attribute__ ((vector_size (16))); +typedef long v4di __attribute__ ((vector_size (32))); + +typedef float v4sf __attribute__ ((vector_size (16))); +typedef float v8sf __attribute__ ((vector_size (32))); + +typedef double v2df __attribute__ ((vector_size (16))); +typedef double v4df __attribute__ ((vector_size (32))); + +v16qi a_qi, b_qi; +v8hi a_hi, b_hi; +v4si a_si, b_si; +v2di a_di, b_di; +v4sf a_sf, b_sf; +v2df a_df, b_df; + +v32qi +foo_v32qi () +{ + return __builtin_shufflevector (a_qi, b_qi, 0, 16, 1, 17, 2, 18, 3, 19, 4, 20, 5, 21, 6, 22, 7, 23, 8, 24, 9, 25, 10, 26, 11, 27, 12, 28, 13, 29, 14, 30, 15, 31); +} + +v16hi +foo_v16qi () +{ + return __builtin_shufflevector (a_hi, b_hi, 0, 8, 1, 9, 2, 10, 3, 11, 4, 12, 5, 13, 6, 14, 7, 15); +} + +v8si +foo_v8si () +{ + return __builtin_shufflevector (a_si, b_si, 0, 4, 1, 5, 2, 6, 3, 7); +} + +v4di +foo_v4di () +{ + return __builtin_shufflevector (a_di, b_di, 0, 2, 1, 3); +} + +v8sf +foo_v8sf () +{ + return __builtin_shufflevector (a_sf, b_sf, 0, 4, 1, 5, 2, 6, 3, 7); +} + +v4df +foo_v4df () +{ + return __builtin_shufflevector (a_df, b_df, 0, 2, 1, 3); +}