From patchwork Sun Nov 19 07:01:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Xi Ruoyao X-Patchwork-Id: 166717 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp1526305vqn; Sat, 18 Nov 2023 23:02:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IFIVBIUTq5jY+DGKzz1TL+gWi0hqAzaEBaYLG+BkS0fDwBDrRllz30JWowignovFaTD+bw+ X-Received: by 2002:a05:620a:3995:b0:77b:aa20:908 with SMTP id ro21-20020a05620a399500b0077baa200908mr3481177qkn.37.1700377329447; Sat, 18 Nov 2023 23:02:09 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1700377329; cv=pass; d=google.com; s=arc-20160816; b=J2gzLJ1p1fpi/5DMyqViL2R4TOgFKsVbtphx/0h2NcXOVnF/09no428P1xt6UzVNUY AcJ/6Un22shmKJSrB9su6c0bBF8Is4hcBneRlB81DOvaz0bgGg5yBvheLGnrkcOPoNFj U2hZpBEkbkLMh6r14O6kcq0EGc60BXr1px7goyh8aKaeP5Xa7NRwIBZ2aSiFUfZM5uS0 GkG++nF5OnuT9EorR8LfGA9vWwgBEuJXQkgBzXM0tp5K1cCHTlAoAi8j/HWXXSdDoB94 CReNRAaDDBg+x17bdIiv4J6C9rBKxGg5kdx6FGK7B8eraUVydqtyOsP3F7gQFQYVsBGw M+uQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=PxgOoAfePdE38Ocep2v+RLWHKO6QEcoEGDYSVpQVyDk=; fh=oUCfM/eMlWtMCtZZKY1bglzxCo7b3kw9D5LTFFWuz38=; b=gBLrmmGABEo4iNl61w05f+ATFpa8JeFgnmrD+rbthmJEqUWWpqzeAN2B6LB57tfMAn 9s4lu2OrWgbYIK3v1koQXo1e9wkWUQZ1TGLhDYwt5C+b0Cp1no3fIApVeAqm1sy+x595 hsa4ktcSQMBn6/9uIg8yXoMyiZA2jMuvqYN3J8NFBmdnYmDir1cB5DeZc3d+ZhAXZO/I tm+2piDo1GWahyudWdeVyDlM9VU3kZwvLeuwqQLAQ1ycW1X5UQ9W7+XQLml7+Pn6lXQO LC25yIfm/FAv/YskbHZH45gTGOY3mzkoqruKON0h4r+r7GUJSFBqkTSj3XUU1hinsSjq 6oWQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=HG46pBRu; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id br9-20020a05620a460900b007759f941a4bsi5438180qkb.331.2023.11.18.23.02.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Nov 2023 23:02:09 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@xry111.site header.s=default header.b=HG46pBRu; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=xry111.site Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 22794382F908 for ; Sun, 19 Nov 2023 07:02:09 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from xry111.site (xry111.site [89.208.246.23]) by sourceware.org (Postfix) with ESMTPS id D40133858C20 for ; Sun, 19 Nov 2023 07:01:45 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org D40133858C20 Authentication-Results: sourceware.org; dmarc=pass (p=reject dis=none) header.from=xry111.site Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=xry111.site ARC-Filter: OpenARC Filter v1.0.0 sourceware.org D40133858C20 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=89.208.246.23 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700377308; cv=none; b=Weku+ZnWvvX6hTE2wotm8G3X2WVGl3PhBiFkExaGQ1CWAshlzu4mFKUlutd+u+hltshHznzmSfd+V+O1oh/JOXk6GhUet7InhU4g7UJRCUxsz9oyFueZjf1tTLMSeScjjxdHkYJfMMR85MZZ2rkVs342oVUOAjufoD7+1lpB+OI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700377308; c=relaxed/simple; bh=zfOJxwQiy1TwEwYPUl+6z7oMqW66xbQ8yl+j0wnhddc=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=aelJgY4bytAbRl37+g8dRRPZZ2O307G6e+fJUuI93+TBgvaEElQINrWGDAPESt8BMIHlQEo//+GcuXtWwMx/CCyaVGKnhSzKRr9WljRWQcG/LD6XnJDa0d2zU9Pyn5KwryGdCat9V1Ro9ce152hRdPUcq7Z/15DRlvTE64U9mV0= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=xry111.site; s=default; t=1700377303; bh=zfOJxwQiy1TwEwYPUl+6z7oMqW66xbQ8yl+j0wnhddc=; h=From:To:Cc:Subject:Date:From; b=HG46pBRu+FihC9rnUUkuqyrSqbzscjg1Z71Fhf+Idm2qeMQOPcPgVc3Do8aD3DEW9 jqnhlp1iCmVE2AbvjmDRCXI4/xmIulJUPIZhPkbaYOt9A0dqc8S2/QnEKGMyT52Za3 8qPqNSp8ZZ748dgU4SbC000uU2mt3Tdll6angN1Q= Received: from stargazer.. (unknown [IPv6:240e:358:1182:dc00:dc73:854d:832e:3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-384) server-digest SHA384) (Client did not present a certificate) (Authenticated sender: xry111@xry111.site) by xry111.site (Postfix) with ESMTPSA id 3C2A066B39; Sun, 19 Nov 2023 02:01:39 -0500 (EST) From: Xi Ruoyao To: gcc-patches@gcc.gnu.org Cc: chenglulu , i@xen0n.name, xuchenghua@loongson.cn, Xi Ruoyao Subject: [PATCH] LoongArch: Optimize LSX vector shuffle on floating-point vector Date: Sun, 19 Nov 2023 15:01:03 +0800 Message-ID: <20231119070102.3053-2-xry111@xry111.site> X-Mailer: git-send-email 2.42.1 MIME-Version: 1.0 X-Spam-Status: No, score=-9.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, LIKELY_SPAM_FROM, SPF_HELO_PASS, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782974858427164601 X-GMAIL-MSGID: 1782974858427164601 The vec_perm expander was wrongly defined. GCC internal says: Operand 3 is the “selector”. It is an integral mode vector of the same width and number of elements as mode M. With this mistake, the generic code manages to work around and it ends up creating some very nasty code for a simple __builtin_shuffle (a, b, c) where a and b are V4SF, c is V4SI: la.local $r12,.LANCHOR0 la.local $r13,.LANCHOR1 vld $vr1,$r12,48 vslli.w $vr1,$vr1,2 vld $vr2,$r12,16 vld $vr0,$r13,0 vld $vr3,$r13,16 vshuf.b $vr0,$vr1,$vr1,$vr0 vld $vr1,$r12,32 vadd.b $vr0,$vr0,$vr3 vandi.b $vr0,$vr0,31 vshuf.b $vr0,$vr1,$vr2,$vr0 vst $vr0,$r12,0 jr $r1 This is obviously stupid. Fix the expander definition and adjust loongarch_expand_vec_perm to handle it correctly. gcc/ChangeLog: * config/loongarch/lsx.md (vec_perm): Make the selector VIMODE. * config/loongarch/loongarch.cc (loongarch_expand_vec_perm): Use the mode of the selector (instead of the shuffled vector) for truncating it. Operate on subregs in the selector mode if the shuffled vector has a different mode (i. e. it's a floating-point vector). gcc/testsuite/ChangeLog: * gcc.target/loongarch/vect-shuf-fp.c: New test. --- Bootstrapped & regtested on loongarch64-linux-gnu. Ok for trunk? gcc/config/loongarch/loongarch.cc | 18 ++++++++++-------- gcc/config/loongarch/lsx.md | 2 +- .../gcc.target/loongarch/vect-shuf-fp.c | 16 ++++++++++++++++ 3 files changed, 27 insertions(+), 9 deletions(-) create mode 100644 gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c diff --git a/gcc/config/loongarch/loongarch.cc b/gcc/config/loongarch/loongarch.cc index ce601a331f7..33357c670e1 100644 --- a/gcc/config/loongarch/loongarch.cc +++ b/gcc/config/loongarch/loongarch.cc @@ -8607,8 +8607,9 @@ void loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel) { machine_mode vmode = GET_MODE (target); + machine_mode vimode = GET_MODE (sel); auto nelt = GET_MODE_NUNITS (vmode); - auto round_reg = gen_reg_rtx (vmode); + auto round_reg = gen_reg_rtx (vimode); rtx round_data[MAX_VECT_LEN]; for (int i = 0; i < nelt; i += 1) @@ -8616,9 +8617,16 @@ loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel) round_data[i] = GEN_INT (0x1f); } - rtx round_data_rtx = gen_rtx_CONST_VECTOR (vmode, gen_rtvec_v (nelt, round_data)); + rtx round_data_rtx = gen_rtx_CONST_VECTOR (vimode, gen_rtvec_v (nelt, round_data)); emit_move_insn (round_reg, round_data_rtx); + if (vmode != vimode) + { + target = lowpart_subreg (vimode, target, vmode); + op0 = lowpart_subreg (vimode, op0, vmode); + op1 = lowpart_subreg (vimode, op1, vmode); + } + switch (vmode) { case E_V16QImode: @@ -8626,17 +8634,11 @@ loongarch_expand_vec_perm (rtx target, rtx op0, rtx op1, rtx sel) emit_insn (gen_lsx_vshuf_b (target, op1, op0, sel)); break; case E_V2DFmode: - emit_insn (gen_andv2di3 (sel, sel, round_reg)); - emit_insn (gen_lsx_vshuf_d_f (target, sel, op1, op0)); - break; case E_V2DImode: emit_insn (gen_andv2di3 (sel, sel, round_reg)); emit_insn (gen_lsx_vshuf_d (target, sel, op1, op0)); break; case E_V4SFmode: - emit_insn (gen_andv4si3 (sel, sel, round_reg)); - emit_insn (gen_lsx_vshuf_w_f (target, sel, op1, op0)); - break; case E_V4SImode: emit_insn (gen_andv4si3 (sel, sel, round_reg)); emit_insn (gen_lsx_vshuf_w (target, sel, op1, op0)); diff --git a/gcc/config/loongarch/lsx.md b/gcc/config/loongarch/lsx.md index 8ea41c85b01..5e8d8d74b43 100644 --- a/gcc/config/loongarch/lsx.md +++ b/gcc/config/loongarch/lsx.md @@ -837,7 +837,7 @@ (define_expand "vec_perm" [(match_operand:LSX 0 "register_operand") (match_operand:LSX 1 "register_operand") (match_operand:LSX 2 "register_operand") - (match_operand:LSX 3 "register_operand")] + (match_operand: 3 "register_operand")] "ISA_HAS_LSX" { loongarch_expand_vec_perm (operands[0], operands[1], diff --git a/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c b/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c new file mode 100644 index 00000000000..7acc2113afe --- /dev/null +++ b/gcc/testsuite/gcc.target/loongarch/vect-shuf-fp.c @@ -0,0 +1,16 @@ +/* { dg-do compile } */ +/* { dg-options "-mlasx -O3" } */ +/* { dg-final { scan-assembler "vshuf\.w" } } */ + +#define V __attribute__ ((vector_size (16))) + +int a V; +float b V; +float c V; +float d V; + +void +test (void) +{ + d = __builtin_shuffle (b, c, a); +}