From patchwork Wed May 10 20:45:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 92291 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3908800vqo; Wed, 10 May 2023 13:46:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5QQJxGgNtcq+v0tbTy2E43W8UcXUp55OwUbPGBzNUpyCnqu5he4IoNW2OVqEdCAahKXgop X-Received: by 2002:a17:907:6d06:b0:966:4d11:7887 with SMTP id sa6-20020a1709076d0600b009664d117887mr11569967ejc.4.1683751559664; Wed, 10 May 2023 13:45:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683751559; cv=none; d=google.com; s=arc-20160816; b=cee3HiVHMFSreMxxM1u7qjE8vYNv0WcQKE7CaUzesnmpZYta+04kB7orQf+j2jA5kA ryduXQoasKAF4pXBlq/FVdtgggyDnU3pgGbwBAsx54qT0ib7/YRgw7TmWPthnYvCsTGj qh4890QqIePl64AERTZgqbTpxT2HNk9odKmrRxwjuhRYSQPEbEvrTCeUz7kTvyWj8Cr7 /f+36kXihPBhpGtelYvXbEeMB7etyQhEW8uaQAiMeJeOPIi53H4ddgSJF7FwqniNxdeH EvLQ/QwvIg2vnQfCOBAOWpDGqYSAJbxAgYBmgtcbSiKAqmHr3goCISem3a9L+u70K9Vn 46cQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=FvBSlKPsmWKOasa0zBdZf4cQ6kFrQBGoYTIvrr4D+gE=; b=aZ5MCNr45yFGrxy260RDkLPtqDClUw58BfVOBI1Fo1QU7lqHagwStFKpZvWtUJRRF0 Y2A3SSNQa350CJNxOqA1I6pfj+TQCprVuOgGjKsC/wm3Vhz9CNcQpYlZZT6YnP+PA6xy YkAWi5ImNRtBkY38OioteL6IeH++CTx57mpgSGDZh2nQNDKzAt2Zv/8xK0Kgtb9gTsXx L5uq77Z/blHeKAYQxf9cN+LKpJoZ7WjuDgfW1pCvlewPhSNY3iBdvB1Nx8qpnxQg2i2X VEV58Zed9lOCqH0aQ+bmOlrN7vTTTUilaxCRI/NaxsftODJD7qj3M6npZePneHrLsIRA NAnA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hoQNWx3L; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id f23-20020a170906391700b009577ff6c491si4016069eje.998.2023.05.10.13.45.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 May 2023 13:45:59 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=hoQNWx3L; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 55B503855585 for ; Wed, 10 May 2023 20:45:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 55B503855585 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683751558; bh=FvBSlKPsmWKOasa0zBdZf4cQ6kFrQBGoYTIvrr4D+gE=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=hoQNWx3LaPYlKWYp2XLV6rpq7gKmLUBe4dWrB6+vjZbYAIE2fNcVoII+oGkADBbub VFaouZNjsnvWs3dZVfDBJXP0RVviyOJQN7tJR1wVzzbS9fzC391t6v4skK0FEAXN6R Uw02NALCfscQjt1qlk06elaLEJrkhfm6hjBlKeZ0= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qk1-x72a.google.com (mail-qk1-x72a.google.com [IPv6:2607:f8b0:4864:20::72a]) by sourceware.org (Postfix) with ESMTPS id 2252C3857036 for ; Wed, 10 May 2023 20:45:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 2252C3857036 Received: by mail-qk1-x72a.google.com with SMTP id af79cd13be357-75131c2997bso2730913285a.1 for ; Wed, 10 May 2023 13:45:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683751513; x=1686343513; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=FvBSlKPsmWKOasa0zBdZf4cQ6kFrQBGoYTIvrr4D+gE=; b=ldNm22zrHX8TZItUaqFdXAChBWEhsp4WLi2y/oLI7mFAlKCRqeWkWhTfbVGUvsxwyR i6JQM4WFWg77oeT18g50VzD8Dt2H3zhk5i6TieKJbamBhPlfrPdnjh02pFIHyy4zKJKB dRcl7Mb5b7Cgz3yrFQ0PcIm1qmfLOL4ibVuOuObO/CEXVvvo5Bp6MSJeKosBKlPW39Rj poFgy3fMaJhC6OpzsT8mFGohgd0t31vYEWOKsLlEm8MbXlFUg7o+YG35NQOXAZSkVaUL QaBH805wdkXaic3zCpx4s7HiSki7evBuMDv4S7LYGPtRVo03SO/8MyhO8QKjfMsVfOd3 djhQ== X-Gm-Message-State: AC+VfDxhjwjBa0c4bZrDZ/0TaYrrug7QeYJtArLkz0x7b7wO0vcCxiMa KFOKnj0WJFLuF5bwXge/693SbMQBliMZQbCbAKxUptuxHSq1Ag== X-Received: by 2002:a05:6214:d05:b0:56b:f28a:ee2d with SMTP id 5-20020a0562140d0500b0056bf28aee2dmr37814566qvh.5.1683751513124; Wed, 10 May 2023 13:45:13 -0700 (PDT) MIME-Version: 1.0 Date: Wed, 10 May 2023 22:45:01 +0200 Message-ID: Subject: [PATCH] i386: Add missing vector extend patterns [PR92658] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.3 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765541475405209879?= X-GMAIL-MSGID: =?utf-8?q?1765541475405209879?= Add missing insn pattern for v2qi -> v2si vector extend and named expanders to activate generation of vector extends to 8-byte and 4-byte vectors. gcc/ChangeLog: PR target/92658 * config/i386/mmx.md (sse4_1_v2qiv2si2): New insn pattern. (v4qiv4hi2): New expander. (v2hiv2si2): Ditto. (v2qiv2si2): Ditto. (v2qiv2hi2): Ditto. gcc/testsuite/ChangeLog: PR target/92658 * gcc.target/i386/pr92658-sse4-4b.c: New test. * gcc.target/i386/pr92658-sse4-8b.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 6dd203f4fa8..e7ca921dd2b 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -3543,6 +3543,18 @@ (define_insn "sse4_1_v4qiv4hi2" (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_expand "v4qiv4hi2" + [(set (match_operand:V4HI 0 "register_operand") + (any_extend:V4HI + (match_operand:V4QI 1 "register_operand")))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" +{ + rtx op1 = force_reg (V4QImode, operands[1]); + op1 = lowpart_subreg (V8QImode, op1, V4QImode); + emit_insn (gen_sse4_1_v4qiv4hi2 (operands[0], op1)); + DONE; +}) + (define_insn "sse4_1_v2hiv2si2" [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,v") (any_extend:V2SI @@ -3557,6 +3569,44 @@ (define_insn "sse4_1_v2hiv2si2" (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_expand "v2hiv2si2" + [(set (match_operand:V2SI 0 "register_operand") + (any_extend:V2SI + (match_operand:V2HI 1 "register_operand")))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" +{ + rtx op1 = force_reg (V2HImode, operands[1]); + op1 = lowpart_subreg (V4HImode, op1, V2HImode); + emit_insn (gen_sse4_1_v2hiv2si2 (operands[0], op1)); + DONE; +}) + +(define_insn "sse4_1_v2qiv2si2" + [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,v") + (any_extend:V2SI + (vec_select:V2QI + (match_operand:V4QI 1 "register_operand" "Yr,*x,v") + (parallel [(const_int 0) (const_int 1)]))))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "%vpmovbd\t{%1, %0|%0, %1}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "ssemov") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,maybe_evex") + (set_attr "mode" "TI")]) + +(define_expand "v2qiv2si2" + [(set (match_operand:V2SI 0 "register_operand") + (any_extend:V2SI + (match_operand:V2QI 1 "register_operand")))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" +{ + rtx op1 = force_reg (V2QImode, operands[1]); + op1 = lowpart_subreg (V4QImode, op1, V2QImode); + emit_insn (gen_sse4_1_v2qiv2si2 (operands[0], op1)); + DONE; +}) + (define_insn "sse4_1_v2qiv2hi2" [(set (match_operand:V2HI 0 "register_operand" "=Yr,*x,Yw") (any_extend:V2HI @@ -3571,6 +3621,18 @@ (define_insn "sse4_1_v2qiv2hi2" (set_attr "prefix" "orig,orig,maybe_evex") (set_attr "mode" "TI")]) +(define_expand "v2qiv2hi2" + [(set (match_operand:V2HI 0 "register_operand") + (any_extend:V2HI + (match_operand:V2QI 1 "register_operand")))] + "TARGET_SSE4_1" +{ + rtx op1 = force_reg (V2QImode, operands[1]); + op1 = lowpart_subreg (V4QImode, op1, V2QImode); + emit_insn (gen_sse4_1_v2qiv2hi2 (operands[0], op1)); + DONE; +}) + ;; Pack/unpack vector modes (define_mode_attr mmxpackmode [(V4HI "V8QI") (V2SI "V4HI")]) diff --git a/gcc/testsuite/gcc.target/i386/pr92658-sse4-4b.c b/gcc/testsuite/gcc.target/i386/pr92658-sse4-4b.c new file mode 100644 index 00000000000..f0264a3cbe1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr92658-sse4-4b.c @@ -0,0 +1,26 @@ +/* PR target/92658 */ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse4.1" } */ + +typedef unsigned char v4qi __attribute__((vector_size (4))); +typedef unsigned short v2hi __attribute__((vector_size (4))); + +void +foo_u8_u16 (v2hi * dst, v4qi * __restrict src) +{ + unsigned short tem[2]; + tem[0] = (*src)[0]; + tem[1] = (*src)[1]; + dst[0] = *(v2hi *) tem; +} + +void +bar_u8_u16 (v2hi * dst, v4qi src) +{ + unsigned short tem[4]; + tem[0] = src[0]; + tem[1] = src[1]; + dst[0] = *(v2hi *) tem; +} + +/* { dg-final { scan-assembler-times "pmovzxbw" 2 } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr92658-sse4-8b.c b/gcc/testsuite/gcc.target/i386/pr92658-sse4-8b.c new file mode 100644 index 00000000000..5c815f51ee3 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr92658-sse4-8b.c @@ -0,0 +1,71 @@ +/* PR target/92658 */ +/* { dg-do compile { target { ! ia32 } } } */ +/* { dg-options "-O2 -mtune=icelake-server -ftree-vectorize -msse4.1" } */ + +typedef unsigned char v8qi __attribute__((vector_size (8))); +typedef unsigned short v4hi __attribute__((vector_size (8))); +typedef unsigned int v2si __attribute__((vector_size (8))); + +void +foo_u8_u16 (v4hi * dst, v8qi * __restrict src) +{ + unsigned short tem[4]; + tem[0] = (*src)[0]; + tem[1] = (*src)[1]; + tem[2] = (*src)[2]; + tem[3] = (*src)[3]; + dst[0] = *(v4hi *) tem; +} + +void +bar_u8_u16 (v4hi * dst, v8qi src) +{ + unsigned short tem[4]; + tem[0] = src[0]; + tem[1] = src[1]; + tem[2] = src[2]; + tem[3] = src[3]; + dst[0] = *(v4hi *) tem; +} + +/* { dg-final { scan-assembler-times "pmovzxbw" 2 } } */ + +void +foo_u8_u32 (v2si * dst, v8qi * __restrict src) +{ + unsigned int tem[2]; + tem[0] = (*src)[0]; + tem[1] = (*src)[1]; + dst[0] = *(v2si *) tem; +} + +void +bar_u8_u32 (v2si * dst, v8qi src) +{ + unsigned int tem[2]; + tem[0] = src[0]; + tem[1] = src[1]; + dst[0] = *(v2si *) tem; +} + +/* { dg-final { scan-assembler-times "pmovzxbd" 2 } } */ + +void +foo_u16_u32 (v2si * dst, v4hi * __restrict src) +{ + unsigned int tem[2]; + tem[0] = (*src)[0]; + tem[1] = (*src)[1]; + dst[0] = *(v2si *) tem; +} + +void +bar_u16_u32 (v2si * dst, v4hi src) +{ + unsigned int tem[2]; + tem[0] = src[0]; + tem[1] = src[1]; + dst[0] = *(v2si *) tem; +} + +/* { dg-final { scan-assembler-times "pmovzxwd" 2 } } */