From patchwork Fri May 5 12:13:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 90426 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp354676vqo; Fri, 5 May 2023 05:14:56 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4VcDcMKQlLqy8llMUUAD5st3heC/hj/OW/0dmHt+WsOV0Cg9Mn/Er28/M+TrdzZVAdoZTW X-Received: by 2002:a17:907:36c7:b0:94e:f3d5:e4f8 with SMTP id bj7-20020a17090736c700b0094ef3d5e4f8mr1743578ejc.37.1683288895618; Fri, 05 May 2023 05:14:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683288895; cv=none; d=google.com; s=arc-20160816; b=kGjS43y733QHIseNbfPYm7VN+MYMo6S2VZmVI6H8e1cJ8fqSMGFEdw2ZUL6BGbHY+x SrTHrxgum15AdnLort368+0OVG+U1MCMKUpIV3BMZdWV1ocznzzvAoDdVnzQ+UX5Nqts /1Nmt63JJowLYQ4GCzM5W+VWPEjHdyRQTZ2HGGQTIQIfIxOakg1oA8JLVzMDc2zwykxZ 1dsf/EE0OCz7/UMAvFqtGUncAlLZ4rvBSD7Jdw3QqST0PSCsoY9wFF/yobTykb7RcTKb UWVK8P7H+MLxRSQGpk18ennCiIcasE92QLclM+9P7iD48N2U2VTk3Z+nlRzl85oDEvwA sGlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=QFUwaW9lq2GQTZvQ2ggfWwA70h5y75TClwAzTgqqPWQ=; b=JNI6BCws2Zup5xXJ4r1rivsan0KzObaeD+Ie8oIb6/0ilcz7km0L5zFuCEY0VuUmNL 6MvhijpriI85F2vnqUQOLxQNUw2MKW7av5jdJwuLhtOCsxtRkzOoP7RJs0/qITW99x4w t5Am1sw8/8K2efyd25ON9XVijax7Owi0+lbHA5pSMjfhLLCEPj3bY19o5NsQxvnda98Z 1SXQkA3wJFWIQdHPsw3LuVzq8EfYLPDXXo5zBE2f+hmwDtYYyYgRqPbCNJ0CkqXwZSc5 EglmltzRYVyKeWy3c2lZHp4dHoCs3qVUuH8eMvHLVQBGG5fSAD40D0O/ZuzMyc7xtWH/ GMGg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="V9OLTS/l"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id sy19-20020a1709076f1300b0094fbea57d42si1257860ejc.933.2023.05.05.05.14.55 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 05 May 2023 05:14:55 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="V9OLTS/l"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 8599C3858D33 for ; Fri, 5 May 2023 12:14:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 8599C3858D33 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1683288894; bh=QFUwaW9lq2GQTZvQ2ggfWwA70h5y75TClwAzTgqqPWQ=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=V9OLTS/lyOKjwclPXH8zw2CovCcSwa0k3sDhzeWVGO+0PGiy6t1TCBrbjJMkZYibT dCPEzTov3eok3cVxAH3w5r7/i9adZErutEI2LCxyYFv69h/j/MI9sKQihXjQ9x194i wyFZXOQcrwXgszotrk636eZ2+mTiO6VH1Ax7L0YI= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf2c.google.com (mail-qv1-xf2c.google.com [IPv6:2607:f8b0:4864:20::f2c]) by sourceware.org (Postfix) with ESMTPS id DF3873858D20 for ; Fri, 5 May 2023 12:14:09 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org DF3873858D20 Received: by mail-qv1-xf2c.google.com with SMTP id 6a1803df08f44-61cd6191a62so7104816d6.3 for ; Fri, 05 May 2023 05:14:09 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683288849; x=1685880849; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=QFUwaW9lq2GQTZvQ2ggfWwA70h5y75TClwAzTgqqPWQ=; b=PpcvzzAHGQ6sApD5MBYkXwkXGH1jIGuNY19x2pljftJk+o3VoNmEWt4LGYR7H3akR1 JekQdhGXdhOCM29NsZE5QHcD6Wu6HUBEY4whY+2fftA9IerCPM08NbYRWi0PmqzDYMw0 B68fNQ2xwGfp+HX1HyQHpXJAd5t3AUJnRyRg/4KT+QPJmBkL9zn4MwSS5pzHAyrWMVRy PrhDqbRepT/s0IS7MLvPqy+4Y3g9/x/NN1K+PYMJBWseZ8Wut4Hxe09vP0y/p+aNfZD2 CPvqEzxyrfn1OUHh3nj3etpyddiJO33noKUrodu2iFBnjI2wEzQT/t4MTDoRC2FxkR/j EWpg== X-Gm-Message-State: AC+VfDxTMwesLc32EV3vcn/2mRaerLDQpAC3KJbAa6jKYYOvcKs9BEK/ rbFoekYJf7yo6Swk8GgWlxndj4g5BXNtL22rOhIVOL4jHh3nqw== X-Received: by 2002:ad4:5cce:0:b0:5e9:2d8c:9a06 with SMTP id iu14-20020ad45cce000000b005e92d8c9a06mr1655137qvb.39.1683288848862; Fri, 05 May 2023 05:14:08 -0700 (PDT) MIME-Version: 1.0 Date: Fri, 5 May 2023 14:13:57 +0200 Message-ID: Subject: [PATCH] i386: Introduce mulv2si3 instruction To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-7.6 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765056337054715918?= X-GMAIL-MSGID: =?utf-8?q?1765056337054715918?= For SSE2 targets the expander unpacks input elements into the correct position in the V4SI vector and emits PMULUDQ instruction. The output elements are then shuffled back to their positions in the V2SI vector. For SSE4 targets PMULLD instruction is emitted directly. gcc/ChangeLog: * config/i386/mmx.md (mulv2si3): New expander. (*mulv2si3): New insn pattern. gcc/testsuite/ChangeLog: * gcc.target/i386/sse2-mmx-mult-vec.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Pushed to master. Uros. diff --git a/gcc/config/i386/mmx.md b/gcc/config/i386/mmx.md index 872ddbc55f2..6dd203f4fa8 100644 --- a/gcc/config/i386/mmx.md +++ b/gcc/config/i386/mmx.md @@ -2092,6 +2092,55 @@ (define_insn "*3" (set_attr "type" "sseadd") (set_attr "mode" "TI")]) +(define_expand "mulv2si3" + [(set (match_operand:V2SI 0 "register_operand") + (mult:V2SI + (match_operand:V2SI 1 "register_operand") + (match_operand:V2SI 2 "register_operand")))] + "TARGET_MMX_WITH_SSE" +{ + if (!TARGET_SSE4_1) + { + rtx op1 = lowpart_subreg (V4SImode, force_reg (V2SImode, operands[1]), + V2SImode); + rtx op2 = lowpart_subreg (V4SImode, force_reg (V2SImode, operands[2]), + V2SImode); + + rtx tmp1 = gen_reg_rtx (V4SImode); + emit_insn (gen_vec_interleave_lowv4si (tmp1, op1, op1)); + rtx tmp2 = gen_reg_rtx (V4SImode); + emit_insn (gen_vec_interleave_lowv4si (tmp2, op2, op2)); + + rtx res = gen_reg_rtx (V2DImode); + emit_insn (gen_vec_widen_umult_even_v4si (res, tmp1, tmp2)); + + rtx op0 = gen_reg_rtx (V4SImode); + emit_insn (gen_sse2_pshufd_1 (op0, gen_lowpart (V4SImode, res), + const0_rtx, const2_rtx, + const0_rtx, const2_rtx)); + + emit_move_insn (operands[0], lowpart_subreg (V2SImode, op0, V4SImode)); + DONE; + } +}) + +(define_insn "*mulv2si3" + [(set (match_operand:V2SI 0 "register_operand" "=Yr,*x,v") + (mult:V2SI + (match_operand:V2SI 1 "register_operand" "%0,0,v") + (match_operand:V2SI 2 "register_operand" "Yr,*x,v")))] + "TARGET_SSE4_1 && TARGET_MMX_WITH_SSE" + "@ + pmulld\t{%2, %0|%0, %2} + pmulld\t{%2, %0|%0, %2} + vpmulld\t{%2, %1, %0|%0, %1, %2}" + [(set_attr "isa" "noavx,noavx,avx") + (set_attr "type" "sseimul") + (set_attr "prefix_extra" "1") + (set_attr "prefix" "orig,orig,vex") + (set_attr "btver2_decode" "vector") + (set_attr "mode" "TI")]) + (define_expand "mmx_mulv4hi3" [(set (match_operand:V4HI 0 "register_operand") (mult:V4HI (match_operand:V4HI 1 "register_mmxmem_operand") diff --git a/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c b/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c new file mode 100644 index 00000000000..cdc9a7bb8bf --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/sse2-mmx-mult-vec.c @@ -0,0 +1,27 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -ftree-vectorize -msse2" } */ +/* { dg-require-effective-target sse2 } */ + +#include "sse2-check.h" + +#define N 2 + +int a[N] = {-287807, 604344}; +int b[N] = {474362, 874120}; +int r[N]; + +int rc[N] = {914249338, -11800128}; + +static void +sse2_test (void) +{ + int i; + + for (i = 0; i < N; i++) + r[i] = a[i] * b[i]; + + /* check results: */ + for (i = 0; i < N; i++) + if (r[i] != rc[i]) + abort (); +}