From patchwork Fri Nov 17 00:09:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 165965 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9910:0:b0:403:3b70:6f57 with SMTP id i16csp201802vqn; Thu, 16 Nov 2023 16:12:09 -0800 (PST) X-Google-Smtp-Source: AGHT+IFkoHWvZCpVloRtd3/sIdRYt68LUr14vtdmalxS+e2SAK9iEWrRvMKlD/psVddAlpgum5Ta X-Received: by 2002:a05:620a:2894:b0:773:a028:71b6 with SMTP id j20-20020a05620a289400b00773a02871b6mr9874626qkp.65.1700179928861; Thu, 16 Nov 2023 16:12:08 -0800 (PST) Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id h21-20020a05620a245500b00770728e885bsi609356qkn.316.2023.11.16.16.12.08 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 16:12:08 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=fail header.i=@intel.com header.s=Intel header.b=h0uX60lp; arc=fail (signature failed); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A00123858420 for ; Fri, 17 Nov 2023 00:12:08 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by sourceware.org (Postfix) with ESMTPS id E2F2B3858D37 for ; Fri, 17 Nov 2023 00:11:40 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E2F2B3858D37 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E2F2B3858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=134.134.136.100 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700179903; cv=none; b=dCr5YnuLaxlflmQChsZGieK1xMbiDZE2vBb/Ejc3jPVuYquqLxbNoWCiWpNOe6+Jv+ZqBVeDnj7LmtvcrKjAqxNXkyC6eHfj2iZ2kthXbur4iZ3CThd25abSnG6Ha2cPolerG2ZiRq/iQbJWyN2+qVGTu3wWG1XINRkRU5aMgsY= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1700179903; c=relaxed/simple; bh=Hi4PsEyfHQkX0c5icPYf+RrPVu1448PTEzT1BoXH2O0=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=EQSqWcu5Z1Hkxl+YFv1aT8TUngISlrL5KXSAYxd/Uzme5eRRoY4tIDcxvrGYk1yzOcgxoMpTY8QDWNpSN7xFmAF4xiq0DQ+ZHLnKRP2HMf2HPbUd9oHzsZp+ziGXW4+v4SIBH0TGfxV2K6mRQhJsHhtJJse84I3EVQeParFjEyo= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1700179900; x=1731715900; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=Hi4PsEyfHQkX0c5icPYf+RrPVu1448PTEzT1BoXH2O0=; b=h0uX60lpnnNriu6qmntTelFRZX0N7INdTJDlU1SyGPn9BMrEV9WcEk7y j65UQ5CxpZFy8fpqpYrh4xgix9RYpUEhnQ4ashkAuCYXMRQHInsAs+SaP +R4mr4R0u8LrJ/RGOuWbQ1yTYqnfkN9jRPEgaMRVvURvWPWNFNnwpznH4 x06YtKG71yAWe2Tf1lxCXJ5aEqSwasRD1HGRMebB8rqT0EB+nPF9CotXa TryMsU+0uYPbmdMrfbKqcS0QuCMXpu90ImuSi02MVf8dEznWfsxFssmUE dzsR1EKQ17cwgCP+mkM+hxVBbOvtSRmH0Z6ACKi5Tyr1fzQ0gmU//dztX Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10896"; a="457705760" X-IronPort-AV: E=Sophos;i="6.04,205,1695711600"; d="scan'208";a="457705760" Received: from fmviesa001.fm.intel.com ([10.60.135.141]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 16 Nov 2023 16:11:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.04,205,1695711600"; d="scan'208";a="13724763" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmviesa001.fm.intel.com with ESMTP; 16 Nov 2023 16:11:35 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id 9050D100567B; Fri, 17 Nov 2023 08:11:34 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: crazylht@gmail.com, hjl.tools@gmail.com Subject: [PATCH 1/2] Support reduc_{plus, xor, and, ior}_scal_m for vector integer mode. Date: Fri, 17 Nov 2023 08:09:33 +0800 Message-Id: <20231117000934.2301995-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 MIME-Version: 1.0 X-Spam-Status: No, score=-12.1 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1782767868857345271 X-GMAIL-MSGID: 1782767868857345271 BB vectorizer relies on the backend support of .REDUC_{PLUS,IOR,XOR,AND} to vectorize reduction. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ready push to trunk. gcc/ChangeLog: PR target/112325 * config/i386/sse.md (reduc__scal_): New expander. (REDUC_ANY_LOGIC_MODE): New iterator. (REDUC_PLUS_MODE): Extend to VxHI/SI/DImode. (REDUC_SSE_PLUS_MODE): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/pr112325-1.c: New test. * gcc.target/i386/pr112325-2.c: New test. --- gcc/config/i386/sse.md | 48 ++++++++- gcc/testsuite/gcc.target/i386/pr112325-1.c | 116 +++++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr112325-2.c | 38 +++++++ 3 files changed, 199 insertions(+), 3 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr112325-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr112325-2.c diff --git a/gcc/config/i386/sse.md b/gcc/config/i386/sse.md index d250a6cb802..f94a77d0b6d 100644 --- a/gcc/config/i386/sse.md +++ b/gcc/config/i386/sse.md @@ -3417,7 +3417,9 @@ (define_insn "sse3_hv4sf3" (define_mode_iterator REDUC_SSE_PLUS_MODE [(V2DF "TARGET_SSE") (V4SF "TARGET_SSE") - (V8HF "TARGET_AVX512FP16 && TARGET_AVX512VL")]) + (V8HF "TARGET_AVX512FP16 && TARGET_AVX512VL") + (V8HI "TARGET_SSE2") (V4SI "TARGET_SSE2") + (V2DI "TARGET_SSE2")]) (define_expand "reduc_plus_scal_" [(plus:REDUC_SSE_PLUS_MODE @@ -3458,8 +3460,12 @@ (define_mode_iterator REDUC_PLUS_MODE (V8DF "TARGET_AVX512F && TARGET_EVEX512") (V16SF "TARGET_AVX512F && TARGET_EVEX512") (V32HF "TARGET_AVX512FP16 && TARGET_AVX512VL && TARGET_EVEX512") - (V32QI "TARGET_AVX") - (V64QI "TARGET_AVX512F && TARGET_EVEX512")]) + (V32QI "TARGET_AVX") (V16HI "TARGET_AVX") + (V8SI "TARGET_AVX") (V4DI "TARGET_AVX") + (V64QI "TARGET_AVX512F && TARGET_EVEX512") + (V32HI "TARGET_AVX512F && TARGET_EVEX512") + (V16SI "TARGET_AVX512F && TARGET_EVEX512") + (V8DI "TARGET_AVX512F && TARGET_EVEX512")]) (define_expand "reduc_plus_scal_" [(plus:REDUC_PLUS_MODE @@ -3597,6 +3603,42 @@ (define_insn "reduces" (set_attr "prefix" "evex") (set_attr "mode" "")]) +(define_expand "reduc__scal_" + [(any_logic:VI_128 + (match_operand: 0 "register_operand") + (match_operand:VI_128 1 "register_operand"))] + "TARGET_SSE2" +{ + rtx tmp = gen_reg_rtx (mode); + ix86_expand_reduc (gen_3, tmp, operands[1]); + emit_insn (gen_vec_extract (operands[0], + tmp, const0_rtx)); + DONE; +}) + +(define_mode_iterator REDUC_ANY_LOGIC_MODE + [(V32QI "TARGET_AVX") (V16HI "TARGET_AVX") + (V8SI "TARGET_AVX") (V4DI "TARGET_AVX") + (V64QI "TARGET_AVX512F && TARGET_EVEX512") + (V32HI "TARGET_AVX512F && TARGET_EVEX512") + (V16SI "TARGET_AVX512F && TARGET_EVEX512") + (V8DI "TARGET_AVX512F && TARGET_EVEX512")]) + +(define_expand "reduc__scal_" + [(any_logic:REDUC_ANY_LOGIC_MODE + (match_operand: 0 "register_operand") + (match_operand:REDUC_ANY_LOGIC_MODE 1 "register_operand"))] + "" +{ + rtx tmp = gen_reg_rtx (mode); + emit_insn (gen_vec_extract_hi_ (tmp, operands[1])); + rtx tmp2 = gen_reg_rtx (mode); + rtx tmp3 = gen_lowpart (mode, operands[1]); + emit_insn (gen_3 (tmp2, tmp, tmp3)); + emit_insn (gen_reduc__scal_ (operands[0], tmp2)); + DONE; +}) + ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ;; ;; Parallel floating point comparisons diff --git a/gcc/testsuite/gcc.target/i386/pr112325-1.c b/gcc/testsuite/gcc.target/i386/pr112325-1.c new file mode 100644 index 00000000000..56e20c156f1 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr112325-1.c @@ -0,0 +1,116 @@ +/* { dg-do compile } */ +/* { dg-options "-mavx512vl -mavx512bw -O2 -mtune=generic -mprefer-vector-width=512 -fdump-tree-slp2" } */ +/* { dg-final { scan-tree-dump-times ".REDUC_PLUS" 3 "slp2" } } */ +/* { dg-final { scan-tree-dump-times ".REDUC_IOR" 4 "slp2" } } */ + +int +__attribute__((noipa)) +plus_v4si (int* a) +{ + int sum = 0; + sum += a[0]; + sum += a[1]; + sum += a[2]; + sum += a[3]; + return sum; +} + +short +__attribute__((noipa)) +plus_v8hi (short* a) +{ + short sum = 0; + sum += a[0]; + sum += a[1]; + sum += a[2]; + sum += a[3]; + sum += a[4]; + sum += a[5]; + sum += a[6]; + sum += a[7]; + return sum; +} + +long long +__attribute__((noipa)) +plus_v8di (long long* a) +{ + long long sum = 0; + sum += a[0]; + sum += a[1]; + sum += a[2]; + sum += a[3]; + sum += a[4]; + sum += a[5]; + sum += a[6]; + sum += a[7]; + return sum; +} + +int +__attribute__((noipa)) +ior_v4si (int* a) +{ + int sum = 0; + sum |= a[0]; + sum |= a[1]; + sum |= a[2]; + sum |= a[3]; + return sum; +} + +short +__attribute__((noipa)) +ior_v8hi (short* a) +{ + short sum = 0; + sum |= a[0]; + sum |= a[1]; + sum |= a[2]; + sum |= a[3]; + sum |= a[4]; + sum |= a[5]; + sum |= a[6]; + sum |= a[7]; + return sum; +} + +long long +__attribute__((noipa)) +ior_v8di (long long* a) +{ + long long sum = 0; + sum |= a[0]; + sum |= a[1]; + sum |= a[2]; + sum |= a[3]; + sum |= a[4]; + sum |= a[5]; + sum |= a[6]; + sum |= a[7]; + return sum; +} + +char +__attribute__((noipa)) +ior_v16qi (char* a) +{ + char sum = 0; + sum |= a[0]; + sum |= a[1]; + sum |= a[2]; + sum |= a[3]; + sum |= a[4]; + sum |= a[5]; + sum |= a[6]; + sum |= a[7]; + sum |= a[8]; + sum |= a[9]; + sum |= a[10]; + sum |= a[11]; + sum |= a[12]; + sum |= a[13]; + sum |= a[14]; + sum |= a[15]; + return sum; +} diff --git a/gcc/testsuite/gcc.target/i386/pr112325-2.c b/gcc/testsuite/gcc.target/i386/pr112325-2.c new file mode 100644 index 00000000000..650006b0bd9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr112325-2.c @@ -0,0 +1,38 @@ +/* { dg-do run } */ +/* { dg-options "-O2 -msse2" } */ +/* { dg-require-effective-target sse2 } */ + +#include "sse2-check.h" +#include "pr112325-1.c" + +static void +sse2_test (void) +{ + int d[4] = { 3, 11, 22, 89}; + short w[8] = { 3, 11, 22, 89, 4, 9, 13, 7}; + char b[16] = { 3, 11, 22, 89, 4, 9, 13, 7, 2, 6, 5, 111, 163, 88, 11, 235}; + long long q[8] = { 3, 11, 22, 89, 4, 9, 13, 7}; + + /* if (plus_v4si (d) != 125) */ + /* __builtin_abort (); */ + + /* if (plus_v8hi (w) != 158) */ + /* __builtin_abort (); */ + + /* if (plus_v8di (q) != 158) */ + /* __builtin_abort (); */ + + /* if (ior_v4si (d) != 95) */ + /* __builtin_abort (); */ + + /* if (ior_v8hi (w) != 95) */ + /* __builtin_abort (); */ + + /* if (ior_v16qi (b) != (char)255) */ + /* __builtin_abort (); */ + + if (ior_v8di (q) != 95) + __builtin_abort (); + + return; +}