From patchwork Tue Jan 9 10:46:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: liuhongt X-Patchwork-Id: 186311 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2411:b0:101:2151:f287 with SMTP id m17csp26357dyi; Tue, 9 Jan 2024 02:49:43 -0800 (PST) X-Google-Smtp-Source: AGHT+IE+jgPTrmaw7MwzTjwcpu07cMn9wdraxxw6Nz3b87T6nzax7Ew/POMeZaOYDM5wR5xYJ9MR X-Received: by 2002:a05:620a:8406:b0:781:3e17:ff51 with SMTP id pc6-20020a05620a840600b007813e17ff51mr3898305qkn.30.1704797382850; Tue, 09 Jan 2024 02:49:42 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704797382; cv=pass; d=google.com; s=arc-20160816; b=gC5EKrqvP23RJgfaye4R28GknNedg+xi/3Xq9nnUt5v5YOEse+jRfqkXVNRWWihREB g0mhdO8bU0ohLyD4RpdoSom4ZlXAPFkgOB0QyUiMaRPGtbzMArYVBlJQWtTW7UblgiB6 cTIxVY9qJ/MZyBj2cSoaILlmz1pvr548NYxgL4SpYFCYkLtQGT5ZJ+Yu3tQOgLhRzz1i gCEwayOa0z+NVY6Ktp6PkzZ7Z1cvjdsnALesS2KdO+p0We3wK/VFsfGk32dAwBE3+fEC Nmh2IHWpShrFpQA4Wf22EdkQu1dwA67Hupw/YabYwJYIhNf6KgQIMhsEjYfWC2iZQmvX 3tRg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=NXpiEMJTx8hcfJVqHgyQ4/d3r7mUxUehC2KUscH2Mts=; fh=BbwBlQ8OQMHVYW+HzY6nHoGK+GxX3KEuxNpt/lfzdbk=; b=MNzSgoECDaJM8PW0J2vaJw6bT/0nreqG/cxe9kUga01yp554FNOt279kvID0DgoVyu W5uJSq1RPWUh1QC3IERDrWeG6GqO/wOXMeEzgpgilv0xTgWaoltHt+l9bgf1eBqUH8Mk HXvPE6W+TAqcGy9Zv6MY5rngAQjp8c5DZOsJMKUiBSg0pcIlEREyP9sgFjZhCI0GE2Fw QX5cXiuRy3GvbSQ5m6DBeJi6/A2wVktmZKB34u2NFrplcXo5DoaLaXm3VfFV2inJGBN/ REI1qNr2Txqfjb0J7V1loRfqDXuEoXBVu0mEKzT9u+aGIG+HLhOtxQwAS62kDZA1TlAp 81pg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VRZ4zMKG; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id vv10-20020a05620a562a00b007819e1c8594si1725165qkn.129.2024.01.09.02.49.42 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 09 Jan 2024 02:49:42 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=VRZ4zMKG; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 85C2D385841D for ; Tue, 9 Jan 2024 10:49:42 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.10]) by sourceware.org (Postfix) with ESMTPS id 8E9AE3858CDB for ; Tue, 9 Jan 2024 10:48:52 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 8E9AE3858CDB Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=intel.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 8E9AE3858CDB Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=198.175.65.10 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704797334; cv=none; b=SkoSIasQg1n2cspQjV3ug6pWHgv2MEyDA8wP4gCI8y/oOUquf7zBv3LOrgNGXuwQ4qs7MRfvOVHn9VqziSOGfic+WHgz5sZUhBUkXnfmM7sC1RS4Z/qpfE77SmeB6lhm4LigUgfhzePuPrKe0/8dZcE2ovKA/puxpoc4HCYyIU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704797334; c=relaxed/simple; bh=D7tjk3lQmTBWKuCY4FJpKRFH9Sf5fDdPN6Z57A5ggJw=; h=DKIM-Signature:From:To:Subject:Date:Message-Id:MIME-Version; b=l2Dzk407OdvvqBS2dx74gyT2qRKLKl4viBpJgUCT4WTRNlZo8Sk5btnDC8KKT5w5lSTwVxX0zDhL5+rTBPuLCePxe1t5B6rNU6dwGD4ooRxecXcvxmp4TADdF6wwMkjHpBPp45I++gqE9CxQoVvk6ReXtM486hpGP6JacunDgTM= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1704797333; x=1736333333; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=D7tjk3lQmTBWKuCY4FJpKRFH9Sf5fDdPN6Z57A5ggJw=; b=VRZ4zMKGAKGrqJlQTuHHEWcq2Z5OqMKJ7nwdHXt/4hAnhjqn+MJX3IJc lcO3791Mi+KCBHZTQ7z2x07RoAFzo3Wfd6V2W1iN2YCFYzS/FIfOOx6AD K+d/CvNo8X4/+6+7mKvkkMQru7G0Smym0v3aazGfg9zgHi1yqr39IhD+G H3e7ovf6I5lAViMUnxMapW/TD9UPLmIjYLMG+JNQ9hZiAWH6PdShl+hTI Dx10HBKfOSd2nS0sE+obhRbQDFKbpGXTp8c1idf5jwn/6FcmR8SPOlonF 5zbNObSiAFSHH40hcA8yFl2P+Fvb8uEufIUJC2Z/OaRGFOlFmI3PrKYaF Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10947"; a="11507103" X-IronPort-AV: E=Sophos;i="6.04,182,1695711600"; d="scan'208";a="11507103" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orvoesa102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 09 Jan 2024 02:48:51 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10947"; a="1113042575" X-IronPort-AV: E=Sophos;i="6.04,182,1695711600"; d="scan'208";a="1113042575" Received: from shvmail03.sh.intel.com ([10.239.245.20]) by fmsmga005.fm.intel.com with ESMTP; 09 Jan 2024 02:48:49 -0800 Received: from shliclel4217.sh.intel.com (shliclel4217.sh.intel.com [10.239.240.127]) by shvmail03.sh.intel.com (Postfix) with ESMTP id C82B010079C1; Tue, 9 Jan 2024 18:48:48 +0800 (CST) From: liuhongt To: gcc-patches@gcc.gnu.org Cc: richard.guenther@gmail.com Subject: [PATCH] Optimize A < B ? A : B to MIN_EXPR. Date: Tue, 9 Jan 2024 18:46:48 +0800 Message-Id: <20240109104648.675293-1-hongtao.liu@intel.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-13.2 required=5.0 tests=BAYES_00, DKIMWL_WL_HIGH, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, KAM_SHORT, SPF_HELO_NONE, SPF_NONE, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785687558036772055 X-GMAIL-MSGID: 1787609620598996624 > I wonder if you can amend the existing patterns instead by iterating > over cond/vec_cond.  There are quite some (look for uses of > minmax_from_comparison) that could be adapted to vectors. > > The ones matching the simple form you match are > > #if GIMPLE > /* A >= B ? A : B -> max (A, B) and friends.  The code is still >    in fold_cond_expr_with_comparison for GENERIC folding with >    some extra constraints.  */ > (for cmp (eq ne le lt unle unlt ge gt unge ungt uneq ltgt) >  (simplify >   (cond (cmp:c (nop_convert1?@c0 @0) (nop_convert2?@c1 @1)) >         (convert3? @0) (convert4? @1)) >   (if (!HONOR_SIGNED_ZEROS (type) > ... This pattern is a conditional operation that treats a vector as a complete unit, it's more like cbranchm which is different from vec_cond_expr. So I add my patterns after this. > > I think.  Consider at least placing the new patterns next to that. Bootstrapped and regtested on x86_64-pc-linux-gnu{-m32,}. Ok for trunk? Similar for A < B ? B : A to MAX_EXPR. There're codes in the frontend to optimize such pattern but failed to handle testcase in the PR since it's exposed at gimple level when folding backend builtins. pr95906 now can be optimized to MAX_EXPR as it's commented in the testcase. // FIXME: this should further optimize to a MAX_EXPR typedef signed char v16i8 __attribute__((vector_size(16))); v16i8 f(v16i8 a, v16i8 b) gcc/ChangeLog: PR target/104401 * match.pd (VEC_COND_EXPR: A < B ? A : B -> MIN_EXPR): New patten match. gcc/testsuite/ChangeLog: * gcc.target/i386/pr104401.c: New test. * gcc.dg/tree-ssa/pr95906.c: Adjust testcase. --- gcc/match.pd | 21 ++++++++++++++++++ gcc/testsuite/gcc.dg/tree-ssa/pr95906.c | 3 +-- gcc/testsuite/gcc.target/i386/pr104401.c | 27 ++++++++++++++++++++++++ 3 files changed, 49 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr104401.c diff --git a/gcc/match.pd b/gcc/match.pd index 7b4b15acc41..d8e2009a83f 100644 --- a/gcc/match.pd +++ b/gcc/match.pd @@ -5672,6 +5672,27 @@ DEFINE_INT_AND_FLOAT_ROUND_FN (RINT) (if (VECTOR_TYPE_P (type)) (view_convert @c0) (convert @c0)))))))) + +/* This is for VEC_COND_EXPR + Optimize A < B ? A : B to MIN (A, B) + A > B ? A : B to MAX (A, B). */ +(for cmp (lt le ungt unge gt ge unlt unle) + minmax (min min min min max max max max) + MINMAX (MIN_EXPR MIN_EXPR MIN_EXPR MIN_EXPR MAX_EXPR MAX_EXPR MAX_EXPR MAX_EXPR) + (simplify + (vec_cond (cmp @0 @1) @0 @1) + (if (VECTOR_INTEGER_TYPE_P (type) + && target_supports_op_p (type, MINMAX, optab_vector)) + (minmax @0 @1)))) + +(for cmp (lt le ungt unge gt ge unlt unle) + minmax (max max max max min min min min) + MINMAX (MAX_EXPR MAX_EXPR MAX_EXPR MAX_EXPR MIN_EXPR MIN_EXPR MIN_EXPR MIN_EXPR) + (simplify + (vec_cond (cmp @0 @1) @1 @0) + (if (VECTOR_INTEGER_TYPE_P (type) + && target_supports_op_p (type, MINMAX, optab_vector)) + (minmax @0 @1)))) #endif (for cnd (cond vec_cond) diff --git a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c index 3d820a58e93..d15670f3e9e 100644 --- a/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c +++ b/gcc/testsuite/gcc.dg/tree-ssa/pr95906.c @@ -1,7 +1,6 @@ /* { dg-do compile } */ /* { dg-options "-O2 -fdump-tree-forwprop3-raw -w -Wno-psabi" } */ -// FIXME: this should further optimize to a MAX_EXPR typedef signed char v16i8 __attribute__((vector_size(16))); v16i8 f(v16i8 a, v16i8 b) { @@ -10,4 +9,4 @@ v16i8 f(v16i8 a, v16i8 b) } /* { dg-final { scan-tree-dump-not "bit_(and|ior)_expr" "forwprop3" } } */ -/* { dg-final { scan-tree-dump-times "vec_cond_expr" 1 "forwprop3" } } */ +/* { dg-final { scan-tree-dump-times "max_expr" 1 "forwprop3" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr104401.c b/gcc/testsuite/gcc.target/i386/pr104401.c new file mode 100644 index 00000000000..8ce7ff88d9e --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr104401.c @@ -0,0 +1,27 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -msse4.1" } */ +/* { dg-final { scan-assembler-times "pminsd" 2 } } */ +/* { dg-final { scan-assembler-times "pmaxsd" 2 } } */ + +#include + +__m128i min32(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(value, input)); +} + +__m128i max32(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(value, input)); +} + +__m128i min32_1(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmpgt_epi32(input, value)); +} + +__m128i max32_1(__m128i value, __m128i input) +{ + return _mm_blendv_epi8(input, value, _mm_cmplt_epi32(input, value)); +} +