From patchwork Mon May 22 20:39:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 97603 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1706692vqo; Mon, 22 May 2023 13:40:00 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7KgbPBm2cBt/7hEg3+Ve62hH9J/ERqFs3RvqFx+Smht8U8ia6yoGM5rS7pWi6aq6eDsYIp X-Received: by 2002:a17:907:6d81:b0:94d:69e0:6098 with SMTP id sb1-20020a1709076d8100b0094d69e06098mr12396944ejc.45.1684788000025; Mon, 22 May 2023 13:40:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684788000; cv=none; d=google.com; s=arc-20160816; b=nD2eznYCsdYWQN0b1iFpI4Bq9UDeJ1sC1RJvKzhm9KLh3T/IYqiVltIDkOs+A29y2E Vbeu23oh4U1HUMhhH31Y/+aZ9pBDRitohIT0lYD3o1K7I8RO9AdzRCMA2oCFQNPjhR3w cYj40S4BgzbQz8DD2N1jk+UraacdsUqROlCFftajmp31lUeV05JKoh3024SrTc96pjhm ePdznScjUqooP4Phhp4Mvp/Aqr53tR5RTbW49hFcdNlPcOOW+rZuH81I4KweO8FRYBiK ZrZyaWJINSdpDjyif0Tu1O9gRi/V4yioMlOPVARRMeiWEmcypa7Sxj6NY8YzDYvDvcWV CYng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=kUa3NpoqRd3flwIz1dMSYp5VtVFOzT1GdBxAElPYlpU=; b=OUnPhyUgNtIlIB9AZolUJ2zPPCXrkSyNkOodzB0N6LfILCwwZuiz+iGfUL387s2BtY Qyz1LupcPa/K7O282a++bF5Zgr/z0luuYFYQjciorGsB2KpB2ONpesVPuAnZQFLvJbDg 4qd9/rbdlfN9RM8EL8atpqLbS7Gf6719UhR9rTEzmibJrWidifX01ew+AWNws2o+7z5q t+mPWK/tOPAeCm2SluIkhM6xynnmgYsB9G6yeAkc5O4x4avmClrWyiTRS2HKjOmk+0b9 JfwD/ekZPY5y9IXgl/Jb2bjnc/5EL33LwtlBs70W7TI7/6t4/BojhwRw5bFdGaKILVa9 Ubmw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="JM/uW54y"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id hx14-20020a170906846e00b0096f838d3e59si245228ejc.20.2023.05.22.13.39.59 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 May 2023 13:40:00 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b="JM/uW54y"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id ACBEA3858C3A for ; Mon, 22 May 2023 20:39:58 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org ACBEA3858C3A DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684787998; bh=kUa3NpoqRd3flwIz1dMSYp5VtVFOzT1GdBxAElPYlpU=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=JM/uW54yEByHqdO7pUdEub+ysRvdwpqI9TkyWm9x/KQfyB0NxaurdAH5rajd4OSPW DClA3DkwH8k/XK0osFu3uPY2r+h9w6j0i6+hIKy4kcJQNCbWVKK1c1ZX7knnyObUVI 5IHDGN8uo9KCx2fuabDiqfesT9mG/T+OfYNkfETE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qt1-x82d.google.com (mail-qt1-x82d.google.com [IPv6:2607:f8b0:4864:20::82d]) by sourceware.org (Postfix) with ESMTPS id 30265385842D for ; Mon, 22 May 2023 20:39:14 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 30265385842D Received: by mail-qt1-x82d.google.com with SMTP id d75a77b69052e-3f6b2f1a04bso10699301cf.3 for ; Mon, 22 May 2023 13:39:14 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684787953; x=1687379953; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=kUa3NpoqRd3flwIz1dMSYp5VtVFOzT1GdBxAElPYlpU=; b=Mt1vWelnj4qbhlKVm9F/T7t/ALpmpuaejy7A74cg/zCjEYD7z1QTUYbwaCIcVt5Kad LzaAFx6BrncUMiuwchKCv1kX5AgCWHMC3+A93x7CNGkw+RWpqXEugxb9wEU5BJYQWiWv rCsYTIdFu22MZNmBhCP/ozkAVy92c8qRt1m6LcTM5dkCpWI5qre2QyFQsbWhhu9jlrPB 1UNnIw/Vt7KUseJtuArPPefV3+nxxKQhUvbncrqfPDYSO0ZLeR4IfJ8JmYaj+37t8xFH OURrFBIp6ZZH8ceecwwTDB1GaJWMV8KNpevpbxTNoDlbL9Spbr2EDUJwl0exgyt6EEW4 oXrQ== X-Gm-Message-State: AC+VfDwDoiPA129ncKM+a09oktuQpZyJU0IXX8RBHEUVA+LpPJiHumsP dttTZNMlzJdxB5kUOu0altnV0s6r1+ByLew2Mi+Tz7m2ky2Apw== X-Received: by 2002:a05:622a:1a9b:b0:3f6:a7ad:9edf with SMTP id s27-20020a05622a1a9b00b003f6a7ad9edfmr12572358qtc.40.1684787953184; Mon, 22 May 2023 13:39:13 -0700 (PDT) MIME-Version: 1.0 Date: Mon, 22 May 2023 22:39:01 +0200 Message-ID: Subject: [COMMITTED] i386: Adjust emulated integer vector mode shift costs To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766628261793329258?= X-GMAIL-MSGID: =?utf-8?q?1766628261793329258?= Returned integer vector mode costs of emulated instructions in ix86_shift_rotate_cost are wrong and do not reflect generated instruction sequences. Rewrite handling of different integer vector modes and different target ABIs to return real instruction counts in order to calcuate better costs of various emulated modes. Also add the cost of a memory read, when the instruction in the sequence reads memory. gcc/ChangeLog: * config/i386/i386.cc (ix86_shift_rotate_cost): Correct calculation of integer vector mode costs to reflect generated instruction sequences of different integer vector modes and different target ABIs. Remove "speed" function argument. (ix86_rtx_costs): Update call for removed function argument. (ix86_vector_costs::add_stmt_cost): Ditto. gcc/testsuite/ChangeLog: * gcc.target/i386/sse2-shiftqihi-constant-1.c: Remove XFAILs. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index a36e625342d..38125ce284a 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20565,20 +20565,23 @@ ix86_shift_rotate_cost (const struct processor_costs *cost, enum rtx_code code, enum machine_mode mode, bool constant_op1, HOST_WIDE_INT op1_val, - bool speed, bool and_in_op1, bool shift_and_truncate, bool *skip_op0, bool *skip_op1) { if (skip_op0) *skip_op0 = *skip_op1 = false; + if (GET_MODE_CLASS (mode) == MODE_VECTOR_INT) { - /* V*QImode is emulated with 1-11 insns. */ - if (mode == V16QImode || mode == V32QImode) + int count; + /* Cost of reading the memory. */ + int extra; + + switch (mode) { - int count = 11; - if (TARGET_XOP && mode == V16QImode) + case V16QImode: + if (TARGET_XOP) { /* For XOP we use vpshab, which requires a broadcast of the value to the variable shift insn. For constants this @@ -20586,37 +20589,65 @@ ix86_shift_rotate_cost (const struct processor_costs *cost, shift with one insn set the cost to prefer paddb. */ if (constant_op1) { - if (skip_op1) - *skip_op1 = true; - return ix86_vec_cost (mode, - cost->sse_op - + (speed - ? 2 - : COSTS_N_BYTES - (GET_MODE_UNIT_SIZE (mode)))); + extra = cost->sse_load[2]; + return ix86_vec_cost (mode, cost->sse_op) + extra; + } + else + { + count = (code == ASHIFT) ? 2 : 3; + return ix86_vec_cost (mode, cost->sse_op * count); + } + } + /* FALLTHRU */ + case V32QImode: + extra = (mode == V16QImode) ? cost->sse_load[2] : cost->sse_load[3]; + if (constant_op1) + { + if (code == ASHIFTRT) + { + count = 4; + extra *= 2; + } + else + count = 2; + } + else if (TARGET_SSE4_1) + count = 8; + else if (code == ASHIFTRT) + count = 9; + else + count = 8; + return ix86_vec_cost (mode, cost->sse_op * count) + extra; + + case V2DImode: + case V4DImode: + /* V*DImode arithmetic right shift is emulated. */ + if (code == ASHIFTRT && !TARGET_AVX512VL) + { + if (constant_op1) + { + if (op1_val == 63) + count = TARGET_SSE4_2 ? 1 : 2; + else if (TARGET_XOP) + count = 2; + else + count = 4; } - count = 3; + else if (TARGET_XOP) + count = 3; + else if (TARGET_SSE4_2) + count = 4; + else + count = 5; + + return ix86_vec_cost (mode, cost->sse_op * count); } - else if (TARGET_SSSE3) - count = 7; - return ix86_vec_cost (mode, cost->sse_op * count); - } - /* V*DImode arithmetic right shift is emulated. */ - else if (code == ASHIFTRT - && (mode == V2DImode || mode == V4DImode) - && !TARGET_XOP - && !TARGET_AVX512VL) - { - int count = 4; - if (constant_op1 && op1_val == 63 && TARGET_SSE4_2) - count = 2; - else if (constant_op1) - count = 3; - return ix86_vec_cost (mode, cost->sse_op * count); + /* FALLTHRU */ + default: + return ix86_vec_cost (mode, cost->sse_op); } - else - return ix86_vec_cost (mode, cost->sse_op); } + if (GET_MODE_SIZE (mode) > UNITS_PER_WORD) { if (constant_op1) @@ -20786,7 +20817,6 @@ ix86_rtx_costs (rtx x, machine_mode mode, int outer_code_i, int opno, CONSTANT_P (XEXP (x, 1)), CONST_INT_P (XEXP (x, 1)) ? INTVAL (XEXP (x, 1)) : -1, - speed, GET_CODE (XEXP (x, 1)) == AND, SUBREG_P (XEXP (x, 1)) && GET_CODE (XEXP (XEXP (x, 1), @@ -23558,7 +23588,7 @@ ix86_vector_costs::add_stmt_cost (int count, vect_cost_for_stmt kind, TREE_CODE (op2) == INTEGER_CST, cst_and_fits_in_hwi (op2) ? int_cst_value (op2) : -1, - true, false, false, NULL, NULL); + false, false, NULL, NULL); } break; case NOP_EXPR: diff --git a/gcc/testsuite/gcc.target/i386/sse2-shiftqihi-constant-1.c b/gcc/testsuite/gcc.target/i386/sse2-shiftqihi-constant-1.c index 015450f8219..8a79afcdaf7 100644 --- a/gcc/testsuite/gcc.target/i386/sse2-shiftqihi-constant-1.c +++ b/gcc/testsuite/gcc.target/i386/sse2-shiftqihi-constant-1.c @@ -1,7 +1,7 @@ /* PR target/95524 */ /* { dg-do compile } */ /* { dg-options "-O2 -msse2 -mno-avx" } */ -/* { dg-final { scan-assembler-times "pand\[^\n\]*%xmm" 3 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times "pand\[^\n\]*%xmm" 3 } } */ typedef char v16qi __attribute__ ((vector_size (16))); typedef unsigned char v16uqi __attribute__ ((vector_size (16))); @@ -20,7 +20,7 @@ foo_ashift_128 (v16qi a) return a << 7; } -/* { dg-final { scan-assembler-times "psllw\[^\n\]*%xmm" 1 { xfail *-*-* } } } */ +/* { dg-final { scan-assembler-times "psllw\[^\n\]*%xmm" 1 } } */ __attribute__((noipa)) v16uqi foo_lshiftrt_128 (v16uqi a)