From patchwork Mon May 22 14:36:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 97486 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1497977vqo; Mon, 22 May 2023 07:37:58 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7jgwCpnfJ3fKA9V/0/rdJWExxchZkC8OmKNPyjeLA90a/0K8g7NHeM6vJgKwoDTq/4O2Sw X-Received: by 2002:a17:907:d402:b0:96f:4225:cf46 with SMTP id vi2-20020a170907d40200b0096f4225cf46mr8885661ejc.76.1684766278256; Mon, 22 May 2023 07:37:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684766278; cv=none; d=google.com; s=arc-20160816; b=iQtc2QJCLwLfXasZLOb4ondzBeGdJJ1/p350EoFmwFG7ztQdVamwTvt7ditwJZAbZ+ i7I+zzQcG6uBTnqT5Cj9hrXJddtxem1GEL9hIQdKxcLcoKeT2yd/9AeP7dYH7dM0cjsO 0yjxukDa28cJjwTDq8qIigyq4zflbmqOWWm4Kj+qh8fxMkusNDYH+YWJbc53waiwug9g R/V9eTaQRqFpVVCcFMk+h+0hh4gKyQO9u0YtckK60oUeBeMO2geFsDtgx7Em388I9YPg 342CNiHGLFSkTv7MquVjhlX+IOgQWdK28BPt8W34oQLbVA/M0rmxBaRFDuUZq7UJYzdQ cNHg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence:to:subject :message-id:date:mime-version:dmarc-filter:delivered-to :dkim-signature:dkim-filter; bh=K3/tUV5dEwDiV6Hm5e4itlTtdggY+S7+KxFuyR5JKXo=; b=TaFT1YNF8lita/cnEPMZpNyxOzOs/MkPY1SbIQGZZsQlGIMrflmxMHUiTm7Sx4hXfa 1NuWUdpIHkbyNGqwTmu2Bdxow4pQiVj8yxMG9q1d8SSSqyKa6TlhbUTOkyRhy3jjAmwQ SzdtFeQX2ASq/pTS2WRGKbOIsA/rONFy9Bx/ntv5I92FwCui7NW/jTot3iS4ZvAq6RLZ 7bp8p3Sw2jb8RA9n02Eb23vT9QfMCn/GPCm+hYk/22rwPmmjS7klJsrtHrIC6I5BGGyb NbBg2C2cLeb7+DpprZHFMS88QB7eZQNN88Qno/FqxKMKEm3+DjPlLU51TtmZ+52YwNw5 rx/w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=HDDpChQZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id kw23-20020a170907771700b0096f6bbfff6asi3122311ejc.349.2023.05.22.07.37.58 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 May 2023 07:37:58 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=HDDpChQZ; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 76F7F3857706 for ; Mon, 22 May 2023 14:37:28 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 76F7F3857706 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1684766248; bh=K3/tUV5dEwDiV6Hm5e4itlTtdggY+S7+KxFuyR5JKXo=; h=Date:Subject:To:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=HDDpChQZvKci0RZJArZsrJ4rSEZF4zdfCYUzuvjBXEDivQYCvUiX6j9tZgdWwKRwo xq8waVQ3PBZPQELJdbbrlvqavcX4ehhjfdmROfOB7/DXS7+M3N4psfjwmjMCehpP6m lF+xHi3ezawBGxPQazbpCBSQmwxRBN2juqunm3iU= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-qv1-xf2a.google.com (mail-qv1-xf2a.google.com [IPv6:2607:f8b0:4864:20::f2a]) by sourceware.org (Postfix) with ESMTPS id BCD413858D35 for ; Mon, 22 May 2023 14:36:38 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org BCD413858D35 Received: by mail-qv1-xf2a.google.com with SMTP id 6a1803df08f44-62385de2d40so16561426d6.0 for ; Mon, 22 May 2023 07:36:38 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684766198; x=1687358198; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=K3/tUV5dEwDiV6Hm5e4itlTtdggY+S7+KxFuyR5JKXo=; b=hhdVQ3jU5sDEvo1FIply2UmFWmR9A6htXtnMS4C4PBnz++Whh1hMdfgdRfUXYTot9i jSsVd1LEQlElYSFWbcK3w5k0dP2rXNCvntwu6JJ4R0ALrjO1qh2ff9dnsxuvLqAP+SB9 NTdG+Dyuz+MhSZMrJQyFYK4S/2pyrBv5SQv+rXA+rUA15NhXljqlR5Cz7zjlI2ROwlgC zNTCMayca+JnsjmuAjXibcyhtO663DodKuntS649o/aPSio6Dw58uK4l3zQb3YKKqV/G K8K2CjGLaSkndUFTUi3BXC4e00dhYEPBVIHemJ2MpMqgxN4lBec6GDfbQnd1s/YIQLz8 bKjQ== X-Gm-Message-State: AC+VfDy7MQNnRV3nIPR6LJ2mqGOFnweb/iAYEawASCVvQvyfb/aqkpi/ rarU47loY56G30EJjyMWtP9i8pYGWh8wW97Ldd3UfPF7lubE7w== X-Received: by 2002:a05:6214:29c8:b0:616:7977:2460 with SMTP id gh8-20020a05621429c800b0061679772460mr22831574qvb.24.1684766197787; Mon, 22 May 2023 07:36:37 -0700 (PDT) MIME-Version: 1.0 Date: Mon, 22 May 2023 16:36:26 +0200 Message-ID: Subject: [COMMITTED] i386: Account for the memory read in V*QImode multiplication sequences To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: Uros Bizjak via Gcc-patches From: Uros Bizjak Reply-To: Uros Bizjak Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766605484998372852?= X-GMAIL-MSGID: =?utf-8?q?1766605484998372852?= Add the cost of a memory read to the cost of V*QImode vector mult sequences. gcc/ChangeLog: * config/i386/i386.cc (ix86_multiplication_cost): Add the cost of a memory read to the cost of V?QImode sequences. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index 6a4b3326219..a36e625342d 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -20463,27 +20463,42 @@ ix86_multiplication_cost (const struct processor_costs *cost, { case V4QImode: case V8QImode: - /* Partial V*QImode is emulated with 4-5 insns. */ - if ((TARGET_AVX512BW && TARGET_AVX512VL) || TARGET_XOP) + /* Partial V*QImode is emulated with 4-6 insns. */ + if (TARGET_AVX512BW && TARGET_AVX512VL) return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3); + else if (TARGET_AVX2) + return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 5); + else if (TARGET_XOP) + return (ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3) + + cost->sse_load[2]); else - return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 4); + return (ix86_vec_cost (mode, cost->mulss + cost->sse_op * 4) + + cost->sse_load[2]); case V16QImode: /* V*QImode is emulated with 4-11 insns. */ if (TARGET_AVX512BW && TARGET_AVX512VL) return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3); + else if (TARGET_AVX2) + return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 8); else if (TARGET_XOP) - return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5); - /* FALLTHRU */ + return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 5) + + cost->sse_load[2]); + else + return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7) + + cost->sse_load[2]); + case V32QImode: - if (TARGET_AVX512BW && mode == V32QImode) + if (TARGET_AVX512BW) return ix86_vec_cost (mode, cost->mulss + cost->sse_op * 3); else - return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7); + return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 7) + + cost->sse_load[3] * 2); case V64QImode: - return ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 9); + return (ix86_vec_cost (mode, cost->mulss * 2 + cost->sse_op * 9) + + cost->sse_load[3] * 2 + + cost->sse_load[4] * 2); case V4SImode: /* pmulld is used in this case. No emulation is needed. */