From patchwork Mon Oct 23 14:47:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Roger Sayle X-Patchwork-Id: 156927 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp1347370vqx; Mon, 23 Oct 2023 07:48:13 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEIR3BKtn/t+cm2b8j43eBWoAqCgh2ThHqQa7tQH6pQ8NU74sDAGDUxJuyf1FHsSqEAFaYo X-Received: by 2002:a05:6808:144:b0:3a1:e7fb:76fc with SMTP id h4-20020a056808014400b003a1e7fb76fcmr9825446oie.17.1698072493554; Mon, 23 Oct 2023 07:48:13 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698072493; cv=pass; d=google.com; s=arc-20160816; b=qd5zQvw8FR3wmSklATSEdG/3183Qp+mUiKx7Efd2rlXQsDJTRAI7I3jtqixSWOhK7W pZ4MV2eCwUCQiqGyVsIc6k+MnfpuMVko/8iGSblF7Ekh2QnQ6zGuo6a2x2ZjtNzUwgXN MyoCjgnfP3IyECMAF5InnE7mW9tEN/9jncOEuhYn1hdL+ZYdYQM6hxo1PNEuXjtssXZw KlbQkE38PeoW7/GU/hIzCS5KDAN0qp/ombBmqtppUSvyBNKtlAW8fUy2d/w1glZ87/J7 KifSCuhdIUUjU1K+hvIxgtFhwkvqo+4x5dWzXir67ZpBsZaAlDozaX0xzTJBjujZ0QIX J3IQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-language:thread-index :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=ljkcFqQ2fmxM2wwb+zn1j/74IMopbW2RgIkC9V3TQ0Q=; fh=ez+UBk19YaOo+lQEyE9porlijlGbJDzUOtzUi3k96eQ=; b=ssdCoZRVQV0NPW2Iq5qlQHT0vSF1E2rgv5afoOIHHmGrbmxgIqhYStzDsIwZOjyRBq 65bAJ3/gJw2aaV4JATOGjg+hCXvFfSONi4I/XjYHsSxDNl9tqL4KfD/pXCvpA2P39K6W uu5nQrNUZ3oVZTFw6AMD8aPNP/sQpi0IvBxMQYx9okVxj8mU5Qwrwr+WDM5lvOEvFpaH sR1Aepacr+8DfrbL2Fko/iTW8/RY6zcg4eDijcf0TXjqwVuHsCjhZ5+J5a4m1QueQN4p u7WQTKRBtpwgh3SSHaSUzqYW7a5Y46pwNwf8f/ISZz6T5M9/fgQEiQoad1anZaXLFG4O vpAQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=iM7TLYyT; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id ma12-20020a0562145b0c00b0066d7eb831f1si5791766qvb.562.2023.10.23.07.48.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 23 Oct 2023 07:48:13 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@nextmovesoftware.com header.s=default header.b=iM7TLYyT; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 504143858C2B for ; Mon, 23 Oct 2023 14:48:13 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from server.nextmovesoftware.com (server.nextmovesoftware.com [162.254.253.69]) by sourceware.org (Postfix) with ESMTPS id E52193858D37 for ; Mon, 23 Oct 2023 14:47:47 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org E52193858D37 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=nextmovesoftware.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=nextmovesoftware.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org E52193858D37 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=162.254.253.69 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698072469; cv=none; b=CHfVbVWoCe+CZOkRWSrhzYYjZUMYa1RgfXesOMGte3qgFkWyxv1Ljlo5vQo/2YOPa7NH2WqqhNvHEHJcESLQ7C0xcdIIKmwFYKkZBafjHowiY9KmLV75HRN+VxGq9Jr/MK9zoXrFOLMDsIWQKBqHFv/xkgYjkC4R8oXjXNPZX0c= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698072469; c=relaxed/simple; bh=kUDTod1VKsWtIJ3fpsheuV7q3pLnFBdqyH16OwiEk+Q=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=qBqbS6wDJJMq7E5JpFlUfLWbE08LqQrctE8G3VsCNMRelmFGq8i4EY4AOR3/iXdKhpbw8IxsLfgB77NSY8hLlIC186OJxZ84Dm1+QFJJrhDn1belPtNxAURjxJreraXU3ZuBcsmkOS4MPcHT+eVpMQVTb+7N+bbmAbipd/aPkak= ARC-Authentication-Results: i=1; server2.sourceware.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=nextmovesoftware.com; s=default; h=Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-Transfer-Encoding:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:In-Reply-To:References:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=ljkcFqQ2fmxM2wwb+zn1j/74IMopbW2RgIkC9V3TQ0Q=; b=iM7TLYyT7colWZOopHIsKJJ0tt 6Y8SHgWKi2pafoS6wEKwyO7PBBMzmDszOPkOJFe6HigK7IwfLfCGVT8geR/ZxpFSiOePvOtM9wNIC kmFaIb4dzCB6FUxHepJ54aZ9XnPGufRYuusFgL4VYux2S+zG9nz5d48Avzrvz531jyhmEBK+jHWxd Glpy9/G2vwBPj2emFvS1GbBYUMf2zwAU6Ppm59+Kn1udngGlxQ2McH+hGplhnEXfMxZRk1m4OZ/uF 0xfSqxsX+0+Y19a8DZPuIio7TMzM34v6LqLvkpYoMbhVttIMOWNmvef3HVOgxMNqgiia0ZmYhKgcr AA1Mqutg==; Received: from [185.62.158.67] (port=59056 helo=Dell) by server.nextmovesoftware.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96.2) (envelope-from ) id 1quwDa-0003Kv-33; Mon, 23 Oct 2023 10:47:47 -0400 From: "Roger Sayle" To: Cc: "'Uros Bizjak'" Subject: [x86 PATCH] Fine tune STV register conversion costs for -Os. Date: Mon, 23 Oct 2023 15:47:43 +0100 Message-ID: <008701da05bf$e2196b20$a64c4160$@nextmovesoftware.com> MIME-Version: 1.0 X-Mailer: Microsoft Outlook 16.0 Thread-Index: AdoFvuPAJF8GrvPGRKyG7FIXwkrlrg== Content-Language: en-gb X-AntiAbuse: This header was added to track abuse, please include it with any abuse report X-AntiAbuse: Primary Hostname - server.nextmovesoftware.com X-AntiAbuse: Original Domain - gcc.gnu.org X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] X-AntiAbuse: Sender Address Domain - nextmovesoftware.com X-Get-Message-Sender-Via: server.nextmovesoftware.com: authenticated_id: roger@nextmovesoftware.com X-Authenticated-Sender: server.nextmovesoftware.com: roger@nextmovesoftware.com X-Source: X-Source-Args: X-Source-Dir: X-Spam-Status: No, score=-12.4 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780558062952801669 X-GMAIL-MSGID: 1780558062952801669 The eagle-eyed may have spotted that my recent testcases for DImode shifts on x86_64 included -mno-stv in the dg-options. This is because the Scalar-To-Vector (STV) pass currently transforms these shifts to use SSE vector operations, producing larger code even with -Os. The issue is that the compute_convert_gain currently underestimates the size of instructions required for interunit moves, which is corrected with the patch below. For the simple test case: unsigned long long shl1(unsigned long long x) { return x << 1; } without this patch, GCC -m32 -Os -mavx2 currently generates: shl1: push %ebp // 1 byte mov %esp,%ebp // 2 bytes vmovq 0x8(%ebp),%xmm0 // 5 bytes pop %ebp // 1 byte vpaddq %xmm0,%xmm0,%xmm0 // 4 bytes vmovd %xmm0,%eax // 4 bytes vpextrd $0x1,%xmm0,%edx // 6 bytes ret // 1 byte = 24 bytes total with this patch, we now generate the shorter shl1: push %ebp // 1 byte mov %esp,%ebp // 2 bytes mov 0x8(%ebp),%eax // 3 bytes mov 0xc(%ebp),%edx // 3 bytes pop %ebp // 1 byte add %eax,%eax // 2 bytes adc %edx,%edx // 2 bytes ret // 1 byte = 15 bytes total Benchmarking using CSiBE, shows that this patch saves 1361 bytes when compiling with -m32 -Os, and saves 172 bytes when compiling with -Os. This patch has been tested on x86_64-pc-linux-gnu with make bootstrap and make -k check, both with and without --target_board=unix{-m32} with no new failures. Ok for mainline? 2023-10-23 Roger Sayle gcc/ChangeLog * config/i386/i386-features.cc (compute_convert_gain): Provide more accurate values (sizes) for inter-unit moves with -Os. Thanks in advance, Roger diff --git a/gcc/config/i386/i386-features.cc b/gcc/config/i386/i386-features.cc index cead397..6fac67e 100644 --- a/gcc/config/i386/i386-features.cc +++ b/gcc/config/i386/i386-features.cc @@ -752,11 +752,33 @@ general_scalar_chain::compute_convert_gain () fprintf (dump_file, " Instruction conversion gain: %d\n", gain); /* Cost the integer to sse and sse to integer moves. */ - cost += n_sse_to_integer * ix86_cost->sse_to_integer; - /* ??? integer_to_sse but we only have that in the RA cost table. - Assume sse_to_integer/integer_to_sse are the same which they - are at the moment. */ - cost += n_integer_to_sse * ix86_cost->sse_to_integer; + if (!optimize_function_for_size_p (cfun)) + { + cost += n_sse_to_integer * ix86_cost->sse_to_integer; + /* ??? integer_to_sse but we only have that in the RA cost table. + Assume sse_to_integer/integer_to_sse are the same which they + are at the moment. */ + cost += n_integer_to_sse * ix86_cost->sse_to_integer; + } + else if (TARGET_64BIT || smode == SImode) + { + cost += n_sse_to_integer * COSTS_N_BYTES (4); + cost += n_integer_to_sse * COSTS_N_BYTES (4); + } + else if (TARGET_SSE4_1) + { + /* vmovd (4 bytes) + vpextrd (6 bytes). */ + cost += n_sse_to_integer * COSTS_N_BYTES (10); + /* vmovd (4 bytes) + vpinsrd (6 bytes). */ + cost += n_integer_to_sse * COSTS_N_BYTES (10); + } + else + { + /* movd (4 bytes) + psrlq (5 bytes) + movd (4 bytes). */ + cost += n_sse_to_integer * COSTS_N_BYTES (13); + /* movd (4 bytes) + movd (4 bytes) + unpckldq (4 bytes). */ + cost += n_integer_to_sse * COSTS_N_BYTES (12); + } if (dump_file) fprintf (dump_file, " Registers conversion cost: %d\n", cost);