From patchwork Wed Oct 25 14:30:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Uros Bizjak X-Patchwork-Id: 158133 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce89:0:b0:403:3b70:6f57 with SMTP id p9csp2643330vqx; Wed, 25 Oct 2023 07:30:51 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHW9umAkW97ecNq+VI9yyJuUbfLSZ1fWS5N95IE6JzlpMQG+djTt1R+jCWcKop+TJBsu9Ot X-Received: by 2002:a05:620a:8411:b0:76c:d5ac:66d7 with SMTP id pc17-20020a05620a841100b0076cd5ac66d7mr13605299qkn.43.1698244250900; Wed, 25 Oct 2023 07:30:50 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1698244250; cv=pass; d=google.com; s=arc-20160816; b=NhLrvKuAgxEETXpLTrtxbKeJE7Kjm/2aFMVOTVCYgs7aopiGbOCJsHzaTpXvl5uCD3 Courz6rL+AC/bt5CY2wTjPFHwaD0gB6gugtsQ5vQtqrhs1zyL+SlJ0IkTxB5ieliAsts ppT1BFMw/NRjOTf2XiK/d7q1uu7PWG6vgwpKTLRrMp0VmxkKJ8fsX5aDMFAp9zaG1+02 uXuedhvexFivVLjwtYEkIVTg9uenwzr2IeTanievppnJcPaJ841dLIOzNnoYv1MKHTtX OVDMAg9C8g+7Nf153dKCmYbeFVRavwHR+QnzeBvq6AWF+NWLVWkmWVFrhQo3G8Uu+Ul+ Crog== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:to:subject:message-id:date:from :mime-version:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=+YTF0aHasTMPisbXLDEcC0fL1x5C110qPt06VKJY9M8=; fh=XNn3asQvIblazGK92GBt13dVv+YmGV3pBS0JC29ZQco=; b=KAKRVoY3ghlz+oUY+rAZ3f4OROO99WaZ1iwaROxHXkIDRGuUSf8jQbcZgI5sNowjGw BUrs3gjwcbQ6QpitrO3WirV2Em0G0oPhyDnwjLPKlW7/+hsSbEeG7A/3jbWAVhFvfkQh F8/3QPGqDCTHfVw/yHzDtrxfL24kdQOAPozoA5iN6FPHYAhpoUL6Eiq+3mgcFvKiRHIo bYMOyE1mvvc4meLIL8xh/2tASTrJN5SBxZxGEcwAkwzSPJrCSyB8ZuEtMtaKf8yY24ix oMIHmWGQd8s9bsEWo3YSiBkq6Fu7NaOBpeuZJiun4q5Q8vCnxXzBdZ8ZOSzsZR7tx60E axMg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="Vn6c//fz"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id j11-20020a05620a288b00b0076ef0da9545si9093625qkp.736.2023.10.25.07.30.50 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 07:30:50 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="Vn6c//fz"; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id A6EE73858004 for ; Wed, 25 Oct 2023 14:30:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-ed1-x52e.google.com (mail-ed1-x52e.google.com [IPv6:2a00:1450:4864:20::52e]) by sourceware.org (Postfix) with ESMTPS id 074133858D33 for ; Wed, 25 Oct 2023 14:30:25 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 074133858D33 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 074133858D33 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2a00:1450:4864:20::52e ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698244226; cv=none; b=miHkdeYfg6HTA+yiiKPtJRFt9E5AnTRznx8rkpnRRad7Xj38CgrK1IsJhmgHYT18Ea4UhJAicbbTjJ3sBRB5ZyHb8Zj8TtOJl76C1zb9aVZVEj6aEAiu4eMkTHe+PG63XMQ94mKK0f2qJlWkwCfq62Jn3RAoMCWdB5xabGBo1qA= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1698244226; c=relaxed/simple; bh=0KL7A23AXRAyJnXHlCwM0x27wyw3/C8kWh2i+d1OnAw=; h=DKIM-Signature:MIME-Version:From:Date:Message-ID:Subject:To; b=sFvCxAKdUESP/gDwGfQLLDp3gziRsWRNs06WGG3UmwieNl/cjtfJpj/GDfK6TJvwl+KFqKmi3dshll8qrz7+3LIo4hAFJRNYRYUPycNPS2hZwI27NfoWJct4q5RxF/xV82MGnYiy+FpdC3wXmvhIxsY7rqxe4dIdFnflOlOXK6M= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-ed1-x52e.google.com with SMTP id 4fb4d7f45d1cf-53fc7c67a41so2044402a12.0 for ; Wed, 25 Oct 2023 07:30:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1698244223; x=1698849023; darn=gcc.gnu.org; h=to:subject:message-id:date:from:mime-version:from:to:cc:subject :date:message-id:reply-to; bh=+YTF0aHasTMPisbXLDEcC0fL1x5C110qPt06VKJY9M8=; b=Vn6c//fz+zM3g81cHGZFpdmFnWhvHSDrq3BXwONLefAOY0jUORfO/WW7MoGSD6MsTG u6wDmEN2ljgHc0TjVfdyY5sNaxy6yWNWatOc1x4aikbCYQlDe5A4R+UJ/atL2j0VtlMM hLqckO1Bi/u8NpXQiDeW1cDLmK1rgaVW0yVJzihl94Lnrt+1mhjEppgQXEqz3yhGFIY4 aBYuBEJk2rrcaRCuujHnTKTtTN3hcFPW2Aq+7MerQAG0sAGU9c1fwM1jIqFghhVWf9H8 xybVjQom87/dnGgDP+hU9JeYM9nOka5i0AGhHPYzjHz5cJjPsuoOCQEoeY7HIdt8P53L Cd8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1698244223; x=1698849023; h=to:subject:message-id:date:from:mime-version:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=+YTF0aHasTMPisbXLDEcC0fL1x5C110qPt06VKJY9M8=; b=Ey4iYuz8C5kpw3d25Lcy20D/hRmpzwEWc6k7H109RYKVvuG0wdCAfzd3jTWmVFo2vl 2IrnqG+b6afMv/GV+Wj47hR6rX7afh/wvkUd67h3+wVoKEqKTb6cX6isIPRCVOhX+McT SOXuSCi8MVnM+wWCR0lBmK1KjPqda8YQwOxUZbMoS4ijmngmOFZNNhs9bQxr8BAamuVh eMLlD1EioSut+shPmB27B8W/cLf7HwftUmuJ3FknO9QcQ4+u/5YHtUifSjit3De1bayS R+vXC4W7jtqpp9fual64velJk1ysnL5omRO2CcYIcbNhUqOEPp/cN9woqdaCtNSa/DOX KPYQ== X-Gm-Message-State: AOJu0Yw+lteIxMz5MKHwKbNoIFXKgrJkxy3ylLS44oFiXacur7+/NSoH ir8T15biAb74CMFJXgIkHnIkb+5PE89Rmn+dsuza5tDFXORmeQ== X-Received: by 2002:a50:c88c:0:b0:534:6b86:eda2 with SMTP id d12-20020a50c88c000000b005346b86eda2mr13784394edh.21.1698244223310; Wed, 25 Oct 2023 07:30:23 -0700 (PDT) MIME-Version: 1.0 From: Uros Bizjak Date: Wed, 25 Oct 2023 16:30:12 +0200 Message-ID: Subject: [committed] i386: Narrow test instructions with immediate operands [PR111698] To: "gcc-patches@gcc.gnu.org" X-Spam-Status: No, score=-8.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780738163625354769 X-GMAIL-MSGID: 1780738163625354769 i386: Narrow test instructions with immediate operands [PR111698] Narrow test instructions with immediate operand that test memory location for zero. E.g. testl $0x00aa0000, mem can be converted to testb $0xaa, mem+2. Reject targets where reading (possibly unaligned) part of memory location after a large write to the same address causes store-to-load forwarding stall. PR target/111698 gcc/ChangeLog: * config/i386/x86-tune.def (X86_TUNE_PARTIAL_MEMORY_READ_STALL): New tune. * config/i386/i386.h (TARGET_PARTIAL_MEMORY_READ_STALL): New macro. * config/i386/i386.md: New peephole pattern to narrow test instructions with immediate operands that test memory locations for zero. gcc/testsuite/ChangeLog: * gcc.target/i386/pr111698.c: New test. Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}. Uros. diff --git a/gcc/config/i386/i386.h b/gcc/config/i386/i386.h index e4c1fc6eef0..4426b27f4fe 100644 --- a/gcc/config/i386/i386.h +++ b/gcc/config/i386/i386.h @@ -311,6 +311,8 @@ extern unsigned char ix86_tune_features[X86_TUNE_LAST]; #define TARGET_USE_SAHF ix86_tune_features[X86_TUNE_USE_SAHF] #define TARGET_MOVX ix86_tune_features[X86_TUNE_MOVX] #define TARGET_PARTIAL_REG_STALL ix86_tune_features[X86_TUNE_PARTIAL_REG_STALL] +#define TARGET_PARTIAL_MEMORY_READ_STALL \ + ix86_tune_features[X86_TUNE_PARTIAL_MEMORY_READ_STALL] #define TARGET_PARTIAL_FLAG_REG_STALL \ ix86_tune_features[X86_TUNE_PARTIAL_FLAG_REG_STALL] #define TARGET_LCP_STALL \ diff --git a/gcc/config/i386/i386.md b/gcc/config/i386/i386.md index f90cf1ca734..5d8d5b2eae6 100644 --- a/gcc/config/i386/i386.md +++ b/gcc/config/i386/i386.md @@ -11100,6 +11100,57 @@ (define_split operands[3] = gen_int_mode (INTVAL (operands[3]), QImode); }) +;; Narrow test instructions with immediate operands that test +;; memory locations for zero. E.g. testl $0x00aa0000, mem can be +;; converted to testb $0xaa, mem+2. Reject volatile locations and +;; targets where reading (possibly unaligned) part of memory +;; location after a large write to the same address causes +;; store-to-load forwarding stall. +(define_peephole2 + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ + (and:SWI248 (match_operand:SWI248 0 "memory_operand") + (match_operand 1 "const_int_operand")) + (const_int 0)))] + "!TARGET_PARTIAL_MEMORY_READ_STALL && !MEM_VOLATILE_P (operands[0])" + [(set (reg:CCZ FLAGS_REG) + (compare:CCZ (match_dup 2) (const_int 0)))] +{ + unsigned HOST_WIDE_INT ival = UINTVAL (operands[1]); + int first_nonzero_byte, bitsize; + rtx new_addr, new_const; + machine_mode new_mode; + + if (ival == 0) + FAIL; + + /* Clear bits outside mode width. */ + ival &= GET_MODE_MASK (mode); + + first_nonzero_byte = ctz_hwi (ival) / BITS_PER_UNIT; + + ival >>= first_nonzero_byte * BITS_PER_UNIT; + + bitsize = sizeof (ival) * BITS_PER_UNIT - clz_hwi (ival); + + if (bitsize <= GET_MODE_BITSIZE (QImode)) + new_mode = QImode; + else if (bitsize <= GET_MODE_BITSIZE (HImode)) + new_mode = HImode; + else if (bitsize <= GET_MODE_BITSIZE (SImode)) + new_mode = SImode; + else + new_mode = DImode; + + if (GET_MODE_SIZE (new_mode) >= GET_MODE_SIZE (mode)) + FAIL; + + new_addr = adjust_address (operands[0], new_mode, first_nonzero_byte); + new_const = gen_int_mode (ival, new_mode); + + operands[2] = gen_rtx_AND (new_mode, new_addr, new_const); +}) + ;; %%% This used to optimize known byte-wide and operations to memory, ;; and sometimes to QImode registers. If this is considered useful, ;; it should be done with splitters. diff --git a/gcc/config/i386/x86-tune.def b/gcc/config/i386/x86-tune.def index 3636a4a95d8..9d0699ff9b9 100644 --- a/gcc/config/i386/x86-tune.def +++ b/gcc/config/i386/x86-tune.def @@ -658,6 +658,14 @@ DEF_TUNE (X86_TUNE_NOT_UNPAIRABLE, "not_unpairable", m_PENT | m_LAKEMONT) and can happen in caller/callee saving sequences. */ DEF_TUNE (X86_TUNE_PARTIAL_REG_STALL, "partial_reg_stall", m_PPRO) +/* X86_TUNE_PARTIAL_MEMORY_READ_STALL: Reading (possible unaligned) part of + memory location after a large write to the same address causes + store-to-load forwarding stall. */ +DEF_TUNE (X86_TUNE_PARTIAL_MEMORY_READ_STALL, "partial_memoy_read_stall", + m_386 | m_486 | m_PENT | m_LAKEMONT | m_PPRO | m_P4_NOCONA | m_CORE2 + | m_SILVERMONT | m_GOLDMONT | m_GOLDMONT_PLUS | m_TREMONT + | m_K6_GEODE | m_ATHLON_K8 | m_AMDFAM10) + /* X86_TUNE_PROMOTE_QIMODE: When it is cheap, turn 8bit arithmetic to corresponding 32bit arithmetic. */ DEF_TUNE (X86_TUNE_PROMOTE_QIMODE, "promote_qimode", diff --git a/gcc/testsuite/gcc.target/i386/pr111698.c b/gcc/testsuite/gcc.target/i386/pr111698.c new file mode 100644 index 00000000000..2da6be531a2 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr111698.c @@ -0,0 +1,19 @@ +/* PR target/111698 */ +/* { dg-options "-O2 -masm=att" } */ +/* { dg-final { scan-assembler-not "testl" } } */ + +int m; + +_Bool foo (void) +{ + return m & 0x0a0000; +} + +/* { dg-final { scan-assembler-times "testb" 1 } } */ + +_Bool bar (void) +{ + return m & 0xa0a000; +} + +/* { dg-final { scan-assembler-times "testw" 1 } } */