From patchwork Mon Jul 24 20:35:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 125187 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp2040401vqg; Mon, 24 Jul 2023 13:36:34 -0700 (PDT) X-Google-Smtp-Source: APBJJlFZUqX0gO7pf2cvNW0a0ambE2W+mO2A+c3V0K/4Otm7cI8fWyN0wER0gwvKDtZO37qXkPdC X-Received: by 2002:a2e:8ed9:0:b0:2b6:e8a0:a7f8 with SMTP id e25-20020a2e8ed9000000b002b6e8a0a7f8mr6503933ljl.31.1690230994073; Mon, 24 Jul 2023 13:36:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690230994; cv=none; d=google.com; s=arc-20160816; b=NM8cTyu5fxNwEOv6FPAE98Ia7/QnQ2pUbAfSlpRii+SHFCXyzTFYIzKN0V7IWyAV9B E/NL0/twncbfjwbZw2mhUapVDwWqjlAw0Tq80X3sSBgqt98d6e/5/9by64uvxOksi36C K7W+lf9mcb9FErh7Dg3f59DCMHSFp0L4aYghYHeWH914YOXuJ6eBcekqOCpOfjQIHqkm hWOKgADP5v6k25GxkQMCp1Y0YCcXCenq4m+iJwiHmxpE0j/YN5BoUo7N2tWzaWs9mELx tV9khvLfZbPhnuWbQFdC6GjrVCXpKODIftgIPwss44AES714YqASJH0qqL400VO3BQFr vh6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:reply-to:from:list-subscribe:list-help:list-post :list-archive:list-unsubscribe:list-id:precedence :content-transfer-encoding:mime-version:message-id:date:subject:to :dmarc-filter:delivered-to:dkim-signature:dkim-filter; bh=8QNQsw4IVPy7MjRwnmjkPhjjKMSxvy51MrwqepMhqIM=; fh=JpaSH1qnnwjRnm99ntDcd91CRqQRUifBQqaBzsUHArs=; b=jyw0naOZqGZKZ//MUUT6+Hdl7Try9JH/irVnTcO9+oW1HgoEThu2AzsnQPp9UJPR9x i44I6vAqgxjZH9Vxq3bb+MuWfpLNMWGWAY1jbNxO4eRgYIo/MUwZzLxXiFdr0h8XYbW/ SRLqT+02+Rlh1eZsfQoyFInO+vGKHpJsen8kXabJ6htQrlM2mSUphfuBzP7PIb3siFjR E1gDjS7x6nns4+CcYOgo3Rtc2j7uCEGpHBEBSskn60ICMp2h54KaNnzELbSX+7kTYABa KMEpW+5ytweJvJRmEc0p5JUaEEIzrsyaE/JMw4PVag0lJ7dWlg64s8QWi00fLg+RMARB +/UA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=XHp7oWln; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id n23-20020a170906379700b00988be9d8c53si6860378ejc.946.2023.07.24.13.36.33 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 13:36:34 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gcc.gnu.org header.s=default header.b=XHp7oWln; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=gnu.org Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 2208E3857353 for ; Mon, 24 Jul 2023 20:36:29 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 sourceware.org 2208E3857353 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gcc.gnu.org; s=default; t=1690230989; bh=8QNQsw4IVPy7MjRwnmjkPhjjKMSxvy51MrwqepMhqIM=; h=To:Subject:Date:List-Id:List-Unsubscribe:List-Archive:List-Post: List-Help:List-Subscribe:From:Reply-To:From; b=XHp7oWlncCQC3SdjJNRkOdTE6KTvu4hZliLsit2cOZ9aWh994XtQp2H5g2BJ8Wfbj v46FYPUImrn8VnEMq8FauC/EkHq0K02+nffSv37QQh0jXHm9m2y1ddkDlXq2wVRzll lUhJGaKZ500msYHyncR7kz52htNXBCKGnlTeMSpE= X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x42f.google.com (mail-pf1-x42f.google.com [IPv6:2607:f8b0:4864:20::42f]) by sourceware.org (Postfix) with ESMTPS id A71393858C5F for ; Mon, 24 Jul 2023 20:35:42 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org A71393858C5F Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-666ecb21f86so4508059b3a.3 for ; Mon, 24 Jul 2023 13:35:42 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690230939; x=1690835739; h=content-transfer-encoding:mime-version:message-id:date:subject:to :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8QNQsw4IVPy7MjRwnmjkPhjjKMSxvy51MrwqepMhqIM=; b=j1A6v5gmRjoiB1PHzyeT2XGkApxch2pArmu8cJFu8AJTba/NarqQ4F/yqOMrOJSq2y XDG4TRMd+UgZmbcKAHIMCko8du9u20gEKBtH1bffQ4TujOYbojJjllnjBQn0SOIywWtx DiOy8StIUCIadvU7Nyy2c+nM6s1RsDg0KNibxaEu3kG9l5tBqUs20BnKui4Xbw0agnHo 4hPqMQ6yZD9c86NBwd6l5M+5uzRmkxR+MQEUtnTwZFv7dpTp3bImQrSA3zaZxvL2SMPC /+HOIbnhHerAKpWXpctfbE/xZAVcmgToT1a5q3qLf7+o7/m2GiAtHgbV0kVbfwIkoTsn L5JA== X-Gm-Message-State: ABy/qLba3qzABmK7UbFgjCikL6TBUZETkLDs3wCwNATlZPSQJ8mXlm/f ECojPskafAnwZsVaJ5btpu79ZKXRfMk= X-Received: by 2002:aa7:88d5:0:b0:63a:ea82:b7b7 with SMTP id k21-20020aa788d5000000b0063aea82b7b7mr12627177pff.28.1690230938625; Mon, 24 Jul 2023 13:35:38 -0700 (PDT) Received: from gnu-cfl-3.localdomain ([172.59.161.25]) by smtp.gmail.com with ESMTPSA id e17-20020a62ee11000000b0064d57ecaa1dsm8145528pfi.28.2023.07.24.13.35.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 24 Jul 2023 13:35:38 -0700 (PDT) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 15DE77401F7 for ; Mon, 24 Jul 2023 13:35:37 -0700 (PDT) To: gcc-patches@gcc.gnu.org Subject: [PATCH v3] x86: Properly find the maximum stack slot alignment Date: Mon, 24 Jul 2023 13:35:36 -0700 Message-ID: <20230724203536.40091-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3025.2 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-Patchwork-Original-From: "H.J. Lu via Gcc-patches" From: "H.J. Lu" Reply-To: "H.J. Lu" Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1770775246785263064 X-GMAIL-MSGID: 1772335655060734070 Don't assume that stack slots can only be accessed by stack or frame registers. We first find all registers defined by stack or frame registers. Then check memory accesses by such registers, including stack and frame registers. gcc/ PR target/109780 * config/i386/i386.cc (ix86_update_stack_alignment): New. (ix86_find_all_reg_use): Likewise. (ix86_find_max_used_stack_alignment): Also check memory accesses from registers defined by stack or frame registers. gcc/testsuite/ PR target/109780 * g++.target/i386/pr109780-1.C: New test. * gcc.target/i386/pr109780-1.c: Likewise. * gcc.target/i386/pr109780-2.c: Likewise. --- gcc/config/i386/i386.cc | 128 +++++++++++++++++---- gcc/testsuite/g++.target/i386/pr109780-1.C | 72 ++++++++++++ gcc/testsuite/gcc.target/i386/pr109780-1.c | 14 +++ gcc/testsuite/gcc.target/i386/pr109780-2.c | 21 ++++ 4 files changed, 214 insertions(+), 21 deletions(-) create mode 100644 gcc/testsuite/g++.target/i386/pr109780-1.C create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr109780-2.c diff --git a/gcc/config/i386/i386.cc b/gcc/config/i386/i386.cc index caca74d6dec..b71fd9401ef 100644 --- a/gcc/config/i386/i386.cc +++ b/gcc/config/i386/i386.cc @@ -8084,6 +8084,65 @@ output_probe_stack_range (rtx reg, rtx end) return ""; } +/* Update the maximum stack slot alignment from memory alignment in + PAT. */ + +static void +ix86_update_stack_alignment (rtx, const_rtx pat, void *data) +{ + /* This insn may reference stack slot. Update the maximum stack slot + alignment. */ + subrtx_iterator::array_type array; + FOR_EACH_SUBRTX (iter, array, pat, ALL) + if (MEM_P (*iter)) + { + unsigned int alignment = MEM_ALIGN (*iter); + unsigned int *stack_alignment + = (unsigned int *) data; + if (alignment > *stack_alignment) + *stack_alignment = alignment; + break; + } +} + +/* Find all registers defined with REG. */ + +static void +ix86_find_all_reg_use (HARD_REG_SET &stack_slot_access, + unsigned int reg, auto_bitmap &worklist) +{ + for (df_ref ref = DF_REG_USE_CHAIN (reg); + ref != NULL; + ref = DF_REF_NEXT_REG (ref)) + { + if (DF_REF_IS_ARTIFICIAL (ref)) + continue; + + rtx_insn *insn = DF_REF_INSN (ref); + if (!NONDEBUG_INSN_P (insn)) + continue; + + rtx set = single_set (insn); + if (!set) + continue; + + rtx src = SET_SRC (set); + if (MEM_P (src)) + continue; + + rtx dest = SET_DEST (set); + if (!REG_P (dest)) + continue; + + if (TEST_HARD_REG_BIT (stack_slot_access, REGNO (dest))) + continue; + + /* Add this register to stack_slot_access. */ + add_to_hard_reg_set (&stack_slot_access, Pmode, REGNO (dest)); + bitmap_set_bit (worklist, REGNO (dest)); + } +} + /* Set stack_frame_required to false if stack frame isn't required. Update STACK_ALIGNMENT to the largest alignment, in bits, of stack slot used if stack frame is required and CHECK_STACK_SLOT is true. */ @@ -8102,10 +8161,6 @@ ix86_find_max_used_stack_alignment (unsigned int &stack_alignment, add_to_hard_reg_set (&set_up_by_prologue, Pmode, HARD_FRAME_POINTER_REGNUM); - /* The preferred stack alignment is the minimum stack alignment. */ - if (stack_alignment > crtl->preferred_stack_boundary) - stack_alignment = crtl->preferred_stack_boundary; - bool require_stack_frame = false; FOR_EACH_BB_FN (bb, cfun) @@ -8117,27 +8172,58 @@ ix86_find_max_used_stack_alignment (unsigned int &stack_alignment, set_up_by_prologue)) { require_stack_frame = true; - - if (check_stack_slot) - { - /* Find the maximum stack alignment. */ - subrtx_iterator::array_type array; - FOR_EACH_SUBRTX (iter, array, PATTERN (insn), ALL) - if (MEM_P (*iter) - && (reg_mentioned_p (stack_pointer_rtx, - *iter) - || reg_mentioned_p (frame_pointer_rtx, - *iter))) - { - unsigned int alignment = MEM_ALIGN (*iter); - if (alignment > stack_alignment) - stack_alignment = alignment; - } - } + break; } } cfun->machine->stack_frame_required = require_stack_frame; + + /* Stop if we don't need to check stack slot. */ + if (!check_stack_slot) + return; + + /* The preferred stack alignment is the minimum stack alignment. */ + if (stack_alignment > crtl->preferred_stack_boundary) + stack_alignment = crtl->preferred_stack_boundary; + + HARD_REG_SET stack_slot_access; + CLEAR_HARD_REG_SET (stack_slot_access); + + /* Stack slot can be accessed by stack pointer, frame pointer or + registers defined by stack pointer or frame pointer. */ + auto_bitmap worklist; + add_to_hard_reg_set (&stack_slot_access, Pmode, + STACK_POINTER_REGNUM); + bitmap_set_bit (worklist, STACK_POINTER_REGNUM); + if (frame_pointer_needed) + { + add_to_hard_reg_set (&stack_slot_access, Pmode, + HARD_FRAME_POINTER_REGNUM); + bitmap_set_bit (worklist, HARD_FRAME_POINTER_REGNUM); + } + unsigned int reg; + do + { + reg = bitmap_clear_first_set_bit (worklist); + ix86_find_all_reg_use (stack_slot_access, reg, worklist); + } + while (!bitmap_empty_p (worklist)); + + hard_reg_set_iterator hrsi; + EXECUTE_IF_SET_IN_HARD_REG_SET (stack_slot_access, 0, reg, hrsi) + for (df_ref ref = DF_REG_USE_CHAIN (reg); + ref != NULL; + ref = DF_REF_NEXT_REG (ref)) + { + if (DF_REF_IS_ARTIFICIAL (ref)) + continue; + + rtx_insn *insn = DF_REF_INSN (ref); + if (!NONDEBUG_INSN_P (insn)) + continue; + note_stores (insn, ix86_update_stack_alignment, + &stack_alignment); + } } /* Finalize stack_realign_needed and frame_pointer_needed flags, which diff --git a/gcc/testsuite/g++.target/i386/pr109780-1.C b/gcc/testsuite/g++.target/i386/pr109780-1.C new file mode 100644 index 00000000000..7e3eabdec94 --- /dev/null +++ b/gcc/testsuite/g++.target/i386/pr109780-1.C @@ -0,0 +1,72 @@ +/* { dg-do compile } */ +/* { dg-require-effective-target c++17 } */ +/* { dg-options "-O2 -mavx2 -mtune=haswell" } */ + +template struct remove_reference { + using type = __remove_reference(_Tp); +}; +template struct MaybeStorageBase { + T val; + struct Union { + ~Union(); + } mStorage; +}; +template struct MaybeStorage : MaybeStorageBase { + char mIsSome; +}; +template ::type> +constexpr MaybeStorage Some(T &&); +template constexpr MaybeStorage Some(T &&aValue) { + return {aValue}; +} +template struct Span { + int operator[](long idx) { + int *__trans_tmp_4; + if (__builtin_expect(idx, 0)) + *(int *)__null = false; + __trans_tmp_4 = storage_.data(); + return __trans_tmp_4[idx]; + } + struct { + int *data() { return data_; } + int *data_; + } storage_; +}; +struct Variant { + template Variant(RefT) {} +}; +long from_i, from___trans_tmp_9; +namespace js::intl { +struct DecimalNumber { + Variant string_; + unsigned long significandStart_; + unsigned long significandEnd_; + bool zero_ = false; + bool negative_; + template DecimalNumber(CharT string) : string_(string) {} + template + static MaybeStorage from(Span); + void from(); +}; +} // namespace js::intl +void js::intl::DecimalNumber::from() { + Span __trans_tmp_3; + from(__trans_tmp_3); +} +template +MaybeStorage +js::intl::DecimalNumber::from(Span chars) { + DecimalNumber number(chars); + if (auto ch = chars[from_i]) { + from_i++; + number.negative_ = ch == '-'; + } + while (from___trans_tmp_9 && chars[from_i]) + ; + if (chars[from_i]) + while (chars[from_i - 1]) + number.zero_ = true; + return Some(number); +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr109780-1.c b/gcc/testsuite/gcc.target/i386/pr109780-1.c new file mode 100644 index 00000000000..6b06947f2a5 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109780-1.c @@ -0,0 +1,14 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=skylake" } */ + +char perm[64]; + +void +__attribute__((noipa)) +foo (int n) +{ + for (int i = 0; i < n; ++i) + perm[i] = i; +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr109780-2.c b/gcc/testsuite/gcc.target/i386/pr109780-2.c new file mode 100644 index 00000000000..152da06c6ad --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr109780-2.c @@ -0,0 +1,21 @@ +/* { dg-do compile } */ +/* { dg-options "-O3 -march=skylake" } */ + +#define N 9 + +void +f (double x, double y, double *res) +{ + y = -y; + for (int i = 0; i < N; ++i) + { + double tmp = y; + y = x; + x = tmp; + res[i] = i; + } + res[N] = y * y; + res[N + 1] = x; +} + +/* { dg-final { scan-assembler-not "and\[lq\]?\[^\\n\]*-32,\[^\\n\]*sp" } } */