From patchwork Mon Jan 22 15:45:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 19265 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2658472dyb; Mon, 22 Jan 2024 07:46:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IEDRCXUF6FEyM4xiaCu2MNsDkqtFxG0sX0YDcrYIAoS1ybRL1PRgkiW+orjZgeJX5otRfcX X-Received: by 2002:a05:622a:c6:b0:42a:3ee5:d8e1 with SMTP id p6-20020a05622a00c600b0042a3ee5d8e1mr2814464qtw.26.1705938398867; Mon, 22 Jan 2024 07:46:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705938398; cv=pass; d=google.com; s=arc-20160816; b=gzy1PAQVdHOXuyPV8SVyP2j5AR/LAS8KsCZwdjtuirFutvzPAabAP8Knxsp1lDb6oD U9XTyr2LPLBQo94m3cI5dzKbUZJ16+OdvthP4jgnapkl86x5bW4bIg2t/BieyJyUHPQx OirL5IyzO9GLFqdYSedZnD+9iYtzWLNdlxE4wydlV8wLvm5N5laKoSGF7dCINKzf+dg9 9acPJJ/GYiQ4WaW/w0rdoCGcz9K8dEuOWTFhx1uQMgGQFc0/uMgSggCK8VAB5NJdcHhp pBAtDOuhyIyo2OqgpJGqPXLL4JADHAhoNuAVSuH1hRkjjWaSG721xXO6KH6dJ04/HN1D lqgQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=3EV8lvU57Io9+J+8oymTshuWeE3IcCubwYbclkBPsCY=; fh=DkbU5qKJ1pwscM07d9qZ8pO3cz62HhAUS2BaFSQ0JHM=; b=MyvrhY/m0h1/86OnWBrS6fsJsE6IlgS646HoES63VKaRGT8YZCLFqaX0Ux7RNME13e ukW1/dzXrIMKXd0QAGiq2pgeOJPxMj/NglmYYh9QjiX83E1TyE4qo+6Es7xZKbXA81DZ TMtP18OuSmuX9sq9hHAlf9cwvrA0APtYSAZudG0beSyAE6AuJwTmG4UG5FJkeuxzxzCf FtrcWQVFADZCo+FaCwxc2ojurxWMIFPibFyviyGGjaReEeigNbdkfHQiLAwoiQIEi+21 5rt7p+FKTI7Y0x+FTC/5+9NSISan9qrNSvVkuqLm4Eiq7lH8E07gXcYoJ/47Hs5VEk1w 3pFQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=nXJgwJjp; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id t21-20020a05622a149500b0042a49c7ea40si377385qtx.339.2024.01.22.07.46.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 07:46:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=nXJgwJjp; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 7EF173858417 for ; Mon, 22 Jan 2024 15:46:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by sourceware.org (Postfix) with ESMTPS id EFAD23858D3C for ; Mon, 22 Jan 2024 15:45:43 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org EFAD23858D3C Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org EFAD23858D3C Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::632 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705938346; cv=none; b=kNsqbaJK1CmxPOtzr7iBS7TkkSDr2FIJzX2cFwfhCKV1YVGd3gCupK09aC6mhGRiQnZdb8rWLXgC+soZ0IyOpuGwkQI25C+OHOMGEVe/9vqyMkAopshUhU3i8mEqzWVahAorbLOFGl26236ACud9+wBJCjdDcvYzshqt7EgM0c0= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705938346; c=relaxed/simple; bh=ePaonqlL7dz8kHCHNNQLv72RNHf0KStw+gx4gIPa0JM=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=TYJHUE0R7/YZUU1Z+KrtXBkizID5NeWO8zsFBUblHOaDvrPxDSYwE6VcqAIa5ajhI26KIA7S9kPHIJcjofAhgCBFre4xLyfJA/ClCnHIWSxWPi9xUfISeoBEz3wK7y2yi9UmtLpmLpvk9h5tVsiDrD2xKIW9AXnOFNg46mPYzjc= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1d72f71f222so5760545ad.1 for ; Mon, 22 Jan 2024 07:45:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705938342; x=1706543142; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=3EV8lvU57Io9+J+8oymTshuWeE3IcCubwYbclkBPsCY=; b=nXJgwJjpkjcMXrVKCz84K2OdLUQ6+Ev3zOviALxyUowMTF+6S2LkFjNFyc+HO20EnQ 9QySWBLYwJsKHtr96hYYOU5K3RVzhfUWxdQCei68Xg44s0E9juVS3Chdxomo5M51ib3x g2lF4O/G23WRDJ6hx/qTzSLIMnDtpKLDtRp9g5xNrrZRARw9dUyU5WwaZQMbhQSct9z1 Q7aj6opq+UFPlpUegY6Nm6ajVqQ8ELTQ8D483BjQvFceRadNf+5jlYHS7iWyvKMsMKZK FjZc5Ac+lYzOWb3t9KhiJodwo8OQmauT+8GKW6L3soPa7931e/o8nUVUwrZUr2OQxV9s 2cfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705938342; x=1706543142; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=3EV8lvU57Io9+J+8oymTshuWeE3IcCubwYbclkBPsCY=; b=vbSI/Dfg4Ul5tCGXK/+E3lhKM3b+sgPKWn0PeXPIAEF+KoOBL2BRDqcY8NK5Qcya7q x5GRXlt+0Kt83LTjFQDdjFoxeJ/kpwLdLc8PhXnv0++sgQhADL58PGUlK5l+q4SE1+Mk EWcuf2P2eNvw9dOGnnYkhEF+vMh27WC8kL00AuOqqCP6OC2DNetkcyT8XcL8ObqMXvd2 gw5XR1mwCBOm7bVdU2/K/oMx2JhaOIQVbweC9F5xQq/OBm6L0x7WqI/YI9HwhK1vcaqt oNnFfSXgY8CtZLDY9pILAwAEHXdxr8VxXK6VhbOWVq39GBI/MsWtzlmoTZsV8VNJFY+L BVsw== X-Gm-Message-State: AOJu0YyYrrg0fllaEthqsymAFmpZanoA501Vwi2aY3GDU3LVhevVnlPM PA+a1amp3ASyuXGjJDxC7tyz5IbSbauhb+9MsFVJ9/VvXaP6mf3Uglwo8A1s X-Received: by 2002:a17:902:7c11:b0:1d7:41f4:be8e with SMTP id x17-20020a1709027c1100b001d741f4be8emr1257391pll.65.1705938342453; Mon, 22 Jan 2024 07:45:42 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.56.168.9]) by smtp.gmail.com with ESMTPSA id ji15-20020a170903324f00b001d7164acf5csm6176123plb.120.2024.01.22.07.45.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 07:45:42 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id C3F0E740076; Mon, 22 Jan 2024 07:45:40 -0800 (PST) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, jh@suse.cz Subject: [PATCH v2 0/2] x86: Don't save callee-saved registers if not needed Date: Mon, 22 Jan 2024 07:45:38 -0800 Message-ID: <20240122154540.65652-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Spam-Status: No, score=-3018.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788806062417490071 X-GMAIL-MSGID: 1788806062417490071 Changes in v2: 1. Rebase against commit f9df00340e3 2. Don't add redundant clobbered_registers check in ix86_expand_call. In some cases, there are no need to save callee-saved registers: 1. If a noreturn function doesn't throw nor support exceptions, it can skip saving callee-saved registers. 2. When an interrupt handler is implemented by an assembly stub which does: 1. Save all registers. 2. Call a C function. 3. Restore all registers. 4. Return from interrupt. it is completely unnecessary to save and restore any registers in the C function called by the assembly stub, even if they would normally be callee-saved. This patch set adds no_callee_saved_registers function attribute, which is complementary to no_caller_saved_registers function attribute, to classify x86 backend call-saved register handling type with 1. Default call-saved registers. 2. No caller-saved registers with no_caller_saved_registers attribute. 3. No callee-saved registers with no_callee_saved_registers attribute. Functions of no callee-saved registers won't save callee-saved registers. If a noreturn function doesn't throw nor support exceptions, it is classified as the no callee-saved registers type. With these changes, __libc_start_main in glibc 2.39, which is a noreturn function, is changed from __libc_start_main: endbr64 push %r15 push %r14 mov %rcx,%r14 push %r13 push %r12 push %rbp mov %esi,%ebp push %rbx mov %rdx,%rbx sub $0x28,%rsp mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 to __libc_start_main: endbr64 sub $0x28,%rsp mov %esi,%ebp mov %rdx,%rbx mov %rcx,%r14 mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 In Linux kernel 6.7.0 on x86-64, do_exit is changed from do_exit: endbr64 call push %r15 push %r14 push %r13 push %r12 mov %rdi,%r12 push %rbp push %rbx mov %gs:0x0,%rbx sub $0x28,%rsp mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax call *0x0(%rip) # test $0x2,%ah je to do_exit: endbr64 call sub $0x28,%rsp mov %rdi,%r12 mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax mov %gs:0x0,%rbx call *0x0(%rip) # test $0x2,%ah je I compared GCC master branch bootstrap and test times on a slow machine with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 with the backported patch. The performance data isn't precise since the measurements were done on different days with different GCC sources under different 6.6 kernel versions. GCC master branch build time in seconds: before after improvement 30043.75user 30013.16user 0% 1274.85system 1243.72system 2.4% GCC master branch test time in seconds (new tests added): before after improvement 216035.90user 216547.51user 0 27365.51system 26658.54system 2.6% Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. Systems perform normally. H.J. Lu (2): x86: Add no_callee_saved_registers function attribute x86: Don't save callee-saved registers in noreturn functions gcc/config/i386/i386-expand.cc | 58 +++++++++++++-- gcc/config/i386/i386-options.cc | 61 ++++++++++++---- gcc/config/i386/i386.cc | 70 +++++++++++++++---- gcc/config/i386/i386.h | 20 +++++- gcc/doc/extend.texi | 8 +++ .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 ++++++ .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++ .../gcc.target/i386/no-callee-saved-1.c | 30 ++++++++ .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++ .../gcc.target/i386/no-callee-saved-11.c | 11 +++ .../gcc.target/i386/no-callee-saved-12.c | 10 +++ .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ .../gcc.target/i386/no-callee-saved-15.c | 17 +++++ .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++ .../gcc.target/i386/no-callee-saved-2.c | 30 ++++++++ .../gcc.target/i386/no-callee-saved-3.c | 8 +++ .../gcc.target/i386/no-callee-saved-4.c | 8 +++ .../gcc.target/i386/no-callee-saved-5.c | 11 +++ .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++ .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++ .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++ gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 +++++++ gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 +++++ gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 +++++ gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 +++++ .../gcc.target/i386/stack-check-17.c | 19 ++--- 30 files changed, 797 insertions(+), 48 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c