Message ID | 20240123145951.2092315-1-hjl.tools@gmail.com |
---|---|
Headers |
Return-Path: <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2553:b0:103:945f:af90 with SMTP id p19csp390281dyi; Tue, 23 Jan 2024 07:01:01 -0800 (PST) X-Google-Smtp-Source: AGHT+IF3plLISIQjqzMI2XxEf9TztFu5cT/gu2WsDbXtJgO4Ypw+/XCTaLhAgNrr/QDhx594cvlA X-Received: by 2002:a05:6808:144b:b0:3bd:5377:79df with SMTP id x11-20020a056808144b00b003bd537779dfmr7138557oiv.82.1706022061625; Tue, 23 Jan 2024 07:01:01 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1706022061; cv=pass; d=google.com; s=arc-20160816; b=cp8jxbnWRY1eTpkWODOGkVu2NOJbesXb0Gvof2vJgyJ1XA2g0X4wguP13hYpwc2afi wrQhJHIENR/xK+tdwcTudUnchWGrd0hTUVMNmE/LFC6/RrYOVeItVIT8Rb9OYRukJey+ HKMHIlOadcgg2nxy1ktUZJTA9bzJaMxtY3ZoR5rmkcP/I4RlthRrJ5L1Y3WLfkwX2v53 vuSrcPUaG+oEf7kmbDDHcm8Fbj0ANeaNO4g4+4Z1n1uNLML88bthi5oLs8U6QpWKJLCP J/JiBm1DGb11ZBKb1A6iHSa+hXCEglQpCbawiS4O+jO2F2fKh/GVOqVznpiRYJm3A2P+ sCIA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:cc:to:from:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=sJvf4AhWbqnJOk79ZQdeD01pKIPFGyamV2xQwQMCBk0=; fh=DkbU5qKJ1pwscM07d9qZ8pO3cz62HhAUS2BaFSQ0JHM=; b=k+fpZPL7FyQdn+xGtUGPDgmoRMbu7HLK8wTgttjH6e/O1BDbZ17M+Ugc0nAxbDPY1/ CVPVscQ4M1VbbEM5pJo5WtXpP4N8l5aOmSJNDo1UAEYmOVH4CXgIWRUIzFSZxsHgDgmU oc1cHs3epy5kvfOonE94o43xFwHnBw9SJdsR/M+zJtGg+5+N5Yj7KOmeRDXohjWKLOyM SDas5RqaOQn99iEq1DBBPxKi/+jCalEShiWb005m0r1IjVfKOJfQOmT6pJkqOqMf1ybi F1Xtomv+1k3zhMOvrFV7wW0I0LwpfifbZYBS+s2IorpScg3TJ4U3B9y/h144399mRd5d WtMA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=UKID5CAm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b1-20020a05620a118100b007836d3b9916si8025622qkk.31.2024.01.23.07.01.01 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 07:01:01 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=UKID5CAm; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 6B94138582B5 for <ouuuleilei@gmail.com>; Tue, 23 Jan 2024 15:00:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by sourceware.org (Postfix) with ESMTPS id 852DB3858C54 for <gcc-patches@gcc.gnu.org>; Tue, 23 Jan 2024 14:59:54 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 852DB3858C54 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 852DB3858C54 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::62a ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706021998; cv=none; b=tkSHYETOl15kq7QctJSRmjbPtLT4SIe/7WvQ+Sju9l/p+ptyatxYBVAAYT1nt08TXlALku0MqxeRrY2QLk5gG08TWzwvRJBd9xiZG0m57XZbPl+c4EkO4At5+mPx1EjTvoVMMcLkiOFW+92LUc8ekiy9W41qfGz03wfq6a0yeWE= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1706021998; c=relaxed/simple; bh=D2JF6xE+mumbyvOsR9DHDwtRpgHLvLn9B2lSOva4LDg=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=PR76s5QqHZXMkGiqZfp6OPtnUq9eWqgF0/AlKZ1nImd2VLNilWSx79ntmjNP+bcvrlmYeauN1PZm03NhaLeiSlFs8m6RFh1Ivbe8WwuHv0Pd9W6j70R9mRLfGY0Brq8edbQVvyWMlbuhd1sAdPQH7wF91nFHJmnPEJ6lBTFHkE4= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1d748d43186so15436435ad.0 for <gcc-patches@gcc.gnu.org>; Tue, 23 Jan 2024 06:59:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706021993; x=1706626793; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=sJvf4AhWbqnJOk79ZQdeD01pKIPFGyamV2xQwQMCBk0=; b=UKID5CAmAWmUiode8krjG8qqTWsC3sbY0es9fn1cO9fAW+rN29MKDxwxKW2NLjqnsw XVct15SStzHHigNvYCET8LEqMo3KMV3CB5gpcY+Sw3LKaomG9Zp/5TIgjua4o3hV+8sg d36wp1iRncuDsWG/1jUXZdqvSCjRsrzNB7D8QvtowjlXbNFUAsk+lHYI9ycjGLAuhkEe MmrxexFl+Qx/kLzb8pfD3QAskP/yO1RgcdDhMyPS6XK2xe2bQZNbVZr5QRkS64/bEdQN lVXPREdX78Ys9Uh31Amh0saH1f8HeXBtUB1vo6c2Hrfup09Gykv0R2HI5fph6mKCCZlJ nLaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706021993; x=1706626793; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sJvf4AhWbqnJOk79ZQdeD01pKIPFGyamV2xQwQMCBk0=; b=JdpRPTFL3c2nF5ckDhzeLfYLFhEQYSMQarCHRNL7SqCqza+mncAdcWeil5QD6j9Siy NNQmYe11/dS7ESbJh2yCMHvKrHeU6gjAGtr/OwATOv3h9LI5dsbqPhv+KyKmorJZzlSZ vU6iBT7SH8eEwGcTlvVnYepYtZYI8s/+83G7Oydf8a0dMGbpeOBsZ2WpyKisF/u2i34I O7QEH7ARIZ/6UB+c4LzTcNyFAmyBdMnQgN2FRKAuTa/UKYgHL3tY3o/vS/oksHcUs81w Asf3OyhmsPnVz+NmiLT8UPQ52zYjodCrztBSw/q/GY2xmwurfODiZdAeFCnkEAgWvF7R ecSw== X-Gm-Message-State: AOJu0YzPeEv4aUCYqGHthHv00qAqyp+hHu0cNAV5ufJ4xDYIzpAhL21C EwlZ1e2+lqPzwGnovkrT3h1d3RUMyqi7iThPyTclvgF/u3htjjuoI4si1hLs X-Received: by 2002:a17:903:2a86:b0:1d7:14bb:afd1 with SMTP id lv6-20020a1709032a8600b001d714bbafd1mr4409580plb.79.1706021992882; Tue, 23 Jan 2024 06:59:52 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.56.168.9]) by smtp.gmail.com with ESMTPSA id kx7-20020a170902f94700b001d76ea439c8sm1263742plb.201.2024.01.23.06.59.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Jan 2024 06:59:52 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id 406B37402E6; Tue, 23 Jan 2024 06:59:51 -0800 (PST) From: "H.J. Lu" <hjl.tools@gmail.com> To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, jh@suse.cz Subject: [PATCH v3 0/2] x86: Don't save callee-saved registers if not needed Date: Tue, 23 Jan 2024 06:59:49 -0800 Message-ID: <20240123145951.2092315-1-hjl.tools@gmail.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-3017.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list <gcc-patches.gcc.gnu.org> List-Unsubscribe: <https://gcc.gnu.org/mailman/options/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=unsubscribe> List-Archive: <https://gcc.gnu.org/pipermail/gcc-patches/> List-Post: <mailto:gcc-patches@gcc.gnu.org> List-Help: <mailto:gcc-patches-request@gcc.gnu.org?subject=help> List-Subscribe: <https://gcc.gnu.org/mailman/listinfo/gcc-patches>, <mailto:gcc-patches-request@gcc.gnu.org?subject=subscribe> Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788893788985686995 X-GMAIL-MSGID: 1788893788985686995 |
Series |
x86: Don't save callee-saved registers if not needed
|
|
Message
H.J. Lu
Jan. 23, 2024, 2:59 p.m. UTC
Changes in v3: 1. Rebase against commit 02e68389494 2. Don't add call_no_callee_saved_registers to machine_function since all callee-saved registers are properly clobbered by callee with no_callee_saved_registers attribute. Changes in v2: 1. Rebase against commit f9df00340e3 2. Don't add redundant clobbered_registers check in ix86_expand_call. In some cases, there are no need to save callee-saved registers: 1. If a noreturn function doesn't throw nor support exceptions, it can skip saving callee-saved registers. 2. When an interrupt handler is implemented by an assembly stub which does: 1. Save all registers. 2. Call a C function. 3. Restore all registers. 4. Return from interrupt. it is completely unnecessary to save and restore any registers in the C function called by the assembly stub, even if they would normally be callee-saved. This patch set adds no_callee_saved_registers function attribute, which is complementary to no_caller_saved_registers function attribute, to classify x86 backend call-saved register handling type with 1. Default call-saved registers. 2. No caller-saved registers with no_caller_saved_registers attribute. 3. No callee-saved registers with no_callee_saved_registers attribute. Functions of no callee-saved registers won't save callee-saved registers. If a noreturn function doesn't throw nor support exceptions, it is classified as the no callee-saved registers type. With these changes, __libc_start_main in glibc 2.39, which is a noreturn function, is changed from __libc_start_main: endbr64 push %r15 push %r14 mov %rcx,%r14 push %r13 push %r12 push %rbp mov %esi,%ebp push %rbx mov %rdx,%rbx sub $0x28,%rsp mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 to __libc_start_main: endbr64 sub $0x28,%rsp mov %esi,%ebp mov %rdx,%rbx mov %rcx,%r14 mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 In Linux kernel 6.7.0 on x86-64, do_exit is changed from do_exit: endbr64 call <do_exit+0x9> push %r15 push %r14 push %r13 push %r12 mov %rdi,%r12 push %rbp push %rbx mov %gs:0x0,%rbx sub $0x28,%rsp mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax call *0x0(%rip) # <do_exit+0x39> test $0x2,%ah je <do_exit+0x8d3> to do_exit: endbr64 call <do_exit+0x9> sub $0x28,%rsp mov %rdi,%r12 mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax mov %gs:0x0,%rbx call *0x0(%rip) # <do_exit+0x2f> test $0x2,%ah je <do_exit+0x8c9> I compared GCC master branch bootstrap and test times on a slow machine with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 with the backported patch. The performance data isn't precise since the measurements were done on different days with different GCC sources under different 6.6 kernel versions. GCC master branch build time in seconds: before after improvement 30043.75user 30013.16user 0% 1274.85system 1243.72system 2.4% GCC master branch test time in seconds (new tests added): before after improvement 216035.90user 216547.51user 0 27365.51system 26658.54system 2.6% Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. Systems perform normally. H.J. Lu (2): x86: Add no_callee_saved_registers function attribute x86: Don't save callee-saved registers in noreturn functions gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- gcc/config/i386/i386.cc | 57 +++++++++++++---- gcc/config/i386/i386.h | 16 ++++- gcc/doc/extend.texi | 8 +++ .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ .../gcc.target/i386/no-callee-saved-12.c | 10 +++ .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ .../gcc.target/i386/no-callee-saved-3.c | 8 +++ .../gcc.target/i386/no-callee-saved-4.c | 8 +++ .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ .../gcc.target/i386/stack-check-17.c | 19 +++--- 30 files changed, 775 insertions(+), 47 deletions(-) create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c
Comments
On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > Changes in v3: > > 1. Rebase against commit 02e68389494 > 2. Don't add call_no_callee_saved_registers to machine_function since > all callee-saved registers are properly clobbered by callee with > no_callee_saved_registers attribute. > The patch LGTM, it should be low risk since there's already no_caller_save_registers attribute, the patch just extends to no_callee_save_registers with the same approach. So if there's no objection(or any concerns) in the next couple days, I'm ok for the patch to be in GCC14 and backport. > Changes in v2: > > 1. Rebase against commit f9df00340e3 > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > In some cases, there are no need to save callee-saved registers: > > 1. If a noreturn function doesn't throw nor support exceptions, it can > skip saving callee-saved registers. > > 2. When an interrupt handler is implemented by an assembly stub which does: > > 1. Save all registers. > 2. Call a C function. > 3. Restore all registers. > 4. Return from interrupt. > > it is completely unnecessary to save and restore any registers in the C > function called by the assembly stub, even if they would normally be > callee-saved. > > This patch set adds no_callee_saved_registers function attribute, which > is complementary to no_caller_saved_registers function attribute, to > classify x86 backend call-saved register handling type with > > 1. Default call-saved registers. > 2. No caller-saved registers with no_caller_saved_registers attribute. > 3. No callee-saved registers with no_callee_saved_registers attribute. > > Functions of no callee-saved registers won't save callee-saved registers. > If a noreturn function doesn't throw nor support exceptions, it is > classified as the no callee-saved registers type. > > With these changes, __libc_start_main in glibc 2.39, which is a noreturn > function, is changed from > > __libc_start_main: > endbr64 > push %r15 > push %r14 > mov %rcx,%r14 > push %r13 > push %r12 > push %rbp > mov %esi,%ebp > push %rbx > mov %rdx,%rbx > sub $0x28,%rsp > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > to > > __libc_start_main: > endbr64 > sub $0x28,%rsp > mov %esi,%ebp > mov %rdx,%rbx > mov %rcx,%r14 > mov %rdi,(%rsp) > mov %fs:0x28,%rax > mov %rax,0x18(%rsp) > xor %eax,%eax > test %r9,%r9 > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > do_exit: > endbr64 > call <do_exit+0x9> > push %r15 > push %r14 > push %r13 > push %r12 > mov %rdi,%r12 > push %rbp > push %rbx > mov %gs:0x0,%rbx > sub $0x28,%rsp > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > call *0x0(%rip) # <do_exit+0x39> > test $0x2,%ah > je <do_exit+0x8d3> > > to > > do_exit: > endbr64 > call <do_exit+0x9> > sub $0x28,%rsp > mov %rdi,%r12 > mov %gs:0x28,%rax > mov %rax,0x20(%rsp) > xor %eax,%eax > mov %gs:0x0,%rbx > call *0x0(%rip) # <do_exit+0x2f> > test $0x2,%ah > je <do_exit+0x8c9> > > I compared GCC master branch bootstrap and test times on a slow machine > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > with the backported patch. The performance data isn't precise since the > measurements were done on different days with different GCC sources under > different 6.6 kernel versions. > > GCC master branch build time in seconds: > > before after improvement > 30043.75user 30013.16user 0% > 1274.85system 1243.72system 2.4% > > GCC master branch test time in seconds (new tests added): > > before after improvement > 216035.90user 216547.51user 0 > 27365.51system 26658.54system 2.6% > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > Systems perform normally. > > > H.J. Lu (2): > x86: Add no_callee_saved_registers function attribute > x86: Don't save callee-saved registers in noreturn functions > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > gcc/config/i386/i386.cc | 57 +++++++++++++---- > gcc/config/i386/i386.h | 16 ++++- > gcc/doc/extend.texi | 8 +++ > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > .../gcc.target/i386/stack-check-17.c | 19 +++--- > 30 files changed, 775 insertions(+), 47 deletions(-) > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > -- > 2.43.0 >
On Wed, Jan 24, 2024 at 7:36 PM Hongtao Liu <crazylht@gmail.com> wrote: > > On Tue, Jan 23, 2024 at 11:00 PM H.J. Lu <hjl.tools@gmail.com> wrote: > > > > Changes in v3: > > > > 1. Rebase against commit 02e68389494 > > 2. Don't add call_no_callee_saved_registers to machine_function since > > all callee-saved registers are properly clobbered by callee with > > no_callee_saved_registers attribute. > > > The patch LGTM, it should be low risk since there's already > no_caller_save_registers attribute, the patch just extends to > no_callee_save_registers with the same approach. > So if there's no objection(or any concerns) in the next couple days, > I'm ok for the patch to be in GCC14 and backport. I am checking it in. Thanks. H.J. > > Changes in v2: > > > > 1. Rebase against commit f9df00340e3 > > 2. Don't add redundant clobbered_registers check in ix86_expand_call. > > > > In some cases, there are no need to save callee-saved registers: > > > > 1. If a noreturn function doesn't throw nor support exceptions, it can > > skip saving callee-saved registers. > > > > 2. When an interrupt handler is implemented by an assembly stub which does: > > > > 1. Save all registers. > > 2. Call a C function. > > 3. Restore all registers. > > 4. Return from interrupt. > > > > it is completely unnecessary to save and restore any registers in the C > > function called by the assembly stub, even if they would normally be > > callee-saved. > > > > This patch set adds no_callee_saved_registers function attribute, which > > is complementary to no_caller_saved_registers function attribute, to > > classify x86 backend call-saved register handling type with > > > > 1. Default call-saved registers. > > 2. No caller-saved registers with no_caller_saved_registers attribute. > > 3. No callee-saved registers with no_callee_saved_registers attribute. > > > > Functions of no callee-saved registers won't save callee-saved registers. > > If a noreturn function doesn't throw nor support exceptions, it is > > classified as the no callee-saved registers type. > > > > With these changes, __libc_start_main in glibc 2.39, which is a noreturn > > function, is changed from > > > > __libc_start_main: > > endbr64 > > push %r15 > > push %r14 > > mov %rcx,%r14 > > push %r13 > > push %r12 > > push %rbp > > mov %esi,%ebp > > push %rbx > > mov %rdx,%rbx > > sub $0x28,%rsp > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > to > > > > __libc_start_main: > > endbr64 > > sub $0x28,%rsp > > mov %esi,%ebp > > mov %rdx,%rbx > > mov %rcx,%r14 > > mov %rdi,(%rsp) > > mov %fs:0x28,%rax > > mov %rax,0x18(%rsp) > > xor %eax,%eax > > test %r9,%r9 > > > > In Linux kernel 6.7.0 on x86-64, do_exit is changed from > > > > do_exit: > > endbr64 > > call <do_exit+0x9> > > push %r15 > > push %r14 > > push %r13 > > push %r12 > > mov %rdi,%r12 > > push %rbp > > push %rbx > > mov %gs:0x0,%rbx > > sub $0x28,%rsp > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > call *0x0(%rip) # <do_exit+0x39> > > test $0x2,%ah > > je <do_exit+0x8d3> > > > > to > > > > do_exit: > > endbr64 > > call <do_exit+0x9> > > sub $0x28,%rsp > > mov %rdi,%r12 > > mov %gs:0x28,%rax > > mov %rax,0x20(%rsp) > > xor %eax,%eax > > mov %gs:0x0,%rbx > > call *0x0(%rip) # <do_exit+0x2f> > > test $0x2,%ah > > je <do_exit+0x8c9> > > > > I compared GCC master branch bootstrap and test times on a slow machine > > with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 > > with the backported patch. The performance data isn't precise since the > > measurements were done on different days with different GCC sources under > > different 6.6 kernel versions. > > > > GCC master branch build time in seconds: > > > > before after improvement > > 30043.75user 30013.16user 0% > > 1274.85system 1243.72system 2.4% > > > > GCC master branch test time in seconds (new tests added): > > > > before after improvement > > 216035.90user 216547.51user 0 > > 27365.51system 26658.54system 2.6% > > > > Backported to GCC 13 to rebuild system glibc and kernel on Fedora 39. > > Systems perform normally. > > > > > > H.J. Lu (2): > > x86: Add no_callee_saved_registers function attribute > > x86: Don't save callee-saved registers in noreturn functions > > > > gcc/config/i386/i386-expand.cc | 52 +++++++++++++--- > > gcc/config/i386/i386-options.cc | 61 +++++++++++++++---- > > gcc/config/i386/i386.cc | 57 +++++++++++++---- > > gcc/config/i386/i386.h | 16 ++++- > > gcc/doc/extend.texi | 8 +++ > > .../gcc.dg/torture/no-callee-saved-run-1a.c | 23 +++++++ > > .../gcc.dg/torture/no-callee-saved-run-1b.c | 59 ++++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-1.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-10.c | 46 ++++++++++++++ > > .../gcc.target/i386/no-callee-saved-11.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-12.c | 10 +++ > > .../gcc.target/i386/no-callee-saved-13.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-14.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-15.c | 17 ++++++ > > .../gcc.target/i386/no-callee-saved-16.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-17.c | 16 +++++ > > .../gcc.target/i386/no-callee-saved-18.c | 51 ++++++++++++++++ > > .../gcc.target/i386/no-callee-saved-2.c | 30 +++++++++ > > .../gcc.target/i386/no-callee-saved-3.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-4.c | 8 +++ > > .../gcc.target/i386/no-callee-saved-5.c | 11 ++++ > > .../gcc.target/i386/no-callee-saved-6.c | 12 ++++ > > .../gcc.target/i386/no-callee-saved-7.c | 49 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-8.c | 50 +++++++++++++++ > > .../gcc.target/i386/no-callee-saved-9.c | 49 +++++++++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 ++++++++ > > gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++ > > gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 ++++++ > > .../gcc.target/i386/stack-check-17.c | 19 +++--- > > 30 files changed, 775 insertions(+), 47 deletions(-) > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1a.c > > create mode 100644 gcc/testsuite/gcc.dg/torture/no-callee-saved-run-1b.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-10.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-11.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-12.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-13.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-14.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-15.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-16.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-17.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-18.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-4.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-5.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-6.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-7.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-8.c > > create mode 100644 gcc/testsuite/gcc.target/i386/no-callee-saved-9.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c > > create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c > > > > -- > > 2.43.0 > > > > > -- > BR, > Hongtao