From patchwork Mon Jan 22 15:45:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "H.J. Lu" X-Patchwork-Id: 190211 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp2658467dyb; Mon, 22 Jan 2024 07:46:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IHP+o0Y3pXfjA4g/uvjgLiwwQcPNw68p7528pLX8aLgY5K8IsX2n/ySii8ZaB77jjgXA/sm X-Received: by 2002:a05:620a:430c:b0:783:4849:afdf with SMTP id u12-20020a05620a430c00b007834849afdfmr6110201qko.57.1705938398736; Mon, 22 Jan 2024 07:46:38 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705938398; cv=pass; d=google.com; s=arc-20160816; b=wIYWTiScNW1peFaT2IObGWPnqGkQID4uYPgrcUGxCzj6UTfm80i0+LTnwrkQ+KqrSF 2oKoOLNUE2Deno5oh0nr9F3sY+E1U/GtlJfZUcohSkjU5e+DIu+iJv7Y9kWrdqbswsIW rscdDFAA/MrCDvRhGVKuJz2Cf5YLYjZg3yehV0NDKHOXvinBxMeHWF1rdsdgA7vX2ZsW Lc685+lbqASpRyT2OHOWG/pe/G4eKFoejdNnqb5DocLyE0hK+PyaWA2J8CFfww+T49Ra nIuBiQTFLoE7e4AnVgS4CWaAuYzRRFu0PD+1FU7lyNHp/U7NNPZIy+mUvRRAK1jge2z1 Q/VA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature:arc-filter:dmarc-filter:delivered-to; bh=4U5Wz8a1fPs71GIfINplqtprSQ8++frLuNcDJiW/h/c=; fh=DkbU5qKJ1pwscM07d9qZ8pO3cz62HhAUS2BaFSQ0JHM=; b=t0OvTF80JVCf4rJ97rD6wek1gqlMWxRq0zC1916+f9dstwHJRp4M6ckNDDomBRs1Mf 0ABbaIp70BwQHK06LdstMNcwnDcW5Wh3aSeLKD1NvCUSut11OaT9XevwXmmv4xuw9eQ4 CSReY8VwBznGTab40g7u8nfxcw9hcgo+Nw78ELm6mdqHT41Iyp4cCATFmkOPi1jVUaQ3 ZkBrD8uJf+Xcj06xx8WFAWlt3o/qzUaaGassPKkYL3fmTPFoGuSKJiXcE6yObree/iK0 qqSJtOTanZ+YHgtzaJCd2iHZHAPpy9hwTWBicNAyvt6MsE99Yj3i4yYtBZVAaSS9tfUi 5dhg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=elhXQcHb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (server2.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id b12-20020a05620a126c00b007838b72e6c1si5711607qkl.100.2024.01.22.07.46.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 07:46:38 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=elhXQcHb; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 659A83858293 for ; Mon, 22 Jan 2024 15:46:38 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-pf1-x42c.google.com (mail-pf1-x42c.google.com [IPv6:2607:f8b0:4864:20::42c]) by sourceware.org (Postfix) with ESMTPS id 16D773858C50 for ; Mon, 22 Jan 2024 15:45:44 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 16D773858C50 Authentication-Results: sourceware.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=gmail.com ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 16D773858C50 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=2607:f8b0:4864:20::42c ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705938346; cv=none; b=Q7BSdVP8H0EBqHvFStl8GQgiJUH11PtvC0B/xts9WfWmRKcUhWq192Ga1TIvWtQBqA2WxOwvKNrytBzGVQdAPE1tESs9cAzxLlGpBpjFb088YtjX3xnAK6vYwdQNnalP28bDRpgl2Ne6ArvnJuotb3KiVlYszFMrWw1ZTiDVi3w= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1705938346; c=relaxed/simple; bh=vgaFccf0sJPH5wUV6v2h+6b56rhD95lIaKtBlyaBfCU=; h=DKIM-Signature:From:To:Subject:Date:Message-ID:MIME-Version; b=C6qDei5K7q6Wtowg8kXMNw2Xo0VVUEcVe911nkcsMNPCQ2GCkJ1AKbO7eWStgBRY8SgdlvFjis6p/6EFZCRJXFMo2WWRh9EoNRZV56Fot3Vs5TnNd6Cks2Tq07UgqP4R8a0h/fBdQ43BNPOcywiK+NaFXxei9Q9vVdzh/npgMTs= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-6dbd65d3db6so856265b3a.3 for ; Mon, 22 Jan 2024 07:45:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705938342; x=1706543142; darn=gcc.gnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4U5Wz8a1fPs71GIfINplqtprSQ8++frLuNcDJiW/h/c=; b=elhXQcHbcODLuBaMLVOqcfSgj9OA8ytsg5Rd4w3B9bVOhh5s8fezgrZBORR/sdGmtg tyu4MUnIKzgGxcu4kaC02aRwpNC+N+ZjExSWK5O2o6v7txOJj+Y/bCEVzDe/0mYJpMrp HYm3gBVmQl+l3lq00ppb26JG4Rm3SdJTZ4gFZX+asqzgD6AhfTSDhMLnaw5k/4v6fp1w xwYG3O9p2GsNFfT39Yw7fvUoQXFyuxV+kjjO6S5cu/8AhgNyHWwD+Tup3Atd9Tqpwe9S Py/jpHNSMWq7/Kmg+BrybdMY0zVCrz5q0WOjX+xYRSqIUzlSU1EoZ0V/tK1PRcnOfSwP Wypw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705938342; x=1706543142; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4U5Wz8a1fPs71GIfINplqtprSQ8++frLuNcDJiW/h/c=; b=f6TKwG4n3Ux5mAZK/NPYhymng2smvKH+8hpJ5ZGwhMRKaMTGV5MmV/j6tx9/XLkPG+ zYBlPFFbLq+ZVKyIxG1fxrCBbYkouk0GrbcSAvlBWQRlYKUjojMNt0qXF7k8NeezcyIo uKPA5L190bL3BXkmRP4HEf8lLYGiPDHYdfug2ZUG8lIeZY/sVsEHQXlQQ+7jGZkkbveV xMKQCc2Zfg0I1ST4N+Gwj/L2ZVPhy5oJOUqwUjCKr2g/MUJFZzMUQy+sAfgERmGraEEc rrV6/EVHzlWINLphxDTxWtiPfxAH1/Fk0zsw33qh6xluMT7j7R91wKlZPwFdrQ7FADPG S6rw== X-Gm-Message-State: AOJu0Yyitd6rr8DXDZDPqqDIApuoVWRT3QFuY7/vgM8KGvBs8p6r2siQ uUy6XMLJtMywWDlQCs4uCdrfwv3uUBHJvyh8czR8LOw5w/Vfw3f99sB72qHn X-Received: by 2002:a05:6a20:8416:b0:19b:1e74:101f with SMTP id c22-20020a056a20841600b0019b1e74101fmr2243943pzd.7.1705938342458; Mon, 22 Jan 2024 07:45:42 -0800 (PST) Received: from gnu-cfl-3.localdomain ([172.56.168.9]) by smtp.gmail.com with ESMTPSA id kt12-20020a056a004bac00b006dbac48633bsm7624924pfb.189.2024.01.22.07.45.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 22 Jan 2024 07:45:42 -0800 (PST) Received: from gnu-cfl-3.. (localhost [IPv6:::1]) by gnu-cfl-3.localdomain (Postfix) with ESMTP id E03D6740692; Mon, 22 Jan 2024 07:45:40 -0800 (PST) From: "H.J. Lu" To: gcc-patches@gcc.gnu.org Cc: ubizjak@gmail.com, hongtao.liu@intel.com, jh@suse.cz Subject: [PATCH v2 2/2] x86: Don't save callee-saved registers in noreturn functions Date: Mon, 22 Jan 2024 07:45:40 -0800 Message-ID: <20240122154540.65652-3-hjl.tools@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240122154540.65652-1-hjl.tools@gmail.com> References: <20240122154540.65652-1-hjl.tools@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-3024.0 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, FREEMAIL_FROM, GIT_PATCH_0, KAM_SHORT, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788806062121142896 X-GMAIL-MSGID: 1788806062121142896 There is no need to save callee-saved registers in noreturn functions if they don't throw nor support exceptions. We can treat them the same as functions with no_callee_saved_registers attribute. Adjust stack-check-17.c for noreturn function which no longer saves any registers. With this change, __libc_start_main in glibc 2.39, which is a noreturn function, is changed from __libc_start_main: endbr64 push %r15 push %r14 mov %rcx,%r14 push %r13 push %r12 push %rbp mov %esi,%ebp push %rbx mov %rdx,%rbx sub $0x28,%rsp mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 to __libc_start_main: endbr64 sub $0x28,%rsp mov %esi,%ebp mov %rdx,%rbx mov %rcx,%r14 mov %rdi,(%rsp) mov %fs:0x28,%rax mov %rax,0x18(%rsp) xor %eax,%eax test %r9,%r9 In Linux kernel 6.7.0 on x86-64, do_exit is changed from do_exit: endbr64 call push %r15 push %r14 push %r13 push %r12 mov %rdi,%r12 push %rbp push %rbx mov %gs:0x0,%rbx sub $0x28,%rsp mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax call *0x0(%rip) # test $0x2,%ah je to do_exit: endbr64 call sub $0x28,%rsp mov %rdi,%r12 mov %gs:0x28,%rax mov %rax,0x20(%rsp) xor %eax,%eax mov %gs:0x0,%rbx call *0x0(%rip) # test $0x2,%ah je I compared GCC master branch bootstrap and test times on a slow machine with 6.6 Linux kernels compiled with the original GCC 13 and the GCC 13 with the backported patch. The performance data isn't precise since the measurements were done on different days with different GCC sources under different 6.6 kernel versions. GCC master branch build time in seconds: before after improvement 30043.75user 30013.16user 0% 1274.85system 1243.72system 2.4% GCC master branch test time in seconds (new tests added): before after improvement 216035.90user 216547.51user 0 27365.51system 26658.54system 2.6% gcc/ PR target/38534 * config/i386/i386-options.cc (ix86_set_func_type): Don't save and restore callee saved registers for a noreturn function with nothrow or compiled with -fno-exceptions. gcc/testsuite/ PR target/38534 * gcc.target/i386/pr38534-1.c: New file. * gcc.target/i386/pr38534-2.c: Likewise. * gcc.target/i386/pr38534-3.c: Likewise. * gcc.target/i386/pr38534-4.c: Likewise. * gcc.target/i386/stack-check-17.c: Updated. --- gcc/config/i386/i386-options.cc | 16 ++++++++++-- gcc/testsuite/gcc.target/i386/pr38534-1.c | 26 +++++++++++++++++++ gcc/testsuite/gcc.target/i386/pr38534-2.c | 18 +++++++++++++ gcc/testsuite/gcc.target/i386/pr38534-3.c | 19 ++++++++++++++ gcc/testsuite/gcc.target/i386/pr38534-4.c | 18 +++++++++++++ .../gcc.target/i386/stack-check-17.c | 19 +++++--------- 6 files changed, 102 insertions(+), 14 deletions(-) create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-1.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-2.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-3.c create mode 100644 gcc/testsuite/gcc.target/i386/pr38534-4.c diff --git a/gcc/config/i386/i386-options.cc b/gcc/config/i386/i386-options.cc index 0cdea30599e..f965568947c 100644 --- a/gcc/config/i386/i386-options.cc +++ b/gcc/config/i386/i386-options.cc @@ -3371,9 +3371,21 @@ ix86_simd_clone_adjust (struct cgraph_node *node) static void ix86_set_func_type (tree fndecl) { + /* No need to save and restore callee-saved registers for a noreturn + function with nothrow or compiled with -fno-exceptions. + + NB: Don't use TREE_THIS_VOLATILE to check if this is a noreturn + function. The local-pure-const pass turns an interrupt function + into a noreturn function by setting TREE_THIS_VOLATILE. Normally + the local-pure-const pass is run after ix86_set_func_type is called. + When the local-pure-const pass is enabled for LTO, the interrupt + function is marked as noreturn in the IR output, which leads the + incompatible attribute error in LTO1. */ bool has_no_callee_saved_registers - = lookup_attribute ("no_callee_saved_registers", - TYPE_ATTRIBUTES (TREE_TYPE (fndecl))); + = (((TREE_NOTHROW (fndecl) || !flag_exceptions) + && lookup_attribute ("noreturn", DECL_ATTRIBUTES (fndecl))) + || lookup_attribute ("no_callee_saved_registers", + TYPE_ATTRIBUTES (TREE_TYPE (fndecl)))); if (cfun->machine->func_type == TYPE_UNKNOWN) { diff --git a/gcc/testsuite/gcc.target/i386/pr38534-1.c b/gcc/testsuite/gcc.target/i386/pr38534-1.c new file mode 100644 index 00000000000..9297959e759 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-1.c @@ -0,0 +1,26 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +#define ARRAY_SIZE 256 + +extern int array[ARRAY_SIZE][ARRAY_SIZE][ARRAY_SIZE]; +extern int value (int, int, int) +#ifndef __x86_64__ +__attribute__ ((regparm(3))) +#endif +; + +void +__attribute__((noreturn)) +no_return_to_caller (void) +{ + unsigned i, j, k; + for (i = ARRAY_SIZE; i > 0; --i) + for (j = ARRAY_SIZE; j > 0; --j) + for (k = ARRAY_SIZE; k > 0; --k) + array[i - 1][j - 1][k - 1] = value (i, j, k); + while (1); +} + +/* { dg-final { scan-assembler-not "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr38534-2.c b/gcc/testsuite/gcc.target/i386/pr38534-2.c new file mode 100644 index 00000000000..1fb01363273 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-2.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +extern void bar (void) __attribute__ ((no_callee_saved_registers)); +extern void fn (void) __attribute__ ((noreturn)); + +__attribute__ ((noreturn)) +void +foo (void) +{ + bar (); + fn (); +} + +/* { dg-final { scan-assembler-not "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */ +/* { dg-final { scan-assembler-not "jmp\[\\t \]+_?bar" } } */ +/* { dg-final { scan-assembler "call\[\\t \]+_?bar" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr38534-3.c b/gcc/testsuite/gcc.target/i386/pr38534-3.c new file mode 100644 index 00000000000..87fc35f3fe9 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-3.c @@ -0,0 +1,19 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers)); +extern fn_t bar; +extern void fn (void) __attribute__ ((noreturn)); + +__attribute__ ((noreturn)) +void +foo (void) +{ + bar (); + fn (); +} + +/* { dg-final { scan-assembler-not "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */ +/* { dg-final { scan-assembler-not "jmp" } } */ +/* { dg-final { scan-assembler "call\[\\t \]+" } } */ diff --git a/gcc/testsuite/gcc.target/i386/pr38534-4.c b/gcc/testsuite/gcc.target/i386/pr38534-4.c new file mode 100644 index 00000000000..561ebeef194 --- /dev/null +++ b/gcc/testsuite/gcc.target/i386/pr38534-4.c @@ -0,0 +1,18 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -mtune-ctrl=^prologue_using_move,^epilogue_using_move" } */ + +typedef void (*fn_t) (void) __attribute__ ((no_callee_saved_registers)); +extern void fn (void) __attribute__ ((noreturn)); + +__attribute__ ((noreturn)) +void +foo (fn_t bar) +{ + bar (); + fn (); +} + +/* { dg-final { scan-assembler-not "push" } } */ +/* { dg-final { scan-assembler-not "pop" } } */ +/* { dg-final { scan-assembler-not "jmp" } } */ +/* { dg-final { scan-assembler "call\[\\t \]+" } } */ diff --git a/gcc/testsuite/gcc.target/i386/stack-check-17.c b/gcc/testsuite/gcc.target/i386/stack-check-17.c index b3e41cb3d25..061484e1319 100644 --- a/gcc/testsuite/gcc.target/i386/stack-check-17.c +++ b/gcc/testsuite/gcc.target/i386/stack-check-17.c @@ -23,19 +23,14 @@ f3 (void) /* Verify no explicit probes. */ /* { dg-final { scan-assembler-not "or\[ql\]" } } */ -/* We also want to verify we did not use a push/pop sequence - to probe *sp as the callee register saves are sufficient - to probe *sp. - - y0/y1 are live across the call and thus must be allocated +/* y0/y1 are live across the call and thus must be allocated into either a stack slot or callee saved register. The former would be rather dumb. So assume it does not happen. - So search for two/four pushes for the callee register saves/argument pushes - (plus one for the PIC register if needed on ia32) and no pops (since the - function has no reachable epilogue). */ -/* { dg-final { scan-assembler-times "push\[ql\]" 2 { target { ! ia32 } } } } */ -/* { dg-final { scan-assembler-times "push\[ql\]" 4 { target { ia32 && nonpic } } } } */ -/* { dg-final { scan-assembler-times "push\[ql\]" 5 { target { ia32 && { ! nonpic } } } } } */ -/* { dg-final { scan-assembler-not "pop" } } */ + So search for a push/pop sequence for stack probe and 2 argument + pushes on ia32. There is no need to save and restore the PIC + register on ia32 for a noreturn function. */ +/* { dg-final { scan-assembler-times "push\[ql\]" 1 { target { ! ia32 } } } } */ +/* { dg-final { scan-assembler-times "push\[ql\]" 3 { target ia32 } } } */ +/* { dg-final { scan-assembler-times "pop" 1 } } */