From patchwork Thu Jan 4 14:56:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jan Hubicka X-Patchwork-Id: 185030 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp5658972dyb; Thu, 4 Jan 2024 06:57:51 -0800 (PST) X-Google-Smtp-Source: AGHT+IF/h2mFkOLe9kwwJv0QSibanPEjV88t1yDh8Kog7o9ib3YTuDXm5UxyvRJ0kJ4jiYYGmDNl X-Received: by 2002:a25:ef0b:0:b0:dbe:9d0b:e1d with SMTP id g11-20020a25ef0b000000b00dbe9d0b0e1dmr538740ybd.78.1704380271132; Thu, 04 Jan 2024 06:57:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1704380271; cv=pass; d=google.com; s=arc-20160816; b=DGoLG3CRCQgKvcXPmONMT0HqiWYF6FFsKn0gW65dssrweoKB7MIRU/z3APeHi9bpJM EG98+qRRVtSjv7xfnCKG34tjTpapF6pODe8IKRHBn9uho7+jdnzq9pJHU6POaNGwBB43 3GWHT1MEvjqiD56kJl1VZvekZVOUJKmRbhCaHBmknjNo4lao6FWTferhSgJYTk+9AlL1 lO78BIJNkIR127xXOv7zirjXGggsQHtItQtxK+uhhaeM5aKzJ5EozlpIPal9OsFxjRpr kR2HoZeHhhKCWYZ6aJe2qBfuykPe23xRMr91NEeKGLElyvVPG+vhMZ5SMF7DAfbNphq8 Kt4Q== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=errors-to:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-disposition :mime-version:message-id:subject:to:from:date:dkim-signature :arc-filter:dmarc-filter:delivered-to; bh=g1HJJgOrEwzHmljnHj00HF0OpvwLG3428wga6RW5FRc=; fh=xSNgsTWr2CKgute/UfO+iu37x0Hs54pC3kWre4+qd2k=; b=f9Z8A1HwijRn5UyopIBAQzP8bJ5d3AXWnpiP3T7XC4QC6umv12FJcKo7vUkqqBakfs qi/cJt6lwAcIdwL7OqMPTEVGbgg4ytDjZe9mmcDsmApHONGyUAQW39S5jgt1tzJOwuvw gI97g1ct3Os3XsMHr5lKlMKms3Ee3c4Gjgb7KkNfSSjEvJwCcXP2gwlZtOhGKsJpAg+q qQxADgdho6D9PB2KUqH6xeeUOb4UFo45ssRcjD27+qpoZ9H7HTdrVpxBVZAU0js2LGM5 p3XqKHp/cBd76xn7UTDERBhWWv0Fic6fk4zVgOwui1BTIphfDiIqmLvOkStJ4RtRdVaw TWIQ== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@ucw.cz header.s=gen1 header.b=rK0lBVt5; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ucw.cz Received: from server2.sourceware.org (server2.sourceware.org. [2620:52:3:1:0:246e:9693:128c]) by mx.google.com with ESMTPS id d16-20020a0cdb10000000b0067f70cf7441si24234745qvk.429.2024.01.04.06.57.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 04 Jan 2024 06:57:51 -0800 (PST) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) client-ip=2620:52:3:1:0:246e:9693:128c; Authentication-Results: mx.google.com; dkim=pass header.i=@ucw.cz header.s=gen1 header.b=rK0lBVt5; arc=pass (i=1); spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 2620:52:3:1:0:246e:9693:128c as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=ucw.cz Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id CEAF83858038 for ; Thu, 4 Jan 2024 14:57:50 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from nikam.ms.mff.cuni.cz (nikam.ms.mff.cuni.cz [195.113.20.16]) by sourceware.org (Postfix) with ESMTPS id 5D4F53858C41 for ; Thu, 4 Jan 2024 14:57:00 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 5D4F53858C41 Authentication-Results: sourceware.org; dmarc=fail (p=none dis=none) header.from=ucw.cz Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=kam.mff.cuni.cz ARC-Filter: OpenARC Filter v1.0.0 sourceware.org 5D4F53858C41 Authentication-Results: server2.sourceware.org; arc=none smtp.remote-ip=195.113.20.16 ARC-Seal: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704380223; cv=none; b=Os2gFFMrJ2HINg7Vy6RZI23r6ebGzDdiELDqnHxI2rBMEssZ9jlLMCpcCXp8bPZvrreaPQxxEh1IPXaM8x2TBwWqjTwcjNpiPrBtOKwkQkHcF3KsCmHvCFxFuRx64K+rxpDME5fFV/IPHp0Ms1vfmoBZh220aATcpXbIMw2uLTI= ARC-Message-Signature: i=1; a=rsa-sha256; d=sourceware.org; s=key; t=1704380223; c=relaxed/simple; bh=LYgt3Ckk/GHOIbMT5FbYBY/j7V3rzAp7e/5WyHWcpxE=; h=DKIM-Signature:Date:From:To:Subject:Message-ID:MIME-Version; b=MhSIpScL12EwfHy2SkYUDchjU89dz68tHYhOu2IlPw+Fa/9VwG8WAdvBwpWWyRvXZON1t1U+9+ei4jLaLO2PWMiWeIDllByJc0aQrqfTTJ0IPLrP8kajWZN1zqS7XtcLEQJylokQgj+/Fx6sffjQ8/Tvd2FoCFXt9XCdTv/3zEI= ARC-Authentication-Results: i=1; server2.sourceware.org Received: by nikam.ms.mff.cuni.cz (Postfix, from userid 16202) id C7D8E282538; Thu, 4 Jan 2024 15:56:58 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ucw.cz; s=gen1; t=1704380218; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:mime-version:mime-version:content-type:content-type; bh=g1HJJgOrEwzHmljnHj00HF0OpvwLG3428wga6RW5FRc=; b=rK0lBVt55giOjHpuQNIdw4LY0/vN1i5PwiqjL/dacWznIYmnLi6D+M5AdETos7MvnOGVgZ c746uZ8hcGD958SH3/3uAqVV2kKqo/6570JDJqEL7LZVUR/9J6IvwhRT4VmCSOOArWt4Ct aGszAKiXJPK9j+pOzpS7cpImLRHZqOA= Date: Thu, 4 Jan 2024 15:56:58 +0100 From: Jan Hubicka To: gcc-patches@gcc.gnu.org, rguenther@suse.de Subject: Add -falign-all-functions Message-ID: MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-10.9 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, GIT_PATCH_0, HEADER_FROM_DIFFERENT_DOMAINS, JMQ_SPF_NEUTRAL, RCVD_IN_MSPIKE_H3, RCVD_IN_MSPIKE_WL, SPF_HELO_NONE, SPF_PASS, TXREP, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.30 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787172247452339889 X-GMAIL-MSGID: 1787172247452339889 Hi, this patch adds new option -falign-all-functions which works like -falign-functions, but applies to all functions including those in cold regions. As discussed in the PR log, this is needed for atomically patching function entries in the kernel. An option would be to make -falign-function mandatory, but I think it is not a good idea, since original purpose of -falign-funtions is optimization of instruction decode and cache size. Having -falign-all-functions is backwards compatible. Richi also suggested extending syntax of the -falign-functions parameters (which is already non-trivial) but it seems to me that having separate flag is more readable. Bootstrapped/regtested x86_64-linux, OK for master and later backports to release branches? gcc/ChangeLog: PR middle-end/88345 * common.opt: Add -falign-all-functions * doc/invoke.texi: Add -falign-all-functions. (-falign-functions, -falign-labels, -falign-loops): Document that alignment is ignored in cold code. * flags.h (align_loops): Reindent. (align_jumps): Reindent. (align_labels): Reindent. (align_functions): Reindent. (align_all_functions): New macro. * opts.cc (common_handle_option): Handle -falign-all-functions. * toplev.cc (parse_alignment_opts): Likewise. * varasm.cc (assemble_start_function): Likewise. diff --git a/gcc/common.opt b/gcc/common.opt index d263a959df3..fea2c855fcf 100644 --- a/gcc/common.opt +++ b/gcc/common.opt @@ -1033,6 +1033,13 @@ faggressive-loop-optimizations Common Var(flag_aggressive_loop_optimizations) Optimization Init(1) Aggressively optimize loops using language constraints. +falign-all-functions +Common Var(flag_align_all_functions) Optimization +Align the start of functions. + +falign-all-functions= +Common RejectNegative Joined Var(str_align_all_functions) Optimization + falign-functions Common Var(flag_align_functions) Optimization Align the start of functions. diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi index d272b9228dd..ad3d75d310c 100644 --- a/gcc/doc/invoke.texi +++ b/gcc/doc/invoke.texi @@ -543,6 +543,7 @@ Objective-C and Objective-C++ Dialects}. @xref{Optimize Options,,Options that Control Optimization}. @gccoptlist{-faggressive-loop-optimizations -falign-functions[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] +-falign-all-functions=[@var{n}] -falign-jumps[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] -falign-labels[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] -falign-loops[=@var{n}[:@var{m}:[@var{n2}[:@var{m2}]]]] @@ -14177,6 +14178,9 @@ Align the start of functions to the next power-of-two greater than or equal to @var{n}, skipping up to @var{m}-1 bytes. This ensures that at least the first @var{m} bytes of the function can be fetched by the CPU without crossing an @var{n}-byte alignment boundary. +This is an optimization of code performance and alignment is ignored for +functions considered cold. If alignment is required for all functions, +use @option{-falign-all-functions}. If @var{m} is not specified, it defaults to @var{n}. @@ -14210,6 +14214,12 @@ overaligning functions. It attempts to instruct the assembler to align by the amount specified by @option{-falign-functions}, but not to skip more bytes than the size of the function. +@opindex falign-all-functions=@var{n} +@item -falign-all-functions +Specify minimal alignment for function entry. Unlike @option{-falign-functions} +this alignment is applied also to all functions (even those considered cold). +The alignment is also not affected by @option{-flimit-function-alignment} + @opindex falign-labels @item -falign-labels @itemx -falign-labels=@var{n} @@ -14240,6 +14250,8 @@ Enabled at levels @option{-O2}, @option{-O3}. Align loops to a power-of-two boundary. If the loops are executed many times, this makes up for any execution of the dummy padding instructions. +This is an optimization of code performance and alignment is ignored for +loops considered cold. If @option{-falign-labels} is greater than this value, then its value is used instead. @@ -14262,6 +14274,8 @@ Enabled at levels @option{-O2}, @option{-O3}. Align branch targets to a power-of-two boundary, for branch targets where the targets can only be reached by jumping. In this case, no dummy operations need be executed. +This is an optimization of code performance and alignment is ignored for +jumps considered cold. If @option{-falign-labels} is greater than this value, then its value is used instead. @@ -14371,7 +14385,7 @@ To use the link-time optimizer, @option{-flto} and optimization options should be specified at compile time and during the final link. It is recommended that you compile all the files participating in the same link with the same options and also specify those options at -link time. +link time. For example: @smallexample diff --git a/gcc/flags.h b/gcc/flags.h index e4bafa310d6..ecf4fb9e846 100644 --- a/gcc/flags.h +++ b/gcc/flags.h @@ -89,6 +89,7 @@ public: align_flags x_align_jumps; align_flags x_align_labels; align_flags x_align_functions; + align_flags x_align_all_functions; }; extern class target_flag_state default_target_flag_state; @@ -98,10 +99,11 @@ extern class target_flag_state *this_target_flag_state; #define this_target_flag_state (&default_target_flag_state) #endif -#define align_loops (this_target_flag_state->x_align_loops) -#define align_jumps (this_target_flag_state->x_align_jumps) -#define align_labels (this_target_flag_state->x_align_labels) -#define align_functions (this_target_flag_state->x_align_functions) +#define align_loops (this_target_flag_state->x_align_loops) +#define align_jumps (this_target_flag_state->x_align_jumps) +#define align_labels (this_target_flag_state->x_align_labels) +#define align_functions (this_target_flag_state->x_align_functions) +#define align_all_functions (this_target_flag_state->x_align_all_functions) /* Returns TRUE if generated code should match ABI version N or greater is in use. */ diff --git a/gcc/opts.cc b/gcc/opts.cc index 7a3830caaa3..3fa521501ff 100644 --- a/gcc/opts.cc +++ b/gcc/opts.cc @@ -3342,6 +3342,12 @@ common_handle_option (struct gcc_options *opts, &opts->x_str_align_functions); break; + case OPT_falign_all_functions_: + check_alignment_argument (loc, arg, "all-functions", + &opts->x_flag_align_all_functions, + &opts->x_str_align_all_functions); + break; + case OPT_ftabstop_: /* It is documented that we silently ignore silly values. */ if (value >= 1 && value <= 100) diff --git a/gcc/toplev.cc b/gcc/toplev.cc index 85450d97a1a..3dd6f4e1ef7 100644 --- a/gcc/toplev.cc +++ b/gcc/toplev.cc @@ -1219,6 +1219,7 @@ parse_alignment_opts (void) parse_N_M (str_align_jumps, align_jumps); parse_N_M (str_align_labels, align_labels); parse_N_M (str_align_functions, align_functions); + parse_N_M (str_align_all_functions, align_all_functions); } /* Process the options that have been parsed. */ diff --git a/gcc/varasm.cc b/gcc/varasm.cc index 69f8f8ee018..ddb8536a337 100644 --- a/gcc/varasm.cc +++ b/gcc/varasm.cc @@ -1919,6 +1919,37 @@ assemble_start_function (tree decl, const char *fnname) ASM_OUTPUT_ALIGN (asm_out_file, align); } + /* Handle forced alignment. This really ought to apply to all functions, + since it is used by patchable entries. */ + if (align_all_functions.levels[0].log > align) + { +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN + int align_log = align_all_functions.levels[0].log; +#endif + int max_skip = align_all_functions.levels[0].maxskip; + if (flag_limit_function_alignment && crtl->max_insn_address > 0 + && max_skip >= crtl->max_insn_address) + max_skip = crtl->max_insn_address - 1; + +#ifdef ASM_OUTPUT_MAX_SKIP_ALIGN + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, align_log, max_skip); + if (max_skip >= (1 << align_log) - 1) + align = align_functions.levels[0].log; + if (max_skip == align_all_functions.levels[0].maxskip) + { + ASM_OUTPUT_MAX_SKIP_ALIGN (asm_out_file, + align_all_functions.levels[1].log, + align_all_functions.levels[1].maxskip); + if (align_all_functions.levels[1].maxskip + >= (1 << align_all_functions.levels[1].log) - 1) + align = align_all_functions.levels[1].log; + } +#else + ASM_OUTPUT_ALIGN (asm_out_file, align_all_functions.levels[0].log); + align = align_all_functions.levels[0].log; +#endif + } + /* Handle a user-specified function alignment. Note that we still need to align to DECL_ALIGN, as above, because ASM_OUTPUT_MAX_SKIP_ALIGN might not do any alignment at all. */