[v4] Introduce strub: machine-independent stack scrubbing

Message ID or5y31de5m.fsf_-_@lxoliva.fsfla.org
State Unresolved
Headers
Series [v4] Introduce strub: machine-independent stack scrubbing |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Alexandre Oliva Oct. 20, 2023, 6:03 a.m. UTC
  This is a refreshed and improved version of the version posted back in
June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html

Compared with the previous version, this patch initializes probabilities
of newly-created EH edges and execution counts of EH newly-created
blocks; drops obsolete preprocessor conditionals; improve GCing of
identifiers, decls and types introduced by strub; enable gengtype to
cope with token pasting within multi-line macros; gimplify wrapped
va-list loading (needed on ppc64le); adjusted more testcases to tolerate
async stack overwrites.

Regstrapped on x86_64-linux-gnu and ppc64le-linux-gnu, also boostrapped
with -fstrub=all.  Ok to install?

----

This patch adds the strub attribute for function and variable types,
command-line options, passes and adjustments to implement it,
documentation, and tests.

Stack scrubbing is implemented in a machine-independent way: functions
with strub enabled are modified so that they take an extra stack
watermark argument, that they update with their stack use, and the
caller can then zero it out once it regains control, whether by return
or exception.  There are two ways to go about it: at-calls, that
modifies the visible interface (signature) of the function, and
internal, in which the body is moved to a clone, the clone undergoes
the interface change, and the function becomes a wrapper, preserving
its original interface, that calls the clone and then clears the stack
used by it.

Variables can also be annotated with the strub attribute, so that
functions that read from them get stack scrubbing enabled implicitly,
whether at-calls, for functions only usable within a translation unit,
or internal, for functions whose interfaces must not be modified.

There is a strict mode, in which functions that have their stack
scrubbed can only call other functions with stack-scrubbing
interfaces, or those explicitly marked as callable from strub
contexts, so that an entire call chain gets scrubbing, at once or
piecemeal depending on optimization levels.  In the default mode,
relaxed, this requirement is not enforced by the compiler.

The implementation adds two IPA passes, one that assigns strub modes
early on, another that modifies interfaces and adds calls to the
builtins that jointly implement stack scrubbing.  Another builtin,
that obtains the stack pointer, is added for use in the implementation
of the builtins, whether expanded inline or called in libgcc.

There are new command-line options to change operation modes and to
force the feature disabled; it is enabled by default, but it has no
effect and is implicitly disabled if the strub attribute is never
used.  There are also options meant to use for testing the feature,
enabling different strubbing modes for all (viable) functions.


for  gcc/ChangeLog

	* Makefile.in (OBJS): Add ipa-strub.o.
	(GTFILES): Add ipa-strub.cc.
	* builtins.def (BUILT_IN_STACK_ADDRESS): New.
	(BUILT_IN___STRUB_ENTER): New.
	(BUILT_IN___STRUB_UPDATE): New.
	(BUILT_IN___STRUB_LEAVE): New.
	* builtins.cc: Include ipa-strub.h.
	(STACK_STOPS, STACK_UNSIGNED): Define.
	(expand_builtin_stack_address): New.
	(expand_builtin_strub_enter): New.
	(expand_builtin_strub_update): New.
	(expand_builtin_strub_leave): New.
	(expand_builtin): Call them.
	* common.opt (fstrub=*): New options.
	* doc/extend.texi (strub): New type attribute.
	(__builtin_stack_address): New function.
	(Stack Scrubbing): New section.
	* doc/invoke.texi (-fstrub=*): New options.
	(-fdump-ipa-*): New passes.
	* gengtype-lex.l: Ignore multi-line pp-directives.
	* ipa-inline.cc: Include ipa-strub.h.
	(can_inline_edge_p): Test strub_inlinable_to_p.
	* ipa-split.cc: Include ipa-strub.h.
	(execute_split_functions): Test strub_splittable_p.
	* ipa-strub.cc, ipa-strub.h: New.
	* passes.def: Add strub_mode and strub passes.
	* tree-cfg.cc (gimple_verify_flow_info): Note on debug stmts.
	* tree-pass.h (make_pass_ipa_strub_mode): Declare.
	(make_pass_ipa_strub): Declare.
	(make_pass_ipa_function_and_variable_visibility): Fix
	formatting.
	* tree-ssa-ccp.cc (optimize_stack_restore): Keep restores
	before strub leave.
	* multiple_target.cc (pass_target_clone::gate): Test seen_error.
	* attribs.cc: Include ipa-strub.h.
	(decl_attributes): Support applying attributes to function
	type, rather than pointer type, at handler's request.
	(comp_type_attributes): Combine strub_comptypes and target
	comp_type results.
	* doc/tm.texi.in (TARGET_STRUB_USE_DYNAMIC_ARRAY): New.
	(TARGET_STRUB_MAY_USE_MEMSET): New.
	* doc/tm.texi: Rebuilt.
	* cgraph.h (symtab_node::reset): Add preserve_comdat_group
	param, with a default.
	* cgraphunit.cc (symtab_node::reset): Use it.

for  gcc/c-family/ChangeLog

	* c-attribs.cc: Include ipa-strub.h.
	(handle_strub_attribute): New.
	(c_common_attribute_table): Add strub.

for  gcc/ada/ChangeLog

	* gcc-interface/trans.cc: Include ipa-strub.h.
	(gigi): Make internal decls for targets of compiler-generated
	calls strub-callable too.
	(build_raise_check): Likewise.
	* gcc-interface/utils.cc: Include ipa-strub.h.
	(handle_strub_attribute): New.
	(gnat_internal_attribute_table): Add strub.

for  gcc/testsuite/ChangeLog

	* c-c++-common/strub-O0.c: New.
	* c-c++-common/strub-O1.c: New.
	* c-c++-common/strub-O2.c: New.
	* c-c++-common/strub-O2fni.c: New.
	* c-c++-common/strub-O3.c: New.
	* c-c++-common/strub-O3fni.c: New.
	* c-c++-common/strub-Og.c: New.
	* c-c++-common/strub-Os.c: New.
	* c-c++-common/strub-all1.c: New.
	* c-c++-common/strub-all2.c: New.
	* c-c++-common/strub-apply1.c: New.
	* c-c++-common/strub-apply2.c: New.
	* c-c++-common/strub-apply3.c: New.
	* c-c++-common/strub-apply4.c: New.
	* c-c++-common/strub-at-calls1.c: New.
	* c-c++-common/strub-at-calls2.c: New.
	* c-c++-common/strub-defer-O1.c: New.
	* c-c++-common/strub-defer-O2.c: New.
	* c-c++-common/strub-defer-O3.c: New.
	* c-c++-common/strub-defer-Os.c: New.
	* c-c++-common/strub-internal1.c: New.
	* c-c++-common/strub-internal2.c: New.
	* c-c++-common/strub-parms1.c: New.
	* c-c++-common/strub-parms2.c: New.
	* c-c++-common/strub-parms3.c: New.
	* c-c++-common/strub-relaxed1.c: New.
	* c-c++-common/strub-relaxed2.c: New.
	* c-c++-common/strub-short-O0-exc.c: New.
	* c-c++-common/strub-short-O0.c: New.
	* c-c++-common/strub-short-O1.c: New.
	* c-c++-common/strub-short-O2.c: New.
	* c-c++-common/strub-short-O3.c: New.
	* c-c++-common/strub-short-Os.c: New.
	* c-c++-common/strub-strict1.c: New.
	* c-c++-common/strub-strict2.c: New.
	* c-c++-common/strub-tail-O1.c: New.
	* c-c++-common/strub-tail-O2.c: New.
	* c-c++-common/torture/strub-callable1.c: New.
	* c-c++-common/torture/strub-callable2.c: New.
	* c-c++-common/torture/strub-const1.c: New.
	* c-c++-common/torture/strub-const2.c: New.
	* c-c++-common/torture/strub-const3.c: New.
	* c-c++-common/torture/strub-const4.c: New.
	* c-c++-common/torture/strub-data1.c: New.
	* c-c++-common/torture/strub-data2.c: New.
	* c-c++-common/torture/strub-data3.c: New.
	* c-c++-common/torture/strub-data4.c: New.
	* c-c++-common/torture/strub-data5.c: New.
	* c-c++-common/torture/strub-indcall1.c: New.
	* c-c++-common/torture/strub-indcall2.c: New.
	* c-c++-common/torture/strub-indcall3.c: New.
	* c-c++-common/torture/strub-inlinable1.c: New.
	* c-c++-common/torture/strub-inlinable2.c: New.
	* c-c++-common/torture/strub-ptrfn1.c: New.
	* c-c++-common/torture/strub-ptrfn2.c: New.
	* c-c++-common/torture/strub-ptrfn3.c: New.
	* c-c++-common/torture/strub-ptrfn4.c: New.
	* c-c++-common/torture/strub-pure1.c: New.
	* c-c++-common/torture/strub-pure2.c: New.
	* c-c++-common/torture/strub-pure3.c: New.
	* c-c++-common/torture/strub-pure4.c: New.
	* c-c++-common/torture/strub-run1.c: New.
	* c-c++-common/torture/strub-run2.c: New.
	* c-c++-common/torture/strub-run3.c: New.
	* c-c++-common/torture/strub-run4.c: New.
	* c-c++-common/torture/strub-run4c.c: New.
	* c-c++-common/torture/strub-run4d.c: New.
	* c-c++-common/torture/strub-run4i.c: New.
	* g++.dg/strub-run1.C: New.
	* g++.dg/torture/strub-init1.C: New.
	* g++.dg/torture/strub-init2.C: New.
	* g++.dg/torture/strub-init3.C: New.
	* gnat.dg/strub_attr.adb, gnat.dg/strub_attr.ads: New.
	* gnat.dg/strub_ind.adb, gnat.dg/strub_ind.ads: New.

for  libgcc/ChangeLog

	* Makefile.in (LIB2ADD): Add strub.c.
	* libgcc2.h (__strub_enter, __strub_update, __strub_leave):
	Declare.
	* strub.c: New.
---
 gcc/Makefile.in                                    |    2 
 gcc/ada/gcc-interface/trans.cc                     |   18 
 gcc/ada/gcc-interface/utils.cc                     |   73 
 gcc/attribs.cc                                     |   37 
 gcc/builtins.cc                                    |  269 ++
 gcc/builtins.def                                   |    4 
 gcc/c-family/c-attribs.cc                          |   82 
 gcc/cgraph.h                                       |    2 
 gcc/cgraphunit.cc                                  |    5 
 gcc/common.opt                                     |   29 
 gcc/doc/extend.texi                                |  307 ++
 gcc/doc/invoke.texi                                |   60 
 gcc/doc/tm.texi                                    |   19 
 gcc/doc/tm.texi.in                                 |   19 
 gcc/gengtype-lex.l                                 |    3 
 gcc/ipa-inline.cc                                  |    6 
 gcc/ipa-split.cc                                   |    7 
 gcc/ipa-strub.cc                                   | 3454 ++++++++++++++++++++
 gcc/ipa-strub.h                                    |   45 
 gcc/passes.def                                     |    2 
 gcc/testsuite/c-c++-common/strub-O0.c              |   14 
 gcc/testsuite/c-c++-common/strub-O1.c              |   15 
 gcc/testsuite/c-c++-common/strub-O2.c              |   16 
 gcc/testsuite/c-c++-common/strub-O2fni.c           |   15 
 gcc/testsuite/c-c++-common/strub-O3.c              |   12 
 gcc/testsuite/c-c++-common/strub-O3fni.c           |   15 
 gcc/testsuite/c-c++-common/strub-Og.c              |   16 
 gcc/testsuite/c-c++-common/strub-Os.c              |   18 
 gcc/testsuite/c-c++-common/strub-all1.c            |   32 
 gcc/testsuite/c-c++-common/strub-all2.c            |   24 
 gcc/testsuite/c-c++-common/strub-apply1.c          |   15 
 gcc/testsuite/c-c++-common/strub-apply2.c          |   12 
 gcc/testsuite/c-c++-common/strub-apply3.c          |    8 
 gcc/testsuite/c-c++-common/strub-apply4.c          |   21 
 gcc/testsuite/c-c++-common/strub-at-calls1.c       |   30 
 gcc/testsuite/c-c++-common/strub-at-calls2.c       |   23 
 gcc/testsuite/c-c++-common/strub-defer-O1.c        |    7 
 gcc/testsuite/c-c++-common/strub-defer-O2.c        |    8 
 gcc/testsuite/c-c++-common/strub-defer-O3.c        |  110 +
 gcc/testsuite/c-c++-common/strub-defer-Os.c        |    7 
 gcc/testsuite/c-c++-common/strub-internal1.c       |   31 
 gcc/testsuite/c-c++-common/strub-internal2.c       |   21 
 gcc/testsuite/c-c++-common/strub-parms1.c          |   48 
 gcc/testsuite/c-c++-common/strub-parms2.c          |   36 
 gcc/testsuite/c-c++-common/strub-parms3.c          |   58 
 gcc/testsuite/c-c++-common/strub-relaxed1.c        |   18 
 gcc/testsuite/c-c++-common/strub-relaxed2.c        |   14 
 gcc/testsuite/c-c++-common/strub-short-O0-exc.c    |   10 
 gcc/testsuite/c-c++-common/strub-short-O0.c        |   10 
 gcc/testsuite/c-c++-common/strub-short-O1.c        |   10 
 gcc/testsuite/c-c++-common/strub-short-O2.c        |   10 
 gcc/testsuite/c-c++-common/strub-short-O3.c        |   12 
 gcc/testsuite/c-c++-common/strub-short-Os.c        |   12 
 gcc/testsuite/c-c++-common/strub-strict1.c         |   36 
 gcc/testsuite/c-c++-common/strub-strict2.c         |   25 
 gcc/testsuite/c-c++-common/strub-tail-O1.c         |    8 
 gcc/testsuite/c-c++-common/strub-tail-O2.c         |   14 
 gcc/testsuite/c-c++-common/strub-var1.c            |   24 
 .../c-c++-common/torture/strub-callable1.c         |    9 
 .../c-c++-common/torture/strub-callable2.c         |  264 ++
 gcc/testsuite/c-c++-common/torture/strub-const1.c  |   18 
 gcc/testsuite/c-c++-common/torture/strub-const2.c  |   22 
 gcc/testsuite/c-c++-common/torture/strub-const3.c  |   13 
 gcc/testsuite/c-c++-common/torture/strub-const4.c  |   17 
 gcc/testsuite/c-c++-common/torture/strub-data1.c   |   13 
 gcc/testsuite/c-c++-common/torture/strub-data2.c   |   14 
 gcc/testsuite/c-c++-common/torture/strub-data3.c   |   14 
 gcc/testsuite/c-c++-common/torture/strub-data4.c   |   14 
 gcc/testsuite/c-c++-common/torture/strub-data5.c   |   15 
 .../c-c++-common/torture/strub-indcall1.c          |   14 
 .../c-c++-common/torture/strub-indcall2.c          |   14 
 .../c-c++-common/torture/strub-indcall3.c          |   14 
 .../c-c++-common/torture/strub-inlinable1.c        |   16 
 .../c-c++-common/torture/strub-inlinable2.c        |    7 
 gcc/testsuite/c-c++-common/torture/strub-ptrfn1.c  |   10 
 gcc/testsuite/c-c++-common/torture/strub-ptrfn2.c  |   55 
 gcc/testsuite/c-c++-common/torture/strub-ptrfn3.c  |   50 
 gcc/testsuite/c-c++-common/torture/strub-ptrfn4.c  |   43 
 gcc/testsuite/c-c++-common/torture/strub-pure1.c   |   18 
 gcc/testsuite/c-c++-common/torture/strub-pure2.c   |   22 
 gcc/testsuite/c-c++-common/torture/strub-pure3.c   |   13 
 gcc/testsuite/c-c++-common/torture/strub-pure4.c   |   17 
 gcc/testsuite/c-c++-common/torture/strub-run1.c    |   95 +
 gcc/testsuite/c-c++-common/torture/strub-run2.c    |   84 
 gcc/testsuite/c-c++-common/torture/strub-run3.c    |   80 
 gcc/testsuite/c-c++-common/torture/strub-run4.c    |  106 +
 gcc/testsuite/c-c++-common/torture/strub-run4c.c   |    5 
 gcc/testsuite/c-c++-common/torture/strub-run4d.c   |    7 
 gcc/testsuite/c-c++-common/torture/strub-run4i.c   |    5 
 gcc/testsuite/g++.dg/strub-run1.C                  |   19 
 gcc/testsuite/g++.dg/torture/strub-init1.C         |   13 
 gcc/testsuite/g++.dg/torture/strub-init2.C         |   14 
 gcc/testsuite/g++.dg/torture/strub-init3.C         |   13 
 gcc/testsuite/gnat.dg/strub_access.adb             |   21 
 gcc/testsuite/gnat.dg/strub_access1.adb            |   16 
 gcc/testsuite/gnat.dg/strub_attr.adb               |   37 
 gcc/testsuite/gnat.dg/strub_attr.ads               |   12 
 gcc/testsuite/gnat.dg/strub_disp.adb               |   64 
 gcc/testsuite/gnat.dg/strub_disp1.adb              |   79 
 gcc/testsuite/gnat.dg/strub_ind.adb                |   33 
 gcc/testsuite/gnat.dg/strub_ind.ads                |   17 
 gcc/testsuite/gnat.dg/strub_ind1.adb               |   41 
 gcc/testsuite/gnat.dg/strub_ind1.ads               |   17 
 gcc/testsuite/gnat.dg/strub_ind2.adb               |   34 
 gcc/testsuite/gnat.dg/strub_ind2.ads               |   17 
 gcc/testsuite/gnat.dg/strub_intf.adb               |   93 +
 gcc/testsuite/gnat.dg/strub_intf1.adb              |   86 
 gcc/testsuite/gnat.dg/strub_intf2.adb              |   55 
 gcc/testsuite/gnat.dg/strub_renm.adb               |   21 
 gcc/testsuite/gnat.dg/strub_renm1.adb              |   32 
 gcc/testsuite/gnat.dg/strub_renm2.adb              |   32 
 gcc/testsuite/gnat.dg/strub_var.adb                |   16 
 gcc/testsuite/gnat.dg/strub_var1.adb               |   20 
 gcc/tree-cfg.cc                                    |    1 
 gcc/tree-pass.h                                    |    5 
 gcc/tree-ssa-ccp.cc                                |    4 
 libgcc/Makefile.in                                 |    3 
 libgcc/libgcc2.h                                   |    4 
 libgcc/strub.c                                     |  149 +
 119 files changed, 7302 insertions(+), 12 deletions(-)
 create mode 100644 gcc/ipa-strub.cc
 create mode 100644 gcc/ipa-strub.h
 create mode 100644 gcc/testsuite/c-c++-common/strub-O0.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-O1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-O2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-O2fni.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-O3.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-O3fni.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-Og.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-Os.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-all1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-all2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-apply1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-apply2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-apply3.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-apply4.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-at-calls1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-at-calls2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-defer-O1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-defer-O2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-defer-O3.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-defer-Os.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-internal1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-internal2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-parms1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-parms2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-parms3.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-relaxed1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-relaxed2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-O0-exc.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-O0.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-O1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-O2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-O3.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-short-Os.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-strict1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-strict2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-tail-O1.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-tail-O2.c
 create mode 100644 gcc/testsuite/c-c++-common/strub-var1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-callable1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-callable2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-const1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-const2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-const3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-const4.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-data1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-data2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-data3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-data4.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-data5.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-indcall1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-indcall2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-indcall3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-inlinable1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-inlinable2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-ptrfn1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-ptrfn2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-ptrfn3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-ptrfn4.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-pure1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-pure2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-pure3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-pure4.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run1.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run2.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run3.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run4.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run4c.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run4d.c
 create mode 100644 gcc/testsuite/c-c++-common/torture/strub-run4i.c
 create mode 100644 gcc/testsuite/g++.dg/strub-run1.C
 create mode 100644 gcc/testsuite/g++.dg/torture/strub-init1.C
 create mode 100644 gcc/testsuite/g++.dg/torture/strub-init2.C
 create mode 100644 gcc/testsuite/g++.dg/torture/strub-init3.C
 create mode 100644 gcc/testsuite/gnat.dg/strub_access.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_access1.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_attr.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_attr.ads
 create mode 100644 gcc/testsuite/gnat.dg/strub_disp.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_disp1.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind.ads
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind1.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind1.ads
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind2.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_ind2.ads
 create mode 100644 gcc/testsuite/gnat.dg/strub_intf.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_intf1.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_intf2.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_renm.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_renm1.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_renm2.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_var.adb
 create mode 100644 gcc/testsuite/gnat.dg/strub_var1.adb
 create mode 100644 libgcc/strub.c
  

Comments

Alexandre Oliva Oct. 26, 2023, 6:15 a.m. UTC | #1
Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html

I'm combining the gcc/ipa-strub.cc bits from
https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
  
Alexandre Oliva Nov. 20, 2023, 12:40 p.m. UTC | #2
On Oct 26, 2023, Alexandre Oliva <oliva@adacore.com> wrote:

>> This is a refreshed and improved version of the version posted back in
>> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html

> Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
> I'm combining the gcc/ipa-strub.cc bits from
> https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html

Ping?
Retested on x86_64-linux-gnu, with and without -fstrub=all.

>> This patch adds the strub attribute for function and variable types,
>> command-line options, passes and adjustments to implement it,
>> documentation, and tests.

>> Stack scrubbing is implemented in a machine-independent way: functions
...
  
Richard Biener Nov. 22, 2023, 2:14 p.m. UTC | #3
On Mon, Nov 20, 2023 at 1:40 PM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Oct 26, 2023, Alexandre Oliva <oliva@adacore.com> wrote:
>
> >> This is a refreshed and improved version of the version posted back in
> >> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html
>
> > Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
> > I'm combining the gcc/ipa-strub.cc bits from
> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
>
> Ping?
> Retested on x86_64-linux-gnu, with and without -fstrub=all.

@@ -898,7 +899,24 @@ decl_attributes (tree *node, tree attributes, int flags,
       TYPE_NAME (tt) = *node;
     }

-  *anode = cur_and_last_decl[0];
+  if (*anode != cur_and_last_decl[0])
+    {
+      /* Even if !spec->function_type_required, allow the attribute
+ handler to request the attribute to be applied to the function
+ type, rather than to the function pointer type, by setting
+ cur_and_last_decl[0] to the function type.  */
+      if (!fn_ptr_tmp
+  && POINTER_TYPE_P (*anode)
+  && TREE_TYPE (*anode) == cur_and_last_decl[0]
+  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
+ {
+  fn_ptr_tmp = TREE_TYPE (*anode);
+  fn_ptr_quals = TYPE_QUALS (*anode);
+  anode = &fn_ptr_tmp;
+ }
+      *anode = cur_and_last_decl[0];
+    }
+

what is this a workaround for?  Isn't there a suitable parsing position
for placing the attribute?

+#ifndef STACK_GROWS_DOWNWARD
+# define STACK_TOPS GT
+#else
+# define STACK_TOPS LT
+#endif

according to docs this is defined to 0 or 1 so the above looks wrong
(it's always defined).

+  if (optimize < 2 || optimize_size || flag_no_inline)
+    return NULL_RTX;

I'm wondering about these checks in the expansions of the builtins,
I think this is about inline expanding or emitting a libcall, right?
I wonder if you should use optimize_function_for_speed (cfun) instead?
Usually -fno-inline shouldn't affect such calls, but -fno-builtin-FOO would.
I have no strong opinion here though.

The new builtins seem undocumented - usually those are documented
within extend.texi - I guess placing __builtin___strub_enter calls in
the code manually will break in interesting ways - if that's not supposed
to happen the trick is to embed a space in the name of the built-in.
__builtin_stack_address looks like something users will pick up though
(and thus should be documented)?

-symtab_node::reset (void)
+symtab_node::reset (bool preserve_comdat_group)

not sure what for, I'll leave Honza to comment.

+/* Create a distinct copy of the type of NODE's function, and change
+   the fntype of all calls to it with the same main type to the new
+   type.  */
+
+static void
+distinctify_node_type (cgraph_node *node)
+{
+  tree old_type = TREE_TYPE (node->decl);
+  tree new_type = build_distinct_type_copy (old_type);
+  tree new_ptr_type = NULL_TREE;
+
+  /* Remap any calls to node->decl that use old_type, or a variant
+     thereof, to new_type as well.  We don't look for aliases, their
+     declarations will have their types changed independently, and
+     we'll adjust their fntypes then.  */
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+    {
+      if (!e->call_stmt)
+ continue;
+      tree fnaddr = gimple_call_fn (e->call_stmt);
+      gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
+   && TREE_OPERAND (fnaddr, 0) == node->decl);
+      if (strub_call_fntype_override_p (e->call_stmt))
+ continue;
+      if (!new_ptr_type)
+ new_ptr_type = build_pointer_type (new_type);
+      TREE_TYPE (fnaddr) = new_ptr_type;
+      gimple_call_set_fntype (e->call_stmt, new_type);
+    }
+
+  TREE_TYPE (node->decl) = new_type;

it does feel like there's IPA mechanisms to deal with what you are trying to do
here (or in the caller(s)).


+unsigned int
+pass_ipa_strub_mode::execute (function *)
+{
+  last_cgraph_order = 0;
+  ipa_strub_set_mode_for_new_functions ();
+
+  /* Verify before any inlining or other transformations.  */
+  verify_strub ();

if  (flag_checking) verify_strub ();

please.  I guess we talked about this last year - what's the reason to have both
an IPA pass and a simple IPA pass?  IIRC the simple IPA pass is a simple
one because it wants to see inlined bodies and "fixes" those up?  Some toplevel
comments explaining both passes in the ipa-strub.cc pass would be nice to
have.  I guess I also asked before - did you try it with -flto?

+    /* Decide which of the wrapped function's parms we want to turn into
+       references to the argument passed to the wrapper.  In general,
we want to
+       copy small arguments, and avoid copying large ones.
Variable-sized array
+       lengths given by other arguments, as in 20020210-1.c, would lead to
+       problems if passed by value, after resetting the original function and
+       dropping the length computation; passing them by reference works.
+       DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
+       anyway, but performed at the caller.  */
+    indirect_parms_t indirect_nparms (3, false);
...

IMHO it's a bad idea, at least initially, to mess with the ABI in a heuristical
way this way.  I see you need all sorts of "fixup", including re-gimplification
of stmts, etc., double-ugh.  Ideally IPA-SRA can already do parts of those
transforms which would mean ipa-param-manipulation has generic code
to encode them?  I realize you can't tail-call, but I wonder if some clever
machine-specific tricks would work, I guess it would need the wrapper
to be target generated of course.

As it's only for optimization, how much of the code would go away when
you avoid changing the parameters of the wrapped function?  Could
one easily disable the optimization via a --param maybe?

There's still much commented code and FIXMEs in, a lot of cgraph
related code is used which I can't review very well.

I fear that -fstrub is going to be actually used by people - are you up
to the task fixing things?

+void ATTRIBUTE_STRUB_CALLABLE
+__strub_enter (void **watermark)
+{
+  *watermark = __builtin_frame_address (0);

in your experience, do these (or the inline version) work reliably
when optimizations like omitting the frame pointer or shrink-wrapping
happen from a security point of view?

You are adding symbols to libgcc but I don't see you amending
libgcc/libgcc-std.ver.in?

Neither the documentation in extend.texi nor the one in invoke.texi
mentions the target audience of -fstrub - I suppose it is to ensure
secrets are erased from the stack in case they are stored into locals
or spilled and thus it complements -fzero-call-used-regs.  The patches
mention the red zone, documenting the guarantees and caveats
would be nice to have.  I think the documentation belongs to the
attribute documentation.

I hope Honza can have a quick look over the IPA passes.  As said,
eliding the parmeter optimization from this patch, possibly
implementing missing pieces in ipa-param-manipulation might be
a better way than taking what is percieved as a huge ugly part
of the patch.

Thanks,
Richard.

> >> This patch adds the strub attribute for function and variable types,
> >> command-line options, passes and adjustments to implement it,
> >> documentation, and tests.
>
> >> Stack scrubbing is implemented in a machine-independent way: functions
> ...
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
  
Alexandre Oliva Nov. 23, 2023, 10:56 a.m. UTC | #4
Hello, Richi,

Thanks for the extensive review!

On Nov 22, 2023, Richard Biener <richard.guenther@gmail.com> wrote:

> On Mon, Nov 20, 2023 at 1:40 PM Alexandre Oliva <oliva@adacore.com> wrote:
>> 
>> On Oct 26, 2023, Alexandre Oliva <oliva@adacore.com> wrote:
>> 
>> >> This is a refreshed and improved version of the version posted back in
>> >> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html
>> 
>> > Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
>> > I'm combining the gcc/ipa-strub.cc bits from
>> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
>> 
>> Ping?
>> Retested on x86_64-linux-gnu, with and without -fstrub=all.

> @@ -898,7 +899,24 @@ decl_attributes (tree *node, tree attributes, int flags,
>        TYPE_NAME (tt) = *node;
>      }

> -  *anode = cur_and_last_decl[0];
> +  if (*anode != cur_and_last_decl[0])
> +    {
> +      /* Even if !spec->function_type_required, allow the attribute
> + handler to request the attribute to be applied to the function
> + type, rather than to the function pointer type, by setting
> + cur_and_last_decl[0] to the function type.  */
> +      if (!fn_ptr_tmp
> +  && POINTER_TYPE_P (*anode)
> +  && TREE_TYPE (*anode) == cur_and_last_decl[0]
> +  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
> + {
> +  fn_ptr_tmp = TREE_TYPE (*anode);
> +  fn_ptr_quals = TYPE_QUALS (*anode);
> +  anode = &fn_ptr_tmp;
> + }
> +      *anode = cur_and_last_decl[0];
> +    }
> +

> what is this a workaround for?

For the fact that the strub attribute attaches to types, whether data or
function types, so we can't have fn_type_req, but when it's a function
or pointer-to-function type, we want to affect the function type, rather
than the pointer type, when the attribute has an argument.  The argument
names the strub mode for a function; that only applies to function
types, never to data types.

The hunk above introduces the means for the attribute handler to choose
what to attach the attribute t.

> Isn't there a suitable parsing position for placing the attribute?

It's been a while, but IIRC the need for this first came up in Ada,
where attributes can't just go anywhere, and it was further complicated
by the fact that Ada doesn't have first-class function or procedure
types, only access-to-them, but we needed some means for the attributes
to apply to the function type.

> +#ifndef STACK_GROWS_DOWNWARD
> +# define STACK_TOPS GT
> +#else
> +# define STACK_TOPS LT
> +#endif

> according to docs this is defined to 0 or 1 so the above looks wrong
> (it's always defined).

Ugh.  Thanks, will fix.  (I'm pretty sure I had notes somewhere stating
that stack-grows-upwards hadn't been tested, and that was for the sheer
lack of platforms making that choice, but I hoped it wasn't that broken
:-(

> +  if (optimize < 2 || optimize_size || flag_no_inline)
> +    return NULL_RTX;

> I'm wondering about these checks in the expansions of the builtins,
> I think this is about inline expanding or emitting a libcall, right?

Yeah.

> I wonder if you should use optimize_function_for_speed (cfun) instead?
> Usually -fno-inline shouldn't affect such calls, but -fno-builtin-FOO would.
> I have no strong opinion here though.

I've occasionally wondered whether builtins were the best framework for
these semi-internal calls.

> The new builtins seem undocumented - usually those are documented
> within extend.texi

Erhm...  Weird.  I had documentation for them.

(checks)

No, it's there, in extend.texi, right after __builtin_stack_address.
It's admittedly a big patch :-/

> I guess placing __builtin___strub_enter calls in the code manually
> will break in interesting ways - if that's not supposed to happen the
> trick is to embed a space in the name of the built-in.

Yeah, I was a little torn between the choices here.  On the one hand, I
needed visible symbols for the out-of-line implementations, so I figured
that trying to hide the builtins wouldn't bring any advantage.

However, I've also designed the builtins with interfaces that would
avoid disruption even with explicit calls.  __strub_enter and
__strub_update only initialize or adjust a pointer handed to them.
__strub_leave will erase things from the top of the stack to the
pointer, so if the watermark is "active stack", nothing happens, and
things only get cleared if it points to "unused stack space".  There's
potential for disruption if one passes a statically-allocated pointer to
it, but nothing much different from memsetting that memory range, core
wars-style.

> -symtab_node::reset (void)
> +symtab_node::reset (bool preserve_comdat_group)

> not sure what for, I'll leave Honza to comment.

This restores the possibility of getting the pre-PR107897 behavior, that
the strub wrapper/wrapped splitting relied on.  Conceptually, the
original function becomes the wrapped one, and the wrapper that calls it
is kind of an implementation detail to preserve the exposed API/ABI
while introducing strubbing around the body, so preserving the comdat
group makes sense.  ISTR getting strub regressions when the patch for
PR107897 was put in, but my notes don't detail what broke, alas.

> +/* Create a distinct copy of the type of NODE's function, and change
> +   the fntype of all calls to it with the same main type to the new
> +   type.  */
> +
> +static void
> +distinctify_node_type (cgraph_node *node)
> +{
> +  tree old_type = TREE_TYPE (node->decl);
> +  tree new_type = build_distinct_type_copy (old_type);
> +  tree new_ptr_type = NULL_TREE;
> +
> +  /* Remap any calls to node->decl that use old_type, or a variant
> +     thereof, to new_type as well.  We don't look for aliases, their
> +     declarations will have their types changed independently, and
> +     we'll adjust their fntypes then.  */
> +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> +    {
> +      if (!e->call_stmt)
> + continue;
> +      tree fnaddr = gimple_call_fn (e->call_stmt);
> +      gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
> +   && TREE_OPERAND (fnaddr, 0) == node->decl);
> +      if (strub_call_fntype_override_p (e->call_stmt))
> + continue;
> +      if (!new_ptr_type)
> + new_ptr_type = build_pointer_type (new_type);
> +      TREE_TYPE (fnaddr) = new_ptr_type;
> +      gimple_call_set_fntype (e->call_stmt, new_type);
> +    }
> +
> +  TREE_TYPE (node->decl) = new_type;

> it does feel like there's IPA mechanisms to deal with what you are trying to do
> here (or in the caller(s)).

If there is, I couldn't find it.  There are some vaguely similar
operations, but nothing that will modify the ABI of exported functions
like what strub-at-calls purports to do.  A synthetic argument gets
added to the function's interface, but since that type could conceivably
have been associated/unified with non-strub types, or with similar types
with different strub modes that don't undergo such transformations, we
have to build a new type for the function, and adjust the type in the
call graph.  IPA can do this as part of replacing calls with specialized
clones of a function, but here the attribute calls for an adjustment to
the function type, vaguely resembling the introduction of the implicit
'this' in non-static member functions in C++, but without much help from
the front-end, and without exposing the formal to the front-end.  It
doesn't seem to me that IPA is willing to help with that, and even if it
were, it would require strub to become an actual IPA pass, which would
be a huge undertaking without clear benefit.

> +unsigned int
> +pass_ipa_strub_mode::execute (function *)
> +{
> +  last_cgraph_order = 0;
> +  ipa_strub_set_mode_for_new_functions ();
> +
> +  /* Verify before any inlining or other transformations.  */
> +  verify_strub ();

> if  (flag_checking) verify_strub ();

> please.

No, no, verify_strub is not an internal consistency check, it's part of
the implementation that verifies that strictness requirements have been
met WRT calls requested by the user.  In strict mode, a strub function
can only call other strub functions or functions explicitly marked as
callable from strub contexts.  This security requirement is to be
enforced, and diagnostics are to be issued in case of violation,
regardless of the self-checking mode of the compiler.

> I guess we talked about this last year - what's the reason to have both
> an IPA pass and a simple IPA pass?

We did.  I didn't recall all the reasons on the spot when you asked, but
I followed up by email:
https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603255.html

The earlier pass, that must be placed before any inlining, is the simple
IPA one, and it only assigns modes.

It must be that early because strub-enabled functions must never be
inlined into strub-disabled ones.  So we have to determine modes before
any sort of inlining.

But we can and want to inline strub-enabled functions into other
strub-enabled functions.  But we don't want to inline a wrapped function
back into its wrapper, so we want to (early) inline functions before
they're split, and we also want to inline wrappers after they're split
if possible.

So the assignment pass has to be before any inlining, and the
wrapping/splitting (for internal strub) must be after some inlining but
before the last attempt at inlining.

> IIRC the simple IPA pass is a simple one because it wants to see
> inlined bodies and "fixes" those up?  Some toplevel comments
> explaining both passes in the ipa-strub.cc pass would be nice to have.

ACK, will do.

> I guess I also asked before - did you try it with -flto?

Yeah, I've tested it extensively with -flto during development.  I've
run the GCC testsuite with a patch equivalent to -fstrub=all enabled by
default during development, and I still do so occasionally.  I didn't
before the ping, but prompted by you, I did again.  Torture tests and
specific lto tests don't seem to face any trouble.  And there's no
reason why they should.  The spots in which the strub passes were
introduced involved exploring where they fit best, and adding calls such
as analyzing nodes.  There are even commented-out vestigial calls of
inline_analyze_function, from a time when the pass was inserted at a
different spot and that call was required.

> +    /* Decide which of the wrapped function's parms we want to turn into
> +       references to the argument passed to the wrapper.  In general,
> we want to
> +       copy small arguments, and avoid copying large ones.
> Variable-sized array
> +       lengths given by other arguments, as in 20020210-1.c, would lead to
> +       problems if passed by value, after resetting the original function and
> +       dropping the length computation; passing them by reference works.
> +       DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
> +       anyway, but performed at the caller.  */
> +    indirect_parms_t indirect_nparms (3, false);
> ...

> IMHO it's a bad idea, at least initially, to mess with the ABI in a heuristical
> way this way.

Do you realize that this is only about the wrapped ABI, used exclusively
by the wrapper, to avoid pointlessly copying large arguments twice?  The
wrapper already needs a custom ABI, because of the added watermark
pointer, and va_list and whatnot when needed, so a little further
customization to avoid a quite significant overhead seemed desirable.

> I see you need all sorts of "fixup", including re-gimplification
> of stmts, etc., double-ugh.

*nod*, I've considered going the nested-function way, enabling the
wrapped function to access wrapper's variables directly, but (i) support
for nested functions isn't available everywhere, and (ii) strub
functions could already be nested functions, and the wrapped body would
need access to both enclosing contexts, so adjustments would have to be
made anyway.

> Ideally IPA-SRA can already do parts of those
> transforms which would mean ipa-param-manipulation has generic code
> to encode them?

You'd think so.  I tried that, run into various issues, still have
pending patches I submitted separately to address them, but...  at some
point I had to stop waiting for feedback and had make it work without
that.  I figured what I was doing wasn't a good fit and moved on.

Now, it would be great if IPA could enable me to split the entire body
of a function *while* adding custom arguments, such as the watermark,
va_list, etc.  It didn't look like it could or even wanted to, but maybe
the presence of a useful feature like stack scrubbing could motivate the
introduction of such infrastructure.

> I realize you can't tail-call, but I wonder if some clever
> machine-specific tricks would work, I guess it would need the wrapper
> to be target generated of course.

I'm not sure what you have in mind here, but one of the goals was to
introduce stack scrubbing without machine-dependent code.

We are considering an improvement for strub(internal) to do away with
wrappers given machine-specific prologue/epilogue support, but that's
not even in the roadmap, and it's not clear how that would interact with
exceptions, which was a primary driver for the currently-proposed
approaches.

> As it's only for optimization, how much of the code would go away when
> you avoid changing the parameters of the wrapped function?

We can't avoid changing them entirely, and IIRC some of the
infrastructure for regimplification was also used for va_list and
apply_args handling, so it wouldn't save all that much.  It is an
important optimization, to avoid doubling the stack requirements for
parameters with -fstrub=internal (because wrapper passes arguments on to
wrapped body), and the optimization is already implemented, so it would
be a pity to take it out, and you wouldn't even get rid of any
significant amount of code.

> Could one easily disable the optimization via a --param maybe?

The param is not implemented, but guarding the loop you commented on by
it would accomplish it.  You probably won't want to bootstrap GCC with
-fstrub=all along with it, though ;-)

> There's still much commented code and FIXMEs in, a lot of cgraph
> related code is used which I can't review very well.

Now that you mentioned it, I realize I hadn't yet dropped some of the
preprocessed-out bits when I posted v4.  I did so afterwards.  I was
trying to avoid posting such a monster patch again, but I guess I will
have to post at least another version :-(

But yeah, there are some wishlist items that remain, and some bits that
could probably go away.

> I fear that -fstrub is going to be actually used by people - are you up
> to the task fixing things?

I, for one, would welcome that, and I am (professionally) on the hook to
fix it, so I'd better be up to it ;-) Building it on top of
infrastructure I'm closely familiar with, rather than e.g. on IPA, that
I'm not so much, is a plus in this regard.  That said, I've fixed bugs
in GCC most everywhere, from front-ends to back-ends, so I'm pretty sure
I've got it covered in terms of knowledge.  As for time, if it gets too
much interest (is there such a thing? :-) I might have trouble keeping
up (there are only so many hours in the day), and it would likely be
wise then for others to become familiar with it.  I'd be happy to share
knowledge, and I know I have support from my "employer" (in quotes
because I'm not formally employed, I'm a consultant) to maintain the
code I contribute to the community on its behalf, even having other work
and non-work commitments.  So, bring them on, I say ;-)

> +void ATTRIBUTE_STRUB_CALLABLE
> +__strub_enter (void **watermark)
> +{
> +  *watermark = __builtin_frame_address (0);

> in your experience, do these (or the inline version) work reliably
> when optimizations like omitting the frame pointer or shrink-wrapping
> happen from a security point of view?

Yeah.  We're talking about the frame address for the current function
here, and the compiler knows what it is, even if it optimizes the heck
out of the function.  In the builtin expansion, it's the top of the
stack instead, and the compiler also has to have a good grasp of where
that is.  Red zones are a small complication, but not really much
trouble here: it's a watermark we're talking about, how high the stack
may have gotten, so we know we have to scrub no further than that.  If
it goes too far (as it often does with red zones), we just scrub a
little too much.  What shouldn't happen is not going far enough, but
since we rely on the stack frame address for __strub_update to update
the watermark, and the caller can't really misrepresent its own stack
use without risking the callee to corrupt it, this works really well.
If strub_update gets builtin-expanded inline, it uses the adjusted (for
red zone) stack pointer instead, after any internal stack allocation
(prologue and alloca), but it won't necessarily capture non-accumulated
outgoing args.  The arguments passed to a callee are not regarded as
part of the caller stack frame, so that should be fine.  In order to
cover them, one must have a strub-enabled caller and a strub(at-calls)
callee.

> You are adding symbols to libgcc but I don't see you amending
> libgcc/libgcc-std.ver.in?

That's a good catch, thanks.

hardcfr symbols are missing too.

Will fix.

> Neither the documentation in extend.texi nor the one in invoke.texi
> mentions the target audience of -fstrub

ACK, good point, despite all the detail, that is too implicit.  Will
fix.

> The patches mention the red zone, documenting the guarantees and
> caveats would be nice to have.  I think the documentation belongs to
> the attribute documentation.

It is kind of there, but since the details vary depending on the strub
mode, they're given where the modes are detailed, rather than as if
applying to all strub modes.  (some of the modes are disabled and
callable, that don't do any strubbing whatsoever)

> I hope Honza can have a quick look over the IPA passes.

Likewise.  I shall wait for further feedback before posting a huge v5,
but I'll be happy to refresh the users/aoliva/heads/strub branch in the
git repo as I adjust the patch if that helps.  Currently the patch is in
users/aoliva/heads/testme.

> a better way than taking what is percieved as a huge ugly part
> of the patch.

ISTM you're misperceiving the amount of code involved in that
optimization.  There is a significant amount of wrapper/wrapped
parameter manipulation that is *not* involved with the optimization, and
that was already there before the optimization was implemented.  The
optimization was a tiny addition over it.

Thanks again,
  
Richard Biener Nov. 23, 2023, 12:05 p.m. UTC | #5
On Thu, Nov 23, 2023 at 11:56 AM Alexandre Oliva <oliva@adacore.com> wrote:
>
> Hello, Richi,
>
> Thanks for the extensive review!
>
> On Nov 22, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > On Mon, Nov 20, 2023 at 1:40 PM Alexandre Oliva <oliva@adacore.com> wrote:
> >>
> >> On Oct 26, 2023, Alexandre Oliva <oliva@adacore.com> wrote:
> >>
> >> >> This is a refreshed and improved version of the version posted back in
> >> >> June.  https://gcc.gnu.org/pipermail/gcc-patches/2023-June/621936.html
> >>
> >> > Ping? https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633675.html
> >> > I'm combining the gcc/ipa-strub.cc bits from
> >> > https://gcc.gnu.org/pipermail/gcc-patches/2023-October/633526.html
> >>
> >> Ping?
> >> Retested on x86_64-linux-gnu, with and without -fstrub=all.
>
> > @@ -898,7 +899,24 @@ decl_attributes (tree *node, tree attributes, int flags,
> >        TYPE_NAME (tt) = *node;
> >      }
>
> > -  *anode = cur_and_last_decl[0];
> > +  if (*anode != cur_and_last_decl[0])
> > +    {
> > +      /* Even if !spec->function_type_required, allow the attribute
> > + handler to request the attribute to be applied to the function
> > + type, rather than to the function pointer type, by setting
> > + cur_and_last_decl[0] to the function type.  */
> > +      if (!fn_ptr_tmp
> > +  && POINTER_TYPE_P (*anode)
> > +  && TREE_TYPE (*anode) == cur_and_last_decl[0]
> > +  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
> > + {
> > +  fn_ptr_tmp = TREE_TYPE (*anode);
> > +  fn_ptr_quals = TYPE_QUALS (*anode);
> > +  anode = &fn_ptr_tmp;
> > + }
> > +      *anode = cur_and_last_decl[0];
> > +    }
> > +
>
> > what is this a workaround for?
>
> For the fact that the strub attribute attaches to types, whether data or
> function types, so we can't have fn_type_req, but when it's a function
> or pointer-to-function type, we want to affect the function type, rather
> than the pointer type, when the attribute has an argument.  The argument
> names the strub mode for a function; that only applies to function
> types, never to data types.
>
> The hunk above introduces the means for the attribute handler to choose
> what to attach the attribute t.
>
> > Isn't there a suitable parsing position for placing the attribute?
>
> It's been a while, but IIRC the need for this first came up in Ada,
> where attributes can't just go anywhere, and it was further complicated
> by the fact that Ada doesn't have first-class function or procedure
> types, only access-to-them, but we needed some means for the attributes
> to apply to the function type.
>
> > +#ifndef STACK_GROWS_DOWNWARD
> > +# define STACK_TOPS GT
> > +#else
> > +# define STACK_TOPS LT
> > +#endif
>
> > according to docs this is defined to 0 or 1 so the above looks wrong
> > (it's always defined).
>
> Ugh.  Thanks, will fix.  (I'm pretty sure I had notes somewhere stating
> that stack-grows-upwards hadn't been tested, and that was for the sheer
> lack of platforms making that choice, but I hoped it wasn't that broken
> :-(
>
> > +  if (optimize < 2 || optimize_size || flag_no_inline)
> > +    return NULL_RTX;
>
> > I'm wondering about these checks in the expansions of the builtins,
> > I think this is about inline expanding or emitting a libcall, right?
>
> Yeah.
>
> > I wonder if you should use optimize_function_for_speed (cfun) instead?
> > Usually -fno-inline shouldn't affect such calls, but -fno-builtin-FOO would.
> > I have no strong opinion here though.
>
> I've occasionally wondered whether builtins were the best framework for
> these semi-internal calls.
>
> > The new builtins seem undocumented - usually those are documented
> > within extend.texi
>
> Erhm...  Weird.  I had documentation for them.
>
> (checks)
>
> No, it's there, in extend.texi, right after __builtin_stack_address.
> It's admittedly a big patch :-/
>
> > I guess placing __builtin___strub_enter calls in the code manually
> > will break in interesting ways - if that's not supposed to happen the
> > trick is to embed a space in the name of the built-in.
>
> Yeah, I was a little torn between the choices here.  On the one hand, I
> needed visible symbols for the out-of-line implementations, so I figured
> that trying to hide the builtins wouldn't bring any advantage.
>
> However, I've also designed the builtins with interfaces that would
> avoid disruption even with explicit calls.  __strub_enter and
> __strub_update only initialize or adjust a pointer handed to them.
> __strub_leave will erase things from the top of the stack to the
> pointer, so if the watermark is "active stack", nothing happens, and
> things only get cleared if it points to "unused stack space".  There's
> potential for disruption if one passes a statically-allocated pointer to
> it, but nothing much different from memsetting that memory range, core
> wars-style.
>
> > -symtab_node::reset (void)
> > +symtab_node::reset (bool preserve_comdat_group)
>
> > not sure what for, I'll leave Honza to comment.
>
> This restores the possibility of getting the pre-PR107897 behavior, that
> the strub wrapper/wrapped splitting relied on.  Conceptually, the
> original function becomes the wrapped one, and the wrapper that calls it
> is kind of an implementation detail to preserve the exposed API/ABI
> while introducing strubbing around the body, so preserving the comdat
> group makes sense.  ISTR getting strub regressions when the patch for
> PR107897 was put in, but my notes don't detail what broke, alas.
>
> > +/* Create a distinct copy of the type of NODE's function, and change
> > +   the fntype of all calls to it with the same main type to the new
> > +   type.  */
> > +
> > +static void
> > +distinctify_node_type (cgraph_node *node)
> > +{
> > +  tree old_type = TREE_TYPE (node->decl);
> > +  tree new_type = build_distinct_type_copy (old_type);
> > +  tree new_ptr_type = NULL_TREE;
> > +
> > +  /* Remap any calls to node->decl that use old_type, or a variant
> > +     thereof, to new_type as well.  We don't look for aliases, their
> > +     declarations will have their types changed independently, and
> > +     we'll adjust their fntypes then.  */
> > +  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
> > +    {
> > +      if (!e->call_stmt)
> > + continue;
> > +      tree fnaddr = gimple_call_fn (e->call_stmt);
> > +      gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
> > +   && TREE_OPERAND (fnaddr, 0) == node->decl);
> > +      if (strub_call_fntype_override_p (e->call_stmt))
> > + continue;
> > +      if (!new_ptr_type)
> > + new_ptr_type = build_pointer_type (new_type);
> > +      TREE_TYPE (fnaddr) = new_ptr_type;
> > +      gimple_call_set_fntype (e->call_stmt, new_type);
> > +    }
> > +
> > +  TREE_TYPE (node->decl) = new_type;
>
> > it does feel like there's IPA mechanisms to deal with what you are trying to do
> > here (or in the caller(s)).
>
> If there is, I couldn't find it.  There are some vaguely similar
> operations, but nothing that will modify the ABI of exported functions
> like what strub-at-calls purports to do.  A synthetic argument gets
> added to the function's interface, but since that type could conceivably
> have been associated/unified with non-strub types, or with similar types
> with different strub modes that don't undergo such transformations, we
> have to build a new type for the function, and adjust the type in the
> call graph.  IPA can do this as part of replacing calls with specialized
> clones of a function, but here the attribute calls for an adjustment to
> the function type, vaguely resembling the introduction of the implicit
> 'this' in non-static member functions in C++, but without much help from
> the front-end, and without exposing the formal to the front-end.  It
> doesn't seem to me that IPA is willing to help with that, and even if it
> were, it would require strub to become an actual IPA pass, which would
> be a huge undertaking without clear benefit.

Conceptually it shouldn't be much different from what IPA-SRA does
which is cloning a function but with different arguments, the function
signature transform described in terms of ipa-param-manipulation bits.
I've talked with Martin and at least there's currently no
by-value-to-by-reference
"transform", but IPA-SRA can pass two registers instead of one aggregate
for example.  There's IPA_PARAM_OP_NEW already to add a new param.
In principle the whole function rewriting (apart of recovering
from inlining) should be doable within this framework.

> > +unsigned int
> > +pass_ipa_strub_mode::execute (function *)
> > +{
> > +  last_cgraph_order = 0;
> > +  ipa_strub_set_mode_for_new_functions ();
> > +
> > +  /* Verify before any inlining or other transformations.  */
> > +  verify_strub ();
>
> > if  (flag_checking) verify_strub ();
>
> > please.
>
> No, no, verify_strub is not an internal consistency check, it's part of
> the implementation that verifies that strictness requirements have been
> met WRT calls requested by the user.  In strict mode, a strub function
> can only call other strub functions or functions explicitly marked as
> callable from strub contexts.  This security requirement is to be
> enforced, and diagnostics are to be issued in case of violation,
> regardless of the self-checking mode of the compiler.
>
> > I guess we talked about this last year - what's the reason to have both
> > an IPA pass and a simple IPA pass?
>
> We did.  I didn't recall all the reasons on the spot when you asked, but
> I followed up by email:
> https://gcc.gnu.org/pipermail/gcc-patches/2022-October/603255.html
>
> The earlier pass, that must be placed before any inlining, is the simple
> IPA one, and it only assigns modes.
>
> It must be that early because strub-enabled functions must never be
> inlined into strub-disabled ones.  So we have to determine modes before
> any sort of inlining.
>
> But we can and want to inline strub-enabled functions into other
> strub-enabled functions.  But we don't want to inline a wrapped function
> back into its wrapper, so we want to (early) inline functions before
> they're split, and we also want to inline wrappers after they're split
> if possible.
>
> So the assignment pass has to be before any inlining, and the
> wrapping/splitting (for internal strub) must be after some inlining but
> before the last attempt at inlining.
>
> > IIRC the simple IPA pass is a simple one because it wants to see
> > inlined bodies and "fixes" those up?  Some toplevel comments
> > explaining both passes in the ipa-strub.cc pass would be nice to have.
>
> ACK, will do.
>
> > I guess I also asked before - did you try it with -flto?
>
> Yeah, I've tested it extensively with -flto during development.  I've
> run the GCC testsuite with a patch equivalent to -fstrub=all enabled by
> default during development, and I still do so occasionally.  I didn't
> before the ping, but prompted by you, I did again.  Torture tests and
> specific lto tests don't seem to face any trouble.  And there's no
> reason why they should.  The spots in which the strub passes were
> introduced involved exploring where they fit best, and adding calls such
> as analyzing nodes.  There are even commented-out vestigial calls of
> inline_analyze_function, from a time when the pass was inserted at a
> different spot and that call was required.
>
> > +    /* Decide which of the wrapped function's parms we want to turn into
> > +       references to the argument passed to the wrapper.  In general,
> > we want to
> > +       copy small arguments, and avoid copying large ones.
> > Variable-sized array
> > +       lengths given by other arguments, as in 20020210-1.c, would lead to
> > +       problems if passed by value, after resetting the original function and
> > +       dropping the length computation; passing them by reference works.
> > +       DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
> > +       anyway, but performed at the caller.  */
> > +    indirect_parms_t indirect_nparms (3, false);
> > ...
>
> > IMHO it's a bad idea, at least initially, to mess with the ABI in a heuristical
> > way this way.
>
> Do you realize that this is only about the wrapped ABI, used exclusively
> by the wrapper, to avoid pointlessly copying large arguments twice?  The
> wrapper already needs a custom ABI, because of the added watermark
> pointer, and va_list and whatnot when needed, so a little further
> customization to avoid a quite significant overhead seemed desirable.

I understood the purpose of this, but I also saw it's only ever needed for
non-strict operation and I think that if you care about security you'd never
want to operate in a way that you don't absolutely know the function
you call isn't going to scrub your secrets ...

But yes, I also wondered why you even run into re-gimplification issues.
If you make sure to never pass things as reference that are registers
in the callee context then replacing the PARM_DECL there with
a MEM_REF (new-PARM_DECL) should work without re-gimplification.
OK, existing MEM_REFs need folding to be canonical again, but that
should be it.

Note I didn't look into the re-gimplification code much, I just saw
it and wondered, thinking this would be the reason for it?

> > I see you need all sorts of "fixup", including re-gimplification
> > of stmts, etc., double-ugh.
>
> *nod*, I've considered going the nested-function way, enabling the
> wrapped function to access wrapper's variables directly, but (i) support
> for nested functions isn't available everywhere, and (ii) strub
> functions could already be nested functions, and the wrapped body would
> need access to both enclosing contexts, so adjustments would have to be
> made anyway.
>
> > Ideally IPA-SRA can already do parts of those
> > transforms which would mean ipa-param-manipulation has generic code
> > to encode them?
>
> You'd think so.  I tried that, run into various issues, still have
> pending patches I submitted separately to address them, but...  at some
> point I had to stop waiting for feedback and had make it work without
> that.  I figured what I was doing wasn't a good fit and moved on.
>
> Now, it would be great if IPA could enable me to split the entire body
> of a function *while* adding custom arguments, such as the watermark,
> va_list, etc.  It didn't look like it could or even wanted to, but maybe
> the presence of a useful feature like stack scrubbing could motivate the
> introduction of such infrastructure.
>
> > I realize you can't tail-call, but I wonder if some clever
> > machine-specific tricks would work, I guess it would need the wrapper
> > to be target generated of course.
>
> I'm not sure what you have in mind here, but one of the goals was to
> introduce stack scrubbing without machine-dependent code.
>
> We are considering an improvement for strub(internal) to do away with
> wrappers given machine-specific prologue/epilogue support, but that's
> not even in the roadmap, and it's not clear how that would interact with
> exceptions, which was a primary driver for the currently-proposed
> approaches.

Understood.  But I didn't remember seeing that you do sth like wrapping
each "strubbed call" inside a try { call(); } finally { do-strub } to
ensure this.
Guess it was well hidden in the large patch ;)

> > As it's only for optimization, how much of the code would go away when
> > you avoid changing the parameters of the wrapped function?
>
> We can't avoid changing them entirely, and IIRC some of the
> infrastructure for regimplification was also used for va_list and
> apply_args handling, so it wouldn't save all that much.

Ah, var-args ... indeed for simple forwarding it shouldn't be too bad.
I wonder if this were a way to remove the restriction on function
splitting of var-args functions - there's a recent bugreport about that
(before any va_arg () has been called, of course).

>  It is an
> important optimization, to avoid doubling the stack requirements for
> parameters with -fstrub=internal (because wrapper passes arguments on to
> wrapped body), and the optimization is already implemented, so it would
> be a pity to take it out, and you wouldn't even get rid of any
> significant amount of code.
>
> > Could one easily disable the optimization via a --param maybe?
>
> The param is not implemented, but guarding the loop you commented on by
> it would accomplish it.  You probably won't want to bootstrap GCC with
> -fstrub=all along with it, though ;-)
>
> > There's still much commented code and FIXMEs in, a lot of cgraph
> > related code is used which I can't review very well.
>
> Now that you mentioned it, I realize I hadn't yet dropped some of the
> preprocessed-out bits when I posted v4.  I did so afterwards.  I was
> trying to avoid posting such a monster patch again, but I guess I will
> have to post at least another version :-(
>
> But yeah, there are some wishlist items that remain, and some bits that
> could probably go away.
>
> > I fear that -fstrub is going to be actually used by people - are you up
> > to the task fixing things?
>
> I, for one, would welcome that, and I am (professionally) on the hook to
> fix it, so I'd better be up to it ;-) Building it on top of
> infrastructure I'm closely familiar with, rather than e.g. on IPA, that
> I'm not so much, is a plus in this regard.  That said, I've fixed bugs
> in GCC most everywhere, from front-ends to back-ends, so I'm pretty sure
> I've got it covered in terms of knowledge.  As for time, if it gets too
> much interest (is there such a thing? :-) I might have trouble keeping
> up (there are only so many hours in the day), and it would likely be
> wise then for others to become familiar with it.  I'd be happy to share
> knowledge, and I know I have support from my "employer" (in quotes
> because I'm not formally employed, I'm a consultant) to maintain the
> code I contribute to the community on its behalf, even having other work
> and non-work commitments.  So, bring them on, I say ;-)
>
> > +void ATTRIBUTE_STRUB_CALLABLE
> > +__strub_enter (void **watermark)
> > +{
> > +  *watermark = __builtin_frame_address (0);
>
> > in your experience, do these (or the inline version) work reliably
> > when optimizations like omitting the frame pointer or shrink-wrapping
> > happen from a security point of view?
>
> Yeah.  We're talking about the frame address for the current function
> here, and the compiler knows what it is, even if it optimizes the heck
> out of the function.  In the builtin expansion, it's the top of the
> stack instead, and the compiler also has to have a good grasp of where
> that is.  Red zones are a small complication, but not really much
> trouble here: it's a watermark we're talking about, how high the stack
> may have gotten, so we know we have to scrub no further than that.  If
> it goes too far (as it often does with red zones), we just scrub a
> little too much.  What shouldn't happen is not going far enough, but
> since we rely on the stack frame address for __strub_update to update
> the watermark, and the caller can't really misrepresent its own stack
> use without risking the callee to corrupt it, this works really well.
> If strub_update gets builtin-expanded inline, it uses the adjusted (for
> red zone) stack pointer instead, after any internal stack allocation
> (prologue and alloca), but it won't necessarily capture non-accumulated
> outgoing args.  The arguments passed to a callee are not regarded as
> part of the caller stack frame, so that should be fine.  In order to
> cover them, one must have a strub-enabled caller and a strub(at-calls)
> callee.
>
> > You are adding symbols to libgcc but I don't see you amending
> > libgcc/libgcc-std.ver.in?
>
> That's a good catch, thanks.
>
> hardcfr symbols are missing too.
>
> Will fix.
>
> > Neither the documentation in extend.texi nor the one in invoke.texi
> > mentions the target audience of -fstrub
>
> ACK, good point, despite all the detail, that is too implicit.  Will
> fix.
>
> > The patches mention the red zone, documenting the guarantees and
> > caveats would be nice to have.  I think the documentation belongs to
> > the attribute documentation.
>
> It is kind of there, but since the details vary depending on the strub
> mode, they're given where the modes are detailed, rather than as if
> applying to all strub modes.  (some of the modes are disabled and
> callable, that don't do any strubbing whatsoever)
>
> > I hope Honza can have a quick look over the IPA passes.
>
> Likewise.  I shall wait for further feedback before posting a huge v5,
> but I'll be happy to refresh the users/aoliva/heads/strub branch in the
> git repo as I adjust the patch if that helps.  Currently the patch is in
> users/aoliva/heads/testme.
>
> > a better way than taking what is percieved as a huge ugly part
> > of the patch.
>
> ISTM you're misperceiving the amount of code involved in that
> optimization.  There is a significant amount of wrapper/wrapped
> parameter manipulation that is *not* involved with the optimization, and
> that was already there before the optimization was implemented.  The
> optimization was a tiny addition over it.

I see.

Thanks,
Richard.

> Thanks again,
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
  
Alexandre Oliva Nov. 29, 2023, 8:53 a.m. UTC | #6
On Nov 23, 2023, Richard Biener <richard.guenther@gmail.com> wrote:

> Conceptually it shouldn't be much different from what IPA-SRA does
> which is cloning a function but with different arguments, the function
> signature transform described in terms of ipa-param-manipulation bits.
> I've talked with Martin and at least there's currently no
> by-value-to-by-reference
> "transform", but IPA-SRA can pass two registers instead of one aggregate
> for example.  There's IPA_PARAM_OP_NEW already to add a new param.
> In principle the whole function rewriting (apart of recovering
> from inlining) should be doable within this framework.

I agree, but my attempts to do so have been unfruitful, and hit various
very obscure issues that made it unworkable.  Maybe someone smarter than
I am, or with more time than I had, can make the conversion.  The uses
of IPA_PARAM_OP_NEW for the cloning are there, and the more conservative
choices I made got it to work reliably, unlike the more adventurous
alternatives I tried.  One of the obscure issues I recall hitting was
this one.  I can probably still locate the early strub patch that hit
the described issue back then, if there's interest.
https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575044.html

>> The
>> wrapper already needs a custom ABI, because of the added watermark
>> pointer, and va_list and whatnot when needed, so a little further
>> customization to avoid a quite significant overhead seemed desirable.

> I understood the purpose of this, but I also saw it's only ever needed for
> non-strict operation

Erhm...  I suppose you're not talking about -fstrub=strict, because I
can't see how this relates with the by-reference argument passing from
wrapper to wrapped in internal strubbing.

> and I think that if you care about security you'd never
> want to operate in a way that you don't absolutely know the function
> you call isn't going to scrub your secrets ...

The ABI between wrapper and wrapped function splitting in internal strub
doesn't change this.

> But yes, I also wondered why you even run into re-gimplification issues.

Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

Maybe instead of going through regimplify, we could explicitly create an
SSA name for the indirection load, so that the purpose is made more
explicit.

> replacing the PARM_DECL there with a MEM_REF (new-PARM_DECL) should
> work without re-gimplification.

FWIW, I thought so as well ;-)

> I didn't remember seeing that you do sth like wrapping
> each "strubbed call" inside a try { call(); } finally { do-strub } to
> ensure this.
> Guess it was well hidden in the large patch ;)

Indeed, gsi_insert_finally_seq_after_call is where this is taken care of.

>> > As it's only for optimization, how much of the code would go away when
>> > you avoid changing the parameters of the wrapped function?
>> 
>> We can't avoid changing them entirely, and IIRC some of the
>> infrastructure for regimplification was also used for va_list and
>> apply_args handling, so it wouldn't save all that much.

> Ah, var-args ... indeed for simple forwarding it shouldn't be too bad.
> I wonder if this were a way to remove the restriction on function
> splitting of var-args functions - there's a recent bugreport about that
> (before any va_arg () has been called, of course).

If running va_start unconditionally in varargs functions is generally
acceptable, then the method I've used for wrapping varargs functions
should work generally, yeah.  The trick was to turn:

foo (...)
{
  va_list ap;

  ...
  va_start (&ap, <ignored>);
  ...
}

into

wrapped_foo (va_list &wrap)
{
  va_list ap;

  ...
  va_copy (ap, wrap);
  ...
}

foo (...)
{
  va_list wrap;

  va_start (wrap, <also ignored>);
  wrapped_foo (wrap);
  va_end (wrap);
}


Here's the opening comment I added to ipa-strub.cc:

/* This file introduces two passes that, together, implement
   machine-independent stack scrubbing, strub for short.  It arranges
   for stack frames that have strub enabled to be zeroed-out after
   relinquishing control to a caller, whether by returning or by
   propagating an exception.  This admittedly unusual design decision
   was driven by exception support (one needs a stack frame to be
   active to propagate exceptions out of it), and it enabled an
   implementation that is entirely machine-independent (no custom
   epilogue code is required).

   Strub modes can be selected for stack frames by attaching attribute
   strub to functions or to variables (to their types, actually).
   Different strub modes, with different implementation details, are
   available, and they can be selected by an argument to the strub
   attribute.  When enabled by strub-enabled variables, whether by
   accessing (as in reading from) statically-allocated ones, or by
   introducing (as in declaring) automatically-allocated ones, a
   suitable mode is selected automatically.

   At-calls mode modifies the interface of a function, adding a stack
   watermark argument, that callers use to clean up the stack frame of
   the called function.  Because of the interface change, it can only
   be used when explicitly selected, or when a function is internal to
   a translation unit.  Strub-at-calls function types are distinct
   from their original types (they're not modified in-place), and they
   are not interchangeable with other function types.

   Internal mode, in turn, does not modify the type or the interface
   of a function.  It is currently implemented by turning the function
   into a wrapper, moving the function body to a separate wrapped
   function, and scrubbing the wrapped body's stack in the wrapper.
   Internal-strub function types are mostly interface-compatible with
   other strub modes, namely callable (from strub functions, though
   not strub-enabled) and disabled (not callable from strub
   functions).

   Always_inline functions can be strub functions, but they can only
   be called from other strub functions, because strub functions must
   never be inlined into non-strub functions.  Internal and at-calls
   modes are indistinguishable when it comes to always_inline
   functions: they will necessarily be inlined into another strub
   function, and will thus be integrated into the caller's stack
   frame, whatever the mode.  (Contrast with non-always_inline strub
   functions: an at-calls function can be called from other strub
   functions, ensuring no discontinuity in stack erasing, whereas an
   internal-strub function can only be called from other strub
   functions if it happens to be inlined, or if -fstrub=relaxed mode
   is in effect (that's the default).  In -fstrub=strict mode,
   internal-strub functions are not callable from strub functions,
   because the wrapper itself is not strubbed.

   The implementation involves two simple-IPA passes.  The earliest
   one, strub-mode, assigns strub modes to functions.  It needs to run
   before any inlining, so that we can prevent inlining of strub
   functions into non-strub functions.  It notes explicit strub mode
   requests, enables strub in response to strub variables and testing
   options, and flags unsatisfiable requests.

   Three possibilities of unsatisfiable requests come to mind: (a)
   when a strub mode is explicitly selected, but the function uses
   features that make it ineligible for that mode (e.g. at-calls rules
   out calling __builtin_apply_args, because of the interface changes,
   and internal mode rules out noclone or otherwise non-versionable
   functions, non-default varargs, non-local or forced labels, and
   functions with far too many arguments); (b) when some strub mode
   must be enabled because of a strub variable, but the function is
   not eligible or not viable for any mode; and (c) when
   -fstrub=strict is enabled, and calls are found in strub functions
   to functions that are not callable from strub contexts.
   compute_strub_mode implements (a) and (b), and verify_strub
   implements (c).

   The second IPA pass modifies interfaces of at-calls-strub functions
   and types, introduces strub calls in and around them. and splits
   internal-strub functions.  It is placed after early inlining, so
   that even internal-strub functions get a chance of being inlined
   into other strub functions, but before non-early inlining, so that
   internal-strub wrapper functions still get a chance of inlining
   after splitting.

   Wrappers avoid duplicating the copying of large arguments again by
   passing them by reference to the wrapped bodies.  This involves
   occasional SSA rewriting of address computations, because of the
   additional indirection.  Besides these changes, and the
   introduction of the stack watermark parameter, wrappers and wrapped
   functions cooperate to handle variable argument lists (performing
   va_start in the wrapper, passing the list as an argument, and
   replacing va_start calls in the wrapped body with va_copy), and
   __builtin_apply_args (also called in the wrapper and passed to the
   wrapped body as an argument).

   Strub bodies (both internal-mode wrapped bodies, and at-calls
   functions) always start by adjusting the watermark parameter, by
   calling __builtin___strub_update.  The compiler inserts them in the
   main strub pass.  Allocations of additional stack space for the
   frame (__builtin_alloca) are also followed by watermark updates.
   Stack space temporarily allocated to pass arguments to other
   functions, released right after the call, is not regarded as part
   of the frame.  Around calls to them, i.e., in internal-mode
   wrappers and at-calls callers (even calls through pointers), calls
   to __builtin___strub_enter and __builtin___strub_leave are
   inserted, the latter as a __finally block, so that it runs at
   regular and exceptional exit paths.  strub_enter only initializes
   the stack watermark, and strub_leave is where the scrubbing takes
   place, overwriting with zeros the stack space from the top of the
   stack to the watermark.

   These calls can be optimized in various cases.  In
   pass_ipa_strub::adjust_at_calls_call, for example, we enable
   tail-calling and other optimized calls from one strub body to
   another by passing on the watermark parameter.  The builtins
   themselves may undergo inline substitution during expansion,
   dependign on optimization levels.  This involves dealing with stack
   red zones (when the builtins are called out-of-line, the red zone
   cannot be used) and other ugly details related with inlining strub
   bodies into other strub bodies (see expand_builtin_strub_update).
   expand_builtin_strub_leave may even perform partial inline
   substitution.  */

The type attribute description now starts like this:

@cindex @code{strub} type attribute
@item strub
This attribute defines stack-scrubbing properties of functions and
variables, so that functions that access sensitive data can have their
stack frames zeroed-out upon returning or propagating exceptions.  This
may be enabled explicitly, by selecting certain @code{strub} modes for
specific functions, or implicitly, by means of @code{strub} variables.

And I've fixed the symbol versioning for strub and hardcfr functions.

I've just pushed the base strub patch and the recent incremental changes
(and some commits used for testing) to refs/users/aoliva/heads/strub.


Here are changes.html entries for this and for the other newly-added
features:

new AdaCore-contributed hardening features in gcc 13 and 14

Mention hardening of conditionals (added in gcc 13), control flow
redundancy, hardened booleans, and stack scrubbing.

Also cover forced inlining of string operations while at that.
---
 htdocs/gcc-13/changes.html |    6 ++++++
 htdocs/gcc-14/changes.html |   29 +++++++++++++++++++++++++++++
 2 files changed, 35 insertions(+)

diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
index 8ef3d63952daf..c1dea18caf59b 100644
--- a/htdocs/gcc-13/changes.html
+++ b/htdocs/gcc-13/changes.html
@@ -168,6 +168,12 @@ You may also want to check out our
     been added, see also
     <a href="https://gcc.gnu.org/onlinedocs/gcc/Freestanding-Environments.html">Profiling and Test Coverage in Freestanding Environments</a>.
   </li>
+  <li>
+    New options <code>-fharden-compares</code>
+    and <code>-fharden-conditional-branches</code> to verify compares
+    and conditional branches, to detect some power-deprivation
+    hardware attacks, using reversed conditions.
+  </li>
 </ul>
 
 
diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
index 2088ee91a34e1..cb92f8d8095c3 100644
--- a/htdocs/gcc-14/changes.html
+++ b/htdocs/gcc-14/changes.html
@@ -109,6 +109,35 @@ a work-in-progress.</p>
     of hardening flags.  The options it enables can be displayed using the
     <code>--help=hardened</code> option.
   </li>
+  <li>
+    New option <code>-fharden-control-flow-redundancy</code>, to
+    verify, at the end of functions, that the visited basic blocks
+    correspond to a legitimate execution path, so as to detect and
+    prevent attacks that transfer control into the middle of
+    functions.
+  </li>
+  <li>
+    New type attribute <code>hardbool</code>, for C and Ada.  Hardened
+    booleans take user-specified representations for <code>true</code>
+    and <code>false</code>, presumably with higher hamming distance
+    than standard booleans, and get verified at every use, detecting
+    memory corruption and some malicious attacks.
+  </li>
+  <li>
+    New type attribute <code>strub</code> to control stack scrubbing
+    properties of functions and variables.  The stack frame used by
+    functions marked with the attribute gets zeroed-out upon returning
+    or exception escaping.  Scalar variables marked with the attribute
+    cause functions contaning or accessing them to get stack scrubbing
+    enabled implicitly.
+  </li>
+  <li>
+    New option <code>-finline-stringops</code>, to force inline
+    expansion of <code>memcmp</code>, <code>memcpy</code>,
+    <code>memmove</code> and <code>memset</code>, even when that is
+    not an optimization, to avoid relying on library
+    implementations.
+  </li>
 </ul>
 <!-- .................................................................. -->
 <h2 id="languages">New Languages and Language specific improvements</h2>
  
Richard Biener Nov. 29, 2023, 12:48 p.m. UTC | #7
On Wed, Nov 29, 2023 at 9:53 AM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Nov 23, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > Conceptually it shouldn't be much different from what IPA-SRA does
> > which is cloning a function but with different arguments, the function
> > signature transform described in terms of ipa-param-manipulation bits.
> > I've talked with Martin and at least there's currently no
> > by-value-to-by-reference
> > "transform", but IPA-SRA can pass two registers instead of one aggregate
> > for example.  There's IPA_PARAM_OP_NEW already to add a new param.
> > In principle the whole function rewriting (apart of recovering
> > from inlining) should be doable within this framework.
>
> I agree, but my attempts to do so have been unfruitful, and hit various
> very obscure issues that made it unworkable.  Maybe someone smarter than
> I am, or with more time than I had, can make the conversion.  The uses
> of IPA_PARAM_OP_NEW for the cloning are there, and the more conservative
> choices I made got it to work reliably, unlike the more adventurous
> alternatives I tried.  One of the obscure issues I recall hitting was
> this one.  I can probably still locate the early strub patch that hit
> the described issue back then, if there's interest.
> https://gcc.gnu.org/pipermail/gcc-patches/2021-July/575044.html
>
> >> The
> >> wrapper already needs a custom ABI, because of the added watermark
> >> pointer, and va_list and whatnot when needed, so a little further
> >> customization to avoid a quite significant overhead seemed desirable.
>
> > I understood the purpose of this, but I also saw it's only ever needed for
> > non-strict operation
>
> Erhm...  I suppose you're not talking about -fstrub=strict, because I
> can't see how this relates with the by-reference argument passing from
> wrapper to wrapped in internal strubbing.
>
> > and I think that if you care about security you'd never
> > want to operate in a way that you don't absolutely know the function
> > you call isn't going to scrub your secrets ...
>
> The ABI between wrapper and wrapped function splitting in internal strub
> doesn't change this.
>
> > But yes, I also wondered why you even run into re-gimplification issues.
>
> Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
so I assume it was &MEM[arg_(D)][n_#] which is indeed OK.  But you
shouldn't need to change a pointer argument to be passed by reference,
do you?  As said, you want to restrict by-reference passing to arguments
that are !is_gimple_reg_type ().  Everywhere where a plain PARM_DECL
was valid a *ptr indirection is as well.

> Maybe instead of going through regimplify, we could explicitly create an
> SSA name for the indirection load, so that the purpose is made more
> explicit.
>
> > replacing the PARM_DECL there with a MEM_REF (new-PARM_DECL) should
> > work without re-gimplification.
>
> FWIW, I thought so as well ;-)
>
> > I didn't remember seeing that you do sth like wrapping
> > each "strubbed call" inside a try { call(); } finally { do-strub } to
> > ensure this.
> > Guess it was well hidden in the large patch ;)
>
> Indeed, gsi_insert_finally_seq_after_call is where this is taken care of.
>
> >> > As it's only for optimization, how much of the code would go away when
> >> > you avoid changing the parameters of the wrapped function?
> >>
> >> We can't avoid changing them entirely, and IIRC some of the
> >> infrastructure for regimplification was also used for va_list and
> >> apply_args handling, so it wouldn't save all that much.
>
> > Ah, var-args ... indeed for simple forwarding it shouldn't be too bad.
> > I wonder if this were a way to remove the restriction on function
> > splitting of var-args functions - there's a recent bugreport about that
> > (before any va_arg () has been called, of course).
>
> If running va_start unconditionally in varargs functions is generally
> acceptable, then the method I've used for wrapping varargs functions
> should work generally, yeah.  The trick was to turn:
>
> foo (...)
> {
>   va_list ap;
>
>   ...
>   va_start (&ap, <ignored>);
>   ...
> }
>
> into
>
> wrapped_foo (va_list &wrap)
> {
>   va_list ap;
>
>   ...
>   va_copy (ap, wrap);
>   ...
> }
>
> foo (...)
> {
>   va_list wrap;
>
>   va_start (wrap, <also ignored>);
>   wrapped_foo (wrap);
>   va_end (wrap);
> }

Ah, nice trick indeed.  I think that should work, it might not
be optimal (I don't think we will optimize the va_copy).

>
> Here's the opening comment I added to ipa-strub.cc:
>
> /* This file introduces two passes that, together, implement
>    machine-independent stack scrubbing, strub for short.  It arranges
>    for stack frames that have strub enabled to be zeroed-out after
>    relinquishing control to a caller, whether by returning or by
>    propagating an exception.  This admittedly unusual design decision
>    was driven by exception support (one needs a stack frame to be
>    active to propagate exceptions out of it), and it enabled an
>    implementation that is entirely machine-independent (no custom
>    epilogue code is required).
>
>    Strub modes can be selected for stack frames by attaching attribute
>    strub to functions or to variables (to their types, actually).
>    Different strub modes, with different implementation details, are
>    available, and they can be selected by an argument to the strub
>    attribute.  When enabled by strub-enabled variables, whether by
>    accessing (as in reading from) statically-allocated ones, or by
>    introducing (as in declaring) automatically-allocated ones, a
>    suitable mode is selected automatically.
>
>    At-calls mode modifies the interface of a function, adding a stack
>    watermark argument, that callers use to clean up the stack frame of
>    the called function.  Because of the interface change, it can only
>    be used when explicitly selected, or when a function is internal to
>    a translation unit.  Strub-at-calls function types are distinct
>    from their original types (they're not modified in-place), and they
>    are not interchangeable with other function types.
>
>    Internal mode, in turn, does not modify the type or the interface
>    of a function.  It is currently implemented by turning the function
>    into a wrapper, moving the function body to a separate wrapped
>    function, and scrubbing the wrapped body's stack in the wrapper.
>    Internal-strub function types are mostly interface-compatible with
>    other strub modes, namely callable (from strub functions, though
>    not strub-enabled) and disabled (not callable from strub
>    functions).
>
>    Always_inline functions can be strub functions, but they can only
>    be called from other strub functions, because strub functions must
>    never be inlined into non-strub functions.  Internal and at-calls
>    modes are indistinguishable when it comes to always_inline
>    functions: they will necessarily be inlined into another strub
>    function, and will thus be integrated into the caller's stack
>    frame, whatever the mode.  (Contrast with non-always_inline strub
>    functions: an at-calls function can be called from other strub
>    functions, ensuring no discontinuity in stack erasing, whereas an
>    internal-strub function can only be called from other strub
>    functions if it happens to be inlined, or if -fstrub=relaxed mode
>    is in effect (that's the default).  In -fstrub=strict mode,
>    internal-strub functions are not callable from strub functions,
>    because the wrapper itself is not strubbed.
>
>    The implementation involves two simple-IPA passes.  The earliest
>    one, strub-mode, assigns strub modes to functions.  It needs to run
>    before any inlining, so that we can prevent inlining of strub
>    functions into non-strub functions.  It notes explicit strub mode
>    requests, enables strub in response to strub variables and testing
>    options, and flags unsatisfiable requests.
>
>    Three possibilities of unsatisfiable requests come to mind: (a)
>    when a strub mode is explicitly selected, but the function uses
>    features that make it ineligible for that mode (e.g. at-calls rules
>    out calling __builtin_apply_args, because of the interface changes,
>    and internal mode rules out noclone or otherwise non-versionable
>    functions, non-default varargs, non-local or forced labels, and
>    functions with far too many arguments); (b) when some strub mode
>    must be enabled because of a strub variable, but the function is
>    not eligible or not viable for any mode; and (c) when
>    -fstrub=strict is enabled, and calls are found in strub functions
>    to functions that are not callable from strub contexts.
>    compute_strub_mode implements (a) and (b), and verify_strub
>    implements (c).
>
>    The second IPA pass modifies interfaces of at-calls-strub functions
>    and types, introduces strub calls in and around them. and splits
>    internal-strub functions.  It is placed after early inlining, so
>    that even internal-strub functions get a chance of being inlined
>    into other strub functions, but before non-early inlining, so that
>    internal-strub wrapper functions still get a chance of inlining
>    after splitting.
>
>    Wrappers avoid duplicating the copying of large arguments again by
>    passing them by reference to the wrapped bodies.  This involves
>    occasional SSA rewriting of address computations, because of the
>    additional indirection.  Besides these changes, and the
>    introduction of the stack watermark parameter, wrappers and wrapped
>    functions cooperate to handle variable argument lists (performing
>    va_start in the wrapper, passing the list as an argument, and
>    replacing va_start calls in the wrapped body with va_copy), and
>    __builtin_apply_args (also called in the wrapper and passed to the
>    wrapped body as an argument).
>
>    Strub bodies (both internal-mode wrapped bodies, and at-calls
>    functions) always start by adjusting the watermark parameter, by
>    calling __builtin___strub_update.  The compiler inserts them in the
>    main strub pass.  Allocations of additional stack space for the
>    frame (__builtin_alloca) are also followed by watermark updates.
>    Stack space temporarily allocated to pass arguments to other
>    functions, released right after the call, is not regarded as part
>    of the frame.  Around calls to them, i.e., in internal-mode
>    wrappers and at-calls callers (even calls through pointers), calls
>    to __builtin___strub_enter and __builtin___strub_leave are
>    inserted, the latter as a __finally block, so that it runs at
>    regular and exceptional exit paths.  strub_enter only initializes
>    the stack watermark, and strub_leave is where the scrubbing takes
>    place, overwriting with zeros the stack space from the top of the
>    stack to the watermark.
>
>    These calls can be optimized in various cases.  In
>    pass_ipa_strub::adjust_at_calls_call, for example, we enable
>    tail-calling and other optimized calls from one strub body to
>    another by passing on the watermark parameter.  The builtins
>    themselves may undergo inline substitution during expansion,
>    dependign on optimization levels.  This involves dealing with stack
>    red zones (when the builtins are called out-of-line, the red zone
>    cannot be used) and other ugly details related with inlining strub
>    bodies into other strub bodies (see expand_builtin_strub_update).
>    expand_builtin_strub_leave may even perform partial inline
>    substitution.  */

Thanks.

> The type attribute description now starts like this:
>
> @cindex @code{strub} type attribute
> @item strub
> This attribute defines stack-scrubbing properties of functions and
> variables, so that functions that access sensitive data can have their
> stack frames zeroed-out upon returning or propagating exceptions.  This
> may be enabled explicitly, by selecting certain @code{strub} modes for
> specific functions, or implicitly, by means of @code{strub} variables.
>
> And I've fixed the symbol versioning for strub and hardcfr functions.
>
> I've just pushed the base strub patch and the recent incremental changes
> (and some commits used for testing) to refs/users/aoliva/heads/strub.
>
>
> Here are changes.html entries for this and for the other newly-added
> features:
>
> new AdaCore-contributed hardening features in gcc 13 and 14
>
> Mention hardening of conditionals (added in gcc 13), control flow
> redundancy, hardened booleans, and stack scrubbing.
>
> Also cover forced inlining of string operations while at that.
> ---
>  htdocs/gcc-13/changes.html |    6 ++++++
>  htdocs/gcc-14/changes.html |   29 +++++++++++++++++++++++++++++
>  2 files changed, 35 insertions(+)
>
> diff --git a/htdocs/gcc-13/changes.html b/htdocs/gcc-13/changes.html
> index 8ef3d63952daf..c1dea18caf59b 100644
> --- a/htdocs/gcc-13/changes.html
> +++ b/htdocs/gcc-13/changes.html
> @@ -168,6 +168,12 @@ You may also want to check out our
>      been added, see also
>      <a href="https://gcc.gnu.org/onlinedocs/gcc/Freestanding-Environments.html">Profiling and Test Coverage in Freestanding Environments</a>.
>    </li>
> +  <li>
> +    New options <code>-fharden-compares</code>
> +    and <code>-fharden-conditional-branches</code> to verify compares
> +    and conditional branches, to detect some power-deprivation
> +    hardware attacks, using reversed conditions.
> +  </li>
>  </ul>
>
>
> diff --git a/htdocs/gcc-14/changes.html b/htdocs/gcc-14/changes.html
> index 2088ee91a34e1..cb92f8d8095c3 100644
> --- a/htdocs/gcc-14/changes.html
> +++ b/htdocs/gcc-14/changes.html
> @@ -109,6 +109,35 @@ a work-in-progress.</p>
>      of hardening flags.  The options it enables can be displayed using the
>      <code>--help=hardened</code> option.
>    </li>
> +  <li>
> +    New option <code>-fharden-control-flow-redundancy</code>, to
> +    verify, at the end of functions, that the visited basic blocks
> +    correspond to a legitimate execution path, so as to detect and
> +    prevent attacks that transfer control into the middle of
> +    functions.
> +  </li>
> +  <li>
> +    New type attribute <code>hardbool</code>, for C and Ada.  Hardened
> +    booleans take user-specified representations for <code>true</code>
> +    and <code>false</code>, presumably with higher hamming distance
> +    than standard booleans, and get verified at every use, detecting
> +    memory corruption and some malicious attacks.
> +  </li>
> +  <li>
> +    New type attribute <code>strub</code> to control stack scrubbing
> +    properties of functions and variables.  The stack frame used by
> +    functions marked with the attribute gets zeroed-out upon returning
> +    or exception escaping.  Scalar variables marked with the attribute
> +    cause functions contaning or accessing them to get stack scrubbing
> +    enabled implicitly.
> +  </li>
> +  <li>
> +    New option <code>-finline-stringops</code>, to force inline
> +    expansion of <code>memcmp</code>, <code>memcpy</code>,
> +    <code>memmove</code> and <code>memset</code>, even when that is
> +    not an optimization, to avoid relying on library
> +    implementations.
> +  </li>
>  </ul>
>  <!-- .................................................................. -->
>  <h2 id="languages">New Languages and Language specific improvements</h2>

LGTM.

Can you check on the pass-by-reference thing again please?

Let's see if Honza or Martin have any comments on the IPA bits, I just
mentioned what I think should be doable ...

Thanks,
Richard.

>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
  
Alexandre Oliva Nov. 30, 2023, 4:13 a.m. UTC | #8
On Nov 29, 2023, Richard Biener <richard.guenther@gmail.com> wrote:

>> Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

> 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> so I assume it was &MEM[arg_(D)][n_#] which is indeed OK.

Yeah.

> But you shouldn't need to change a pointer argument to be passed by
> reference, do you?

True, my attempt to simplify the example moved it past the breaking point.

IIRC the actual situations I hit involved computing address of members
of compound objects, such as struct members, even array elements
thereof.  They became problematic after replacing the object with a
dereference in gimple stmts.  The (effectively) offsetting operation is
well-formed gimple, but IIRC adding dereferencing to it made it
malformed gimple.  I don't immediately see why this should be the case,
since it's still offsetting, so perhaps I misremember.

> As said, you want to restrict by-reference passing to arguments
> that are !is_gimple_reg_type ()

*nod*, it was already there:

      if (!(0 /* DECL_BY_REFERENCE (narg) */
	    || is_gimple_reg_type (TREE_TYPE (nparm))
            ...
	{
	  indirect_nparms.add (nparm);

>> Here are changes.html entries for this and for the other newly-added
>> features:

> LGTM.

Was that an ok to install, once the relevant pieces are in?

> Can you check on the pass-by-reference thing again please?

Sure.  I'll get back to you shortly.

If argument indirection becomes the only blocking issue, I'd be happy to
disable it, or even split out the patch that introduces it, so that the
bulk of the feature can go in while we sort out these details.
Disabling it is as simple as:

diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
index 293bec132b885..90770202fb851 100644
--- a/gcc/ipa-strub.cc
+++ b/gcc/ipa-strub.cc
@@ -2831,6 +2831,7 @@ pass_ipa_strub::execute (function *)
 	   parm = DECL_CHAIN (parm),
 	   nparm = DECL_CHAIN (nparm),
 	   nparmt = nparmt ? TREE_CHAIN (nparmt) : NULL_TREE)
+      if (true) ; else // ??? Disable parm indirection for now.
       if (!(0 /* DECL_BY_REFERENCE (narg) */
 	    || is_gimple_reg_type (TREE_TYPE (nparm))
 	    || VECTOR_TYPE_P (TREE_TYPE (nparm))


> Let's see if Honza or Martin have any comments on the IPA bits, I just
> mentioned what I think should be doable ...

I'm curious as to what you're hoping for.  I mean, I am using
create_version_clone_with_body, adding the new params and copying the
preexisting ones, and modifying some argument types for indirection
after cloning.  The problems I faced were as I tried to replace params
with their indirected versions.  According to my notes and my
recollection, that's where I hit most of the trouble.  But what would
this really buy us?  Do you envision a possibility of actually splitting
out the original function body, so that IPA takes care of the whole
wrapping?  AFAICT that would require a lot more infrastructure to deal
with new and modified parameters, though the details of what I learned
about this API back then, and that made it clear I wouldn't be able to
use it, seem to have faded away from my memory.
  
Alexandre Oliva Nov. 30, 2023, 5:04 a.m. UTC | #9
On Nov 29, 2023, Richard Biener <richard.guenther@gmail.com> wrote:

> On Wed, Nov 29, 2023 at 9:53 AM Alexandre Oliva <oliva@adacore.com> wrote:

>> Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.

> 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> so I assume it was &MEM[arg_(D)][n_#] which is indeed OK.  But you
> shouldn't need to change a pointer argument to be passed by reference,
> do you?  As said, you want to restrict by-reference passing to arguments
> that are !is_gimple_reg_type ().  Everywhere where a plain PARM_DECL
> was valid a *ptr indirection is as well.

> Can you check on the pass-by-reference thing again please?

Applying the following patchlet on top of refs/users/aoliva/heads/strub
(that has a -fstrub=all patchlet in it) I get a build error in libgo
building golang.org/x/mod/sumdb.o:

In function ‘golang_0org_1x_1mod_1sumdb.Client.checkTrees.strub.0’:
go1: error: invalid argument to gimple call
&older_195(D)->Hash
# VUSE <.MEM_55>
_16 = __builtin_memcmp (&h, &older_195(D)->Hash, 32);
during IPA pass: strub

golang.org/x/mod/sumdb.go.057i.remove_symbols:  _5 = __builtin_memcmp (&h, &older.Hash, 32);

within golang_0org_1x_1mod_1sumdb.Client.checkTrees becomes, in wrapped
version thereof:

golang.org/x/mod/sumdb.go.058i.strub:  _16 = __builtin_memcmp (&h, &older_195(D)->Hash, 32);

It's not even the case that Hash is at offset 0 into older's type, but I
suspect the reason why the former is well-formed while the latter is not
the offset, but the SSA_NAME.


diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
index 293bec132b8..515ab9a2ee5 100644
--- a/gcc/ipa-strub.cc
+++ b/gcc/ipa-strub.cc
@@ -1950,7 +1950,7 @@ walk_regimplify_addr_expr (tree *op, int *rec, void *arg)
   if (!*op || TREE_CODE (*op) != ADDR_EXPR)
     return NULL_TREE;
 
-  if (!is_gimple_val (*op))
+  if (0 && !is_gimple_val (*op))
     {
       tree ret = force_gimple_operand_gsi (&gsi, *op, true,
                                           NULL_TREE, true, GSI_SAME_STMT);
  
Richard Biener Nov. 30, 2023, 11:56 a.m. UTC | #10
On Thu, Nov 30, 2023 at 6:04 AM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Nov 29, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> > On Wed, Nov 29, 2023 at 9:53 AM Alexandre Oliva <oliva@adacore.com> wrote:
>
> >> Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.
>
> > 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> > so I assume it was &MEM[arg_(D)][n_#] which is indeed OK.  But you
> > shouldn't need to change a pointer argument to be passed by reference,
> > do you?  As said, you want to restrict by-reference passing to arguments
> > that are !is_gimple_reg_type ().  Everywhere where a plain PARM_DECL
> > was valid a *ptr indirection is as well.
>
> > Can you check on the pass-by-reference thing again please?
>
> Applying the following patchlet on top of refs/users/aoliva/heads/strub
> (that has a -fstrub=all patchlet in it) I get a build error in libgo
> building golang.org/x/mod/sumdb.o:
>
> In function ‘golang_0org_1x_1mod_1sumdb.Client.checkTrees.strub.0’:
> go1: error: invalid argument to gimple call
> &older_195(D)->Hash
> # VUSE <.MEM_55>
> _16 = __builtin_memcmp (&h, &older_195(D)->Hash, 32);
> during IPA pass: strub
>
> golang.org/x/mod/sumdb.go.057i.remove_symbols:  _5 = __builtin_memcmp (&h, &older.Hash, 32);

Ah, yeah - &older.Hash is considered a constant while &older_195(D)->Hash isn't.

So indeed you are correct - you'll need adjustments, sorry for
misleading you ...

Richard.

> within golang_0org_1x_1mod_1sumdb.Client.checkTrees becomes, in wrapped
> version thereof:
>
> golang.org/x/mod/sumdb.go.058i.strub:  _16 = __builtin_memcmp (&h, &older_195(D)->Hash, 32);
>
> It's not even the case that Hash is at offset 0 into older's type, but I
> suspect the reason why the former is well-formed while the latter is not
> the offset, but the SSA_NAME.
>
>
> diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
> index 293bec132b8..515ab9a2ee5 100644
> --- a/gcc/ipa-strub.cc
> +++ b/gcc/ipa-strub.cc
> @@ -1950,7 +1950,7 @@ walk_regimplify_addr_expr (tree *op, int *rec, void *arg)
>    if (!*op || TREE_CODE (*op) != ADDR_EXPR)
>      return NULL_TREE;
>
> -  if (!is_gimple_val (*op))
> +  if (0 && !is_gimple_val (*op))
>      {
>        tree ret = force_gimple_operand_gsi (&gsi, *op, true,
>                                            NULL_TREE, true, GSI_SAME_STMT);
>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
  
Richard Biener Nov. 30, 2023, noon UTC | #11
On Thu, Nov 30, 2023 at 5:13 AM Alexandre Oliva <oliva@adacore.com> wrote:
>
> On Nov 29, 2023, Richard Biener <richard.guenther@gmail.com> wrote:
>
> >> Because &arg_#(D)[n_#] is good gimple, but &(*byref_arg_#(D))[n_#] isn't.
>
> > 'arg_#(D)' looks like a SSA name, and no, taking the address doesn't work,
> > so I assume it was &MEM[arg_(D)][n_#] which is indeed OK.
>
> Yeah.
>
> > But you shouldn't need to change a pointer argument to be passed by
> > reference, do you?
>
> True, my attempt to simplify the example moved it past the breaking point.
>
> IIRC the actual situations I hit involved computing address of members
> of compound objects, such as struct members, even array elements
> thereof.  They became problematic after replacing the object with a
> dereference in gimple stmts.  The (effectively) offsetting operation is
> well-formed gimple, but IIRC adding dereferencing to it made it
> malformed gimple.  I don't immediately see why this should be the case,
> since it's still offsetting, so perhaps I misremember.
>
> > As said, you want to restrict by-reference passing to arguments
> > that are !is_gimple_reg_type ()
>
> *nod*, it was already there:
>
>       if (!(0 /* DECL_BY_REFERENCE (narg) */
>             || is_gimple_reg_type (TREE_TYPE (nparm))
>             ...
>         {
>           indirect_nparms.add (nparm);
>
> >> Here are changes.html entries for this and for the other newly-added
> >> features:
>
> > LGTM.
>
> Was that an ok to install, once the relevant pieces are in?

See below.

> > Can you check on the pass-by-reference thing again please?
>
> Sure.  I'll get back to you shortly.
>
> If argument indirection becomes the only blocking issue, I'd be happy to
> disable it, or even split out the patch that introduces it, so that the
> bulk of the feature can go in while we sort out these details.
> Disabling it is as simple as:
>
> diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
> index 293bec132b885..90770202fb851 100644
> --- a/gcc/ipa-strub.cc
> +++ b/gcc/ipa-strub.cc
> @@ -2831,6 +2831,7 @@ pass_ipa_strub::execute (function *)
>            parm = DECL_CHAIN (parm),
>            nparm = DECL_CHAIN (nparm),
>            nparmt = nparmt ? TREE_CHAIN (nparmt) : NULL_TREE)
> +      if (true) ; else // ??? Disable parm indirection for now.
>        if (!(0 /* DECL_BY_REFERENCE (narg) */
>             || is_gimple_reg_type (TREE_TYPE (nparm))
>             || VECTOR_TYPE_P (TREE_TYPE (nparm))
>
>
> > Let's see if Honza or Martin have any comments on the IPA bits, I just
> > mentioned what I think should be doable ...
>
> I'm curious as to what you're hoping for.  I mean, I am using
> create_version_clone_with_body, adding the new params and copying the
> preexisting ones, and modifying some argument types for indirection
> after cloning.  The problems I faced were as I tried to replace params
> with their indirected versions.  According to my notes and my
> recollection, that's where I hit most of the trouble.  But what would
> this really buy us?  Do you envision a possibility of actually splitting
> out the original function body, so that IPA takes care of the whole
> wrapping?  AFAICT that would require a lot more infrastructure to deal
> with new and modified parameters, though the details of what I learned
> about this API back then, and that made it clear I wouldn't be able to
> use it, seem to have faded away from my memory.

In the end I was hoping for general comments on the cgraph usage
and for the specifics indeed being able to use IPA mechanisms
to perform the stmt re-writing (and re-gimplification) for the value to
by-reference replacement.  The core copy_body mechanism
might support it by simply registering *new_param as replacement
for 'old_param' in the copy_body_data decl_map.

Iff the IPA folks (Honza or Martin) don't have any further comments
the patch is OK to install by next Monday.

Thanks,
Richard.

>
> --
> Alexandre Oliva, happy hacker            https://FSFLA.org/blogs/lxo/
>    Free Software Activist                   GNU Toolchain Engineer
> More tolerance and less prejudice are key for inclusion and diversity
> Excluding neuro-others for not behaving ""normal"" is *not* inclusive
  

Patch

diff --git a/gcc/Makefile.in b/gcc/Makefile.in
index a25a1e32fbc5f..53e953e224344 100644
--- a/gcc/Makefile.in
+++ b/gcc/Makefile.in
@@ -1542,6 +1542,7 @@  OBJS = \
 	ipa-reference.o \
 	ipa-ref.o \
 	ipa-utils.o \
+	ipa-strub.o \
 	ipa.o \
 	ira.o \
 	ira-build.o \
@@ -2846,6 +2847,7 @@  GTFILES = $(CPPLIB_H) $(srcdir)/input.h $(srcdir)/coretypes.h \
   $(srcdir)/sanopt.cc \
   $(srcdir)/sancov.cc \
   $(srcdir)/ipa-devirt.cc \
+  $(srcdir)/ipa-strub.cc \
   $(srcdir)/internal-fn.h \
   $(srcdir)/calls.cc \
   $(srcdir)/omp-general.h \
diff --git a/gcc/ada/gcc-interface/trans.cc b/gcc/ada/gcc-interface/trans.cc
index 89f0a07c824e3..49046bf51fb38 100644
--- a/gcc/ada/gcc-interface/trans.cc
+++ b/gcc/ada/gcc-interface/trans.cc
@@ -69,6 +69,21 @@ 
 #include "ada-tree.h"
 #include "gigi.h"
 
+/* The following #include is for strub_make_callable.
+
+   This function marks a function as safe to call from strub contexts.  We mark
+   Ada subprograms that may be called implicitly by the compiler, and that won't
+   leave on the stack caller data passed to them.  This stops implicit calls
+   introduced in subprograms that have their stack scrubbed from being flagged
+   as unsafe, even in -fstrub=strict mode.
+
+   These subprograms are also marked with the strub(callable) attribute in Ada
+   sources, but their declarations aren't necessarily imported by GNAT, or made
+   visible to gigi, in units that end up relying on them.  So when gigi
+   introduces their declarations on its own, it must also add the attribute, by
+   calling strub_make_callable.  */
+#include "ipa-strub.h"
+
 /* We should avoid allocating more than ALLOCA_THRESHOLD bytes via alloca,
    for fear of running out of stack space.  If we need more, we use xmalloc
    instead.  */
@@ -454,6 +469,7 @@  gigi (Node_Id gnat_root,
 						     int64_type, NULL_TREE),
 			   NULL_TREE, is_default, true, true, true, false,
 			   false, NULL, Empty);
+  strub_make_callable (mulv64_decl);
 
   if (Enable_128bit_Types)
     {
@@ -466,6 +482,7 @@  gigi (Node_Id gnat_root,
 							 NULL_TREE),
 			       NULL_TREE, is_default, true, true, true, false,
 			       false, NULL, Empty);
+      strub_make_callable (mulv128_decl);
     }
 
   /* Name of the _Parent field in tagged record types.  */
@@ -722,6 +739,7 @@  build_raise_check (int check, enum exception_info_kind kind)
     = create_subprog_decl (get_identifier (Name_Buffer), NULL_TREE, ftype,
 			   NULL_TREE, is_default, true, true, true, false,
 			   false, NULL, Empty);
+  strub_make_callable (result);
   set_call_expr_flags (result, ECF_NORETURN | ECF_XTHROW);
 
   return result;
diff --git a/gcc/ada/gcc-interface/utils.cc b/gcc/ada/gcc-interface/utils.cc
index 4e2ed173fbe97..5b4d16d7904ae 100644
--- a/gcc/ada/gcc-interface/utils.cc
+++ b/gcc/ada/gcc-interface/utils.cc
@@ -39,6 +39,7 @@ 
 #include "varasm.h"
 #include "toplev.h"
 #include "opts.h"
+#include "ipa-strub.h"
 #include "output.h"
 #include "debug.h"
 #include "convert.h"
@@ -6727,9 +6728,77 @@  handle_no_stack_protector_attribute (tree *node, tree name, tree, int,
    struct attribute_spec.handler.  */
 
 static tree
-handle_strub_attribute (tree *, tree, tree, int, bool *no_add_attrs)
+handle_strub_attribute (tree *node, tree name,
+			tree args,
+			int ARG_UNUSED (flags), bool *no_add_attrs)
 {
-  *no_add_attrs = true;
+  bool enable = true;
+
+  if (args && FUNCTION_POINTER_TYPE_P (*node))
+    *node = TREE_TYPE (*node);
+
+  if (args && FUNC_OR_METHOD_TYPE_P (*node))
+    {
+      switch (strub_validate_fn_attr_parm (TREE_VALUE (args)))
+	{
+	case 1:
+	case 2:
+	  enable = true;
+	  break;
+
+	case 0:
+	  warning (OPT_Wattributes,
+		   "%qE attribute ignored because of argument %qE",
+		   name, TREE_VALUE (args));
+	  *no_add_attrs = true;
+	  enable = false;
+	  break;
+
+	case -1:
+	case -2:
+	  enable = false;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	}
+
+      args = TREE_CHAIN (args);
+    }
+
+  if (args)
+    {
+      warning (OPT_Wattributes,
+	       "ignoring attribute %qE because of excess arguments"
+	       " starting at %qE",
+	       name, TREE_VALUE (args));
+      *no_add_attrs = true;
+      enable = false;
+    }
+
+  /* Warn about unmet expectations that the strub attribute works like a
+     qualifier.  ??? Could/should we extend it to the element/field types
+     here?  */
+  if (TREE_CODE (*node) == ARRAY_TYPE
+      || VECTOR_TYPE_P (*node)
+      || TREE_CODE (*node) == COMPLEX_TYPE)
+    warning (OPT_Wattributes,
+	     "attribute %qE does not apply to elements"
+	     " of non-scalar type %qT",
+	     name, *node);
+  else if (RECORD_OR_UNION_TYPE_P (*node))
+    warning (OPT_Wattributes,
+	     "attribute %qE does not apply to fields"
+	     " of aggregate type %qT",
+	     name, *node);
+
+  /* If we see a strub-enabling attribute, and we're at the default setting,
+     implicitly or explicitly, note that the attribute was seen, so that we can
+     reduce the compile-time overhead to nearly zero when the strub feature is
+     not used.  */
+  if (enable && flag_strub < -2)
+    flag_strub += 2;
+
   return NULL_TREE;
 }
 
diff --git a/gcc/attribs.cc b/gcc/attribs.cc
index 229d8b32c1ee0..caa157d0a4e24 100644
--- a/gcc/attribs.cc
+++ b/gcc/attribs.cc
@@ -27,6 +27,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "diagnostic-core.h"
 #include "attribs.h"
 #include "fold-const.h"
+#include "ipa-strub.h"
 #include "stor-layout.h"
 #include "langhooks.h"
 #include "plugin.h"
@@ -785,8 +786,8 @@  decl_attributes (tree *node, tree attributes, int flags,
 	  flags &= ~(int) ATTR_FLAG_TYPE_IN_PLACE;
 	}
 
-      if (spec->function_type_required && TREE_CODE (*anode) != FUNCTION_TYPE
-	  && TREE_CODE (*anode) != METHOD_TYPE)
+      if (spec->function_type_required
+	  && !FUNC_OR_METHOD_TYPE_P (*anode))
 	{
 	  if (TREE_CODE (*anode) == POINTER_TYPE
 	      && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
@@ -898,7 +899,24 @@  decl_attributes (tree *node, tree attributes, int flags,
 	      TYPE_NAME (tt) = *node;
 	    }
 
-	  *anode = cur_and_last_decl[0];
+	  if (*anode != cur_and_last_decl[0])
+	    {
+	      /* Even if !spec->function_type_required, allow the attribute
+		 handler to request the attribute to be applied to the function
+		 type, rather than to the function pointer type, by setting
+		 cur_and_last_decl[0] to the function type.  */
+	      if (!fn_ptr_tmp
+		  && POINTER_TYPE_P (*anode)
+		  && TREE_TYPE (*anode) == cur_and_last_decl[0]
+		  && FUNC_OR_METHOD_TYPE_P (TREE_TYPE (*anode)))
+		{
+		  fn_ptr_tmp = TREE_TYPE (*anode);
+		  fn_ptr_quals = TYPE_QUALS (*anode);
+		  anode = &fn_ptr_tmp;
+		}
+	      *anode = cur_and_last_decl[0];
+	    }
+
 	  if (ret == error_mark_node)
 	    {
 	      warning (OPT_Wattributes, "%qE attribute ignored", name);
@@ -1501,9 +1519,20 @@  comp_type_attributes (const_tree type1, const_tree type2)
   if ((lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (type1)) != NULL)
       ^ (lookup_attribute ("nocf_check", TYPE_ATTRIBUTES (type2)) != NULL))
     return 0;
+  int strub_ret = strub_comptypes (CONST_CAST_TREE (type1),
+				   CONST_CAST_TREE (type2));
+  if (strub_ret == 0)
+    return strub_ret;
   /* As some type combinations - like default calling-convention - might
      be compatible, we have to call the target hook to get the final result.  */
-  return targetm.comp_type_attributes (type1, type2);
+  int target_ret = targetm.comp_type_attributes (type1, type2);
+  if (target_ret == 0)
+    return target_ret;
+  if (strub_ret == 2 || target_ret == 2)
+    return 2;
+  if (strub_ret == 1 && target_ret == 1)
+    return 1;
+  gcc_unreachable ();
 }
 
 /* PREDICATE acts as a function of type:
diff --git a/gcc/builtins.cc b/gcc/builtins.cc
index cb90bd03b3ea3..b89c7a9376a95 100644
--- a/gcc/builtins.cc
+++ b/gcc/builtins.cc
@@ -71,6 +71,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "gimple-fold.h"
 #include "intl.h"
 #include "file-prefix-map.h" /* remap_macro_filename()  */
+#include "ipa-strub.h" /* strub_watermark_parm()  */
 #include "gomp-constants.h"
 #include "omp-general.h"
 #include "tree-dfa.h"
@@ -151,6 +152,7 @@  static rtx expand_builtin_strnlen (tree, rtx, machine_mode);
 static rtx expand_builtin_alloca (tree);
 static rtx expand_builtin_unop (machine_mode, tree, rtx, rtx, optab);
 static rtx expand_builtin_frame_address (tree, tree);
+static rtx expand_builtin_stack_address ();
 static tree stabilize_va_list_loc (location_t, tree, int);
 static rtx expand_builtin_expect (tree, rtx);
 static rtx expand_builtin_expect_with_probability (tree, rtx);
@@ -5258,6 +5260,252 @@  expand_builtin_frame_address (tree fndecl, tree exp)
     }
 }
 
+#ifndef STACK_GROWS_DOWNWARD
+# define STACK_TOPS GT
+#else
+# define STACK_TOPS LT
+#endif
+
+#ifdef POINTERS_EXTEND_UNSIGNED
+# define STACK_UNSIGNED POINTERS_EXTEND_UNSIGNED
+#else
+# define STACK_UNSIGNED true
+#endif
+
+/* Expand a call to builtin function __builtin_stack_address.  */
+
+static rtx
+expand_builtin_stack_address ()
+{
+  return convert_to_mode (ptr_mode, copy_to_reg (stack_pointer_rtx),
+			  STACK_UNSIGNED);
+}
+
+/* Expand a call to builtin function __builtin_strub_enter.  */
+
+static rtx
+expand_builtin_strub_enter (tree exp)
+{
+  if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE))
+    return NULL_RTX;
+
+  if (optimize < 1 || flag_no_inline)
+    return NULL_RTX;
+
+  rtx stktop = expand_builtin_stack_address ();
+
+  tree wmptr = CALL_EXPR_ARG (exp, 0);
+  tree wmtype = TREE_TYPE (TREE_TYPE (wmptr));
+  tree wmtree = fold_build2 (MEM_REF, wmtype, wmptr,
+			     build_int_cst (TREE_TYPE (wmptr), 0));
+  rtx wmark = expand_expr (wmtree, NULL_RTX, ptr_mode, EXPAND_MEMORY);
+
+  emit_move_insn (wmark, stktop);
+
+  return const0_rtx;
+}
+
+/* Expand a call to builtin function __builtin_strub_update.  */
+
+static rtx
+expand_builtin_strub_update (tree exp)
+{
+  if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE))
+    return NULL_RTX;
+
+  if (optimize < 2 || flag_no_inline)
+    return NULL_RTX;
+
+  rtx stktop = expand_builtin_stack_address ();
+
+#ifdef RED_ZONE_SIZE
+  /* Here's how the strub enter, update and leave functions deal with red zones.
+
+     If it weren't for red zones, update, called from within a strub context,
+     would bump the watermark to the top of the stack.  Enter and leave, running
+     in the caller, would use the caller's top of stack address both to
+     initialize the watermark passed to the callee, and to start strubbing the
+     stack afterwards.
+
+     Ideally, we'd update the watermark so as to cover the used amount of red
+     zone, and strub starting at the caller's other end of the (presumably
+     unused) red zone.  Normally, only leaf functions use the red zone, but at
+     this point we can't tell whether a function is a leaf, nor can we tell how
+     much of the red zone it uses.  Furthermore, some strub contexts may have
+     been inlined so that update and leave are called from the same stack frame,
+     and the strub builtins may all have been inlined, turning a strub function
+     into a leaf.
+
+     So cleaning the range from the caller's stack pointer (one end of the red
+     zone) to the (potentially inlined) callee's (other end of the) red zone
+     could scribble over the caller's own red zone.
+
+     We avoid this possibility by arranging for callers that are strub contexts
+     to use their own watermark as the strub starting point.  So, if A calls B,
+     and B calls C, B will tell A to strub up to the end of B's red zone, and
+     will strub itself only the part of C's stack frame and red zone that
+     doesn't overlap with B's.  With that, we don't need to know who's leaf and
+     who isn't: inlined calls will shrink their strub window to zero, each
+     remaining call will strub some portion of the stack, and eventually the
+     strub context will return to a caller that isn't a strub context itself,
+     that will therefore use its own stack pointer as the strub starting point.
+     It's not a leaf, because strub contexts can't be inlined into non-strub
+     contexts, so it doesn't use the red zone, and it will therefore correctly
+     strub up the callee's stack frame up to the end of the callee's red zone.
+     Neat!  */
+  if (true /* (flags_from_decl_or_type (current_function_decl) & ECF_LEAF) */)
+    {
+      poly_int64 red_zone_size = RED_ZONE_SIZE;
+#if STACK_GROWS_DOWNWARD
+      red_zone_size = -red_zone_size;
+#endif
+      stktop = plus_constant (ptr_mode, stktop, red_zone_size);
+      stktop = force_reg (ptr_mode, stktop);
+    }
+#endif
+
+  tree wmptr = CALL_EXPR_ARG (exp, 0);
+  tree wmtype = TREE_TYPE (TREE_TYPE (wmptr));
+  tree wmtree = fold_build2 (MEM_REF, wmtype, wmptr,
+			     build_int_cst (TREE_TYPE (wmptr), 0));
+  rtx wmark = expand_expr (wmtree, NULL_RTX, ptr_mode, EXPAND_MEMORY);
+
+  rtx wmarkr = force_reg (ptr_mode, wmark);
+
+  rtx_code_label *lab = gen_label_rtx ();
+  do_compare_rtx_and_jump (stktop, wmarkr, STACK_TOPS, STACK_UNSIGNED,
+			   ptr_mode, NULL_RTX, lab, NULL,
+			   profile_probability::very_likely ());
+  emit_move_insn (wmark, stktop);
+
+  /* If this is an inlined strub function, also bump the watermark for the
+     enclosing function.  This avoids a problem with the following scenario: A
+     calls B and B calls C, and both B and C get inlined into A.  B allocates
+     temporary stack space before calling C.  If we don't update A's watermark,
+     we may use an outdated baseline for the post-C strub_leave, erasing B's
+     temporary stack allocation.  We only need this if we're fully expanding
+     strub_leave inline.  */
+  tree xwmptr = (optimize > 2
+		 ? strub_watermark_parm (current_function_decl)
+		 : wmptr);
+  if (wmptr != xwmptr)
+    {
+      wmptr = xwmptr;
+      wmtype = TREE_TYPE (TREE_TYPE (wmptr));
+      wmtree = fold_build2 (MEM_REF, wmtype, wmptr,
+			    build_int_cst (TREE_TYPE (wmptr), 0));
+      wmark = expand_expr (wmtree, NULL_RTX, ptr_mode, EXPAND_MEMORY);
+      wmarkr = force_reg (ptr_mode, wmark);
+
+      do_compare_rtx_and_jump (stktop, wmarkr, STACK_TOPS, STACK_UNSIGNED,
+			       ptr_mode, NULL_RTX, lab, NULL,
+			       profile_probability::very_likely ());
+      emit_move_insn (wmark, stktop);
+    }
+
+  emit_label (lab);
+
+  return const0_rtx;
+}
+
+
+/* Expand a call to builtin function __builtin_strub_leave.  */
+
+static rtx
+expand_builtin_strub_leave (tree exp)
+{
+  if (!validate_arglist (exp, POINTER_TYPE, VOID_TYPE))
+    return NULL_RTX;
+
+  if (optimize < 2 || optimize_size || flag_no_inline)
+    return NULL_RTX;
+
+  rtx stktop = NULL_RTX;
+
+  if (tree wmptr = (optimize
+		    ? strub_watermark_parm (current_function_decl)
+		    : NULL_TREE))
+    {
+      tree wmtype = TREE_TYPE (TREE_TYPE (wmptr));
+      tree wmtree = fold_build2 (MEM_REF, wmtype, wmptr,
+				 build_int_cst (TREE_TYPE (wmptr), 0));
+      rtx wmark = expand_expr (wmtree, NULL_RTX, ptr_mode, EXPAND_MEMORY);
+      stktop = force_reg (ptr_mode, wmark);
+    }
+
+  if (!stktop)
+    stktop = expand_builtin_stack_address ();
+
+  tree wmptr = CALL_EXPR_ARG (exp, 0);
+  tree wmtype = TREE_TYPE (TREE_TYPE (wmptr));
+  tree wmtree = fold_build2 (MEM_REF, wmtype, wmptr,
+			     build_int_cst (TREE_TYPE (wmptr), 0));
+  rtx wmark = expand_expr (wmtree, NULL_RTX, ptr_mode, EXPAND_MEMORY);
+
+  rtx wmarkr = force_reg (ptr_mode, wmark);
+
+#ifndef STACK_GROWS_DOWNWARD
+  rtx base = stktop;
+  rtx end = wmarkr;
+#else
+  rtx base = wmarkr;
+  rtx end = stktop;
+#endif
+
+  /* We're going to modify it, so make sure it's not e.g. the stack pointer.  */
+  base = copy_to_reg (base);
+
+  rtx_code_label *done = gen_label_rtx ();
+  do_compare_rtx_and_jump (base, end, LT, STACK_UNSIGNED,
+			   ptr_mode, NULL_RTX, done, NULL,
+			   profile_probability::very_likely ());
+
+  if (optimize < 3)
+    expand_call (exp, NULL_RTX, true);
+  else
+    {
+      /* Ok, now we've determined we want to copy the block, so convert the
+	 addresses to Pmode, as needed to dereference them to access ptr_mode
+	 memory locations, so that we don't have to convert anything within the
+	 loop.  */
+      base = memory_address (ptr_mode, base);
+      end = memory_address (ptr_mode, end);
+
+      rtx zero = force_operand (const0_rtx, NULL_RTX);
+      int ulen = GET_MODE_SIZE (ptr_mode);
+
+      /* ??? It would be nice to use setmem or similar patterns here,
+	 but they do not necessarily obey the stack growth direction,
+	 which has security implications.  We also have to avoid calls
+	 (memset, bzero or any machine-specific ones), which are
+	 likely unsafe here (see TARGET_STRUB_MAY_USE_MEMSET).  */
+#ifndef STACK_GROWS_DOWNWARD
+      rtx incr = plus_constant (Pmode, base, ulen);
+      rtx dstm = gen_rtx_MEM (ptr_mode, base);
+
+      rtx_code_label *loop = gen_label_rtx ();
+      emit_label (loop);
+      emit_move_insn (dstm, zero);
+      emit_move_insn (base, force_operand (incr, NULL_RTX));
+#else
+      rtx decr = plus_constant (Pmode, end, -ulen);
+      rtx dstm = gen_rtx_MEM (ptr_mode, end);
+
+      rtx_code_label *loop = gen_label_rtx ();
+      emit_label (loop);
+      emit_move_insn (end, force_operand (decr, NULL_RTX));
+      emit_move_insn (dstm, zero);
+#endif
+      do_compare_rtx_and_jump (base, end, LT, STACK_UNSIGNED,
+			       Pmode, NULL_RTX, NULL, loop,
+			       profile_probability::very_likely ());
+    }
+
+  emit_label (done);
+
+  return const0_rtx;
+}
+
 /* Expand EXP, a call to the alloca builtin.  Return NULL_RTX if we
    failed and the caller should emit a normal call.  */
 
@@ -7585,6 +7833,27 @@  expand_builtin (tree exp, rtx target, rtx subtarget, machine_mode mode,
     case BUILT_IN_RETURN_ADDRESS:
       return expand_builtin_frame_address (fndecl, exp);
 
+    case BUILT_IN_STACK_ADDRESS:
+      return expand_builtin_stack_address ();
+
+    case BUILT_IN___STRUB_ENTER:
+      target = expand_builtin_strub_enter (exp);
+      if (target)
+	return target;
+      break;
+
+    case BUILT_IN___STRUB_UPDATE:
+      target = expand_builtin_strub_update (exp);
+      if (target)
+	return target;
+      break;
+
+    case BUILT_IN___STRUB_LEAVE:
+      target = expand_builtin_strub_leave (exp);
+      if (target)
+	return target;
+      break;
+
     /* Returns the address of the area where the structure is returned.
        0 otherwise.  */
     case BUILT_IN_AGGREGATE_INCOMING_ADDRESS:
diff --git a/gcc/builtins.def b/gcc/builtins.def
index eb6f4ec2034cd..068a60e1e6dce 100644
--- a/gcc/builtins.def
+++ b/gcc/builtins.def
@@ -995,6 +995,10 @@  DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSL, "ffsl", BT_FN_INT_LONG, ATTR_CONST_NOTHRO
 DEF_EXT_LIB_BUILTIN    (BUILT_IN_FFSLL, "ffsll", BT_FN_INT_LONGLONG, ATTR_CONST_NOTHROW_LEAF_LIST)
 DEF_EXT_LIB_BUILTIN        (BUILT_IN_FORK, "fork", BT_FN_PID, ATTR_NOTHROW_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FRAME_ADDRESS, "frame_address", BT_FN_PTR_UINT, ATTR_NULL)
+DEF_GCC_BUILTIN        (BUILT_IN_STACK_ADDRESS, "stack_address", BT_FN_PTR, ATTR_NULL)
+DEF_BUILTIN_STUB       (BUILT_IN___STRUB_ENTER, "__builtin___strub_enter")
+DEF_BUILTIN_STUB       (BUILT_IN___STRUB_UPDATE, "__builtin___strub_update")
+DEF_BUILTIN_STUB       (BUILT_IN___STRUB_LEAVE, "__builtin___strub_leave")
 /* [trans-mem]: Adjust BUILT_IN_TM_FREE if BUILT_IN_FREE is changed.  */
 DEF_LIB_BUILTIN        (BUILT_IN_FREE, "free", BT_FN_VOID_PTR, ATTR_NOTHROW_LEAF_LIST)
 DEF_GCC_BUILTIN        (BUILT_IN_FROB_RETURN_ADDR, "frob_return_addr", BT_FN_PTR_PTR, ATTR_NULL)
diff --git a/gcc/c-family/c-attribs.cc b/gcc/c-family/c-attribs.cc
index 232365d46e237..241e50bbd81ce 100644
--- a/gcc/c-family/c-attribs.cc
+++ b/gcc/c-family/c-attribs.cc
@@ -41,6 +41,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "common/common-target.h"
 #include "langhooks.h"
 #include "tree-inline.h"
+#include "ipa-strub.h"
 #include "toplev.h"
 #include "tree-iterator.h"
 #include "opts.h"
@@ -69,6 +70,7 @@  static tree handle_asan_odr_indicator_attribute (tree *, tree, tree, int,
 static tree handle_stack_protect_attribute (tree *, tree, tree, int, bool *);
 static tree handle_no_stack_protector_function_attribute (tree *, tree,
 							tree, int, bool *);
+static tree handle_strub_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noinline_attribute (tree *, tree, tree, int, bool *);
 static tree handle_noclone_attribute (tree *, tree, tree, int, bool *);
 static tree handle_nocf_check_attribute (tree *, tree, tree, int, bool *);
@@ -321,6 +323,8 @@  const struct attribute_spec c_common_attribute_table[] =
   { "no_stack_protector",     0, 0, true, false, false, false,
 			      handle_no_stack_protector_function_attribute,
 			      attr_stack_protect_exclusions },
+  { "strub",		      0, 1, false, true, false, true,
+			      handle_strub_attribute, NULL },
   { "noinline",               0, 0, true,  false, false, false,
 			      handle_noinline_attribute,
 	                      attr_noinline_exclusions },
@@ -1476,6 +1480,84 @@  handle_noipa_attribute (tree *node, tree name, tree, int, bool *no_add_attrs)
   return NULL_TREE;
 }
 
+/* Handle a "strub" attribute; arguments as in
+   struct attribute_spec.handler.  */
+
+static tree
+handle_strub_attribute (tree *node, tree name,
+			tree args,
+			int ARG_UNUSED (flags), bool *no_add_attrs)
+{
+  bool enable = true;
+
+  if (args && FUNCTION_POINTER_TYPE_P (*node))
+    *node = TREE_TYPE (*node);
+
+  if (args && FUNC_OR_METHOD_TYPE_P (*node))
+    {
+      switch (strub_validate_fn_attr_parm (TREE_VALUE (args)))
+	{
+	case 1:
+	case 2:
+	  enable = true;
+	  break;
+
+	case 0:
+	  warning (OPT_Wattributes,
+		   "%qE attribute ignored because of argument %qE",
+		   name, TREE_VALUE (args));
+	  *no_add_attrs = true;
+	  enable = false;
+	  break;
+
+	case -1:
+	case -2:
+	  enable = false;
+	  break;
+
+	default:
+	  gcc_unreachable ();
+	}
+
+      args = TREE_CHAIN (args);
+    }
+
+  if (args)
+    {
+      warning (OPT_Wattributes,
+	       "ignoring attribute %qE because of excess arguments"
+	       " starting at %qE",
+	       name, TREE_VALUE (args));
+      *no_add_attrs = true;
+      enable = false;
+    }
+
+  /* Warn about unmet expectations that the strub attribute works like a
+     qualifier.  ??? Could/should we extend it to the element/field types
+     here?  */
+  if (TREE_CODE (*node) == ARRAY_TYPE
+      || VECTOR_TYPE_P (*node)
+      || TREE_CODE (*node) == COMPLEX_TYPE)
+    warning (OPT_Wattributes,
+	     "attribute %qE does not apply to elements"
+	     " of non-scalar type %qT",
+	     name, *node);
+  else if (RECORD_OR_UNION_TYPE_P (*node))
+    warning (OPT_Wattributes,
+	     "attribute %qE does not apply to fields"
+	     " of aggregate type %qT",
+	     name, *node);
+
+  /* If we see a strub-enabling attribute, and we're at the default setting,
+     implicitly or explicitly, note that the attribute was seen, so that we can
+     reduce the compile-time overhead to nearly zero when the strub feature is
+     not used.  */
+  if (enable && flag_strub < -2)
+    flag_strub += 2;
+
+  return NULL_TREE;
+}
+
 /* Handle a "noinline" attribute; arguments as in
    struct attribute_spec.handler.  */
 
diff --git a/gcc/cgraph.h b/gcc/cgraph.h
index cedaaac3a45b7..177bd9dc5b7ac 100644
--- a/gcc/cgraph.h
+++ b/gcc/cgraph.h
@@ -153,7 +153,7 @@  public:
   void remove (void);
 
   /* Undo any definition or use of the symbol.  */
-  void reset (void);
+  void reset (bool preserve_comdat_group = false);
 
   /* Dump symtab node to F.  */
   void dump (FILE *f);
diff --git a/gcc/cgraphunit.cc b/gcc/cgraphunit.cc
index bccd2f2abb5a3..9a550a5cce645 100644
--- a/gcc/cgraphunit.cc
+++ b/gcc/cgraphunit.cc
@@ -384,7 +384,7 @@  symbol_table::process_new_functions (void)
    functions or variables.  */
 
 void
-symtab_node::reset (void)
+symtab_node::reset (bool preserve_comdat_group)
 {
   /* Reset our data structures so we can analyze the function again.  */
   analyzed = false;
@@ -395,7 +395,8 @@  symtab_node::reset (void)
   cpp_implicit_alias = false;
 
   remove_all_references ();
-  remove_from_same_comdat_group ();
+  if (!preserve_comdat_group)
+    remove_from_same_comdat_group ();
 
   if (cgraph_node *cn = dyn_cast <cgraph_node *> (this))
     {
diff --git a/gcc/common.opt b/gcc/common.opt
index ce34075561f9f..c35b6a3de552a 100644
--- a/gcc/common.opt
+++ b/gcc/common.opt
@@ -2875,6 +2875,35 @@  fstrict-overflow
 Common
 Treat signed overflow as undefined.  Negated as -fwrapv -fwrapv-pointer.
 
+fstrub=disable
+Common RejectNegative Var(flag_strub, 0)
+Disable stack scrub entirely, disregarding strub attributes.
+
+fstrub=strict
+Common RejectNegative Var(flag_strub, -4)
+Enable stack scrub as per attributes, with strict call checking.
+
+; If any strub-enabling attribute is seen when the default or strict
+; initializer values are in effect, flag_strub is bumped up by 2.  The
+; scrub mode gate function will then bump these initializer values to
+; 0 if no strub-enabling attribute is seen.  This minimizes the strub
+; overhead.
+fstrub=relaxed
+Common RejectNegative Var(flag_strub, -3) Init(-3)
+Restore default strub mode: as per attributes, with relaxed checking.
+
+fstrub=all
+Common RejectNegative Var(flag_strub, 3)
+Enable stack scrubbing for all viable functions.
+
+fstrub=at-calls
+Common RejectNegative Var(flag_strub, 1)
+Enable at-calls stack scrubbing for all viable functions.
+
+fstrub=internal
+Common RejectNegative Var(flag_strub, 2)
+Enable internal stack scrubbing for all viable functions.
+
 fsync-libcalls
 Common Var(flag_sync_libcalls) Init(1)
 Implement __atomic operations via libcalls to legacy __sync functions.
diff --git a/gcc/doc/extend.texi b/gcc/doc/extend.texi
index e1295898fc25c..0cdbbc208c059 100644
--- a/gcc/doc/extend.texi
+++ b/gcc/doc/extend.texi
@@ -77,6 +77,7 @@  extensions, accepted by GCC in C90 mode and in C++.
 * Function Names::      Printable strings which are the name of the current
                         function.
 * Return Address::      Getting the return or frame address of a function.
+* Stack Scrubbing::     Stack scrubbing internal interfaces.
 * Vector Extensions::   Using vector instructions through built-in functions.
 * Offsetof::            Special syntax for implementing @code{offsetof}.
 * __sync Builtins::     Legacy built-in functions for atomic memory access.
@@ -9096,6 +9097,263 @@  pid_t wait (wait_status_ptr_t p)
 @}
 @end smallexample
 
+@cindex @code{strub} type attribute
+@item strub
+This attribute defines stack-scrubbing properties of functions and
+variables.  Being a type attribute, it attaches to types, even when
+specified in function and variable declarations.  When applied to
+function types, it takes an optional string argument.  When applied to a
+pointer-to-function type, if the optional argument is given, it gets
+propagated to the function type.
+
+@smallexample
+/* A strub variable.  */
+int __attribute__ ((strub)) var;
+/* A strub variable that happens to be a pointer.  */
+__attribute__ ((strub)) int *strub_ptr_to_int;
+/* A pointer type that may point to a strub variable.  */
+typedef int __attribute__ ((strub)) *ptr_to_strub_int_type;
+
+/* A declaration of a strub function.  */
+extern int __attribute__ ((strub)) foo (void);
+/* A pointer to that strub function.  */
+int __attribute__ ((strub ("at-calls"))) (*ptr_to_strub_fn)(void) = foo;
+@end smallexample
+
+A function associated with @code{at-calls} @code{strub} mode
+(@code{strub("at-calls")}, or just @code{strub}) undergoes interface
+changes.  Its callers are adjusted to match the changes, and to scrub
+(overwrite with zeros) the stack space used by the called function after
+it returns.  The interface change makes the function type incompatible
+with an unadorned but otherwise equivalent type, so @emph{every}
+declaration and every type that may be used to call the function must be
+associated with this strub mode.
+
+A function associated with @code{internal} @code{strub} mode
+(@code{strub("internal")}) retains an unmodified, type-compatible
+interface, but it may be turned into a wrapper that calls the wrapped
+body using a custom interface.  The wrapper then scrubs the stack space
+used by the wrapped body.  Though the wrapped body has its stack space
+scrubbed, the wrapper does not, so arguments and return values may
+remain unscrubbed even when such a function is called by another
+function that enables @code{strub}.  This is why, when compiling with
+@option{-fstrub=strict}, a @code{strub} context is not allowed to call
+@code{internal} @code{strub} functions.
+
+@smallexample
+/* A declaration of an internal-strub function.  */
+extern int __attribute__ ((strub ("internal"))) bar (void);
+
+int __attribute__ ((strub))
+baz (void)
+@{
+  /* Ok, foo was declared above as an at-calls strub function.  */
+  foo ();
+  /* Not allowed in strict mode, otherwise allowed.  */
+  bar ();
+@}
+@end smallexample
+
+An automatically-allocated variable associated with the @code{strub}
+attribute causes the (immediately) enclosing function to have
+@code{strub} enabled.
+
+A statically-allocated variable associated with the @code{strub}
+attribute causes functions that @emph{read} it, through its @code{strub}
+data type, to have @code{strub} enabled.  Reading data by dereferencing
+a pointer to a @code{strub} data type has the same effect.  Note: The
+attribute does not carry over from a composite type to the types of its
+components, so the intended effect may not be obtained with non-scalar
+types.
+
+When selecting a @code{strub}-enabled mode for a function that is not
+explicitly associated with one, because of @code{strub} variables or
+data pointers, the function must satisfy @code{internal} mode viability
+requirements (see below), even when @code{at-calls} mode is also viable
+and, being more efficient, ends up selected as an optimization.
+
+@smallexample
+/* zapme is implicitly strub-enabled because of strub variables.
+   Optimization may change its strub mode, but not the requirements.  */
+static int
+zapme (int i)
+@{
+  /* A local strub variable enables strub.  */
+  int __attribute__ ((strub)) lvar;
+  /* Reading strub data through a pointer-to-strub enables strub.  */
+  lvar = * (ptr_to_strub_int_type) &i;
+  /* Writing to a global strub variable does not enable strub.  */
+  var = lvar;
+  /* Reading from a global strub variable enables strub.  */
+  return var;
+@}
+@end smallexample
+
+A @code{strub} context is the body (as opposed to the interface) of a
+function that has @code{strub} enabled, be it explicitly, by
+@code{at-calls} or @code{internal} mode, or implicitly, due to
+@code{strub} variables or command-line options.
+
+A function of a type associated with the @code{disabled} @code{strub}
+mode (@code{strub("disabled")} will not have its own stack space
+scrubbed.  Such functions @emph{cannot} be called from within
+@code{strub} contexts.
+
+In order to enable a function to be called from within @code{strub}
+contexts without having its stack space scrubbed, associate it with the
+@code{callable} @code{strub} mode (@code{strub("callable")}).
+
+When a function is not assigned a @code{strub} mode, explicitly or
+implicitly, the mode defaults to @code{callable}, except when compiling
+with @option{-fstrub=strict}, that causes @code{strub} mode to default
+to @code{disabled}.
+
+@example
+extern int __attribute__ ((strub ("callable"))) bac (void);
+extern int __attribute__ ((strub ("disabled"))) bad (void);
+ /* Implicitly disabled with -fstrub=strict, otherwise callable.  */
+extern int bah (void);
+
+int __attribute__ ((strub))
+bal (void)
+@{
+  /* Not allowed, bad is not strub-callable.  */
+  bad ();
+  /* Ok, bac is strub-callable.  */
+  bac ();
+  /* Not allowed with -fstrub=strict, otherwise allowed.  */
+  bah ();
+@}
+@end example
+
+Function types marked @code{callable} and @code{disabled} are not
+mutually compatible types, but the underlying interfaces are compatible,
+so it is safe to convert pointers between them, and to use such pointers
+or alternate declarations to call them.  Interfaces are also
+interchangeable between them and @code{internal} (but not
+@code{at-calls}!), but adding @code{internal} to a pointer type will not
+cause the pointed-to function to perform stack scrubbing.
+
+@example
+void __attribute__ ((strub))
+bap (void)
+@{
+  /* Assign a callable function to pointer-to-disabled.
+     Flagged as not quite compatible with -Wpedantic.  */
+  int __attribute__ ((strub ("disabled"))) (*d_p) (void) = bac;
+  /* Not allowed: calls disabled type in a strub context.  */
+  d_p ();
+
+  /* Assign a disabled function to pointer-to-callable.
+     Flagged as not quite compatible with -Wpedantic.  */
+  int __attribute__ ((strub ("callable"))) (*c_p) (void) = bad;
+  /* Ok, safe.  */
+  c_p ();
+
+  /* Assign an internal function to pointer-to-callable.
+     Flagged as not quite compatible with -Wpedantic.  */
+  c_p = bar;
+  /* Ok, safe.  */
+  c_p ();
+
+  /* Assign an at-calls function to pointer-to-callable.
+     Flaggged as incompatible.  */
+  c_p = bal;
+  /* The call through an interface-incompatible type will not use the
+     modified interface expected by the at-calls function, so it is
+     likely to misbehave at runtime.  */
+  c_p ();
+@}
+@end example
+
+@code{Strub} contexts are never inlined into non-@code{strub} contexts.
+When an @code{internal}-strub function is split up, the wrapper can
+often be inlined, but the wrapped body @emph{never} is.  A function
+marked as @code{always_inline}, even if explicitly assigned
+@code{internal} strub mode, will not undergo wrapping, so its body gets
+inlined as required.
+
+@example
+inline int __attribute__ ((strub ("at-calls")))
+inl_atc (void)
+@{
+  /* This body may get inlined into strub contexts.  */
+@}
+
+inline int __attribute__ ((strub ("internal")))
+inl_int (void)
+@{
+  /* This body NEVER gets inlined, though its wrapper may.  */
+@}
+
+inline int __attribute__ ((strub ("internal"), always_inline))
+inl_int_ali (void)
+@{
+  /* No internal wrapper, so this body ALWAYS gets inlined,
+     but it cannot be called from non-strub contexts.  */
+@}
+
+void __attribute__ ((strub ("disabled")))
+bat (void)
+@{
+  /* Not allowed, cannot inline into a non-strub context.  */
+  inl_int_ali ();
+@}
+@end example
+
+@cindex strub eligibility and viability
+Some @option{-fstrub=*} command line options enable @code{strub} modes
+implicitly where viable.  A @code{strub} mode is only viable for a
+function if the function is eligible for that mode, and if other
+conditions, detailed below, are satisfied.  If it's not eligible for a
+mode, attempts to explicitly associate it with that mode are rejected
+with an error message.  If it is eligible, that mode may be assigned
+explicitly through this attribute, but implicit assignment through
+command-line options may involve additional viability requirements.
+
+A function is ineligible for @code{at-calls} @code{strub} mode if a
+different @code{strub} mode is explicitly requested, if attribute
+@code{noipa} is present, or if it calls @code{__builtin_apply_args}.
+@code{At-calls} @code{strub} mode, if not requested through the function
+type, is only viable for an eligible function if the function is not
+visible to other translation units, if it doesn't have its address
+taken, and if it is never called with a function type overrider.
+
+@smallexample
+/* bar is eligible for at-calls strub mode,
+   but not viable for that mode because it is visible to other units.
+   It is eligible and viable for internal strub mode.  */
+void bav () @{@}
+
+/* setp is eligible for at-calls strub mode,
+   but not viable for that mode because its address is taken.
+   It is eligible and viable for internal strub mode.  */
+void setp (void) @{ static void (*p)(void); = setp; @}
+@end smallexample
+
+A function is ineligible for @code{internal} @code{strub} mode if a
+different @code{strub} mode is explicitly requested, or if attribute
+@code{noipa} is present.  For an @code{always_inline} function, meeting
+these requirements is enough to make it eligible.  Any function that has
+attribute @code{noclone}, that uses such extensions as non-local labels,
+computed gotos, alternate variable argument passing interfaces,
+@code{__builtin_next_arg}, or @code{__builtin_return_address}, or that
+takes too many (about 64Ki) arguments is ineligible, unless it is
+@code{always_inline}.  For @code{internal} @code{strub} mode, all
+eligible functions are viable.
+
+@smallexample
+/* flop is not eligible, thus not viable, for at-calls strub mode.
+   Likewise for internal strub mode.  */
+__attribute__ ((noipa)) void flop (void) @{@}
+
+/* flip is eligible and viable for at-calls strub mode.
+   It would be ineligible for internal strub mode, because of noclone,
+   if it weren't for always_inline.  With always_inline, noclone is not
+   an obstacle, so it is also eligible and viable for internal strub mode.  */
+inline __attribute__ ((noclone, always_inline)) void flip (void) @{@}
+@end smallexample
+
 @cindex @code{unused} type attribute
 @item unused
 When attached to a type (including a @code{union} or a @code{struct}),
@@ -12241,6 +12499,55 @@  option is in effect.  Such calls should only be made in debugging
 situations.
 @enddefbuiltin
 
+@deftypefn {Built-in Function} {void *} __builtin_stack_address ()
+This function returns the value of the stack pointer register.
+@end deftypefn
+
+@node Stack Scrubbing
+@section Stack scrubbing internal interfaces
+
+Stack scrubbing involves cooperation between a @code{strub} context,
+i.e., a function whose stack frame is to be zeroed-out, and its callers.
+The caller initializes a stack watermark, the @code{strub} context
+updates the watermark according to its stack use, and the caller zeroes
+it out once it regains control, whether by the callee's returning or by
+an exception.
+
+Each of these steps is performed by a different builtin function call.
+Calls to these builtins are introduced automatically, in response to
+@code{strub} attributes and command-line options; they are not expected
+to be explicitly called by source code.
+
+The functions that implement the builtins are available in libgcc but,
+depending on optimization levels, they are expanded internally, adjusted
+to account for inlining, and sometimes combined/deferred (e.g. passing
+the caller-supplied watermark on to callees, refraining from erasing
+stack areas that the caller will) to enable tail calls and to optimize
+for code size.
+
+@deftypefn {Built-in Function} {void} __builtin___strub_enter (void **@var{wmptr})
+This function initializes a stack @var{watermark} variable with the
+current top of the stack.  A call to this builtin function is introduced
+before entering a @code{strub} context.  It remains as a function call
+if optimization is not enabled.
+@end deftypefn
+
+@deftypefn {Built-in Function} {void} __builtin___strub_update (void **@var{wmptr})
+This function updates a stack @var{watermark} variable with the current
+top of the stack, if it tops the previous watermark.  A call to this
+builtin function is inserted within @code{strub} contexts, whenever
+additional stack space may have been used.  It remains as a function
+call at optimization levels lower than 2.
+@end deftypefn
+
+@deftypefn {Built-in Function} {void} __builtin___strub_leave (void **@var{wmptr})
+This function overwrites the memory area between the current top of the
+stack, and the @var{watermark}ed address.  A call to this builtin
+function is inserted after leaving a @code{strub} context.  It remains
+as a function call at optimization levels lower than 3, and it is guarded by
+a condition at level 2.
+@end deftypefn
+
 @node Vector Extensions
 @section Using Vector Instructions through Built-in Functions
 
diff --git a/gcc/doc/invoke.texi b/gcc/doc/invoke.texi
index 1a762bdcc480f..f18c31fca1f2f 100644
--- a/gcc/doc/invoke.texi
+++ b/gcc/doc/invoke.texi
@@ -649,6 +649,8 @@  Objective-C and Objective-C++ Dialects}.
 -fstack-protector-explicit  -fstack-check
 -fstack-limit-register=@var{reg}  -fstack-limit-symbol=@var{sym}
 -fno-stack-limit  -fsplit-stack
+-fstrub=disable  -fstrub=strict  -fstrub=relaxed
+-fstrub=all  -fstrub=at-calls  -fstrub=internal
 -fvtable-verify=@r{[}std@r{|}preinit@r{|}none@r{]}
 -fvtv-counts  -fvtv-debug
 -finstrument-functions  -finstrument-functions-once
@@ -17689,6 +17691,56 @@  without @option{-fsplit-stack} always has a large stack.  Support for
 this is implemented in the gold linker in GNU binutils release 2.21
 and later.
 
+@opindex -fstrub=disable
+@item -fstrub=disable
+Disable stack scrubbing entirely, ignoring any @code{strub} attributes.
+See @xref{Common Type Attributes}.
+
+@opindex fstrub=strict
+@item -fstrub=strict
+Functions default to @code{strub} mode @code{disabled}, and apply
+@option{strict}ly the restriction that only functions associated with
+@code{strub}-@code{callable} modes (@code{at-calls}, @code{callable} and
+@code{always_inline} @code{internal}) are @code{callable} by functions
+with @code{strub}-enabled modes (@code{at-calls} and @code{internal}).
+
+@opindex fstrub=relaxed
+@item -fstrub=relaxed
+Restore the default stack scrub (@code{strub}) setting, namely,
+@code{strub} is only enabled as required by @code{strub} attributes
+associated with function and data types.  @code{Relaxed} means that
+strub contexts are only prevented from calling functions explicitly
+associated with @code{strub} mode @code{disabled}.  This option is only
+useful to override other @option{-fstrub=*} options that precede it in
+the command line.
+
+@opindex fstrub=at-calls
+@item -fstrub=at-calls
+Enable @code{at-calls} @code{strub} mode where viable.  The primary use
+of this option is for testing.  It exercises the @code{strub} machinery
+in scenarios strictly local to a translation unit.  This @code{strub}
+mode modifies function interfaces, so any function that is visible to
+other translation units, or that has its address taken, will @emph{not}
+be affected by this option.  Optimization options may also affect
+viability.  See the @code{strub} attribute documentation for details on
+viability and eligibility requirements.
+
+@opindex fstrub=internal
+@item -fstrub=internal
+Enable @code{internal} @code{strub} mode where viable.  The primary use
+of this option is for testing.  This option is intended to exercise
+thoroughly parts of the @code{strub} machinery that implement the less
+efficient, but interface-preserving @code{strub} mode.  Functions that
+would not be affected by this option are quite uncommon.
+
+@opindex fstrub=all
+@item -fstrub=all
+Enable some @code{strub} mode where viable.  When both strub modes are
+viable, @code{at-calls} is preferred.  @option{-fdump-ipa-strubm} adds
+function attributes that tell which mode was selected for each function.
+The primary use of this option is for testing, to exercise thoroughly
+the @code{strub} machinery.
+
 @opindex fvtable-verify
 @item -fvtable-verify=@r{[}std@r{|}preinit@r{|}none@r{]}
 This option is only available when compiling C++ code.
@@ -19594,6 +19646,14 @@  and inlining decisions.
 @item inline
 Dump after function inlining.
 
+@item strubm
+Dump after selecting @code{strub} modes, and recording the selections as
+function attributes.
+
+@item strub
+Dump @code{strub} transformations: interface changes, function wrapping,
+and insertion of builtin calls for stack scrubbing and watermarking.
+
 @end table
 
 Additionally, the options @option{-optimized}, @option{-missed},
diff --git a/gcc/doc/tm.texi b/gcc/doc/tm.texi
index f7ac806ff157c..3c6fc506cc1c1 100644
--- a/gcc/doc/tm.texi
+++ b/gcc/doc/tm.texi
@@ -3449,6 +3449,25 @@  in DWARF 2 debug information.  The default is zero.  A different value
 may reduce the size of debug information on some ports.
 @end defmac
 
+@defmac TARGET_STRUB_USE_DYNAMIC_ARRAY
+If defined to nonzero, @code{__strub_leave} will allocate a dynamic
+array covering the stack range that needs scrubbing before clearing it.
+Allocating the array tends to make scrubbing slower, but it enables the
+scrubbing to be safely implemented with a @code{memset} call, which
+could make up for the difference.
+@end defmac
+
+@defmac TARGET_STRUB_MAY_USE_MEMSET
+If defined to nonzero, enable @code{__strub_leave} to be optimized so as
+to call @code{memset} for stack scrubbing.  This is only enabled by
+default if @code{TARGET_STRUB_USE_DYNAMIC_ARRAY} is enabled; it's not
+advisable to enable it otherwise, since @code{memset} would then likely
+overwrite its own stack frame, but it might work if the target ABI
+enables @code{memset} to not use the stack at all, not even for
+arguments or its return address, and its implementation is trivial
+enough that it doesn't use a stack frame.
+@end defmac
+
 @node Exception Handling
 @subsection Exception Handling Support
 @cindex exception handling
diff --git a/gcc/doc/tm.texi.in b/gcc/doc/tm.texi.in
index 141027e0bb464..d513de365e93e 100644
--- a/gcc/doc/tm.texi.in
+++ b/gcc/doc/tm.texi.in
@@ -2685,6 +2685,25 @@  in DWARF 2 debug information.  The default is zero.  A different value
 may reduce the size of debug information on some ports.
 @end defmac
 
+@defmac TARGET_STRUB_USE_DYNAMIC_ARRAY
+If defined to nonzero, @code{__strub_leave} will allocate a dynamic
+array covering the stack range that needs scrubbing before clearing it.
+Allocating the array tends to make scrubbing slower, but it enables the
+scrubbing to be safely implemented with a @code{memset} call, which
+could make up for the difference.
+@end defmac
+
+@defmac TARGET_STRUB_MAY_USE_MEMSET
+If defined to nonzero, enable @code{__strub_leave} to be optimized so as
+to call @code{memset} for stack scrubbing.  This is only enabled by
+default if @code{TARGET_STRUB_USE_DYNAMIC_ARRAY} is enabled; it's not
+advisable to enable it otherwise, since @code{memset} would then likely
+overwrite its own stack frame, but it might work if the target ABI
+enables @code{memset} to not use the stack at all, not even for
+arguments or its return address, and its implementation is trivial
+enough that it doesn't use a stack frame.
+@end defmac
+
 @node Exception Handling
 @subsection Exception Handling Support
 @cindex exception handling
diff --git a/gcc/gengtype-lex.l b/gcc/gengtype-lex.l
index 34837d9dc9a8f..a7bb44cf2b9ad 100644
--- a/gcc/gengtype-lex.l
+++ b/gcc/gengtype-lex.l
@@ -165,6 +165,9 @@  CXX_KEYWORD inline|public:|private:|protected:|template|operator|friend|static|m
 [(){},*:<>;=%/|+\!\?\.-]	{ return yytext[0]; }
 
    /* ignore pp-directives */
+^{HWS}"#"{HWS}[a-z_]+([^\n]*"\\"\n)+[^\n]*\n   {
+  update_lineno (yytext, yyleng);
+}
 ^{HWS}"#"{HWS}[a-z_]+[^\n]*\n   {lexer_line.line++;}
 
 .				{
diff --git a/gcc/ipa-inline.cc b/gcc/ipa-inline.cc
index dc120e6da5af6..dbc3c7e8fdc88 100644
--- a/gcc/ipa-inline.cc
+++ b/gcc/ipa-inline.cc
@@ -119,6 +119,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "stringpool.h"
 #include "attribs.h"
 #include "asan.h"
+#include "ipa-strub.h"
 
 /* Inliner uses greedy algorithm to inline calls in a priority order.
    Badness is used as the key in a Fibonacci heap which roughly corresponds
@@ -443,6 +444,11 @@  can_inline_edge_p (struct cgraph_edge *e, bool report,
       inlinable = false;
     }
 
+  if (inlinable && !strub_inlinable_to_p (callee, caller))
+    {
+      e->inline_failed = CIF_UNSPECIFIED;
+      inlinable = false;
+    }
   if (!inlinable && report)
     report_inline_failed_reason (e);
   return inlinable;
diff --git a/gcc/ipa-split.cc b/gcc/ipa-split.cc
index 6730f4f9d0e31..1a7285ff5dcf8 100644
--- a/gcc/ipa-split.cc
+++ b/gcc/ipa-split.cc
@@ -104,6 +104,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "ipa-fnsummary.h"
 #include "cfgloop.h"
 #include "attribs.h"
+#include "ipa-strub.h"
 
 /* Per basic block info.  */
 
@@ -1811,6 +1812,12 @@  execute_split_functions (void)
 		 "section.\n");
       return 0;
     }
+  if (!strub_splittable_p (node))
+    {
+      if (dump_file)
+	fprintf (dump_file, "Not splitting: function is a strub context.\n");
+      return 0;
+    }
 
   /* We enforce splitting after loop headers when profile info is not
      available.  */
diff --git a/gcc/ipa-strub.cc b/gcc/ipa-strub.cc
new file mode 100644
index 0000000000000..612ece9f18552
--- /dev/null
+++ b/gcc/ipa-strub.cc
@@ -0,0 +1,3454 @@ 
+/* strub (stack scrubbing) support.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Alexandre Oliva <oliva@adacore.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "tree.h"
+#include "gimple.h"
+#include "gimplify.h"
+#include "tree-pass.h"
+#include "ssa.h"
+#include "gimple-iterator.h"
+#include "gimplify-me.h"
+#include "tree-into-ssa.h"
+#include "tree-ssa.h"
+#include "tree-cfg.h"
+#include "cfghooks.h"
+#include "cfgloop.h"
+#include "cfgcleanup.h"
+#include "tree-eh.h"
+#include "except.h"
+#include "builtins.h"
+#include "attribs.h"
+#include "tree-inline.h"
+#include "cgraph.h"
+#include "alloc-pool.h"
+#include "symbol-summary.h"
+#include "ipa-prop.h"
+#include "ipa-fnsummary.h"
+#include "gimple-fold.h"
+#include "fold-const.h"
+#include "gimple-walk.h"
+#include "tree-dfa.h"
+#include "langhooks.h"
+#include "calls.h"
+#include "vec.h"
+#include "stor-layout.h"
+#include "varasm.h"
+#include "alias.h"
+#include "diagnostic.h"
+#include "intl.h"
+#include "ipa-strub.h"
+#include "symtab-thunks.h"
+#include "attr-fnspec.h"
+
+/* Const and pure functions that gain a watermark parameter for strub purposes
+   are still regarded as such, which may cause the inline expansions of the
+   __strub builtins to malfunction.  Ideally, attribute "fn spec" would enable
+   us to inform the backend about requirements and side effects of the call, but
+   call_fusage building in calls.c:expand_call does not even look at
+   attr_fnspec, so we resort to asm loads and updates to attain an equivalent
+   effect.  Once expand_call gains the ability to issue extra memory uses and
+   clobbers based on pure/const function's fnspec, we can define this to 1.  */
+#define ATTR_FNSPEC_DECONST_WATERMARK 0
+
+enum strub_mode {
+  /* This mode denotes a regular function, that does not require stack
+     scrubbing (strubbing).  It may call any other functions, but if
+     it calls AT_CALLS (or WRAPPED) ones, strubbing logic is
+     automatically introduced around those calls (the latter, by
+     inlining INTERNAL wrappers).  */
+  STRUB_DISABLED = 0,
+
+  /* This denotes a function whose signature is (to be) modified to
+     take an extra parameter, for stack use annotation, and its
+     callers must initialize and pass that argument, and perform the
+     strubbing.  Functions that are explicitly marked with attribute
+     strub must have the mark visible wherever the function is,
+     including aliases, and overriders and overriding methods.
+     Functions that are implicitly marked for strubbing, for accessing
+     variables explicitly marked as such, will only select this
+     strubbing method if they are internal to a translation unit.  It
+     can only be inlined into other strubbing functions, i.e.,
+     STRUB_AT_CALLS or STRUB_WRAPPED.  */
+  STRUB_AT_CALLS = 1,
+
+  /* This denotes a function that is to perform strubbing internally,
+     without any changes to its interface (the function is turned into
+     a strubbing wrapper, and its original body is moved to a separate
+     STRUB_WRAPPED function, with a modified interface).  Functions
+     may be explicitly marked with attribute strub(2), and the
+     attribute must be visible at the point of definition.  Functions
+     that are explicitly marked for strubbing, for accessing variables
+     explicitly marked as such, may select this strubbing mode if
+     their interface cannot change, e.g. because its interface is
+     visible to other translation units, directly, by indirection
+     (having its address taken), inheritance, etc.  Functions that use
+     this method must not have the noclone attribute, nor the noipa
+     one.  Functions marked as always_inline may select this mode, but
+     they are NOT wrapped, they remain unchanged, and are only inlined
+     into strubbed contexts.  Once non-always_inline functions are
+     wrapped, the wrapper becomes STRUB_WRAPPER, and the wrapped becomes
+     STRUB_WRAPPED.  */
+  STRUB_INTERNAL = 2,
+
+  /* This denotes a function whose stack is not strubbed, but that is
+     nevertheless explicitly or implicitly marked as callable from strubbing
+     functions.  Normally, only STRUB_AT_CALLS (and STRUB_INTERNAL ->
+     STRUB_WRAPPED) functions can be called from strubbing contexts (bodies of
+     STRUB_AT_CALLS, STRUB_INTERNAL and STRUB_WRAPPED functions), but attribute
+     strub(3) enables other functions to be (indirectly) called from these
+     contexts.  Some builtins and internal functions may be implicitly marked as
+     STRUB_CALLABLE.  */
+  STRUB_CALLABLE = 3,
+
+  /* This denotes the function that took over the body of a
+     STRUB_INTERNAL function.  At first, it's only called by its
+     wrapper, but the wrapper may be inlined.  The wrapped function,
+     in turn, can only be inlined into other functions whose stack
+     frames are strubbed, i.e., that are STRUB_WRAPPED or
+     STRUB_AT_CALLS.  */
+  STRUB_WRAPPED = -1,
+
+  /* This denotes the wrapper function that replaced the STRUB_INTERNAL
+     function.  This mode overrides the STRUB_INTERNAL mode at the time the
+     internal to-be-wrapped function becomes a wrapper, so that inlining logic
+     can tell one from the other.  */
+  STRUB_WRAPPER = -2,
+
+  /* This denotes an always_inline function that requires strubbing.  It can
+     only be called from, and inlined into, other strubbing contexts.  */
+  STRUB_INLINABLE = -3,
+
+  /* This denotes a function that accesses strub variables, so it would call for
+     internal strubbing (whether or not it's eligible for that), but since
+     at-calls strubbing is viable, that's selected as an optimization.  This
+     mode addresses the inconvenience that such functions may have different
+     modes selected depending on optimization flags, and get a different
+     callable status depending on that choice: if we assigned them
+     STRUB_AT_CALLS mode, they would be callable when optimizing, whereas
+     STRUB_INTERNAL would not be callable.  */
+  STRUB_AT_CALLS_OPT = -4,
+
+};
+
+/* Look up a strub attribute in TYPE, and return it.  */
+
+static tree
+get_strub_attr_from_type (tree type)
+{
+  return lookup_attribute ("strub", TYPE_ATTRIBUTES (type));
+}
+
+/* Look up a strub attribute in DECL or in its type, and return it.  */
+
+static tree
+get_strub_attr_from_decl (tree decl)
+{
+  tree ret = lookup_attribute ("strub", DECL_ATTRIBUTES (decl));
+  if (ret)
+    return ret;
+  return get_strub_attr_from_type (TREE_TYPE (decl));
+}
+
+#define STRUB_ID_COUNT		8
+#define STRUB_IDENT_COUNT	3
+#define STRUB_TYPE_COUNT	5
+
+#define STRUB_ID_BASE		0
+#define STRUB_IDENT_BASE	(STRUB_ID_BASE + STRUB_ID_COUNT)
+#define STRUB_TYPE_BASE		(STRUB_IDENT_BASE + STRUB_IDENT_COUNT)
+#define STRUB_CACHE_SIZE	(STRUB_TYPE_BASE + STRUB_TYPE_COUNT)
+
+/* Keep the strub mode and temp identifiers and types from being GC'd.  */
+static GTY((deletable)) tree strub_cache[STRUB_CACHE_SIZE];
+
+/* Define a function to cache identifier ID, to be used as a strub attribute
+   parameter for a strub mode named after NAME.  */
+#define DEF_STRUB_IDS(IDX, NAME, ID)				\
+static inline tree get_strub_mode_id_ ## NAME () {		\
+  int idx = STRUB_ID_BASE + IDX;				\
+  tree identifier = strub_cache[idx];				\
+  if (!identifier)						\
+    strub_cache[idx] = identifier = get_identifier (ID);	\
+  return identifier;						\
+}
+/* Same as DEF_STRUB_IDS, but use the string expansion of NAME as ID.  */
+#define DEF_STRUB_ID(IDX, NAME)			\
+  DEF_STRUB_IDS (IDX, NAME, #NAME)
+
+/* Define functions for each of the strub mode identifiers.
+   Expose dashes rather than underscores.  */
+DEF_STRUB_ID (0, disabled)
+DEF_STRUB_IDS (1, at_calls, "at-calls")
+DEF_STRUB_ID (2, internal)
+DEF_STRUB_ID (3, callable)
+DEF_STRUB_ID (4, wrapped)
+DEF_STRUB_ID (5, wrapper)
+DEF_STRUB_ID (6, inlinable)
+DEF_STRUB_IDS (7, at_calls_opt, "at-calls-opt")
+
+/* Release the temporary macro names.  */
+#undef DEF_STRUB_IDS
+#undef DEF_STRUB_ID
+
+/* Return the identifier corresponding to strub MODE.  */
+
+static tree
+get_strub_mode_attr_parm (enum strub_mode mode)
+{
+  switch (mode)
+    {
+    case STRUB_DISABLED:
+      return get_strub_mode_id_disabled ();
+
+    case STRUB_AT_CALLS:
+      return get_strub_mode_id_at_calls ();
+
+    case STRUB_INTERNAL:
+      return get_strub_mode_id_internal ();
+
+    case STRUB_CALLABLE:
+      return get_strub_mode_id_callable ();
+
+    case STRUB_WRAPPED:
+      return get_strub_mode_id_wrapped ();
+
+    case STRUB_WRAPPER:
+      return get_strub_mode_id_wrapper ();
+
+    case STRUB_INLINABLE:
+      return get_strub_mode_id_inlinable ();
+
+    case STRUB_AT_CALLS_OPT:
+      return get_strub_mode_id_at_calls_opt ();
+
+    default:
+      gcc_unreachable ();
+    }
+}
+
+/* Return the parmeters (TREE_VALUE) for a strub attribute of MODE.
+   We know we use a single parameter, so we bypass the creation of a
+   tree list.  */
+
+static tree
+get_strub_mode_attr_value (enum strub_mode mode)
+{
+  return get_strub_mode_attr_parm (mode);
+}
+
+/* Determine whether ID is a well-formed strub mode-specifying attribute
+   parameter for a function (type).  Only user-visible modes are accepted, and
+   ID must be non-NULL.
+
+   For unacceptable parms, return 0, otherwise a nonzero value as below.
+
+   If the parm enables strub, return positive, otherwise negative.
+
+   If the affected type must be a distinct, incompatible type,return an integer
+   of absolute value 2, otherwise 1.  */
+
+int
+strub_validate_fn_attr_parm (tree id)
+{
+  int ret;
+  const char *s = NULL;
+  size_t len = 0;
+
+  /* do NOT test for NULL.  This is only to be called with non-NULL arguments.
+     We assume that the strub parameter applies to a function, because only
+     functions accept an explicit argument.  If we accepted NULL, and we
+     happened to be called to verify the argument for a variable, our return
+     values would be wrong.  */
+  if (TREE_CODE (id) == STRING_CST)
+    {
+      s = TREE_STRING_POINTER (id);
+      len = TREE_STRING_LENGTH (id) - 1;
+    }
+  else if (TREE_CODE (id) == IDENTIFIER_NODE)
+    {
+      s = IDENTIFIER_POINTER (id);
+      len = IDENTIFIER_LENGTH (id);
+    }
+  else
+    return 0;
+
+  enum strub_mode mode;
+
+  if (len != 8)
+    return 0;
+
+  switch (s[0])
+    {
+    case 'd':
+      mode = STRUB_DISABLED;
+      ret = -1;
+      break;
+
+    case 'a':
+      mode = STRUB_AT_CALLS;
+      ret = 2;
+      break;
+
+    case 'i':
+      mode = STRUB_INTERNAL;
+      ret = 1;
+      break;
+
+    case 'c':
+      mode = STRUB_CALLABLE;
+      ret = -2;
+      break;
+
+    default:
+      /* Other parms are for internal use only.  */
+      return 0;
+    }
+
+  tree mode_id = get_strub_mode_attr_parm (mode);
+
+  if (TREE_CODE (id) == IDENTIFIER_NODE
+      ? id != mode_id
+      : strncmp (s, IDENTIFIER_POINTER (mode_id), len) != 0)
+    return 0;
+
+  return ret;
+}
+
+/* Return the strub mode from STRUB_ATTR.  VAR_P should be TRUE if the attribute
+   is taken from a variable, rather than from a function, or a type thereof.  */
+
+static enum strub_mode
+get_strub_mode_from_attr (tree strub_attr, bool var_p = false)
+{
+  enum strub_mode mode = STRUB_DISABLED;
+
+  if (strub_attr)
+    {
+      if (!TREE_VALUE (strub_attr))
+	mode = !var_p ? STRUB_AT_CALLS : STRUB_INTERNAL;
+      else
+	{
+	  gcc_checking_assert (!var_p);
+	  tree id = TREE_VALUE (strub_attr);
+	  if (TREE_CODE (id) == TREE_LIST)
+	    id = TREE_VALUE (id);
+	  const char *s = (TREE_CODE (id) == STRING_CST
+			   ? TREE_STRING_POINTER (id)
+			   : IDENTIFIER_POINTER (id));
+	  size_t len = (TREE_CODE (id) == STRING_CST
+			? TREE_STRING_LENGTH (id) - 1
+			: IDENTIFIER_LENGTH (id));
+
+	  switch (len)
+	    {
+	    case 7:
+	      switch (s[6])
+		{
+		case 'r':
+		  mode = STRUB_WRAPPER;
+		  break;
+
+		case 'd':
+		  mode = STRUB_WRAPPED;
+		  break;
+
+		default:
+		  gcc_unreachable ();
+		}
+	      break;
+
+	    case 8:
+	      switch (s[0])
+		{
+		case 'd':
+		  mode = STRUB_DISABLED;
+		  break;
+
+		case 'a':
+		  mode = STRUB_AT_CALLS;
+		  break;
+
+		case 'i':
+		  mode = STRUB_INTERNAL;
+		  break;
+
+		case 'c':
+		  mode = STRUB_CALLABLE;
+		  break;
+
+		default:
+		  gcc_unreachable ();
+		}
+	      break;
+
+	    case 9:
+	      mode = STRUB_INLINABLE;
+	      break;
+
+	    case 12:
+	      mode = STRUB_AT_CALLS_OPT;
+	      break;
+
+	    default:
+	      gcc_unreachable ();
+	    }
+
+	  gcc_checking_assert (TREE_CODE (id) == IDENTIFIER_NODE
+			       ? id == get_strub_mode_attr_parm (mode)
+			       : strncmp (IDENTIFIER_POINTER
+					  (get_strub_mode_attr_parm (mode)),
+					  s, len) == 0);
+	}
+    }
+
+  return mode;
+}
+
+/* Look up, decode and return the strub mode associated with FNDECL.  */
+
+static enum strub_mode
+get_strub_mode_from_fndecl (tree fndecl)
+{
+  return get_strub_mode_from_attr (get_strub_attr_from_decl (fndecl));
+}
+
+/* Look up, decode and return the strub mode associated with NODE.  */
+
+static enum strub_mode
+get_strub_mode (cgraph_node *node)
+{
+  return get_strub_mode_from_fndecl (node->decl);
+}
+
+/* Look up, decode and return the strub mode associated with TYPE.  */
+
+static enum strub_mode
+get_strub_mode_from_type (tree type)
+{
+  bool var_p = !FUNC_OR_METHOD_TYPE_P (type);
+  tree attr = get_strub_attr_from_type (type);
+
+  if (attr)
+    return get_strub_mode_from_attr (attr, var_p);
+
+  if (flag_strub >= -1 && !var_p)
+    return STRUB_CALLABLE;
+
+  return STRUB_DISABLED;
+}
+
+
+/* Return TRUE iff NODE calls builtin va_start.  */
+
+static bool
+calls_builtin_va_start_p (cgraph_node *node)
+{
+  bool result = false;
+
+  for (cgraph_edge *e = node->callees; e; e = e->next_callee)
+    {
+      tree cdecl = e->callee->decl;
+      if (fndecl_built_in_p (cdecl, BUILT_IN_VA_START))
+	return true;
+    }
+
+  return result;
+}
+
+/* Return TRUE iff NODE calls builtin apply_args, and optionally REPORT it.  */
+
+static bool
+calls_builtin_apply_args_p (cgraph_node *node, bool report = false)
+{
+  bool result = false;
+
+  for (cgraph_edge *e = node->callees; e; e = e->next_callee)
+    {
+      tree cdecl = e->callee->decl;
+      if (!fndecl_built_in_p (cdecl, BUILT_IN_APPLY_ARGS))
+	continue;
+
+      result = true;
+
+      if (!report)
+	break;
+
+      sorry_at (e->call_stmt
+		? gimple_location (e->call_stmt)
+		: DECL_SOURCE_LOCATION (node->decl),
+		"at-calls %<strub%> does not support call to %qD",
+		cdecl);
+    }
+
+  return result;
+}
+
+/* Return TRUE iff NODE carries the always_inline attribute.  */
+
+static inline bool
+strub_always_inline_p (cgraph_node *node)
+{
+  return lookup_attribute ("always_inline", DECL_ATTRIBUTES (node->decl));
+}
+
+/* Return TRUE iff NODE is potentially eligible for any strub-enabled mode, and
+   optionally REPORT the reasons for ineligibility.  */
+
+static inline bool
+can_strub_p (cgraph_node *node, bool report = false)
+{
+  bool result = true;
+
+  if (!report && strub_always_inline_p (node))
+    return result;
+
+  if (lookup_attribute ("noipa", DECL_ATTRIBUTES (node->decl)))
+    {
+      result = false;
+
+      if (!report)
+	return result;
+
+      sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		"%qD is not eligible for %<strub%>"
+		" because of attribute %<noipa%>",
+		node->decl);
+    }
+
+  /* We can't, and don't want to vectorize the watermark and other
+     strub-introduced parms.  */
+  if (lookup_attribute ("simd", DECL_ATTRIBUTES (node->decl)))
+    {
+      result = false;
+
+      if (!report)
+	return result;
+
+      sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		"%qD is not eligible for %<strub%>"
+		" because of attribute %<simd%>",
+		node->decl);
+    }
+
+  return result;
+}
+
+/* Return TRUE iff NODE is eligible for at-calls strub, and optionally REPORT
+   the reasons for ineligibility.  Besides general non-eligibility for
+   strub-enabled modes, at-calls rules out calling builtin apply_args.  */
+
+static bool
+can_strub_at_calls_p (cgraph_node *node, bool report = false)
+{
+  bool result = !report || can_strub_p (node, report);
+
+  if (!result && !report)
+    return result;
+
+  return !calls_builtin_apply_args_p (node, report);
+}
+
+/* Return TRUE iff the called function (pointer or, if available,
+   decl) undergoes a significant type conversion for the call.  Strub
+   mode changes between function types, and other non-useless type
+   conversions, are regarded as significant.  When the function type
+   is overridden, the effective strub mode for the call is that of the
+   call fntype, rather than that of the pointer or of the decl.
+   Functions called with type overrides cannot undergo type changes;
+   it's as if their address was taken, so they're considered
+   non-viable for implicit at-calls strub mode.  */
+
+static inline bool
+strub_call_fntype_override_p (const gcall *gs)
+{
+  if (gimple_call_internal_p (gs))
+    return false;
+  tree fn_type = TREE_TYPE (TREE_TYPE (gimple_call_fn (gs)));
+  if (tree decl = gimple_call_fndecl (gs))
+    fn_type = TREE_TYPE (decl);
+
+  /* We do NOT want to take the mode from the decl here.  This
+     function is used to tell whether we can change the strub mode of
+     a function, and whether the effective mode for the call is to be
+     taken from the decl or from an overrider type.  When the strub
+     mode is explicitly declared, or overridden with a type cast, the
+     difference will be noticed in function types.  However, if the
+     strub mode is implicit due to e.g. strub variables or -fstrub=*
+     command-line flags, we will adjust call types along with function
+     types.  In either case, the presence of type or strub mode
+     overriders in calls will prevent a function from having its strub
+     modes changed in ways that would imply type changes, but taking
+     strub modes from decls would defeat this, since we set strub
+     modes and then call this function to tell whether the original
+     type was overridden to decide whether to adjust the call.  We
+     need the answer to be about the type, not the decl.  */
+  enum strub_mode mode = get_strub_mode_from_type (fn_type);
+  return (get_strub_mode_from_type (gs->u.fntype) != mode
+	  || !useless_type_conversion_p (gs->u.fntype, fn_type));
+}
+
+/* Return TRUE iff NODE is called directly with a type override.  */
+
+static bool
+called_directly_with_type_override_p (cgraph_node *node, void *)
+{
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+    if (e->call_stmt && strub_call_fntype_override_p (e->call_stmt))
+      return true;
+
+  return false;
+}
+
+/* Return TRUE iff NODE or any other nodes aliased to it are called
+   with type overrides.  We can't safely change the type of such
+   functions.  */
+
+static bool
+called_with_type_override_p (cgraph_node *node)
+{
+  return (node->call_for_symbol_thunks_and_aliases
+	  (called_directly_with_type_override_p, NULL, true, true));
+}
+
+/* Symbolic macro for the max number of arguments that internal strub may add to
+   a function.  */
+
+#define STRUB_INTERNAL_MAX_EXTRA_ARGS 3
+
+/* We can't perform internal strubbing if the function body involves certain
+   features:
+
+   - a non-default __builtin_va_start (e.g. x86's __builtin_ms_va_start) is
+   currently unsupported because we can't discover the corresponding va_copy and
+   va_end decls in the wrapper, and we don't convey the alternate variable
+   arguments ABI to the modified wrapped function.  The default
+   __builtin_va_start is supported by calling va_start/va_end at the wrapper,
+   that takes variable arguments, passing a pointer to the va_list object to the
+   wrapped function, that runs va_copy from it where the original function ran
+   va_start.
+
+   __builtin_next_arg is currently unsupported because the wrapped function
+   won't be a variable argument function.  We could process it in the wrapper,
+   that remains a variable argument function, and replace calls in the wrapped
+   body, but we currently don't.
+
+   __builtin_return_address is rejected because it's generally used when the
+   actual caller matters, and introducing a wrapper breaks such uses as those in
+   the unwinder.  */
+
+static bool
+can_strub_internally_p (cgraph_node *node, bool report = false)
+{
+  bool result = !report || can_strub_p (node, report);
+
+  if (!result && !report)
+    return result;
+
+  if (!report && strub_always_inline_p (node))
+    return result;
+
+  /* Since we're not changing the function identity proper, just
+     moving its full implementation, we *could* disable
+     fun->cannot_be_copied_reason and/or temporarily drop a noclone
+     attribute, but we'd have to prevent remapping of the labels.  */
+  if (lookup_attribute ("noclone", DECL_ATTRIBUTES (node->decl)))
+    {
+      result = false;
+
+      if (!report)
+	return result;
+
+      sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		"%qD is not eligible for internal %<strub%>"
+		" because of attribute %<noclone%>",
+		node->decl);
+    }
+
+  if (node->has_gimple_body_p ())
+    {
+      for (cgraph_edge *e = node->callees; e; e = e->next_callee)
+	{
+	  tree cdecl = e->callee->decl;
+	  if (!((fndecl_built_in_p (cdecl, BUILT_IN_VA_START)
+		 && cdecl != builtin_decl_explicit (BUILT_IN_VA_START))
+		|| fndecl_built_in_p (cdecl, BUILT_IN_NEXT_ARG)
+		|| fndecl_built_in_p (cdecl, BUILT_IN_RETURN_ADDRESS)))
+	    continue;
+
+	  result = false;
+
+	  if (!report)
+	    return result;
+
+	  sorry_at (e->call_stmt
+		    ? gimple_location (e->call_stmt)
+		    : DECL_SOURCE_LOCATION (node->decl),
+		    "%qD is not eligible for internal %<strub%> "
+		    "because it calls %qD",
+		    node->decl, cdecl);
+	}
+
+      struct function *fun = DECL_STRUCT_FUNCTION (node->decl);
+      if (fun->has_nonlocal_label)
+	{
+	  result = false;
+
+	  if (!report)
+	    return result;
+
+	  sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		    "%qD is not eligible for internal %<strub%> "
+		    "because it contains a non-local goto target",
+		    node->decl);
+	}
+
+      if (fun->has_forced_label_in_static)
+	{
+	  result = false;
+
+	  if (!report)
+	    return result;
+
+	  sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		    "%qD is not eligible for internal %<strub%> "
+		    "because the address of a local label escapes",
+		    node->decl);
+	}
+
+      /* Catch any other case that would prevent versioning/cloning
+	 so as to also have it covered above.  */
+      gcc_checking_assert (!result /* || !node->has_gimple_body_p () */
+			   || tree_versionable_function_p (node->decl));
+
+
+      /* Label values references are not preserved when copying.  If referenced
+	 in nested functions, as in 920415-1.c and 920721-4.c their decls get
+	 remapped independently.  The exclusion below might be too broad, in
+	 that we might be able to support correctly cases in which the labels
+	 are only used internally in a function, but disconnecting forced labels
+	 from their original declarations is undesirable in general.  */
+      basic_block bb;
+      FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl))
+	for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	     !gsi_end_p (gsi); gsi_next (&gsi))
+	  {
+	    glabel *label_stmt = dyn_cast <glabel *> (gsi_stmt (gsi));
+	    tree target;
+
+	    if (!label_stmt)
+	      break;
+
+	    target = gimple_label_label (label_stmt);
+
+	    if (!FORCED_LABEL (target))
+	      continue;
+
+	    result = false;
+
+	    if (!report)
+	      return result;
+
+	    sorry_at (gimple_location (label_stmt),
+		      "internal %<strub%> does not support forced labels");
+	  }
+    }
+
+  if (list_length (TYPE_ARG_TYPES (TREE_TYPE (node->decl)))
+      >= (((HOST_WIDE_INT) 1 << IPA_PARAM_MAX_INDEX_BITS)
+	  - STRUB_INTERNAL_MAX_EXTRA_ARGS))
+    {
+      result = false;
+
+      if (!report)
+	return result;
+
+      sorry_at (DECL_SOURCE_LOCATION (node->decl),
+		"%qD has too many arguments for internal %<strub%>",
+		node->decl);
+    }
+
+  return result;
+}
+
+/* Return TRUE iff NODE has any strub-requiring local variable, or accesses (as
+   in reading) any variable through a strub-requiring type.  */
+
+static bool
+strub_from_body_p (cgraph_node *node)
+{
+  if (!node->has_gimple_body_p ())
+    return false;
+
+  /* If any local variable is marked for strub...  */
+  unsigned i;
+  tree var;
+  FOR_EACH_LOCAL_DECL (DECL_STRUCT_FUNCTION (node->decl),
+		       i, var)
+    if (get_strub_mode_from_type (TREE_TYPE (var))
+	!= STRUB_DISABLED)
+      return true;
+
+  /* Now scan the body for loads with strub-requiring types.
+     ??? Compound types don't propagate the strub requirement to
+     component types.  */
+  basic_block bb;
+  FOR_EACH_BB_FN (bb, DECL_STRUCT_FUNCTION (node->decl))
+    for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+	 !gsi_end_p (gsi); gsi_next (&gsi))
+      {
+	gimple *stmt = gsi_stmt (gsi);
+
+	if (!gimple_assign_load_p (stmt))
+	  continue;
+
+	tree rhs = gimple_assign_rhs1 (stmt);
+	if (get_strub_mode_from_type (TREE_TYPE (rhs))
+	    != STRUB_DISABLED)
+	  return true;
+      }
+
+  return false;
+}
+
+/* Return TRUE iff node is associated with a builtin that should be callable
+   from strub contexts.  */
+
+static inline bool
+strub_callable_builtin_p (cgraph_node *node)
+{
+  if (DECL_BUILT_IN_CLASS (node->decl) != BUILT_IN_NORMAL)
+    return false;
+
+  enum built_in_function fcode = DECL_FUNCTION_CODE (node->decl);
+
+  switch (fcode)
+    {
+    case BUILT_IN_NONE:
+      gcc_unreachable ();
+
+      /* This temporarily allocates stack for the call, and we can't reasonably
+	 update the watermark for that.  Besides, we don't check the actual call
+	 target, nor its signature, and it seems to be overkill to as much as
+	 try to do so.  */
+    case BUILT_IN_APPLY:
+      return false;
+
+      /* Conversely, this shouldn't be called from within strub contexts, since
+	 the caller may have had its signature modified.  STRUB_INTERNAL is ok,
+	 the call will remain in the STRUB_WRAPPER, and removed from the
+	 STRUB_WRAPPED clone.  */
+    case BUILT_IN_APPLY_ARGS:
+      return false;
+
+      /* ??? Make all other builtins callable.  We wish to make any builtin call
+	 the compiler might introduce on its own callable.  Anything that is
+	 predictable enough as to be known not to allow stack data that should
+	 be strubbed to unintentionally escape to non-strub contexts can be
+	 allowed, and pretty much every builtin appears to fit this description.
+	 The exceptions to this rule seem to be rare, and only available as
+	 explicit __builtin calls, so let's keep it simple and allow all of
+	 them...  */
+    default:
+      return true;
+    }
+}
+
+/* Compute the strub mode to be used for NODE.  STRUB_ATTR should be the strub
+   attribute,found for NODE, if any.  */
+
+static enum strub_mode
+compute_strub_mode (cgraph_node *node, tree strub_attr)
+{
+  enum strub_mode req_mode = get_strub_mode_from_attr (strub_attr);
+
+  gcc_checking_assert (flag_strub >= -2 && flag_strub <= 3);
+
+  /* Symbolic encodings of the -fstrub-* flags.  */
+  /* Enable strub when explicitly requested through attributes to functions or
+     variables, reporting errors if the requests cannot be satisfied.  */
+  const bool strub_flag_auto = flag_strub < 0;
+  /* strub_flag_auto with strub call verification; without this, functions are
+     implicitly callable.  */
+  const bool strub_flag_strict = flag_strub < -1;
+  /* Disable strub altogether, ignore attributes entirely.  */
+  const bool strub_flag_disabled = flag_strub == 0;
+  /* On top of _auto, also enable strub implicitly for functions that can
+     safely undergo at-calls strubbing.  Internal mode will still be used in
+     functions that request it explicitly with attribute strub(2), or when the
+     function body requires strubbing and at-calls strubbing is not viable.  */
+  const bool strub_flag_at_calls = flag_strub == 1;
+  /* On top of default, also enable strub implicitly for functions that can
+     safely undergo internal strubbing.  At-calls mode will still be used in
+     functions that requiest it explicitly with attribute strub() or strub(1),
+     or when the function body requires strubbing and internal strubbing is not
+     viable.  */
+  const bool strub_flag_internal = flag_strub == 2;
+  /* On top of default, also enable strub implicitly for functions that can
+     safely undergo strubbing in either mode.  When both modes are viable,
+     at-calls is preferred.  */
+  const bool strub_flag_either = flag_strub == 3;
+  /* Besides the default behavior, enable strub implicitly for all viable
+     functions.  */
+  const bool strub_flag_viable = flag_strub > 0;
+
+  /* The consider_* variables should be TRUE if selecting the corresponding
+     strub modes would be consistent with requests from attributes and command
+     line flags.  Attributes associated with functions pretty much mandate a
+     selection, and should report an error if not satisfied; strub_flag_auto
+     implicitly enables some viable strub mode if that's required by references
+     to variables marked for strub; strub_flag_viable enables strub if viable
+     (even when favoring one mode, body-requested strub can still be satisfied
+     by either mode), and falls back to callable, silently unless variables
+     require strubbing.  */
+
+  const bool consider_at_calls
+    = (!strub_flag_disabled
+       && (strub_attr
+	   ? req_mode == STRUB_AT_CALLS
+	   : true));
+  const bool consider_internal
+    = (!strub_flag_disabled
+       && (strub_attr
+	   ? req_mode == STRUB_INTERNAL
+	   : true));
+
+  const bool consider_callable
+    = (!strub_flag_disabled
+       && (strub_attr
+	   ? req_mode == STRUB_CALLABLE
+	   : (!strub_flag_strict
+	      || strub_callable_builtin_p (node))));
+
+  /* This is a shorthand for either strub-enabled mode.  */
+  const bool consider_strub
+    = (consider_at_calls || consider_internal);
+
+  /* We can cope with always_inline functions even with noipa and noclone,
+     because we just leave them alone.  */
+  const bool is_always_inline
+    = strub_always_inline_p (node);
+
+  /* Strubbing in general, and each specific strub mode, may have its own set of
+     requirements.  We require noipa for strubbing, either because of cloning
+     required for internal strub, or because of caller enumeration required for
+     at-calls strub.  We don't consider the at-calls mode eligible if it's not
+     even considered, it has no further requirements.  Internal mode requires
+     cloning and the absence of certain features in the body and, like at-calls,
+     it's not eligible if it's not even under consideration.
+
+     ??? Do we need target hooks for further constraints?  E.g., x86's
+     "interrupt" attribute breaks internal strubbing because the wrapped clone
+     carries the attribute and thus isn't callable; in this case, we could use a
+     target hook to adjust the clone instead.  */
+  const bool strub_eligible
+    = (consider_strub
+       && (is_always_inline || can_strub_p (node)));
+  const bool at_calls_eligible
+    = (consider_at_calls && strub_eligible
+       && can_strub_at_calls_p (node));
+  const bool internal_eligible
+    = (consider_internal && strub_eligible
+       && (is_always_inline
+	   || can_strub_internally_p (node)));
+
+  /* In addition to the strict eligibility requirements, some additional
+     constraints are placed on implicit selection of certain modes.  These do
+     not prevent the selection of a mode if explicitly specified as part of a
+     function interface (the strub attribute), but they may prevent modes from
+     being selected by the command line or by function bodies.  The only actual
+     constraint is on at-calls mode: since we change the function's exposed
+     signature, we won't do it implicitly if the function can possibly be used
+     in ways that do not expect the signature change, e.g., if the function is
+     available to or interposable by other units, if its address is taken,
+     etc.  */
+  const bool at_calls_viable
+    = (at_calls_eligible
+       && (strub_attr
+	   || (node->has_gimple_body_p ()
+	       && (!node->externally_visible
+		   || (node->binds_to_current_def_p ()
+		       && node->can_be_local_p ()))
+	       && node->only_called_directly_p ()
+	       && !called_with_type_override_p (node))));
+  const bool internal_viable
+    = (internal_eligible);
+
+  /* Shorthand.  */
+  const bool strub_viable
+    = (at_calls_viable || internal_viable);
+
+  /* We wish to analyze the body, to look for implicit requests for strub, both
+     to implicitly enable it when the body calls for it, and to report errors if
+     the body calls for it but neither mode is viable (even if that follows from
+     non-eligibility because of the explicit specification of some non-strubbing
+     mode).  We can refrain from scanning the body only in rare circumstances:
+     when strub is enabled by a function attribute (scanning might be redundant
+     in telling us to also enable it), and when we are enabling strub implicitly
+     but there are non-viable modes: we want to know whether strubbing is
+     required, to fallback to another mode, even if we're only enabling a
+     certain mode, or, when either mode would do, to report an error if neither
+     happens to be viable.  */
+  const bool analyze_body
+    = (strub_attr
+       ? !consider_strub
+       : (strub_flag_auto
+	  || (strub_flag_viable && (!at_calls_viable && !internal_viable))
+	  || (strub_flag_either && !strub_viable)));
+
+  /* Cases in which strubbing is enabled or disabled by strub_flag_auto.
+     Unsatisfiable requests ought to be reported.  */
+  const bool strub_required
+    = ((strub_attr && consider_strub)
+       || (analyze_body && strub_from_body_p (node)));
+
+  /* Besides the required cases, we want to abide by the requests to enabling on
+     an if-viable basis.  */
+  const bool strub_enable
+    = (strub_required
+       || (strub_flag_at_calls && at_calls_viable)
+       || (strub_flag_internal && internal_viable)
+       || (strub_flag_either && strub_viable));
+
+  /* And now we're finally ready to select a mode that abides by the viability
+     and eligibility constraints, and that satisfies the strubbing requirements
+     and requests, subject to the constraints.  If both modes are viable and
+     strub is to be enabled, pick STRUB_AT_CALLS unless STRUB_INTERNAL was named
+     as preferred.  */
+  const enum strub_mode mode
+    = ((strub_enable && is_always_inline)
+       ? (strub_required ? STRUB_INLINABLE : STRUB_CALLABLE)
+       : (strub_enable && internal_viable
+	  && (strub_flag_internal || !at_calls_viable))
+       ? STRUB_INTERNAL
+       : (strub_enable && at_calls_viable)
+       ? (strub_required && !strub_attr
+	  ? STRUB_AT_CALLS_OPT
+	  : STRUB_AT_CALLS)
+       : consider_callable
+       ? STRUB_CALLABLE
+       : STRUB_DISABLED);
+
+  switch (mode)
+    {
+    case STRUB_CALLABLE:
+      if (is_always_inline)
+	break;
+      /* Fall through.  */
+
+    case STRUB_DISABLED:
+      if (strub_enable && !strub_attr)
+	{
+	  gcc_checking_assert (analyze_body);
+	  error_at (DECL_SOURCE_LOCATION (node->decl),
+		    "%qD requires %<strub%>,"
+		    " but no viable %<strub%> mode was found",
+		    node->decl);
+	  break;
+	}
+      /* Fall through.  */
+
+    case STRUB_AT_CALLS:
+    case STRUB_INTERNAL:
+    case STRUB_INLINABLE:
+      /* Differences from an mode requested through a function attribute are
+	 reported in set_strub_mode_to.  */
+      break;
+
+    case STRUB_AT_CALLS_OPT:
+      /* Functions that select this mode do so because of references to strub
+	 variables.  Even if we choose at-calls as an optimization, the
+	 requirements for internal strub must still be satisfied.  Optimization
+	 options may render implicit at-calls strub not viable (-O0 sets
+	 force_output for static non-inline functions), and it would not be good
+	 if changing optimization options turned a well-formed into an
+	 ill-formed one.  */
+      if (!internal_viable)
+	can_strub_internally_p (node, true);
+      break;
+
+    case STRUB_WRAPPED:
+    case STRUB_WRAPPER:
+    default:
+      gcc_unreachable ();
+    }
+
+  return mode;
+}
+
+/* Set FNDT's strub mode to MODE; FNDT may be a function decl or
+   function type.  If OVERRIDE, do not check whether a mode is already
+   set.  */
+
+static void
+strub_set_fndt_mode_to (tree fndt, enum strub_mode mode, bool override)
+{
+  gcc_checking_assert (override
+		       || !(DECL_P (fndt)
+			    ? get_strub_attr_from_decl (fndt)
+			    : get_strub_attr_from_type (fndt)));
+
+  tree attr = tree_cons (get_identifier ("strub"),
+			 get_strub_mode_attr_value (mode),
+			 NULL_TREE);
+  tree *attrp = NULL;
+  if (DECL_P (fndt))
+    {
+      gcc_checking_assert (FUNC_OR_METHOD_TYPE_P (TREE_TYPE (fndt)));
+      attrp = &DECL_ATTRIBUTES (fndt);
+    }
+  else if (FUNC_OR_METHOD_TYPE_P (fndt))
+    attrp = &TYPE_ATTRIBUTES (fndt);
+  else
+    gcc_unreachable ();
+
+  TREE_CHAIN (attr) = *attrp;
+  *attrp = attr;
+}
+
+/* Set FNDT's strub mode to callable.
+   FNDT may be a function decl or a function type.  */
+
+void
+strub_make_callable (tree fndt)
+{
+  strub_set_fndt_mode_to (fndt, STRUB_CALLABLE, false);
+}
+
+/* Set NODE to strub MODE.  Report incompatibilities between MODE and the mode
+   requested through explicit attributes, and cases of non-eligibility.  */
+
+static void
+set_strub_mode_to (cgraph_node *node, enum strub_mode mode)
+{
+  tree attr = get_strub_attr_from_decl (node->decl);
+  enum strub_mode req_mode = get_strub_mode_from_attr (attr);
+
+  if (attr)
+    {
+      /* Check for and report incompatible mode changes.  */
+      if (mode != req_mode
+	  && !(req_mode == STRUB_INTERNAL
+	       && (mode == STRUB_WRAPPED
+		   || mode == STRUB_WRAPPER))
+	  && !((req_mode == STRUB_INTERNAL
+		|| req_mode == STRUB_AT_CALLS
+		|| req_mode == STRUB_CALLABLE)
+	       && mode == STRUB_INLINABLE))
+	{
+	  error_at (DECL_SOURCE_LOCATION (node->decl),
+		    "%<strub%> mode %qE selected for %qD, when %qE was requested",
+		    get_strub_mode_attr_parm (mode),
+		    node->decl,
+		    get_strub_mode_attr_parm (req_mode));
+	  if (node->alias)
+	    {
+	      cgraph_node *target = node->ultimate_alias_target ();
+	      if (target != node)
+		error_at (DECL_SOURCE_LOCATION (target->decl),
+			  "the incompatible selection was determined"
+			  " by ultimate alias target %qD",
+			  target->decl);
+	    }
+
+	  /* Report any incompatibilities with explicitly-requested strub.  */
+	  switch (req_mode)
+	    {
+	    case STRUB_AT_CALLS:
+	      can_strub_at_calls_p (node, true);
+	      break;
+
+	    case STRUB_INTERNAL:
+	      can_strub_internally_p (node, true);
+	      break;
+
+	    default:
+	      break;
+	    }
+	}
+
+      /* Drop any incompatible strub attributes leading the decl attribute
+	 chain.  Return if we find one with the mode we need.  */
+      for (;;)
+	{
+	  if (mode == req_mode)
+	    return;
+
+	  if (DECL_ATTRIBUTES (node->decl) != attr)
+	    break;
+
+	  DECL_ATTRIBUTES (node->decl) = TREE_CHAIN (attr);
+	  attr = get_strub_attr_from_decl (node->decl);
+	  if (!attr)
+	    break;
+
+	  req_mode = get_strub_mode_from_attr (attr);
+	}
+    }
+  else if (mode == req_mode)
+    return;
+
+  strub_set_fndt_mode_to (node->decl, mode, attr);
+}
+
+/* Compute and set NODE's strub mode.  */
+
+static void
+set_strub_mode (cgraph_node *node)
+{
+  tree attr = get_strub_attr_from_decl (node->decl);
+
+  if (attr)
+    switch (get_strub_mode_from_attr (attr))
+      {
+	/* These can't have been requested through user attributes, so we must
+	   have already gone through them.  */
+      case STRUB_WRAPPER:
+      case STRUB_WRAPPED:
+      case STRUB_INLINABLE:
+      case STRUB_AT_CALLS_OPT:
+	return;
+
+      case STRUB_DISABLED:
+      case STRUB_AT_CALLS:
+      case STRUB_INTERNAL:
+      case STRUB_CALLABLE:
+	break;
+
+      default:
+	gcc_unreachable ();
+      }
+
+  cgraph_node *xnode = node;
+  if (node->alias)
+    xnode = node->ultimate_alias_target ();
+  /* Weakrefs may remain unresolved (the above will return node) if
+     their targets are not defined, so make sure we compute a strub
+     mode for them, instead of defaulting to STRUB_DISABLED and
+     rendering them uncallable.  */
+  enum strub_mode mode = (xnode != node && !xnode->alias
+			  ? get_strub_mode (xnode)
+			  : compute_strub_mode (node, attr));
+
+  set_strub_mode_to (node, mode);
+}
+
+
+/* Non-strub functions shouldn't be called from within strub contexts,
+   except through callable ones.  Always inline strub functions can
+   only be called from strub functions.  */
+
+static bool
+strub_callable_from_p (strub_mode caller_mode, strub_mode callee_mode)
+{
+  switch (caller_mode)
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS_OPT:
+    case STRUB_AT_CALLS:
+    case STRUB_INTERNAL:
+    case STRUB_INLINABLE:
+      break;
+
+    case STRUB_WRAPPER:
+    case STRUB_DISABLED:
+    case STRUB_CALLABLE:
+      return callee_mode != STRUB_INLINABLE;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  switch (callee_mode)
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS:
+    case STRUB_INLINABLE:
+      break;
+
+    case STRUB_AT_CALLS_OPT:
+    case STRUB_INTERNAL:
+    case STRUB_WRAPPER:
+      return (flag_strub >= -1);
+
+    case STRUB_DISABLED:
+      return false;
+
+    case STRUB_CALLABLE:
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return true;
+}
+
+/* Return TRUE iff CALLEE can be inlined into CALLER.  We wish to avoid inlining
+   WRAPPED functions back into their WRAPPERs.  More generally, we wish to avoid
+   inlining strubbed functions into non-strubbed ones.  CALLER doesn't have to
+   be an immediate caller of CALLEE: the immediate caller may have already been
+   cloned for inlining, and then CALLER may be further up the original call
+   chain.  ???  It would be nice if our own caller would retry inlining callee
+   if caller gets inlined.  */
+
+bool
+strub_inlinable_to_p (cgraph_node *callee, cgraph_node *caller)
+{
+  strub_mode callee_mode = get_strub_mode (callee);
+
+  switch (callee_mode)
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS:
+    case STRUB_INTERNAL:
+    case STRUB_INLINABLE:
+    case STRUB_AT_CALLS_OPT:
+      break;
+
+    case STRUB_WRAPPER:
+    case STRUB_DISABLED:
+    case STRUB_CALLABLE:
+      /* When we consider inlining, we've already verified callability, so we
+	 can even inline callable and then disabled into a strub context.  That
+	 will get strubbed along with the context, so it's hopefully not a
+	 problem.  */
+      return true;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  strub_mode caller_mode = get_strub_mode (caller);
+
+  switch (caller_mode)
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS:
+    case STRUB_INTERNAL:
+    case STRUB_INLINABLE:
+    case STRUB_AT_CALLS_OPT:
+      return true;
+
+    case STRUB_WRAPPER:
+    case STRUB_DISABLED:
+    case STRUB_CALLABLE:
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return false;
+}
+
+/* Check that types T1 and T2 are strub-compatible.  Return 1 if the strub modes
+   are the same, 2 if they are interchangeable, and 0 otherwise.  */
+
+int
+strub_comptypes (tree t1, tree t2)
+{
+  if (TREE_CODE (t1) != TREE_CODE (t2))
+    return 0;
+
+  enum strub_mode m1 = get_strub_mode_from_type (t1);
+  enum strub_mode m2 = get_strub_mode_from_type (t2);
+
+  if (m1 == m2)
+    return 1;
+
+  /* We're dealing with types, so only strub modes that can be selected by
+     attributes in the front end matter.  If either mode is at-calls (for
+     functions) or internal (for variables), the conversion is not
+     compatible.  */
+  bool var_p = !FUNC_OR_METHOD_TYPE_P (t1);
+  enum strub_mode mr = var_p ? STRUB_INTERNAL : STRUB_AT_CALLS;
+  if (m1 == mr || m2 == mr)
+    return 0;
+
+  return 2;
+}
+
+/* Return the effective strub mode used for CALL, and set *TYPEP to
+   the effective type used for the call.  The effective type and mode
+   are those of the callee, unless the call involves a typecast.  */
+
+static enum strub_mode
+effective_strub_mode_for_call (gcall *call, tree *typep)
+{
+  tree type;
+  enum strub_mode mode;
+
+  if (strub_call_fntype_override_p (call))
+    {
+      type = gimple_call_fntype (call);
+      mode = get_strub_mode_from_type (type);
+    }
+  else
+    {
+      type = TREE_TYPE (TREE_TYPE (gimple_call_fn (call)));
+      tree decl = gimple_call_fndecl (call);
+      if (decl)
+	mode = get_strub_mode_from_fndecl (decl);
+      else
+	mode = get_strub_mode_from_type (type);
+    }
+
+  if (typep)
+    *typep = type;
+
+  return mode;
+}
+
+/* Create a distinct copy of the type of NODE's function, and change
+   the fntype of all calls to it with the same main type to the new
+   type.  */
+
+static void
+distinctify_node_type (cgraph_node *node)
+{
+  tree old_type = TREE_TYPE (node->decl);
+  tree new_type = build_distinct_type_copy (old_type);
+  tree new_ptr_type = NULL_TREE;
+
+  /* Remap any calls to node->decl that use old_type, or a variant
+     thereof, to new_type as well.  We don't look for aliases, their
+     declarations will have their types changed independently, and
+     we'll adjust their fntypes then.  */
+  for (cgraph_edge *e = node->callers; e; e = e->next_caller)
+    {
+      if (!e->call_stmt)
+	continue;
+      tree fnaddr = gimple_call_fn (e->call_stmt);
+      gcc_checking_assert (TREE_CODE (fnaddr) == ADDR_EXPR
+			   && TREE_OPERAND (fnaddr, 0) == node->decl);
+      if (strub_call_fntype_override_p (e->call_stmt))
+	continue;
+      if (!new_ptr_type)
+	new_ptr_type = build_pointer_type (new_type);
+      TREE_TYPE (fnaddr) = new_ptr_type;
+      gimple_call_set_fntype (e->call_stmt, new_type);
+    }
+
+  TREE_TYPE (node->decl) = new_type;
+}
+
+/* Return TRUE iff TYPE and any variants have the same strub mode.  */
+
+static bool
+same_strub_mode_in_variants_p (tree type)
+{
+  enum strub_mode mode = get_strub_mode_from_type (type);
+
+  for (tree other = TYPE_MAIN_VARIANT (type);
+       other != NULL_TREE; other = TYPE_NEXT_VARIANT (other))
+    if (type != other && mode != get_strub_mode_from_type (other))
+      return false;
+
+  /* Check that the canonical type, if set, either is in the same
+     variant chain, or has the same strub mode as type.  Also check
+     the variants of the canonical type.  */
+  if (TYPE_CANONICAL (type)
+      && (TYPE_MAIN_VARIANT (TYPE_CANONICAL (type))
+	  != TYPE_MAIN_VARIANT (type)))
+    {
+      if (mode != get_strub_mode_from_type (TYPE_CANONICAL (type)))
+	return false;
+      else
+	return same_strub_mode_in_variants_p (TYPE_CANONICAL (type));
+    }
+
+  return true;
+}
+
+/* Check that strub functions don't call non-strub functions, and that
+   always_inline strub functions are only called by strub
+   functions.  */
+
+static void
+verify_strub ()
+{
+  cgraph_node *node;
+
+  /* It's expected that check strub-wise pointer type compatibility of variables
+     and of functions is already taken care of by front-ends, on account of the
+     attribute's being marked as affecting type identity and of the creation of
+     distinct types.  */
+
+  /* Check that call targets in strub contexts have strub-callable types.  */
+
+  FOR_EACH_FUNCTION_WITH_GIMPLE_BODY (node)
+  {
+    enum strub_mode caller_mode = get_strub_mode (node);
+
+    for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee)
+      {
+	gcc_checking_assert (e->indirect_unknown_callee);
+
+	if (!e->call_stmt)
+	  continue;
+
+	enum strub_mode callee_mode
+	  = effective_strub_mode_for_call (e->call_stmt, NULL);
+
+	if (!strub_callable_from_p (caller_mode, callee_mode))
+	  error_at (gimple_location (e->call_stmt),
+		    "indirect non-%<strub%> call in %<strub%> context %qD",
+		    node->decl);
+      }
+
+    for (cgraph_edge *e = node->callees; e; e = e->next_callee)
+      {
+	gcc_checking_assert (!e->indirect_unknown_callee);
+
+	if (!e->call_stmt)
+	  continue;
+
+	tree callee_fntype;
+	enum strub_mode callee_mode
+	  = effective_strub_mode_for_call (e->call_stmt, &callee_fntype);
+
+	if (!strub_callable_from_p (caller_mode, callee_mode))
+	  {
+	    if (callee_mode == STRUB_INLINABLE)
+	      error_at (gimple_location (e->call_stmt),
+			"calling %<always_inline%> %<strub%> %qD"
+			" in non-%<strub%> context %qD",
+			e->callee->decl, node->decl);
+	    else if (fndecl_built_in_p (e->callee->decl, BUILT_IN_APPLY_ARGS)
+		     && caller_mode == STRUB_INTERNAL)
+	      /* This is ok, it will be kept in the STRUB_WRAPPER, and removed
+		 from the STRUB_WRAPPED's strub context.  */
+	      continue;
+	    else if (!strub_call_fntype_override_p (e->call_stmt))
+	      error_at (gimple_location (e->call_stmt),
+			"calling non-%<strub%> %qD in %<strub%> context %qD",
+			e->callee->decl, node->decl);
+	    else
+	      error_at (gimple_location (e->call_stmt),
+			"calling %qD using non-%<strub%> type %qT"
+			" in %<strub%> context %qD",
+			e->callee->decl, callee_fntype, node->decl);
+	  }
+      }
+  }
+}
+
+namespace {
+
+/* Define a pass to compute strub modes.  */
+const pass_data pass_data_ipa_strub_mode = {
+  SIMPLE_IPA_PASS,
+  "strubm",
+  OPTGROUP_NONE,
+  TV_NONE,
+  PROP_cfg, // properties_required
+  0,	    // properties_provided
+  0,	    // properties_destroyed
+  0,	    // properties_start
+  0,	    // properties_finish
+};
+
+class pass_ipa_strub_mode : public simple_ipa_opt_pass
+{
+public:
+  pass_ipa_strub_mode (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_strub_mode, ctxt)
+  {}
+  opt_pass *clone () { return new pass_ipa_strub_mode (m_ctxt); }
+  virtual bool gate (function *) {
+    /* In relaxed (-3) and strict (-4) settings, that only enable strub at a
+       function or variable attribute's request, the attribute handler changes
+       flag_strub to -1 or -2, respectively, if any strub-enabling occurence of
+       the attribute is found.  Therefore, if it remains at -3 or -4, nothing
+       that would enable strub was found, so we can disable it and avoid the
+       overhead.  */
+    if (flag_strub < -2)
+      flag_strub = 0;
+    return flag_strub;
+  }
+  virtual unsigned int execute (function *);
+};
+
+/* Define a pass to introduce strub transformations.  */
+const pass_data pass_data_ipa_strub = {
+  SIMPLE_IPA_PASS,
+  "strub",
+  OPTGROUP_NONE,
+  TV_NONE,
+  PROP_cfg | PROP_ssa, // properties_required
+  0,	    // properties_provided
+  0,	    // properties_destroyed
+  0,	    // properties_start
+  TODO_update_ssa
+  | TODO_cleanup_cfg
+  | TODO_rebuild_cgraph_edges
+  | TODO_verify_il, // properties_finish
+};
+
+class pass_ipa_strub : public simple_ipa_opt_pass
+{
+public:
+  pass_ipa_strub (gcc::context *ctxt)
+    : simple_ipa_opt_pass (pass_data_ipa_strub, ctxt)
+  {}
+  opt_pass *clone () { return new pass_ipa_strub (m_ctxt); }
+  virtual bool gate (function *) { return flag_strub && !seen_error (); }
+  virtual unsigned int execute (function *);
+
+  /* Define on demand and cache some types we use often.  */
+#define DEF_TYPE(IDX, NAME, INIT)		\
+  static inline tree get_ ## NAME () {		\
+    int idx = STRUB_TYPE_BASE + IDX;		\
+    static tree type = strub_cache[idx];	\
+    if (!type)					\
+      strub_cache[idx] = type = (INIT);		\
+    return type;				\
+  }
+
+  /* Use a distinct ptr_type_node to denote the watermark, so that we can
+     recognize it in arg lists and avoid modifying types twice.  */
+  DEF_TYPE (0, wmt, build_variant_type_copy (ptr_type_node))
+
+  DEF_TYPE (1, pwmt, build_reference_type (get_wmt ()))
+
+  DEF_TYPE (2, qpwmt,
+	    build_qualified_type (get_pwmt (),
+				  TYPE_QUAL_RESTRICT
+				  /* | TYPE_QUAL_CONST */))
+
+  DEF_TYPE (3, qptr,
+	    build_qualified_type (ptr_type_node,
+				  TYPE_QUAL_RESTRICT
+				  | TYPE_QUAL_CONST))
+
+  DEF_TYPE (4, qpvalst,
+	    build_qualified_type (build_reference_type
+				  (va_list_type_node),
+				  TYPE_QUAL_RESTRICT
+				  /* | TYPE_QUAL_CONST */))
+
+#undef DEF_TYPE
+
+  /* Define non-strub builtins on demand.  */
+#define DEF_NM_BUILTIN(NAME, CODE, FNTYPELIST)			\
+  static tree get_ ## NAME () {					\
+    tree decl = builtin_decl_explicit (CODE);			\
+    if (!decl)							\
+      {								\
+	tree type = build_function_type_list FNTYPELIST;	\
+	decl = add_builtin_function				\
+	  ("__builtin_" #NAME,					\
+	   type, CODE, BUILT_IN_NORMAL,				\
+	   NULL, NULL);						\
+	TREE_NOTHROW (decl) = true;				\
+	set_builtin_decl ((CODE), decl, true);			\
+      }								\
+    return decl;						\
+  }
+
+  DEF_NM_BUILTIN (stack_address,
+		  BUILT_IN_STACK_ADDRESS,
+		  (ptr_type_node, NULL))
+
+#undef DEF_NM_BUILTIN
+
+  /* Define strub builtins on demand.  */
+#define DEF_SS_BUILTIN(NAME, FNSPEC, CODE, FNTYPELIST)		\
+  static tree get_ ## NAME () {					\
+    tree decl = builtin_decl_explicit (CODE);			\
+    if (!decl)							\
+      {								\
+	tree type = build_function_type_list FNTYPELIST;	\
+	tree attrs = NULL;					\
+	if (FNSPEC)						\
+	  attrs = tree_cons (get_identifier ("fn spec"),	\
+			     build_tree_list			\
+			     (NULL_TREE,			\
+			      build_string (strlen (FNSPEC),	\
+					    (FNSPEC))),		\
+			     attrs);				\
+	decl = add_builtin_function_ext_scope			\
+	  ("__builtin___strub_" #NAME,				\
+	   type, CODE, BUILT_IN_NORMAL,				\
+	   "__strub_" #NAME, attrs);				\
+	TREE_NOTHROW (decl) = true;				\
+	set_builtin_decl ((CODE), decl, true);			\
+      }								\
+    return decl;						\
+  }
+
+  DEF_SS_BUILTIN (enter, ". Ot",
+		  BUILT_IN___STRUB_ENTER,
+		  (void_type_node, get_qpwmt (), NULL))
+  DEF_SS_BUILTIN (update, ". Wt",
+		  BUILT_IN___STRUB_UPDATE,
+		  (void_type_node, get_qpwmt (), NULL))
+  DEF_SS_BUILTIN (leave, ". w ",
+		  BUILT_IN___STRUB_LEAVE,
+		  (void_type_node, get_qpwmt (), NULL))
+
+#undef DEF_SS_BUILTIN
+
+    /* Define strub identifiers on demand.  */
+#define DEF_IDENT(IDX, NAME)						\
+  static inline tree get_ ## NAME () {					\
+    int idx = STRUB_IDENT_BASE + IDX;					\
+    tree identifier = strub_cache[idx];					\
+    if (!identifier)							\
+      strub_cache[idx] = identifier = get_identifier (".strub." #NAME);	\
+    return identifier;							\
+  }
+
+  DEF_IDENT (0, watermark_ptr)
+  DEF_IDENT (1, va_list_ptr)
+  DEF_IDENT (2, apply_args)
+
+#undef DEF_IDENT
+
+  static inline int adjust_at_calls_type (tree);
+  static inline void adjust_at_calls_call (cgraph_edge *, int, tree);
+  static inline void adjust_at_calls_calls (cgraph_node *);
+
+  /* Add to SEQ a call to the strub watermark update builtin, taking NODE's
+     location if given.  Optionally add the corresponding edge from NODE, with
+     execution frequency COUNT.  Return the modified SEQ.  */
+
+  static inline gimple_seq
+  call_update_watermark (tree wmptr, cgraph_node *node, profile_count count,
+			 gimple_seq seq = NULL)
+    {
+      tree uwm = get_update ();
+      gcall *update = gimple_build_call (uwm, 1, wmptr);
+      if (node)
+	gimple_set_location (update, DECL_SOURCE_LOCATION (node->decl));
+      gimple_seq_add_stmt (&seq, update);
+      if (node)
+	node->create_edge (cgraph_node::get_create (uwm), update, count, false);
+      return seq;
+    }
+
+};
+
+} // anon namespace
+
+/* Gather with this type a collection of parameters that we're turning into
+   explicit references.  */
+
+typedef hash_set<tree> indirect_parms_t;
+
+/* Dereference OP's incoming turned-into-reference parm if it's an
+   INDIRECT_PARMS or an ADDR_EXPR thereof.  Set *REC and return according to
+   gimple-walking expectations.  */
+
+static tree
+maybe_make_indirect (indirect_parms_t &indirect_parms, tree op, int *rec)
+{
+  if (DECL_P (op))
+    {
+      *rec = 0;
+      if (indirect_parms.contains (op))
+	{
+	  tree ret = gimple_fold_indirect_ref (op);
+	  if (!ret)
+	    ret = build2 (MEM_REF,
+			  TREE_TYPE (TREE_TYPE (op)),
+			  op,
+			  build_int_cst (TREE_TYPE (op), 0));
+	  return ret;
+	}
+    }
+  else if (TREE_CODE (op) == ADDR_EXPR
+	   && DECL_P (TREE_OPERAND (op, 0)))
+    {
+      *rec = 0;
+      if (indirect_parms.contains (TREE_OPERAND (op, 0)))
+	{
+	  op = TREE_OPERAND (op, 0);
+	  return op;
+	}
+    }
+
+  return NULL_TREE;
+}
+
+/* A gimple-walking function that adds dereferencing to indirect parms.  */
+
+static tree
+walk_make_indirect (tree *op, int *rec, void *arg)
+{
+  walk_stmt_info *wi = (walk_stmt_info *)arg;
+  indirect_parms_t &indirect_parms = *(indirect_parms_t *)wi->info;
+
+  if (!*op || TYPE_P (*op))
+    {
+      *rec = 0;
+      return NULL_TREE;
+    }
+
+  if (tree repl = maybe_make_indirect (indirect_parms, *op, rec))
+    {
+      *op = repl;
+      wi->changed = true;
+    }
+
+  return NULL_TREE;
+}
+
+/* A gimple-walking function that turns any non-gimple-val ADDR_EXPRs into a
+   separate SSA.  Though addresses of e.g. parameters, and of members thereof,
+   are gimple vals, turning parameters into references, with an extra layer of
+   indirection and thus explicit dereferencing, need to be regimplified.  */
+
+static tree
+walk_regimplify_addr_expr (tree *op, int *rec, void *arg)
+{
+  walk_stmt_info *wi = (walk_stmt_info *)arg;
+  gimple_stmt_iterator &gsi = *(gimple_stmt_iterator *)wi->info;
+
+  *rec = 0;
+
+  if (!*op || TREE_CODE (*op) != ADDR_EXPR)
+    return NULL_TREE;
+
+  if (!is_gimple_val (*op))
+    {
+      tree ret = force_gimple_operand_gsi (&gsi, *op, true,
+					   NULL_TREE, true, GSI_SAME_STMT);
+      gcc_assert (ret != *op);
+      *op = ret;
+      wi->changed = true;
+    }
+
+  return NULL_TREE;
+}
+
+/* Turn STMT's PHI arg defs into separate SSA defs if they've become
+   non-gimple_val.  Return TRUE if any edge insertions need to be committed.  */
+
+static bool
+walk_regimplify_phi (gphi *stmt)
+{
+  bool needs_commit = false;
+
+  for (unsigned i = 0, n = gimple_phi_num_args (stmt); i < n; i++)
+    {
+      tree op = gimple_phi_arg_def (stmt, i);
+      if ((TREE_CODE (op) == ADDR_EXPR
+	   && !is_gimple_val (op))
+	  /* ??? A PARM_DECL that was addressable in the original function and
+	     had its address in PHI nodes, but that became a reference in the
+	     wrapped clone would NOT be updated by update_ssa in PHI nodes.
+	     Alas, if we were to create a default def for it now, update_ssa
+	     would complain that the symbol that needed rewriting already has
+	     SSA names associated with it.  OTOH, leaving the PARM_DECL alone,
+	     it eventually causes errors because it remains unchanged in PHI
+	     nodes, but it gets rewritten as expected if it appears in other
+	     stmts.  So we cheat a little here, and force the PARM_DECL out of
+	     the PHI node and into an assignment.  It's a little expensive,
+	     because we insert it at the edge, which introduces a basic block
+	     that's entirely unnecessary, but it works, and the block will be
+	     removed as the default def gets propagated back into the PHI node,
+	     so the final optimized code looks just as expected.  */
+	  || (TREE_CODE (op) == PARM_DECL
+	      && !TREE_ADDRESSABLE (op)))
+	{
+	  tree temp = make_ssa_name (TREE_TYPE (op), stmt);
+	  if (TREE_CODE (op) == PARM_DECL)
+	    SET_SSA_NAME_VAR_OR_IDENTIFIER (temp, DECL_NAME (op));
+	  SET_PHI_ARG_DEF (stmt, i, temp);
+
+	  gimple *assign = gimple_build_assign (temp, op);
+	  if (gimple_phi_arg_has_location (stmt, i))
+	    gimple_set_location (assign, gimple_phi_arg_location (stmt, i));
+	  gsi_insert_on_edge (gimple_phi_arg_edge (stmt, i), assign);
+	  needs_commit = true;
+	}
+    }
+
+  return needs_commit;
+}
+
+/* Create a reference type to use for PARM when turning it into a reference.
+   NONALIASED causes the reference type to gain its own separate alias set, so
+   that accessing the indirectly-passed parm won'will not add aliasing
+   noise.  */
+
+static tree
+build_ref_type_for (tree parm, bool nonaliased = true)
+{
+  gcc_checking_assert (TREE_CODE (parm) == PARM_DECL);
+
+  tree ref_type = build_reference_type (TREE_TYPE (parm));
+
+  if (!nonaliased)
+    return ref_type;
+
+  /* Each PARM turned indirect still points to the distinct memory area at the
+     wrapper, and the reference in unchanging, so we might qualify it, but...
+     const is not really important, since we're only using default defs for the
+     reference parm anyway, and not introducing any defs, and restrict seems to
+     cause trouble.  E.g., libgnat/s-concat3.adb:str_concat_3 has memmoves that,
+     if it's wrapped, the memmoves are deleted in dse1.  Using a distinct alias
+     set seems to not run afoul of this problem, and it hopefully enables the
+     compiler to tell the pointers do point to objects that are not otherwise
+     aliased.  */
+  tree qref_type = build_variant_type_copy (ref_type);
+
+  TYPE_ALIAS_SET (qref_type) = new_alias_set ();
+  record_alias_subset (TYPE_ALIAS_SET (qref_type), get_alias_set (ref_type));
+
+  return qref_type;
+}
+
+/* Add cgraph edges from current_function_decl to callees in SEQ with frequency
+   COUNT, assuming all calls in SEQ are direct.  */
+
+static void
+add_call_edges_for_seq (gimple_seq seq, profile_count count)
+{
+  cgraph_node *node = cgraph_node::get_create (current_function_decl);
+
+  for (gimple_stmt_iterator gsi = gsi_start (seq);
+       !gsi_end_p (gsi); gsi_next (&gsi))
+    {
+      gimple *stmt = gsi_stmt (gsi);
+
+      gcall *call = dyn_cast <gcall *> (stmt);
+      if (!call)
+	continue;
+
+      tree callee = gimple_call_fndecl (call);
+      gcc_checking_assert (callee);
+      node->create_edge (cgraph_node::get_create (callee), call, count, false);
+    }
+}
+
+/* Insert SEQ after the call at GSI, as if the call was in a try block with SEQ
+   as finally, i.e., SEQ will run after the call whether it returns or
+   propagates an exception.  This handles block splitting, EH edge and block
+   creation, noreturn and nothrow optimizations, and even throwing calls without
+   preexisting local handlers.  */
+
+static void
+gsi_insert_finally_seq_after_call (gimple_stmt_iterator gsi, gimple_seq seq)
+{
+  if (!seq)
+    return;
+
+  gimple *stmt = gsi_stmt (gsi);
+
+  if (gimple_has_location (stmt))
+    annotate_all_with_location (seq, gimple_location (stmt));
+
+  gcall *call = dyn_cast <gcall *> (stmt);
+  bool noreturn_p = call && gimple_call_noreturn_p (call);
+  int eh_lp = lookup_stmt_eh_lp (stmt);
+  bool must_not_throw_p = eh_lp < 0;
+  bool nothrow_p = (must_not_throw_p
+		    || (call && gimple_call_nothrow_p (call))
+		    || (eh_lp <= 0
+			&& (TREE_NOTHROW (cfun->decl)
+			    || !flag_exceptions)));
+
+  if (noreturn_p && nothrow_p)
+    return;
+
+  /* Don't expect an EH edge if we're not to throw, or if we're not in an EH
+     region yet.  */
+  bool no_eh_edge_p = (nothrow_p || !eh_lp);
+  bool must_end_bb = stmt_ends_bb_p (stmt);
+
+  edge eft = NULL, eeh = NULL;
+  if (must_end_bb && !(noreturn_p && no_eh_edge_p))
+    {
+      gcc_checking_assert (gsi_one_before_end_p (gsi));
+
+      edge e;
+      edge_iterator ei;
+      FOR_EACH_EDGE (e, ei, gsi_bb (gsi)->succs)
+	{
+	  if ((e->flags & EDGE_EH))
+	    {
+	      gcc_checking_assert (!eeh);
+	      eeh = e;
+#if !CHECKING_P
+	      if (eft || noreturn_p)
+		break;
+#endif
+	    }
+	  if ((e->flags & EDGE_FALLTHRU))
+	    {
+	      gcc_checking_assert (!eft);
+	      eft = e;
+#if !CHECKING_P
+	      if (eeh || no_eh_edge_p)
+		break;
+#endif
+	    }
+	}
+
+      gcc_checking_assert (!(eft && (eft->flags & EDGE_FALLTHRU))
+			   == noreturn_p);
+      gcc_checking_assert (!(eeh && (eeh->flags & EDGE_EH))
+			   == no_eh_edge_p);
+      gcc_checking_assert (eft != eeh);
+    }
+
+  if (!noreturn_p)
+    {
+      gimple_seq nseq = nothrow_p ? seq : gimple_seq_copy (seq);
+
+      if (must_end_bb)
+	{
+	  gcc_checking_assert (gsi_one_before_end_p (gsi));
+	  add_call_edges_for_seq (nseq, eft->count ());
+	  gsi_insert_seq_on_edge_immediate (eft, nseq);
+	}
+      else
+	{
+	  add_call_edges_for_seq (nseq, gsi_bb (gsi)->count);
+	  gsi_insert_seq_after (&gsi, nseq, GSI_SAME_STMT);
+	}
+    }
+
+  if (nothrow_p)
+    return;
+
+  if (eh_lp)
+    {
+      add_call_edges_for_seq (seq, eeh->count ());
+      gsi_insert_seq_on_edge_immediate (eeh, seq);
+      return;
+    }
+
+  /* A throwing call may appear within a basic block in a function that doesn't
+     have any EH regions.  We're going to add a cleanup if so, therefore the
+     block will have to be split.  */
+  basic_block bb = gsi_bb (gsi);
+  if (!gsi_one_before_end_p (gsi))
+    split_block (bb, stmt);
+
+  /* Create a new block for the EH cleanup.  */
+  basic_block bb_eh_cleanup = create_empty_bb (bb);
+  if (dom_info_available_p (CDI_DOMINATORS))
+    set_immediate_dominator (CDI_DOMINATORS, bb_eh_cleanup, bb);
+  if (current_loops)
+    add_bb_to_loop (bb_eh_cleanup, current_loops->tree_root);
+
+  /* Make the new block an EH cleanup for the call.  */
+  eh_region new_r = gen_eh_region_cleanup (NULL);
+  eh_landing_pad lp = gen_eh_landing_pad (new_r);
+  tree label = gimple_block_label (bb_eh_cleanup);
+  lp->post_landing_pad = label;
+  EH_LANDING_PAD_NR (label) = lp->index;
+  add_stmt_to_eh_lp (stmt, lp->index);
+
+  /* Add the cleanup code to the EH cleanup block.  */
+  gsi = gsi_after_labels (bb_eh_cleanup);
+  gsi_insert_seq_before (&gsi, seq, GSI_SAME_STMT);
+
+  /* And then propagate the exception further.  */
+  gresx *resx = gimple_build_resx (new_r->index);
+  if (gimple_has_location (stmt))
+    gimple_set_location (resx, gimple_location (stmt));
+  gsi_insert_before (&gsi, resx, GSI_SAME_STMT);
+
+  /* Finally, wire the EH cleanup block into the CFG.  */
+  edge neeh = make_eh_edges (stmt);
+  neeh->probability = profile_probability::never ();
+  gcc_checking_assert (neeh->dest == bb_eh_cleanup);
+  gcc_checking_assert (!neeh->dest->count.initialized_p ());
+  neeh->dest->count = neeh->count ();
+  add_call_edges_for_seq (seq, neeh->dest->count);
+}
+
+/* Copy the attribute list at *ATTRS, minus any NAME attributes, leaving
+   shareable trailing nodes alone.  */
+
+static inline void
+remove_named_attribute_unsharing (const char *name, tree *attrs)
+{
+  while (tree found = lookup_attribute (name, *attrs))
+    {
+      /* Copy nodes up to the next NAME attribute.  */
+      while (*attrs != found)
+	{
+	  *attrs = tree_cons (TREE_PURPOSE (*attrs),
+			      TREE_VALUE (*attrs),
+			      TREE_CHAIN (*attrs));
+	  attrs = &TREE_CHAIN (*attrs);
+	}
+      /* Then drop it.  */
+      gcc_checking_assert (*attrs == found);
+      *attrs = TREE_CHAIN (*attrs);
+    }
+}
+
+/* Record the order of the last cgraph entry whose mode we've already set, so
+   that we can perform mode setting incrementally without duplication.  */
+static int last_cgraph_order;
+
+/* Set strub modes for functions introduced since the last call.  */
+
+static void
+ipa_strub_set_mode_for_new_functions ()
+{
+  if (symtab->order == last_cgraph_order)
+    return;
+
+  cgraph_node *node;
+
+  /* Go through the functions twice, once over non-aliases, and then over
+     aliases, so that aliases can reuse the mode computation of their ultimate
+     targets.  */
+  for (int aliases = 0; aliases <= 1; aliases++)
+    FOR_EACH_FUNCTION (node)
+    {
+      if (!node->alias != !aliases)
+	continue;
+
+      /*  Already done.  */
+      if (node->order < last_cgraph_order)
+	continue;
+
+      set_strub_mode (node);
+    }
+
+  last_cgraph_order = symtab->order;
+}
+
+/* Return FALSE if NODE is a strub context, and TRUE otherwise.  */
+
+bool
+strub_splittable_p (cgraph_node *node)
+{
+  switch (get_strub_mode (node))
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS:
+    case STRUB_AT_CALLS_OPT:
+    case STRUB_INLINABLE:
+    case STRUB_INTERNAL:
+    case STRUB_WRAPPER:
+      return false;
+
+    case STRUB_CALLABLE:
+    case STRUB_DISABLED:
+      break;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  return true;
+}
+
+/* Return the PARM_DECL of the incoming watermark pointer, if there is one.  */
+
+tree
+strub_watermark_parm (tree fndecl)
+{
+  switch (get_strub_mode_from_fndecl (fndecl))
+    {
+    case STRUB_WRAPPED:
+    case STRUB_AT_CALLS:
+    case STRUB_AT_CALLS_OPT:
+      break;
+
+    case STRUB_INTERNAL:
+    case STRUB_WRAPPER:
+    case STRUB_CALLABLE:
+    case STRUB_DISABLED:
+    case STRUB_INLINABLE:
+      return NULL_TREE;
+
+    default:
+      gcc_unreachable ();
+    }
+
+  for (tree parm = DECL_ARGUMENTS (fndecl); parm; parm = DECL_CHAIN (parm))
+    /* The type (variant) compare finds the parameter even in a just-created
+       clone, before we set its name, but the type-based compare doesn't work
+       during builtin expansion within the lto compiler, because we'll have
+       created a separate variant in that run.  */
+    if (TREE_TYPE (parm) == pass_ipa_strub::get_qpwmt ()
+	|| DECL_NAME (parm) == pass_ipa_strub::get_watermark_ptr ())
+      return parm;
+
+  gcc_unreachable ();
+}
+
+/* Adjust a STRUB_AT_CALLS function TYPE, adding a watermark pointer if it
+   hasn't been added yet.  Return the named argument count.  */
+
+int
+pass_ipa_strub::adjust_at_calls_type (tree type)
+{
+  int named_args = 0;
+
+  gcc_checking_assert (same_strub_mode_in_variants_p (type));
+
+  if (!TYPE_ARG_TYPES (type))
+    return named_args;
+
+  tree *tlist = &TYPE_ARG_TYPES (type);
+  tree qpwmptrt = get_qpwmt ();
+  while (*tlist && TREE_VALUE (*tlist) != void_type_node)
+    {
+      /* The type has already been adjusted.  */
+      if (TREE_VALUE (*tlist) == qpwmptrt)
+	return named_args;
+      named_args++;
+      *tlist = tree_cons (TREE_PURPOSE (*tlist),
+			  TREE_VALUE (*tlist),
+			  TREE_CHAIN (*tlist));
+      tlist = &TREE_CHAIN (*tlist);
+    }
+
+  /* Add the new argument after all named arguments, so as to not mess with
+     attributes that reference parameters.  */
+  *tlist = tree_cons (NULL_TREE, get_qpwmt (), *tlist);
+
+#if ATTR_FNSPEC_DECONST_WATERMARK
+  if (!type_already_adjusted)
+    {
+      int flags = flags_from_decl_or_type (type);
+      tree fnspec = lookup_attribute ("fn spec", type);
+
+      if ((flags & (ECF_CONST | ECF_PURE | ECF_NOVOPS)) || fnspec)
+	{
+	  size_t xargs = 1;
+	  size_t curlen = 0, tgtlen = 2 + 2 * (named_args + xargs);
+	  auto_vec<char> nspecv (tgtlen);
+	  char *nspec = &nspecv[0]; /* It will *not* be NUL-terminated!  */
+	  if (fnspec)
+	    {
+	      tree fnspecstr = TREE_VALUE (TREE_VALUE (fnspec));
+	      curlen = TREE_STRING_LENGTH (fnspecstr);
+	      memcpy (nspec, TREE_STRING_POINTER (fnspecstr), curlen);
+	    }
+	  if (!curlen)
+	    {
+	      nspec[curlen++] = '.';
+	      nspec[curlen++] = ((flags & ECF_CONST)
+				 ? 'c'
+				 : (flags & ECF_PURE)
+				 ? 'p'
+				 : ' ');
+	    }
+	  while (curlen < tgtlen - 2 * xargs)
+	    {
+	      nspec[curlen++] = '.';
+	      nspec[curlen++] = ' ';
+	    }
+	  nspec[curlen++] = 'W';
+	  nspec[curlen++] = 't';
+
+	  /* The type has already been copied, if needed, before adding
+	     parameters.  */
+	  TYPE_ATTRIBUTES (type)
+	    = tree_cons (get_identifier ("fn spec"),
+			 build_tree_list (NULL_TREE,
+					  build_string (tgtlen, nspec)),
+			 TYPE_ATTRIBUTES (type));
+	}
+    }
+#endif
+
+  return named_args;
+}
+
+/* Adjust a call to an at-calls call target.  Create a watermark local variable
+   if needed, initialize it before, pass it to the callee according to the
+   modified at-calls interface, and release the callee's stack space after the
+   call, if not deferred.  If the call is const or pure, arrange for the
+   watermark to not be assumed unused or unchanged.  */
+
+void
+pass_ipa_strub::adjust_at_calls_call (cgraph_edge *e, int named_args,
+				      tree callee_fntype)
+{
+  gcc_checking_assert (e->call_stmt);
+  gcall *ocall = e->call_stmt;
+  gimple_stmt_iterator gsi = gsi_for_stmt (ocall);
+
+  /* Make sure we haven't modified this call yet.  */
+  gcc_checking_assert (!(int (gimple_call_num_args (ocall)) > named_args
+			 && (TREE_TYPE (gimple_call_arg (ocall, named_args))
+			     == get_pwmt ())));
+
+  /* If we're already within a strub context, pass on the incoming watermark
+     pointer, and omit the enter and leave calls around the modified call, as an
+     optimization, or as a means to satisfy a tail-call requirement.  */
+  tree swmp = ((optimize_size || optimize > 2
+		|| gimple_call_must_tail_p (ocall)
+		|| (optimize == 2 && gimple_call_tail_p (ocall)))
+	       ? strub_watermark_parm (e->caller->decl)
+	       : NULL_TREE);
+  bool omit_own_watermark = swmp;
+  tree swm = NULL_TREE;
+  if (!omit_own_watermark)
+    {
+      swm = create_tmp_var (get_wmt (), ".strub.watermark");
+      TREE_ADDRESSABLE (swm) = true;
+      swmp = build1 (ADDR_EXPR, get_pwmt (), swm);
+
+      /* Initialize the watermark before the call.  */
+      tree enter = get_enter ();
+      gcall *stptr = gimple_build_call (enter, 1,
+					unshare_expr (swmp));
+      if (gimple_has_location (ocall))
+	gimple_set_location (stptr, gimple_location (ocall));
+      gsi_insert_before (&gsi, stptr, GSI_SAME_STMT);
+      e->caller->create_edge (cgraph_node::get_create (enter),
+			      stptr, gsi_bb (gsi)->count, false);
+    }
+
+
+  /* Replace the call with one that passes the swmp argument first.  */
+  gcall *wrcall;
+  { gcall *stmt = ocall;
+    // Mostly copied from gimple_call_copy_skip_args.
+    int i = 0;
+    int nargs = gimple_call_num_args (stmt);
+    auto_vec<tree> vargs (MAX (nargs, named_args) + 1);
+    gcall *new_stmt;
+
+    /* pr71109.c calls a prototypeless function, then defines it with
+       additional arguments.  It's ill-formed, but after it's inlined,
+       it somehow works out.  */
+    for (; i < named_args && i < nargs; i++)
+      vargs.quick_push (gimple_call_arg (stmt, i));
+    for (; i < named_args; i++)
+      vargs.quick_push (null_pointer_node);
+
+    vargs.quick_push (unshare_expr (swmp));
+
+    for (; i < nargs; i++)
+      vargs.quick_push (gimple_call_arg (stmt, i));
+
+    if (gimple_call_internal_p (stmt))
+      gcc_unreachable ();
+    else
+      new_stmt = gimple_build_call_vec (gimple_call_fn (stmt), vargs);
+    gimple_call_set_fntype (new_stmt, callee_fntype);
+
+    if (gimple_call_lhs (stmt))
+      gimple_call_set_lhs (new_stmt, gimple_call_lhs (stmt));
+
+    gimple_move_vops (new_stmt, stmt);
+
+    if (gimple_has_location (stmt))
+      gimple_set_location (new_stmt, gimple_location (stmt));
+    gimple_call_copy_flags (new_stmt, stmt);
+    gimple_call_set_chain (new_stmt, gimple_call_chain (stmt));
+
+    gimple_set_modified (new_stmt, true);
+
+    wrcall = new_stmt;
+  }
+
+  update_stmt (wrcall);
+  gsi_replace (&gsi, wrcall, true);
+  cgraph_edge::set_call_stmt (e, wrcall, false);
+
+  /* Insert the strub code after the call.  */
+  gimple_seq seq = NULL;
+
+#if !ATTR_FNSPEC_DECONST_WATERMARK
+  /* If the call will be assumed to not modify or even read the
+     watermark, make it read and modified ourselves.  */
+  if ((gimple_call_flags (wrcall)
+       & (ECF_CONST | ECF_PURE | ECF_NOVOPS)))
+    {
+      if (!swm)
+	swm = build2 (MEM_REF,
+		      TREE_TYPE (TREE_TYPE (swmp)),
+		      swmp,
+		      build_int_cst (TREE_TYPE (swmp), 0));
+
+      vec<tree, va_gc> *inputs = NULL;
+      vec<tree, va_gc> *outputs = NULL;
+      vec_safe_push (outputs,
+		     build_tree_list
+		     (build_tree_list
+		      (NULL_TREE, build_string (2, "=m")),
+		      unshare_expr (swm)));
+      vec_safe_push (inputs,
+		     build_tree_list
+		     (build_tree_list
+		      (NULL_TREE, build_string (1, "m")),
+		      unshare_expr (swm)));
+      gasm *forcemod = gimple_build_asm_vec ("", inputs, outputs,
+					     NULL, NULL);
+      gimple_seq_add_stmt (&seq, forcemod);
+
+      /* If the call will be assumed to not even read the watermark,
+	 make sure it is already in memory before the call.  */
+      if ((gimple_call_flags (wrcall) & ECF_CONST))
+	{
+	  vec<tree, va_gc> *inputs = NULL;
+	  vec_safe_push (inputs,
+			 build_tree_list
+			 (build_tree_list
+			  (NULL_TREE, build_string (1, "m")),
+			  unshare_expr (swm)));
+	  gasm *force_store = gimple_build_asm_vec ("", inputs, NULL,
+						    NULL, NULL);
+	  if (gimple_has_location (wrcall))
+	    gimple_set_location (force_store, gimple_location (wrcall));
+	  gsi_insert_before (&gsi, force_store, GSI_SAME_STMT);
+	}
+    }
+#endif
+
+  if (!omit_own_watermark)
+    {
+      gcall *sleave = gimple_build_call (get_leave (), 1,
+					 unshare_expr (swmp));
+      gimple_seq_add_stmt (&seq, sleave);
+
+      gassign *clobber = gimple_build_assign (swm,
+					      build_clobber
+					      (TREE_TYPE (swm)));
+      gimple_seq_add_stmt (&seq, clobber);
+    }
+
+  gsi_insert_finally_seq_after_call (gsi, seq);
+}
+
+/* Adjust all at-calls calls in NODE. */
+
+void
+pass_ipa_strub::adjust_at_calls_calls (cgraph_node *node)
+{
+  /* Adjust unknown-callee indirect calls with STRUB_AT_CALLS types within
+     onode.  */
+  if (node->indirect_calls)
+    {
+      push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+      for (cgraph_edge *e = node->indirect_calls; e; e = e->next_callee)
+	{
+	  gcc_checking_assert (e->indirect_unknown_callee);
+
+	  if (!e->call_stmt)
+	    continue;
+
+	  tree callee_fntype;
+	  enum strub_mode callee_mode
+	    = effective_strub_mode_for_call (e->call_stmt, &callee_fntype);
+
+	  if (callee_mode != STRUB_AT_CALLS
+	      && callee_mode != STRUB_AT_CALLS_OPT)
+	    continue;
+
+	  int named_args = adjust_at_calls_type (callee_fntype);
+
+	  adjust_at_calls_call (e, named_args, callee_fntype);
+	}
+      pop_cfun ();
+    }
+
+  if (node->callees)
+    {
+      push_cfun (DECL_STRUCT_FUNCTION (node->decl));
+      for (cgraph_edge *e = node->callees; e; e = e->next_callee)
+	{
+	  gcc_checking_assert (!e->indirect_unknown_callee);
+
+	  if (!e->call_stmt)
+	    continue;
+
+	  tree callee_fntype;
+	  enum strub_mode callee_mode
+	    = effective_strub_mode_for_call (e->call_stmt, &callee_fntype);
+
+	  if (callee_mode != STRUB_AT_CALLS
+	      && callee_mode != STRUB_AT_CALLS_OPT)
+	    continue;
+
+	  int named_args = adjust_at_calls_type (callee_fntype);
+
+	  adjust_at_calls_call (e, named_args, callee_fntype);
+	}
+      pop_cfun ();
+    }
+}
+
+/* The strubm (strub mode) pass computes a strub mode for each function in the
+   call graph, and checks, before any inlining, that strub callability
+   requirements in effect are satisfied.  */
+
+unsigned int
+pass_ipa_strub_mode::execute (function *)
+{
+  last_cgraph_order = 0;
+  ipa_strub_set_mode_for_new_functions ();
+
+  /* Verify before any inlining or other transformations.  */
+  verify_strub ();
+
+  return 0;
+}
+
+/* Create a strub mode pass.  */
+
+simple_ipa_opt_pass *
+make_pass_ipa_strub_mode (gcc::context *ctxt)
+{
+  return new pass_ipa_strub_mode (ctxt);
+}
+
+/* The strub pass proper adjusts types, signatures, and at-calls calls, and
+   splits internal-strub functions.  */
+
+unsigned int
+pass_ipa_strub::execute (function *)
+{
+  cgraph_node *onode;
+
+  ipa_strub_set_mode_for_new_functions ();
+
+  /* First, adjust the signature of at-calls functions.  We adjust types of
+     at-calls functions first, so that we don't modify types in place unless
+     strub is explicitly requested.  */
+  FOR_EACH_FUNCTION (onode)
+  {
+    enum strub_mode mode = get_strub_mode (onode);
+
+    if (mode == STRUB_AT_CALLS
+	|| mode == STRUB_AT_CALLS_OPT)
+      {
+	/* Create a type variant if strubbing was not explicitly requested in
+	   the function type.  */
+	if (get_strub_mode_from_type (TREE_TYPE (onode->decl)) != mode)
+	  distinctify_node_type (onode);
+
+	int named_args = adjust_at_calls_type (TREE_TYPE (onode->decl));
+
+	/* An external function explicitly declared with strub won't have a
+	   body.  Even with implicit at-calls strub, a function may have had its
+	   body removed after we selected the mode, and then we have nothing
+	   further to do.  */
+	if (!onode->has_gimple_body_p ())
+	  continue;
+
+	tree *pargs = &DECL_ARGUMENTS (onode->decl);
+
+	/* A noninterposable_alias reuses the same parm decl chain, don't add
+	   the parm twice.  */
+	bool aliased_parms = (onode->alias && *pargs
+			      && DECL_CONTEXT (*pargs) != onode->decl);
+
+	if (aliased_parms)
+	  continue;
+
+	for (int i = 0; i < named_args; i++)
+	  pargs = &DECL_CHAIN (*pargs);
+
+	tree wmptr = build_decl (DECL_SOURCE_LOCATION (onode->decl),
+				 PARM_DECL,
+				 get_watermark_ptr (),
+				 get_qpwmt ());
+	DECL_ARTIFICIAL (wmptr) = 1;
+	DECL_ARG_TYPE (wmptr) = get_qpwmt ();
+	DECL_CONTEXT (wmptr) = onode->decl;
+	TREE_USED (wmptr) = 1;
+	DECL_CHAIN (wmptr) = *pargs;
+	*pargs = wmptr;
+
+	if (onode->alias)
+	  continue;
+
+	cgraph_node *nnode = onode;
+	push_cfun (DECL_STRUCT_FUNCTION (nnode->decl));
+
+	{
+	  edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	  gimple_seq seq = call_update_watermark (wmptr, nnode, e->src->count);
+	  gsi_insert_seq_on_edge_immediate (e, seq);
+	}
+
+	if (DECL_STRUCT_FUNCTION (nnode->decl)->calls_alloca)
+	  {
+	    basic_block bb;
+	    FOR_EACH_BB_FN (bb, cfun)
+	      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+		   !gsi_end_p (gsi); gsi_next (&gsi))
+		{
+		  gimple *stmt = gsi_stmt (gsi);
+
+		  gcall *call = dyn_cast <gcall *> (stmt);
+
+		  if (!call)
+		    continue;
+
+		  if (gimple_alloca_call_p (call))
+		    {
+		      /* Capture stack growth.  */
+		      gimple_seq seq = call_update_watermark (wmptr, NULL,
+							      gsi_bb (gsi)
+							      ->count);
+		      gsi_insert_finally_seq_after_call (gsi, seq);
+		    }
+		}
+	  }
+
+	pop_cfun ();
+      }
+  }
+
+  FOR_EACH_FUNCTION (onode)
+  {
+    if (!onode->has_gimple_body_p ())
+      continue;
+
+    enum strub_mode mode = get_strub_mode (onode);
+
+    if (mode != STRUB_INTERNAL)
+      {
+	adjust_at_calls_calls (onode);
+	continue;
+      }
+
+    bool is_stdarg = calls_builtin_va_start_p (onode);;
+    bool apply_args = calls_builtin_apply_args_p (onode);
+
+    vec<ipa_adjusted_param, va_gc> *nparms = NULL;
+    unsigned j = 0;
+    {
+      // The following loop copied from ipa-split.c:split_function.
+      for (tree parm = DECL_ARGUMENTS (onode->decl);
+	   parm; parm = DECL_CHAIN (parm), j++)
+	{
+	  ipa_adjusted_param adj = {};
+	  adj.op = IPA_PARAM_OP_COPY;
+	  adj.base_index = j;
+	  adj.prev_clone_index = j;
+	  vec_safe_push (nparms, adj);
+	}
+
+      if (apply_args)
+	{
+	  ipa_adjusted_param aaadj = {};
+	  aaadj.op = IPA_PARAM_OP_NEW;
+	  aaadj.type = get_qptr ();
+	  vec_safe_push (nparms, aaadj);
+	}
+
+      if (is_stdarg)
+	{
+	  ipa_adjusted_param vladj = {};
+	  vladj.op = IPA_PARAM_OP_NEW;
+	  vladj.type = get_qpvalst ();
+	  vec_safe_push (nparms, vladj);
+	}
+
+      ipa_adjusted_param wmadj = {};
+      wmadj.op = IPA_PARAM_OP_NEW;
+      wmadj.type = get_qpwmt ();
+      vec_safe_push (nparms, wmadj);
+    }
+    ipa_param_adjustments adj (nparms, -1, false);
+
+    cgraph_node *nnode = onode->create_version_clone_with_body
+      (auto_vec<cgraph_edge *> (0),
+       NULL, &adj, NULL, NULL, "strub", NULL);
+
+    if (!nnode)
+      {
+	error_at (DECL_SOURCE_LOCATION (onode->decl),
+		  "failed to split %qD for %<strub%>",
+		  onode->decl);
+	continue;
+      }
+
+    onode->split_part = true;
+    if (onode->calls_comdat_local)
+      nnode->add_to_same_comdat_group (onode);
+
+    set_strub_mode_to (onode, STRUB_WRAPPER);
+    set_strub_mode_to (nnode, STRUB_WRAPPED);
+
+    adjust_at_calls_calls (nnode);
+
+    /* Decide which of the wrapped function's parms we want to turn into
+       references to the argument passed to the wrapper.  In general, we want to
+       copy small arguments, and avoid copying large ones.  Variable-sized array
+       lengths given by other arguments, as in 20020210-1.c, would lead to
+       problems if passed by value, after resetting the original function and
+       dropping the length computation; passing them by reference works.
+       DECL_BY_REFERENCE is *not* a substitute for this: it involves copying
+       anyway, but performed at the caller.  */
+    indirect_parms_t indirect_nparms (3, false);
+    unsigned adjust_ftype = 0;
+    unsigned named_args = 0;
+    for (tree parm = DECL_ARGUMENTS (onode->decl),
+	   nparm = DECL_ARGUMENTS (nnode->decl),
+	   nparmt = TYPE_ARG_TYPES (TREE_TYPE (nnode->decl));
+	 parm;
+	 named_args++,
+	   parm = DECL_CHAIN (parm),
+	   nparm = DECL_CHAIN (nparm),
+	   nparmt = nparmt ? TREE_CHAIN (nparmt) : NULL_TREE)
+      if (!(0 /* DECL_BY_REFERENCE (narg) */
+	    || is_gimple_reg_type (TREE_TYPE (nparm))
+	    || VECTOR_TYPE_P (TREE_TYPE (nparm))
+	    || TREE_CODE (TREE_TYPE (nparm)) == COMPLEX_TYPE
+	    || (tree_fits_uhwi_p (TYPE_SIZE_UNIT (TREE_TYPE (nparm)))
+		&& (tree_to_uhwi (TYPE_SIZE_UNIT (TREE_TYPE (nparm)))
+		    <= 4 * UNITS_PER_WORD))))
+	{
+	  indirect_nparms.add (nparm);
+
+	  /* ??? Is there any case in which it is not safe to suggest the parms
+	     turned indirect don't alias anything else?  They are distinct,
+	     unaliased memory in the wrapper, and the wrapped can't possibly
+	     take pointers into them because none of the pointers passed to the
+	     wrapper can alias other incoming parameters passed by value, even
+	     if with transparent reference, and the wrapper doesn't take any
+	     extra parms that could point into wrapper's parms.  So we can
+	     probably drop the TREE_ADDRESSABLE and keep the TRUE.  */
+	  tree ref_type = build_ref_type_for (nparm,
+					      true
+					      || !TREE_ADDRESSABLE (parm));
+
+	  DECL_ARG_TYPE (nparm) = TREE_TYPE (nparm) = ref_type;
+	  relayout_decl (nparm);
+	  TREE_ADDRESSABLE (nparm) = 0;
+	  DECL_BY_REFERENCE (nparm) = 0;
+	  DECL_NOT_GIMPLE_REG_P (nparm) = 0;
+	  /* ??? This avoids mismatches in debug info bind stmts in
+	     e.g. a-chahan .  */
+	  DECL_ABSTRACT_ORIGIN (nparm) = NULL;
+
+	  if (nparmt)
+	    adjust_ftype++;
+	}
+
+    /* Also adjust the wrapped function type, if needed.  */
+    if (adjust_ftype)
+      {
+	tree nftype = TREE_TYPE (nnode->decl);
+
+	/* We always add at least one argument at the end of the signature, when
+	   cloning the function, so we don't expect to need to duplicate the
+	   type here.  */
+	gcc_checking_assert (TYPE_ARG_TYPES (nftype)
+			     != TYPE_ARG_TYPES (TREE_TYPE (onode->decl)));
+
+	/* Check that fnspec still works for the modified function signature,
+	   and drop it otherwise.  */
+	bool drop_fnspec = false;
+	tree fnspec = lookup_attribute ("fn spec", TYPE_ATTRIBUTES (nftype));
+	attr_fnspec spec = fnspec ? attr_fnspec (fnspec) : attr_fnspec ("");
+
+	unsigned retcopy;
+	if (!(fnspec && spec.returns_arg (&retcopy)))
+	  retcopy = (unsigned) -1;
+
+	unsigned i = 0;
+	for (tree nparm = DECL_ARGUMENTS (nnode->decl),
+	       nparmt = TYPE_ARG_TYPES (nftype);
+	     adjust_ftype > 0;
+	     i++, nparm = DECL_CHAIN (nparm), nparmt = TREE_CHAIN (nparmt))
+	  if (indirect_nparms.contains (nparm))
+	    {
+	      TREE_VALUE (nparmt) = TREE_TYPE (nparm);
+	      adjust_ftype--;
+
+	      if (fnspec && !drop_fnspec)
+		{
+		  if (i == retcopy)
+		    drop_fnspec = true;
+		  else if (spec.arg_specified_p (i))
+		    {
+		      /* Properties that apply to pointers only must not be
+			 present, because we don't make pointers further
+			 indirect.  */
+		      gcc_checking_assert
+			(!spec.arg_max_access_size_given_by_arg_p (i, NULL));
+		      gcc_checking_assert (!spec.arg_copied_to_arg_p (i, NULL));
+
+		      /* Any claim of direct access only is invalidated by
+			 adding an indirection level.  */
+		      if (spec.arg_direct_p (i))
+			drop_fnspec = true;
+
+		      /* If there's a claim the argument is not read from, the
+			 added indirection invalidates it: if the argument is
+			 used at all, then the pointer will necessarily be
+			 read.  */
+		      if (!spec.arg_maybe_read_p (i)
+			  && spec.arg_used_p (i))
+			drop_fnspec = true;
+		    }
+		}
+	    }
+
+	/* ??? Maybe we could adjust it instead.  */
+	if (drop_fnspec)
+	  remove_named_attribute_unsharing ("fn spec",
+					    &TYPE_ATTRIBUTES (nftype));
+
+	TREE_TYPE (nnode->decl) = nftype;
+      }
+
+#if ATTR_FNSPEC_DECONST_WATERMARK
+    {
+      int flags = flags_from_decl_or_type (nnode->decl);
+      tree fnspec = lookup_attribute ("fn spec", TREE_TYPE (nnode->decl));
+
+      if ((flags & (ECF_CONST | ECF_PURE | ECF_NOVOPS)) || fnspec)
+	{
+	  size_t xargs = 1 + int (is_stdarg) + int (apply_args);
+	  size_t curlen = 0, tgtlen = 2 + 2 * (named_args + xargs);
+	  auto_vec<char> nspecv (tgtlen);
+	  char *nspec = &nspecv[0]; /* It will *not* be NUL-terminated!  */
+	  bool no_writes_p = true;
+	  if (fnspec)
+	    {
+	      tree fnspecstr = TREE_VALUE (TREE_VALUE (fnspec));
+	      curlen = TREE_STRING_LENGTH (fnspecstr);
+	      memcpy (nspec, TREE_STRING_POINTER (fnspecstr), curlen);
+	      if (!(flags & (ECF_CONST | ECF_PURE | ECF_NOVOPS))
+		  && curlen >= 2
+		  && nspec[1] != 'c' && nspec[1] != 'C'
+		  && nspec[1] != 'p' && nspec[1] != 'P')
+		no_writes_p = false;
+	    }
+	  if (!curlen)
+	    {
+	      nspec[curlen++] = '.';
+	      nspec[curlen++] = ((flags & ECF_CONST)
+				 ? 'c'
+				 : (flags & ECF_PURE)
+				 ? 'p'
+				 : ' ');
+	    }
+	  while (curlen < tgtlen - 2 * xargs)
+	    {
+	      nspec[curlen++] = '.';
+	      nspec[curlen++] = ' ';
+	    }
+
+	  /* These extra args are unlikely to be present in const or pure
+	     functions.  It's conceivable that a function that takes variable
+	     arguments, or that passes its arguments on to another function,
+	     could be const or pure, but it would not modify the arguments, and,
+	     being pure or const, it couldn't possibly modify or even access
+	     memory referenced by them.  But it can read from these internal
+	     data structures created by the wrapper, and from any
+	     argument-passing memory referenced by them, so we denote the
+	     possibility of reading from multiple levels of indirection, but
+	     only of reading because const/pure.  */
+	  if (apply_args)
+	    {
+	      nspec[curlen++] = 'r';
+	      nspec[curlen++] = ' ';
+	    }
+	  if (is_stdarg)
+	    {
+	      nspec[curlen++] = (no_writes_p ? 'r' : '.');
+	      nspec[curlen++] = (no_writes_p ? 't' : ' ');
+	    }
+
+	  nspec[curlen++] = 'W';
+	  nspec[curlen++] = 't';
+
+	  /* The type has already been copied before adding parameters.  */
+	  gcc_checking_assert (TYPE_ARG_TYPES (TREE_TYPE (nnode->decl))
+			       != TYPE_ARG_TYPES (TREE_TYPE (onode->decl)));
+	  TYPE_ATTRIBUTES (TREE_TYPE (nnode->decl))
+	    = tree_cons (get_identifier ("fn spec"),
+			 build_tree_list (NULL_TREE,
+					  build_string (tgtlen, nspec)),
+			 TYPE_ATTRIBUTES (TREE_TYPE (nnode->decl)));
+	}
+    }
+#endif
+
+    {
+      tree decl = onode->decl;
+      cgraph_node *target = nnode;
+
+      { // copied from create_wrapper
+
+	/* Preserve DECL_RESULT so we get right by reference flag.  */
+	tree decl_result = DECL_RESULT (decl);
+
+	/* Remove the function's body but keep arguments to be reused
+	   for thunk.  */
+	onode->release_body (true);
+	onode->reset (/* unlike create_wrapper: preserve_comdat_group = */true);
+
+	DECL_UNINLINABLE (decl) = false;
+	DECL_RESULT (decl) = decl_result;
+	DECL_INITIAL (decl) = NULL;
+	allocate_struct_function (decl, false);
+	set_cfun (NULL);
+
+	/* Turn alias into thunk and expand it into GIMPLE representation.  */
+	onode->definition = true;
+
+	thunk_info::get_create (onode);
+	onode->thunk = true;
+	onode->create_edge (target, NULL, onode->count);
+	onode->callees->can_throw_external = !TREE_NOTHROW (target->decl);
+
+	tree arguments = DECL_ARGUMENTS (decl);
+
+	while (arguments)
+	  {
+	    TREE_ADDRESSABLE (arguments) = false;
+	    arguments = TREE_CHAIN (arguments);
+	  }
+
+	{
+	  tree alias = onode->callees->callee->decl;
+	  tree thunk_fndecl = decl;
+	  tree a;
+
+	  int nxargs = 1 + is_stdarg + apply_args;
+
+	  { // Simplified from expand_thunk.
+	    tree restype;
+	    basic_block bb, then_bb, else_bb, return_bb;
+	    gimple_stmt_iterator bsi;
+	    int nargs = 0;
+	    tree arg;
+	    int i;
+	    tree resdecl;
+	    tree restmp = NULL;
+
+	    gcall *call;
+	    greturn *ret;
+	    bool alias_is_noreturn = TREE_THIS_VOLATILE (alias);
+
+	    a = DECL_ARGUMENTS (thunk_fndecl);
+
+	    current_function_decl = thunk_fndecl;
+
+	    /* Ensure thunks are emitted in their correct sections.  */
+	    resolve_unique_section (thunk_fndecl, 0,
+				    flag_function_sections);
+
+	    bitmap_obstack_initialize (NULL);
+
+	    /* Build the return declaration for the function.  */
+	    restype = TREE_TYPE (TREE_TYPE (thunk_fndecl));
+	    if (DECL_RESULT (thunk_fndecl) == NULL_TREE)
+	      {
+		resdecl = build_decl (input_location, RESULT_DECL, 0, restype);
+		DECL_ARTIFICIAL (resdecl) = 1;
+		DECL_IGNORED_P (resdecl) = 1;
+		DECL_CONTEXT (resdecl) = thunk_fndecl;
+		DECL_RESULT (thunk_fndecl) = resdecl;
+	      }
+	    else
+	      resdecl = DECL_RESULT (thunk_fndecl);
+
+	    profile_count cfg_count = onode->count;
+	    if (!cfg_count.initialized_p ())
+	      cfg_count = profile_count::from_gcov_type (BB_FREQ_MAX).guessed_local ();
+
+	    bb = then_bb = else_bb = return_bb
+	      = init_lowered_empty_function (thunk_fndecl, true, cfg_count);
+
+	    bsi = gsi_start_bb (bb);
+
+	    /* Build call to the function being thunked.  */
+	    if (!VOID_TYPE_P (restype)
+		&& (!alias_is_noreturn
+		    || TREE_ADDRESSABLE (restype)
+		    || TREE_CODE (TYPE_SIZE_UNIT (restype)) != INTEGER_CST))
+	      {
+		if (DECL_BY_REFERENCE (resdecl))
+		  {
+		    restmp = gimple_fold_indirect_ref (resdecl);
+		    if (!restmp)
+		      restmp = build2 (MEM_REF,
+				       TREE_TYPE (TREE_TYPE (resdecl)),
+				       resdecl,
+				       build_int_cst (TREE_TYPE (resdecl), 0));
+		  }
+		else if (!is_gimple_reg_type (restype))
+		  {
+		    if (aggregate_value_p (resdecl, TREE_TYPE (thunk_fndecl)))
+		      {
+			restmp = resdecl;
+
+			if (VAR_P (restmp))
+			  {
+			    add_local_decl (cfun, restmp);
+			    BLOCK_VARS (DECL_INITIAL (current_function_decl))
+			      = restmp;
+			  }
+		      }
+		    else
+		      restmp = create_tmp_var (restype, "retval");
+		  }
+		else
+		  restmp = create_tmp_reg (restype, "retval");
+	      }
+
+	    for (arg = a; arg; arg = DECL_CHAIN (arg))
+	      nargs++;
+	    auto_vec<tree> vargs (nargs + nxargs);
+	    i = 0;
+	    arg = a;
+
+	    if (nargs)
+	      for (tree nparm = DECL_ARGUMENTS (nnode->decl);
+		   i < nargs;
+		   i++, arg = DECL_CHAIN (arg), nparm = DECL_CHAIN (nparm))
+		{
+		  tree save_arg = arg;
+		  tree tmp = arg;
+
+		  /* Arrange to pass indirectly the parms, if we decided to do
+		     so, and revert its type in the wrapper.  */
+		  if (indirect_nparms.contains (nparm))
+		    {
+		      tree ref_type = TREE_TYPE (nparm);
+		      TREE_ADDRESSABLE (arg) = true;
+		      tree addr = build1 (ADDR_EXPR, ref_type, arg);
+		      tmp = arg = addr;
+		    }
+		  else
+		    DECL_NOT_GIMPLE_REG_P (arg) = 0;
+
+		  /* Convert the argument back to the type used by the calling
+		     conventions, e.g. a non-prototyped float type is passed as
+		     double, as in 930603-1.c, and needs to be converted back to
+		     double to be passed on unchanged to the wrapped
+		     function.  */
+		  if (TREE_TYPE (nparm) != DECL_ARG_TYPE (nparm))
+		    arg = fold_convert (DECL_ARG_TYPE (nparm), arg);
+
+		  if (!is_gimple_val (arg))
+		    {
+		      tmp = create_tmp_reg (TYPE_MAIN_VARIANT
+					    (TREE_TYPE (arg)), "arg");
+		      gimple *stmt = gimple_build_assign (tmp, arg);
+		      gsi_insert_after (&bsi, stmt, GSI_NEW_STMT);
+		    }
+		  vargs.quick_push (tmp);
+		  arg = save_arg;
+		}
+	    /* These strub arguments are adjusted later.  */
+	    if (apply_args)
+	      vargs.quick_push (null_pointer_node);
+	    if (is_stdarg)
+	      vargs.quick_push (null_pointer_node);
+	    vargs.quick_push (null_pointer_node);
+	    call = gimple_build_call_vec (build_fold_addr_expr_loc (0, alias),
+					  vargs);
+	    onode->callees->call_stmt = call;
+	    // gimple_call_set_from_thunk (call, true);
+	    if (DECL_STATIC_CHAIN (alias))
+	      {
+		tree p = DECL_STRUCT_FUNCTION (alias)->static_chain_decl;
+		tree type = TREE_TYPE (p);
+		tree decl = build_decl (DECL_SOURCE_LOCATION (thunk_fndecl),
+					PARM_DECL, create_tmp_var_name ("CHAIN"),
+					type);
+		DECL_ARTIFICIAL (decl) = 1;
+		DECL_IGNORED_P (decl) = 1;
+		TREE_USED (decl) = 1;
+		DECL_CONTEXT (decl) = thunk_fndecl;
+		DECL_ARG_TYPE (decl) = type;
+		TREE_READONLY (decl) = 1;
+
+		struct function *sf = DECL_STRUCT_FUNCTION (thunk_fndecl);
+		sf->static_chain_decl = decl;
+
+		gimple_call_set_chain (call, decl);
+	      }
+
+	    /* Return slot optimization is always possible and in fact required to
+	       return values with DECL_BY_REFERENCE.  */
+	    if (aggregate_value_p (resdecl, TREE_TYPE (thunk_fndecl))
+		&& (!is_gimple_reg_type (TREE_TYPE (resdecl))
+		    || DECL_BY_REFERENCE (resdecl)))
+	      gimple_call_set_return_slot_opt (call, true);
+
+	    if (restmp)
+	      {
+		gimple_call_set_lhs (call, restmp);
+		gcc_assert (useless_type_conversion_p (TREE_TYPE (restmp),
+						       TREE_TYPE (TREE_TYPE (alias))));
+	      }
+	    gsi_insert_after (&bsi, call, GSI_NEW_STMT);
+	    if (!alias_is_noreturn)
+	      {
+		/* Build return value.  */
+		if (!DECL_BY_REFERENCE (resdecl))
+		  ret = gimple_build_return (restmp);
+		else
+		  ret = gimple_build_return (resdecl);
+
+		gsi_insert_after (&bsi, ret, GSI_NEW_STMT);
+	      }
+	    else
+	      {
+		remove_edge (single_succ_edge (bb));
+	      }
+
+	    cfun->gimple_df->in_ssa_p = true;
+	    update_max_bb_count ();
+	    profile_status_for_fn (cfun)
+	      = cfg_count.initialized_p () && cfg_count.ipa_p ()
+	      ? PROFILE_READ : PROFILE_GUESSED;
+	    /* FIXME: C++ FE should stop setting TREE_ASM_WRITTEN on thunks.  */
+	    // TREE_ASM_WRITTEN (thunk_fndecl) = false;
+	    delete_unreachable_blocks ();
+	    update_ssa (TODO_update_ssa);
+	    checking_verify_flow_info ();
+	    free_dominance_info (CDI_DOMINATORS);
+
+	    /* Since we want to emit the thunk, we explicitly mark its name as
+	       referenced.  */
+	    onode->thunk = false;
+	    onode->lowered = true;
+	    bitmap_obstack_release (NULL);
+	  }
+	  current_function_decl = NULL;
+	  set_cfun (NULL);
+	}
+
+	thunk_info::remove (onode);
+
+	// some more of create_wrapper at the end of the next block.
+      }
+    }
+
+    {
+      tree aaval = NULL_TREE;
+      tree vaptr = NULL_TREE;
+      tree wmptr = NULL_TREE;
+      for (tree arg = DECL_ARGUMENTS (nnode->decl); arg; arg = DECL_CHAIN (arg))
+	{
+	  aaval = vaptr;
+	  vaptr = wmptr;
+	  wmptr = arg;
+	}
+
+      if (!apply_args)
+	aaval = NULL_TREE;
+      /* The trailing args are [apply_args], [va_list_ptr], and
+	 watermark.  If we don't have a va_list_ptr, the penultimate
+	 argument is apply_args.
+       */
+      else if (!is_stdarg)
+	aaval = vaptr;
+
+      if (!is_stdarg)
+	vaptr = NULL_TREE;
+
+      DECL_NAME (wmptr) = get_watermark_ptr ();
+      DECL_ARTIFICIAL (wmptr) = 1;
+      DECL_IGNORED_P (wmptr) = 1;
+      TREE_USED (wmptr) = 1;
+
+      if (is_stdarg)
+	{
+	  DECL_NAME (vaptr) = get_va_list_ptr ();
+	  DECL_ARTIFICIAL (vaptr) = 1;
+	  DECL_IGNORED_P (vaptr) = 1;
+	  TREE_USED (vaptr) = 1;
+	}
+
+      if (apply_args)
+	{
+	  DECL_NAME (aaval) = get_apply_args ();
+	  DECL_ARTIFICIAL (aaval) = 1;
+	  DECL_IGNORED_P (aaval) = 1;
+	  TREE_USED (aaval) = 1;
+	}
+
+      push_cfun (DECL_STRUCT_FUNCTION (nnode->decl));
+
+      {
+	edge e = single_succ_edge (ENTRY_BLOCK_PTR_FOR_FN (cfun));
+	gimple_seq seq = call_update_watermark (wmptr, nnode, e->src->count);
+	gsi_insert_seq_on_edge_immediate (e, seq);
+      }
+
+      bool any_indirect = !indirect_nparms.is_empty ();
+
+      if (any_indirect)
+	{
+	  basic_block bb;
+	  bool needs_commit = false;
+	  FOR_EACH_BB_FN (bb, cfun)
+	    {
+	      for (gphi_iterator gsi = gsi_start_nonvirtual_phis (bb);
+		   !gsi_end_p (gsi);
+		   gsi_next_nonvirtual_phi (&gsi))
+		{
+		  gphi *stmt = gsi.phi ();
+
+		  walk_stmt_info wi = {};
+		  wi.info = &indirect_nparms;
+		  walk_gimple_op (stmt, walk_make_indirect, &wi);
+		  if (wi.changed && !is_gimple_debug (gsi_stmt (gsi)))
+		    if (walk_regimplify_phi (stmt))
+		      needs_commit = true;
+		}
+
+	      for (gimple_stmt_iterator gsi = gsi_start_bb (bb);
+		   !gsi_end_p (gsi); gsi_next (&gsi))
+		{
+		  gimple *stmt = gsi_stmt (gsi);
+
+		  walk_stmt_info wi = {};
+		  wi.info = &indirect_nparms;
+		  walk_gimple_op (stmt, walk_make_indirect, &wi);
+		  if (wi.changed)
+		    {
+		      if (!is_gimple_debug (stmt))
+			{
+			  wi.info = &gsi;
+			  walk_gimple_op (stmt, walk_regimplify_addr_expr,
+					  &wi);
+			}
+		      update_stmt (stmt);
+		    }
+		}
+	    }
+	  if (needs_commit)
+	    gsi_commit_edge_inserts ();
+	}
+
+      if (DECL_STRUCT_FUNCTION (nnode->decl)->calls_alloca
+	  || is_stdarg || apply_args)
+	for (cgraph_edge *e = nnode->callees, *enext; e; e = enext)
+	  {
+	    if (!e->call_stmt)
+	      continue;
+
+	    gcall *call = e->call_stmt;
+	    gimple_stmt_iterator gsi = gsi_for_stmt (call);
+	    tree fndecl = e->callee->decl;
+
+	    enext = e->next_callee;
+
+	    if (gimple_alloca_call_p (call))
+	      {
+		gimple_seq seq = call_update_watermark (wmptr, NULL,
+							gsi_bb (gsi)->count);
+		gsi_insert_finally_seq_after_call (gsi, seq);
+	      }
+	    else if (fndecl && is_stdarg
+		     && fndecl_built_in_p (fndecl, BUILT_IN_VA_START))
+	      {
+		/* Using a non-default stdarg ABI makes the function ineligible
+		   for internal strub.  */
+		gcc_checking_assert (builtin_decl_explicit (BUILT_IN_VA_START)
+				     == fndecl);
+		tree bvacopy = builtin_decl_explicit (BUILT_IN_VA_COPY);
+		gimple_call_set_fndecl (call, bvacopy);
+		tree arg = vaptr;
+		/* The va_copy source must be dereferenced, unless it's an array
+		   type, that would have decayed to a pointer.  */
+		if (TREE_CODE (TREE_TYPE (TREE_TYPE (vaptr))) != ARRAY_TYPE)
+		  {
+		    arg = gimple_fold_indirect_ref (vaptr);
+		    if (!arg)
+		      arg = build2 (MEM_REF,
+				    TREE_TYPE (TREE_TYPE (vaptr)),
+				    vaptr,
+				    build_int_cst (TREE_TYPE (vaptr), 0));
+		    if (!is_gimple_val (arg))
+		      arg = force_gimple_operand_gsi (&gsi, arg, true,
+						      NULL_TREE, true, GSI_SAME_STMT);
+		  }
+		gimple_call_set_arg (call, 1, arg);
+		update_stmt (call);
+		e->redirect_callee (cgraph_node::get_create (bvacopy));
+	      }
+	    else if (fndecl && apply_args
+		     && fndecl_built_in_p (fndecl, BUILT_IN_APPLY_ARGS))
+	      {
+		tree lhs = gimple_call_lhs (call);
+		gimple *assign = (lhs
+				  ? gimple_build_assign (lhs, aaval)
+				  : gimple_build_nop ());
+		gsi_replace (&gsi, assign, true);
+		cgraph_edge::remove (e);
+	      }
+	  }
+
+      { // a little more copied from create_wrapper
+
+	/* Inline summary set-up.  */
+	nnode->analyze ();
+	// inline_analyze_function (nnode);
+      }
+
+      pop_cfun ();
+    }
+
+    {
+      push_cfun (DECL_STRUCT_FUNCTION (onode->decl));
+      gimple_stmt_iterator gsi
+	= gsi_after_labels (single_succ (ENTRY_BLOCK_PTR_FOR_FN (cfun)));
+
+      gcall *wrcall;
+      while (!(wrcall = dyn_cast <gcall *> (gsi_stmt (gsi))))
+	gsi_next (&gsi);
+
+      tree swm = create_tmp_var (get_wmt (), ".strub.watermark");
+      TREE_ADDRESSABLE (swm) = true;
+      tree swmp = build1 (ADDR_EXPR, get_pwmt (), swm);
+
+      tree enter = get_enter ();
+      gcall *stptr = gimple_build_call (enter, 1, unshare_expr (swmp));
+      gimple_set_location (stptr, gimple_location (wrcall));
+      gsi_insert_before (&gsi, stptr, GSI_SAME_STMT);
+      onode->create_edge (cgraph_node::get_create (enter),
+			  stptr, gsi_bb (gsi)->count, false);
+
+      int nargs = gimple_call_num_args (wrcall);
+
+      gimple_seq seq = NULL;
+
+      if (apply_args)
+	{
+	  tree aalst = create_tmp_var (ptr_type_node, ".strub.apply_args");
+	  tree bappargs = builtin_decl_explicit (BUILT_IN_APPLY_ARGS);
+	  gcall *appargs = gimple_build_call (bappargs, 0);
+	  gimple_call_set_lhs (appargs, aalst);
+	  gimple_set_location (appargs, gimple_location (wrcall));
+	  gsi_insert_before (&gsi, appargs, GSI_SAME_STMT);
+	  gimple_call_set_arg (wrcall, nargs - 2 - is_stdarg, aalst);
+	  onode->create_edge (cgraph_node::get_create (bappargs),
+			      appargs, gsi_bb (gsi)->count, false);
+	}
+
+      if (is_stdarg)
+	{
+	  tree valst = create_tmp_var (va_list_type_node, ".strub.va_list");
+	  TREE_ADDRESSABLE (valst) = true;
+	  tree vaptr = build1 (ADDR_EXPR,
+			       build_pointer_type (va_list_type_node),
+			       valst);
+	  gimple_call_set_arg (wrcall, nargs - 2, unshare_expr (vaptr));
+
+	  tree bvastart = builtin_decl_explicit (BUILT_IN_VA_START);
+	  gcall *vastart = gimple_build_call (bvastart, 2,
+					      unshare_expr (vaptr),
+					      integer_zero_node);
+	  gimple_set_location (vastart, gimple_location (wrcall));
+	  gsi_insert_before (&gsi, vastart, GSI_SAME_STMT);
+	  onode->create_edge (cgraph_node::get_create (bvastart),
+			      vastart, gsi_bb (gsi)->count, false);
+
+	  tree bvaend = builtin_decl_explicit (BUILT_IN_VA_END);
+	  gcall *vaend = gimple_build_call (bvaend, 1, unshare_expr (vaptr));
+	  gimple_set_location (vaend, gimple_location (wrcall));
+	  gimple_seq_add_stmt (&seq, vaend);
+	}
+
+      gimple_call_set_arg (wrcall, nargs - 1, unshare_expr (swmp));
+      // gimple_call_set_tail (wrcall, false);
+      update_stmt (wrcall);
+
+      {
+#if !ATTR_FNSPEC_DECONST_WATERMARK
+	/* If the call will be assumed to not modify or even read the
+	   watermark, make it read and modified ourselves.  */
+	if ((gimple_call_flags (wrcall)
+	     & (ECF_CONST | ECF_PURE | ECF_NOVOPS)))
+	  {
+	    vec<tree, va_gc> *inputs = NULL;
+	    vec<tree, va_gc> *outputs = NULL;
+	    vec_safe_push (outputs,
+			   build_tree_list
+			   (build_tree_list
+			    (NULL_TREE, build_string (2, "=m")),
+			    swm));
+	    vec_safe_push (inputs,
+			   build_tree_list
+			   (build_tree_list
+			    (NULL_TREE, build_string (1, "m")),
+			    swm));
+	    gasm *forcemod = gimple_build_asm_vec ("", inputs, outputs,
+						   NULL, NULL);
+	    gimple_seq_add_stmt (&seq, forcemod);
+
+	    /* If the call will be assumed to not even read the watermark,
+	       make sure it is already in memory before the call.  */
+	    if ((gimple_call_flags (wrcall) & ECF_CONST))
+	      {
+		vec<tree, va_gc> *inputs = NULL;
+		vec_safe_push (inputs,
+			       build_tree_list
+			       (build_tree_list
+				(NULL_TREE, build_string (1, "m")),
+				swm));
+		gasm *force_store = gimple_build_asm_vec ("", inputs, NULL,
+							  NULL, NULL);
+		gimple_set_location (force_store, gimple_location (wrcall));
+		gsi_insert_before (&gsi, force_store, GSI_SAME_STMT);
+	      }
+	  }
+#endif
+
+	gcall *sleave = gimple_build_call (get_leave (), 1,
+					   unshare_expr (swmp));
+	gimple_seq_add_stmt (&seq, sleave);
+
+	gassign *clobber = gimple_build_assign (swm,
+						build_clobber
+						(TREE_TYPE (swm)));
+	gimple_seq_add_stmt (&seq, clobber);
+      }
+
+      gsi_insert_finally_seq_after_call (gsi, seq);
+
+      /* For nnode, we don't rebuild edges because we wish to retain
+	 any redirections copied to it from earlier passes, so we add
+	 call graph edges explicitly there, but for onode, we create a
+	 fresh function, so we may as well just issue the calls and
+	 then rebuild all cgraph edges.  */
+      // cgraph_edge::rebuild_edges ();
+      onode->analyze ();
+      // inline_analyze_function (onode);
+
+      pop_cfun ();
+    }
+  }
+
+  return 0;
+}
+
+simple_ipa_opt_pass *
+make_pass_ipa_strub (gcc::context *ctxt)
+{
+  return new pass_ipa_strub (ctxt);
+}
+
+#include "gt-ipa-strub.h"
diff --git a/gcc/ipa-strub.h b/gcc/ipa-strub.h
new file mode 100644
index 0000000000000..29869fadfa6c9
--- /dev/null
+++ b/gcc/ipa-strub.h
@@ -0,0 +1,45 @@ 
+/* strub (stack scrubbing) infrastructure.
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Alexandre Oliva <oliva@adacore.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+/* Return TRUE if CALLEE can be inlined into CALLER, as far as stack scrubbing
+   constraints are concerned.  CALLEE doesn't have to be called directly by
+   CALLER, but the returned value says nothing about intervening functions.  */
+extern bool strub_inlinable_to_p (cgraph_node *callee, cgraph_node *caller);
+
+/* Return FALSE if NODE is a strub context, and TRUE otherwise.  */
+extern bool strub_splittable_p (cgraph_node *node);
+
+/* Locate and return the watermark_ptr parameter for FNDECL.  If FNDECL is not a
+   strub context, return NULL.  */
+extern tree strub_watermark_parm (tree fndecl);
+
+/* Make a function type or declaration callable.  */
+extern void strub_make_callable (tree fndecl);
+
+/* Return zero iff ID is NOT an acceptable parameter for a user-supplied strub
+   attribute for a function.  Otherwise, return >0 if it enables strub, <0 if it
+   does not.  Return +/-1 if the attribute-modified type is compatible with the
+   type without the attribute, or +/-2 if it is not compatible.  */
+extern int strub_validate_fn_attr_parm (tree id);
+
+/* Like comptypes, return 0 if t1 and t2 are not compatible, 1 if they are
+   compatible, and 2 if they are nearly compatible.  Same strub mode is
+   compatible, interface-compatible strub modes are nearly compatible.  */
+extern int strub_comptypes (tree t1, tree t2);
diff --git a/gcc/passes.def b/gcc/passes.def
index 1e1950bdb39cb..d515e77be0399 100644
--- a/gcc/passes.def
+++ b/gcc/passes.def
@@ -52,6 +52,7 @@  along with GCC; see the file COPYING3.  If not see
   INSERT_PASSES_AFTER (all_small_ipa_passes)
   NEXT_PASS (pass_ipa_free_lang_data);
   NEXT_PASS (pass_ipa_function_and_variable_visibility);
+  NEXT_PASS (pass_ipa_strub_mode);
   NEXT_PASS (pass_build_ssa_passes);
   PUSH_INSERT_PASSES_WITHIN (pass_build_ssa_passes)
       NEXT_PASS (pass_fixup_cfg);
@@ -115,6 +116,7 @@  along with GCC; see the file COPYING3.  If not see
   POP_INSERT_PASSES ()
 
   NEXT_PASS (pass_ipa_remove_symbols);
+  NEXT_PASS (pass_ipa_strub);
   NEXT_PASS (pass_ipa_oacc);
   PUSH_INSERT_PASSES_WITHIN (pass_ipa_oacc)
       NEXT_PASS (pass_ipa_pta);
diff --git a/gcc/testsuite/c-c++-common/strub-O0.c b/gcc/testsuite/c-c++-common/strub-O0.c
new file mode 100644
index 0000000000000..c7a79a6ea0d8a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O0.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O0 -fstrub=strict -fdump-rtl-expand" } */
+
+/* At -O0, none of the strub builtins are expanded inline.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-O1.c b/gcc/testsuite/c-c++-common/strub-O1.c
new file mode 100644
index 0000000000000..96285c975d98e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O1.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -fstrub=strict -fdump-rtl-expand" } */
+
+/* At -O1, without -fno-inline, we fully expand enter, but neither update nor
+   leave.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump-not "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-O2.c b/gcc/testsuite/c-c++-common/strub-O2.c
new file mode 100644
index 0000000000000..8edc0d8aa1321
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O2.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstrub=strict -fdump-rtl-expand" } */
+
+/* At -O2, without -fno-inline, we fully expand enter and update, and add a test
+   around the leave call.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump-not "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
+/* { dg-final { scan-rtl-dump "\[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-O2fni.c b/gcc/testsuite/c-c++-common/strub-O2fni.c
new file mode 100644
index 0000000000000..c6d900cf3c45b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O2fni.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstrub=strict -fdump-rtl-expand -fno-inline" } */
+
+/* With -fno-inline, none of the strub builtins are inlined.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "\[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-O3.c b/gcc/testsuite/c-c++-common/strub-O3.c
new file mode 100644
index 0000000000000..33ee465e51cb6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O3.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fstrub=strict -fdump-rtl-expand" } */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump-not "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "strub_leave" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-O3fni.c b/gcc/testsuite/c-c++-common/strub-O3fni.c
new file mode 100644
index 0000000000000..2936f82079e18
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-O3fni.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fstrub=strict -fdump-rtl-expand -fno-inline" } */
+
+/* With -fno-inline, none of the strub builtins are inlined.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "\[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-Og.c b/gcc/testsuite/c-c++-common/strub-Og.c
new file mode 100644
index 0000000000000..479746e57d87e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-Og.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Og -fstrub=strict -fdump-rtl-expand" } */
+
+/* At -Og, without -fno-inline, we fully expand enter, but neither update nor
+   leave.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump-not "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "\[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-Os.c b/gcc/testsuite/c-c++-common/strub-Os.c
new file mode 100644
index 0000000000000..2241d4ea07f27
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-Os.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os -fstrub=strict -fdump-rtl-expand" } */
+
+/* At -Os, without -fno-inline, we fully expand enter, and also update.  The
+   expanded update might be larger than a call proper, but argument saving and
+   restoring required by the call will most often make it larger.  The leave
+   call is left untouched.  */
+
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-rtl-dump-not "strub_enter" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "strub_update" "expand" } } */
+/* { dg-final { scan-rtl-dump "strub_leave" "expand" } } */
+/* { dg-final { scan-rtl-dump-not "\[(\]call\[^\n\]*strub_leave.*\n\[(\]code_label" "expand" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-all1.c b/gcc/testsuite/c-c++-common/strub-all1.c
new file mode 100644
index 0000000000000..a322bcc5da606
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-all1.c
@@ -0,0 +1,32 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=all -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* h becomes STRUB_CALLABLE, rather than STRUB_INLINABLE, because of the
+   strub-enabling -fstrub flag, and gets inlined before pass_ipa_strub.  */
+static inline void
+__attribute__ ((__always_inline__))
+h() {
+}
+
+/* g becomes STRUB_AT_CALLS, because of the flag.  */
+static inline void
+g() {
+  h();
+}
+
+/* f becomes STRUB_INTERNAL because of the flag, and gets split into
+   STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 1 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls\[)\]" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-all2.c b/gcc/testsuite/c-c++-common/strub-all2.c
new file mode 100644
index 0000000000000..db60026d0e080
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-all2.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=all -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* g becomes STRUB_INTERNAL, because of the flag.  Without inline, force_output
+   is set for static non-inline functions when not optimizing, and that keeps
+   only_called_directly_p from returning true, which makes STRUB_AT_CALLS
+   non-viable.  */
+static void
+g() {
+}
+
+/* f becomes STRUB_INTERNAL because of the flag, and gets split into
+   STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-apply1.c b/gcc/testsuite/c-c++-common/strub-apply1.c
new file mode 100644
index 0000000000000..2f462adc1efe0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-apply1.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+void __attribute__ ((__strub__ ("callable")))
+apply_function (void *args)
+{
+  __builtin_apply (0, args, 0);
+}
+
+void __attribute__ ((__strub__ ("internal")))
+apply_args (int i, int j, double d)
+{
+  void *args = __builtin_apply_args ();
+  apply_function (args);
+}
diff --git a/gcc/testsuite/c-c++-common/strub-apply2.c b/gcc/testsuite/c-c++-common/strub-apply2.c
new file mode 100644
index 0000000000000..a5d7551f5da5c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-apply2.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+extern void __attribute__ ((__strub__))
+apply_function (void *args);
+
+void __attribute__ ((__strub__))
+apply_args (int i, int j, double d) /* { dg-error "selected" } */
+{
+  void *args = __builtin_apply_args (); /* { dg-message "does not support" } */
+  apply_function (args);
+}
diff --git a/gcc/testsuite/c-c++-common/strub-apply3.c b/gcc/testsuite/c-c++-common/strub-apply3.c
new file mode 100644
index 0000000000000..64422a0d1e880
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-apply3.c
@@ -0,0 +1,8 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+void __attribute__ ((__strub__))
+apply_function (void *args)
+{
+  __builtin_apply (0, args, 0); /* { dg-error "in .strub. context" } */
+}
diff --git a/gcc/testsuite/c-c++-common/strub-apply4.c b/gcc/testsuite/c-c++-common/strub-apply4.c
new file mode 100644
index 0000000000000..15ffaa031b899
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-apply4.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstrub=strict -fdump-ipa-strubm" } */
+
+/* Check that implicit enabling of strub mode selects internal strub when the
+   function uses __builtin_apply_args, that prevents the optimization to
+   at-calls mode.  */
+
+int __attribute__ ((__strub__)) var;
+
+static inline void
+apply_args (int i, int j, double d)
+{
+  var++;
+  __builtin_apply_args ();
+}
+
+void f() {
+  apply_args (1, 2, 3);
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 1 "strubm" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-at-calls1.c b/gcc/testsuite/c-c++-common/strub-at-calls1.c
new file mode 100644
index 0000000000000..b70843b4215a4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-at-calls1.c
@@ -0,0 +1,30 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=at-calls -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* h becomes STRUB_CALLABLE, rather than STRUB_INLINABLE, because of the
+   strub-enabling -fstrub flag, and gets inlined before pass_ipa_strub.  */
+static inline void
+__attribute__ ((__always_inline__))
+h() {
+}
+
+/* g becomes STRUB_AT_CALLS, because of the flag.  */
+static inline void
+g() {
+  h();
+}
+
+/* f does NOT become STRUB_AT_CALLS because it is visible; it becomes
+   STRUB_CALLABLE.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls\[)\]" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-at-calls2.c b/gcc/testsuite/c-c++-common/strub-at-calls2.c
new file mode 100644
index 0000000000000..97a3988a6b922
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-at-calls2.c
@@ -0,0 +1,23 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=at-calls -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* g does NOT become STRUB_AT_CALLS because it's not viable.  Without inline,
+   force_output is set for static non-inline functions when not optimizing, and
+   that keeps only_called_directly_p from returning true, which makes
+   STRUB_AT_CALLS non-viable.  It becomes STRUB_CALLABLE instead.  */
+static void
+g() {
+}
+
+/* f does NOT become STRUB_AT_CALLS because it is visible; it becomes
+   STRUB_CALLABLE.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-defer-O1.c b/gcc/testsuite/c-c++-common/strub-defer-O1.c
new file mode 100644
index 0000000000000..3d73431b3dcd3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-defer-O1.c
@@ -0,0 +1,7 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict -O1" } */
+
+/* Check that a strub function called by another strub function does NOT defer
+   the strubbing to its caller at -O1.  */
+
+#include "strub-defer-O2.c"
diff --git a/gcc/testsuite/c-c++-common/strub-defer-O2.c b/gcc/testsuite/c-c++-common/strub-defer-O2.c
new file mode 100644
index 0000000000000..fddf3c745e7e6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-defer-O2.c
@@ -0,0 +1,8 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict -O2" } */
+
+/* Check that a strub function called by another strub function does NOT defer
+   the strubbing to its caller at -O2.  */
+
+#define EXPECT_DEFERRAL !
+#include "strub-defer-O3.c"
diff --git a/gcc/testsuite/c-c++-common/strub-defer-O3.c b/gcc/testsuite/c-c++-common/strub-defer-O3.c
new file mode 100644
index 0000000000000..7ebc65b58dd72
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-defer-O3.c
@@ -0,0 +1,110 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict -O3" } */
+
+/* Check that a strub function called by another strub function defers the
+   strubbing to its caller at -O3.  */
+
+#ifndef EXPECT_DEFERRAL
+/* Other strub-defer*.c tests override this macro.  */
+# define EXPECT_DEFERRAL
+#endif
+
+const char test_string[] = "\x55\xde\xad\xbe\xef\xc0\x1d\xca\xfe\x55\xaa";
+
+/* Pad before and after the string on the stack, so that it's not overwritten by
+   regular stack use.  */
+#define PAD 7
+
+static inline __attribute__ ((__always_inline__, __strub__ ("callable")))
+char *
+leak_string (void)
+{
+  /* We use this variable to avoid any stack red zone.  Stack scrubbing covers
+     it, but __builtin_stack_address, that we take as a reference, doesn't, so
+     if e.g. callable() were to store the string in the red zone, we wouldn't
+     find it because it would be outside the range we searched.  */
+  typedef void __attribute__ ((__strub__ ("callable"))) callable_t (char *);
+  callable_t *f = 0;
+
+  char s[2 * PAD + 1][sizeof (test_string)];
+  __builtin_strcpy (s[PAD], test_string);
+  asm ("" : "+m" (s), "+r" (f));
+
+  if (__builtin_expect (!f, 1))
+    return (char*)__builtin_stack_address ();
+
+  f (s[PAD]);
+  return 0;
+}
+
+static inline __attribute__ ((__always_inline__, __strub__ ("callable")))
+int
+look_for_string (char *e)
+{
+  char *p = (char*)__builtin_stack_address ();
+
+  if (p == e)
+    __builtin_abort ();
+
+  if (p > e)
+    {
+      char *q = p;
+      p = e;
+      e = q;
+    }
+
+  for (char *re = e - sizeof (test_string); p < re; p++)
+    for (int i = 0; p[i] == test_string[i]; i++)
+      if (i == sizeof (test_string) - 1)
+	return i;
+
+  return 0;
+}
+
+static __attribute__ ((__strub__ ("at-calls"), __noinline__, __noclone__))
+char *
+at_calls ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("at-calls")))
+char *
+deferred_at_calls ()
+{
+  char *ret;
+  int i = 1;
+  /* Since these test check stack contents above the top of the stack, an
+     unexpected asynchronous signal or interrupt might overwrite the bits we
+     expect to find and cause spurious fails.  Tolerate one such overall
+     spurious fail by retrying.  */
+  while (EXPECT_DEFERRAL !look_for_string ((ret = at_calls ())))
+    if (!i--) __builtin_abort ();
+  return ret;
+}
+
+static __attribute__ ((__strub__ ("internal")))
+char *
+deferred_internal ()
+{
+  int i = 1;
+  char *ret;
+  while (EXPECT_DEFERRAL !look_for_string ((ret = at_calls ())))
+    if (!i--) __builtin_abort ();
+  return ret;
+}
+
+int main ()
+{
+  int i = 1;
+  /* These calls should not be subject to spurious fails: whether or not some
+     asynchronous event overwrites the scrubbed stack space, the string won't
+     remain there.  Unless the asynchronous event happens to write the string
+     where we look for it, but what are the odds?  Anyway, it doesn't hurt to
+     retry, even if just for symmetry.  */
+  while (look_for_string (deferred_at_calls ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (deferred_internal ()))
+    if (!i--) __builtin_abort ();
+  __builtin_exit (0);
+}
diff --git a/gcc/testsuite/c-c++-common/strub-defer-Os.c b/gcc/testsuite/c-c++-common/strub-defer-Os.c
new file mode 100644
index 0000000000000..fbaf85fe0fafe
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-defer-Os.c
@@ -0,0 +1,7 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict -Os" } */
+
+/* Check that a strub function called by another strub function defers the
+   strubbing to its caller at -Os.  */
+
+#include "strub-defer-O3.c"
diff --git a/gcc/testsuite/c-c++-common/strub-internal1.c b/gcc/testsuite/c-c++-common/strub-internal1.c
new file mode 100644
index 0000000000000..e9d7b7b9ee0a8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-internal1.c
@@ -0,0 +1,31 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=internal -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* h becomes STRUB_CALLABLE, rather than STRUB_INLINABLE, because of the
+   strub-enabling -fstrub flag, and gets inlined before pass_ipa_strub.  */
+static inline void
+__attribute__ ((__always_inline__))
+h() {
+}
+
+/* g becomes STRUB_INTERNAL because of the flag, and gets split into
+   STRUB_WRAPPER and STRUB_WRAPPED.  */
+static inline void
+g() {
+  h();
+}
+
+/* f becomes STRUB_INTERNAL because of the flag, and gets split into
+   STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]callable\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-internal2.c b/gcc/testsuite/c-c++-common/strub-internal2.c
new file mode 100644
index 0000000000000..8b8e15a51c71c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-internal2.c
@@ -0,0 +1,21 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=internal -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* g becomes STRUB_INTERNAL, because of the flag.  */
+static void
+g() {
+}
+
+/* f becomes STRUB_INTERNAL because of the flag, and gets split into
+   STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  g();
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-parms1.c b/gcc/testsuite/c-c++-common/strub-parms1.c
new file mode 100644
index 0000000000000..0a4a7539d3489
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-parms1.c
@@ -0,0 +1,48 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+#include <stdarg.h>
+
+void __attribute__ ((__strub__ ("internal")))
+small_args (int i, long long l, void *p, void **q, double d, char c)
+{
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*small_args\[^ \]*.strub.\[0-9\]* \[(\]int i, long long int l, void \\* p, void \\* \\* q, double d, char c, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*small_args\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*&.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+
+
+struct large_arg {
+  int x[128];
+};
+
+void __attribute__ ((__strub__ ("internal")))
+large_byref_arg (struct large_arg la)
+{
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*large_byref_arg\[^ \]*.strub.\[0-9\]* \[(\]struct large_arg & la, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*large_byref_arg\[^ \]*.strub.\[0-9\]* \[(\]&\[^&\]*&.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+
+void __attribute__ ((__strub__ ("internal")))
+std_arg (int i, ...)
+{
+  va_list vl;
+  va_start (vl, i);
+  va_end (vl);
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*std_arg\[^ \]*.strub.\[0-9\]* \[(\]int i, \[^&,\]* &\[^&,\]*.strub.va_list_ptr, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*std_arg\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*&.strub.va_list.\[0-9\]*, &.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_start \\(" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_copy \\(" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_end \\(" 2 "strub" } } */
+
+void __attribute__ ((__strub__ ("internal")))
+apply_args (int i, int j, double d)
+{
+  __builtin_apply_args ();
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*apply_args\[^ \]*.strub.\[0-9\]* \[(\]int i, int j, double d, void \\*\[^&,\]*.strub.apply_args, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*apply_args\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*.strub.apply_args.\[0-9\]*_\[0-9\]*, &.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-parms2.c b/gcc/testsuite/c-c++-common/strub-parms2.c
new file mode 100644
index 0000000000000..147171d96d5a1
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-parms2.c
@@ -0,0 +1,36 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+#include <stdarg.h>
+
+void __attribute__ ((__strub__ ("at-calls")))
+small_args (int i, long long l, void *p, void **q, double d, char c)
+{
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*small_args\[^ \]* \[(\]int i, long long int l, void \\* p, void \\* \\* q, double d, char c, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+
+
+struct large_arg {
+  int x[128];
+};
+
+void __attribute__ ((__strub__ ("at-calls")))
+large_byref_arg (struct large_arg la)
+{
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*large_byref_arg\[^ \]* \[(\]struct large_arg la, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+
+void __attribute__ ((__strub__ ("at-calls")))
+std_arg (int i, ...)
+{
+  va_list vl;
+  va_start (vl, i);
+  va_end (vl);
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*std_arg\[^ \]* \[(\]int i, void \\* &\[^&,\]*.strub.watermark_ptr\[, .]*\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_start \\(" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-not "va_copy \\(" "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_end \\(" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-parms3.c b/gcc/testsuite/c-c++-common/strub-parms3.c
new file mode 100644
index 0000000000000..4e92682895a43
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-parms3.c
@@ -0,0 +1,58 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that uses of a strub variable implicitly enables internal strub for
+   publicly-visible functions, and causes the same transformations to their
+   signatures as those in strub-parms1.c.  */
+
+#include <stdarg.h>
+
+int __attribute__ ((__strub__)) var;
+
+void
+small_args (int i, long long l, void *p, void **q, double d, char c)
+{
+  var++;
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*small_args\[^ \]*.strub.\[0-9\]* \[(\]int i, long long int l, void \\* p, void \\* \\* q, double d, char c, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*small_args\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*&.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+
+
+struct large_arg {
+  int x[128];
+};
+
+void
+large_byref_arg (struct large_arg la)
+{
+  var++;
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*large_byref_arg\[^ \]*.strub.\[0-9\]* \[(\]struct large_arg & la, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*large_byref_arg\[^ \]*.strub.\[0-9\]* \[(\]&\[^&\]*&.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+
+void
+std_arg (int i, ...)
+{
+  va_list vl;
+  va_start (vl, i);
+  var++;
+  va_end (vl);
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*std_arg\[^ \]*.strub.\[0-9\]* \[(\]int i, \[^&,\]* &\[^&,\]*.strub.va_list_ptr, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*std_arg\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*&.strub.va_list.\[0-9\]*, &.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_start \\(" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_copy \\(" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "va_end \\(" 2 "strub" } } */
+
+void
+apply_args (int i, int j, double d)
+{
+  var++;
+  __builtin_apply_args ();
+}
+
+/* { dg-final { scan-ipa-dump "\n(void )?\[^ \]*apply_args\[^ \]*.strub.\[0-9\]* \[(\]int i, int j, double d, void \\*\[^&,\]*.strub.apply_args, void \\* &\[^&,\]*.strub.watermark_ptr\[)\]" "strub" } } */
+/* { dg-final { scan-ipa-dump " \[^ \]*apply_args\[^ \]*.strub.\[0-9\]* \[(\]\[^&\]*.strub.apply_args.\[0-9\]*_\[0-9\]*, &.strub.watermark.\[0-9\]*\[)\]" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-relaxed1.c b/gcc/testsuite/c-c++-common/strub-relaxed1.c
new file mode 100644
index 0000000000000..e2f9d8aebca58
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-relaxed1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* The difference between relaxed and strict in this case is that we accept the
+   call from one internal-strub function to another.  Without the error,
+   inlining takes place.  */
+
+#include "strub-strict1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]inlinable\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls-opt\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 1 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls-opt\[)\]" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 1 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-relaxed2.c b/gcc/testsuite/c-c++-common/strub-relaxed2.c
new file mode 100644
index 0000000000000..98474435d2e59
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-relaxed2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed -fdump-ipa-strubm -fdump-ipa-strub" } */
+
+/* The difference between relaxed and strict in this case is that we accept the
+   call from one internal-strub function to another.  */
+
+#include "strub-strict2.c"
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 2 "strubm" } } */
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapped\[)\]" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]wrapper\[)\]" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-O0-exc.c b/gcc/testsuite/c-c++-common/strub-short-O0-exc.c
new file mode 100644
index 0000000000000..1de15342595e4
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-O0-exc.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O0 -fstrub=strict -fexceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 45 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 89 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-O0.c b/gcc/testsuite/c-c++-common/strub-short-O0.c
new file mode 100644
index 0000000000000..f9209c819004b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-O0.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O0 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 45 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 45 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-O1.c b/gcc/testsuite/c-c++-common/strub-short-O1.c
new file mode 100644
index 0000000000000..bed1dcfb54a45
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-O1.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 45 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 45 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-O2.c b/gcc/testsuite/c-c++-common/strub-short-O2.c
new file mode 100644
index 0000000000000..6bf0071f52b93
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-O2.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 45 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 45 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-O3.c b/gcc/testsuite/c-c++-common/strub-short-O3.c
new file mode 100644
index 0000000000000..4732f515bf70c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-O3.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O3 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  At -O3 and -Os, we omit
+   enter and leave calls within strub contexts, passing on the enclosing
+   watermark.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 15 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 15 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-short-Os.c b/gcc/testsuite/c-c++-common/strub-short-Os.c
new file mode 100644
index 0000000000000..8d6424c479a3a
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-short-Os.c
@@ -0,0 +1,12 @@ 
+/* { dg-do compile } */
+/* { dg-options "-Os -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.  At -O3 and -Os, we omit
+   enter and leave calls within strub contexts, passing on the enclosing
+   watermark.  */
+
+#include "torture/strub-callable1.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 15 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 4 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 15 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-strict1.c b/gcc/testsuite/c-c++-common/strub-strict1.c
new file mode 100644
index 0000000000000..368522442066e
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-strict1.c
@@ -0,0 +1,36 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strubm" } */
+
+static int __attribute__ ((__strub__)) var;
+
+/* h becomes STRUB_INLINABLE, because of the use of the strub variable,
+   and the always_inline flag.  It would get inlined before pass_ipa_strub, if
+   it weren't for the error.  */
+static inline void
+__attribute__ ((__always_inline__))
+h() {
+  var++;
+}
+
+/* g becomes STRUB_AT_CALLS_OPT, because of the use of the strub variable, and
+   the viability of at-calls strubbing.  Though internally a strub context, its
+   interface is not strub-enabled, so it's not callable from within strub
+   contexts.  */
+static inline void
+g() {
+  var--;
+  h();
+}
+
+/* f becomes STRUB_INTERNAL because of the use of the strub variable, and gets
+   split into STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  var++;
+  g();  /* { dg-error "calling non-.strub." } */
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 3 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]inlinable\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]at-calls-opt\[)\]" 1 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 1 "strubm" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-strict2.c b/gcc/testsuite/c-c++-common/strub-strict2.c
new file mode 100644
index 0000000000000..b4f2888321821
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-strict2.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strubm" } */
+
+static int __attribute__ ((__strub__)) var;
+
+/* g becomes STRUB_INTERNAL because of the use of the strub variable, and gets
+   split into STRUB_WRAPPER and STRUB_WRAPPED.  It's not STRUB_AT_CALLS_OPT
+   because force_output is set for static non-inline functions when not
+   optimizing, and that keeps only_called_directly_p from returning true, which
+   makes STRUB_AT_CALLS[_OPT] non-viable.  */
+static void
+g() {
+  var--;
+}
+
+/* f becomes STRUB_INTERNAL because of the use of the strub variable, and gets
+   split into STRUB_WRAPPER and STRUB_WRAPPED.  */
+void
+f() {
+  var++;
+  g();  /* { dg-error "calling non-.strub." } */
+}
+
+/* { dg-final { scan-ipa-dump-times "strub \[(\]" 2 "strubm" } } */
+/* { dg-final { scan-ipa-dump-times "strub \[(\]internal\[)\]" 2 "strubm" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-tail-O1.c b/gcc/testsuite/c-c++-common/strub-tail-O1.c
new file mode 100644
index 0000000000000..e48e0610e079b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-tail-O1.c
@@ -0,0 +1,8 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O1 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+#include "strub-tail-O2.c"
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-tail-O2.c b/gcc/testsuite/c-c++-common/strub-tail-O2.c
new file mode 100644
index 0000000000000..87cda7ab21b16
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-tail-O2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-O2 -fstrub=strict -fno-exceptions -fdump-ipa-strub" } */
+
+/* Check that the expected strub calls are issued.
+   Tail calls are short-circuited at -O2+.  */
+
+int __attribute__ ((__strub__))
+g (int i, int j) {
+  return g (i, j);
+}
+
+/* { dg-final { scan-ipa-dump-times "strub_enter" 0 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_update" 2 "strub" } } */
+/* { dg-final { scan-ipa-dump-times "strub_leave" 0 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/strub-var1.c b/gcc/testsuite/c-c++-common/strub-var1.c
new file mode 100644
index 0000000000000..eb6250fd39c90
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/strub-var1.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+
+int __attribute__ ((strub)) x;
+float __attribute__ ((strub)) f;
+double __attribute__ ((strub)) d;
+
+/* The attribute applies to the type of the declaration, i.e., to the pointer
+   variable p, not to the pointed-to integer.  */
+int __attribute__ ((strub)) *
+p = &x; /* { dg-message "incompatible|invalid conversion" } */
+
+typedef int __attribute__ ((strub)) strub_int;
+strub_int *q = &x; /* Now this is compatible.  */
+
+int __attribute__ ((strub))
+a[2]; /* { dg-warning "does not apply to elements" } */
+
+int __attribute__ ((vector_size (4 * sizeof (int))))
+    __attribute__ ((strub))
+v; /* { dg-warning "does not apply to elements" } */
+
+struct s {
+  int i, j;
+} __attribute__ ((strub)) w; /* { dg-warning "does not apply to fields" } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-callable1.c b/gcc/testsuite/c-c++-common/torture/strub-callable1.c
new file mode 100644
index 0000000000000..b5e45ab0525ad
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-callable1.c
@@ -0,0 +1,9 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+/* Check that strub and non-strub functions can be called from non-strub
+   contexts, and that strub and callable functions can be called from strub
+   contexts.  */
+
+#define OMIT_IMPERMISSIBLE_CALLS 1
+#include "strub-callable2.c"
diff --git a/gcc/testsuite/c-c++-common/torture/strub-callable2.c b/gcc/testsuite/c-c++-common/torture/strub-callable2.c
new file mode 100644
index 0000000000000..96aa7fe4b07f7
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-callable2.c
@@ -0,0 +1,264 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+/* Check that impermissible (cross-strub-context) calls are reported.  */
+
+extern int __attribute__ ((__strub__ ("callable"))) xcallable (void);
+extern int __attribute__ ((__strub__ ("internal"))) xinternal (void);
+extern int __attribute__ ((__strub__ ("at-calls"))) xat_calls (void);
+extern int __attribute__ ((__strub__ ("disabled"))) xdisabled (void);
+
+int __attribute__ ((__strub__ ("callable"))) callable (void);
+int __attribute__ ((__strub__ ("internal"))) internal (void);
+int __attribute__ ((__strub__ ("at-calls"))) at_calls (void);
+int __attribute__ ((__strub__ ("disabled"))) disabled (void);
+
+int __attribute__ ((__strub__)) var;
+int var_user (void);
+
+static inline int __attribute__ ((__always_inline__, __strub__ ("callable")))
+icallable (void);
+static inline int __attribute__ ((__always_inline__, __strub__ ("internal")))
+iinternal (void);
+static inline int __attribute__ ((__always_inline__, __strub__ ("at-calls")))
+iat_calls (void);
+static inline int __attribute__ ((__always_inline__, __strub__ ("disabled")))
+idisabled (void);
+static inline int __attribute__ ((__always_inline__))
+ivar_user (void);
+
+static inline int __attribute__ ((__always_inline__, __strub__ ("callable")))
+i_callable (void) { return 0; }
+static inline int __attribute__ ((__always_inline__, __strub__ ("internal")))
+i_internal (void) { return var; }
+static inline int __attribute__ ((__always_inline__, __strub__ ("at-calls")))
+i_at_calls (void) { return var; }
+static inline int __attribute__ ((__always_inline__, __strub__ ("disabled")))
+i_disabled (void) { return 0; }
+static inline int __attribute__ ((__always_inline__))
+i_var_user (void) { return var; }
+
+#define CALLS_GOOD_FOR_STRUB_CONTEXT(ISEP)	\
+  do {						\
+    ret += i ## ISEP ## at_calls ();		\
+    ret += i ## ISEP ## internal ();		\
+    ret += i ## ISEP ## var_user ();		\
+  } while (0)
+
+#define CALLS_GOOD_FOR_NONSTRUB_CONTEXT(ISEP)	\
+  do {						\
+    ret += internal ();				\
+    ret += disabled ();				\
+    ret += var_user ();				\
+						\
+    ret += i ## ISEP ## disabled ();		\
+						\
+    ret += xinternal ();			\
+    ret += xdisabled ();			\
+  } while (0)
+
+#define CALLS_GOOD_FOR_EITHER_CONTEXT(ISEP)	\
+  do {						\
+    ret += i ## ISEP ## callable ();		\
+						\
+    ret += callable ();				\
+    ret += at_calls ();				\
+						\
+    ret += xat_calls ();			\
+    ret += xcallable ();			\
+  } while (0)
+
+/* Not a strub context, so it can call anything.
+   Explicitly declared as callable even from within strub contexts.  */
+int __attribute__ ((__strub__ ("callable")))
+callable (void) {
+  int ret = 0;
+
+  /* CALLS_GOOD_FOR_STRUB_CONTEXT(); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += iat_calls (); /* { dg-error "in non-.strub. context" } */
+    ret += iinternal (); /* { dg-error "in non-.strub. context" } */
+    ret += ivar_user (); /* { dg-error "in non-.strub. context" } */
+#endif
+  CALLS_GOOD_FOR_EITHER_CONTEXT();
+  CALLS_GOOD_FOR_NONSTRUB_CONTEXT();
+
+  return ret;
+}
+
+/* Internal strubbing means the body is a strub context, so it can only call
+   strub functions, and it's not itself callable from strub functions.  */
+int __attribute__ ((__strub__ ("internal")))
+internal (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT();
+  CALLS_GOOD_FOR_EITHER_CONTEXT();
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += idisabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
+
+int __attribute__ ((__strub__ ("at-calls")))
+at_calls (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT();
+  CALLS_GOOD_FOR_EITHER_CONTEXT();
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += idisabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
+
+int __attribute__ ((__strub__ ("disabled")))
+disabled () {
+  int ret = 0;
+
+  /* CALLS_GOOD_FOR_STRUB_CONTEXT(); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += iat_calls (); /* { dg-error "in non-.strub. context" } */
+    ret += iinternal (); /* { dg-error "in non-.strub. context" } */
+    ret += ivar_user (); /* { dg-error "in non-.strub. context" } */
+#endif
+  CALLS_GOOD_FOR_EITHER_CONTEXT();
+  CALLS_GOOD_FOR_NONSTRUB_CONTEXT();
+
+  return ret;
+}  
+
+int
+var_user (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT();
+  CALLS_GOOD_FOR_EITHER_CONTEXT();
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += idisabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
+
+int
+icallable (void)
+{
+  int ret = 0;
+
+  /* CALLS_GOOD_FOR_STRUB_CONTEXT(_); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += i_at_calls (); /* { dg-error "in non-.strub. context" } */
+    ret += i_internal (); /* { dg-error "in non-.strub. context" } */
+    ret += i_var_user (); /* { dg-error "in non-.strub. context" } */
+#endif
+  CALLS_GOOD_FOR_EITHER_CONTEXT(_);
+  CALLS_GOOD_FOR_NONSTRUB_CONTEXT(_);
+
+  return ret;
+}
+
+int
+iinternal (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT(_);
+  CALLS_GOOD_FOR_EITHER_CONTEXT(_);
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(_); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += i_disabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
+
+int __attribute__ ((__always_inline__, __strub__ ("at-calls")))
+iat_calls (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT(_);
+  CALLS_GOOD_FOR_EITHER_CONTEXT(_);
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(_); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += i_disabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
+
+int
+idisabled () {
+  int ret = 0;
+
+  /* CALLS_GOOD_FOR_STRUB_CONTEXT(_); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += i_at_calls (); /* { dg-error "in non-.strub. context" } */
+    ret += i_internal (); /* { dg-error "in non-.strub. context" } */
+    ret += i_var_user (); /* { dg-error "in non-.strub. context" } */
+#endif
+  CALLS_GOOD_FOR_EITHER_CONTEXT(_);
+  CALLS_GOOD_FOR_NONSTRUB_CONTEXT(_);
+
+  return ret;
+}  
+
+int
+ivar_user (void) {
+  int ret = var;
+
+  CALLS_GOOD_FOR_STRUB_CONTEXT(_);
+  CALLS_GOOD_FOR_EITHER_CONTEXT(_);
+  /* CALLS_GOOD_FOR_NONSTRUB_CONTEXT(_); */
+#if !OMIT_IMPERMISSIBLE_CALLS
+    ret += internal (); /* { dg-error "in .strub. context" } */
+    ret += disabled (); /* { dg-error "in .strub. context" } */
+    ret += var_user (); /* { dg-error "in .strub. context" } */
+
+    ret += i_disabled (); /* { dg-error "in .strub. context" } */
+
+    ret += xinternal (); /* { dg-error "in .strub. context" } */
+    ret += xdisabled (); /* { dg-error "in .strub. context" } */
+#endif
+
+  return ret;
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-const1.c b/gcc/testsuite/c-c++-common/torture/strub-const1.c
new file mode 100644
index 0000000000000..2857195706ed6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-const1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub const function call, we issue an asm statement
+   to make sure the watermark passed to it is held in memory before the call,
+   and another to make sure it is not assumed to be unchanged.  */
+
+int __attribute__ ((__strub__, __const__))
+f() {
+  return 0;
+}
+
+int
+g() {
+  return f();
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-const2.c b/gcc/testsuite/c-c++-common/torture/strub-const2.c
new file mode 100644
index 0000000000000..98a92bc9eac2b
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-const2.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub implicitly-const function call, we issue an
+   asm statement to make sure the watermark passed to it is held in memory
+   before the call, and another to make sure it is not assumed to be
+   unchanged.  */
+
+int __attribute__ ((__strub__))
+#if ! __OPTIMIZE__
+__attribute__ ((__const__))
+#endif
+f() {
+  return 0;
+}
+
+int
+g() {
+  return f();
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-const3.c b/gcc/testsuite/c-c++-common/torture/strub-const3.c
new file mode 100644
index 0000000000000..5511a6e1e71d3
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-const3.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub const wrapping call, we issue an asm statement
+   to make sure the watermark passed to it is held in memory before the call,
+   and another to make sure it is not assumed to be unchanged.  */
+
+int __attribute__ ((__strub__ ("internal"), __const__))
+f() {
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-const4.c b/gcc/testsuite/c-c++-common/torture/strub-const4.c
new file mode 100644
index 0000000000000..47ee927964dff
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-const4.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub implicitly-const wrapping call, we issue an
+   asm statement to make sure the watermark passed to it is held in memory
+   before the call, and another to make sure it is not assumed to be
+   unchanged.  */
+
+int __attribute__ ((__strub__ ("internal")))
+#if ! __OPTIMIZE__
+__attribute__ ((__const__))
+#endif
+f() {
+  return 0;
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 2 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-data1.c b/gcc/testsuite/c-c++-common/torture/strub-data1.c
new file mode 100644
index 0000000000000..7c27a2a1a6dca
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-data1.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* The pointed-to data enables strubbing if accessed.  */
+int __attribute__ ((__strub__)) var;
+
+int f() {
+  return var;
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-data2.c b/gcc/testsuite/c-c++-common/torture/strub-data2.c
new file mode 100644
index 0000000000000..e66d903780afd
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-data2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* The pointer itself is a strub variable, enabling internal strubbing when
+   its value is used.  */
+int __attribute__ ((__strub__)) *ptr;
+
+int *f() {
+  return ptr;
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-data3.c b/gcc/testsuite/c-c++-common/torture/strub-data3.c
new file mode 100644
index 0000000000000..5e08e0e58c658
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-data3.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* The pointer itself is a strub variable, that would enable internal strubbing
+   if its value was used.  Here, it's only overwritten, so no strub.  */
+int __attribute__ ((__strub__)) var;
+
+void f() {
+  var = 0;
+}
+
+/* { dg-final { scan-ipa-dump-not "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-data4.c b/gcc/testsuite/c-c++-common/torture/strub-data4.c
new file mode 100644
index 0000000000000..a818e7a38bb5f
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-data4.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* The pointer itself is a strub variable, that would enable internal strubbing
+   if its value was used.  Here, it's only overwritten, so no strub.  */
+int __attribute__ ((__strub__)) *ptr;
+
+void f() {
+  ptr = 0;
+}
+
+/* { dg-final { scan-ipa-dump-not "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-data5.c b/gcc/testsuite/c-c++-common/torture/strub-data5.c
new file mode 100644
index 0000000000000..ddb0b5c0543b0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-data5.c
@@ -0,0 +1,15 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+/* It would be desirable to issue at least warnings for these.  */
+
+typedef int __attribute__ ((__strub__)) strub_int;
+strub_int *ptr;
+
+int *f () {
+  return ptr; /* { dg-message "incompatible|invalid conversion" } */
+}
+
+strub_int *g () {
+  return f (); /* { dg-message "incompatible|invalid conversion" } */
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-indcall1.c b/gcc/testsuite/c-c++-common/torture/strub-indcall1.c
new file mode 100644
index 0000000000000..c165f312f16de
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-indcall1.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+typedef void __attribute__ ((__strub__)) fntype ();
+fntype (*ptr);
+
+void f() {
+  ptr ();
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "(&\.strub\.watermark\.\[0-9\]\+)" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-indcall2.c b/gcc/testsuite/c-c++-common/torture/strub-indcall2.c
new file mode 100644
index 0000000000000..69fcff8d3763d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-indcall2.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+typedef void __attribute__ ((__strub__)) fntype (int, int);
+fntype (*ptr);
+
+void f() {
+  ptr (0, 0);
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "(0, 0, &\.strub\.watermark\.\[0-9\]\+)" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-indcall3.c b/gcc/testsuite/c-c++-common/torture/strub-indcall3.c
new file mode 100644
index 0000000000000..ff006224909bd
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-indcall3.c
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+typedef void __attribute__ ((__strub__)) fntype (int, int, ...);
+fntype (*ptr);
+
+void f() {
+  ptr (0, 0, 1, 1);
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "(0, 0, &\.strub\.watermark\.\[0-9\]\+, 1, 1)" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-inlinable1.c b/gcc/testsuite/c-c++-common/torture/strub-inlinable1.c
new file mode 100644
index 0000000000000..614b02228ba29
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-inlinable1.c
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed" } */
+
+inline void __attribute__ ((strub ("internal"), always_inline))
+inl_int_ali (void)
+{
+  /* No internal wrapper, so this body ALWAYS gets inlined,
+     but it cannot be called from non-strub contexts.  */
+}
+
+void
+bat (void)
+{
+  /* Not allowed, not a strub context.  */
+  inl_int_ali (); /* { dg-error "context" } */
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-inlinable2.c b/gcc/testsuite/c-c++-common/torture/strub-inlinable2.c
new file mode 100644
index 0000000000000..f9a6b4a16faf8
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-inlinable2.c
@@ -0,0 +1,7 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=all" } */
+
+#include "strub-inlinable1.c"
+
+/* With -fstrub=all, the caller becomes a strub context, so the strub-inlinable
+   callee is not rejected.  */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-ptrfn1.c b/gcc/testsuite/c-c++-common/torture/strub-ptrfn1.c
new file mode 100644
index 0000000000000..b4a7f3992bbaa
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-ptrfn1.c
@@ -0,0 +1,10 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict" } */
+
+typedef void ft (void);
+typedef void ft2 (int, int);
+extern ft __attribute__ ((__strub__)) fnac;
+
+ft * f (void) {
+  return fnac; /* { dg-message "incompatible|invalid conversion" } */
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-ptrfn2.c b/gcc/testsuite/c-c++-common/torture/strub-ptrfn2.c
new file mode 100644
index 0000000000000..d9d2c0caec42d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-ptrfn2.c
@@ -0,0 +1,55 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed -Wpedantic" } */
+
+/* C++ does not warn about the partial incompatibilities.
+
+   The d_p () calls are actually rejected, even in C++, but they are XFAILed
+   here because we don't get far enough in the compilation as to observe them,
+   because the incompatibilities are errors without -fpermissive.
+   strub-ptrfn3.c uses -fpermissive to check those.
+ */
+
+extern int __attribute__ ((strub ("callable"))) bac (void);
+extern int __attribute__ ((strub ("disabled"))) bad (void);
+extern int __attribute__ ((strub ("internal"))) bar (void);
+extern int __attribute__ ((strub ("at-calls"))) bal (void);
+
+void __attribute__ ((strub))
+bap (void)
+{
+  int __attribute__ ((strub ("disabled"))) (*d_p) (void) = bad;
+  int __attribute__ ((strub ("callable"))) (*c_p) (void) = bac;
+  int __attribute__ ((strub ("at-calls"))) (*a_p) (void) = bal;
+
+  d_p = bac; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bad; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bar; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+
+  d_p (); /* { dg-error "indirect non-.strub. call in .strub. context" "" { xfail c++ } } */
+  c_p ();
+  a_p ();
+}
+
+void __attribute__ ((strub))
+baP (void)
+{
+  typedef int __attribute__ ((strub ("disabled"))) d_fn_t (void);
+  typedef int __attribute__ ((strub ("callable"))) c_fn_t (void);
+  typedef int __attribute__ ((strub ("at-calls"))) a_fn_t (void);
+
+  d_fn_t *d_p = bad;
+  c_fn_t *c_p = bac;
+  a_fn_t *a_p = bal;
+
+  d_p = bac; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bad; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bar; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+
+  d_p (); /* { dg-error "indirect non-.strub. call in .strub. context" "" { xfail c++ } } */
+  c_p ();
+  a_p ();
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-ptrfn3.c b/gcc/testsuite/c-c++-common/torture/strub-ptrfn3.c
new file mode 100644
index 0000000000000..e1f179e160e5c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-ptrfn3.c
@@ -0,0 +1,50 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed -Wpedantic -fpermissive" } */
+/* { dg-prune-output "command-line option .-fpermissive." } */
+
+/* See strub-ptrfn2.c.  */
+
+extern int __attribute__ ((strub ("callable"))) bac (void);
+extern int __attribute__ ((strub ("disabled"))) bad (void);
+extern int __attribute__ ((strub ("internal"))) bar (void);
+extern int __attribute__ ((strub ("at-calls"))) bal (void);
+
+void __attribute__ ((strub))
+bap (void)
+{
+  int __attribute__ ((strub ("disabled"))) (*d_p) (void) = bad;
+  int __attribute__ ((strub ("callable"))) (*c_p) (void) = bac;
+  int __attribute__ ((strub ("at-calls"))) (*a_p) (void) = bal;
+
+  d_p = bac; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bad; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bar; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+
+  d_p (); /* { dg-error "indirect non-.strub. call in .strub. context" } */
+  c_p ();
+  a_p ();
+}
+
+void __attribute__ ((strub))
+baP (void)
+{
+  typedef int __attribute__ ((strub ("disabled"))) d_fn_t (void);
+  typedef int __attribute__ ((strub ("callable"))) c_fn_t (void);
+  typedef int __attribute__ ((strub ("at-calls"))) a_fn_t (void);
+
+  d_fn_t *d_p = bad;
+  c_fn_t *c_p = bac;
+  a_fn_t *a_p = bal;
+
+  d_p = bac; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bad; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bar; /* { dg-warning "not quite compatible" "" { xfail c++ } } */
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+
+  d_p (); /* { dg-error "indirect non-.strub. call in .strub. context" } */
+  c_p ();
+  a_p ();
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-ptrfn4.c b/gcc/testsuite/c-c++-common/torture/strub-ptrfn4.c
new file mode 100644
index 0000000000000..70b558afad040
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-ptrfn4.c
@@ -0,0 +1,43 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=relaxed" } */
+
+/* This is strub-ptrfn2.c without -Wpedantic.
+
+   Even C doesn't report the (not-quite-)compatible conversions without it.  */
+
+extern int __attribute__ ((strub ("callable"))) bac (void);
+extern int __attribute__ ((strub ("disabled"))) bad (void);
+extern int __attribute__ ((strub ("internal"))) bar (void);
+extern int __attribute__ ((strub ("at-calls"))) bal (void);
+
+void __attribute__ ((strub))
+bap (void)
+{
+  int __attribute__ ((strub ("disabled"))) (*d_p) (void) = bad;
+  int __attribute__ ((strub ("callable"))) (*c_p) (void) = bac;
+  int __attribute__ ((strub ("at-calls"))) (*a_p) (void) = bal;
+
+  d_p = bac;
+  c_p = bad;
+  c_p = bar;
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+}
+
+void __attribute__ ((strub))
+baP (void)
+{
+  typedef int __attribute__ ((strub ("disabled"))) d_fn_t (void);
+  typedef int __attribute__ ((strub ("callable"))) c_fn_t (void);
+  typedef int __attribute__ ((strub ("at-calls"))) a_fn_t (void);
+
+  d_fn_t *d_p = bad;
+  c_fn_t *c_p = bac;
+  a_fn_t *a_p = bal;
+
+  d_p = bac;
+  c_p = bad;
+  c_p = bar;
+  c_p = bal; /* { dg-message "incompatible|invalid conversion" } */
+  a_p = bac; /* { dg-message "incompatible|invalid conversion" } */
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-pure1.c b/gcc/testsuite/c-c++-common/torture/strub-pure1.c
new file mode 100644
index 0000000000000..a262a086837b2
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-pure1.c
@@ -0,0 +1,18 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub pure function call, we issue an asm statement
+   to make sure the watermark passed to it is not assumed to be unchanged.  */
+
+int __attribute__ ((__strub__, __pure__))
+f() {
+  static int i; /* Stop it from being detected as const.  */
+  return i;
+}
+
+int
+g() {
+  return f();
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-pure2.c b/gcc/testsuite/c-c++-common/torture/strub-pure2.c
new file mode 100644
index 0000000000000..4c4bd50c209a0
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-pure2.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub implicitly-pure function call, we issue an asm
+   statement to make sure the watermark passed to it is not assumed to be
+   unchanged.  */
+
+int __attribute__ ((__strub__))
+#if ! __OPTIMIZE__ /* At -O0, implicit pure detection doesn't run.  */
+__attribute__ ((__pure__))
+#endif
+f() {
+  static int i; /* Stop it from being detected as const.  */
+  return i;
+}
+
+int
+g() {
+  return f();
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-pure3.c b/gcc/testsuite/c-c++-common/torture/strub-pure3.c
new file mode 100644
index 0000000000000..ce195c6b1f1b6
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-pure3.c
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub pure wrapping call, we issue an asm statement
+   to make sure the watermark passed to it is not assumed to be unchanged.  */
+
+int __attribute__ ((__strub__ ("internal"), __pure__))
+f() {
+  static int i; /* Stop it from being detected as const.  */
+  return i;
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-pure4.c b/gcc/testsuite/c-c++-common/torture/strub-pure4.c
new file mode 100644
index 0000000000000..75cd54ccb5b5d
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-pure4.c
@@ -0,0 +1,17 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+/* Check that, along with a strub implicitly-pure wrapping call, we issue an asm
+   statement to make sure the watermark passed to it is not assumed to be
+   unchanged.  */
+
+int __attribute__ ((__strub__ ("internal")))
+#if ! __OPTIMIZE__ /* At -O0, implicit pure detection doesn't run.  */
+__attribute__ ((__pure__))
+#endif
+f() {
+  static int i; /* Stop it from being detected as const.  */
+  return i;
+}
+
+/* { dg-final { scan-ipa-dump-times "__asm__" 1 "strub" } } */
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run1.c b/gcc/testsuite/c-c++-common/torture/strub-run1.c
new file mode 100644
index 0000000000000..7458b3fb54da5
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run1.c
@@ -0,0 +1,95 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict" } */
+
+/* Check that a non-strub function leaves a string behind in the stack, and that
+   equivalent strub functions don't.  Avoid the use of red zones by avoiding
+   leaf functions.  */
+
+const char test_string[] = "\x55\xde\xad\xbe\xef\xc0\x1d\xca\xfe\x55\xaa";
+
+/* Pad before and after the string on the stack, so that it's not overwritten by
+   regular stack use.  */
+#define PAD 7
+
+static inline __attribute__ ((__always_inline__, __strub__ ("callable")))
+char *
+leak_string (void)
+{
+  /* We use this variable to avoid any stack red zone.  Stack scrubbing covers
+     it, but __builtin_stack_address, that we take as a reference, doesn't, so
+     if e.g. callable() were to store the string in the red zone, we wouldn't
+     find it because it would be outside the range we searched.  */
+  typedef void __attribute__ ((__strub__ ("callable"))) callable_t (char *);
+  callable_t *f = 0;
+
+  char s[2 * PAD + 1][sizeof (test_string)];
+  __builtin_strcpy (s[PAD], test_string);
+  asm ("" : "+m" (s), "+r" (f));
+
+  if (__builtin_expect (!f, 1))
+    return (char *) __builtin_stack_address ();
+
+  f (s[PAD]);
+  return 0;
+}
+
+static inline __attribute__ ((__always_inline__))
+int
+look_for_string (char *e)
+{
+  char *p = (char *) __builtin_stack_address ();
+
+  if (p == e)
+    __builtin_abort ();
+
+  if (p > e)
+    {
+      char *q = p;
+      p = e;
+      e = q;
+    }
+
+  for (char *re = e - sizeof (test_string); p < re; p++)
+    for (int i = 0; p[i] == test_string[i]; i++)
+      if (i == sizeof (test_string) - 1)
+	return i;
+
+  return 0;
+}
+
+static __attribute__ ((__noinline__, __noclone__))
+char *
+callable ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("at-calls")))
+char *
+at_calls ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("internal")))
+char *
+internal ()
+{
+  return leak_string ();
+}
+
+int main ()
+{
+  /* Since these test check stack contents above the top of the stack, an
+     unexpected asynchronous signal or interrupt might overwrite the bits we
+     expect to find and cause spurious fails.  Tolerate one such overall
+     spurious fail by retrying.  */
+  int i = 1;
+  while (!look_for_string (callable ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (at_calls ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (internal ()))
+    if (!i--) __builtin_abort ();
+  __builtin_exit (0);
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run2.c b/gcc/testsuite/c-c++-common/torture/strub-run2.c
new file mode 100644
index 0000000000000..5d60a7775f4bb
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run2.c
@@ -0,0 +1,84 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict" } */
+
+/* Check that a non-strub function leaves a string behind in the stack, and that
+   equivalent strub functions don't.  Allow red zones to be used.  */
+
+const char test_string[] = "\x55\xde\xad\xbe\xef\xc0\x1d\xca\xfe\x55\xaa";
+
+/* Pad before and after the string on the stack, so that it's not overwritten by
+   regular stack use.  */
+#define PAD 7
+
+static inline __attribute__ ((__always_inline__, __strub__ ("callable")))
+char *
+leak_string (void)
+{
+  int len = sizeof (test_string);
+  asm ("" : "+rm" (len));
+  char s[2 * PAD + 1][len];
+  __builtin_strcpy (s[PAD], test_string);
+  asm ("" : "+m" (s));
+  return (char *) __builtin_stack_address ();
+}
+
+static inline __attribute__ ((__always_inline__))
+int
+look_for_string (char *e)
+{
+  char *p = (char *) __builtin_stack_address ();
+
+  if (p == e)
+    __builtin_abort ();
+
+  if (p > e)
+    {
+      char *q = p;
+      p = e;
+      e = q;
+    }
+
+  for (char *re = e - sizeof (test_string); p < re; p++)
+    for (int i = 0; p[i] == test_string[i]; i++)
+      if (i == sizeof (test_string) - 1)
+	return i;
+
+  return 0;
+}
+
+static __attribute__ ((__noinline__, __noclone__))
+char *
+callable ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("at-calls")))
+char *
+at_calls ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("internal")))
+char *
+internal ()
+{
+  return leak_string ();
+}
+
+int main ()
+{
+  /* Since these test check stack contents above the top of the stack, an
+     unexpected asynchronous signal or interrupt might overwrite the bits we
+     expect to find and cause spurious fails.  Tolerate one such overall
+     spurious fail by retrying.  */
+  int i = 1;
+  while (!look_for_string (callable ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (at_calls ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (internal ()))
+    if (!i--) __builtin_abort ();
+  __builtin_exit (0);
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run3.c b/gcc/testsuite/c-c++-common/torture/strub-run3.c
new file mode 100644
index 0000000000000..c2ad710858e87
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run3.c
@@ -0,0 +1,80 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict" } */
+/* { dg-require-effective-target alloca } */
+
+/* Check that a non-strub function leaves a string behind in the stack, and that
+   equivalent strub functions don't.  */
+
+const char test_string[] = "\x55\xde\xad\xbe\xef\xc0\x1d\xca\xfe\x55\xaa";
+
+static inline __attribute__ ((__always_inline__, __strub__ ("callable")))
+char *
+leak_string (void)
+{
+  int len = sizeof (test_string);
+  char *s = (char *) __builtin_alloca (len);
+  __builtin_strcpy (s, test_string);
+  asm ("" : "+m" (s));
+  return (char *) __builtin_stack_address ();
+}
+
+static inline __attribute__ ((__always_inline__))
+int
+look_for_string (char *e)
+{
+  char *p = (char *) __builtin_stack_address ();
+
+  if (p == e)
+    __builtin_abort ();
+
+  if (p > e)
+    {
+      char *q = p;
+      p = e;
+      e = q;
+    }
+
+  for (char *re = e - sizeof (test_string); p < re; p++)
+    for (int i = 0; p[i] == test_string[i]; i++)
+      if (i == sizeof (test_string) - 1)
+	return i;
+
+  return 0;
+}
+
+static __attribute__ ((__noinline__, __noclone__))
+char *
+callable ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("at-calls")))
+char *
+at_calls ()
+{
+  return leak_string ();
+}
+
+static __attribute__ ((__strub__ ("internal")))
+char *
+internal ()
+{
+  return leak_string ();
+}
+
+int main ()
+{
+  /* Since these test check stack contents above the top of the stack, an
+     unexpected asynchronous signal or interrupt might overwrite the bits we
+     expect to find and cause spurious fails.  Tolerate one such overall
+     spurious fail by retrying.  */
+  int i = 1;
+  while (!look_for_string (callable ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (at_calls ()))
+    if (!i--) __builtin_abort ();
+  while (look_for_string (internal ()))
+    if (!i--) __builtin_abort ();
+  __builtin_exit (0);
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run4.c b/gcc/testsuite/c-c++-common/torture/strub-run4.c
new file mode 100644
index 0000000000000..3b36b8e5d68ef
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run4.c
@@ -0,0 +1,106 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=all" } */
+/* { dg-require-effective-target alloca } */
+
+/* Check that multi-level, multi-inlined functions still get cleaned up as
+   expected, without overwriting temporary stack allocations while they should
+   still be available.  */
+
+#ifndef ATTR_STRUB_AT_CALLS
+# define ATTR_STRUB_AT_CALLS /* Defined in strub-run4d.c.  */
+#endif
+
+const char test_string[] = "\x55\xde\xad\xbe\xef\xc0\x1d\xca\xfe\x55\xaa";
+
+static inline __attribute__ ((__always_inline__))
+char *
+leak_string (void)
+{
+  int __attribute__ ((__strub__)) len = 512;
+  asm ("" : "+r" (len));
+  char s[len];
+  __builtin_strcpy (s, test_string);
+  __builtin_strcpy (s + len - sizeof (test_string), test_string);
+  asm ("" : "+m" (s));
+  return (char *) __builtin_stack_address ();
+}
+
+static inline __attribute__ ((__always_inline__))
+int
+look_for_string (char *e)
+{
+  char *p = (char *) __builtin_stack_address ();
+
+  if (p == e)
+    __builtin_abort ();
+
+  if (p > e)
+    {
+      char *q = p;
+      p = e;
+      e = q;
+    }
+
+  for (char *re = e - sizeof (test_string); p < re; p++)
+    for (int i = 0; p[i] == test_string[i]; i++)
+      if (i == sizeof (test_string) - 1)
+	return i;
+
+  return 0;
+}
+
+static inline ATTR_STRUB_AT_CALLS
+char *
+innermost ()
+{
+  int __attribute__ ((__strub__)) len = 512;
+  asm ("" : "+r" (len));
+  char s[len];
+  __builtin_strcpy (s, test_string);
+  __builtin_strcpy (s + len - sizeof (test_string), test_string);
+  asm ("" : "+m" (s));
+  char *ret = leak_string ();
+  if (__builtin_strcmp (s, test_string) != 0)
+    __builtin_abort ();
+  if (__builtin_strcmp (s + len - sizeof (test_string), test_string) != 0)
+    __builtin_abort ();
+  return ret;
+}
+
+static inline ATTR_STRUB_AT_CALLS
+char *
+intermediate ()
+{
+  int __attribute__ ((__strub__)) len = 512;
+  asm ("" : "+r" (len));
+  char s[len];
+  __builtin_strcpy (s, test_string);
+  __builtin_strcpy (s + len - sizeof (test_string), test_string);
+  asm ("" : "+m" (s));
+  char *ret = innermost ();
+  if (__builtin_strcmp (s, test_string) != 0)
+    __builtin_abort ();
+  if (__builtin_strcmp (s + len - sizeof (test_string), test_string) != 0)
+    __builtin_abort ();
+  return ret;
+}
+
+static inline __attribute__ ((__strub__ ("internal")))
+char *
+internal ()
+{
+  return intermediate ();
+}
+
+int __attribute__ ((__strub__ ("disabled")))
+main ()
+{
+  /* Since these test check stack contents above the top of the stack, an
+     unexpected asynchronous signal or interrupt might overwrite the bits we
+     expect to find and cause spurious fails.  Tolerate one such overall
+     spurious fail by retrying.  */
+  int i = 1;
+  while (look_for_string (internal ()))
+    if (!i--) __builtin_abort ();
+  __builtin_exit (0);
+}
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run4c.c b/gcc/testsuite/c-c++-common/torture/strub-run4c.c
new file mode 100644
index 0000000000000..57f9baf758ded
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run4c.c
@@ -0,0 +1,5 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=at-calls" } */
+/* { dg-require-effective-target alloca } */
+
+#include "strub-run4.c"
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run4d.c b/gcc/testsuite/c-c++-common/torture/strub-run4d.c
new file mode 100644
index 0000000000000..08de3f1c3b17c
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run4d.c
@@ -0,0 +1,7 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=strict" } */
+/* { dg-require-effective-target alloca } */
+
+#define ATTR_STRUB_AT_CALLS __attribute__ ((__strub__ ("at-calls")))
+
+#include "strub-run4.c"
diff --git a/gcc/testsuite/c-c++-common/torture/strub-run4i.c b/gcc/testsuite/c-c++-common/torture/strub-run4i.c
new file mode 100644
index 0000000000000..459f6886c5499
--- /dev/null
+++ b/gcc/testsuite/c-c++-common/torture/strub-run4i.c
@@ -0,0 +1,5 @@ 
+/* { dg-do run } */
+/* { dg-options "-fstrub=internal" } */
+/* { dg-require-effective-target alloca } */
+
+#include "strub-run4.c"
diff --git a/gcc/testsuite/g++.dg/strub-run1.C b/gcc/testsuite/g++.dg/strub-run1.C
new file mode 100644
index 0000000000000..0d367fb83d09d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/strub-run1.C
@@ -0,0 +1,19 @@ 
+// { dg-do run }
+// { dg-options "-fstrub=internal" }
+
+// Check that we don't get extra copies.
+
+struct T {
+  T &self;
+  void check () const { if (&self != this) __builtin_abort (); }
+  T() : self (*this) { check (); }
+  T(const T& ck) : self (*this) { ck.check (); check (); }
+  ~T() { check (); }
+};
+
+T foo (T q) { q.check (); return T(); }
+T bar (T p) { p.check (); return foo (p); }
+
+int main () {
+  bar (T()).check ();
+}
diff --git a/gcc/testsuite/g++.dg/torture/strub-init1.C b/gcc/testsuite/g++.dg/torture/strub-init1.C
new file mode 100644
index 0000000000000..c226ab10ff651
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/strub-init1.C
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+extern int __attribute__((__strub__)) initializer ();
+
+int f() {
+  static int x = initializer ();
+  return x;
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/g++.dg/torture/strub-init2.C b/gcc/testsuite/g++.dg/torture/strub-init2.C
new file mode 100644
index 0000000000000..a7911f1fa7212
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/strub-init2.C
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+extern int __attribute__((__strub__)) initializer ();
+
+static int x = initializer ();
+
+int f() {
+  return x;
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/g++.dg/torture/strub-init3.C b/gcc/testsuite/g++.dg/torture/strub-init3.C
new file mode 100644
index 0000000000000..6ebebcd01e8ea
--- /dev/null
+++ b/gcc/testsuite/g++.dg/torture/strub-init3.C
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-options "-fstrub=strict -fdump-ipa-strub" } */
+
+extern int __attribute__((__strub__)) initializer ();
+
+int f() {
+  int x = initializer ();
+  return x;
+}
+
+/* { dg-final { scan-ipa-dump "strub_enter" "strub" } } */
+/* { dg-final { scan-ipa-dump "strub_leave" "strub" } } */
+/* { dg-final { scan-ipa-dump-not "strub_update" "strub" } } */
diff --git a/gcc/testsuite/gnat.dg/strub_access.adb b/gcc/testsuite/gnat.dg/strub_access.adb
new file mode 100644
index 0000000000000..29e6996ecf61c
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_access.adb
@@ -0,0 +1,21 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=relaxed -fdump-ipa-strubm" }
+
+--  The main subprogram doesn't read from the automatic variable, but
+--  being an automatic variable, its presence should be enough for the
+--  procedure to get strub enabled.
+
+procedure Strub_Access is
+   type Strub_Int is new Integer;
+   pragma Machine_Attribute (Strub_Int, "strub");
+   
+   X : aliased Strub_Int := 0;
+
+   function F (P : access Strub_Int) return Strub_Int is (P.all);
+
+begin
+   X := F (X'Access);
+end Strub_Access;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]internal\[)\]\[)\]" 1 "strubm" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]at-calls-opt\[)\]\[)\]" 1 "strubm" } }
diff --git a/gcc/testsuite/gnat.dg/strub_access1.adb b/gcc/testsuite/gnat.dg/strub_access1.adb
new file mode 100644
index 0000000000000..dae4706016436
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_access1.adb
@@ -0,0 +1,16 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=relaxed" }
+
+--  Check that we reject 'Access of a strub variable whose type does
+--  not carry a strub modifier.
+
+procedure Strub_Access1 is
+   X : aliased Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+
+   function F (P : access Integer) return Integer is (P.all);
+   
+begin
+   X := F (X'Unchecked_access); -- OK.
+   X := F (X'Access); -- { dg-error "target access type drops .strub. mode" }
+end Strub_Access1;
diff --git a/gcc/testsuite/gnat.dg/strub_attr.adb b/gcc/testsuite/gnat.dg/strub_attr.adb
new file mode 100644
index 0000000000000..10445d7cf8451
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_attr.adb
@@ -0,0 +1,37 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict -fdump-ipa-strubm -fdump-ipa-strub" }
+
+package body Strub_Attr is
+   E : exception;
+
+   procedure P (X : Integer) is
+   begin
+      raise E;
+   end;
+   
+   function F (X : Integer) return Integer is
+   begin
+      return X * X;
+   end;
+   
+   function G return Integer is (F (X));
+   --  function G return Integer is (FP (X));
+   --  Calling G would likely raise an exception, because although FP
+   --  carries the strub at-calls attribute needed to call F, the
+   --  attribute is dropped from the type used for the call proper.
+end Strub_Attr;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]internal\[)\]\[)\]" 2 "strubm" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]at-calls\[)\]\[)\]" 0 "strubm" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub\[)\]" 1 "strubm" } }
+
+--  { dg-final { scan-ipa-dump-times "strub.watermark_ptr" 6 "strub" } }
+--  We have 1 at-calls subprogram (F) and 2 wrapped (P and G).
+--  For each of them, there's one match for the wrapped signature, 
+--  and one for the update call.
+
+--  { dg-final { scan-ipa-dump-times "strub.watermark" 27 "strub" } }
+--  The 6 matches above, plus:
+--  5*2: wm var decl, enter, call, leave and clobber for each wrapper;
+--  2*1: an extra leave and clobber for the exception paths in the wrappers.
+--  7*1: for the F call in G, including EH path.
diff --git a/gcc/testsuite/gnat.dg/strub_attr.ads b/gcc/testsuite/gnat.dg/strub_attr.ads
new file mode 100644
index 0000000000000..a94c23bf41833
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_attr.ads
@@ -0,0 +1,12 @@ 
+package Strub_Attr is
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "internal");
+
+   function F (X : Integer) return Integer;
+   pragma Machine_Attribute (F, "strub");
+
+   X : Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+
+   function G return Integer;
+end Strub_Attr;
diff --git a/gcc/testsuite/gnat.dg/strub_disp.adb b/gcc/testsuite/gnat.dg/strub_disp.adb
new file mode 100644
index 0000000000000..3dbcc4a357cba
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_disp.adb
@@ -0,0 +1,64 @@ 
+--  { dg-do compile }
+
+procedure Strub_Disp is
+   package Foo is
+      type A is tagged null record;
+
+      procedure P (I : Integer; X : A);
+      pragma Machine_Attribute (P, "strub", "at-calls");
+      
+      function F (X : access A) return Integer;
+
+      type B is new A with null record;
+      
+      overriding
+      procedure P (I : Integer; X : B); -- { dg-error "requires the same .strub. mode" }
+
+      overriding
+      function F (X : access B) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+
+   end Foo;
+
+   package body Foo is
+      procedure P (I : Integer; X : A) is
+      begin
+	 null;
+      end;
+      
+      function F (X : access A) return Integer is (0);
+
+      overriding
+      procedure P (I : Integer; X : B) is
+      begin
+	 P (I, A (X));
+      end;
+      
+      overriding
+      function F (X : access B) return Integer is (1);
+   end Foo;
+
+   use Foo;
+
+   procedure Q (X : A'Class) is
+   begin
+      P (-1, X);
+   end;
+
+   XA : aliased A;
+   XB : aliased B;
+   I : Integer := 0;
+   XC : access A'Class;
+begin
+   Q (XA);
+   Q (XB);
+   
+   I := I + F (XA'Access);
+   I := I + F (XB'Access);
+
+   XC := XA'Access;
+   I := I + F (XC);
+
+   XC := XB'Access;
+   I := I + F (XC);
+end Strub_Disp;
diff --git a/gcc/testsuite/gnat.dg/strub_disp1.adb b/gcc/testsuite/gnat.dg/strub_disp1.adb
new file mode 100644
index 0000000000000..09756a74b7d81
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_disp1.adb
@@ -0,0 +1,79 @@ 
+--  { dg-do compile }
+--  { dg-options "-fdump-ipa-strub" }
+
+-- Check that at-calls dispatching calls are transformed.
+
+procedure Strub_Disp1 is
+   package Foo is
+      type A is tagged null record;
+
+      procedure P (I : Integer; X : A);
+      pragma Machine_Attribute (P, "strub", "at-calls");
+      
+      function F (X : access A) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+      type B is new A with null record;
+      
+      overriding
+      procedure P (I : Integer; X : B);
+      pragma Machine_Attribute (P, "strub", "at-calls");
+
+      overriding
+      function F (X : access B) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+   end Foo;
+
+   package body Foo is
+      procedure P (I : Integer; X : A) is
+      begin
+	 null;
+      end;
+      
+      function F (X : access A) return Integer is (0);
+
+      overriding
+      procedure P (I : Integer; X : B) is
+      begin
+	 P (I, A (X)); -- strub-at-calls non-dispatching call
+      end;
+      
+      overriding
+      function F (X : access B) return Integer is (1);
+   end Foo;
+
+   use Foo;
+
+   procedure Q (X : A'Class) is
+   begin
+      P (-1, X); -- strub-at-calls dispatching call.
+   end;
+
+   XA : aliased A;
+   XB : aliased B;
+   I : Integer := 0;
+   XC : access A'Class;
+begin
+   Q (XA);
+   Q (XB);
+   
+   I := I + F (XA'Access); -- strub-at-calls non-dispatching call
+   I := I + F (XB'Access); -- strub-at-calls non-dispatching call
+
+   XC := XA'Access;
+   I := I + F (XC); -- strub-at-calls dispatching call.
+
+   XC := XB'Access;
+   I := I + F (XC); -- strub-at-calls dispatching call.
+end Strub_Disp1;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]at-calls\[)\]\[)\]" 4 "strub" } }
+
+--  Count the strub-at-calls non-dispatching calls 
+--  (+ 2 each, for the matching prototypes)
+--  { dg-final { scan-ipa-dump-times "foo\.p \[(\]\[^\n\]*watermark" 3 "strub" } }
+--  { dg-final { scan-ipa-dump-times "foo\.f \[(\]\[^\n\]*watermark" 4 "strub" } }
+
+--  Count the strub-at-calls dispatching calls.
+--  { dg-final { scan-ipa-dump-times "_\[0-9\]* \[(\]\[^\n\]*watermark" 3 "strub" } }
diff --git a/gcc/testsuite/gnat.dg/strub_ind.adb b/gcc/testsuite/gnat.dg/strub_ind.adb
new file mode 100644
index 0000000000000..da56acaa957d2
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind.adb
@@ -0,0 +1,33 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict" }
+
+--  This is essentially the same test as strub_attr.adb, 
+--  but applying attributes to access types as well.
+--  That doesn't quite work yet, so we get an error we shouldn't get.
+
+package body Strub_Ind is
+   E : exception;
+
+   function G return Integer;
+
+   procedure P (X : Integer) is
+   begin
+      raise E;
+   end;
+   
+   function F (X : Integer) return Integer is
+   begin
+      return X * X;
+   end;
+   
+   function G return Integer is (FP (X));
+
+   type GT is access function return Integer;
+
+   type GT_SAC is access function return Integer;
+   pragma Machine_Attribute (GT_SAC, "strub", "at-calls");
+
+   GP : GT_SAC := GT_SAC (GT'(G'Access)); -- { dg-error "incompatible" }
+   -- pragma Machine_Attribute (GP, "strub", "at-calls");
+
+end Strub_Ind;
diff --git a/gcc/testsuite/gnat.dg/strub_ind.ads b/gcc/testsuite/gnat.dg/strub_ind.ads
new file mode 100644
index 0000000000000..99a65fc24b1ec
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind.ads
@@ -0,0 +1,17 @@ 
+package Strub_Ind is
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "internal");
+
+   function F (X : Integer) return Integer;
+   pragma Machine_Attribute (F, "strub");
+
+   X : Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+
+   type FT is access function (X : Integer) return Integer;
+   pragma Machine_Attribute (FT, "strub", "at-calls");
+
+   FP : FT := F'Access;
+   -- pragma Machine_Attribute (FP, "strub", "at-calls"); -- not needed
+
+end Strub_Ind;
diff --git a/gcc/testsuite/gnat.dg/strub_ind1.adb b/gcc/testsuite/gnat.dg/strub_ind1.adb
new file mode 100644
index 0000000000000..825e395e6819c
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind1.adb
@@ -0,0 +1,41 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict -fdump-ipa-strubm" }
+
+--  This is essentially the same test as strub_attr.adb, 
+--  but with an explicit conversion.
+
+package body Strub_Ind1 is
+   E : exception;
+
+   type Strub_Int is New Integer;
+   pragma Machine_Attribute (Strub_Int, "strub");
+
+   function G return Integer;
+   pragma Machine_Attribute (G, "strub", "disabled");
+
+   procedure P (X : Integer) is
+   begin
+      raise E;
+   end;
+   
+   function G return Integer is (FP (X));
+
+   type GT is access function return Integer;
+   pragma Machine_Attribute (GT, "strub", "disabled");
+
+   type GT_SC is access function return Integer;
+   pragma Machine_Attribute (GT_SC, "strub", "callable");
+
+   GP : GT_SC := GT_SC (GT'(G'Access));
+   --  pragma Machine_Attribute (GP, "strub", "callable"); -- not needed.
+
+   function F (X : Integer) return Integer is
+   begin
+      return X * GP.all;
+   end;
+   
+end Strub_Ind1;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]disabled\[)\]\[)\]" 1 "strubm" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]internal\[)\]\[)\]" 1 "strubm" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub\[)\]" 1 "strubm" } }
diff --git a/gcc/testsuite/gnat.dg/strub_ind1.ads b/gcc/testsuite/gnat.dg/strub_ind1.ads
new file mode 100644
index 0000000000000..d3f1273b3a6b9
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind1.ads
@@ -0,0 +1,17 @@ 
+package Strub_Ind1 is
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "internal");
+
+   function F (X : Integer) return Integer;
+   pragma Machine_Attribute (F, "strub");
+
+   X : aliased Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+
+   type FT is access function (X : Integer) return Integer;
+   pragma Machine_Attribute (FT, "strub", "at-calls");
+
+   FP : FT := F'Access;
+   pragma Machine_Attribute (FP, "strub", "at-calls");
+
+end Strub_Ind1;
diff --git a/gcc/testsuite/gnat.dg/strub_ind2.adb b/gcc/testsuite/gnat.dg/strub_ind2.adb
new file mode 100644
index 0000000000000..e918b39263117
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind2.adb
@@ -0,0 +1,34 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict" }
+
+--  This is essentially the same test as strub_attr.adb, 
+--  but with an explicit conversion.
+
+package body Strub_Ind2 is
+   E : exception;
+
+   function G return Integer;
+   pragma Machine_Attribute (G, "strub", "callable");
+
+   procedure P (X : Integer) is
+   begin
+      raise E;
+   end;
+   
+   function G return Integer is (FP (X));
+
+   type GT is access function return Integer;
+   pragma Machine_Attribute (GT, "strub", "callable");
+
+   type GT_SD is access function return Integer;
+   pragma Machine_Attribute (GT_SD, "strub", "disabled");
+
+   GP : GT_SD := GT_SD (GT'(G'Access));
+   --  pragma Machine_Attribute (GP, "strub", "disabled"); -- not needed.
+
+   function F (X : Integer) return Integer is
+   begin
+      return X * GP.all; --  { dg-error "using non-.strub. type" }
+   end;
+   
+end Strub_Ind2;
diff --git a/gcc/testsuite/gnat.dg/strub_ind2.ads b/gcc/testsuite/gnat.dg/strub_ind2.ads
new file mode 100644
index 0000000000000..e13865ec49c38
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_ind2.ads
@@ -0,0 +1,17 @@ 
+package Strub_Ind2 is
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "internal");
+
+   function F (X : Integer) return Integer;
+   pragma Machine_Attribute (F, "strub");
+
+   X : Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+
+   type FT is access function (X : Integer) return Integer;
+   pragma Machine_Attribute (FT, "strub", "at-calls");
+
+   FP : FT := F'Access;
+   pragma Machine_Attribute (FP, "strub", "at-calls");
+
+end Strub_Ind2;
diff --git a/gcc/testsuite/gnat.dg/strub_intf.adb b/gcc/testsuite/gnat.dg/strub_intf.adb
new file mode 100644
index 0000000000000..8f0212a75866f
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_intf.adb
@@ -0,0 +1,93 @@ 
+--  { dg-do compile }
+
+--  Check that strub mode mismatches between overrider and overridden
+--  subprograms are reported.
+
+procedure Strub_Intf is
+   package Foo is
+      type TP is interface;
+      procedure P (I : Integer; X : TP) is abstract;
+      pragma Machine_Attribute (P, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+
+      type TF is interface;
+      function F (X : access TF) return Integer is abstract;
+
+      type TX is interface;
+      procedure P (I : Integer; X : TX) is abstract;
+
+      type TI is interface and TP and TF and TX;
+      --  When we freeze TI, we detect the mismatch between the
+      --  inherited P and another parent's P.  Because TP appears
+      --  before TX, we inherit P from TP, and report the mismatch at
+      --  the pragma inherited from TP against TX's P.  In contrast,
+      --  when we freeze TII below, since TX appears before TP, we
+      --  report the error at the line in which the inherited
+      --  subprogram is synthesized, namely the line below, against
+      --  the line of the pragma.
+
+      type TII is interface and TX and TP and TF; -- { dg-error "requires the same .strub. mode" }
+
+      function F (X : access TI) return Integer is abstract;
+      pragma Machine_Attribute (F, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+
+      type A is new TI with null record;
+
+      procedure P (I : Integer; X : A);
+      pragma Machine_Attribute (P, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+      
+      function F (X : access A) return Integer; -- { dg-error "requires the same .strub. mode" }
+
+      type B is new TI with null record;
+      
+      overriding
+      procedure P (I : Integer; X : B); -- { dg-error "requires the same .strub. mode" }
+
+      overriding
+      function F (X : access B) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+
+   end Foo;
+
+   package body Foo is
+      procedure P (I : Integer; X : A) is
+      begin
+	 null;
+      end;
+      
+      function F (X : access A) return Integer is (0);
+
+      overriding
+      procedure P (I : Integer; X : B) is
+      begin
+	 null;
+      end;
+      
+      overriding
+      function F (X : access B) return Integer is (1);
+
+   end Foo;
+
+   use Foo;
+
+   procedure Q (X : TX'Class) is
+   begin
+      P (-1, X);
+   end;
+
+   XA : aliased A;
+   XB : aliased B;
+   I : Integer := 0;
+   XC : access TI'Class;
+begin
+   Q (XA);
+   Q (XB);
+   
+   I := I + F (XA'Access);
+   I := I + F (XB'Access);
+
+   XC := XA'Access;
+   I := I + F (XC);
+
+   XC := XB'Access;
+   I := I + F (XC);
+end Strub_Intf;
diff --git a/gcc/testsuite/gnat.dg/strub_intf1.adb b/gcc/testsuite/gnat.dg/strub_intf1.adb
new file mode 100644
index 0000000000000..bf77321cef790
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_intf1.adb
@@ -0,0 +1,86 @@ 
+--  { dg-do compile }
+--  { dg-options "-fdump-ipa-strub" }
+
+-- Check that at-calls dispatching calls to interfaces are transformed.
+
+procedure Strub_Intf1 is
+   package Foo is
+      type TX is Interface;
+      procedure P (I : Integer; X : TX) is abstract;
+      pragma Machine_Attribute (P, "strub", "at-calls");
+      function F (X : access TX) return Integer is abstract;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+      type A is new TX with null record;
+
+      procedure P (I : Integer; X : A);
+      pragma Machine_Attribute (P, "strub", "at-calls");
+      
+      function F (X : access A) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+      type B is new TX with null record;
+      
+      overriding
+      procedure P (I : Integer; X : B);
+      pragma Machine_Attribute (P, "strub", "at-calls");
+
+      overriding
+      function F (X : access B) return Integer;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+   end Foo;
+
+   package body Foo is
+      procedure P (I : Integer; X : A) is
+      begin
+	 null;
+      end;
+      
+      function F (X : access A) return Integer is (0);
+
+      overriding
+      procedure P (I : Integer; X : B) is
+      begin
+	 null;
+      end;
+      
+      overriding
+      function F (X : access B) return Integer is (1);
+
+   end Foo;
+
+   use Foo;
+
+   procedure Q (X : TX'Class) is
+   begin
+      P (-1, X);
+   end;
+
+   XA : aliased A;
+   XB : aliased B;
+   I : Integer := 0;
+   XC : access TX'Class;
+begin
+   Q (XA);
+   Q (XB);
+   
+   I := I + F (XA'Access);
+   I := I + F (XB'Access);
+
+   XC := XA'Access;
+   I := I + F (XC);
+
+   XC := XB'Access;
+   I := I + F (XC);
+end Strub_Intf1;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]at-calls\[)\]\[)\]" 4 "strub" } }
+
+--  Count the strub-at-calls non-dispatching calls 
+--  (+ 2 each, for the matching prototypes)
+--  { dg-final { scan-ipa-dump-times "foo\.p \[(\]\[^\n\]*watermark" 2 "strub" } }
+--  { dg-final { scan-ipa-dump-times "foo\.f \[(\]\[^\n\]*watermark" 4 "strub" } }
+
+--  Count the strub-at-calls dispatching calls.
+--  { dg-final { scan-ipa-dump-times "_\[0-9\]* \[(\]\[^\n\]*watermark" 3 "strub" } }
diff --git a/gcc/testsuite/gnat.dg/strub_intf2.adb b/gcc/testsuite/gnat.dg/strub_intf2.adb
new file mode 100644
index 0000000000000..e8880dbc43730
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_intf2.adb
@@ -0,0 +1,55 @@ 
+--  { dg-do compile }
+
+--  Check that strub mode mismatches between overrider and overridden
+--  subprograms are reported even when the overriders for an
+--  interface's subprograms are inherited from a type that is not a
+--  descendent of the interface.
+
+procedure Strub_Intf2 is
+   package Foo is
+      type A is tagged null record;
+
+      procedure P (I : Integer; X : A);
+      pragma Machine_Attribute (P, "strub", "at-calls"); -- { dg-error "requires the same .strub. mode" }
+      
+      function F (X : access A) return Integer;
+
+      type TX is Interface;
+
+      procedure P (I : Integer; X : TX) is abstract; 
+
+      function F (X : access TX) return Integer is abstract;
+      pragma Machine_Attribute (F, "strub", "at-calls");
+
+      type B is new A and TX with null record; -- { dg-error "requires the same .strub. mode" }
+
+   end Foo;
+
+   package body Foo is
+      procedure P (I : Integer; X : A) is
+      begin
+	 null;
+      end;
+      
+      function F (X : access A) return Integer is (0);
+
+   end Foo;
+
+   use Foo;
+
+   procedure Q (X : TX'Class) is
+   begin
+      P (-1, X);
+   end;
+
+   XB : aliased B;
+   I : Integer := 0;
+   XC : access TX'Class;
+begin
+   Q (XB);
+   
+   I := I + F (XB'Access);
+
+   XC := XB'Access;
+   I := I + F (XC);
+end Strub_Intf2;
diff --git a/gcc/testsuite/gnat.dg/strub_renm.adb b/gcc/testsuite/gnat.dg/strub_renm.adb
new file mode 100644
index 0000000000000..217367e712d82
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_renm.adb
@@ -0,0 +1,21 @@ 
+--  { dg-do compile }
+
+procedure Strub_Renm is
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "at-calls");
+
+   function F return Integer;
+   pragma Machine_Attribute (F, "strub", "internal");
+
+   procedure Q (X : Integer) renames P; -- { dg-error "requires the same .strub. mode" }
+
+   function G return Integer renames F;
+   pragma Machine_Attribute (G, "strub", "callable"); -- { dg-error "requires the same .strub. mode" }
+
+   procedure P (X : Integer) is null;
+   function F return Integer is (0);
+
+begin
+   P (F);
+   Q (G);
+end Strub_Renm;
diff --git a/gcc/testsuite/gnat.dg/strub_renm1.adb b/gcc/testsuite/gnat.dg/strub_renm1.adb
new file mode 100644
index 0000000000000..a11adbfb5a9d6
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_renm1.adb
@@ -0,0 +1,32 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=relaxed -fdump-ipa-strub" }
+
+procedure Strub_Renm1 is
+   V : Integer := 0;
+   pragma Machine_Attribute (V, "strub");
+
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "at-calls");
+
+   function F return Integer;
+
+   procedure Q (X : Integer) renames P;
+   pragma Machine_Attribute (Q, "strub", "at-calls");
+
+   function G return Integer renames F;
+   pragma Machine_Attribute (G, "strub", "internal");
+
+   procedure P (X : Integer) is null;
+   function F return Integer is (0);
+
+begin
+   P (F);
+   Q (G);
+end Strub_Renm1;
+
+--  This is for P; Q is an alias.
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]at-calls\[)\]\[)\]" 1 "strub" } }
+
+--  This is *not* for G, but for Strub_Renm1.
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]wrapped\[)\]\[)\]" 1 "strub" } }
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]wrapper\[)\]\[)\]" 1 "strub" } }
diff --git a/gcc/testsuite/gnat.dg/strub_renm2.adb b/gcc/testsuite/gnat.dg/strub_renm2.adb
new file mode 100644
index 0000000000000..c488c20826fdb
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_renm2.adb
@@ -0,0 +1,32 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict -fdump-ipa-strub" }
+
+procedure Strub_Renm2 is
+   V : Integer := 0;
+   pragma Machine_Attribute (V, "strub");
+
+   procedure P (X : Integer);
+   pragma Machine_Attribute (P, "strub", "at-calls");
+
+   function F return Integer;
+
+   procedure Q (X : Integer) renames P;
+   pragma Machine_Attribute (Q, "strub", "at-calls");
+
+   type T is access function return Integer;
+
+   type TC is access function return Integer;
+   pragma Machine_Attribute (TC, "strub", "callable");
+
+   FCptr : constant TC := TC (T'(F'Access));
+
+   function G return Integer renames FCptr.all;
+   pragma Machine_Attribute (G, "strub", "callable");
+
+   procedure P (X : Integer) is null;
+   function F return Integer is (0);
+
+begin
+   P (F);  -- { dg-error "calling non-.strub." }
+   Q (G);  -- ok, G is callable.
+end Strub_Renm2;
diff --git a/gcc/testsuite/gnat.dg/strub_var.adb b/gcc/testsuite/gnat.dg/strub_var.adb
new file mode 100644
index 0000000000000..3d158de28031f
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_var.adb
@@ -0,0 +1,16 @@ 
+--  { dg-do compile }
+--  { dg-options "-fstrub=strict -fdump-ipa-strubm" }
+
+-- We don't read from the automatic variable, but being an automatic
+--  variable, its presence should be enough for the procedure to get
+--  strub enabled.
+
+with Strub_Attr;
+procedure Strub_Var is
+   X : Integer := 0;
+   pragma Machine_Attribute (X, "strub");
+begin
+   X := Strub_Attr.F (0);
+end Strub_Var;
+
+--  { dg-final { scan-ipa-dump-times "\[(\]strub \[(\]internal\[)\]\[)\]" 1 "strubm" } }
diff --git a/gcc/testsuite/gnat.dg/strub_var1.adb b/gcc/testsuite/gnat.dg/strub_var1.adb
new file mode 100644
index 0000000000000..6a504e09198b6
--- /dev/null
+++ b/gcc/testsuite/gnat.dg/strub_var1.adb
@@ -0,0 +1,20 @@ 
+--  { dg-do compile }
+
+with Strub_Attr;
+procedure Strub_Var1 is
+   type TA  -- { dg-warning "does not apply to elements" }
+      is array (1..2) of Integer;
+   pragma Machine_Attribute (TA, "strub");
+   
+   A : TA := (0, 0);  -- { dg-warning "does not apply to elements" }
+   
+   type TR is record  -- { dg-warning "does not apply to fields" }
+      M, N : Integer;
+   end record;
+   pragma Machine_Attribute (TR, "strub");
+   
+   R : TR := (0, 0);
+
+begin
+   A(2) := Strub_Attr.F (A(1));
+end Strub_Var1;
diff --git a/gcc/tree-cfg.cc b/gcc/tree-cfg.cc
index ffeb20b717aea..f748ca8508940 100644
--- a/gcc/tree-cfg.cc
+++ b/gcc/tree-cfg.cc
@@ -5774,6 +5774,7 @@  gimple_verify_flow_info (void)
 	{
 	  gimple *stmt = gsi_stmt (gsi);
 
+	  /* Do NOT disregard debug stmts after found_ctrl_stmt.  */
 	  if (found_ctrl_stmt)
 	    {
 	      error ("control flow in the middle of basic block %d",
diff --git a/gcc/tree-pass.h b/gcc/tree-pass.h
index 09e6ada5b2f91..36f955a0b37ee 100644
--- a/gcc/tree-pass.h
+++ b/gcc/tree-pass.h
@@ -510,8 +510,9 @@  extern gimple_opt_pass *make_pass_adjust_alignment (gcc::context *ctxt);
 
 /* IPA Passes */
 extern simple_ipa_opt_pass *make_pass_ipa_lower_emutls (gcc::context *ctxt);
-extern simple_ipa_opt_pass
-							      *make_pass_ipa_function_and_variable_visibility (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_function_and_variable_visibility (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_strub_mode (gcc::context *ctxt);
+extern simple_ipa_opt_pass *make_pass_ipa_strub (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_tree_profile (gcc::context *ctxt);
 extern simple_ipa_opt_pass *make_pass_ipa_auto_profile (gcc::context *ctxt);
 
diff --git a/gcc/tree-ssa-ccp.cc b/gcc/tree-ssa-ccp.cc
index 1a555ae682638..03ff88afadddd 100644
--- a/gcc/tree-ssa-ccp.cc
+++ b/gcc/tree-ssa-ccp.cc
@@ -3073,7 +3073,9 @@  optimize_stack_restore (gimple_stmt_iterator i)
       if (!callee
 	  || !fndecl_built_in_p (callee, BUILT_IN_NORMAL)
 	  /* All regular builtins are ok, just obviously not alloca.  */
-	  || ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (callee)))
+	  || ALLOCA_FUNCTION_CODE_P (DECL_FUNCTION_CODE (callee))
+	  /* Do not remove stack updates before strub leave.  */
+	  || fndecl_built_in_p (callee, BUILT_IN___STRUB_LEAVE))
 	return NULL_TREE;
 
       if (fndecl_built_in_p (callee, BUILT_IN_STACK_RESTORE))
diff --git a/libgcc/Makefile.in b/libgcc/Makefile.in
index 8dedd10f79a30..d8163c5af9903 100644
--- a/libgcc/Makefile.in
+++ b/libgcc/Makefile.in
@@ -433,6 +433,9 @@  LIB2ADD += enable-execute-stack.c
 # Control Flow Redundancy hardening out-of-line checker.
 LIB2ADD += $(srcdir)/hardcfr.c
 
+# Stack scrubbing infrastructure.
+LIB2ADD += $(srcdir)/strub.c
+
 # While emutls.c has nothing to do with EH, it is in LIB2ADDEH*
 # instead of LIB2ADD because that's the way to be sure on some targets
 # (e.g. *-*-darwin*) only one copy of it is linked.
diff --git a/libgcc/libgcc2.h b/libgcc/libgcc2.h
index 06e26cad884b4..0b97eec472586 100644
--- a/libgcc/libgcc2.h
+++ b/libgcc/libgcc2.h
@@ -551,6 +551,10 @@  extern int __parityDI2 (UDWtype);
 
 extern void __enable_execute_stack (void *);
 
+extern void __strub_enter (void **);
+extern void __strub_update (void**);
+extern void __strub_leave (void **);
+
 #ifndef HIDE_EXPORTS
 #pragma GCC visibility pop
 #endif
diff --git a/libgcc/strub.c b/libgcc/strub.c
new file mode 100644
index 0000000000000..2a2b930b6039d
--- /dev/null
+++ b/libgcc/strub.c
@@ -0,0 +1,149 @@ 
+/* Stack scrubbing infrastructure
+   Copyright (C) 2021-2022 Free Software Foundation, Inc.
+   Contributed by Alexandre Oliva <oliva@adacore.com>
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify it under
+the terms of the GNU General Public License as published by the Free
+Software Foundation; either version 3, or (at your option) any later
+version.
+
+GCC is distributed in the hope that it will be useful, but WITHOUT ANY
+WARRANTY; without even the implied warranty of MERCHANTABILITY or
+FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License
+for more details.
+
+Under Section 7 of GPL version 3, you are granted additional
+permissions described in the GCC Runtime Library Exception, version
+3.1, as published by the Free Software Foundation.
+
+You should have received a copy of the GNU General Public License and
+a copy of the GCC Runtime Library Exception along with this program;
+see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
+<http://www.gnu.org/licenses/>.  */
+
+#include "tconfig.h"
+#include "tsystem.h"
+#include "coretypes.h"
+#include "tm.h"
+#include "libgcc_tm.h"
+#include "libgcc2.h"
+
+#ifndef STACK_GROWS_DOWNWARD
+# define TOPS >
+#else
+# define TOPS <
+#endif
+
+#define ATTRIBUTE_STRUB_CALLABLE __attribute__ ((__strub__ ("callable")))
+
+/* Enter a stack scrubbing context, initializing the watermark to the caller's
+   stack address.  */
+void ATTRIBUTE_STRUB_CALLABLE
+__strub_enter (void **watermark)
+{
+  *watermark = __builtin_frame_address (0);
+}
+
+/* Update the watermark within a stack scrubbing context with the current stack
+   pointer.  */
+void ATTRIBUTE_STRUB_CALLABLE
+__strub_update (void **watermark)
+{
+  void *sp = __builtin_frame_address (0);
+
+  if (sp TOPS *watermark)
+    *watermark = sp;
+}
+
+#if TARGET_STRUB_USE_DYNAMIC_ARRAY && ! defined TARGET_STRUB_MAY_USE_MEMSET
+# define TARGET_STRUB_MAY_USE_MEMSET 1
+#endif
+
+#if defined __x86_64__ && __OPTIMIZE__
+# define TARGET_STRUB_DISABLE_RED_ZONE \
+  /* __attribute__ ((__target__ ("no-red-zone"))) // not needed when optimizing */
+#elif !defined RED_ZONE_SIZE || defined __i386__
+# define TARGET_STRUB_DISABLE_RED_ZONE
+#endif
+
+#ifndef TARGET_STRUB_DISABLE_RED_ZONE
+/* Dummy function, called to force the caller to not be a leaf function, so
+   that it can't use the red zone.  */
+static void ATTRIBUTE_STRUB_CALLABLE
+__attribute__ ((__noinline__, __noipa__))
+__strub_dummy_force_no_leaf (void)
+{
+}
+#endif
+
+/* Leave a stack scrubbing context, clearing the stack between its top and
+   *MARK.  */
+void ATTRIBUTE_STRUB_CALLABLE
+#if ! TARGET_STRUB_MAY_USE_MEMSET
+__attribute__ ((__optimize__ ("-fno-tree-loop-distribute-patterns")))
+#endif
+#ifdef TARGET_STRUB_DISABLE_RED_ZONE
+TARGET_STRUB_DISABLE_RED_ZONE
+#endif
+__strub_leave (void **mark)
+{
+  void *sp = __builtin_stack_address ();
+
+  void **base, **end;
+#ifndef STACK_GROWS_DOWNWARD
+  base = sp; /* ??? Do we need an offset here?  */
+  end = *mark;
+#else
+  base = *mark;
+  end = sp; /* ??? Does any platform require an offset here?  */
+#endif
+
+  if (! (base < end))
+    return;
+
+#if TARGET_STRUB_USE_DYNAMIC_ARRAY
+  /* Compute the length without assuming the pointers are both sufficiently
+     aligned.  They should be, but pointer differences expected to be exact may
+     yield unexpected results when the assumption doesn't hold.  Given the
+     potential security implications, compute the length without that
+     expectation.  If the pointers are misaligned, we may leave a partial
+     unscrubbed word behind.  */
+  ptrdiff_t len = ((char *)end - (char *)base) / sizeof (void *);
+  /* Allocate a dynamically-sized array covering the desired range, so that we
+     can safely call memset on it.  */
+  void *ptr[len];
+  base = &ptr[0];
+  end = &ptr[len];
+#elifndef TARGET_STRUB_DISABLE_RED_ZONE
+  /* Prevent the use of the red zone, by making this function non-leaf through
+     an unreachable call that, because of the asm stmt, the compiler will
+     consider reachable.  */
+  asm goto ("" : : : : no_leaf);
+  if (0)
+    {
+    no_leaf:
+      __strub_dummy_force_no_leaf ();
+      return;
+    }
+#endif
+
+  /* ldist may turn these loops into a memset (thus the conditional
+     -fno-tree-loop-distribute-patterns above).  Without the dynamic array
+     above, that call would likely be unsafe: possibly tail-called, and likely
+     scribbling over its own stack frame.  */
+#ifndef STACK_GROWS_DOWNWARD
+  do
+    *base++ = 0;
+  while (base < end);
+  /* Make sure the stack overwrites are not optimized away.  */
+  asm ("" : : "m" (end[0]));
+#else
+  do
+    *--end = 0;
+  while (base < end);
+  /* Make sure the stack overwrites are not optimized away.  */
+  asm ("" : : "m" (base[0]));
+#endif
+}