From patchwork Tue Oct 18 15:12:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Manolis Tsamis X-Patchwork-Id: 4214 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp2012884wrs; Tue, 18 Oct 2022 08:13:57 -0700 (PDT) X-Google-Smtp-Source: AMsMyM64s7hwU6s3luts94hxMPjOu91EuXWW+QptahXqDsFarQq/wn3LmtJqkLEjjCAGLByw9U3a X-Received: by 2002:a17:907:3f26:b0:78d:9c30:4529 with SMTP id hq38-20020a1709073f2600b0078d9c304529mr2848642ejc.195.1666106037057; Tue, 18 Oct 2022 08:13:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666106037; cv=none; d=google.com; s=arc-20160816; b=X92t7URTA9sW5VXEHY7rqO/phhR4OD2mr4PQuHB4fdFaLeQqnEjIeynmAJWIzD54dK VAPy44vbHuLHrHAS+vxOiucqwQ4ajrYN1BmdwZHP0CyPCx0IyNHFzofSPbokokF1hGkU AJd2ZG0sP0dJdRUJlljMLp5jcAbDhsC79bDv64qCIB3HQe2MEBXyhNIJHC5iozyPe+8A unKfazilZzhK+S5d/oWNG3YSp0Z8VhbRiueRTNhDKTS5RSdNKcw+YpAYtvwPw8zpCTjC We4fs0++0TG7NGV4H0aCPFOl8vJCNaAQSdQk/e3ibJU7/1YsvLfaicwFmTThGKBGrwWO 7Ulg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=sender:errors-to:cc:list-subscribe:list-help:list-post:list-archive :list-unsubscribe:list-id:precedence:content-transfer-encoding :mime-version:message-id:date:subject:to:from:dkim-signature :dmarc-filter:delivered-to; bh=RPJWuEk43HtirPW0+TbG32wd0zQkLrYUsXvSeUTRCec=; b=KL2onunky4VpZ2HoMOc3hMl9zoq81xT8TrWWPqIz97/kC6/9PL2ldbz6gesiACMbKq PjW0XTfIE24SvurVd/HNWNsnfz1ob5hUUCJYbhj9wAa9D+Zw8QxmdZ7p9Mj3IuB+YCJm PSnT2OMZd2oyc5OWXi5XoLbkv3CW8tUrkrmJAlF+Dr9lLaa7b09cNOyeR7ak543EJCr3 a67HUJTwCGQS3talZVORSJQEwrOywJAL3iwW0VXA1xgjCnZqpokgQ/Wr7l799pidp7GP olpRPCuctRy3+iQ5dVknvklRpSQfzMyqUYvDYXdKyGOg7SmeONK0dv4eRFTkGGTKz4Be 8/og== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b="SVc/s1UP"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from sourceware.org (ip-8-43-85-97.sourceware.org. [8.43.85.97]) by mx.google.com with ESMTPS id u2-20020a509502000000b004595af54eacsi13456254eda.226.2022.10.18.08.13.56 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Oct 2022 08:13:57 -0700 (PDT) Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; Authentication-Results: mx.google.com; dkim=fail header.i=@vrull.eu header.s=google header.b="SVc/s1UP"; spf=pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) smtp.mailfrom="gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org" Received: from server2.sourceware.org (localhost [IPv6:::1]) by sourceware.org (Postfix) with ESMTP id 849C1385782C for ; Tue, 18 Oct 2022 15:13:52 +0000 (GMT) X-Original-To: gcc-patches@gcc.gnu.org Delivered-To: gcc-patches@gcc.gnu.org Received: from mail-lf1-x12f.google.com (mail-lf1-x12f.google.com [IPv6:2a00:1450:4864:20::12f]) by sourceware.org (Postfix) with ESMTPS id B57F83858C83 for ; Tue, 18 Oct 2022 15:13:13 +0000 (GMT) DMARC-Filter: OpenDMARC Filter v1.4.1 sourceware.org B57F83858C83 Authentication-Results: sourceware.org; dmarc=none (p=none dis=none) header.from=vrull.eu Authentication-Results: sourceware.org; spf=pass smtp.mailfrom=vrull.eu Received: by mail-lf1-x12f.google.com with SMTP id d6so23042840lfs.10 for ; Tue, 18 Oct 2022 08:13:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vrull.eu; s=google; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=RPJWuEk43HtirPW0+TbG32wd0zQkLrYUsXvSeUTRCec=; b=SVc/s1UPnGqMZ3gqSblJnRr5zYnb2MC+We/ARBCeJ1nxMqGYBBzD+pZAhPitzPcEAM FPRbHEi54x7KmfjwCzY0TKOg7HZpDssZWJ360igKUND8ayfKzn8DQWOLQa0EZ1inhMGd mk6SfgW0XHMHu9XlhRnHMX4r575H2pNIrGvE1/rfDiIqt2zlzepbF77+rpl9J4xbnRIZ zdaSADVE5Hiyp4f7EIbZWEhRFPn9aa0yhfsb28wim9TxzYuDTBCBu6hM5VFmQvU/FNJV VrsXJ2SVKBkv1oeJ/w+yLGRzUHAfeW3AxhH7ep0fr3eWhlBwRk1M8XktlL1zPx4Vxmdy 6XCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=RPJWuEk43HtirPW0+TbG32wd0zQkLrYUsXvSeUTRCec=; b=bDYXMS5imUUOR2ww+EFRE3mjFUX9XM8VkUXoz1jiBmgLT92UxcxFYjP2y7zZYnP2rC LXgQS8mPCo01IVkppcgQcf1ukZS2D7AgWztY9b/6N1m2vd5fl7EhfmneVAVSnhu3bJ/Q jqjDzqGoaK8d2PL27L1/OiFXU2qVF7AxuzQoor/XxZCJ3OhNsw9IhJf+m3R4qJpUygqC sk8aInQ7rES1NAmW8S+pJAb+wxFJ573ULKllAVbWX1okFDKvdaUTWZfWq6vg5utBX18U RHzCxwirgw1rchyGsLbRAlfML214aB9nQeOyAmsz654BpRLWPRd2ZzjQkks65biswtYH Rmfw== X-Gm-Message-State: ACrzQf0hJN6vxoIEesLhwzSiVhNrrNHih0qQoD0QZnT9RjPqeEEAmdGI bWyauLRB4lYEAIWuv3YDJrxHRG0UYUcMcA== X-Received: by 2002:a05:6512:22d3:b0:4a2:3b96:e980 with SMTP id g19-20020a05651222d300b004a23b96e980mr1341472lfu.352.1666105991532; Tue, 18 Oct 2022 08:13:11 -0700 (PDT) Received: from helsinki-03.engr ([2a01:4f9:6b:2a47::2]) by smtp.gmail.com with ESMTPSA id bg31-20020a05651c0b9f00b0026dce212f24sm2011037ljb.98.2022.10.18.08.13.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Oct 2022 08:13:10 -0700 (PDT) From: Manolis Tsamis To: gcc-patches@gcc.gnu.org Subject: [PATCH v2] Enable shrink wrapping for the RISC-V target. Date: Tue, 18 Oct 2022 17:12:12 +0200 Message-Id: <20221018151212.1523137-1-manolis.tsamis@vrull.eu> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 X-Spam-Status: No, score=-11.8 required=5.0 tests=BAYES_00, DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, DKIM_VALID_EF, GIT_PATCH_0, JMQ_SPF_NEUTRAL, KAM_SHORT, LIKELY_SPAM_BODY, RCVD_IN_DNSWL_NONE, SPF_HELO_NONE, SPF_PASS, TXREP autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on server2.sourceware.org X-BeenThere: gcc-patches@gcc.gnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: Gcc-patches mailing list List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Vineet Gupta , Kito Cheng , Philipp Tomsich Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747038804094754193?= X-GMAIL-MSGID: =?utf-8?q?1747038804094754193?= This commit implements the target macros (TARGET_SHRINK_WRAP_*) that enable separate shrink wrapping for function prologues/epilogues in RISC-V. Tested against SPEC CPU 2017, this change always has a net-positive effect on the dynamic instruction count. See the following table for the breakdown on how this reduces the number of dynamic instructions per workload on a like-for-like (i.e., same config file; suppressing shrink-wrapping with -fno-shrink-wrap): # dynamic instructions w/o shrink-wrap w/ shrink-wrap reduction 500.perlbench_r 1265716786593 1262156218578 3560568015 0.28% 500.perlbench_r 779224795689 765337009025 13887786664 1.78% 500.perlbench_r 724087331471 711307152522 12780178949 1.77% 502.gcc_r 204259864844 194517006339 9742858505 4.77% 502.gcc_r 244047794302 231555834722 12491959580 5.12% 502.gcc_r 230896069400 221877703011 9018366389 3.91% 502.gcc_r 192130616624 183856450605 8274166019 4.31% 502.gcc_r 258875074079 247756203226 11118870853 4.30% 505.mcf_r 662653430325 660678680547 1974749778 0.30% 520.omnetpp_r 985114167068 934191310154 50922856914 5.17% 523.xalancbmk_r 927037633578 921688937650 5348695928 0.58% 525.x264_r 490953958454 490565583447 388375007 0.08% 525.x264_r 1994662294421 1993171932425 1490361996 0.07% 525.x264_r 1897617120450 1896062750609 1554369841 0.08% 531.deepsjeng_r 1695189878907 1669304130411 25885748496 1.53% 541.leela_r 1925941222222 1897900861198 28040361024 1.46% 548.exchange2_r 2073816227944 2073816226729 1215 0.00% 557.xz_r 379572090003 379057409041 514680962 0.14% 557.xz_r 953117469352 952680431430 437037922 0.05% 557.xz_r 536859579650 536456690164 402889486 0.08% 18421773405376 18223938521833 197834883543 1.07% totals Signed-off-by: Manolis Tsamis gcc/ChangeLog: * config/riscv/riscv.cc (struct machine_function): Add array to store register wrapping information. (riscv_for_each_saved_reg): Skip registers that are wrapped separetely. (riscv_get_separate_components): New function. (riscv_components_for_bb): Likewise. (riscv_disqualify_components): Likewise. (riscv_process_components): Likewise. (riscv_emit_prologue_components): Likewise. (riscv_emit_epilogue_components): Likewise. (riscv_set_handled_components): Likewise. (TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS): Define. (TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB): Likewise. (TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS): Likewise. (TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS): Likewise. gcc/testsuite/ChangeLog: * gcc.target/riscv/shrink-wrap-1.c: New test. --- Changes in v2: - Fixed code and comment formatting for testcase. - Rebased code and resolved compile issues due to gp_sp_offset and GET_MODE_SIZE being poly_int. gcc/config/riscv/riscv.cc | 187 +++++++++++++++++- .../gcc.target/riscv/shrink-wrap-1.c | 24 +++ 2 files changed, 209 insertions(+), 2 deletions(-) create mode 100644 gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc index ad57b995e7b..a84805af4e7 100644 --- a/gcc/config/riscv/riscv.cc +++ b/gcc/config/riscv/riscv.cc @@ -26,6 +26,7 @@ along with GCC; see the file COPYING3. If not see #include "system.h" #include "coretypes.h" #include "target.h" +#include "backend.h" #include "tm.h" #include "rtl.h" #include "regs.h" @@ -51,6 +52,7 @@ along with GCC; see the file COPYING3. If not see #include "optabs.h" #include "bitmap.h" #include "df.h" +#include "function-abi.h" #include "diagnostic.h" #include "builtins.h" #include "predict.h" @@ -154,6 +156,11 @@ struct GTY(()) machine_function { /* The current frame information, calculated by riscv_compute_frame_info. */ struct riscv_frame_info frame; + + /* The components already handled by separate shrink-wrapping, which should + not be considered by the prologue and epilogue. */ + bool reg_is_wrapped_separately[FIRST_PSEUDO_REGISTER]; + }; /* Information about a single argument. */ @@ -4681,7 +4688,7 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, riscv_save_restore_fn fn, for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) { - bool handle_reg = TRUE; + bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno]; /* If this is a normal return in a function that calls the eh_return builtin, then do not restore the eh return data registers as that @@ -4712,9 +4719,11 @@ riscv_for_each_saved_reg (poly_int64 sp_offset, riscv_save_restore_fn fn, for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) { + bool handle_reg = !cfun->machine->reg_is_wrapped_separately[regno]; machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; - riscv_save_restore_reg (mode, regno, offset, fn); + if (handle_reg) + riscv_save_restore_reg (mode, regno, offset, fn); offset -= GET_MODE_SIZE (mode).to_constant (); } } @@ -5139,6 +5148,156 @@ riscv_epilogue_uses (unsigned int regno) return false; } +/* Implement TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS. */ + +static sbitmap +riscv_get_separate_components (void) +{ + HOST_WIDE_INT offset; + sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER); + bitmap_clear (components); + + if (riscv_use_save_libcall (&cfun->machine->frame) + || cfun->machine->interrupt_handler_p) + return components; + + offset = cfun->machine->frame.gp_sp_offset.to_constant (); + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) + { + if (SMALL_OPERAND (offset)) + bitmap_set_bit (components, regno); + + offset -= UNITS_PER_WORD; + } + + offset = cfun->machine->frame.fp_sp_offset.to_constant (); + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) + { + machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; + + if (SMALL_OPERAND (offset)) + bitmap_set_bit (components, regno); + + offset -= GET_MODE_SIZE (mode).to_constant (); + } + + /* Don't mess with the hard frame pointer. */ + if (frame_pointer_needed) + bitmap_clear_bit (components, HARD_FRAME_POINTER_REGNUM); + + bitmap_clear_bit (components, RETURN_ADDR_REGNUM); + + return components; +} + +/* Implement TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB. */ + +static sbitmap +riscv_components_for_bb (basic_block bb) +{ + bitmap in = DF_LIVE_IN (bb); + bitmap gen = &DF_LIVE_BB_INFO (bb)->gen; + bitmap kill = &DF_LIVE_BB_INFO (bb)->kill; + + sbitmap components = sbitmap_alloc (FIRST_PSEUDO_REGISTER); + bitmap_clear (components); + + function_abi_aggregator callee_abis; + rtx_insn *insn; + FOR_BB_INSNS (bb, insn) + if (CALL_P (insn)) + callee_abis.note_callee_abi (insn_callee_abi (insn)); + HARD_REG_SET extra_caller_saves = callee_abis.caller_save_regs (*crtl->abi); + + /* GPRs are used in a bb if they are in the IN, GEN, or KILL sets. */ + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (!fixed_regs[regno] + && !crtl->abi->clobbers_full_reg_p (regno) + && (TEST_HARD_REG_BIT (extra_caller_saves, regno) + || bitmap_bit_p (in, regno) + || bitmap_bit_p (gen, regno) + || bitmap_bit_p (kill, regno))) + bitmap_set_bit (components, regno); + + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (!fixed_regs[regno] + && !crtl->abi->clobbers_full_reg_p (regno) + && (TEST_HARD_REG_BIT (extra_caller_saves, regno) + || bitmap_bit_p (in, regno) + || bitmap_bit_p (gen, regno) + || bitmap_bit_p (kill, regno))) + bitmap_set_bit (components, regno); + + return components; +} + +/* Implement TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS. */ + +static void +riscv_disqualify_components (sbitmap, edge, sbitmap, bool) +{ + /* Nothing to do for riscv. */ +} + +static void +riscv_process_components (sbitmap components, bool prologue_p) +{ + HOST_WIDE_INT offset; + riscv_save_restore_fn fn = prologue_p? riscv_save_reg : riscv_restore_reg; + + offset = cfun->machine->frame.gp_sp_offset.to_constant (); + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.mask, regno - GP_REG_FIRST)) + { + if (bitmap_bit_p (components, regno)) + riscv_save_restore_reg (word_mode, regno, offset, fn); + + offset -= UNITS_PER_WORD; + } + + offset = cfun->machine->frame.fp_sp_offset.to_constant (); + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (BITSET_P (cfun->machine->frame.fmask, regno - FP_REG_FIRST)) + { + machine_mode mode = TARGET_DOUBLE_FLOAT ? DFmode : SFmode; + + if (bitmap_bit_p (components, regno)) + riscv_save_restore_reg (mode, regno, offset, fn); + + offset -= GET_MODE_SIZE (mode).to_constant (); + } +} + +/* Implement TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS. */ + +static void +riscv_emit_prologue_components (sbitmap components) +{ + riscv_process_components (components, true); +} + +/* Implement TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS. */ + +static void +riscv_emit_epilogue_components (sbitmap components) +{ + riscv_process_components (components, false); +} + +static void +riscv_set_handled_components (sbitmap components) +{ + for (unsigned int regno = GP_REG_FIRST; regno <= GP_REG_LAST; regno++) + if (bitmap_bit_p (components, regno)) + cfun->machine->reg_is_wrapped_separately[regno] = true; + + for (unsigned int regno = FP_REG_FIRST; regno <= FP_REG_LAST; regno++) + if (bitmap_bit_p (components, regno)) + cfun->machine->reg_is_wrapped_separately[regno] = true; +} + /* Return nonzero if this function is known to have a null epilogue. This allows the optimizer to omit jumps to jumps if no stack was created. */ @@ -6458,6 +6617,30 @@ riscv_vector_alignment (const_tree type) #undef TARGET_FUNCTION_ARG_BOUNDARY #define TARGET_FUNCTION_ARG_BOUNDARY riscv_function_arg_boundary +#undef TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS +#define TARGET_SHRINK_WRAP_GET_SEPARATE_COMPONENTS \ + riscv_get_separate_components + +#undef TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB +#define TARGET_SHRINK_WRAP_COMPONENTS_FOR_BB \ + riscv_components_for_bb + +#undef TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS +#define TARGET_SHRINK_WRAP_DISQUALIFY_COMPONENTS \ + riscv_disqualify_components + +#undef TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS +#define TARGET_SHRINK_WRAP_EMIT_PROLOGUE_COMPONENTS \ + riscv_emit_prologue_components + +#undef TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS +#define TARGET_SHRINK_WRAP_EMIT_EPILOGUE_COMPONENTS \ + riscv_emit_epilogue_components + +#undef TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS +#define TARGET_SHRINK_WRAP_SET_HANDLED_COMPONENTS \ + riscv_set_handled_components + /* The generic ELF target does not always have TLS support. */ #ifdef HAVE_AS_TLS #undef TARGET_HAVE_TLS diff --git a/gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c b/gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c new file mode 100644 index 00000000000..e1e07c3d4c5 --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/shrink-wrap-1.c @@ -0,0 +1,24 @@ +/* { dg-do compile } */ +/* { dg-options "-fshrink-wrap" } */ +/* { dg-skip-if "" { *-*-* } { "-O0" "-Os" "-Oz" } } */ + +void g(void); + +void f(int x) +{ + if (x) + { + /* Force saving of some callee-saved registers. With shrink wrapping + enabled these only need to be saved if x is non-zero. */ + register int s2 asm("18") = x; + register int s3 asm("19") = x; + register int s4 asm("20") = x; + asm("" : : "r"(s2)); + asm("" : : "r"(s3)); + asm("" : : "r"(s4)); + g(); + } +} + +/* The resulting code should do nothing if X is 0. */ +/* { dg-final { scan-assembler "bne\ta0,zero,.*\n.*ret" } } */