[RFC,v1,08/10] ifcvt: add if-conversion to conditional-zero instructions
Checks
Commit Message
Some architectures, as it the case on RISC-V with the proposed
ZiCondOps and the vendor-defined XVentanaCondOps, define a
conditional-zero instruction that is equivalent to:
- the positive form: rd = (rc != 0) ? rs : 0
- the negated form: rd = (rc == 0) ? rs : 0
While noce_try_store_flag_mask will somewhat work for this case, it
will generate a number of atomic RTX that will misdirect the cost
calculation and may be too long (i.e., 4 RTX and more) to successfully
merge at combine-time.
Instead, we add two new transforms that attempt to build up what we
define as the canonical form of a conditional-zero expression:
(set (match_operand 0 "register_operand" "=r")
(and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
(const_int 0)))
(match_operand 2 "register_operand" "r")))
Architectures that provide a conditional-zero are thus expected to
define an instruction matching this pattern in their backend.
Based on this, we support the following cases:
- noce_try_condzero:
a ? a : b
a ? b : 0 (and then/else swapped)
!a ? b : 0 (and then/else swapped)
- noce_try_condzero_arith:
conditional-plus, conditional-minus, conditional-and,
conditional-or, conditional-xor, conditional-shift,
conditional-and
Given that this is hooked into the CE passes, it is less powerful than
a tree-pass (e.g., it can not transform cases where an extension, such
as for uint16_t operations is in either the then or else-branch
together with the arithmetic) but already covers a good array of cases
and triggers across SPEC CPU 2017.
Adding transformations in a tree pass should come in a future
improvement.
gcc/ChangeLog:
* ifcvt.cc (noce_emit_insn): Add prototype.
(noce_emit_condzero): Helper for noce_try_condzero and
noce_try_condzero_arith transforms.
(noce_try_condzero): New transform.
(noce_try_condzero_arith): New transform for conditional
arithmetic that can be built up by exploiting that the
conditional-zero instruction will inject 0, which acts
as the neutral element for operations.
(noce_process_if_block): Call noce_try_condzero and
noce_try_condzero_arith.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/xventanacondops-and-01.c: New test.
* gcc.target/riscv/xventanacondops-and-02.c: New test.
* gcc.target/riscv/xventanacondops-eq-01.c: New test.
* gcc.target/riscv/xventanacondops-eq-02.c: New test.
* gcc.target/riscv/xventanacondops-lt-01.c: New test.
* gcc.target/riscv/xventanacondops-ne-01.c: New test.
* gcc.target/riscv/xventanacondops-xor-01.c: New test.
Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
---
gcc/ifcvt.cc | 216 ++++++++++++++++++
.../gcc.target/riscv/zicond-and-01.c | 16 ++
.../gcc.target/riscv/zicond-and-02.c | 15 ++
gcc/testsuite/gcc.target/riscv/zicond-eq-01.c | 11 +
gcc/testsuite/gcc.target/riscv/zicond-eq-02.c | 14 ++
gcc/testsuite/gcc.target/riscv/zicond-lt-01.c | 16 ++
gcc/testsuite/gcc.target/riscv/zicond-ne-01.c | 10 +
.../gcc.target/riscv/zicond-xor-01.c | 14 ++
8 files changed, 312 insertions(+)
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-01.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-02.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-01.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-02.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-lt-01.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ne-01.c
create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-xor-01.c
Comments
On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> Some architectures, as it the case on RISC-V with the proposed
> ZiCondOps and the vendor-defined XVentanaCondOps, define a
> conditional-zero instruction that is equivalent to:
> - the positive form: rd = (rc != 0) ? rs : 0
> - the negated form: rd = (rc == 0) ? rs : 0
>
> While noce_try_store_flag_mask will somewhat work for this case, it
> will generate a number of atomic RTX that will misdirect the cost
> calculation and may be too long (i.e., 4 RTX and more) to successfully
> merge at combine-time.
Can you expand on this? Especially when there are patterns that use
(if_then_else) already.
>
> Instead, we add two new transforms that attempt to build up what we
> define as the canonical form of a conditional-zero expression:
>
> (set (match_operand 0 "register_operand" "=r")
> (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
> (const_int 0)))
> (match_operand 2 "register_operand" "r")))
Again why are you not using:
(set (reg) (if_then_else (eq_ne (reg) (const_int 0)) (reg) (const_int 0)))
Form instead of the really bad "canonical" form of the above?
Thanks,
Andrew Pinski
>
> Architectures that provide a conditional-zero are thus expected to
> define an instruction matching this pattern in their backend.
>
> Based on this, we support the following cases:
> - noce_try_condzero:
> a ? a : b
> a ? b : 0 (and then/else swapped)
> !a ? b : 0 (and then/else swapped)
> - noce_try_condzero_arith:
> conditional-plus, conditional-minus, conditional-and,
> conditional-or, conditional-xor, conditional-shift,
> conditional-and
>
> Given that this is hooked into the CE passes, it is less powerful than
> a tree-pass (e.g., it can not transform cases where an extension, such
> as for uint16_t operations is in either the then or else-branch
> together with the arithmetic) but already covers a good array of cases
> and triggers across SPEC CPU 2017.
>
> Adding transformations in a tree pass should come in a future
> improvement.
>
> gcc/ChangeLog:
>
> * ifcvt.cc (noce_emit_insn): Add prototype.
> (noce_emit_condzero): Helper for noce_try_condzero and
> noce_try_condzero_arith transforms.
> (noce_try_condzero): New transform.
> (noce_try_condzero_arith): New transform for conditional
> arithmetic that can be built up by exploiting that the
> conditional-zero instruction will inject 0, which acts
> as the neutral element for operations.
> (noce_process_if_block): Call noce_try_condzero and
> noce_try_condzero_arith.
>
> gcc/testsuite/ChangeLog:
>
> * gcc.target/riscv/xventanacondops-and-01.c: New test.
> * gcc.target/riscv/xventanacondops-and-02.c: New test.
> * gcc.target/riscv/xventanacondops-eq-01.c: New test.
> * gcc.target/riscv/xventanacondops-eq-02.c: New test.
> * gcc.target/riscv/xventanacondops-lt-01.c: New test.
> * gcc.target/riscv/xventanacondops-ne-01.c: New test.
> * gcc.target/riscv/xventanacondops-xor-01.c: New test.
>
> Signed-off-by: Philipp Tomsich <philipp.tomsich@vrull.eu>
> ---
>
> gcc/ifcvt.cc | 216 ++++++++++++++++++
> .../gcc.target/riscv/zicond-and-01.c | 16 ++
> .../gcc.target/riscv/zicond-and-02.c | 15 ++
> gcc/testsuite/gcc.target/riscv/zicond-eq-01.c | 11 +
> gcc/testsuite/gcc.target/riscv/zicond-eq-02.c | 14 ++
> gcc/testsuite/gcc.target/riscv/zicond-lt-01.c | 16 ++
> gcc/testsuite/gcc.target/riscv/zicond-ne-01.c | 10 +
> .../gcc.target/riscv/zicond-xor-01.c | 14 ++
> 8 files changed, 312 insertions(+)
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-01.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-and-02.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-01.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-eq-02.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-lt-01.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-ne-01.c
> create mode 100644 gcc/testsuite/gcc.target/riscv/zicond-xor-01.c
>
> diff --git a/gcc/ifcvt.cc b/gcc/ifcvt.cc
> index 008796838f7..7ac3bd8f18e 100644
> --- a/gcc/ifcvt.cc
> +++ b/gcc/ifcvt.cc
> @@ -97,6 +97,7 @@ static int find_if_case_2 (basic_block, edge, edge);
> static int dead_or_predicable (basic_block, basic_block, basic_block,
> edge, int);
> static void noce_emit_move_insn (rtx, rtx);
> +static rtx_insn *noce_emit_insn (rtx);
> static rtx_insn *block_has_only_trap (basic_block);
> static void need_cmov_or_rewire (basic_block, hash_set<rtx_insn *> *,
> hash_map<rtx_insn *, int> *);
> @@ -787,6 +788,9 @@ static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
> static int noce_try_minmax (struct noce_if_info *);
> static int noce_try_abs (struct noce_if_info *);
> static int noce_try_sign_mask (struct noce_if_info *);
> +static rtx noce_emit_condzero (struct noce_if_info *, rtx, bool = false);
> +static int noce_try_condzero (struct noce_if_info *);
> +static int noce_try_condzero_arith (struct noce_if_info *);
>
> /* Return the comparison code for reversed condition for IF_INFO,
> or UNKNOWN if reversing the condition is not possible. */
> @@ -1664,6 +1668,214 @@ noce_try_addcc (struct noce_if_info *if_info)
> return FALSE;
> }
>
> +/* Helper to noce_try_condzero: cond ? a : 0. */
> +static rtx
> +noce_emit_condzero (struct noce_if_info *if_info, rtx a, bool reverse)
> +{
> + /* The canonical form for a conditional-zero-or-value is:
> + (set (match_operand 0 "register_operand" "=r")
> + (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
> + (const_int 0)))
> + (match_operand 2 "register_operand" "r")))
> + */
> +
> + machine_mode opmode = GET_MODE (if_info->x);
> + enum rtx_code code = GET_CODE (if_info->cond);
> + rtx cond;
> + rtx op_a = XEXP (if_info->cond, 0);
> + rtx op_b = XEXP (if_info->cond, 1);
> +
> + /* If it is not a EQ/NE comparison against const0_rtx, canonicalize
> + by first synthesizing a truth-value and then building a NE
> + condition around it. */
> + if ((code != EQ && code != NE) || XEXP (if_info->cond, 1) != const0_rtx)
> + {
> + rtx tmp = gen_reg_rtx (opmode);
> +
> + start_sequence ();
> + cond = gen_rtx_fmt_ee (code, opmode, op_a, op_b);
> + if (!noce_emit_insn (gen_rtx_SET (tmp, cond)))
> + {
> + end_sequence ();
> +
> + /* If we can't emit this pattern, try to reverse it and
> + invert the polarity of the second test. */
> + start_sequence ();
> + cond = gen_rtx_fmt_ee (reverse_condition (code), opmode, op_a, op_b);
> + if (!noce_emit_insn (gen_rtx_SET (tmp, cond))) {
> + end_sequence ();
> + return NULL_RTX;
> + }
> +
> + /* We have recovered by reversing the first comparison,
> + so we need change the second one around as well... */
> + reverse = !reverse;
> + }
> + rtx_insn *seq = get_insns ();
> + end_sequence ();
> + emit_insn (seq);
> +
> + /* Set up the second comparison that will be embedded in the
> + canonical conditional-zero-or-value RTX. */
> + code = NE;
> + op_a = tmp;
> + op_b = const0_rtx;
> + }
> +
> + cond = gen_rtx_fmt_ee (reverse ? reverse_condition (code) : code,
> + opmode, op_a, op_b);
> +
> + /* Build (and (neg (eq_or_ne ... const0_rtx)) (reg <a>)) */
> + rtx target = gen_reg_rtx (opmode);
> + rtx czero = gen_rtx_AND (opmode, gen_rtx_NEG (opmode, cond), a);
> + noce_emit_move_insn (target, czero);
> +
> + return target;
> +}
> +
> +/* Use a conditional-zero instruction for "if (test) x = 0;", if available. */
> +static int
> +noce_try_condzero (struct noce_if_info *if_info)
> +{
> + rtx target;
> + rtx_insn *seq;
> + int reversep = 0;
> + /* Keep local copies of the constituent elements of if_info, as we
> + may be changing them. We are not allowed to modify if_info
> + though, as we may fail in this function and can't leave different
> + semantics behind for the next functions. */
> + rtx a = if_info->a;
> + rtx b = if_info->b;
> + rtx x = if_info->x;
> + rtx cond = if_info->cond;
> + enum rtx_code code = GET_CODE (cond);
> + rtx cond_arg0 = XEXP (cond, 0);
> + rtx cond_arg1 = XEXP (cond, 1);
> + rtx orig_b = NULL_RTX;
> +
> + if (!noce_simple_bbs (if_info))
> + return FALSE;
> +
> + /* We may encounter the form "(b != 0) ? b : a", which can be
> + simplified to "b | ((b != 0) ? 0 : a)". */
> + if (code == NE && cond_arg1 == const0_rtx &&
> + REG_P (b) && rtx_equal_p (b, cond_arg0))
> + {
> + orig_b = b;
> + b = const0_rtx;
> + }
> +
> + /* We may encounter the form "(b == 0) ? b : a", which can be
> + simplied to "(b == 0) ? 0 : a". */
> + if (code == EQ && cond_arg1 == const0_rtx &&
> + REG_P (b) && rtx_equal_p (b, cond_arg0))
> + {
> + b = const0_rtx;
> + }
> +
> + start_sequence ();
> +
> + if ((a == const0_rtx && (REG_P (b) || rtx_equal_p (b, x)))
> + || ((reversep = (noce_reversed_cond_code (if_info) != UNKNOWN))
> + && b == const0_rtx && (REG_P (a) || rtx_equal_p (a, x))))
> + {
> + target = noce_emit_condzero(if_info, reversep ? a : b, reversep);
> +
> + /* Handle the case where we replace b in "(b != 0) ? b : a" with
> + with const0_rtx to then emit "b | ((b != 0) ? 0 : a)". */
> + if (orig_b && target)
> + target = expand_simple_binop (GET_MODE (x), IOR, orig_b,
> + target, x, 0, OPTAB_WIDEN);
> +
> + if (target)
> + {
> + if (target != if_info->x)
> + noce_emit_move_insn (if_info->x, target);
> +
> + seq = end_ifcvt_sequence (if_info);
> + if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info))
> + return FALSE;
> +
> + emit_insn_before_setloc (seq, if_info->jump,
> + INSN_LOCATION (if_info->insn_a));
> + if_info->transform_name = "noce_try_condzero";
> +
> + return TRUE;
> + }
> + }
> +
> + end_sequence ();
> +
> + return FALSE;
> +}
> +
> +/* Convert "if (test) x op= a;" to a branchless sequence using the
> + canonical form for a conditional-zero. */
> +static int
> +noce_try_condzero_arith (struct noce_if_info *if_info)
> +{
> + rtx target;
> + rtx_insn *seq;
> + rtx_code op = GET_CODE (if_info->a);
> + const rtx arg0 = XEXP (if_info->a, 0);
> + const rtx arg1 = XEXP (if_info->a, 1);
> +
> + if (!noce_simple_bbs (if_info))
> + return FALSE;
> +
> + /* Check for no else condition. */
> + if (!rtx_equal_p (if_info->x, if_info->b))
> + return FALSE;
> +
> + if (op != PLUS && op != MINUS && op != IOR && op != XOR &&
> + op != ASHIFT && op != ASHIFTRT && op != LSHIFTRT && op != AND)
> + return FALSE;
> +
> + if (!rtx_equal_p (if_info->x, arg0))
> + return FALSE;
> +
> + start_sequence ();
> +
> + target = noce_emit_condzero(if_info, arg1, op != AND ? true : false);
> +
> + if (target)
> + {
> + rtx op1 = if_info->x;
> +
> + if (op == AND)
> + {
> + /* Emit "tmp = x & val;" followed by "tmp |= !cond ? x : 0;" */
> + op1 = expand_simple_binop (GET_MODE (if_info->x), AND, op1,
> + arg1, NULL_RTX, 0, OPTAB_WIDEN);
> + op = IOR;
> + }
> +
> + if (op1)
> + target = expand_simple_binop (GET_MODE (if_info->x), op, op1,
> + target, if_info->x, 0, OPTAB_WIDEN);
> + }
> +
> + if (target)
> + {
> + if (target != if_info->x)
> + noce_emit_move_insn (if_info->x, target);
> +
> + seq = end_ifcvt_sequence (if_info);
> + if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info))
> + return FALSE;
> +
> + emit_insn_before_setloc(seq, if_info->jump,
> + INSN_LOCATION(if_info->insn_a));
> + if_info->transform_name = "noce_try_condzero_arith";
> +
> + return TRUE;
> + }
> +
> + end_sequence ();
> +
> + return FALSE;
> +}
> +
> /* Convert "if (test) x = 0;" to "x &= -(test == 0);" */
>
> static int
> @@ -3967,8 +4179,12 @@ noce_process_if_block (struct noce_if_info *if_info)
> {
> if (noce_try_addcc (if_info))
> goto success;
> + if (noce_try_condzero (if_info))
> + goto success;
> if (noce_try_store_flag_mask (if_info))
> goto success;
> + if (noce_try_condzero_arith (if_info))
> + goto success;
> if (HAVE_conditional_move
> && noce_try_cmove_arith (if_info))
> goto success;
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-and-01.c b/gcc/testsuite/gcc.target/riscv/zicond-and-01.c
> new file mode 100644
> index 00000000000..d9b0ff00756
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-and-01.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
> +
> +long and1(long a, long b, long c, long d)
> +{
> + if (c < d)
> + a &= b;
> +
> + return a;
> +}
> +
> +/* { dg-final { scan-assembler-times "and\t" 1 } } */
> +/* { dg-final { scan-assembler-times "slt" 1 } } */
> +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
> +/* { dg-final { scan-assembler-times "or\t" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-and-02.c b/gcc/testsuite/gcc.target/riscv/zicond-and-02.c
> new file mode 100644
> index 00000000000..80f417cfb54
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-and-02.c
> @@ -0,0 +1,15 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
> +
> +int and2(int a, int b, long c)
> +{
> + if (c)
> + a &= b;
> +
> + return a;
> +}
> +
> +/* { dg-final { scan-assembler-times "and\t" 1 } } */
> +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
> +/* { dg-final { scan-assembler-times "or\t" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c b/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c
> new file mode 100644
> index 00000000000..4f933c1db60
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-eq-01.c
> @@ -0,0 +1,11 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +long
> +eq1 (long a, long b)
> +{
> + return (a == 0) ? b : 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c b/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c
> new file mode 100644
> index 00000000000..a7bc747ab1d
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-eq-02.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +long
> +eq2 (long a, long b)
> +{
> + if (a == 0)
> + return b;
> +
> + return 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c b/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c
> new file mode 100644
> index 00000000000..830bfc6449f
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-lt-01.c
> @@ -0,0 +1,16 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
> +
> +long long sink (long long);
> +
> +long long lt3 (long long a, long long b)
> +{
> + if (a < b)
> + b = 0;
> +
> + return sink(b);
> +}
> +
> +/* { dg-final { scan-assembler-times "czero.nez\t" 1 } } */
> +/* { dg-final { scan-assembler-times "slt\t" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c b/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c
> new file mode 100644
> index 00000000000..f25e601ae3c
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-ne-01.c
> @@ -0,0 +1,10 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
> +
> +long long ne1(long long a, long long b)
> +{
> + return (a != 0) ? b : 0;
> +}
> +
> +/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */
> diff --git a/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c b/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c
> new file mode 100644
> index 00000000000..c45a3be2680
> --- /dev/null
> +++ b/gcc/testsuite/gcc.target/riscv/zicond-xor-01.c
> @@ -0,0 +1,14 @@
> +/* { dg-do compile } */
> +/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
> +/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
> +
> +long xor1(long crc, long poly)
> +{
> + if (crc & 1)
> + crc ^= poly;
> +
> + return crc;
> +}
> +
> +/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */
> +/* { dg-final { scan-assembler-times "xor\t" 1 } } */
> --
> 2.34.1
>
On 2/10/23 15:41, Philipp Tomsich wrote:
> Some architectures, as it the case on RISC-V with the proposed
> ZiCondOps and the vendor-defined XVentanaCondOps, define a
> conditional-zero instruction that is equivalent to:
> - the positive form: rd = (rc != 0) ? rs : 0
> - the negated form: rd = (rc == 0) ? rs : 0
>
> While noce_try_store_flag_mask will somewhat work for this case, it
> will generate a number of atomic RTX that will misdirect the cost
> calculation and may be too long (i.e., 4 RTX and more) to successfully
> merge at combine-time.
>
> Instead, we add two new transforms that attempt to build up what we
> define as the canonical form of a conditional-zero expression:
>
> (set (match_operand 0 "register_operand" "=r")
> (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
> (const_int 0)))
> (match_operand 2 "register_operand" "r")))
>
> Architectures that provide a conditional-zero are thus expected to
> define an instruction matching this pattern in their backend.
>
> Based on this, we support the following cases:
> - noce_try_condzero:
> a ? a : b
> a ? b : 0 (and then/else swapped)
> !a ? b : 0 (and then/else swapped)
> - noce_try_condzero_arith:
> conditional-plus, conditional-minus, conditional-and,
> conditional-or, conditional-xor, conditional-shift,
> conditional-and
>
> Given that this is hooked into the CE passes, it is less powerful than
> a tree-pass (e.g., it can not transform cases where an extension, such
> as for uint16_t operations is in either the then or else-branch
> together with the arithmetic) but already covers a good array of cases
> and triggers across SPEC CPU 2017.
>
> Adding transformations in a tree pass should come in a future
> improvement.
>
> gcc/ChangeLog:
>
> * ifcvt.cc (noce_emit_insn): Add prototype.
> (noce_emit_condzero): Helper for noce_try_condzero and
> noce_try_condzero_arith transforms.
> (noce_try_condzero): New transform.
> (noce_try_condzero_arith): New transform for conditional
> arithmetic that can be built up by exploiting that the
> conditional-zero instruction will inject 0, which acts
> as the neutral element for operations.
> (noce_process_if_block): Call noce_try_condzero and
> noce_try_condzero_arith.
A few things we've noted while working with this patch internally.
1. The canonical conditional-zero-or-value requires operand 2 to be a
register. That will tend to inhibit if-conversion of something like a
shift-by-constant. This showed up somewhere in leela.
2. Internally we fixed #1 by forcing that argument into a register when
it was a constant and reload hasn't completed. This (naturally)
resulted in more if-conversions happening. That in turn caused
perlbench to fail. This was tracked down to a conditional-and sequence.
I think the call to noce_emit_condzero needs adjustment to pass arg0
instead of arg1 when OP == AND.
3. The canaonical conditional-zero-or-value assumes the target can do a
generic SEQ/SNE of two register values. As you know, on RISC-V we have
SEQZ/SNEZ. So we've added another fallback path to handle that case in
noce_emit_condzero. You subtract the two values, then you can do an
SEQZ/SNEZ on the result.
Formatting nits noted below.
> + if (!noce_emit_insn (gen_rtx_SET (tmp, cond))) {
> + end_sequence ();
> + return NULL_RTX;
> + }
Open-curly on new line, indented two places from the IF, similarly for
the close curly. That will probably require adjusting the indentation
of the statements.
> + if (code == NE && cond_arg1 == const0_rtx &&
> + REG_P (b) && rtx_equal_p (b, cond_arg0))
Don't end with &&, instead bring it down indented one position from the
open parenthesis.
> + {
> + orig_b = b;
> + b = const0_rtx;
> + }
> +
> + /* We may encounter the form "(b == 0) ? b : a", which can be
> + simplied to "(b == 0) ? 0 : a". */
> + if (code == EQ && cond_arg1 == const0_rtx &&
> + REG_P (b) && rtx_equal_p (b, cond_arg0))
Likewise.
> + {
> + target = noce_emit_condzero(if_info, reversep ? a : b, reversep);
Missing whitespace before open-paren of function call.
> +
> + if (op != PLUS && op != MINUS && op != IOR && op != XOR &&
> + op != ASHIFT && op != ASHIFTRT && op != LSHIFTRT && op != AND)
Bring the "&&" down to the next line here too.
> +
> + if (!rtx_equal_p (if_info->x, arg0))
> + return FALSE;
?!? What's this for?
> +
> + target = noce_emit_condzero(if_info, arg1, op != AND ? true : false);
Missing space before open-paren.
> +
> + emit_insn_before_setloc(seq, if_info->jump,
> + INSN_LOCATION(if_info->insn_a));
Here too.
I think the ChangeLog entry for the testsuite needs its filenames
updated :-)
While this was submitted before stage1 closed, I think we're probably
too late for it to go forward until we re-open stage1.
jeff
Andrew Pinski via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
>>
>> Some architectures, as it the case on RISC-V with the proposed
>> ZiCondOps and the vendor-defined XVentanaCondOps, define a
>> conditional-zero instruction that is equivalent to:
>> - the positive form: rd = (rc != 0) ? rs : 0
>> - the negated form: rd = (rc == 0) ? rs : 0
>>
>> While noce_try_store_flag_mask will somewhat work for this case, it
>> will generate a number of atomic RTX that will misdirect the cost
>> calculation and may be too long (i.e., 4 RTX and more) to successfully
>> merge at combine-time.
>
> Can you expand on this? Especially when there are patterns that use
> (if_then_else) already.
>
>>
>> Instead, we add two new transforms that attempt to build up what we
>> define as the canonical form of a conditional-zero expression:
>>
>> (set (match_operand 0 "register_operand" "=r")
>> (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
>> (const_int 0)))
>> (match_operand 2 "register_operand" "r")))
>
> Again why are you not using:
> (set (reg) (if_then_else (eq_ne (reg) (const_int 0)) (reg) (const_int 0)))
> Form instead of the really bad "canonical" form of the above?
I don't think one form is inherently better than the other if we think
about just this one case. But I agree that the if_then_else form is
currently the canonical form for the operation, and extends more
naturally to the general case. AArch64 already matches specifically
for it (with xzr providing the zero value).
Thanks,
Richard
On 2/13/23 10:32, Richard Sandiford via Gcc-patches wrote:
> Andrew Pinski via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
>> On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich
>> <philipp.tomsich@vrull.eu> wrote:
>>>
>>> Some architectures, as it the case on RISC-V with the proposed
>>> ZiCondOps and the vendor-defined XVentanaCondOps, define a
>>> conditional-zero instruction that is equivalent to:
>>> - the positive form: rd = (rc != 0) ? rs : 0
>>> - the negated form: rd = (rc == 0) ? rs : 0
>>>
>>> While noce_try_store_flag_mask will somewhat work for this case, it
>>> will generate a number of atomic RTX that will misdirect the cost
>>> calculation and may be too long (i.e., 4 RTX and more) to successfully
>>> merge at combine-time.
>>
>> Can you expand on this? Especially when there are patterns that use
>> (if_then_else) already.
>>
>>>
>>> Instead, we add two new transforms that attempt to build up what we
>>> define as the canonical form of a conditional-zero expression:
>>>
>>> (set (match_operand 0 "register_operand" "=r")
>>> (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
>>> (const_int 0)))
>>> (match_operand 2 "register_operand" "r")))
>>
>> Again why are you not using:
>> (set (reg) (if_then_else (eq_ne (reg) (const_int 0)) (reg) (const_int 0)))
>> Form instead of the really bad "canonical" form of the above?
>
> I don't think one form is inherently better than the other if we think
> about just this one case. But I agree that the if_then_else form is
> currently the canonical form for the operation, and extends more
> naturally to the general case. AArch64 already matches specifically
> for it (with xzr providing the zero value).
The more I think about it, the more I prefer the if-then-else form. My
biggest hesitation with getting behind one form or the other is a lack
of knowledge about which is likely better interpreted by simplify-rtx
and friends -- though it may not matter much in practice.
Jeff
On Mon, Feb 13, 2023 at 10:43 AM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 2/13/23 10:32, Richard Sandiford via Gcc-patches wrote:
> > Andrew Pinski via Gcc-patches <gcc-patches@gcc.gnu.org> writes:
> >> On Fri, Feb 10, 2023 at 2:47 PM Philipp Tomsich
> >> <philipp.tomsich@vrull.eu> wrote:
> >>>
> >>> Some architectures, as it the case on RISC-V with the proposed
> >>> ZiCondOps and the vendor-defined XVentanaCondOps, define a
> >>> conditional-zero instruction that is equivalent to:
> >>> - the positive form: rd = (rc != 0) ? rs : 0
> >>> - the negated form: rd = (rc == 0) ? rs : 0
> >>>
> >>> While noce_try_store_flag_mask will somewhat work for this case, it
> >>> will generate a number of atomic RTX that will misdirect the cost
> >>> calculation and may be too long (i.e., 4 RTX and more) to successfully
> >>> merge at combine-time.
> >>
> >> Can you expand on this? Especially when there are patterns that use
> >> (if_then_else) already.
> >>
> >>>
> >>> Instead, we add two new transforms that attempt to build up what we
> >>> define as the canonical form of a conditional-zero expression:
> >>>
> >>> (set (match_operand 0 "register_operand" "=r")
> >>> (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
> >>> (const_int 0)))
> >>> (match_operand 2 "register_operand" "r")))
> >>
> >> Again why are you not using:
> >> (set (reg) (if_then_else (eq_ne (reg) (const_int 0)) (reg) (const_int 0)))
> >> Form instead of the really bad "canonical" form of the above?
> >
> > I don't think one form is inherently better than the other if we think
> > about just this one case. But I agree that the if_then_else form is
> > currently the canonical form for the operation, and extends more
> > naturally to the general case. AArch64 already matches specifically
> > for it (with xzr providing the zero value).
> The more I think about it, the more I prefer the if-then-else form. My
> biggest hesitation with getting behind one form or the other is a lack
> of knowledge about which is likely better interpreted by simplify-rtx
> and friends -- though it may not matter much in practice.
In the case of IF_THEN_ELSE, combine tries a few things:
/* If this is a simple operation applied to an IF_THEN_ELSE, try
applying it to the arms of the IF_THEN_ELSE. This often simplifies
things. Check for cases where both arms are testing the same
condition.
Also see simplify_if_then_else inside combine.cc.
I don't think the first one could be done not using IF_THEN_ELSE.
Thanks,
Andrew Pinski
>
> Jeff
On Mon, 13 Feb 2023, Jeff Law via Gcc-patches wrote:
> 3. The canaonical conditional-zero-or-value assumes the target can do a
> generic SEQ/SNE of two register values. As you know, on RISC-V we have
> SEQZ/SNEZ. So we've added another fallback path to handle that case in
> noce_emit_condzero. You subtract the two values, then you can do an SEQZ/SNEZ
> on the result.
NB these machine operations are identical to MIPSr6 SELEQZ and SELNEZ
instructions (cf. ISA_HAS_SEL), so why can't we just duplicate what the
MIPS backend does? Or did the MIPS backend do something wrong here?
Maciej
On 2/28/23 09:42, Maciej W. Rozycki wrote:
> On Mon, 13 Feb 2023, Jeff Law via Gcc-patches wrote:
>
>> 3. The canaonical conditional-zero-or-value assumes the target can do a
>> generic SEQ/SNE of two register values. As you know, on RISC-V we have
>> SEQZ/SNEZ. So we've added another fallback path to handle that case in
>> noce_emit_condzero. You subtract the two values, then you can do an SEQZ/SNEZ
>> on the result.
>
> NB these machine operations are identical to MIPSr6 SELEQZ and SELNEZ
> instructions (cf. ISA_HAS_SEL), so why can't we just duplicate what the
> MIPS backend does? Or did the MIPS backend do something wrong here?
That's the form that Andrew (and subsequently I) both suggested using.
Switching to that form may in turn make some of these other issues go away.
jeff
@@ -97,6 +97,7 @@ static int find_if_case_2 (basic_block, edge, edge);
static int dead_or_predicable (basic_block, basic_block, basic_block,
edge, int);
static void noce_emit_move_insn (rtx, rtx);
+static rtx_insn *noce_emit_insn (rtx);
static rtx_insn *block_has_only_trap (basic_block);
static void need_cmov_or_rewire (basic_block, hash_set<rtx_insn *> *,
hash_map<rtx_insn *, int> *);
@@ -787,6 +788,9 @@ static rtx noce_get_alt_condition (struct noce_if_info *, rtx, rtx_insn **);
static int noce_try_minmax (struct noce_if_info *);
static int noce_try_abs (struct noce_if_info *);
static int noce_try_sign_mask (struct noce_if_info *);
+static rtx noce_emit_condzero (struct noce_if_info *, rtx, bool = false);
+static int noce_try_condzero (struct noce_if_info *);
+static int noce_try_condzero_arith (struct noce_if_info *);
/* Return the comparison code for reversed condition for IF_INFO,
or UNKNOWN if reversing the condition is not possible. */
@@ -1664,6 +1668,214 @@ noce_try_addcc (struct noce_if_info *if_info)
return FALSE;
}
+/* Helper to noce_try_condzero: cond ? a : 0. */
+static rtx
+noce_emit_condzero (struct noce_if_info *if_info, rtx a, bool reverse)
+{
+ /* The canonical form for a conditional-zero-or-value is:
+ (set (match_operand 0 "register_operand" "=r")
+ (and (neg (eq_or_ne (match_operand 1 "register_operand" "r")
+ (const_int 0)))
+ (match_operand 2 "register_operand" "r")))
+ */
+
+ machine_mode opmode = GET_MODE (if_info->x);
+ enum rtx_code code = GET_CODE (if_info->cond);
+ rtx cond;
+ rtx op_a = XEXP (if_info->cond, 0);
+ rtx op_b = XEXP (if_info->cond, 1);
+
+ /* If it is not a EQ/NE comparison against const0_rtx, canonicalize
+ by first synthesizing a truth-value and then building a NE
+ condition around it. */
+ if ((code != EQ && code != NE) || XEXP (if_info->cond, 1) != const0_rtx)
+ {
+ rtx tmp = gen_reg_rtx (opmode);
+
+ start_sequence ();
+ cond = gen_rtx_fmt_ee (code, opmode, op_a, op_b);
+ if (!noce_emit_insn (gen_rtx_SET (tmp, cond)))
+ {
+ end_sequence ();
+
+ /* If we can't emit this pattern, try to reverse it and
+ invert the polarity of the second test. */
+ start_sequence ();
+ cond = gen_rtx_fmt_ee (reverse_condition (code), opmode, op_a, op_b);
+ if (!noce_emit_insn (gen_rtx_SET (tmp, cond))) {
+ end_sequence ();
+ return NULL_RTX;
+ }
+
+ /* We have recovered by reversing the first comparison,
+ so we need change the second one around as well... */
+ reverse = !reverse;
+ }
+ rtx_insn *seq = get_insns ();
+ end_sequence ();
+ emit_insn (seq);
+
+ /* Set up the second comparison that will be embedded in the
+ canonical conditional-zero-or-value RTX. */
+ code = NE;
+ op_a = tmp;
+ op_b = const0_rtx;
+ }
+
+ cond = gen_rtx_fmt_ee (reverse ? reverse_condition (code) : code,
+ opmode, op_a, op_b);
+
+ /* Build (and (neg (eq_or_ne ... const0_rtx)) (reg <a>)) */
+ rtx target = gen_reg_rtx (opmode);
+ rtx czero = gen_rtx_AND (opmode, gen_rtx_NEG (opmode, cond), a);
+ noce_emit_move_insn (target, czero);
+
+ return target;
+}
+
+/* Use a conditional-zero instruction for "if (test) x = 0;", if available. */
+static int
+noce_try_condzero (struct noce_if_info *if_info)
+{
+ rtx target;
+ rtx_insn *seq;
+ int reversep = 0;
+ /* Keep local copies of the constituent elements of if_info, as we
+ may be changing them. We are not allowed to modify if_info
+ though, as we may fail in this function and can't leave different
+ semantics behind for the next functions. */
+ rtx a = if_info->a;
+ rtx b = if_info->b;
+ rtx x = if_info->x;
+ rtx cond = if_info->cond;
+ enum rtx_code code = GET_CODE (cond);
+ rtx cond_arg0 = XEXP (cond, 0);
+ rtx cond_arg1 = XEXP (cond, 1);
+ rtx orig_b = NULL_RTX;
+
+ if (!noce_simple_bbs (if_info))
+ return FALSE;
+
+ /* We may encounter the form "(b != 0) ? b : a", which can be
+ simplified to "b | ((b != 0) ? 0 : a)". */
+ if (code == NE && cond_arg1 == const0_rtx &&
+ REG_P (b) && rtx_equal_p (b, cond_arg0))
+ {
+ orig_b = b;
+ b = const0_rtx;
+ }
+
+ /* We may encounter the form "(b == 0) ? b : a", which can be
+ simplied to "(b == 0) ? 0 : a". */
+ if (code == EQ && cond_arg1 == const0_rtx &&
+ REG_P (b) && rtx_equal_p (b, cond_arg0))
+ {
+ b = const0_rtx;
+ }
+
+ start_sequence ();
+
+ if ((a == const0_rtx && (REG_P (b) || rtx_equal_p (b, x)))
+ || ((reversep = (noce_reversed_cond_code (if_info) != UNKNOWN))
+ && b == const0_rtx && (REG_P (a) || rtx_equal_p (a, x))))
+ {
+ target = noce_emit_condzero(if_info, reversep ? a : b, reversep);
+
+ /* Handle the case where we replace b in "(b != 0) ? b : a" with
+ with const0_rtx to then emit "b | ((b != 0) ? 0 : a)". */
+ if (orig_b && target)
+ target = expand_simple_binop (GET_MODE (x), IOR, orig_b,
+ target, x, 0, OPTAB_WIDEN);
+
+ if (target)
+ {
+ if (target != if_info->x)
+ noce_emit_move_insn (if_info->x, target);
+
+ seq = end_ifcvt_sequence (if_info);
+ if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info))
+ return FALSE;
+
+ emit_insn_before_setloc (seq, if_info->jump,
+ INSN_LOCATION (if_info->insn_a));
+ if_info->transform_name = "noce_try_condzero";
+
+ return TRUE;
+ }
+ }
+
+ end_sequence ();
+
+ return FALSE;
+}
+
+/* Convert "if (test) x op= a;" to a branchless sequence using the
+ canonical form for a conditional-zero. */
+static int
+noce_try_condzero_arith (struct noce_if_info *if_info)
+{
+ rtx target;
+ rtx_insn *seq;
+ rtx_code op = GET_CODE (if_info->a);
+ const rtx arg0 = XEXP (if_info->a, 0);
+ const rtx arg1 = XEXP (if_info->a, 1);
+
+ if (!noce_simple_bbs (if_info))
+ return FALSE;
+
+ /* Check for no else condition. */
+ if (!rtx_equal_p (if_info->x, if_info->b))
+ return FALSE;
+
+ if (op != PLUS && op != MINUS && op != IOR && op != XOR &&
+ op != ASHIFT && op != ASHIFTRT && op != LSHIFTRT && op != AND)
+ return FALSE;
+
+ if (!rtx_equal_p (if_info->x, arg0))
+ return FALSE;
+
+ start_sequence ();
+
+ target = noce_emit_condzero(if_info, arg1, op != AND ? true : false);
+
+ if (target)
+ {
+ rtx op1 = if_info->x;
+
+ if (op == AND)
+ {
+ /* Emit "tmp = x & val;" followed by "tmp |= !cond ? x : 0;" */
+ op1 = expand_simple_binop (GET_MODE (if_info->x), AND, op1,
+ arg1, NULL_RTX, 0, OPTAB_WIDEN);
+ op = IOR;
+ }
+
+ if (op1)
+ target = expand_simple_binop (GET_MODE (if_info->x), op, op1,
+ target, if_info->x, 0, OPTAB_WIDEN);
+ }
+
+ if (target)
+ {
+ if (target != if_info->x)
+ noce_emit_move_insn (if_info->x, target);
+
+ seq = end_ifcvt_sequence (if_info);
+ if (!seq || !targetm.noce_conversion_profitable_p (seq, if_info))
+ return FALSE;
+
+ emit_insn_before_setloc(seq, if_info->jump,
+ INSN_LOCATION(if_info->insn_a));
+ if_info->transform_name = "noce_try_condzero_arith";
+
+ return TRUE;
+ }
+
+ end_sequence ();
+
+ return FALSE;
+}
+
/* Convert "if (test) x = 0;" to "x &= -(test == 0);" */
static int
@@ -3967,8 +4179,12 @@ noce_process_if_block (struct noce_if_info *if_info)
{
if (noce_try_addcc (if_info))
goto success;
+ if (noce_try_condzero (if_info))
+ goto success;
if (noce_try_store_flag_mask (if_info))
goto success;
+ if (noce_try_condzero_arith (if_info))
+ goto success;
if (HAVE_conditional_move
&& noce_try_cmove_arith (if_info))
goto success;
new file mode 100644
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+long and1(long a, long b, long c, long d)
+{
+ if (c < d)
+ a &= b;
+
+ return a;
+}
+
+/* { dg-final { scan-assembler-times "and\t" 1 } } */
+/* { dg-final { scan-assembler-times "slt" 1 } } */
+/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
+/* { dg-final { scan-assembler-times "or\t" 1 } } */
new file mode 100644
@@ -0,0 +1,15 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64 -mbranch-cost=4" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+int and2(int a, int b, long c)
+{
+ if (c)
+ a &= b;
+
+ return a;
+}
+
+/* { dg-final { scan-assembler-times "and\t" 1 } } */
+/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
+/* { dg-final { scan-assembler-times "or\t" 1 } } */
new file mode 100644
@@ -0,0 +1,11 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+long
+eq1 (long a, long b)
+{
+ return (a == 0) ? b : 0;
+}
+
+/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
new file mode 100644
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+long
+eq2 (long a, long b)
+{
+ if (a == 0)
+ return b;
+
+ return 0;
+}
+
+/* { dg-final { scan-assembler-times "czero.nez" 1 } } */
new file mode 100644
@@ -0,0 +1,16 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+long long sink (long long);
+
+long long lt3 (long long a, long long b)
+{
+ if (a < b)
+ b = 0;
+
+ return sink(b);
+}
+
+/* { dg-final { scan-assembler-times "czero.nez\t" 1 } } */
+/* { dg-final { scan-assembler-times "slt\t" 1 } } */
new file mode 100644
@@ -0,0 +1,10 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" } } */
+
+long long ne1(long long a, long long b)
+{
+ return (a != 0) ? b : 0;
+}
+
+/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */
new file mode 100644
@@ -0,0 +1,14 @@
+/* { dg-do compile } */
+/* { dg-options "-march=rv64gc_zicond -mabi=lp64" } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-Og" "-Os" "-Oz" } } */
+
+long xor1(long crc, long poly)
+{
+ if (crc & 1)
+ crc ^= poly;
+
+ return crc;
+}
+
+/* { dg-final { scan-assembler-times "czero.eqz" 1 } } */
+/* { dg-final { scan-assembler-times "xor\t" 1 } } */