diff mbox series

RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

Message ID	20230628115559.116166-1-juzhe.zhong@rivai.ai
State	Unresolved
Headers	Received-SPF: pass (google.com: domain of gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org designates 8.43.85.97 as permitted sender) client-ip=8.43.85.97; DMARC-Filter: OpenDMARC Filter v1.4.2 sourceware.org 0D77C3858D35 From: Juzhe-Zhong <juzhe.zhong@rivai.ai> To: gcc-patches@gcc.gnu.org Cc: kito.cheng@gmail.com, kito.cheng@sifive.com, palmer@dabbelt.com, palmer@rivosinc.com, jeffreyalaw@gmail.com, rdapp.gcc@gmail.com, Juzhe-Zhong <juzhe.zhong@rivai.ai> Subject: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering Date: Wed, 28 Jun 2023 19:55:59 +0800 Message-Id: <20230628115559.116166-1-juzhe.zhong@rivai.ai> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Feedback-ID: bizesmtp:rivai.ai:qybglogicsvrgz:qybglogicsvrgz7a-one-0 Precedence: list Errors-To: gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org Sender: "Gcc-patches" <gcc-patches-bounces+ouuuleilei=gmail.com@gcc.gnu.org> X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?=
Series	RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering \| RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

Checks

Context	Check	Description
snail/gcc-patch-check	warning	Git am fail log

Commit Message

juzhe.zhong@rivai.ai June 28, 2023, 11:55 a.m. UTC

  Similar to vfwmacc. Add combine patterns as follows:

For vfwnmsac:
1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (reg) )))
2. (set (reg) (fma (neg (float_extend (reg))) (reg) (reg) )))

For vfwmsac:
1. (set (reg) (fma (float_extend (reg)) (float_extend (reg))) (neg (reg)) )))
2. (set (reg) (fma (float_extend (reg)) (reg) (neg (reg)) )))

For vfwnmacc:
1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (neg (reg)) )))
2. (set (reg) (fma (neg (float_extend (reg))) (reg) (neg (reg)) )))

gcc/ChangeLog:

        * config/riscv/autovec-opt.md (*double_widen_fnma<mode>): New pattern.
        (*single_widen_fnma<mode>): Ditto.
        (*double_widen_fms<mode>): Ditto.
        (*single_widen_fms<mode>): Ditto.
        (*double_widen_fnms<mode>): Ditto.
        (*single_widen_fnms<mode>): Ditto.

gcc/testsuite/ChangeLog:

        * gcc.target/riscv/rvv/autovec/widen/widen-10.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen-11.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen-12.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-10.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-11.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run-12.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-10.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-11.c: New test.
        * gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-12.c: New test.

---
 gcc/config/riscv/autovec-opt.md               | 182 ++++++++++++++++++
 .../riscv/rvv/autovec/widen/widen-10.c        |  22 +++
 .../riscv/rvv/autovec/widen/widen-11.c        |  22 +++
 .../riscv/rvv/autovec/widen/widen-12.c        |  22 +++
 .../rvv/autovec/widen/widen-complicate-7.c    |  27 +++
 .../rvv/autovec/widen/widen-complicate-8.c    |  27 +++
 .../rvv/autovec/widen/widen-complicate-9.c    |  27 +++
 .../riscv/rvv/autovec/widen/widen_run-10.c    |  32 +++
 .../riscv/rvv/autovec/widen/widen_run-11.c    |  32 +++
 .../riscv/rvv/autovec/widen/widen_run-12.c    |  32 +++
 .../rvv/autovec/widen/widen_run_zvfh-10.c     |  32 +++
 .../rvv/autovec/widen/widen_run_zvfh-11.c     |  32 +++
 .../rvv/autovec/widen/widen_run_zvfh-12.c     |  32 +++
 13 files changed, 521 insertions(+)
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-12.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-10.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-11.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-12.c

Comments

Jeff Law June 28, 2023, 6:16 p.m. UTC | #1

On 6/28/23 05:55, Juzhe-Zhong wrote:
> Similar to vfwmacc. Add combine patterns as follows:
> 
> For vfwnmsac:
> 1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (reg) )))
> 2. (set (reg) (fma (neg (float_extend (reg))) (reg) (reg) )))
> 
> For vfwmsac:
> 1. (set (reg) (fma (float_extend (reg)) (float_extend (reg))) (neg (reg)) )))
> 2. (set (reg) (fma (float_extend (reg)) (reg) (neg (reg)) )))
> 
> For vfwnmacc:
> 1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (neg (reg)) )))
> 2. (set (reg) (fma (neg (float_extend (reg))) (reg) (neg (reg)) )))
> 
> gcc/ChangeLog:
> 
>          * config/riscv/autovec-opt.md (*double_widen_fnma<mode>): New pattern.
>          (*single_widen_fnma<mode>): Ditto.
>          (*double_widen_fms<mode>): Ditto.
>          (*single_widen_fms<mode>): Ditto.
>          (*double_widen_fnms<mode>): Ditto.
>          (*single_widen_fnms<mode>): Ditto.
> 

> +
> +;; This helps to match ext + fnma.
> +(define_insn_and_split "*single_widen_fnma<mode>"
> +  [(set (match_operand:VWEXTF 0 "register_operand")
> +	(fma:VWEXTF
> +	  (neg:VWEXTF
> +	    (float_extend:VWEXTF
> +	      (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
> +	  (match_operand:VWEXTF 3 "register_operand")
> +	  (match_operand:VWEXTF 1 "register_operand")))]
I'd like to understand this better.  It looks like it's meant to be a 
bridge to another pattern.  However, it looks like it would be a 4->1 
pattern without needing a bridge.  So I'd like to know why that code 
isn't working.

Can you send the before/after combine dumps which show this bridge 
pattern being used?

The same concern exists with the other bridge patterns, but I don't 
think I need to see the before/after for each of them.



Thanks,
Jeff

juzhe.zhong@rivai.ai June 28, 2023, 10:10 p.m. UTC | #2

Sure.

https://godbolt.org/z/8857KzTno 

Failed to match this instruction:
(set (reg:VNx2DF 134 [ vect__31.47 ])
    (fma:VNx2DF (neg:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [ vect__28.44 ])))
        (reg:VNx2DF 150 [ vect__8.12 ])
        (reg:VNx2DF 171 [ vect__29.45 ])))



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-29 02:16
To: Juzhe-Zhong; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering
 
 
On 6/28/23 05:55, Juzhe-Zhong wrote:
> Similar to vfwmacc. Add combine patterns as follows:
> 
> For vfwnmsac:
> 1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (reg) )))
> 2. (set (reg) (fma (neg (float_extend (reg))) (reg) (reg) )))
> 
> For vfwmsac:
> 1. (set (reg) (fma (float_extend (reg)) (float_extend (reg))) (neg (reg)) )))
> 2. (set (reg) (fma (float_extend (reg)) (reg) (neg (reg)) )))
> 
> For vfwnmacc:
> 1. (set (reg) (fma (neg (float_extend (reg))) (float_extend (reg))) (neg (reg)) )))
> 2. (set (reg) (fma (neg (float_extend (reg))) (reg) (neg (reg)) )))
> 
> gcc/ChangeLog:
> 
>          * config/riscv/autovec-opt.md (*double_widen_fnma<mode>): New pattern.
>          (*single_widen_fnma<mode>): Ditto.
>          (*double_widen_fms<mode>): Ditto.
>          (*single_widen_fms<mode>): Ditto.
>          (*double_widen_fnms<mode>): Ditto.
>          (*single_widen_fnms<mode>): Ditto.
> 
 
> +
> +;; This helps to match ext + fnma.
> +(define_insn_and_split "*single_widen_fnma<mode>"
> +  [(set (match_operand:VWEXTF 0 "register_operand")
> + (fma:VWEXTF
> +   (neg:VWEXTF
> +     (float_extend:VWEXTF
> +       (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
> +   (match_operand:VWEXTF 3 "register_operand")
> +   (match_operand:VWEXTF 1 "register_operand")))]
I'd like to understand this better.  It looks like it's meant to be a 
bridge to another pattern.  However, it looks like it would be a 4->1 
pattern without needing a bridge.  So I'd like to know why that code 
isn't working.
 
Can you send the before/after combine dumps which show this bridge 
pattern being used?
 
The same concern exists with the other bridge patterns, but I don't 
think I need to see the before/after for each of them.
 
 
 
Thanks,
Jeff

Jeff Law June 28, 2023, 10:43 p.m. UTC | #3

On 6/28/23 16:10, 钟居哲 wrote:
> Sure.
> 
> https://godbolt.org/z/8857KzTno <https://godbolt.org/z/8857KzTno>
> 
> Failed to match this instruction:
> (set (reg:VNx2DF 134 [ vect__31.47 ])
>      (fma:VNx2DF (neg:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [ 
> vect__28.44 ])))
>          (reg:VNx2DF 150 [ vect__8.12 ])
>          (reg:VNx2DF 171 [ vect__29.45 ])))
Please attach the full dump.  I would expect to see additional attempts 
with more operands replaced.

jeff

juzhe.zhong@rivai.ai June 28, 2023, 10:56 p.m. UTC | #4

juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-29 06:43
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering
 
 
On 6/28/23 16:10, 钟居哲 wrote:
> Sure.
> 
> https://godbolt.org/z/8857KzTno <https://godbolt.org/z/8857KzTno>
> 
> Failed to match this instruction:
> (set (reg:VNx2DF 134 [ vect__31.47 ])
>      (fma:VNx2DF (neg:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [ 
> vect__28.44 ])))
>          (reg:VNx2DF 150 [ vect__8.12 ])
>          (reg:VNx2DF 171 [ vect__29.45 ])))
Please attach the full dump.  I would expect to see additional attempts 
with more operands replaced.
 
jeff

Jeff Law June 29, 2023, 11:43 p.m. UTC | #5

On 6/28/23 16:56, 钟居哲 wrote:
> 
> 
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
> 
>     *From:* Jeff Law <mailto:jeffreyalaw@gmail.com>
>     *Date:* 2023-06-29 06:43
>     *To:* 钟居哲 <mailto:juzhe.zhong@rivai.ai>; gcc-patches
>     <mailto:gcc-patches@gcc.gnu.org>
>     *CC:* kito.cheng <mailto:kito.cheng@gmail.com>; kito.cheng
>     <mailto:kito.cheng@sifive.com>; palmer <mailto:palmer@dabbelt.com>;
>     palmer <mailto:palmer@rivosinc.com>; rdapp.gcc
>     <mailto:rdapp.gcc@gmail.com>
>     *Subject:* Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac
>     combine lowering
>     On 6/28/23 16:10, 钟居哲 wrot
>      > Sure.
>      >
>      > https://godbolt.org/z/8857KzTno <https://godbolt.org/z/8857KzTno>
>      >
>      > Failed to match this instruction:
>      > (set (reg:VNx2DF 134 [ vect__31.47 ])
>      >      (fma:VNx2DF (neg:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [
>      > vect__28.44 ])))
>      >          (reg:VNx2DF 150 [ vect__8.12 ])
>      >          (reg:VNx2DF 171 [ vect__29.45 ])))
>     Please attach the full dump.  I would expect to see additional attempts
>     with more operands replaced.
THanks for the dump.  I think this fundamentally the same issue as the 
widening problem.

Drop those intermediate patterns.  They're not needed/helpful.  You may 
need a dependency height reduction pattern to get the code you want, but 
I see no evidence those extra patterns will solve anything.

jeff

juzhe.zhong@rivai.ai June 30, 2023, 1:14 a.m. UTC | #6

No, reduction patterns won't help. 
As I said in vfwmul patch. You should make sure your environment is working then try again.

Thanks.



juzhe.zhong@rivai.ai
 
From: Jeff Law
Date: 2023-06-30 07:43
To: 钟居哲; gcc-patches
CC: kito.cheng; kito.cheng; palmer; palmer; rdapp.gcc
Subject: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering
 
 
On 6/28/23 16:56, 钟居哲 wrote:
> 
> 
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
> 
>     *From:* Jeff Law <mailto:jeffreyalaw@gmail.com>
>     *Date:* 2023-06-29 06:43
>     *To:* 钟居哲 <mailto:juzhe.zhong@rivai.ai>; gcc-patches
>     <mailto:gcc-patches@gcc.gnu.org>
>     *CC:* kito.cheng <mailto:kito.cheng@gmail.com>; kito.cheng
>     <mailto:kito.cheng@sifive.com>; palmer <mailto:palmer@dabbelt.com>;
>     palmer <mailto:palmer@rivosinc.com>; rdapp.gcc
>     <mailto:rdapp.gcc@gmail.com>
>     *Subject:* Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac
>     combine lowering
>     On 6/28/23 16:10, 钟居哲 wrot
>      > Sure.
>      >
>      > https://godbolt.org/z/8857KzTno <https://godbolt.org/z/8857KzTno>
>      >
>      > Failed to match this instruction:
>      > (set (reg:VNx2DF 134 [ vect__31.47 ])
>      >      (fma:VNx2DF (neg:VNx2DF (float_extend:VNx2DF (reg:VNx2SF 136 [
>      > vect__28.44 ])))
>      >          (reg:VNx2DF 150 [ vect__8.12 ])
>      >          (reg:VNx2DF 171 [ vect__29.45 ])))
>     Please attach the full dump.  I would expect to see additional attempts
>     with more operands replaced.
THanks for the dump.  I think this fundamentally the same issue as the 
widening problem.
 
Drop those intermediate patterns.  They're not needed/helpful.  You may 
need a dependency height reduction pattern to get the code you want, but 
I see no evidence those extra patterns will solve anything.
 
jeff

Jeff Law June 30, 2023, 1:26 a.m. UTC | #7

On 6/29/23 19:14, juzhe.zhong@rivai.ai wrote:
> No, reduction patterns won't help.
> As I said in vfwmul patch. You should make sure your environment is 
> working then try again.
I've triple checked this already.

I checked it again and your patch does not impact behavior, nor should 
it.   I checked it on top of these trunk commits:

14bfda6084eaca07c842566a34316974907958e2
e714af12e3bee0032d8d226f87d92c9bc46f0269

I checked it with the code from the godbolt links you suggested with the 
options shown in those links.

More importantly, your explanation of what the pattern is supposed to do 
shows a misunderstanding of what combine's capabilities actually are.  A 
bridge or intermediate pattern is not needed here, combine can 
substitute multiple sources in combination attempts as can be clearly 
seen from the dump fragments I posted.

The only reason I didn't reject the patch at the outset was the 
possibility that maybe we were trying to combine more than 4 
instructions or that possibility something about the number of operands, 
unspecs, whatever were getting in the way.

This patch is not needed and does not affect code generation.

I would strongly suggest looking at a dependency height reduction 
pattern if you want to optimize that code further.

Jeff

juzhe.zhong@rivai.ai June 30, 2023, 1:32 a.m. UTC | #8

>> I've triple checked this already.
You mean you still didn't see vfwmul.vv ?

That's odd. Let's wait for kito or Robin test this patch.
Then, I believe they will know what I am saying.

>> I would strongly suggest looking at a dependency height reduction
>> pattern if you want to optimize that code further.
I did it long time ago. Turns out it's better to do that on Combine PASS in both GCC and LLVM.

Never mind, I always have this implementation in my downstream and won't affect my downstream GCC maintainment.
It's ok that this patch is not approved since I can get the perfect codegen in my downstream. 

Thanks.

juzhe.zhong@rivai.ai

From: Jeff Law
Date: 2023-06-30 09:26
To: juzhe.zhong@rivai.ai; gcc-patches
CC: kito.cheng; Kito.cheng; palmer; palmer; Robin Dapp
Subject: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

On 6/29/23 19:14, juzhe.zhong@rivai.ai wrote:
> No, reduction patterns won't help.
> As I said in vfwmul patch. You should make sure your environment is 
> working then try again.
I've triple checked this already.

I checked it again and your patch does not impact behavior, nor should 
it.   I checked it on top of these trunk commits:

14bfda6084eaca07c842566a34316974907958e2
e714af12e3bee0032d8d226f87d92c9bc46f0269

I checked it with the code from the godbolt links you suggested with the 
options shown in those links.

More importantly, your explanation of what the pattern is supposed to do 
shows a misunderstanding of what combine's capabilities actually are.  A 
bridge or intermediate pattern is not needed here, combine can 
substitute multiple sources in combination attempts as can be clearly 
seen from the dump fragments I posted.

The only reason I didn't reject the patch at the outset was the 
possibility that maybe we were trying to combine more than 4 
instructions or that possibility something about the number of operands, 
unspecs, whatever were getting in the way.

This patch is not needed and does not affect code generation.

I would strongly suggest looking at a dependency height reduction 
pattern if you want to optimize that code further.

Jeff

Robin Dapp July 3, 2023, 7:48 a.m. UTC | #9

To reiterate, this is OK from my side.  As discussed in the other
thread, Jeff would like to have more info on whether a bridge pattern
is needed at all and I agreed to get back to it in a while.  Until
then, we can merge this.

Regards
 Robin

Kito Cheng July 3, 2023, 9:01 a.m. UTC | #10

Tried on local, widen-complicate-7.c, widen-complicate-8.c and
widen-complicate-9.c need those bridge pattern, otherwise will fail to
combine, so give an explicitly LGTM from my side.

On Mon, Jul 3, 2023 at 3:48 PM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> To reiterate, this is OK from my side.  As discussed in the other
> thread, Jeff would like to have more info on whether a bridge pattern
> is needed at all and I agreed to get back to it in a while.  Until
> then, we can merge this.
>
> Regards
>  Robin
>

juzhe.zhong@rivai.ai July 3, 2023, 9:12 a.m. UTC | #11

Thanks kito. 
Lehua will merge it for me.

juzhe.zhong@rivai.ai

From: Kito Cheng
Date: 2023-07-03 17:01
To: Robin Dapp
CC: juzhe.zhong@rivai.ai; jeffreyalaw; gcc-patches; Kito.cheng; palmer; palmer
Subject: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering
Tried on local, widen-complicate-7.c, widen-complicate-8.c and
widen-complicate-9.c need those bridge pattern, otherwise will fail to
combine, so give an explicitly LGTM from my side.

On Mon, Jul 3, 2023 at 3:48 PM Robin Dapp via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> To reiterate, this is OK from my side.  As discussed in the other
> thread, Jeff would like to have more info on whether a bridge pattern
> is needed at all and I agreed to get back to it in a while.  Until
> then, we can merge this.
>
> Regards
>  Robin
>

Lehua Ding July 3, 2023, 9:27 a.m. UTC | #12

Commited, thanks Robin, Kito, and Jeff.
&nbsp;
&nbsp;
------------------&nbsp;Original&nbsp;------------------
From: &nbsp;"juzhe.zhong@rivai.ai"<juzhe.zhong@rivai.ai&gt;;
Date: &nbsp;Mon, Jul 3, 2023 05:12 PM
To: &nbsp;"kito.cheng"<kito.cheng@gmail.com&gt;; "Robin Dapp"<rdapp.gcc@gmail.com&gt;; 
Cc: &nbsp;"Jeff Law"<jeffreyalaw@gmail.com&gt;; "gcc-patches"<gcc-patches@gcc.gnu.org&gt;; "Kito Cheng"<kito.cheng@sifive.com&gt;; "palmer"<palmer@dabbelt.com&gt;; "palmer"<palmer@rivosinc.com&gt;; "丁乐华"<lehua.ding@rivai.ai&gt;; 
Subject: &nbsp;Re: Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

&nbsp;

 Thanks kito. 
Lehua will merge it for me.

 juzhe.zhong@rivai.ai

 &nbsp;
From:&nbsp;Kito Cheng
Date:&nbsp;2023-07-03 17:01
To:&nbsp;Robin Dapp
CC:&nbsp;juzhe.zhong@rivai.ai; jeffreyalaw; gcc-patches; Kito.cheng; palmer; palmer
Subject:&nbsp;Re: [PATCH] RISC-V: Support vfwnmacc/vfwmsac/vfwnmsac combine lowering

Tried on local, widen-complicate-7.c, widen-complicate-8.c and
 widen-complicate-9.c need those bridge pattern, otherwise will fail to
 combine, so give an explicitly LGTM from my side.
 &nbsp;
 On Mon, Jul 3, 2023 at 3:48 PM Robin Dapp via Gcc-patches
 <gcc-patches@gcc.gnu.org&gt; wrote:
 &gt;
 &gt; To reiterate, this is OK from my side.&nbsp; As discussed in the other
 &gt; thread, Jeff would like to have more info on whether a bridge pattern
 &gt; is needed at all and I agreed to get back to it in a while.&nbsp; Until
 &gt; then, we can merge this.
 &gt;
 &gt; Regards
 &gt;&nbsp; Robin
 &gt;
 &nbsp;

diff mbox series

Patch

diff --git a/gcc/config/riscv/autovec-opt.md b/gcc/config/riscv/autovec-opt.md
index 1a1cef0eaa5..0c0ba685d6b 100644
--- a/gcc/config/riscv/autovec-opt.md
+++ b/gcc/config/riscv/autovec-opt.md
@@ -502,3 +502,185 @@ 
   }
   [(set_attr "type" "vfwmuladd")
    (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; -------------------------------------------------------------------------
+;; ---- [FP] VFWNMSAC
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vfwnmsac.vv
+;; -------------------------------------------------------------------------
+
+;; Combine ext + ext + fnma ===> widen fnma.
+;; Most of circumstantces, LoopVectorizer will generate the following IR:
+;; vect__8.176_40 = (vector([2,2]) double) vect__7.175_41;
+;; vect__11.180_35 = (vector([2,2]) double) vect__10.179_36;
+;; vect__13.182_33 = .FNMA (vect__11.180_35, vect__8.176_40, vect__4.172_45);
+(define_insn_and_split "*double_widen_fnma<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (neg:VWEXTF
+	    (float_extend:VWEXTF
+	      (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
+	  (float_extend:VWEXTF
+	    (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand"))
+	  (match_operand:VWEXTF 1 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    riscv_vector::emit_vlmax_fp_ternary_insn (code_for_pred_widen_mul_neg (PLUS, <MODE>mode),
+					      riscv_vector::RVV_WIDEN_TERNOP, operands);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; This helps to match ext + fnma.
+(define_insn_and_split "*single_widen_fnma<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (neg:VWEXTF
+	    (float_extend:VWEXTF
+	      (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
+	  (match_operand:VWEXTF 3 "register_operand")
+	  (match_operand:VWEXTF 1 "register_operand")))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ext_ops[] = {tmp, operands[2]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ext_ops);
+
+    rtx dst = expand_ternary_op (<MODE>mode, fnma_optab, tmp, operands[3],
+				 operands[1], operands[0], 0);
+    emit_move_insn (operands[0], dst);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; -------------------------------------------------------------------------
+;; ---- [FP] VFWMSAC
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vfwmsac.vv
+;; -------------------------------------------------------------------------
+
+;; Combine ext + ext + fms ===> widen fms.
+;; Most of circumstantces, LoopVectorizer will generate the following IR:
+;; vect__8.176_40 = (vector([2,2]) double) vect__7.175_41;
+;; vect__11.180_35 = (vector([2,2]) double) vect__10.179_36;
+;; vect__13.182_33 = .FMS (vect__11.180_35, vect__8.176_40, vect__4.172_45);
+(define_insn_and_split "*double_widen_fms<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (float_extend:VWEXTF
+	    (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand"))
+	  (float_extend:VWEXTF
+	    (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand"))
+	  (neg:VWEXTF
+	    (match_operand:VWEXTF 1 "register_operand"))))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    riscv_vector::emit_vlmax_fp_ternary_insn (code_for_pred_widen_mul (MINUS, <MODE>mode),
+					      riscv_vector::RVV_WIDEN_TERNOP, operands);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; This helps to match ext + fms.
+(define_insn_and_split "*single_widen_fms<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (float_extend:VWEXTF
+	    (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand"))
+	  (match_operand:VWEXTF 3 "register_operand")
+	  (neg:VWEXTF
+	    (match_operand:VWEXTF 1 "register_operand"))))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ext_ops[] = {tmp, operands[2]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ext_ops);
+
+    rtx dst = expand_ternary_op (<MODE>mode, fms_optab, tmp, operands[3],
+				 operands[1], operands[0], 0);
+    emit_move_insn (operands[0], dst);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; -------------------------------------------------------------------------
+;; ---- [FP] VFWNMACC
+;; -------------------------------------------------------------------------
+;; Includes:
+;; - vfwnmacc.vv
+;; -------------------------------------------------------------------------
+
+;; Combine ext + ext + fnms ===> widen fnms.
+;; Most of circumstantces, LoopVectorizer will generate the following IR:
+;; vect__8.176_40 = (vector([2,2]) double) vect__7.175_41;
+;; vect__11.180_35 = (vector([2,2]) double) vect__10.179_36;
+;; vect__13.182_33 = .FNMS (vect__11.180_35, vect__8.176_40, vect__4.172_45);
+(define_insn_and_split "*double_widen_fnms<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (neg:VWEXTF
+	    (float_extend:VWEXTF
+	      (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
+	  (float_extend:VWEXTF
+	    (match_operand:<V_DOUBLE_TRUNC> 3 "register_operand"))
+	  (neg:VWEXTF
+	    (match_operand:VWEXTF 1 "register_operand"))))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    riscv_vector::emit_vlmax_fp_ternary_insn (code_for_pred_widen_mul_neg (MINUS, <MODE>mode),
+					      riscv_vector::RVV_WIDEN_TERNOP, operands);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
+
+;; This helps to match ext + fnms.
+(define_insn_and_split "*single_widen_fnms<mode>"
+  [(set (match_operand:VWEXTF 0 "register_operand")
+	(fma:VWEXTF
+	  (neg:VWEXTF
+	    (float_extend:VWEXTF
+	      (match_operand:<V_DOUBLE_TRUNC> 2 "register_operand")))
+	  (match_operand:VWEXTF 3 "register_operand")
+	  (neg:VWEXTF
+	    (match_operand:VWEXTF 1 "register_operand"))))]
+  "TARGET_VECTOR && can_create_pseudo_p ()"
+  "#"
+  "&& 1"
+  [(const_int 0)]
+  {
+    insn_code icode = code_for_pred_extend (<MODE>mode);
+    rtx tmp = gen_reg_rtx (<MODE>mode);
+    rtx ext_ops[] = {tmp, operands[2]};
+    riscv_vector::emit_vlmax_insn (icode, riscv_vector::RVV_UNOP, ext_ops);
+
+    rtx dst = expand_ternary_op (<MODE>mode, fnms_optab, tmp, operands[3],
+				 operands[1], operands[0], 0);
+    emit_move_insn (operands[0], dst);
+    DONE;
+  }
+  [(set_attr "type" "vfwmuladd")
+   (set_attr "mode" "<V_DOUBLE_TRUNC>")])
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-10.c
new file mode 100644
index 00000000000..490f1a41068
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-10.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -O3 -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwmacc_##TYPE1_##TYPE2 (TYPE1 *__restrict dst,  \
+						       TYPE2 *__restrict a,    \
+						       TYPE2 *__restrict b,    \
+						       int n)                  \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      dst[i] += -((TYPE1) a[i] * (TYPE1) b[i]);                                \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwnmsac\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-11.c
new file mode 100644
index 00000000000..4d44a40fed3
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-11.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -O3 -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwmacc_##TYPE1_##TYPE2 (TYPE1 *__restrict dst,  \
+						       TYPE2 *__restrict a,    \
+						       TYPE2 *__restrict b,    \
+						       int n)                  \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      dst[i] = (TYPE1) a[i] * (TYPE1) b[i] - dst[i];                           \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwmsac\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-12.c
new file mode 100644
index 00000000000..2cb2a1edebf
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-12.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -O3 -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwmacc_##TYPE1_##TYPE2 (TYPE1 *__restrict dst,  \
+						       TYPE2 *__restrict a,    \
+						       TYPE2 *__restrict b,    \
+						       int n)                  \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      dst[i] = -((TYPE1) a[i] * (TYPE1) b[i]) - dst[i];                        \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwnmacc\.vv} 2 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c
new file mode 100644
index 00000000000..2e3f6664d93
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-7.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
+    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
+    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
+    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      {                                                                        \
+	dst[i] += -((TYPE1) a[i] * (TYPE1) b[i]);                              \
+	dst2[i] += -((TYPE1) a2[i] * (TYPE1) b[i]);                            \
+	dst3[i] += -((TYPE1) a2[i] * (TYPE1) a[i]);                            \
+	dst4[i] += -((TYPE1) a[i] * (TYPE1) b2[i]);                            \
+      }                                                                        \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwnmsac\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c
new file mode 100644
index 00000000000..2acfbd01c6d
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-8.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
+    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
+    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
+    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      {                                                                        \
+	dst[i] = (TYPE1) a[i] * (TYPE1) b[i] - dst[i];                         \
+	dst2[i] = (TYPE1) a2[i] * (TYPE1) b[i] - dst2[i];                      \
+	dst3[i] = (TYPE1) a2[i] * (TYPE1) a[i] - dst3[i];                      \
+	dst4[i] = (TYPE1) a[i] * (TYPE1) b2[i] - dst4[i];                      \
+      }                                                                        \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwmsac\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c
new file mode 100644
index 00000000000..da7f870c12b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen-complicate-9.c
@@ -0,0 +1,27 @@ 
+/* { dg-do compile } */
+/* { dg-additional-options "-march=rv32gcv_zvfh -mabi=ilp32d --param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <stdint-gcc.h>
+
+#define TEST_TYPE(TYPE1, TYPE2)                                                \
+  __attribute__ ((noipa)) void vwadd_##TYPE1_##TYPE2 (                         \
+    TYPE1 *__restrict dst, TYPE1 *__restrict dst2, TYPE1 *__restrict dst3,     \
+    TYPE1 *__restrict dst4, TYPE2 *__restrict a, TYPE2 *__restrict b,          \
+    TYPE2 *__restrict a2, TYPE2 *__restrict b2, int n)                         \
+  {                                                                            \
+    for (int i = 0; i < n; i++)                                                \
+      {                                                                        \
+	dst[i] = -((TYPE1) a[i] * (TYPE1) b[i]) - dst[i];                      \
+	dst2[i] = -((TYPE1) a2[i] * (TYPE1) b[i]) - dst2[i];                   \
+	dst3[i] = -((TYPE1) a2[i] * (TYPE1) a[i]) - dst3[i];                   \
+	dst4[i] = -((TYPE1) a[i] * (TYPE1) b2[i]) - dst4[i];                   \
+      }                                                                        \
+  }
+
+#define TEST_ALL()                                                             \
+  TEST_TYPE (float, _Float16)                                                  \
+  TEST_TYPE (double, float)
+
+TEST_ALL ()
+
+/* { dg-final { scan-assembler-times {\tvfwnmacc\.vv} 8 } } */
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-10.c
new file mode 100644
index 00000000000..262660c5bcd
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-10.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-10.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == -((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) + dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (double, float, -2147483648)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-11.c
new file mode 100644
index 00000000000..246999cab62
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-11.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-11.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == ((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) - dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (double, float, -2147483648)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-12.c
new file mode 100644
index 00000000000..2a6a03b5b35
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run-12.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-12.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == -((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) - dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (double, float, -2147483648)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-10.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-10.c
new file mode 100644
index 00000000000..f678c35f81f
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-10.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-10.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == -((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) + dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-11.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-11.c
new file mode 100644
index 00000000000..294f77dbc46
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-11.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-11.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == ((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) - dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}
diff --git a/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-12.c b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-12.c
new file mode 100644
index 00000000000..013291cdc60
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/rvv/autovec/widen/widen_run_zvfh-12.c
@@ -0,0 +1,32 @@ 
+/* { dg-do run { target { riscv_vector && riscv_zvfh_hw } } } */
+/* { dg-additional-options "--param=riscv-autovec-preference=scalable -ffast-math" } */
+
+#include <assert.h>
+#include "widen-12.c"
+
+#define SZ 512
+
+#define RUN(TYPE1, TYPE2, LIMIT)                                               \
+  TYPE2 a##TYPE2[SZ];                                                          \
+  TYPE2 b##TYPE2[SZ];                                                          \
+  TYPE1 dst##TYPE1[SZ];                                                        \
+  TYPE1 dst2##TYPE1[SZ];                                                       \
+  for (int i = 0; i < SZ; i++)                                                 \
+    {                                                                          \
+      a##TYPE2[i] = LIMIT + i % 8723;                                          \
+      b##TYPE2[i] = LIMIT + i & 1964;                                          \
+      dst##TYPE1[i] = LIMIT + i & 628;                                         \
+      dst2##TYPE1[i] = LIMIT + i & 628;                                        \
+    }                                                                          \
+  vwmacc_##TYPE1_##TYPE2 (dst##TYPE1, a##TYPE2, b##TYPE2, SZ);                 \
+  for (int i = 0; i < SZ; i++)                                                 \
+    assert (dst##TYPE1[i]                                                      \
+	    == -((TYPE1) a##TYPE2[i] * (TYPE1) b##TYPE2[i]) - dst2##TYPE1[i]);
+
+#define RUN_ALL() RUN (float, _Float16, -32768)
+
+int
+main ()
+{
+  RUN_ALL ()
+}