[v4,07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.

Message ID 20230417183701.2249183-8-collison@rivosinc.com
State Accepted
Headers
Series RISC-V: Add autovec support |

Checks

Context Check Description
snail/gcc-patch-check success Github commit url

Commit Message

Michael Collison April 17, 2023, 6:36 p.m. UTC
  While working on autovectorizing for the RISCV port I encountered an issue
where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
where GET_MODE_NUNITS is equal to one.

Tested on RISCV and x86_64-linux-gnu. Okay?

2023-03-09  Michael Collison  <collison@rivosinc.com>

	* tree-vect-slp.cc (can_duplicate_and_interleave_p):
	Check that GET_MODE_NUNITS is a multiple of 2.
---
 gcc/tree-vect-slp.cc | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)
  

Comments

Richard Biener April 18, 2023, 6:11 a.m. UTC | #1
On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
>
> While working on autovectorizing for the RISCV port I encountered an issue
> where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> where GET_MODE_NUNITS is equal to one.
>
> Tested on RISCV and x86_64-linux-gnu. Okay?

OK.

> 2023-03-09  Michael Collison  <collison@rivosinc.com>
>
>         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
>         Check that GET_MODE_NUNITS is a multiple of 2.
> ---
>  gcc/tree-vect-slp.cc | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>
> diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> index d73deaecce0..a64fe454e19 100644
> --- a/gcc/tree-vect-slp.cc
> +++ b/gcc/tree-vect-slp.cc
> @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
>             (GET_MODE_BITSIZE (int_mode), 1);
>           tree vector_type
>             = get_vectype_for_scalar_type (vinfo, int_type, count);
> +         poly_int64 half_nelts;
>           if (vector_type
>               && VECTOR_MODE_P (TYPE_MODE (vector_type))
>               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> -                          GET_MODE_SIZE (base_vector_mode)))
> +                          GET_MODE_SIZE (base_vector_mode))
> +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> +                            2, &half_nelts))
>             {
>               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
>                  together into elements of type INT_TYPE and using the result
> @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
>               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
>               vec_perm_builder sel1 (nelts, 2, 3);
>               vec_perm_builder sel2 (nelts, 2, 3);
> -             poly_int64 half_nelts = exact_div (nelts, 2);
> +
>               for (unsigned int i = 0; i < 3; ++i)
>                 {
>                   sel1.quick_push (i);
> --
> 2.34.1
>
  
Kito Cheng April 18, 2023, 2:28 p.m. UTC | #2
Wait, VNx1DImode can be really evaluate to just one element if
-march=rv64g_zve64x,

I thinks this should be just fixed on backend by this patch:

https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/

On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
<gcc-patches@gcc.gnu.org> wrote:
>
> On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> >
> > While working on autovectorizing for the RISCV port I encountered an issue
> > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > where GET_MODE_NUNITS is equal to one.
> >
> > Tested on RISCV and x86_64-linux-gnu. Okay?
>
> OK.
>
> > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> >
> >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> >         Check that GET_MODE_NUNITS is a multiple of 2.
> > ---
> >  gcc/tree-vect-slp.cc | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > index d73deaecce0..a64fe454e19 100644
> > --- a/gcc/tree-vect-slp.cc
> > +++ b/gcc/tree-vect-slp.cc
> > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> >             (GET_MODE_BITSIZE (int_mode), 1);
> >           tree vector_type
> >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > +         poly_int64 half_nelts;
> >           if (vector_type
> >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > -                          GET_MODE_SIZE (base_vector_mode)))
> > +                          GET_MODE_SIZE (base_vector_mode))
> > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > +                            2, &half_nelts))
> >             {
> >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> >                  together into elements of type INT_TYPE and using the result
> > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> >               vec_perm_builder sel1 (nelts, 2, 3);
> >               vec_perm_builder sel2 (nelts, 2, 3);
> > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > +
> >               for (unsigned int i = 0; i < 3; ++i)
> >                 {
> >                   sel1.quick_push (i);
> > --
> > 2.34.1
> >
  
Kito Cheng April 18, 2023, 6:21 p.m. UTC | #3
Few more background about RVV:

RISC-V has provide different VLEN configuration by different ISA
extension like `zve32x`, `zve64x` and `v`
zve32x just guarantee the minimal VLEN is 32 bits,
zve64x guarantee the minimal VLEN is 64 bits,
and v guarantee the minimal VLEN is 128 bits,

Current status (without that patch):

Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
is invalid mode
 - one vector register could hold 1 + 1x SImode where x is 0~n, so it
might hold just one SI

Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
 - one vector register could hold 1 + 1x DImode where x is 0~n, so it
might hold just one DI
 - one vector register could hold 2 + 2x SImode where x is 0~n, so it
might hold just two SI

So what I want to say here is VNx1DImode is really NOT safe to assume
to have more than two DI in theory.

However `v` extension guarantees the minimal VLEN is 128 bits.

We are trying to introduce another type/mode mapping for this configure:

v: Mode for one vector register mode is VNx2DImode or VNx4SImode
 - one vector register could hold 2 + 2x DImode where x is 0~n, so it
will hold at least two DI
 - one vector register could hold 4 + 4x SImode where x is 0~n, so it
will hold at least four DI

So GET_MODE_NUNITS for a single vector register with DI mode will
become 2 (VNx2DImode) if it is really possible, which is a more
precise way to model the vector extension for RISC-V .



On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Wait, VNx1DImode can be really evaluate to just one element if
> -march=rv64g_zve64x,
>
> I thinks this should be just fixed on backend by this patch:
>
> https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>
> On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> > >
> > > While working on autovectorizing for the RISCV port I encountered an issue
> > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > > where GET_MODE_NUNITS is equal to one.
> > >
> > > Tested on RISCV and x86_64-linux-gnu. Okay?
> >
> > OK.
> >
> > > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> > >
> > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> > >         Check that GET_MODE_NUNITS is a multiple of 2.
> > > ---
> > >  gcc/tree-vect-slp.cc | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index d73deaecce0..a64fe454e19 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >             (GET_MODE_BITSIZE (int_mode), 1);
> > >           tree vector_type
> > >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > > +         poly_int64 half_nelts;
> > >           if (vector_type
> > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> > >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > > -                          GET_MODE_SIZE (base_vector_mode)))
> > > +                          GET_MODE_SIZE (base_vector_mode))
> > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > > +                            2, &half_nelts))
> > >             {
> > >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> > >                  together into elements of type INT_TYPE and using the result
> > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> > >               vec_perm_builder sel1 (nelts, 2, 3);
> > >               vec_perm_builder sel2 (nelts, 2, 3);
> > > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > > +
> > >               for (unsigned int i = 0; i < 3; ++i)
> > >                 {
> > >                   sel1.quick_push (i);
> > > --
> > > 2.34.1
> > >
  
juzhe.zhong@rivai.ai April 18, 2023, 10:48 p.m. UTC | #4
Yes, like kito said.
We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here.
We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification.
So we strongly suggest that we should drop this fix.

Thanks.


juzhe.zhong@rivai.ai
 
From: Kito Cheng
Date: 2023-04-19 02:21
To: Richard Biener; Jeff Law; Palmer Dabbelt
CC: Michael Collison; gcc-patches; 钟居哲
Subject: Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS is a multiple of 2.
Few more background about RVV:
 
RISC-V has provide different VLEN configuration by different ISA
extension like `zve32x`, `zve64x` and `v`
zve32x just guarantee the minimal VLEN is 32 bits,
zve64x guarantee the minimal VLEN is 64 bits,
and v guarantee the minimal VLEN is 128 bits,
 
Current status (without that patch):
 
Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
is invalid mode
- one vector register could hold 1 + 1x SImode where x is 0~n, so it
might hold just one SI
 
Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
- one vector register could hold 1 + 1x DImode where x is 0~n, so it
might hold just one DI
- one vector register could hold 2 + 2x SImode where x is 0~n, so it
might hold just two SI
 
So what I want to say here is VNx1DImode is really NOT safe to assume
to have more than two DI in theory.
 
However `v` extension guarantees the minimal VLEN is 128 bits.
 
We are trying to introduce another type/mode mapping for this configure:
 
v: Mode for one vector register mode is VNx2DImode or VNx4SImode
- one vector register could hold 2 + 2x DImode where x is 0~n, so it
will hold at least two DI
- one vector register could hold 4 + 4x SImode where x is 0~n, so it
will hold at least four DI
 
So GET_MODE_NUNITS for a single vector register with DI mode will
become 2 (VNx2DImode) if it is really possible, which is a more
precise way to model the vector extension for RISC-V .
 
 
 
On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com> wrote:
>
> Wait, VNx1DImode can be really evaluate to just one element if
> -march=rv64g_zve64x,
>
> I thinks this should be just fixed on backend by this patch:
>
> https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>
> On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
> <gcc-patches@gcc.gnu.org> wrote:
> >
> > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison <collison@rivosinc.com> wrote:
> > >
> > > While working on autovectorizing for the RISCV port I encountered an issue
> > > where can_duplicate_and_interleave_p assumes that GET_MODE_NUNITS is a
> > > evenly divisible by two. The RISC-V target has vector modes (e.g. VNx1DImode),
> > > where GET_MODE_NUNITS is equal to one.
> > >
> > > Tested on RISCV and x86_64-linux-gnu. Okay?
> >
> > OK.
> >
> > > 2023-03-09  Michael Collison  <collison@rivosinc.com>
> > >
> > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
> > >         Check that GET_MODE_NUNITS is a multiple of 2.
> > > ---
> > >  gcc/tree-vect-slp.cc | 7 +++++--
> > >  1 file changed, 5 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
> > > index d73deaecce0..a64fe454e19 100644
> > > --- a/gcc/tree-vect-slp.cc
> > > +++ b/gcc/tree-vect-slp.cc
> > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >             (GET_MODE_BITSIZE (int_mode), 1);
> > >           tree vector_type
> > >             = get_vectype_for_scalar_type (vinfo, int_type, count);
> > > +         poly_int64 half_nelts;
> > >           if (vector_type
> > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
> > >               && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
> > > -                          GET_MODE_SIZE (base_vector_mode)))
> > > +                          GET_MODE_SIZE (base_vector_mode))
> > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
> > > +                            2, &half_nelts))
> > >             {
> > >               /* Try fusing consecutive sequences of COUNT / NVECTORS elements
> > >                  together into elements of type INT_TYPE and using the result
> > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
> > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
> > >               vec_perm_builder sel1 (nelts, 2, 3);
> > >               vec_perm_builder sel2 (nelts, 2, 3);
> > > -             poly_int64 half_nelts = exact_div (nelts, 2);
> > > +
> > >               for (unsigned int i = 0; i < 3; ++i)
> > >                 {
> > >                   sel1.quick_push (i);
> > > --
> > > 2.34.1
> > >
  
Michael Collison April 18, 2023, 11:19 p.m. UTC | #5
Juzhe and Kito,

Thank you for the clarification.

On 4/18/23 18:48, juzhe.zhong@rivai.ai wrote:
> Yes, like kito said.
> We won't enable VNx1DImode in auto-vectorization so it's meaningless 
> to fix it here.
> We dynamic adjust the minimum vector-length for different '-march' 
> according to RVV ISA specification.
> So we strongly suggest that we should drop this fix.
>
> Thanks.
> ------------------------------------------------------------------------
> juzhe.zhong@rivai.ai
>
>     *From:* Kito Cheng <mailto:kito.cheng@gmail.com>
>     *Date:* 2023-04-19 02:21
>     *To:* Richard Biener <mailto:richard.guenther@gmail.com>; Jeff Law
>     <mailto:jeffreyalaw@gmail.com>; Palmer Dabbelt
>     <mailto:palmer@dabbelt.com>
>     *CC:* Michael Collison <mailto:collison@rivosinc.com>; gcc-patches
>     <mailto:gcc-patches@gcc.gnu.org>; 钟居哲 <mailto:juzhe.zhong@rivai.ai>
>     *Subject:* Re: [PATCH v4 07/10] vect: Verify that GET_MODE_NUNITS
>     is a multiple of 2.
>     Few more background about RVV:
>     RISC-V has provide different VLEN configuration by different ISA
>     extension like `zve32x`, `zve64x` and `v`
>     zve32x just guarantee the minimal VLEN is 32 bits,
>     zve64x guarantee the minimal VLEN is 64 bits,
>     and v guarantee the minimal VLEN is 128 bits,
>     Current status (without that patch):
>     Zve32x: Mode for one vector register mode is VNx1SImode and VNx1DImode
>     is invalid mode
>     - one vector register could hold 1 + 1x SImode where x is 0~n, so it
>     might hold just one SI
>     Zve64x: Mode for one vector register mode is VNx1DImode or VNx2SImode
>     - one vector register could hold 1 + 1x DImode where x is 0~n, so it
>     might hold just one DI
>     - one vector register could hold 2 + 2x SImode where x is 0~n, so it
>     might hold just two SI
>     So what I want to say here is VNx1DImode is really NOT safe to assume
>     to have more than two DI in theory.
>     However `v` extension guarantees the minimal VLEN is 128 bits.
>     We are trying to introduce another type/mode mapping for this
>     configure:
>     v: Mode for one vector register mode is VNx2DImode or VNx4SImode
>     - one vector register could hold 2 + 2x DImode where x is 0~n, so it
>     will hold at least two DI
>     - one vector register could hold 4 + 4x SImode where x is 0~n, so it
>     will hold at least four DI
>     So GET_MODE_NUNITS for a single vector register with DI mode will
>     become 2 (VNx2DImode) if it is really possible, which is a more
>     precise way to model the vector extension for RISC-V .
>     On Tue, Apr 18, 2023 at 10:28 PM Kito Cheng <kito.cheng@gmail.com>
>     wrote:
>     >
>     > Wait, VNx1DImode can be really evaluate to just one element if
>     > -march=rv64g_zve64x,
>     >
>     > I thinks this should be just fixed on backend by this patch:
>     >
>     >
>     https://patchwork.ozlabs.org/project/gcc/patch/20230414014518.15458-1-juzhe.zhong@rivai.ai/
>     >
>     > On Tue, Apr 18, 2023 at 2:12 PM Richard Biener via Gcc-patches
>     > <gcc-patches@gcc.gnu.org> wrote:
>     > >
>     > > On Mon, Apr 17, 2023 at 8:42 PM Michael Collison
>     <collison@rivosinc.com> wrote:
>     > > >
>     > > > While working on autovectorizing for the RISCV port I
>     encountered an issue
>     > > > where can_duplicate_and_interleave_p assumes that
>     GET_MODE_NUNITS is a
>     > > > evenly divisible by two. The RISC-V target has vector modes
>     (e.g. VNx1DImode),
>     > > > where GET_MODE_NUNITS is equal to one.
>     > > >
>     > > > Tested on RISCV and x86_64-linux-gnu. Okay?
>     > >
>     > > OK.
>     > >
>     > > > 2023-03-09  Michael Collison <collison@rivosinc.com>
>     > > >
>     > > >         * tree-vect-slp.cc (can_duplicate_and_interleave_p):
>     > > >         Check that GET_MODE_NUNITS is a multiple of 2.
>     > > > ---
>     > > >  gcc/tree-vect-slp.cc | 7 +++++--
>     > > >  1 file changed, 5 insertions(+), 2 deletions(-)
>     > > >
>     > > > diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
>     > > > index d73deaecce0..a64fe454e19 100644
>     > > > --- a/gcc/tree-vect-slp.cc
>     > > > +++ b/gcc/tree-vect-slp.cc
>     > > > @@ -423,10 +423,13 @@ can_duplicate_and_interleave_p
>     (vec_info *vinfo, unsigned int count,
>     > > >             (GET_MODE_BITSIZE (int_mode), 1);
>     > > >           tree vector_type
>     > > >             = get_vectype_for_scalar_type (vinfo, int_type,
>     count);
>     > > > +         poly_int64 half_nelts;
>     > > >           if (vector_type
>     > > >               && VECTOR_MODE_P (TYPE_MODE (vector_type))
>     > > >               && known_eq (GET_MODE_SIZE (TYPE_MODE
>     (vector_type)),
>     > > > -                          GET_MODE_SIZE (base_vector_mode)))
>     > > > +                          GET_MODE_SIZE (base_vector_mode))
>     > > > +             && multiple_p (GET_MODE_NUNITS (TYPE_MODE
>     (vector_type)),
>     > > > +                            2, &half_nelts))
>     > > >             {
>     > > >               /* Try fusing consecutive sequences of COUNT /
>     NVECTORS elements
>     > > >                  together into elements of type INT_TYPE and
>     using the result
>     > > > @@ -434,7 +437,7 @@ can_duplicate_and_interleave_p (vec_info
>     *vinfo, unsigned int count,
>     > > >               poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE
>     (vector_type));
>     > > >               vec_perm_builder sel1 (nelts, 2, 3);
>     > > >               vec_perm_builder sel2 (nelts, 2, 3);
>     > > > -             poly_int64 half_nelts = exact_div (nelts, 2);
>     > > > +
>     > > >               for (unsigned int i = 0; i < 3; ++i)
>     > > >                 {
>     > > >                   sel1.quick_push (i);
>     > > > --
>     > > > 2.34.1
>     > > >
>
  
Richard Sandiford April 20, 2023, 10:01 a.m. UTC | #6
<juzhe.zhong@rivai.ai> writes:
> Yes, like kito said.
> We won't enable VNx1DImode in auto-vectorization so it's meaningless to fix it here.
> We dynamic adjust the minimum vector-length for different '-march' according to RVV ISA specification.
> So we strongly suggest that we should drop this fix.

I think the patch should go in regardless.  If we have a port with
a VNx1 mode then the exact_div is at best dubious and at worst wrong.

Thanks,
Richard
  

Patch

diff --git a/gcc/tree-vect-slp.cc b/gcc/tree-vect-slp.cc
index d73deaecce0..a64fe454e19 100644
--- a/gcc/tree-vect-slp.cc
+++ b/gcc/tree-vect-slp.cc
@@ -423,10 +423,13 @@  can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
 	    (GET_MODE_BITSIZE (int_mode), 1);
 	  tree vector_type
 	    = get_vectype_for_scalar_type (vinfo, int_type, count);
+	  poly_int64 half_nelts;
 	  if (vector_type
 	      && VECTOR_MODE_P (TYPE_MODE (vector_type))
 	      && known_eq (GET_MODE_SIZE (TYPE_MODE (vector_type)),
-			   GET_MODE_SIZE (base_vector_mode)))
+			   GET_MODE_SIZE (base_vector_mode))
+	      && multiple_p (GET_MODE_NUNITS (TYPE_MODE (vector_type)),
+			     2, &half_nelts))
 	    {
 	      /* Try fusing consecutive sequences of COUNT / NVECTORS elements
 		 together into elements of type INT_TYPE and using the result
@@ -434,7 +437,7 @@  can_duplicate_and_interleave_p (vec_info *vinfo, unsigned int count,
 	      poly_uint64 nelts = GET_MODE_NUNITS (TYPE_MODE (vector_type));
 	      vec_perm_builder sel1 (nelts, 2, 3);
 	      vec_perm_builder sel2 (nelts, 2, 3);
-	      poly_int64 half_nelts = exact_div (nelts, 2);
+
 	      for (unsigned int i = 0; i < 3; ++i)
 		{
 		  sel1.quick_push (i);