[V2,2/4,RFC] RISC-V: Add vector related reservations

Message ID 20240110013159.2645757-3-ewlu@rivosinc.com
State Unresolved
Headers
Series RISC-V: Associate typed insns to dfa reservation |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Edwin Lu Jan. 10, 2024, 1:31 a.m. UTC
  This patch copies the vector reservations from generic-ooo.md and
inserts them into generic.md and sifive.md. Creates new vector crypto related
insn reservations.

gcc/ChangeLog:

	* config/riscv/generic-ooo.md (generic_ooo_crypto_aes): create reservation
	(generic_ooo_crypto_sha): ditto
	(generic_ooo_crypto_sm): ditto
	(generic_ooo_vec_vesetvl): ditto
	(generic_ooo_vec_vsetvl): ditto
	* config/riscv/generic.md (pipe0): ditto
	(generic_vec_load): ditto
	(generic_vec_store): ditto
	(generic_vec_loadstore_seg): ditto
	(generic_vec_alu): ditto
	(generic_vec_fcmp): ditto
	(generic_vec_imul): ditto
	(generic_vec_fadd): ditto
	(generic_vec_fmul): ditto
	(generic_crypto): ditto
	(generic_crypto_aes): ditto
	(generic_crypto_sha): ditto
	(generic_crypto_sm): ditto
	(generic_perm): ditto
	(generic_vec_reduction): ditto
	(generic_vec_ordered_reduction): ditto
	(generic_vec_idiv): ditto
	(generic_vec_float_divsqrt): ditto
	(generic_vec_mask): ditto
	(generic_vec_vesetvl): ditto
	(generic_vec_setrm): ditto
	(generic_vec_readlen): ditto
	* config/riscv/sifive-7.md (sifive_7): ditto
	(sifive_7_vec_load): ditto
	(sifive_7_vec_store): ditto
	(sifive_7_vec_loadstore_seg): ditto
	(sifive_7_vec_alu): ditto
	(sifive_7_vec_fcmp): ditto
	(sifive_7_vec_imul): ditto
	(sifive_7_vec_fadd): ditto
	(sifive_7_vec_fmul): ditto
	(sifive_7_crypto): ditto
	(sifive_7_crypto_aes): ditto
	(sifive_7_crypto_sha): ditto
	(sifive_7_crypto_sm): ditto
	(sifive_7_perm): ditto
	(sifive_7_vec_reduction): ditto
	(sifive_7_vec_ordered_reduction): ditto
	(sifive_7_vec_idiv): ditto
	(sifive_7_vec_float_divsqrt): ditto
	(sifive_7_vec_mask): ditto
	(sifive_7_vec_vesetvl): ditto
	(sifive_7_vec_setrm): ditto
	(sifive_7_vec_readlen): ditto

Signed-off-by: Edwin Lu <ewlu@rivosinc.com>
Co-authored-by: Robin Dapp <rdapp.gcc@gmail.com>
---
V2:
- Remove unnecessary syntax changes in generic-ooo
- Add new vector crypto reservations and types to
  pipelines
---
 gcc/config/riscv/generic-ooo.md |  27 +++++-
 gcc/config/riscv/generic.md     | 143 +++++++++++++++++++++++++++++++
 gcc/config/riscv/sifive-7.md    | 144 ++++++++++++++++++++++++++++++++
 3 files changed, 311 insertions(+), 3 deletions(-)
  

Comments

Robin Dapp Jan. 10, 2024, 4 p.m. UTC | #1
Hi Edwin,

> This patch copies the vector reservations from generic-ooo.md and
> inserts them into generic.md and sifive.md. Creates new vector crypto related
> insn reservations.

In principle, the changes look good to me but I wonder if we could
split off the vector parts from generic-ooo into their own md file
(generic-vector-ooo or so?) and include this in the others?  Or is
there a reason why you decided against this?

A recurring question in vector cost model discussions seems to be how
to handle the situation when a tune model does not specify a "vector tune
model".  The problem exists for the scheduler descriptions and the
normal vector cost model (and possibly insn_costs as well).

Juzhe just implemented a fallback so we always use the "generic rvv" cost
model.  Your changes would be in the same vein and if we could split
them off then we'd be able to easier exchange one scheduler descriptions
for another one (say if one tune model wants to use an in-order vector
model).

There is also still the question of whether to set all latencies
to 1 for an OOO core but this question should be settled separately
as soon as we have proper hardware benchmark results.  If so we
would probably rename generic-vector-ooo into
generic-vector-in-order ;)

Regards
 Robin
  
Edwin Lu Jan. 10, 2024, 6:24 p.m. UTC | #2
Hi Robin,
On 1/10/2024 8:00 AM, Robin Dapp wrote:
> Hi Edwin,
> 
>> This patch copies the vector reservations from generic-ooo.md and
>> inserts them into generic.md and sifive.md. Creates new vector crypto related
>> insn reservations.
> 
> In principle, the changes look good to me but I wonder if we could
> split off the vector parts from generic-ooo into their own md file
> (generic-vector-ooo or so?) and include this in the others?  Or is
> there a reason why you decided against this?
> 
I forgot we could include other md files into another file (I'll double 
check that there isn't anything fancy for including other pipelines), 
but I also thought that eventually all the tunes would have their own 
vector cost pipelines. Since all the pipelines should be tuned to their 
cost model, they would be different anyway. If it would be simpler for 
now, I could separate the files out.

> A recurring question in vector cost model discussions seems to be how
> to handle the situation when a tune model does not specify a "vector tune
> model".  The problem exists for the scheduler descriptions and the
> normal vector cost model (and possibly insn_costs as well).
> 
> Juzhe just implemented a fallback so we always use the "generic rvv" cost
> model.  Your changes would be in the same vein and if we could split
> them off then we'd be able to easier exchange one scheduler descriptions
> for another one (say if one tune model wants to use an in-order vector
> model).
> 
I think I'm getting a bit confused. Is there a reason why we would want 
to exchange scheduler descriptions like the example you provided? I'm 
just thinking why a in-order model would want to use an ooo vector model 
and vice versa. Please correct me if I got the wrong idea.

I also want to double check, isn't forcing all typed instructions to be 
part of a dfa pipeline in effect removing a situation where a tune model 
does not specify a "vector tune model"? At least from my testing with 
the assert statement, I get ICEs when trying to run the testsuite 
without the vector tune model even on gc.

> There is also still the question of whether to set all latencies
> to 1 for an OOO core but this question should be settled separately
> as soon as we have proper hardware benchmark results.  If so we
> would probably rename generic-vector-ooo into
> generic-vector-in-order ;)
> 
> Regards
>   Robin
> 
> 
I agree the latencies can be tweaked after we get those benchmarks :)

Edwin
  
Robin Dapp Jan. 10, 2024, 7:16 p.m. UTC | #3
> Since all the pipelines should be tuned to their cost model, they
> would be different anyway. If it would be simpler for now, I could
> separate the files out.
> I think I'm getting a bit confused. Is there a reason why we would
> want to exchange scheduler descriptions like the example you
> provided? I'm just thinking why a in-order model would want to use an
> ooo vector model and vice versa. Please correct me if I got the wrong
> idea.

Yeah, the confusion is understandable as it's all in flow and several
things I mentioned are artifacts of us not yet being stabilized (or
actually having hard data to base our decisions on).

Usually, once a uarch has settled there is no reason to exchange
anything, just smaller tweaks might be done.  I was more thinking of
the near to mid-term future where larger changes like ripping out
one thing and using another one altogether might still happen.

Regarding out of order vs in order - for in-order pipelines we will
always want to get latencies right.  For out of order it is a balancing
act (proper latencies often mean more spilling and the processor will
reorder correctly anyway).

So you're mostly right that the argument is not very strong as soon
as we really know what to do and not to do.

> I also want to double check, isn't forcing all typed instructions to
> be part of a dfa pipeline in effect removing a situation where a tune
> model does not specify a "vector tune model"? At least from my
> testing with the assert statement, I get ICEs when trying to run the
> testsuite without the vector tune model even on gc.

There are (at least) three parts of the "tune model":
 - vector cost model, specifying the cost of generic vector operations,
   not necessarily corresponding to an insn
 - insn cost, specifying the cost of an individual insn, usually close
   to latency but sometimes also "complexity" or other things.
 - insn latency and other hardware scheduler properties.

We can leave out any of those which will make us fall back to default
values.  Even if we forced a scheduler description we could still have
the default fallback for the other two and generate unfavorable code
as a result.

However, this is of course not desirable and we will soon have a
reasonable vector cost model that corresponds to the non-uarch
specific properties of the vector spec.  Once this is in place
we will also want a somewhat generic vector scheduler description
that goes hand in hand with that.  Despite the name, the vector
part of generic-ooo could be used for in-order vector uarchs and
we might want to define a different description for out-of-order
uarchs.  That's a separate discussion but at least for that
contingency it would make sense to easily interchange the scheduler
description ;)

Regards
 Robin
  
Edwin Lu Jan. 10, 2024, 10:39 p.m. UTC | #4
>> Since all the pipelines should be tuned to their cost model, they
>> would be different anyway. If it would be simpler for now, I could
>> separate the files out.
>> I think I'm getting a bit confused. Is there a reason why we would
>> want to exchange scheduler descriptions like the example you
>> provided? I'm just thinking why a in-order model would want to use an
>> ooo vector model and vice versa. Please correct me if I got the wrong
>> idea.
> 
> Yeah, the confusion is understandable as it's all in flow and several
> things I mentioned are artifacts of us not yet being stabilized (or
> actually having hard data to base our decisions on).
> 
> Usually, once a uarch has settled there is no reason to exchange
> anything, just smaller tweaks might be done.  I was more thinking of
> the near to mid-term future where larger changes like ripping out
> one thing and using another one altogether might still happen.
> 
> Regarding out of order vs in order - for in-order pipelines we will
> always want to get latencies right.  For out of order it is a balancing
> act (proper latencies often mean more spilling and the processor will
> reorder correctly anyway).
> 
> So you're mostly right that the argument is not very strong as soon
> as we really know what to do and not to do.
> 
That makes sense to me!
>> I also want to double check, isn't forcing all typed instructions to
>> be part of a dfa pipeline in effect removing a situation where a tune
>> model does not specify a "vector tune model"? At least from my
>> testing with the assert statement, I get ICEs when trying to run the
>> testsuite without the vector tune model even on gc.
> 
> There are (at least) three parts of the "tune model":
>   - vector cost model, specifying the cost of generic vector operations,
>     not necessarily corresponding to an insn
>   - insn cost, specifying the cost of an individual insn, usually close
>     to latency but sometimes also "complexity" or other things.
>   - insn latency and other hardware scheduler properties.
> 
> We can leave out any of those which will make us fall back to default
> values.  Even if we forced a scheduler description we could still have
> the default fallback for the other two and generate unfavorable code
> as a result.
> 
So if I'm understanding things correctly, the costs the Juzhe is working 
on in riscv.cc would be part of the vector cost model since they don't 
correspond to individual instructions and only target vector code. These 
costs would be the default fallback in the event of having no scheduler 
descriptions for the insn.

The vector pipelines I'm working describes the insn latency categorized 
by the insn type. The scheduler will attempt to generate favorable code 
by this description but also consider the vector cost model. That is, 
it's possible for an insn with a latency of 1 and cost of 10 to be 
replaced by an insn with a latency of 2 and cost of 3.

The insn cost is the cost of every insn which can be specified 
elsewhere. These override the values in the vector cost model for vector 
insns? Where are these specified?

Then, all of these combined form a tune model like generic-ooo or rocket.
> However, this is of course not desirable and we will soon have a
> reasonable vector cost model that corresponds to the non-uarch
> specific properties of the vector spec.  Once this is in place
> we will also want a somewhat generic vector scheduler description
> that goes hand in hand with that.  Despite the name, the vector
> part of generic-ooo could be used for in-order vector uarchs and
> we might want to define a different description for out-of-order
> uarchs.  That's a separate discussion but at least for that
> contingency it would make sense to easily interchange the scheduler
> description ;)
> 
I think I understand everything. I'm currently testing a run with a 
generic-vector-ooo file and I'm a little unsure how we would create a 
second generic-vector-in-order file such that each insn maps only to one 
reservation without using tune attributes but I guess that will be an 
implementation detail for later :)

Edwin
  

Patch

diff --git a/gcc/config/riscv/generic-ooo.md b/gcc/config/riscv/generic-ooo.md
index ef8cb96daf4..fb5f34c0ef2 100644
--- a/gcc/config/riscv/generic-ooo.md
+++ b/gcc/config/riscv/generic-ooo.md
@@ -195,7 +195,8 @@  (define_insn_reservation "generic_ooo_popcount" 2
 (define_insn_reservation "generic_ooo_vec_alu" 3
   (and (eq_attr "tune" "generic_ooo")
        (eq_attr "type" "vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
-		        vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector"))
+		        vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector,\
+			vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float comparison, conversion etc.
@@ -209,7 +210,8 @@  (define_insn_reservation "generic_ooo_vec_fcmp" 3
 ;; Vector integer multiplication.
 (define_insn_reservation "generic_ooo_vec_imul" 4
   (and (eq_attr "tune" "generic_ooo")
-       (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul"))
+       (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul,vclmul,vclmulh,\
+			vghsh,vgmul"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector float addition.
@@ -230,6 +232,25 @@  (define_insn_reservation "generic_ooo_crypto" 4
        (eq_attr "type" "crypto"))
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
+;; Vector crypto, AES
+(define_insn_reservation "generic_ooo_crypto_aes" 4
+  (and (eq_attr "tune" "generic_ooo")
+       (eq_attr "type" "vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+;; Vector crypto, sha
+(define_insn_reservation "generic_ooo_crypto_sha" 4
+  (and (eq_attr "tune" "generic_ooo")
+       (eq_attr "type" "vsha2ms,vsha2ch,vsha2cl"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+;; Vector crypto, SM3/4
+(define_insn_reservation "generic_ooo_crypto_sm" 4
+  (and (eq_attr "tune" "generic_ooo")
+       (eq_attr "type" "vsm4k,vsm4r,vsm3me,vsm3c"))
+  "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
+
+
 ;; Vector permute.
 (define_insn_reservation "generic_ooo_perm" 3
   (and (eq_attr "tune" "generic_ooo")
@@ -271,7 +292,7 @@  (define_insn_reservation "generic_ooo_vec_mask" 2
   "generic_ooo_vxu_issue,generic_ooo_vxu_alu")
 
 ;; Vector vsetvl.
-(define_insn_reservation "generic_ooo_vec_vesetvl" 1
+(define_insn_reservation "generic_ooo_vec_vsetvl" 1
   (and (eq_attr "tune" "generic_ooo")
        (eq_attr "type" "vsetvl,vsetvl_pre"))
   "generic_ooo_vxu_issue")
diff --git a/gcc/config/riscv/generic.md b/gcc/config/riscv/generic.md
index 45986cfea89..45c9027229a 100644
--- a/gcc/config/riscv/generic.md
+++ b/gcc/config/riscv/generic.md
@@ -25,6 +25,15 @@  (define_cpu_unit "alu" "pipe0")
 (define_cpu_unit "imuldiv" "pipe0")
 (define_cpu_unit "fdivsqrt" "pipe0")
 
+;; Separate issue queue for vector instructions.
+(define_cpu_unit "generic_vxu_issue" "pipe0")
+
+;; Vector execution unit.
+(define_cpu_unit "generic_vxu_alu" "pipe0")
+
+;; Vector subunit that does mult/div/sqrt.
+(define_cpu_unit "generic_vxu_multicycle" "pipe0")
+
 (define_insn_reservation "generic_alu" 1
   (and (eq_attr "tune" "generic")
        (eq_attr "type" "unknown,const,arith,shift,slt,multi,auipc,nop,logical,\
@@ -102,3 +111,137 @@  (define_insn_reservation "generic_fsqrt" 25
        (eq_attr "type" "fsqrt"))
   "fdivsqrt*25")
 
+;; Vector load/store
+(define_insn_reservation "generic_vec_load" 6
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vlde,vldm,vlds,vldux,vldox,vldff,vldr"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+(define_insn_reservation "generic_vec_store" 6
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vste,vstm,vsts,vstux,vstox,vstr"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector segment loads/stores.
+(define_insn_reservation "generic_vec_loadstore_seg" 10
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,\
+			vssegte,vssegts,vssegtux,vssegtox"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Regular vector operations and integer comparisons.
+(define_insn_reservation "generic_vec_alu" 3
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
+		        vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector,\
+			vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector float comparison, conversion etc.
+(define_insn_reservation "generic_vec_fcmp" 3
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vfrecp,vfminmax,vfcmp,vfsgnj,vfclass,vfcvtitof,\
+			vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\
+			vfncvtftoi,vfncvtftof"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector integer multiplication.
+(define_insn_reservation "generic_vec_imul" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul,vclmul,vclmulh,\
+			vghsh,vgmul"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector float addition.
+(define_insn_reservation "generic_vec_fadd" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vfalu,vfwalu"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector float multiplication and FMA.
+(define_insn_reservation "generic_vec_fmul" 6
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vfmul,vfwmul,vfmuladd,vfwmuladd"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector crypto, assumed to be a generic operation for now.
+(define_insn_reservation "generic_crypto" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "crypto"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector crypto, AES
+(define_insn_reservation "generic_crypto_aes" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector crypto, sha
+(define_insn_reservation "generic_crypto_sha" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vsha2ms,vsha2ch,vsha2cl"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector crypto, SM3/4
+(define_insn_reservation "generic_crypto_sm" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vsm4k,vsm4r,vsm3me,vsm3c"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector permute.
+(define_insn_reservation "generic_perm" 3
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vimerge,vfmerge,vslideup,vslidedown,vislide1up,\
+			vislide1down,vfslide1up,vfslide1down,vgather,vcompress"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector reduction.
+(define_insn_reservation "generic_vec_reduction" 8
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vired,viwred,vfredu,vfwredu"))
+  "generic_vxu_issue,generic_vxu_multicycle")
+
+;; Vector ordered reduction, assume the latency number is for
+;; a 128-bit vector.  It is scaled in riscv_sched_adjust_cost
+;; for larger vectors.
+(define_insn_reservation "generic_vec_ordered_reduction" 10
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vfredo,vfwredo"))
+  "generic_vxu_issue,generic_vxu_multicycle*3")
+
+;; Vector integer division, assume not pipelined.
+(define_insn_reservation "generic_vec_idiv" 16
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vidiv"))
+  "generic_vxu_issue,generic_vxu_multicycle*3")
+
+;; Vector float divisions and sqrt, assume not pipelined.
+(define_insn_reservation "generic_vec_float_divsqrt" 16
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vfdiv,vfsqrt"))
+  "generic_vxu_issue,generic_vxu_multicycle*3")
+
+;; Vector mask operations.
+(define_insn_reservation "generic_vec_mask" 2
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,\
+			vfmovvf,vfmovfv"))
+  "generic_vxu_issue,generic_vxu_alu")
+
+;; Vector vsetvl.
+(define_insn_reservation "generic_vec_vesetvl" 1
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "vsetvl,vsetvl_pre"))
+  "generic_vxu_issue")
+
+;; Vector rounding mode setters, assume pipeline barrier.
+(define_insn_reservation "generic_vec_setrm" 20
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "wrvxrm,wrfrm"))
+  "generic_vxu_issue,generic_vxu_issue*3")
+
+;; Vector read vlen/vlenb.
+(define_insn_reservation "generic_vec_readlen" 4
+  (and (eq_attr "tune" "generic")
+       (eq_attr "type" "rdvlenb,rdvl"))
+  "generic_vxu_issue,generic_vxu_issue")
diff --git a/gcc/config/riscv/sifive-7.md b/gcc/config/riscv/sifive-7.md
index 52904f546ed..97ec7d0847c 100644
--- a/gcc/config/riscv/sifive-7.md
+++ b/gcc/config/riscv/sifive-7.md
@@ -12,6 +12,15 @@  (define_cpu_unit "sifive_7_B" "sifive_7")
 (define_cpu_unit "sifive_7_idiv" "sifive_7")
 (define_cpu_unit "sifive_7_fpu" "sifive_7")
 
+;; Separate issue queue for vector instructions.
+(define_cpu_unit "sifive_7_vxu_issue" "sifive_7")
+
+;; Vector execution unit.
+(define_cpu_unit "sifive_7_vxu_alu" "sifive_7")
+
+;; Vector subunit that does mult/div/sqrt.
+(define_cpu_unit "sifive_7_vxu_multicycle" "sifive_7")
+
 (define_insn_reservation "sifive_7_load" 3
   (and (eq_attr "tune" "sifive_7")
        (eq_attr "type" "load"))
@@ -119,6 +128,141 @@  (define_insn_reservation "sifive_7_popcount" 2
        (eq_attr "type" "cpop,clmul"))
   "sifive_7_A")
 
+;; Vector load/store
+(define_insn_reservation "sifive_7_vec_load" 6
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vlde,vldm,vlds,vldux,vldox,vldff,vldr"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+(define_insn_reservation "sifive_7_vec_store" 6
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vste,vstm,vsts,vstux,vstox,vstr"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector segment loads/stores.
+(define_insn_reservation "sifive_7_vec_loadstore_seg" 10
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vlsegde,vlsegds,vlsegdux,vlsegdox,vlsegdff,\
+			vssegte,vssegts,vssegtux,vssegtox"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Regular vector operations and integer comparisons.
+(define_insn_reservation "sifive_7_vec_alu" 3
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vialu,viwalu,vext,vicalu,vshift,vnshift,viminmax,vicmp,\
+		        vimov,vsalu,vaalu,vsshift,vnclip,vmov,vfmov,vector,\
+			vandn,vbrev,vbrev8,vrev8,vclz,vctz,vrol,vror,vwsll"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector float comparison, conversion etc.
+(define_insn_reservation "sifive_7_vec_fcmp" 3
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vfrecp,vfminmax,vfcmp,vfsgnj,vfclass,vfcvtitof,\
+			vfcvtftoi,vfwcvtitof,vfwcvtftoi,vfwcvtftof,vfncvtitof,\
+			vfncvtftoi,vfncvtftof"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector integer multiplication.
+(define_insn_reservation "sifive_7_vec_imul" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vimul,viwmul,vimuladd,viwmuladd,vsmul,vclmul,vclmulh,\
+			vghsh,vgmul"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector float addition.
+(define_insn_reservation "sifive_7_vec_fadd" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vfalu,vfwalu"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector float multiplication and FMA.
+(define_insn_reservation "sifive_7_vec_fmul" 6
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vfmul,vfwmul,vfmuladd,vfwmuladd"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector crypto, assumed to be a generic operation for now.
+(define_insn_reservation "sifive_7_crypto" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "crypto"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector crypto, AES
+(define_insn_reservation "sifive_7_crypto_aes" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vaesef,vaesem,vaesdf,vaesdm,vaeskf1,vaeskf2,vaesz"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector crypto, sha
+(define_insn_reservation "sifive_7_crypto_sha" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vsha2ms,vsha2ch,vsha2cl"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector crypto, SM3/4
+(define_insn_reservation "sifive_7_crypto_sm" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vsm4k,vsm4r,vsm3me,vsm3c"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector permute.
+(define_insn_reservation "sifive_7_perm" 3
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vimerge,vfmerge,vslideup,vslidedown,vislide1up,\
+			vislide1down,vfslide1up,vfslide1down,vgather,vcompress"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector reduction.
+(define_insn_reservation "sifive_7_vec_reduction" 8
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vired,viwred,vfredu,vfwredu"))
+  "sifive_7_vxu_issue,sifive_7_vxu_multicycle")
+
+;; Vector ordered reduction, assume the latency number is for
+;; a 128-bit vector.  It is scaled in riscv_sched_adjust_cost
+;; for larger vectors.
+(define_insn_reservation "sifive_7_vec_ordered_reduction" 10
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vfredo,vfwredo"))
+  "sifive_7_vxu_issue,sifive_7_vxu_multicycle*3")
+
+;; Vector integer division, assume not pipelined.
+(define_insn_reservation "sifive_7_vec_idiv" 16
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vidiv"))
+  "sifive_7_vxu_issue,sifive_7_vxu_multicycle*3")
+
+;; Vector float divisions and sqrt, assume not pipelined.
+(define_insn_reservation "sifive_7_vec_float_divsqrt" 16
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vfdiv,vfsqrt"))
+  "sifive_7_vxu_issue,sifive_7_vxu_multicycle*3")
+
+;; Vector mask operations.
+(define_insn_reservation "sifive_7_vec_mask" 2
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vmalu,vmpop,vmffs,vmsfs,vmiota,vmidx,vimovvx,vimovxv,\
+			vfmovvf,vfmovfv"))
+  "sifive_7_vxu_issue,sifive_7_vxu_alu")
+
+;; Vector vsetvl.
+(define_insn_reservation "sifive_7_vec_vesetvl" 1
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "vsetvl,vsetvl_pre"))
+  "sifive_7_vxu_issue")
+
+;; Vector rounding mode setters, assume pipeline barrier.
+(define_insn_reservation "sifive_7_vec_setrm" 20
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "wrvxrm,wrfrm"))
+  "sifive_7_vxu_issue,sifive_7_vxu_issue*3")
+
+;; Vector read vlen/vlenb.
+(define_insn_reservation "sifive_7_vec_readlen" 4
+  (and (eq_attr "tune" "sifive_7")
+       (eq_attr "type" "rdvlenb,rdvl"))
+  "sifive_7_vxu_issue,sifive_7_vxu_issue")
+
 (define_bypass 1 "sifive_7_load,sifive_7_alu,sifive_7_mul,sifive_7_f2i,sifive_7_sfb_alu"
   "sifive_7_alu,sifive_7_branch")