[v2,2/2] riscv: thead: Add support for the XTheadFMemIdx ISA extension

Message ID 20231020095348.2455729-3-christoph.muellner@vrull.eu
State Unresolved
Headers
Series riscv: Adding support for XTHead(F)MemIdx |

Checks

Context Check Description
snail/gcc-patch-check warning Git am fail log

Commit Message

Christoph Müllner Oct. 20, 2023, 9:53 a.m. UTC
  From: Christoph Müllner <christoph.muellner@vrull.eu>

The XTheadFMemIdx ISA extension provides additional load and store
instructions for floating-point registers with new addressing modes.

The following memory accesses types are supported:
* load/store: [w,d] (single-precision FP, double-precision FP)

The following addressing modes are supported:
* register offset with additional immediate offset (4 instructions):
  flr<type>, fsr<type>
* zero-extended register offset with additional immediate offset
  (4 instructions): flur<type>, fsur<type>

These addressing modes are also part of the similar XTheadMemIdx
ISA extension support, whose code is reused and extended to support
floating-point registers.

One challenge that this patch needs to solve are GP registers in FP-mode
(e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx
instructions. Such registers are the result of independent
optimizations, which can happen after register allocation.
This patch uses a simple but efficient method to address this:
add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations.
This allows to use the instructions from XTheadMemIdx in case
of such registers.

The added tests ensure that this feature won't regress without notice.
Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak).

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>

gcc/ChangeLog:

	* config/riscv/riscv.cc (riscv_index_reg_class):
	Return GR_REGS for XTheadFMemIdx.
	(riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx.
	* config/riscv/riscv.h (HARDFP_REG_P): New macro.
	* config/riscv/thead.cc (is_fmemidx_mode): New function.
	(th_memidx_classify_address_index): Add support for XTheadFMemIdx.
	(th_fmemidx_output_index): New function.
	(th_output_move): Add support for XTheadFMemIdx.
	* config/riscv/thead.md (TH_M_ANYF): New mode iterator.
	(TH_M_NOEXTF): Likewise.
	(*th_fmemidx_movsf_hardfloat): New INSN.
	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
	(*th_fmemidx_I_a): Likewise.
	(*th_fmemidx_I_c): Likewise.
	(*th_fmemidx_US_a): Likewise.
	(*th_fmemidx_US_c): Likewise.
	(*th_fmemidx_UZ_a): Likewise.
	(*th_fmemidx_UZ_c): Likewise.

gcc/testsuite/ChangeLog:

	* gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadfmemidx-index.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
	* gcc.target/riscv/xtheadfmemidx-uindex.c: New test.
---
 gcc/config/riscv/riscv.cc                     |   4 +-
 gcc/config/riscv/riscv.h                      |   2 +
 gcc/config/riscv/thead.cc                     |  65 ++++++-
 gcc/config/riscv/thead.md                     | 161 ++++++++++++++++++
 .../riscv/xtheadfmemidx-index-update.c        |  20 +++
 .../xtheadfmemidx-index-xtheadbb-update.c     |  20 +++
 .../riscv/xtheadfmemidx-index-xtheadbb.c      |  22 +++
 .../gcc.target/riscv/xtheadfmemidx-index.c    |  22 +++
 .../riscv/xtheadfmemidx-uindex-update.c       |  20 +++
 .../xtheadfmemidx-uindex-xtheadbb-update.c    |  20 +++
 .../riscv/xtheadfmemidx-uindex-xtheadbb.c     |  24 +++
 .../gcc.target/riscv/xtheadfmemidx-uindex.c   |  25 +++
 12 files changed, 400 insertions(+), 5 deletions(-)
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c
 create mode 100644 gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c
  

Comments

Jeff Law Oct. 29, 2023, 10:25 p.m. UTC | #1
On 10/20/23 03:53, Christoph Muellner wrote:
> From: Christoph Müllner <christoph.muellner@vrull.eu>
> 
> The XTheadFMemIdx ISA extension provides additional load and store
> instructions for floating-point registers with new addressing modes.
> 
> The following memory accesses types are supported:
> * load/store: [w,d] (single-precision FP, double-precision FP)
> 
> The following addressing modes are supported:
> * register offset with additional immediate offset (4 instructions):
>    flr<type>, fsr<type>
> * zero-extended register offset with additional immediate offset
>    (4 instructions): flur<type>, fsur<type>
> 
> These addressing modes are also part of the similar XTheadMemIdx
> ISA extension support, whose code is reused and extended to support
> floating-point registers.
> 
> One challenge that this patch needs to solve are GP registers in FP-mode
> (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx
> instructions. Such registers are the result of independent
> optimizations, which can happen after register allocation.
> This patch uses a simple but efficient method to address this:
> add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations.
> This allows to use the instructions from XTheadMemIdx in case
> of such registers.
Or alternately define secondary reloads so that you can get a scratch 
register to reload the address into a GPR.  Your call on whether or not 
to try to implement that.  I guess it largely depends on how likely it 
is you'll have one extension defined, but not the other.

> 
> The added tests ensure that this feature won't regress without notice.
> Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak).
> 
> Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> 
> gcc/ChangeLog:
> 
> 	* config/riscv/riscv.cc (riscv_index_reg_class):
> 	Return GR_REGS for XTheadFMemIdx.
> 	(riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx.
> 	* config/riscv/riscv.h (HARDFP_REG_P): New macro.
> 	* config/riscv/thead.cc (is_fmemidx_mode): New function.
> 	(th_memidx_classify_address_index): Add support for XTheadFMemIdx.
> 	(th_fmemidx_output_index): New function.
> 	(th_output_move): Add support for XTheadFMemIdx.
> 	* config/riscv/thead.md (TH_M_ANYF): New mode iterator.
> 	(TH_M_NOEXTF): Likewise.
> 	(*th_fmemidx_movsf_hardfloat): New INSN.
> 	(*th_fmemidx_movdf_hardfloat_rv64): Likewise.
> 	(*th_fmemidx_I_a): Likewise.
> 	(*th_fmemidx_I_c): Likewise.
> 	(*th_fmemidx_US_a): Likewise.
> 	(*th_fmemidx_US_c): Likewise.
> 	(*th_fmemidx_UZ_a): Likewise.
> 	(*th_fmemidx_UZ_c): Likewise.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-index.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
> 	* gcc.target/riscv/xtheadfmemidx-uindex.c: New test.
> ---
Same note as with the prior patch WRT wrapping assembly instructions 
when using scan-assembler.



> diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> index eb162abcb92..1e9813b4f39 100644
> --- a/gcc/config/riscv/riscv.h
> +++ b/gcc/config/riscv/riscv.h
> @@ -372,6 +372,8 @@ ASM_MISA_SPEC
>     ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM)
>   #define FP_REG_P(REGNO)  \
>     ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
> +#define HARDFP_REG_P(REGNO)  \
> +  ((REGNO) >= FP_REG_FIRST && (REGNO) <= FP_REG_LAST)
>   #define V_REG_P(REGNO)  \
>     ((unsigned int) ((int) (REGNO) - V_REG_FIRST) < V_REG_NUM)
>   #define VL_REG_P(REGNO) ((REGNO) == VL_REGNUM)

> @@ -755,6 +768,40 @@ th_memidx_output_index (rtx x, machine_mode mode, bool load)
>     return buf;
>   }
>   
> +/* Provide a buffer for a th.flX/th.fluX/th.fsX/th.fsuX instruction
> +   for the given MODE. If LOAD is true, a load instruction will be
> +   provided (otherwise, a store instruction). If X is not suitable
> +   return NULL.  */
> +
> +static const char *
> +th_fmemidx_output_index (rtx x, machine_mode mode, bool load)
> +{
> +  struct riscv_address_info info;
> +  static char buf[128] = {0};
Same comment WRT static buffers as in the previous patch.

OK for the trunk after fixing the testcases and potentially adjusting 
the static buffer.  No need to get another review round, post for for 
the archiver and commit.

jeff
  
Christoph Müllner Oct. 31, 2023, 1:43 p.m. UTC | #2
On Sun, Oct 29, 2023 at 11:25 PM Jeff Law <jeffreyalaw@gmail.com> wrote:
>
>
>
> On 10/20/23 03:53, Christoph Muellner wrote:
> > From: Christoph Müllner <christoph.muellner@vrull.eu>
> >
> > The XTheadFMemIdx ISA extension provides additional load and store
> > instructions for floating-point registers with new addressing modes.
> >
> > The following memory accesses types are supported:
> > * load/store: [w,d] (single-precision FP, double-precision FP)
> >
> > The following addressing modes are supported:
> > * register offset with additional immediate offset (4 instructions):
> >    flr<type>, fsr<type>
> > * zero-extended register offset with additional immediate offset
> >    (4 instructions): flur<type>, fsur<type>
> >
> > These addressing modes are also part of the similar XTheadMemIdx
> > ISA extension support, whose code is reused and extended to support
> > floating-point registers.
> >
> > One challenge that this patch needs to solve are GP registers in FP-mode
> > (e.g. "(reg:DF a2)"), which cannot be handled by the XTheadFMemIdx
> > instructions. Such registers are the result of independent
> > optimizations, which can happen after register allocation.
> > This patch uses a simple but efficient method to address this:
> > add a dependency for XTheadMemIdx to XTheadFMemIdx optimizations.
> > This allows to use the instructions from XTheadMemIdx in case
> > of such registers.
> Or alternately define secondary reloads so that you can get a scratch
> register to reload the address into a GPR.  Your call on whether or not
> to try to implement that.  I guess it largely depends on how likely it
> is you'll have one extension defined, but not the other.

I started doing this but I thought it is not worth the effort,
given all cores that implement one extension also support the other.


> > The added tests ensure that this feature won't regress without notice.
> > Testing: GCC regression test suite and SPEC CPU 2017 intrate (base&peak).
> >
> > Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
> >
> > gcc/ChangeLog:
> >
> >       * config/riscv/riscv.cc (riscv_index_reg_class):
> >       Return GR_REGS for XTheadFMemIdx.
> >       (riscv_regno_ok_for_index_p): Add support for XTheadFMemIdx.
> >       * config/riscv/riscv.h (HARDFP_REG_P): New macro.
> >       * config/riscv/thead.cc (is_fmemidx_mode): New function.
> >       (th_memidx_classify_address_index): Add support for XTheadFMemIdx.
> >       (th_fmemidx_output_index): New function.
> >       (th_output_move): Add support for XTheadFMemIdx.
> >       * config/riscv/thead.md (TH_M_ANYF): New mode iterator.
> >       (TH_M_NOEXTF): Likewise.
> >       (*th_fmemidx_movsf_hardfloat): New INSN.
> >       (*th_fmemidx_movdf_hardfloat_rv64): Likewise.
> >       (*th_fmemidx_I_a): Likewise.
> >       (*th_fmemidx_I_c): Likewise.
> >       (*th_fmemidx_US_a): Likewise.
> >       (*th_fmemidx_US_c): Likewise.
> >       (*th_fmemidx_UZ_a): Likewise.
> >       (*th_fmemidx_UZ_c): Likewise.
> >
> > gcc/testsuite/ChangeLog:
> >
> >       * gcc.target/riscv/xtheadfmemidx-index-update.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-index.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-uindex-update.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c: New test.
> >       * gcc.target/riscv/xtheadfmemidx-uindex.c: New test.
> > ---
> Same note as with the prior patch WRT wrapping assembly instructions
> when using scan-assembler.

Will do.

>
>
>
> > diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
> > index eb162abcb92..1e9813b4f39 100644
> > --- a/gcc/config/riscv/riscv.h
> > +++ b/gcc/config/riscv/riscv.h
> > @@ -372,6 +372,8 @@ ASM_MISA_SPEC
> >     ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM)
> >   #define FP_REG_P(REGNO)  \
> >     ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
> > +#define HARDFP_REG_P(REGNO)  \
> > +  ((REGNO) >= FP_REG_FIRST && (REGNO) <= FP_REG_LAST)
> >   #define V_REG_P(REGNO)  \
> >     ((unsigned int) ((int) (REGNO) - V_REG_FIRST) < V_REG_NUM)
> >   #define VL_REG_P(REGNO) ((REGNO) == VL_REGNUM)
>
> > @@ -755,6 +768,40 @@ th_memidx_output_index (rtx x, machine_mode mode, bool load)
> >     return buf;
> >   }
> >
> > +/* Provide a buffer for a th.flX/th.fluX/th.fsX/th.fsuX instruction
> > +   for the given MODE. If LOAD is true, a load instruction will be
> > +   provided (otherwise, a store instruction). If X is not suitable
> > +   return NULL.  */
> > +
> > +static const char *
> > +th_fmemidx_output_index (rtx x, machine_mode mode, bool load)
> > +{
> > +  struct riscv_address_info info;
> > +  static char buf[128] = {0};
> Same comment WRT static buffers as in the previous patch.
>
> OK for the trunk after fixing the testcases and potentially adjusting
> the static buffer.  No need to get another review round, post for for
> the archiver and commit.

Thanks!

>
> jeff
  

Patch

diff --git a/gcc/config/riscv/riscv.cc b/gcc/config/riscv/riscv.cc
index 70aaaa53b76..2ecdd521b75 100644
--- a/gcc/config/riscv/riscv.cc
+++ b/gcc/config/riscv/riscv.cc
@@ -1084,7 +1084,7 @@  riscv_regno_mode_ok_for_base_p (int regno,
 enum reg_class
 riscv_index_reg_class ()
 {
-  if (TARGET_XTHEADMEMIDX)
+  if (TARGET_XTHEADMEMIDX || TARGET_XTHEADFMEMIDX)
     return GR_REGS;
 
   return NO_REGS;
@@ -1097,7 +1097,7 @@  riscv_index_reg_class ()
 int
 riscv_regno_ok_for_index_p (int regno)
 {
-  if (TARGET_XTHEADMEMIDX)
+  if (TARGET_XTHEADMEMIDX || TARGET_XTHEADFMEMIDX)
     return riscv_regno_mode_ok_for_base_p (regno, VOIDmode, 1);
 
   return 0;
diff --git a/gcc/config/riscv/riscv.h b/gcc/config/riscv/riscv.h
index eb162abcb92..1e9813b4f39 100644
--- a/gcc/config/riscv/riscv.h
+++ b/gcc/config/riscv/riscv.h
@@ -372,6 +372,8 @@  ASM_MISA_SPEC
   ((unsigned int) ((int) (REGNO) - GP_REG_FIRST) < GP_REG_NUM)
 #define FP_REG_P(REGNO)  \
   ((unsigned int) ((int) (REGNO) - FP_REG_FIRST) < FP_REG_NUM)
+#define HARDFP_REG_P(REGNO)  \
+  ((REGNO) >= FP_REG_FIRST && (REGNO) <= FP_REG_LAST)
 #define V_REG_P(REGNO)  \
   ((unsigned int) ((int) (REGNO) - V_REG_FIRST) < V_REG_NUM)
 #define VL_REG_P(REGNO) ((REGNO) == VL_REGNUM)
diff --git a/gcc/config/riscv/thead.cc b/gcc/config/riscv/thead.cc
index 236b590fd80..ebad379dbc5 100644
--- a/gcc/config/riscv/thead.cc
+++ b/gcc/config/riscv/thead.cc
@@ -592,6 +592,18 @@  is_memidx_mode (machine_mode mode)
   return false;
 }
 
+static bool
+is_fmemidx_mode (machine_mode mode)
+{
+  if (mode == SFmode && TARGET_HARD_FLOAT)
+    return true;
+
+  if (mode == DFmode && TARGET_DOUBLE_FLOAT)
+    return true;
+
+  return false;
+}
+
 /* Return true if X is a valid address for T-Head's memory addressing modes
    with scaled register offsets for machine mode MODE.
    If it is, fill in INFO appropriately (if non-NULL).
@@ -602,7 +614,8 @@  th_memidx_classify_address_index (struct riscv_address_info *info, rtx x,
 				  machine_mode mode, bool strict_p)
 {
   /* Ensure that the mode is supported.  */
-  if (!(TARGET_XTHEADMEMIDX && is_memidx_mode (mode)))
+  if (!(TARGET_XTHEADMEMIDX && is_memidx_mode (mode))
+      && !(TARGET_XTHEADFMEMIDX && is_fmemidx_mode (mode)))
     return false;
 
   if (GET_CODE (x) != PLUS)
@@ -755,6 +768,40 @@  th_memidx_output_index (rtx x, machine_mode mode, bool load)
   return buf;
 }
 
+/* Provide a buffer for a th.flX/th.fluX/th.fsX/th.fsuX instruction
+   for the given MODE. If LOAD is true, a load instruction will be
+   provided (otherwise, a store instruction). If X is not suitable
+   return NULL.  */
+
+static const char *
+th_fmemidx_output_index (rtx x, machine_mode mode, bool load)
+{
+  struct riscv_address_info info;
+  static char buf[128] = {0};
+
+  /* Validate x.  */
+  if (!th_memidx_classify_address_index (&info, x, mode, false))
+    return NULL;
+
+  int index = exact_log2 (GET_MODE_SIZE (mode).to_constant ()) - 2;
+  bool uindex = info.type == ADDRESS_REG_UREG;
+
+  const char *const insn[][2] = {
+    {
+      "th.fs%srw\t%%z1,%%0",
+      "th.fs%srd\t%%z1,%%0"
+    },
+    {
+      "th.fl%srw\t%%0,%%1",
+      "th.fl%srd\t%%0,%%1"
+    }
+  };
+
+  snprintf (buf, sizeof (buf), insn[load][index], uindex ? "u" : "");
+
+  return buf;
+}
+
 /* Return true if X is a valid address for T-Head's memory addressing modes
    for machine mode MODE.  If it is, fill in INFO appropriately (if non-NULL).
    If STRICT_P is true then REG_OK_STRICT is in effect.  */
@@ -804,25 +851,37 @@  th_output_move (rtx dest, rtx src)
   if (dest_code == REG && src_code == MEM)
     {
       rtx x = XEXP (src, 0);
-      if (GET_MODE_CLASS (mode) == MODE_INT)
+      if (GET_MODE_CLASS (mode) == MODE_INT
+	  || (GET_MODE_CLASS (mode) == MODE_FLOAT && GP_REG_P (REGNO (dest))))
 	{
 	  if ((insn = th_memidx_output_index (x, mode, true)))
 	    return insn;
 	  if ((insn = th_memidx_output_modify (x, mode, true)))
 	    return insn;
 	}
+      else if (GET_MODE_CLASS (mode) == MODE_FLOAT && HARDFP_REG_P (REGNO (dest)))
+	{
+	  if ((insn = th_fmemidx_output_index (x, mode, true)))
+	    return insn;
+	}
     }
   else if (dest_code == MEM && (src_code == REG || src == CONST0_RTX (mode)))
     {
       rtx x = XEXP (dest, 0);
       if (GET_MODE_CLASS (mode) == MODE_INT
-	  || src == CONST0_RTX (mode))
+	  || src == CONST0_RTX (mode)
+	  || (GET_MODE_CLASS (mode) == MODE_FLOAT && GP_REG_P (REGNO (src))))
 	{
 	  if ((insn = th_memidx_output_index (x, mode, false)))
 	    return insn;
 	  if ((insn = th_memidx_output_modify (x, mode, false)))
 	    return insn;
 	}
+      else if (GET_MODE_CLASS (mode) == MODE_FLOAT && HARDFP_REG_P (REGNO (src)))
+	{
+	  if ((insn = th_fmemidx_output_index (x, mode, false)))
+	    return insn;
+	}
     }
   return NULL;
 }
diff --git a/gcc/config/riscv/thead.md b/gcc/config/riscv/thead.md
index e9a8bf579d0..1840245748d 100644
--- a/gcc/config/riscv/thead.md
+++ b/gcc/config/riscv/thead.md
@@ -587,15 +587,24 @@  (define_insn "*th_memidx_bb_extendqi<SUPERQI:mode>2"
   [(set_attr "move_type" "shift_shift,load,load,load,load,load")
    (set_attr "mode" "<SUPERQI:MODE>")])
 
+;; All modes that are supported by XTheadMemIdx
 (define_mode_iterator TH_M_ANYI [(QI "TARGET_XTHEADMEMIDX")
                                  (HI "TARGET_XTHEADMEMIDX")
                                  (SI "TARGET_XTHEADMEMIDX")
                                  (DI "TARGET_64BIT && TARGET_XTHEADMEMIDX")])
 
+;; All modes that are supported by XTheadFMemIdx
+(define_mode_iterator TH_M_ANYF [(SF "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX")
+                                 (DF "TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX")])
+
 ;; All non-extension modes that are supported by XTheadMemIdx
 (define_mode_iterator TH_M_NOEXTI [(SI "!TARGET_64BIT && TARGET_XTHEADMEMIDX")
                                    (DI "TARGET_64BIT && TARGET_XTHEADMEMIDX")])
 
+;; All non-extension modes that are supported by XTheadFMemIdx
+(define_mode_iterator TH_M_NOEXTF [(SF "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX")
+                                   (DF "TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX")])
+
 ;; XTheadMemIdx optimizations
 ;; All optimizations attempt to improve the operand utilization of
 ;; XTheadMemIdx instructions, where one sign or zero extended
@@ -812,4 +821,156 @@  (define_insn_and_split "*th_memidx_UZ_c"
         (match_dup 0))]
 )
 
+;; XTheadFMemIdx
+
+(define_insn "*th_fmemidx_movsf_hardfloat"
+  [(set (match_operand:SF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
+	(match_operand:SF 1 "move_operand"         " th_m_mir,f,th_m_miu,f"))]
+  "TARGET_HARD_FLOAT && TARGET_XTHEADFMEMIDX
+   && (register_operand (operands[0], SFmode)
+       || reg_or_0_operand (operands[1], SFmode))"
+  { return riscv_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fpload,fpstore,fpload,fpstore")
+   (set_attr "mode" "SF")])
+
+(define_insn "*th_fmemidx_movdf_hardfloat_rv64"
+  [(set (match_operand:DF 0 "nonimmediate_operand" "=f,th_m_mir,f,th_m_miu")
+	(match_operand:DF 1 "move_operand"         " th_m_mir,f,th_m_miu,f"))]
+  "TARGET_64BIT && TARGET_DOUBLE_FLOAT && TARGET_XTHEADFMEMIDX
+   && (register_operand (operands[0], DFmode)
+       || reg_or_0_operand (operands[1], DFmode))"
+  { return riscv_output_move (operands[0], operands[1]); }
+  [(set_attr "move_type" "fpload,fpstore,fpload,fpstore")
+   (set_attr "mode" "DF")])
+
+;; XTheadFMemIdx optimizations
+;; Similar like XTheadMemIdx optimizations, but less cases.
+;; Note, that we might get GP registers in FP-mode (reg:DF a2)
+;; which cannot be handled by the XTheadFMemIdx instructions.
+;; This might even happend after register allocation.
+;; We could implement splitters that undo the combiner results
+;; if "after_reload && !HARDFP_REG_P (operands[0])", but this
+;; raises even more questions (e.g. split into what?).
+;; So let's solve this by simply requiring XTheadMemIdx
+;; which provides the necessary instructions to cover this case.
+
+(define_insn_and_split "*th_fmemidx_I_a"
+  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
+        (mem:TH_M_NOEXTF (plus:X
+          (mult:X (match_operand:X 1 "register_operand" "r")
+                  (match_operand:QI 2 "immediate_operand" "i"))
+          (match_operand:X 3 "register_operand" "r"))))]
+  "TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
+   && CONST_INT_P (operands[2])
+   && pow2p_hwi (INTVAL (operands[2]))
+   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+        (mem:TH_M_NOEXTF (plus:X
+          (match_dup 3)
+          (ashift:X (match_dup 1) (match_dup 2)))))]
+  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
+  }
+)
+
+(define_insn_and_split "*th_fmemidx_I_c"
+  [(set (mem:TH_M_ANYF (plus:X
+          (mult:X (match_operand:X 1 "register_operand" "r")
+                  (match_operand:QI 2 "immediate_operand" "i"))
+          (match_operand:X 3 "register_operand" "r")))
+        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
+  "TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
+   && CONST_INT_P (operands[2])
+   && pow2p_hwi (INTVAL (operands[2]))
+   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)"
+  "#"
+  "&& 1"
+  [(set (mem:TH_M_ANYF (plus:X
+          (match_dup 3)
+          (ashift:X (match_dup 1) (match_dup 2))))
+        (match_dup 0))]
+  { operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
+  }
+)
+
+(define_insn_and_split "*th_fmemidx_US_a"
+  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
+        (mem:TH_M_NOEXTF (plus:DI
+          (and:DI
+            (mult:DI (match_operand:DI 1 "register_operand" "r")
+                     (match_operand:QI 2 "immediate_operand" "i"))
+            (match_operand:DI 3 "immediate_operand" "i"))
+          (match_operand:DI 4 "register_operand" "r"))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
+   && CONST_INT_P (operands[2])
+   && pow2p_hwi (INTVAL (operands[2]))
+   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
+   && CONST_INT_P (operands[3])
+   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+        (mem:TH_M_NOEXTF (plus:DI
+          (match_dup 4)
+          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2)))))]
+  { operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
+  }
+)
+
+(define_insn_and_split "*th_fmemidx_US_c"
+  [(set (mem:TH_M_ANYF (plus:DI
+          (and:DI
+            (mult:DI (match_operand:DI 1 "register_operand" "r")
+                     (match_operand:QI 2 "immediate_operand" "i"))
+            (match_operand:DI 3 "immediate_operand" "i"))
+          (match_operand:DI 4 "register_operand" "r")))
+        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
+   && CONST_INT_P (operands[2])
+   && pow2p_hwi (INTVAL (operands[2]))
+   && IN_RANGE (exact_log2 (INTVAL (operands[2])), 1, 3)
+   && CONST_INT_P (operands[3])
+   && (INTVAL (operands[3]) >> exact_log2 (INTVAL (operands[2]))) == 0xffffffff"
+  "#"
+  "&& 1"
+  [(set (mem:TH_M_ANYF (plus:DI
+          (match_dup 4)
+          (ashift:DI (zero_extend:DI (match_dup 1)) (match_dup 2))))
+        (match_dup 0))]
+  { operands[1] = gen_lowpart (SImode, operands[1]);
+    operands[2] = GEN_INT (exact_log2 (INTVAL (operands [2])));
+  }
+)
+
+(define_insn_and_split "*th_fmemidx_UZ_a"
+  [(set (match_operand:TH_M_NOEXTF 0 "register_operand" "=f")
+        (mem:TH_M_NOEXTF (plus:DI
+          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
+          (match_operand:DI 2 "register_operand" "r"))))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX
+   && (!HARD_REGISTER_NUM_P (REGNO (operands[0])) || HARDFP_REG_P (REGNO (operands[0])))"
+  "#"
+  "&& 1"
+  [(set (match_dup 0)
+        (mem:TH_M_NOEXTF (plus:DI
+          (match_dup 2)
+          (zero_extend:DI (match_dup 1)))))]
+)
+
+(define_insn_and_split "*th_fmemidx_UZ_c"
+  [(set (mem:TH_M_ANYF (plus:DI
+          (zero_extend:DI (match_operand:SI 1 "register_operand" "r"))
+          (match_operand:DI 2 "register_operand" "r")))
+        (match_operand:TH_M_ANYF 0 "register_operand" "f"))]
+  "TARGET_64BIT && TARGET_XTHEADMEMIDX && TARGET_XTHEADFMEMIDX"
+  "#"
+  "&& 1"
+  [(set (mem:TH_M_ANYF (plus:DI
+          (match_dup 2)
+          (zero_extend:DI (match_dup 1))))
+        (match_dup 0))]
+)
+
 (include "thead-peephole.md")
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
new file mode 100644
index 00000000000..931770037ca
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-update.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM_UPD(float, 0)
+#if __riscv_xlen == 64
+LR_REG_IMM_UPD(double, 2)
+#endif
+
+SR_REG_IMM_UPD(float, 1)
+#if __riscv_xlen == 64
+SR_REG_IMM_UPD(double, 3)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 1 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 3 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
new file mode 100644
index 00000000000..fe4be4d095c
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb-update.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM_UPD(float, 0)
+#if __riscv_xlen == 64
+LR_REG_IMM_UPD(double, 2)
+#endif
+
+SR_REG_IMM_UPD(float, 1)
+#if __riscv_xlen == 64
+SR_REG_IMM_UPD(double, 3)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 1 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 3 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
new file mode 100644
index 00000000000..dff76894802
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index-xtheadbb.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM(float, 0)
+/* { dg-final { scan-assembler-times "th.flrw\t\[^\n\r\]*0\[\n\r\]" 1 } } */
+#if __riscv_xlen == 64
+LR_REG_IMM(double, 2)
+/* { dg-final { scan-assembler-times "th.flrd\t\[^\n\r\]*2\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+SR_REG_IMM(float, 1)
+/* { dg-final { scan-assembler-times "th.fsrw\t\[^\n\r\]*1\[\n\r\]" 1 } } */
+#if __riscv_xlen == 64
+SR_REG_IMM(double, 3)
+/* { dg-final { scan-assembler-times "th.fsrd\t\[^\n\r\]*3\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
new file mode 100644
index 00000000000..5d8800863b4
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-index.c
@@ -0,0 +1,22 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LR_REG_IMM(float, 0)
+/* { dg-final { scan-assembler-times "th.flrw\t\[^\n\r\]*0\[\n\r\]" 1 } } */
+#if __riscv_xlen == 64
+LR_REG_IMM(double, 2)
+/* { dg-final { scan-assembler-times "th.flrd\t\[^\n\r\]*2\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+SR_REG_IMM(float, 1)
+/* { dg-final { scan-assembler-times "th.fsrw\t\[^\n\r\]*1\[\n\r\]" 1 } } */
+#if __riscv_xlen == 64
+SR_REG_IMM(double, 3)
+/* { dg-final { scan-assembler-times "th.fsrd\t\[^\n\r\]*3\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c
new file mode 100644
index 00000000000..63e96be3741
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-update.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM_UPD(float, 0)
+#if __riscv_xlen == 64
+LRU_REG_IMM_UPD(double, 2)
+#endif
+
+SRU_REG_IMM_UPD(float, 1)
+#if __riscv_xlen == 64
+SRU_REG_IMM_UPD(double, 3)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 1 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 3 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c
new file mode 100644
index 00000000000..8100720bbe8
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb-update.c
@@ -0,0 +1,20 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM_UPD(float, 0)
+#if __riscv_xlen == 64
+LRU_REG_IMM_UPD(double, 2)
+#endif
+
+SRU_REG_IMM_UPD(float, 1)
+#if __riscv_xlen == 64
+SRU_REG_IMM_UPD(double, 3)
+#endif
+
+/* If the shifted value is used later, we cannot eliminate it.  */
+/* { dg-final { scan-assembler-times "slli" 1 { target { rv32 } } } } */
+/* { dg-final { scan-assembler-times "slli" 3 { target { rv64 } } } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c
new file mode 100644
index 00000000000..a37473469be
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex-xtheadbb.c
@@ -0,0 +1,24 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadbb_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadbb_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM(float, 0)
+/* { dg-final { scan-assembler-times "th.flurw\t\[^\n\r\]*0\[\n\r\]" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.flrw\t\[^\n\r\]*0\[\n\r\]" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+LRU_REG_IMM(double, 2)
+/* { dg-final { scan-assembler-times "th.flurd\t\[^\n\r\]*2\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+SRU_REG_IMM(float, 1)
+/* { dg-final { scan-assembler-times "th.fsurw\t\[^\n\r\]*1\[\n\r\]" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.fsrw\t\[^\n\r\]*1\[\n\r\]" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+SRU_REG_IMM(double, 3)
+/* { dg-final { scan-assembler-times "th.fsurd\t\[^\n\r\]*3\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
diff --git a/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c
new file mode 100644
index 00000000000..ca4783b400b
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/xtheadfmemidx-uindex.c
@@ -0,0 +1,25 @@ 
+/* { dg-do compile } */
+/* { dg-skip-if "" { *-*-* } { "-O0" "-O1" "-Og" } } */
+/* { dg-options "-march=rv64gc_xtheadmemidx_xtheadfmemidx" { target { rv64 } } } */
+/* { dg-options "-march=rv32imafc_xtheadmemidx_xtheadfmemidx -mabi=ilp32f" { target { rv32 } } } */
+
+#include "xtheadmemidx-helpers.h"
+
+LRU_REG_IMM(float, 0)
+/* { dg-final { scan-assembler-times "th.flurw\t\[^\n\r\]*0\[\n\r\]" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.flrw\t\[^\n\r\]*0\[\n\r\]" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+LRU_REG_IMM(double, 2)
+/* { dg-final { scan-assembler-times "th.flurd\t\[^\n\r\]*2\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+SRU_REG_IMM(float, 1)
+/* { dg-final { scan-assembler-times "th.fsurw\t\[^\n\r\]*1\[\n\r\]" 1 { target { rv64 } } } } */
+/* { dg-final { scan-assembler-times "th.fsrw\t\[^\n\r\]*1\[\n\r\]" 1 { target { rv32 } } } } */
+#if __riscv_xlen == 64
+SRU_REG_IMM(double, 3)
+/* { dg-final { scan-assembler-times "th.fsurd\t\[^\n\r\]*3\[\n\r\]" 1 { target { rv64 } } } } */
+#endif
+
+/* { dg-final { scan-assembler-not "slli" } } */
+