[RFC,0/1] RISC-V: Common register pair framework

Message ID cover.1664599545.git.research_trasio@irq.a4lg.com
Headers
Series RISC-V: Common register pair framework |

Message

Tsukasa OI Oct. 1, 2022, 4:45 a.m. UTC
  Hi all RISC-V folks,

My 'Zfinx' fixes PATCH 3/3 is going to be revised due to:

(1) Lack of 'Zhinxmin' + 'Z[dq]inx' support
(2) Insufficient test coverage
(3) Possible room to improve maintainability

In this week, I worked on (1)+(2) and improvement worked well.  For (3), I
might able to provide another proposal.


Register pairs using GPRs (that require certain alignments) are not
exclusive to 'Z[dq]inx'.  We currently expect that following extensions use
aligned register pairs:

-   'Zdinx' (ratified)
-   'Zqinx' (once proposed but not ratified)
-   'Zbpbo'       (a part of 'P'-extension proposal)
    Only used for instruction aliases WEXT and WEXTI.
-   'Zpsfoperand' (a part of 'P'-extension proposal)
    For instance, a 'P'-extension proposal implementation
    <https://github.com/riscvarchive/riscv-binutils-gdb/pull/257>
    uses "nds_rdp", "nds_rsp" and "nds_rtp" for register pair operands.

Due to this, if we have a common framework to handle register pairs, they
will be a lot easier.  But, as long as following statement is acceptable:

    If an invalid / reserved encoding is disassembled,
    it might be disassembled as an instruction with invalid operands.

Take for example, RVE.

If an instruction (for use in the RV32E environment) has the encoding
"add x14, x15, x16", x16 is an invalid GPR for RVE because RVI has 16 GPRs
(x0-x15) instead of 32 (x0-x31).

The best disassembler result (for me) would be ".4byte 0x1078733", meaning
an instruction is not recognized.  However, this resolution is sometimes
hard and forces to use some buffering before printing.

The second best solution would be... "add x14, x15, invalid16" or something
like that.  "invalid16" is used so that the encoding is not valid on the
current situation (e.g. ELF attributes meaning RV32E).  Note that similar
solution "unknown" is already used by the RISC-V disassembler.

    .insn 0x0107e753

This is an invalid encoding of "fadd.s fa4, fa5, fa6" with a reserved
rounding mode (0b110).  Current GNU Binutils disassembles this word as:

    fadd.s fa4,fa5,fa6,unknown

"unknown" makes sense here.
Extending this idea looks a viable solution to me.


So, my big question is, is it acceptable to extend this idea?
Here's only a part of such cases:

-   RVE: invalid register number
-   Register Pairs (Z[dq]inx and Zpsfoperand): invalid register number
-   Shift amount (SHAMT): equals to or greater than XLEN



If the idea above is acceptable for RISC-V GNU toolchain developers, this
patchset provides common framework for register pairs both for 'Z[dq]inx'
and 'Zpsfoperand'.

Operand Format:
1.  'l' (stands for "length")
2.  One of the following:
    '1' for  32-bit data  (or less; though redundant, makes code readable)
    '2' for  64-bit data  (RV32: 2 registers)
    '4' for 128-bit data  (RV32: 4 registers, RV64: 2 registers)
3.  One of the following:
    'd' for RD
    's' for RS1
    't' for RS2
    'r' for RS3
    'u' for RS1 and RS2 (where RS1 == RS2)
    (note that a GPR is expected here, even on 'r' and 'u')
    (To be added later for 'P' extension proposal; 'Zbpbo' WEXT aliases):
        'F' for RS1 and RS3 (RV32 "l2F" only; RS1 is even and RS3==RS1+1)

For instance, "l2d" means a 64-bit width destination register operand.  On
RV32, it would require a register pair and the register number must be even.
On RV64, it represents a GPR with no alignment requirements.

When assembling, it indirectly raises an error for an invalid register.

When disassembling, it once accepts all register numbers but the output
depends on whether the register number is valid for register pair alignment
requirements:

-   If valid, regular GPR operand is printed.
    (Style: dis_style_register)
-   If not,   "invalid%d" (where %d is the register number) is printed.
    (Style: dis_style_text)

I confirmed that this is sufficient to implement 'Z[dq]inx' and
'Zpsfoperand' except old 'P'-proposal aliases WEXT and WEXTI (now
implemented as aliases of FSR and FSRI; "l2F" will be added later).

I would like to hear your thoughts.

Thanks,
Tsukasa




Appendix: About my new 'Z[dq]inx' register pair implementation
          based on this Framework

My original 'Z[dq]inx' register pair implementation intended to implement
the best solution for the disassembler.  In contrast, my new 'Z[dq]inx'
register pair validation based on this framework has following benefits:

1.  Although an error is indirectly generated (as before), it explicitly
    checks whether the register number is valid directly on the assembler.
2.  My previous implementation required custom match_opcode functions and
    required to make separate opcode entry as shown below (on Zdinx):
    -   'D'
    -   'Zdinx' (XLEN==32)
    -   'Zdinx' (XLEN==64)
    This makes the code maintainance harder.  New implementation still
    requires to split opcode entry but...
    -   'D'
    -   'Zdinx' (for all XLEN)
    Not only reducing the changes, it will improve maintainability.
3.  Operand length and type fields are adjacent.
    In contrast to my previous implementation (operand types and custom
    match_opcode function representing length for all operands),
    it's pretty easy to understand.

This my new attempt for 'Z[dq]inx' in development is available at:
<https://github.com/a4lg/binutils-gdb/tree/riscv-float-combined-2>




Tsukasa OI (1):
  RISC-V: Implement common register pair framework

 gas/config/tc-riscv.c | 72 +++++++++++++++++++++++++++++++++++++++++++
 opcodes/riscv-dis.c   | 31 +++++++++++++++++++
 2 files changed, 103 insertions(+)


base-commit: 06bed95d8d2bac94956509dfc1f223d00e51eafb
  

Comments

Nelson Chu Oct. 1, 2022, 7:17 a.m. UTC | #1
On Sat, Oct 1, 2022 at 12:46 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>
> Hi all RISC-V folks,
>
> My 'Zfinx' fixes PATCH 3/3 is going to be revised due to:
>
> (1) Lack of 'Zhinxmin' + 'Z[dq]inx' support
> (2) Insufficient test coverage
> (3) Possible room to improve maintainability
>
> In this week, I worked on (1)+(2) and improvement worked well.  For (3), I
> might able to provide another proposal.
>
>
> Register pairs using GPRs (that require certain alignments) are not
> exclusive to 'Z[dq]inx'.  We currently expect that following extensions use
> aligned register pairs:
>
> -   'Zdinx' (ratified)
> -   'Zqinx' (once proposed but not ratified)
> -   'Zbpbo'       (a part of 'P'-extension proposal)
>     Only used for instruction aliases WEXT and WEXTI.
> -   'Zpsfoperand' (a part of 'P'-extension proposal)
>     For instance, a 'P'-extension proposal implementation
>     <https://github.com/riscvarchive/riscv-binutils-gdb/pull/257>
>     uses "nds_rdp", "nds_rsp" and "nds_rtp" for register pair operands.
>
> Due to this, if we have a common framework to handle register pairs, they
> will be a lot easier.  But, as long as following statement is acceptable:
>
>     If an invalid / reserved encoding is disassembled,
>     it might be disassembled as an instruction with invalid operands.
>
> Take for example, RVE.
>
> If an instruction (for use in the RV32E environment) has the encoding
> "add x14, x15, x16", x16 is an invalid GPR for RVE because RVI has 16 GPRs
> (x0-x15) instead of 32 (x0-x31).
>
> The best disassembler result (for me) would be ".4byte 0x1078733", meaning
> an instruction is not recognized.  However, this resolution is sometimes
> hard and forces to use some buffering before printing.
>
> The second best solution would be... "add x14, x15, invalid16" or something
> like that.  "invalid16" is used so that the encoding is not valid on the
> current situation (e.g. ELF attributes meaning RV32E).  Note that similar
> solution "unknown" is already used by the RISC-V disassembler.
>
>     .insn 0x0107e753

The main purpose of .insn is to let users encode the (customer)
instructions which haven't been supported in assembler.  Therefore,
trying to recognize and report something for .insn is kind of
unnecessary, even if we can use it to encode the existing
instructions.  We should suggest users to stop writing the supported
instructions by .insn directives.

> This is an invalid encoding of "fadd.s fa4, fa5, fa6" with a reserved
> rounding mode (0b110).  Current GNU Binutils disassembles this word as:
>
>     fadd.s fa4,fa5,fa6,unknown
>
> "unknown" makes sense here.
> Extending this idea looks a viable solution to me.
>
>
> So, my big question is, is it acceptable to extend this idea?
> Here's only a part of such cases:
>
> -   RVE: invalid register number
> -   Register Pairs (Z[dq]inx and Zpsfoperand): invalid register number
> -   Shift amount (SHAMT): equals to or greater than XLEN
>
>
>
> If the idea above is acceptable for RISC-V GNU toolchain developers, this
> patchset provides common framework for register pairs both for 'Z[dq]inx'
> and 'Zpsfoperand'.
>
> Operand Format:
> 1.  'l' (stands for "length")
> 2.  One of the following:
>     '1' for  32-bit data  (or less; though redundant, makes code readable)
>     '2' for  64-bit data  (RV32: 2 registers)
>     '4' for 128-bit data  (RV32: 4 registers, RV64: 2 registers)
> 3.  One of the following:
>     'd' for RD
>     's' for RS1
>     't' for RS2
>     'r' for RS3
>     'u' for RS1 and RS2 (where RS1 == RS2)
>     (note that a GPR is expected here, even on 'r' and 'u')
>     (To be added later for 'P' extension proposal; 'Zbpbo' WEXT aliases):
>         'F' for RS1 and RS3 (RV32 "l2F" only; RS1 is even and RS3==RS1+1)
>
> For instance, "l2d" means a 64-bit width destination register operand.  On
> RV32, it would require a register pair and the register number must be even.
> On RV64, it represents a GPR with no alignment requirements.
>
> When assembling, it indirectly raises an error for an invalid register.

I would suggest you not to spend too much time on these topics about
error reporting.  I used to do something similar for rvv constraints,
but now all of them are abandoned, since the stricter the assembler
checks, the hardware test checking will fail,
https://github.com/riscvarchive/riscv-binutils-gdb/pull/193.
Unfortunately, there is no such thing as the best of both worlds.

Nelson

> When disassembling, it once accepts all register numbers but the output
> depends on whether the register number is valid for register pair alignment
> requirements:
>
> -   If valid, regular GPR operand is printed.
>     (Style: dis_style_register)
> -   If not,   "invalid%d" (where %d is the register number) is printed.
>     (Style: dis_style_text)
>
> I confirmed that this is sufficient to implement 'Z[dq]inx' and
> 'Zpsfoperand' except old 'P'-proposal aliases WEXT and WEXTI (now
> implemented as aliases of FSR and FSRI; "l2F" will be added later).
>
> I would like to hear your thoughts.
>
> Thanks,
> Tsukasa
>
>
>
>
> Appendix: About my new 'Z[dq]inx' register pair implementation
>           based on this Framework
>
> My original 'Z[dq]inx' register pair implementation intended to implement
> the best solution for the disassembler.  In contrast, my new 'Z[dq]inx'
> register pair validation based on this framework has following benefits:
>
> 1.  Although an error is indirectly generated (as before), it explicitly
>     checks whether the register number is valid directly on the assembler.
> 2.  My previous implementation required custom match_opcode functions and
>     required to make separate opcode entry as shown below (on Zdinx):
>     -   'D'
>     -   'Zdinx' (XLEN==32)
>     -   'Zdinx' (XLEN==64)
>     This makes the code maintainance harder.  New implementation still
>     requires to split opcode entry but...
>     -   'D'
>     -   'Zdinx' (for all XLEN)
>     Not only reducing the changes, it will improve maintainability.
> 3.  Operand length and type fields are adjacent.
>     In contrast to my previous implementation (operand types and custom
>     match_opcode function representing length for all operands),
>     it's pretty easy to understand.
>
> This my new attempt for 'Z[dq]inx' in development is available at:
> <https://github.com/a4lg/binutils-gdb/tree/riscv-float-combined-2>
>
>
>
>
> Tsukasa OI (1):
>   RISC-V: Implement common register pair framework
>
>  gas/config/tc-riscv.c | 72 +++++++++++++++++++++++++++++++++++++++++++
>  opcodes/riscv-dis.c   | 31 +++++++++++++++++++
>  2 files changed, 103 insertions(+)
>
>
> base-commit: 06bed95d8d2bac94956509dfc1f223d00e51eafb
> --
> 2.34.1
>
  
Tsukasa OI Oct. 6, 2022, 12:21 p.m. UTC | #2
On 2022/10/01 16:17, Nelson Chu wrote:
> On Sat, Oct 1, 2022 at 12:46 PM Tsukasa OI <research_trasio@irq.a4lg.com> wrote:
>>
>> Hi all RISC-V folks,
>>
>> My 'Zfinx' fixes PATCH 3/3 is going to be revised due to:
>>
>> (1) Lack of 'Zhinxmin' + 'Z[dq]inx' support
>> (2) Insufficient test coverage
>> (3) Possible room to improve maintainability
>>
>> In this week, I worked on (1)+(2) and improvement worked well.  For (3), I
>> might able to provide another proposal.
>>
>>
>> Register pairs using GPRs (that require certain alignments) are not
>> exclusive to 'Z[dq]inx'.  We currently expect that following extensions use
>> aligned register pairs:
>>
>> -   'Zdinx' (ratified)
>> -   'Zqinx' (once proposed but not ratified)
>> -   'Zbpbo'       (a part of 'P'-extension proposal)
>>     Only used for instruction aliases WEXT and WEXTI.
>> -   'Zpsfoperand' (a part of 'P'-extension proposal)
>>     For instance, a 'P'-extension proposal implementation
>>     <https://github.com/riscvarchive/riscv-binutils-gdb/pull/257>
>>     uses "nds_rdp", "nds_rsp" and "nds_rtp" for register pair operands.
>>
>> Due to this, if we have a common framework to handle register pairs, they
>> will be a lot easier.  But, as long as following statement is acceptable:
>>
>>     If an invalid / reserved encoding is disassembled,
>>     it might be disassembled as an instruction with invalid operands.
>>
>> Take for example, RVE.
>>
>> If an instruction (for use in the RV32E environment) has the encoding
>> "add x14, x15, x16", x16 is an invalid GPR for RVE because RVI has 16 GPRs
>> (x0-x15) instead of 32 (x0-x31).
>>
>> The best disassembler result (for me) would be ".4byte 0x1078733", meaning
>> an instruction is not recognized.  However, this resolution is sometimes
>> hard and forces to use some buffering before printing.
>>
>> The second best solution would be... "add x14, x15, invalid16" or something
>> like that.  "invalid16" is used so that the encoding is not valid on the
>> current situation (e.g. ELF attributes meaning RV32E).  Note that similar
>> solution "unknown" is already used by the RISC-V disassembler.
>>
>>     .insn 0x0107e753
> 
> The main purpose of .insn is to let users encode the (customer)
> instructions which haven't been supported in assembler.  Therefore,
> trying to recognize and report something for .insn is kind of
> unnecessary, even if we can use it to encode the existing
> instructions.  We should suggest users to stop writing the supported
> instructions by .insn directives.
> 
>> This is an invalid encoding of "fadd.s fa4, fa5, fa6" with a reserved
>> rounding mode (0b110).  Current GNU Binutils disassembles this word as:
>>
>>     fadd.s fa4,fa5,fa6,unknown
>>
>> "unknown" makes sense here.
>> Extending this idea looks a viable solution to me.
>>
>>
>> So, my big question is, is it acceptable to extend this idea?
>> Here's only a part of such cases:
>>
>> -   RVE: invalid register number
>> -   Register Pairs (Z[dq]inx and Zpsfoperand): invalid register number
>> -   Shift amount (SHAMT): equals to or greater than XLEN
>>
>>
>>
>> If the idea above is acceptable for RISC-V GNU toolchain developers, this
>> patchset provides common framework for register pairs both for 'Z[dq]inx'
>> and 'Zpsfoperand'.
>>
>> Operand Format:
>> 1.  'l' (stands for "length")
>> 2.  One of the following:
>>     '1' for  32-bit data  (or less; though redundant, makes code readable)
>>     '2' for  64-bit data  (RV32: 2 registers)
>>     '4' for 128-bit data  (RV32: 4 registers, RV64: 2 registers)
>> 3.  One of the following:
>>     'd' for RD
>>     's' for RS1
>>     't' for RS2
>>     'r' for RS3
>>     'u' for RS1 and RS2 (where RS1 == RS2)
>>     (note that a GPR is expected here, even on 'r' and 'u')
>>     (To be added later for 'P' extension proposal; 'Zbpbo' WEXT aliases):
>>         'F' for RS1 and RS3 (RV32 "l2F" only; RS1 is even and RS3==RS1+1)
>>
>> For instance, "l2d" means a 64-bit width destination register operand.  On
>> RV32, it would require a register pair and the register number must be even.
>> On RV64, it represents a GPR with no alignment requirements.
>>
>> When assembling, it indirectly raises an error for an invalid register.
> 
> I would suggest you not to spend too much time on these topics about
> error reporting.  I used to do something similar for rvv constraints,
> but now all of them are abandoned, since the stricter the assembler
> checks, the hardware test checking will fail,
> https://github.com/riscvarchive/riscv-binutils-gdb/pull/193.
> Unfortunately, there is no such thing as the best of both worlds.

I'm talking about compliance to the specification.  And hardware test
checking is not very relevant to the toolchain (not totally irrelevant
though).  Enlighten me what are you on about exactly.  From my current
understandings, you are not explaining any valid reasons _not_ doing
such checks.

I provided two viable PoCs (Z[dq]inx and 'P'-extension proposal).  Why
are you doing that?!

Tsukasa

> 
> Nelson
> 
>> When disassembling, it once accepts all register numbers but the output
>> depends on whether the register number is valid for register pair alignment
>> requirements:
>>
>> -   If valid, regular GPR operand is printed.
>>     (Style: dis_style_register)
>> -   If not,   "invalid%d" (where %d is the register number) is printed.
>>     (Style: dis_style_text)
>>
>> I confirmed that this is sufficient to implement 'Z[dq]inx' and
>> 'Zpsfoperand' except old 'P'-proposal aliases WEXT and WEXTI (now
>> implemented as aliases of FSR and FSRI; "l2F" will be added later).
>>
>> I would like to hear your thoughts.
>>
>> Thanks,
>> Tsukasa
>>
>>
>>
>>
>> Appendix: About my new 'Z[dq]inx' register pair implementation
>>           based on this Framework
>>
>> My original 'Z[dq]inx' register pair implementation intended to implement
>> the best solution for the disassembler.  In contrast, my new 'Z[dq]inx'
>> register pair validation based on this framework has following benefits:
>>
>> 1.  Although an error is indirectly generated (as before), it explicitly
>>     checks whether the register number is valid directly on the assembler.
>> 2.  My previous implementation required custom match_opcode functions and
>>     required to make separate opcode entry as shown below (on Zdinx):
>>     -   'D'
>>     -   'Zdinx' (XLEN==32)
>>     -   'Zdinx' (XLEN==64)
>>     This makes the code maintainance harder.  New implementation still
>>     requires to split opcode entry but...
>>     -   'D'
>>     -   'Zdinx' (for all XLEN)
>>     Not only reducing the changes, it will improve maintainability.
>> 3.  Operand length and type fields are adjacent.
>>     In contrast to my previous implementation (operand types and custom
>>     match_opcode function representing length for all operands),
>>     it's pretty easy to understand.
>>
>> This my new attempt for 'Z[dq]inx' in development is available at:
>> <https://github.com/a4lg/binutils-gdb/tree/riscv-float-combined-2>
>>
>>
>>
>>
>> Tsukasa OI (1):
>>   RISC-V: Implement common register pair framework
>>
>>  gas/config/tc-riscv.c | 72 +++++++++++++++++++++++++++++++++++++++++++
>>  opcodes/riscv-dis.c   | 31 +++++++++++++++++++
>>  2 files changed, 103 insertions(+)
>>
>>
>> base-commit: 06bed95d8d2bac94956509dfc1f223d00e51eafb
>> --
>> 2.34.1
>>
>