LoongArch: Add support for b ".L1" and beq "$t0", "$t1", ".L1"

Message ID 20231206031724.2330403-1-mengqinggang@loongson.cn
State Accepted
Headers
Series LoongArch: Add support for b ".L1" and beq "$t0", "$t1", ".L1" |

Checks

Context Check Description
snail/binutils-gdb-check success Github commit url

Commit Message

mengqinggang Dec. 6, 2023, 3:17 a.m. UTC
  Support register and symbol names enclosed in double quotation marks.
---
 .../gas/loongarch/double_quotation_marks.d    | 11 +++++
 .../gas/loongarch/double_quotation_marks.s    |  2 +
 opcodes/loongarch-coder.c                     | 41 ++++++++++++++++---
 3 files changed, 48 insertions(+), 6 deletions(-)
 create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.d
 create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.s
  

Comments

WANG Xuerui Dec. 6, 2023, 3:28 a.m. UTC | #1
Hi,

On 12/6/23 11:17, mengqinggang wrote:
> Support register and symbol names enclosed in double quotation marks.
> ---
>   .../gas/loongarch/double_quotation_marks.d    | 11 +++++
>   .../gas/loongarch/double_quotation_marks.s    |  2 +
>   opcodes/loongarch-coder.c                     | 41 ++++++++++++++++---
>   3 files changed, 48 insertions(+), 6 deletions(-)
>   create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.d
>   create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.s
>
> [snip]
>
> diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> new file mode 100644
> index 00000000000..bb8acb99a40
> --- /dev/null
> +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> @@ -0,0 +1,2 @@
> +# Before only support beq $t0, $t1, .L1
> +beq "$t0", "$t1", ".L1"
May you provide some explanation as to this feature's intended use case? 
Because it seems pointless otherwise without some kind of scenario 
that's both valuable to support and impossible to do so without this 
feature...
  
Fangrui Song Dec. 6, 2023, 3:45 a.m. UTC | #2
On Tue, Dec 5, 2023 at 7:28 PM WANG Xuerui <i.swmail@xen0n.name> wrote:
>
> Hi,
>
> On 12/6/23 11:17, mengqinggang wrote:
> > Support register and symbol names enclosed in double quotation marks.
> > ---
> >   .../gas/loongarch/double_quotation_marks.d    | 11 +++++
> >   .../gas/loongarch/double_quotation_marks.s    |  2 +
> >   opcodes/loongarch-coder.c                     | 41 ++++++++++++++++---
> >   3 files changed, 48 insertions(+), 6 deletions(-)
> >   create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.d
> >   create mode 100644 gas/testsuite/gas/loongarch/double_quotation_marks.s
> >
> > [snip]
> >
> > diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > new file mode 100644
> > index 00000000000..bb8acb99a40
> > --- /dev/null
> > +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > @@ -0,0 +1,2 @@
> > +# Before only support beq $t0, $t1, .L1
> > +beq "$t0", "$t1", ".L1"
> May you provide some explanation as to this feature's intended use case?

Agree

> Because it seems pointless otherwise without some kind of scenario
> that's both valuable to support and impossible to do so without this
> feature...

I think it will be consistent (with other architectures) to support
quoted symbols (".L1"), but it would be odd to support quoted
registers.
  
Xi Ruoyao Dec. 6, 2023, 4:55 a.m. UTC | #3
On Tue, 2023-12-05 at 19:45 -0800, Fangrui Song wrote:
> > > diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > new file mode 100644
> > > index 00000000000..bb8acb99a40
> > > --- /dev/null
> > > +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > @@ -0,0 +1,2 @@
> > > +# Before only support beq $t0, $t1, .L1
> > > +beq "$t0", "$t1", ".L1"
> > May you provide some explanation as to this feature's intended use case?
> 
> Agree
> 
> > Because it seems pointless otherwise without some kind of scenario
> > that's both valuable to support and impossible to do so without this
> > feature...
> 
> I think it will be consistent (with other architectures) to support
> quoted symbols (".L1"), but it would be odd to support quoted
> registers.

Just curiously: why did the other architectures have to support quoted
symbols anyway?
  
mengqinggang Dec. 6, 2023, 6:07 a.m. UTC | #4
This is a question discovered during the porting of RSEQ system call, 
and other architectures support this feature.


在 2023/12/6 上午11:28, WANG Xuerui 写道:
> Hi,
>
> On 12/6/23 11:17, mengqinggang wrote:
>> Support register and symbol names enclosed in double quotation marks.
>> ---
>>   .../gas/loongarch/double_quotation_marks.d    | 11 +++++
>>   .../gas/loongarch/double_quotation_marks.s    |  2 +
>>   opcodes/loongarch-coder.c                     | 41 ++++++++++++++++---
>>   3 files changed, 48 insertions(+), 6 deletions(-)
>>   create mode 100644 
>> gas/testsuite/gas/loongarch/double_quotation_marks.d
>>   create mode 100644 
>> gas/testsuite/gas/loongarch/double_quotation_marks.s
>>
>> [snip]
>>
>> diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s 
>> b/gas/testsuite/gas/loongarch/double_quotation_marks.s
>> new file mode 100644
>> index 00000000000..bb8acb99a40
>> --- /dev/null
>> +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
>> @@ -0,0 +1,2 @@
>> +# Before only support beq $t0, $t1, .L1
>> +beq "$t0", "$t1", ".L1"
> May you provide some explanation as to this feature's intended use 
> case? Because it seems pointless otherwise without some kind of 
> scenario that's both valuable to support and impossible to do so 
> without this feature...
  
Fangrui Song Dec. 6, 2023, 6:14 a.m. UTC | #5
On Tue, Dec 5, 2023 at 8:55 PM Xi Ruoyao <xry111@xry111.site> wrote:
>
> On Tue, 2023-12-05 at 19:45 -0800, Fangrui Song wrote:
> > > > diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > > new file mode 100644
> > > > index 00000000000..bb8acb99a40
> > > > --- /dev/null
> > > > +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > > @@ -0,0 +1,2 @@
> > > > +# Before only support beq $t0, $t1, .L1
> > > > +beq "$t0", "$t1", ".L1"
> > > May you provide some explanation as to this feature's intended use case?
> >
> > Agree
> >
> > > Because it seems pointless otherwise without some kind of scenario
> > > that's both valuable to support and impossible to do so without this
> > > feature...
> >
> > I think it will be consistent (with other architectures) to support
> > quoted symbols (".L1"), but it would be odd to support quoted
> > registers.
>
> Just curiously: why did the other architectures have to support quoted
> symbols anyway?

gas allows you to define a symbol whose name contains a special character, e.g.

"a b":
call "a b"

I am not familiar with gas code, but if the patch needs to parse
quotes manually, it feels like a lack of needed abstraction.
I am expecting some function like parseIdentifier to abstract out the logic.
  
Xi Ruoyao Dec. 6, 2023, 6:16 a.m. UTC | #6
On Wed, 2023-12-06 at 14:07 +0800, mengqinggang wrote:
> This is a question discovered during the porting of RSEQ system call, 
> and other architectures support this feature.

The other architectures support quoted symbols, but not quoted register
names.

On x86_64 if we write

   movq $1, "%rax"

It will be explained as "saving 1 into a symbol named %rax", instead of
"loading 1 into the register rax".  I. e. once it's quoted, it will be
treated as a symbol name instead of register name.

> 在 2023/12/6 上午11:28, WANG Xuerui 写道:
> > Hi,
> > 
> > On 12/6/23 11:17, mengqinggang wrote:
> > > Support register and symbol names enclosed in double quotation
> > > marks.
> > > ---
> > >   .../gas/loongarch/double_quotation_marks.d    | 11 +++++
> > >   .../gas/loongarch/double_quotation_marks.s    |  2 +
> > >   opcodes/loongarch-coder.c                     | 41
> > > ++++++++++++++++---
> > >   3 files changed, 48 insertions(+), 6 deletions(-)
> > >   create mode 100644 
> > > gas/testsuite/gas/loongarch/double_quotation_marks.d
> > >   create mode 100644 
> > > gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > 
> > > [snip]
> > > 
> > > diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s 
> > > b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > new file mode 100644
> > > index 00000000000..bb8acb99a40
> > > --- /dev/null
> > > +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > @@ -0,0 +1,2 @@
> > > +# Before only support beq $t0, $t1, .L1
> > > +beq "$t0", "$t1", ".L1"
> > May you provide some explanation as to this feature's intended use 
> > case? Because it seems pointless otherwise without some kind of 
> > scenario that's both valuable to support and impossible to do so 
> > without this feature...
>
  
Xi Ruoyao Dec. 6, 2023, 6:22 a.m. UTC | #7
On Wed, 2023-12-06 at 14:16 +0800, Xi Ruoyao wrote:
> On Wed, 2023-12-06 at 14:07 +0800, mengqinggang wrote:
> > This is a question discovered during the porting of RSEQ system call, 
> > and other architectures support this feature.
> 
> The other architectures support quoted symbols, but not quoted register
> names.
> 
> On x86_64 if we write
> 
>    movq $1, "%rax"
> 
> It will be explained as "saving 1 into a symbol named %rax", instead of
> "loading 1 into the register rax".  I. e. once it's quoted, it will be
> treated as a symbol name instead of register name.

So we should just reject

    beq "$t0", $t1, .L1

because to be consistent with other architectures, this instruction
should be explained as "if a symbol named $t0 contains the same value as
the register $t1, then jump".  But our beq instruction does not support
a symbol name as the first operand.

OTOH

    beq $t0, $t1, ".L1"

should be accepted.
  
mengqinggang Dec. 6, 2023, 6:27 a.m. UTC | #8
You are right, I will change it to only symbols that can be enclosed in 
double quotation marks.


在 2023/12/6 下午2:22, Xi Ruoyao 写道:
> On Wed, 2023-12-06 at 14:16 +0800, Xi Ruoyao wrote:
>> On Wed, 2023-12-06 at 14:07 +0800, mengqinggang wrote:
>>> This is a question discovered during the porting of RSEQ system call,
>>> and other architectures support this feature.
>> The other architectures support quoted symbols, but not quoted register
>> names.
>>
>> On x86_64 if we write
>>
>>     movq $1, "%rax"
>>
>> It will be explained as "saving 1 into a symbol named %rax", instead of
>> "loading 1 into the register rax".  I. e. once it's quoted, it will be
>> treated as a symbol name instead of register name.
> So we should just reject
>
>      beq "$t0", $t1, .L1
>
> because to be consistent with other architectures, this instruction
> should be explained as "if a symbol named $t0 contains the same value as
> the register $t1, then jump".  But our beq instruction does not support
> a symbol name as the first operand.
>
> OTOH
>
>      beq $t0, $t1, ".L1"
>
> should be accepted.
>
  
Xi Ruoyao Dec. 6, 2023, 6:35 a.m. UTC | #9
On Tue, 2023-12-05 at 22:14 -0800, Fangrui Song wrote:
> On Tue, Dec 5, 2023 at 8:55 PM Xi Ruoyao <xry111@xry111.site> wrote:
> > 
> > On Tue, 2023-12-05 at 19:45 -0800, Fangrui Song wrote:
> > > > > diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > > > new file mode 100644
> > > > > index 00000000000..bb8acb99a40
> > > > > --- /dev/null
> > > > > +++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
> > > > > @@ -0,0 +1,2 @@
> > > > > +# Before only support beq $t0, $t1, .L1
> > > > > +beq "$t0", "$t1", ".L1"
> > > > May you provide some explanation as to this feature's intended use case?
> > > 
> > > Agree
> > > 
> > > > Because it seems pointless otherwise without some kind of scenario
> > > > that's both valuable to support and impossible to do so without this
> > > > feature...
> > > 
> > > I think it will be consistent (with other architectures) to support
> > > quoted symbols (".L1"), but it would be odd to support quoted
> > > registers.
> > 
> > Just curiously: why did the other architectures have to support quoted
> > symbols anyway?
> 
> gas allows you to define a symbol whose name contains a special character, e.g.
> 
> "a b":
> call "a b"
> 
> I am not familiar with gas code, but if the patch needs to parse
> quotes manually, it feels like a lack of needed abstraction.
> I am expecting some function like parseIdentifier to abstract out the logic.

And there are some caveats handling the quoted strings:  on x86_64

    call "\""

is explained as "call a function named <double-quote-mark>".  I'm not
sure if we want to support the same thing...
  

Patch

diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.d b/gas/testsuite/gas/loongarch/double_quotation_marks.d
new file mode 100644
index 00000000000..99c6d343e4c
--- /dev/null
+++ b/gas/testsuite/gas/loongarch/double_quotation_marks.d
@@ -0,0 +1,11 @@ 
+#as:
+#objdump: -dr
+
+.*:[    ]+file format .*
+
+
+Disassembly of section .text:
+
+.* <.text>:
+[ 	]+0:[ 	]+5800018d[ 	]+beq[ 	]+\$t0, \$t1, 0[ 	]+# 0x0
+[ 	]+0: R_LARCH_B16[ 	]+.L1
diff --git a/gas/testsuite/gas/loongarch/double_quotation_marks.s b/gas/testsuite/gas/loongarch/double_quotation_marks.s
new file mode 100644
index 00000000000..bb8acb99a40
--- /dev/null
+++ b/gas/testsuite/gas/loongarch/double_quotation_marks.s
@@ -0,0 +1,2 @@ 
+# Before only support beq $t0, $t1, .L1
+beq "$t0", "$t1", ".L1"
diff --git a/opcodes/loongarch-coder.c b/opcodes/loongarch-coder.c
index a68ae1c3106..198de81aa25 100644
--- a/opcodes/loongarch-coder.c
+++ b/opcodes/loongarch-coder.c
@@ -254,16 +254,45 @@  loongarch_split_args_by_comma (char *args, const char *arg_strs[])
 {
   size_t num = 0;
 
+  /* Supporting <b ".L1"> or <beq "r1", "r2", ".L1">, ignore the first '"'.
+     Mismatched '"' is judged by common code and the white characters also
+     removed by common code. The code like <b ".L1"> or <b "r1","r2",".L1">.  */
+  if ('"' == *args)
+    args++;
+
   if (*args)
     arg_strs[num++] = args;
   for (; *args; args++)
-    if (*args == ',')
-      {
-	if (MAX_ARG_NUM_PLUS_2 - 1 == num)
+    {
+      /* Supporting <b ".L1"> or <beq "r1","r2",".L1>",
+	 ignore the last '"'.  */
+      if (('"' == *args) && ('\0' == *(args+1)))
+	{
+	  *args = '\0';
 	  break;
-	else
-	  *args = '\0', arg_strs[num++] = args + 1;
-      }
+	}
+
+      /* Supporting <b ".L1"> or <beq "r1","r2",".L1>",
+	 ignore the '"' before ','.  */
+      if (('"' == *args) && (',' == *(args+1)))
+	*args = '\0';
+
+      if (*args == ',')
+	{
+	  if (MAX_ARG_NUM_PLUS_2 - 1 == num)
+	    break;
+	  else
+	    {
+	      *args = '\0';
+	      /* Supporting <b ".L1"> or <beq "r1","r2",".L1">,
+		 ignore the '"' after ','.  */
+	      if('"' == *(args+1))
+		arg_strs[num++] = args + 2;
+	      else
+		arg_strs[num++] = args + 1;
+	    }
+	}
+    }
   arg_strs[num] = NULL;
   return num;
 }