ARM: memset: cast the constant byte to unsigned char

Message ID 20230517181353.381073-1-kursad.oney@broadcom.com
State New
Headers
Series ARM: memset: cast the constant byte to unsigned char |

Commit Message

Kursad Oney May 17, 2023, 6:13 p.m. UTC
  memset() description in ISO/IEC 9899:1999 (and elsewhere) says:

	The memset function copies the value of c (converted to an
	unsigned char) into each of the first n characters of the
	object pointed to by s.

The kernel's arm32 memset does not cast c to unsigned char. This results
in the following code to produce erroneous output:

	char a[128];
	memset(a, -128, sizeof(a));

This is because gcc will generally emit the following code before
it calls memset() :

	mov   r0, r7
	mvn   r1, #127        ; 0x7f
	bl    00000000 <memset>

r1 ends up with 0xffffff80 before being used by memset() and the
'a' array will have -128 once in every four bytes while the other
bytes will be set incorrectly to -1 like this (printing the first
8 bytes) :

	test_module: -128 -1 -1 -1
	test_module: -1 -1 -1 -128

The change here is to 'and' r1 with 255 before it is used.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Kursad Oney <kursad.oney@broadcom.com>

---

 arch/arm/lib/memset.S | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Kursad Oney July 7, 2023, 5:45 p.m. UTC | #1
Hi Ard,

On Wed, May 17, 2023 at 2:14 PM Kursad Oney <kursad.oney@broadcom.com> wrote:
>
> memset() description in ISO/IEC 9899:1999 (and elsewhere) says:
>
>         The memset function copies the value of c (converted to an
>         unsigned char) into each of the first n characters of the
>         object pointed to by s.
>
> The kernel's arm32 memset does not cast c to unsigned char. This results
> in the following code to produce erroneous output:
>
>         char a[128];
>         memset(a, -128, sizeof(a));
>
> This is because gcc will generally emit the following code before
> it calls memset() :
>
>         mov   r0, r7
>         mvn   r1, #127        ; 0x7f
>         bl    00000000 <memset>
>
> r1 ends up with 0xffffff80 before being used by memset() and the
> 'a' array will have -128 once in every four bytes while the other
> bytes will be set incorrectly to -1 like this (printing the first
> 8 bytes) :
>
>         test_module: -128 -1 -1 -1
>         test_module: -1 -1 -1 -128
>
> The change here is to 'and' r1 with 255 before it is used.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kursad Oney <kursad.oney@broadcom.com>
>
> ---
>
>  arch/arm/lib/memset.S | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
> index d71ab61430b2..de75ae4d5ab4 100644
> --- a/arch/arm/lib/memset.S
> +++ b/arch/arm/lib/memset.S
> @@ -17,6 +17,7 @@ ENTRY(__memset)
>  ENTRY(mmioset)
>  WEAK(memset)
>  UNWIND( .fnstart         )
> +       and     r1, r1, #255            @ cast to unsigned char
>         ands    r3, r0, #3              @ 1 unaligned?
>         mov     ip, r0                  @ preserve r0 as return value
>         bne     6f                      @ 1
> --
> 2.37.3
>

I didn't get any reaction to this patch so I added you to see if you
could help review it or direct me to the right channel. Thank you!
kursad
  
Ard Biesheuvel Aug. 3, 2023, 1:59 p.m. UTC | #2
On Wed, 17 May 2023 at 20:14, Kursad Oney <kursad.oney@broadcom.com> wrote:
>
> memset() description in ISO/IEC 9899:1999 (and elsewhere) says:
>
>         The memset function copies the value of c (converted to an
>         unsigned char) into each of the first n characters of the
>         object pointed to by s.
>
> The kernel's arm32 memset does not cast c to unsigned char. This results
> in the following code to produce erroneous output:
>
>         char a[128];
>         memset(a, -128, sizeof(a));
>
> This is because gcc will generally emit the following code before
> it calls memset() :
>
>         mov   r0, r7
>         mvn   r1, #127        ; 0x7f
>         bl    00000000 <memset>
>
> r1 ends up with 0xffffff80 before being used by memset() and the
> 'a' array will have -128 once in every four bytes while the other
> bytes will be set incorrectly to -1 like this (printing the first
> 8 bytes) :
>
>         test_module: -128 -1 -1 -1
>         test_module: -1 -1 -1 -128
>
> The change here is to 'and' r1 with 255 before it is used.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kursad Oney <kursad.oney@broadcom.com>
>
> ---
>
>  arch/arm/lib/memset.S | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
> index d71ab61430b2..de75ae4d5ab4 100644
> --- a/arch/arm/lib/memset.S
> +++ b/arch/arm/lib/memset.S
> @@ -17,6 +17,7 @@ ENTRY(__memset)
>  ENTRY(mmioset)
>  WEAK(memset)
>  UNWIND( .fnstart         )
> +       and     r1, r1, #255            @ cast to unsigned char
>         ands    r3, r0, #3              @ 1 unaligned?
>         mov     ip, r0                  @ preserve r0 as return value
>         bne     6f                      @ 1

Yes, this is clearly a bug. The value in R1 is expanded to 32 bits like this

1:      orr     r1, r1, r1, lsl #8
        orr     r1, r1, r1, lsl #16

which assumes that the upper bytes are 0x0, which they are not in this case.



Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
  
Linus Walleij Aug. 10, 2023, 8:15 a.m. UTC | #3
On Wed, May 17, 2023 at 8:14 PM Kursad Oney <kursad.oney@broadcom.com> wrote:

> memset() description in ISO/IEC 9899:1999 (and elsewhere) says:
>
>         The memset function copies the value of c (converted to an
>         unsigned char) into each of the first n characters of the
>         object pointed to by s.
>
> The kernel's arm32 memset does not cast c to unsigned char. This results
> in the following code to produce erroneous output:
>
>         char a[128];
>         memset(a, -128, sizeof(a));
>
> This is because gcc will generally emit the following code before
> it calls memset() :
>
>         mov   r0, r7
>         mvn   r1, #127        ; 0x7f
>         bl    00000000 <memset>
>
> r1 ends up with 0xffffff80 before being used by memset() and the
> 'a' array will have -128 once in every four bytes while the other
> bytes will be set incorrectly to -1 like this (printing the first
> 8 bytes) :
>
>         test_module: -128 -1 -1 -1
>         test_module: -1 -1 -1 -128
>
> The change here is to 'and' r1 with 255 before it is used.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Kursad Oney <kursad.oney@broadcom.com>

Wow you found this old thing!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Can you please put this into Russell's patch tracker?
https://www.arm.linux.org.uk/developer/

Yours,
Linus Walleij
  

Patch

diff --git a/arch/arm/lib/memset.S b/arch/arm/lib/memset.S
index d71ab61430b2..de75ae4d5ab4 100644
--- a/arch/arm/lib/memset.S
+++ b/arch/arm/lib/memset.S
@@ -17,6 +17,7 @@  ENTRY(__memset)
 ENTRY(mmioset)
 WEAK(memset)
 UNWIND( .fnstart         )
+	and	r1, r1, #255		@ cast to unsigned char
 	ands	r3, r0, #3		@ 1 unaligned?
 	mov	ip, r0			@ preserve r0 as return value
 	bne	6f			@ 1