[RFC] VMCI: Silence memcpy() run-time false positive warning

Message ID 20240101130828.3666251-1-harshit.m.mogalapalli@oracle.com
State New
Headers
Series [RFC] VMCI: Silence memcpy() run-time false positive warning |

Commit Message

Harshit Mogalapalli Jan. 1, 2024, 1:08 p.m. UTC
  Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.

memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)

WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237

Some code commentry, based on my understanding:

544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
/// This is 24 + payload_size

memcpy(&dg_info->msg, dg, dg_size);
	Destination = dg_info->msg ---> this is a 24 byte
					structure(struct vmci_datagram)
	Source = dg --> this is a 24 byte structure (struct vmci_datagram)
	Size = dg_size = 24 + payload_size


{payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.

 35 struct delayed_datagram_info {
 36         struct datagram_entry *entry;
 37         struct work_struct work;
 38         bool in_dg_host_queue;
 39         /* msg and msg_payload must be together. */
 40         struct vmci_datagram msg;
 41         u8 msg_payload[];
 42 };

So those extra bytes of payload are copied into msg_payload[], so there
is no bug, but a run time warning is seen while fuzzing with Syzkaller.

One possible way to silence the warning is to split the memcpy() into
two parts -- one -- copying the msg and second taking care of payload.

Reported-by: syzkaller <syzkaller@googlegroups.com>
Suggested-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
---
This patch is only tested with the C reproducer, not any testing
specific to driver is done.
---
 drivers/misc/vmw_vmci/vmci_datagram.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)
  

Comments

Greg KH Jan. 1, 2024, 1:55 p.m. UTC | #1
On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
> 
> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
> 
> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
> 
> Some code commentry, based on my understanding:
> 
> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
> /// This is 24 + payload_size
> 
> memcpy(&dg_info->msg, dg, dg_size);
> 	Destination = dg_info->msg ---> this is a 24 byte
> 					structure(struct vmci_datagram)
> 	Source = dg --> this is a 24 byte structure (struct vmci_datagram)
> 	Size = dg_size = 24 + payload_size
> 
> 
> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
> 
>  35 struct delayed_datagram_info {
>  36         struct datagram_entry *entry;
>  37         struct work_struct work;
>  38         bool in_dg_host_queue;
>  39         /* msg and msg_payload must be together. */
>  40         struct vmci_datagram msg;
>  41         u8 msg_payload[];
>  42 };
> 
> So those extra bytes of payload are copied into msg_payload[], so there
> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
> 
> One possible way to silence the warning is to split the memcpy() into
> two parts -- one -- copying the msg and second taking care of payload.

And what are the performance impacts of this?

thanks,

greg k-h
  
Gustavo A. R. Silva Jan. 1, 2024, 5:43 p.m. UTC | #2
On 1/1/24 07:08, Harshit Mogalapalli wrote:
> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
> 
> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)

This is not a 'false postive warning.' This is a legitimately warning
coming from the fortified memcpy().

Under FORTIFY_SOURCE we should not copy data across multiple members
in a structure. For that we alternatives like struct_group(), or as
in this case, splitting memcpy(), or as I suggest below, a mix of
direct assignment and memcpy().


> 
> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
> 
> Some code commentry, based on my understanding:
> 
> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
> /// This is 24 + payload_size
> 
> memcpy(&dg_info->msg, dg, dg_size);
> 	Destination = dg_info->msg ---> this is a 24 byte
> 					structure(struct vmci_datagram)
> 	Source = dg --> this is a 24 byte structure (struct vmci_datagram)
> 	Size = dg_size = 24 + payload_size
> 
> 
> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
> 
>   35 struct delayed_datagram_info {
>   36         struct datagram_entry *entry;
>   37         struct work_struct work;
>   38         bool in_dg_host_queue;
>   39         /* msg and msg_payload must be together. */
>   40         struct vmci_datagram msg;
>   41         u8 msg_payload[];
>   42 };
> 
> So those extra bytes of payload are copied into msg_payload[], so there
> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
> 
> One possible way to silence the warning is to split the memcpy() into
> two parts -- one -- copying the msg and second taking care of payload.
> 
> Reported-by: syzkaller <syzkaller@googlegroups.com>
> Suggested-by: Vegard Nossum <vegard.nossum@oracle.com>
> Signed-off-by: Harshit Mogalapalli <harshit.m.mogalapalli@oracle.com>
> ---
> This patch is only tested with the C reproducer, not any testing
> specific to driver is done.
> ---
>   drivers/misc/vmw_vmci/vmci_datagram.c | 4 +++-
>   1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/misc/vmw_vmci/vmci_datagram.c b/drivers/misc/vmw_vmci/vmci_datagram.c
> index f50d22882476..b43661590f56 100644
> --- a/drivers/misc/vmw_vmci/vmci_datagram.c
> +++ b/drivers/misc/vmw_vmci/vmci_datagram.c
> @@ -216,6 +216,7 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
>   		if (dst_entry->run_delayed ||
>   		    dg->src.context == VMCI_HOST_CONTEXT_ID) {
>   			struct delayed_datagram_info *dg_info;
> +			size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;

This seems to be the same as `dg->payload_size`, so I don't think a new
variable is necessary.

>   
>   			if (atomic_add_return(1, &delayed_dg_host_queue_size)
>   			    == VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
> @@ -234,7 +235,8 @@ static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
>   
>   			dg_info->in_dg_host_queue = true;
>   			dg_info->entry = dst_entry;
> -			memcpy(&dg_info->msg, dg, dg_size);
> +			memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
> +			memcpy(&dg_info->msg_payload, dg + 1, payload_size);

I think a direct assignment and a call to memcpy() is better in this case,
something like this:

dg_info->msg = *dg;
memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);

However, that `dg + 1` thing is making my eyes twitch. Where exactly are we
making sure that `dg` actually points to an area in memory bigger than
`sizeof(*dg)`?...

Also, we could also use struct_size() during allocation, some lines above:

-                       dg_info = kmalloc(sizeof(*dg_info) +
-                                   (size_t) dg->payload_size, GFP_ATOMIC);
+                       dg_info = kmalloc(struct_size(dg_info, msg_payload, dg->payload_size),
+                                         GFP_ATOMIC);

--
Gustavo

>   
>   			INIT_WORK(&dg_info->work, dg_delayed_dispatch);
>   			schedule_work(&dg_info->work);
  
Harshit Mogalapalli Jan. 2, 2024, 6:34 p.m. UTC | #3
Hi Greg,

On 01/01/24 7:25 pm, Greg Kroah-Hartman wrote:
> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>>
>> memcpy: detected field-spanning write (size 56) of single field "&dg_info->msg"
>> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
>>
>> WARNING: CPU: 0 PID: 1555 at drivers/misc/vmw_vmci/vmci_datagram.c:237
>> dg_dispatch_as_host+0x88e/0xa60 drivers/misc/vmw_vmci/vmci_datagram.c:237
>>
>> Some code commentry, based on my understanding:
>>
>> 544 #define VMCI_DG_SIZE(_dg) (VMCI_DG_HEADERSIZE + (size_t)(_dg)->payload_size)
>> /// This is 24 + payload_size
>>
>> memcpy(&dg_info->msg, dg, dg_size);
>> 	Destination = dg_info->msg ---> this is a 24 byte
>> 					structure(struct vmci_datagram)
>> 	Source = dg --> this is a 24 byte structure (struct vmci_datagram)
>> 	Size = dg_size = 24 + payload_size
>>
>>
>> {payload_size = 56-24 =32} -- Syzkaller managed to set payload_size to 32.
>>
>>   35 struct delayed_datagram_info {
>>   36         struct datagram_entry *entry;
>>   37         struct work_struct work;
>>   38         bool in_dg_host_queue;
>>   39         /* msg and msg_payload must be together. */
>>   40         struct vmci_datagram msg;
>>   41         u8 msg_payload[];
>>   42 };
>>
>> So those extra bytes of payload are copied into msg_payload[], so there
>> is no bug, but a run time warning is seen while fuzzing with Syzkaller.
>>
>> One possible way to silence the warning is to split the memcpy() into
>> two parts -- one -- copying the msg and second taking care of payload.
> 
> And what are the performance impacts of this?
> 

I haven't done any performance tests on this.

I tried to look at the diff in assembly code but couldn't comment on 
performance from that. Also, gustavo suggested to do this: instead of 
two memcpy()'s; a direct assignment and memcpy() for the payload part.

Is there a way to do perf analysis based on code without access to hardware?

Thanks,
Harshit

> thanks,
> 
> greg k-h
  
Harshit Mogalapalli Jan. 2, 2024, 6:37 p.m. UTC | #4
Hi Gustavo,

On 01/01/24 11:13 pm, Gustavo A. R. Silva wrote:
> 
> 
> On 1/1/24 07:08, Harshit Mogalapalli wrote:
>> Syzkaller hit 'WARNING in dg_dispatch_as_host' bug.
>>
>> memcpy: detected field-spanning write (size 56) of single field 
>> "&dg_info->msg"
>> at drivers/misc/vmw_vmci/vmci_datagram.c:237 (size 24)
> 
> This is not a 'false postive warning.' This is a legitimately warning
> coming from the fortified memcpy().
> 
> Under FORTIFY_SOURCE we should not copy data across multiple members
> in a structure. For that we alternatives like struct_group(), or as
> in this case, splitting memcpy(), or as I suggest below, a mix of
> direct assignment and memcpy().
> 

Thanks for sharing this.
> 
>>
>> struct vmci_datagram *dg)
>>           if (dst_entry->run_delayed ||
>>               dg->src.context == VMCI_HOST_CONTEXT_ID) {
>>               struct delayed_datagram_info *dg_info;
>> +            size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;
> 
> This seems to be the same as `dg->payload_size`, so I don't think a new
> variable is necessary.
> 

Oh right, this is unnecessary. I will remove it.

>>               if (atomic_add_return(1, &delayed_dg_host_queue_size)
>>                   == VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
>> @@ -234,7 +235,8 @@ static int dg_dispatch_as_host(u32 context_id, 
>> struct vmci_datagram *dg)
>>               dg_info->in_dg_host_queue = true;
>>               dg_info->entry = dst_entry;
>> -            memcpy(&dg_info->msg, dg, dg_size);
>> +            memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
>> +            memcpy(&dg_info->msg_payload, dg + 1, payload_size);
> 
> I think a direct assignment and a call to memcpy() is better in this case,
> something like this:
> 
> dg_info->msg = *dg;
> memcpy(&dg_info->msg_payload, dg + 1, dg->payload_size);
> 
> However, that `dg + 1` thing is making my eyes twitch. Where exactly are we
> making sure that `dg` actually points to an area in memory bigger than
> `sizeof(*dg)`?...
>

Going up on the call tree:

-> vmci_transport_dgram_enqueue()
--> vmci_datagram_send()
---> vmci_datagram_dispatch()
----> dg_dispatch_as_host()

1694 static int vmci_transport_dgram_enqueue(
1695         struct vsock_sock *vsk,
1696         struct sockaddr_vm *remote_addr,
1697         struct msghdr *msg,
1698         size_t len)
1699 {
1700         int err;
1701         struct vmci_datagram *dg;
1702
1703         if (len > VMCI_MAX_DG_PAYLOAD_SIZE)
1704                 return -EMSGSIZE;
1705
1706         if (!vmci_transport_allow_dgram(vsk, remote_addr->svm_cid))
1707                 return -EPERM;
1708
1709         /* Allocate a buffer for the user's message and our packet 
header. */
1710         dg = kmalloc(len + sizeof(*dg), GFP_KERNEL);
1711         if (!dg)
1712                 return -ENOMEM;

^^^ dg = kmalloc(len + sizeof(*dg), GFP_KERNEL);
I think from this we can say allocated memory for dg is bigger than 
sizeof(*dg).


> Also, we could also use struct_size() during allocation, some lines above:
> 
> -                       dg_info = kmalloc(sizeof(*dg_info) +
> -                                   (size_t) dg->payload_size, GFP_ATOMIC);
> +                       dg_info = kmalloc(struct_size(dg_info, 
> msg_payload, dg->payload_size),
> +                                         GFP_ATOMIC);
> 
Thanks again for the suggestion.

I still couldn't figure out the performance comparison before and after 
patch. Once I have some reasoning, I will include the above changes and 
send a V2.

Thanks,
Harshit
> -- 
> Gustavo
> 
>>               INIT_WORK(&dg_info->work, dg_delayed_dispatch);
>>               schedule_work(&dg_info->work);
  
Vegard Nossum Jan. 4, 2024, 6:31 p.m. UTC | #5
On 01/01/2024 14:55, Greg Kroah-Hartman wrote:
> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>> One possible way to silence the warning is to split the memcpy() into
>> two parts -- one -- copying the msg and second taking care of payload.
> 
> And what are the performance impacts of this?

I did a disasssembly diff for the version of the patch that uses
dg->payload_size directly in the second memcpy and I get this as the
only change:

@@ -419,11 +419,16 @@
         mov    %rax,%rbx
         test   %rax,%rax
         je
+       mov    0x0(%rbp),%rdx
         mov    %r14,(%rax)
-       mov    %r13,%rdx
-       mov    %rbp,%rsi
-       lea    0x30(%rax),%rdi
+       lea    0x18(%rbp),%rsi
+       lea    0x48(%rax),%rdi
         movb   $0x1,0x28(%rax)
+       mov    %rdx,0x30(%rax)
+       mov    0x8(%rbp),%rdx
+       mov    %rdx,0x38(%rax)
+       mov    0x10(%rbp),%rdx
+       mov    %rdx,0x40(%rax)
         call
         mov    0x0(%rip),%rsi        #
         lea    0x8(%rbx),%rdx

Basically, I believe it's inlining the first constant-size memcpy and
keeping the second one as a call.

Overall, the number of memory accesses should be the same.

The biggest impact that I can see is therefore the code size (which
isn't much).

There is also a kmalloc() on the same code path that I assume would
dwarf any performance impact from this patch -- but happy to be corrected.


Vegard
  
Gustavo A. R. Silva Jan. 4, 2024, 7:02 p.m. UTC | #6
On 1/4/24 12:31, Vegard Nossum wrote:
> 
> On 01/01/2024 14:55, Greg Kroah-Hartman wrote:
>> On Mon, Jan 01, 2024 at 05:08:28AM -0800, Harshit Mogalapalli wrote:
>>> One possible way to silence the warning is to split the memcpy() into
>>> two parts -- one -- copying the msg and second taking care of payload.
>>
>> And what are the performance impacts of this?
> 
> I did a disasssembly diff for the version of the patch that uses
> dg->payload_size directly in the second memcpy and I get this as the
> only change:
> 
> @@ -419,11 +419,16 @@
>          mov    %rax,%rbx
>          test   %rax,%rax
>          je
> +       mov    0x0(%rbp),%rdx
>          mov    %r14,(%rax)
> -       mov    %r13,%rdx
> -       mov    %rbp,%rsi
> -       lea    0x30(%rax),%rdi
> +       lea    0x18(%rbp),%rsi
> +       lea    0x48(%rax),%rdi
>          movb   $0x1,0x28(%rax)
> +       mov    %rdx,0x30(%rax)
> +       mov    0x8(%rbp),%rdx
> +       mov    %rdx,0x38(%rax)
> +       mov    0x10(%rbp),%rdx
> +       mov    %rdx,0x40(%rax)
>          call
>          mov    0x0(%rip),%rsi        #
>          lea    0x8(%rbx),%rdx
> 
> Basically, I believe it's inlining the first constant-size memcpy and
> keeping the second one as a call.
> 
> Overall, the number of memory accesses should be the same.
> 
> The biggest impact that I can see is therefore the code size (which
> isn't much).

Yep, I don't think this is a problem.

I look forward to reviewing v2 of this patch.

Thanks
--
Gustavo

> 
> There is also a kmalloc() on the same code path that I assume would
> dwarf any performance impact from this patch -- but happy to be corrected.
> 
> 
> Vegard
>
  

Patch

diff --git a/drivers/misc/vmw_vmci/vmci_datagram.c b/drivers/misc/vmw_vmci/vmci_datagram.c
index f50d22882476..b43661590f56 100644
--- a/drivers/misc/vmw_vmci/vmci_datagram.c
+++ b/drivers/misc/vmw_vmci/vmci_datagram.c
@@ -216,6 +216,7 @@  static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
 		if (dst_entry->run_delayed ||
 		    dg->src.context == VMCI_HOST_CONTEXT_ID) {
 			struct delayed_datagram_info *dg_info;
+			size_t payload_size = dg_size - VMCI_DG_HEADERSIZE;
 
 			if (atomic_add_return(1, &delayed_dg_host_queue_size)
 			    == VMCI_MAX_DELAYED_DG_HOST_QUEUE_SIZE) {
@@ -234,7 +235,8 @@  static int dg_dispatch_as_host(u32 context_id, struct vmci_datagram *dg)
 
 			dg_info->in_dg_host_queue = true;
 			dg_info->entry = dst_entry;
-			memcpy(&dg_info->msg, dg, dg_size);
+			memcpy(&dg_info->msg, dg, VMCI_DG_HEADERSIZE);
+			memcpy(&dg_info->msg_payload, dg + 1, payload_size);
 
 			INIT_WORK(&dg_info->work, dg_delayed_dispatch);
 			schedule_work(&dg_info->work);