[net-next] net: ethtool: Fix out-of-bounds copy to user

Message ID 20230601112839.13799-1-dinghui@sangfor.com.cn
State New
Headers
Series [net-next] net: ethtool: Fix out-of-bounds copy to user |

Commit Message

Ding Hui June 1, 2023, 11:28 a.m. UTC
  When we get statistics by ethtool during changing the number of NIC
channels greater, the utility may crash due to memory corruption.

The NIC drivers callback get_sset_count() could return a calculated
length depends on current number of channels (e.g. i40e, igb).

The ethtool allocates a user buffer with the first ioctl returned
length and invokes the second ioctl to get data. The kernel copies
data to the user buffer but without checking its length. If the length
returned by the second get_sset_count() is greater than the length
allocated by the user, it will lead to an out-of-bounds copy.

Fix it by restricting the copy length not exceed the buffer length
specified by userspace.

Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
 net/ethtool/ioctl.c | 16 +++++++++-------
 1 file changed, 9 insertions(+), 7 deletions(-)
  

Comments

Alexander Duyck June 1, 2023, 3:04 p.m. UTC | #1
On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote:
> When we get statistics by ethtool during changing the number of NIC
> channels greater, the utility may crash due to memory corruption.
> 
> The NIC drivers callback get_sset_count() could return a calculated
> length depends on current number of channels (e.g. i40e, igb).
> 

The drivers shouldn't be changing that value. If the drivers are doing
this they should be fixed to provide a fixed length in terms of their
strings.

> The ethtool allocates a user buffer with the first ioctl returned
> length and invokes the second ioctl to get data. The kernel copies
> data to the user buffer but without checking its length. If the length
> returned by the second get_sset_count() is greater than the length
> allocated by the user, it will lead to an out-of-bounds copy.
> 
> Fix it by restricting the copy length not exceed the buffer length
> specified by userspace.
> 
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>

Changing the copy size would not fix this. The problem is the driver
will be overwriting with the size that it thinks it should be using.
Reducing the value that is provided for the memory allocations will
cause the driver to corrupt memory.

> ---
>  net/ethtool/ioctl.c | 16 +++++++++-------
>  1 file changed, 9 insertions(+), 7 deletions(-)
> 
> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> index 6bb778e10461..82a975a9c895 100644
> --- a/net/ethtool/ioctl.c
> +++ b/net/ethtool/ioctl.c
> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
>  	if (copy_from_user(&test, useraddr, sizeof(test)))
>  		return -EFAULT;
>  
> -	test.len = test_len;
> +	test.len = min_t(u32, test.len, test_len);
>  	data = kcalloc(test_len, sizeof(u64), GFP_USER);
>  	if (!data)
>  		return -ENOMEM;

This is the wrong spot to be doing this. You need to use test_len for
your allocation as that is what the driver will be writing to. You
should look at adjusting after the allocation call and before you do
the copy

> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
>  	if (copy_to_user(useraddr, &test, sizeof(test)))
>  		goto out;
>  	useraddr += sizeof(test);
> -	if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
> +	if (test.len &&
> +	    copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
>  		goto out;
>  	ret = 0;
>  

I don't believe this is adding any value. I wouldn't bother with
checking for lengths of 0.

> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
>  		return -ENOMEM;
>  	WARN_ON_ONCE(!ret);
>  
> -	gstrings.len = ret;
> +	gstrings.len = min_t(u32, gstrings.len, ret);
>  
>  	if (gstrings.len) {
> -		data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
> +		data = vzalloc(array_size(ret, ETH_GSTRING_LEN));
>  		if (!data)
>  			return -ENOMEM;
>  

Same here. We should be using the returned value for the allocations
and tests, and then doing the min adjustment after the allocationis
completed.

> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
>  	if (copy_from_user(&stats, useraddr, sizeof(stats)))
>  		return -EFAULT;
>  
> -	stats.n_stats = n_stats;
> +	stats.n_stats = min_t(u32, stats.n_stats, n_stats);
>  
> -	if (n_stats) {
> +	if (stats.n_stats) {
>  		data = vzalloc(array_size(n_stats, sizeof(u64)));
>  		if (!data)
>  			return -ENOMEM;

Same here. We should be using n_stats, not stats.n_stats and adjust
before you do the final copy.

> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
>  	if (copy_to_user(useraddr, &stats, sizeof(stats)))
>  		goto out;
>  	useraddr += sizeof(stats);
> -	if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
> +	if (stats.n_stats &&
> +	    copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
>  		goto out;
>  	ret = 0;
>  

Again. I am not sure what value is being added. If n_stats is 0 then I
am pretty sure this will do nothing anyway.
  
Ding Hui June 2, 2023, 1:46 a.m. UTC | #2
On 2023/6/1 23:04, Alexander H Duyck wrote:
> On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote:
>> When we get statistics by ethtool during changing the number of NIC
>> channels greater, the utility may crash due to memory corruption.
>>
>> The NIC drivers callback get_sset_count() could return a calculated
>> length depends on current number of channels (e.g. i40e, igb).
>>
> 
> The drivers shouldn't be changing that value. If the drivers are doing
> this they should be fixed to provide a fixed length in terms of their
> strings.
> 

Is there an explicit declaration for the rule?
I searched the ethernet drivers, found that many drivers that support
multiple queues return calculated length by number of queues rather than
fixed value. So pushing all these drivers to follow the rule is hard
to me.

>> The ethtool allocates a user buffer with the first ioctl returned
>> length and invokes the second ioctl to get data. The kernel copies
>> data to the user buffer but without checking its length. If the length
>> returned by the second get_sset_count() is greater than the length
>> allocated by the user, it will lead to an out-of-bounds copy.
>>
>> Fix it by restricting the copy length not exceed the buffer length
>> specified by userspace.
>>
>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> 
> Changing the copy size would not fix this. The problem is the driver
> will be overwriting with the size that it thinks it should be using.
> Reducing the value that is provided for the memory allocations will
> cause the driver to corrupt memory.
> 

I noticed that, in fact I did use the returned length to allocate
kernel memory, and only use adjusted length to copy to user.

>> ---
>>   net/ethtool/ioctl.c | 16 +++++++++-------
>>   1 file changed, 9 insertions(+), 7 deletions(-)
>>
>> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
>> index 6bb778e10461..82a975a9c895 100644
>> --- a/net/ethtool/ioctl.c
>> +++ b/net/ethtool/ioctl.c
>> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
>>   	if (copy_from_user(&test, useraddr, sizeof(test)))
>>   		return -EFAULT;
>>   
>> -	test.len = test_len;
>> +	test.len = min_t(u32, test.len, test_len);
>>   	data = kcalloc(test_len, sizeof(u64), GFP_USER);
>>   	if (!data)
>>   		return -ENOMEM;
> 
> This is the wrong spot to be doing this. You need to use test_len for
> your allocation as that is what the driver will be writing to. You
> should look at adjusting after the allocation call and before you do
> the copy

data = kcalloc(test_len, sizeof(u64), GFP_USER);  // yes, **test_len** for kernel memory
...
copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)) // **test.len** only for copy to user

> 
>> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
>>   	if (copy_to_user(useraddr, &test, sizeof(test)))
>>   		goto out;
>>   	useraddr += sizeof(test);
>> -	if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
>> +	if (test.len &&
>> +	    copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
>>   		goto out;
>>   	ret = 0;
>>   
> 
> I don't believe this is adding any value. I wouldn't bother with
> checking for lengths of 0.
> 

Yes, we already checked the data ptr is not NULL, so we don't need checking test.len.
I'll remove it in v2.

>> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
>>   		return -ENOMEM;
>>   	WARN_ON_ONCE(!ret);
>>   
>> -	gstrings.len = ret;
>> +	gstrings.len = min_t(u32, gstrings.len, ret);
>>   
>>   	if (gstrings.len) {
>> -		data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
>> +		data = vzalloc(array_size(ret, ETH_GSTRING_LEN));
>>   		if (!data)
>>   			return -ENOMEM;
>>   
> 
> Same here. We should be using the returned value for the allocations
> and tests, and then doing the min adjustment after the allocationis
> completed.
> 

gstrings.len = min_t(u32, gstrings.len, ret);   // adjusting

if (gstrings.len) {  // checking the adjusted gstrings.len can avoid unnecessary vzalloc and __ethtool_get_strings()
vzalloc(array_size(ret, ETH_GSTRING_LEN)); // **ret** for kernel memory, rather than adjusted lenght

At last, adjusted gstrings.len for copy to user

>> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
>>   	if (copy_from_user(&stats, useraddr, sizeof(stats)))
>>   		return -EFAULT;
>>   
>> -	stats.n_stats = n_stats;
>> +	stats.n_stats = min_t(u32, stats.n_stats, n_stats);
>>   
>> -	if (n_stats) {
>> +	if (stats.n_stats) {
>>   		data = vzalloc(array_size(n_stats, sizeof(u64)));
>>   		if (!data)
>>   			return -ENOMEM;
> 
> Same here. We should be using n_stats, not stats.n_stats and adjust
> before you do the final copy.
> 
>> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
>>   	if (copy_to_user(useraddr, &stats, sizeof(stats)))
>>   		goto out;
>>   	useraddr += sizeof(stats);
>> -	if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
>> +	if (stats.n_stats &&
>> +	    copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
>>   		goto out;
>>   	ret = 0;
>>   
> 
> Again. I am not sure what value is being added. If n_stats is 0 then I
> am pretty sure this will do nothing anyway.
> 

Not really no, n_stats is returned value, and stats.n_stats is adjusted value,
if the adjusted stats.n_stats is 0, we avoid memory allocation and setting data ptr
to NULL, copy_to_user() with NULL ptr maybe cause warn log.
  
Andrew Lunn June 2, 2023, 12:26 p.m. UTC | #3
> > Changing the copy size would not fix this. The problem is the driver
> > will be overwriting with the size that it thinks it should be using.
> > Reducing the value that is provided for the memory allocations will
> > cause the driver to corrupt memory.
> > 
> 
> I noticed that, in fact I did use the returned length to allocate
> kernel memory, and only use adjusted length to copy to user.

This is also something i checked when quickly looking at the patch. It
does look correct.

Also, RTNL should be held during the time both calls are made into the
driver. So nothing from userspace should be able to get in the middle
of these calls to change the number of queues.

	Andrew
  
Ding Hui June 2, 2023, 3:01 p.m. UTC | #4
On 2023/6/2 8:26 下午, Andrew Lunn wrote:
>>> Changing the copy size would not fix this. The problem is the driver
>>> will be overwriting with the size that it thinks it should be using.
>>> Reducing the value that is provided for the memory allocations will
>>> cause the driver to corrupt memory.
>>>
>>
>> I noticed that, in fact I did use the returned length to allocate
>> kernel memory, and only use adjusted length to copy to user.
> 
> This is also something i checked when quickly looking at the patch. It
> does look correct.
> 

Thanks.

> Also, RTNL should be held during the time both calls are made into the
> driver. So nothing from userspace should be able to get in the middle
> of these calls to change the number of queues.
> 

The RTNL lock is already be held during every each ioctl in dev_ethtool().

     rtnl_lock();
     rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state);
     rtnl_unlock();
  
Alexander Duyck June 2, 2023, 3:30 p.m. UTC | #5
On Thu, Jun 1, 2023 at 6:46 PM Ding Hui <dinghui@sangfor.com.cn> wrote:
>
> On 2023/6/1 23:04, Alexander H Duyck wrote:
> > On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote:
> >> When we get statistics by ethtool during changing the number of NIC
> >> channels greater, the utility may crash due to memory corruption.
> >>
> >> The NIC drivers callback get_sset_count() could return a calculated
> >> length depends on current number of channels (e.g. i40e, igb).
> >>
> >
> > The drivers shouldn't be changing that value. If the drivers are doing
> > this they should be fixed to provide a fixed length in terms of their
> > strings.
> >
>
> Is there an explicit declaration for the rule?
> I searched the ethernet drivers, found that many drivers that support
> multiple queues return calculated length by number of queues rather than
> fixed value. So pushing all these drivers to follow the rule is hard
> to me.
>
> >> The ethtool allocates a user buffer with the first ioctl returned
> >> length and invokes the second ioctl to get data. The kernel copies
> >> data to the user buffer but without checking its length. If the length
> >> returned by the second get_sset_count() is greater than the length
> >> allocated by the user, it will lead to an out-of-bounds copy.
> >>
> >> Fix it by restricting the copy length not exceed the buffer length
> >> specified by userspace.
> >>
> >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> >
> > Changing the copy size would not fix this. The problem is the driver
> > will be overwriting with the size that it thinks it should be using.
> > Reducing the value that is provided for the memory allocations will
> > cause the driver to corrupt memory.
> >
>
> I noticed that, in fact I did use the returned length to allocate
> kernel memory, and only use adjusted length to copy to user.

Ah, okay I hadn't noticed that part. Although that leads me to the
question of if any of the drivers might be modifying the length values
stored in the structures. We may want to add a new stack variable to
track what the clamped value is for these rather than just leaving the
value stored in the structure.

> >> ---
> >>   net/ethtool/ioctl.c | 16 +++++++++-------
> >>   1 file changed, 9 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
> >> index 6bb778e10461..82a975a9c895 100644
> >> --- a/net/ethtool/ioctl.c
> >> +++ b/net/ethtool/ioctl.c
> >> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
> >>      if (copy_from_user(&test, useraddr, sizeof(test)))
> >>              return -EFAULT;
> >>
> >> -    test.len = test_len;
> >> +    test.len = min_t(u32, test.len, test_len);
> >>      data = kcalloc(test_len, sizeof(u64), GFP_USER);
> >>      if (!data)
> >>              return -ENOMEM;
> >
> > This is the wrong spot to be doing this. You need to use test_len for
> > your allocation as that is what the driver will be writing to. You
> > should look at adjusting after the allocation call and before you do
> > the copy
>
> data = kcalloc(test_len, sizeof(u64), GFP_USER);  // yes, **test_len** for kernel memory
> ...
> copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)) // **test.len** only for copy to user

One other thought on this. Would we ever expect the length value to
change? For many of these I wonder if we shouldn't just return an
error in the case that there isn't enough space to store the test
results.

It might make sense to look at adding a return of ENOSPC or EFBIG when
we encounter a size difference where our output is too big for the
provided userspace buffer. At least with that we are not losing data.

> >
> >> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
> >>      if (copy_to_user(useraddr, &test, sizeof(test)))
> >>              goto out;
> >>      useraddr += sizeof(test);
> >> -    if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
> >> +    if (test.len &&
> >> +        copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
> >>              goto out;
> >>      ret = 0;
> >>
> >
> > I don't believe this is adding any value. I wouldn't bother with
> > checking for lengths of 0.
> >
>
> Yes, we already checked the data ptr is not NULL, so we don't need checking test.len.
> I'll remove it in v2.
>
> >> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
> >>              return -ENOMEM;
> >>      WARN_ON_ONCE(!ret);
> >>
> >> -    gstrings.len = ret;
> >> +    gstrings.len = min_t(u32, gstrings.len, ret);
> >>
> >>      if (gstrings.len) {
> >> -            data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
> >> +            data = vzalloc(array_size(ret, ETH_GSTRING_LEN));
> >>              if (!data)
> >>                      return -ENOMEM;
> >>
> >
> > Same here. We should be using the returned value for the allocations
> > and tests, and then doing the min adjustment after the allocationis
> > completed.
> >
>
> gstrings.len = min_t(u32, gstrings.len, ret);   // adjusting
>
> if (gstrings.len) {  // checking the adjusted gstrings.len can avoid unnecessary vzalloc and __ethtool_get_strings()
> vzalloc(array_size(ret, ETH_GSTRING_LEN)); // **ret** for kernel memory, rather than adjusted lenght
>
> At last, adjusted gstrings.len for copy to user

I see what you are talking about now.

> >> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
> >>      if (copy_from_user(&stats, useraddr, sizeof(stats)))
> >>              return -EFAULT;
> >>
> >> -    stats.n_stats = n_stats;
> >> +    stats.n_stats = min_t(u32, stats.n_stats, n_stats);
> >>
> >> -    if (n_stats) {
> >> +    if (stats.n_stats) {
> >>              data = vzalloc(array_size(n_stats, sizeof(u64)));
> >>              if (!data)
> >>                      return -ENOMEM;
> >
> > Same here. We should be using n_stats, not stats.n_stats and adjust
> > before you do the final copy.
> >
> >> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
> >>      if (copy_to_user(useraddr, &stats, sizeof(stats)))
> >>              goto out;
> >>      useraddr += sizeof(stats);
> >> -    if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
> >> +    if (stats.n_stats &&
> >> +        copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
> >>              goto out;
> >>      ret = 0;
> >>
> >
> > Again. I am not sure what value is being added. If n_stats is 0 then I
> > am pretty sure this will do nothing anyway.
> >
>
> Not really no, n_stats is returned value, and stats.n_stats is adjusted value,
> if the adjusted stats.n_stats is 0, we avoid memory allocation and setting data ptr
> to NULL, copy_to_user() with NULL ptr maybe cause warn log.

I see now. So data is NULL if stats.n_stats is 0.
  
Andrew Lunn June 2, 2023, 3:37 p.m. UTC | #6
> > Also, RTNL should be held during the time both calls are made into the
> > driver. So nothing from userspace should be able to get in the middle
> > of these calls to change the number of queues.
> > 
> 
> The RTNL lock is already be held during every each ioctl in dev_ethtool().
> 
>     rtnl_lock();
>     rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state);
>     rtnl_unlock();

Yes, exactly. So the kernel should be safe from buffer overruns.

Userspace will not get more than it asked for. It might get less, and
it could be different to the previous calls. But i'm not aware of
anything which says anything about the consistency between different
invocations of ethtool -S.

	Andrew
  
Alexander Duyck June 2, 2023, 4:01 p.m. UTC | #7
On Fri, Jun 2, 2023 at 8:42 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > > Also, RTNL should be held during the time both calls are made into the
> > > driver. So nothing from userspace should be able to get in the middle
> > > of these calls to change the number of queues.
> > >
> >
> > The RTNL lock is already be held during every each ioctl in dev_ethtool().
> >
> >     rtnl_lock();
> >     rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state);
> >     rtnl_unlock();
>
> Yes, exactly. So the kernel should be safe from buffer overruns.
>
> Userspace will not get more than it asked for. It might get less, and
> it could be different to the previous calls. But i'm not aware of
> anything which says anything about the consistency between different
> invocations of ethtool -S.

The problem is the userspace allocation ends up requiring we make two
calls into the stack. So it takes the lock once to get the count,
performs the allocation, and then calls into ethtool again taking the
lock and by then the value may have changed.

Within each call it is held the entire time, but the userspace has to
make two calls. So in between the two the number of rings could
potentially change.

What this change is essentially doing is clamping the copied data to
the lesser of the current value versus the value when the userspace
was allocated. However I am wondering now if we shouldn't just update
the size value and return that as some sort of error for the userspace
to potentially reallocate and repeat until it has the right size.
  
Andrew Lunn June 2, 2023, 4:37 p.m. UTC | #8
> What this change is essentially doing is clamping the copied data to
> the lesser of the current value versus the value when the userspace
> was allocated. However I am wondering now if we shouldn't just update
> the size value and return that as some sort of error for the userspace
> to potentially reallocate and repeat until it has the right size.

I'm not sure we should be putting any effort into the IOCTL
interface. It is deprecated. We should fix overrun problems, but i
would not change the API. Netlink handles this atomically, and that is
the interface tools should be using, not this IOCTL.

    Andrew
  
Alexander Duyck June 2, 2023, 6:02 p.m. UTC | #9
On Fri, Jun 2, 2023 at 9:37 AM Andrew Lunn <andrew@lunn.ch> wrote:
>
> > What this change is essentially doing is clamping the copied data to
> > the lesser of the current value versus the value when the userspace
> > was allocated. However I am wondering now if we shouldn't just update
> > the size value and return that as some sort of error for the userspace
> > to potentially reallocate and repeat until it has the right size.
>
> I'm not sure we should be putting any effort into the IOCTL
> interface. It is deprecated. We should fix overrun problems, but i
> would not change the API. Netlink handles this atomically, and that is
> the interface tools should be using, not this IOCTL.

If that is the case maybe it would just make more sense to just return
an error if we are at risk of overrunning the userspace allocated
buffer.
  
Ding Hui June 3, 2023, 1:51 a.m. UTC | #10
On 2023/6/3 2:02, Alexander Duyck wrote:
> On Fri, Jun 2, 2023 at 9:37 AM Andrew Lunn <andrew@lunn.ch> wrote:
>>
>>> What this change is essentially doing is clamping the copied data to
>>> the lesser of the current value versus the value when the userspace
>>> was allocated. However I am wondering now if we shouldn't just update
>>> the size value and return that as some sort of error for the userspace
>>> to potentially reallocate and repeat until it has the right size.
>>
>> I'm not sure we should be putting any effort into the IOCTL
>> interface. It is deprecated. We should fix overrun problems, but i
>> would not change the API. Netlink handles this atomically, and that is
>> the interface tools should be using, not this IOCTL.
> 
> If that is the case maybe it would just make more sense to just return
> an error if we are at risk of overrunning the userspace allocated
> buffer.
> 

In that case, I can modify to return an error, however, I think the
ENOSPC or EFBIG mentioned in a previous email may not be suitable,
maybe like others length/size checking return EINVAL.

Another thing I wondered is that should I update the current length
back to user if user buffer is not enough, assuming we update the new
length with error returned, the userspace can use it to reallocate
buffer if he wants to, which can avoid re-call previous ioctl to get
the new length.
  
Jakub Kicinski June 3, 2023, 5:55 a.m. UTC | #11
On Sat, 3 Jun 2023 09:51:34 +0800 Ding Hui wrote:
> > If that is the case maybe it would just make more sense to just return
> > an error if we are at risk of overrunning the userspace allocated
> > buffer.
> 
> In that case, I can modify to return an error, however, I think the
> ENOSPC or EFBIG mentioned in a previous email may not be suitable,
> maybe like others length/size checking return EINVAL.
> 
> Another thing I wondered is that should I update the current length
> back to user if user buffer is not enough, assuming we update the new
> length with error returned, the userspace can use it to reallocate
> buffer if he wants to, which can avoid re-call previous ioctl to get
> the new length.

This entire thread presupposes that user provides the length of 
the buffer. I don't see that in the code. Take ethtool_get_stats()
as an example, you assume that stats.n_stats is set correctly,
but it's not enforced today. Some app somewhere may pass in zeroed
out stats and work just fine.
  
Ding Hui June 3, 2023, 7:11 a.m. UTC | #12
On 2023/6/3 13:55, Jakub Kicinski wrote:
> On Sat, 3 Jun 2023 09:51:34 +0800 Ding Hui wrote:
>>> If that is the case maybe it would just make more sense to just return
>>> an error if we are at risk of overrunning the userspace allocated
>>> buffer.
>>
>> In that case, I can modify to return an error, however, I think the
>> ENOSPC or EFBIG mentioned in a previous email may not be suitable,
>> maybe like others length/size checking return EINVAL.
>>
>> Another thing I wondered is that should I update the current length
>> back to user if user buffer is not enough, assuming we update the new
>> length with error returned, the userspace can use it to reallocate
>> buffer if he wants to, which can avoid re-call previous ioctl to get
>> the new length.
> 
> This entire thread presupposes that user provides the length of
> the buffer. I don't see that in the code. Take ethtool_get_stats()
> as an example, you assume that stats.n_stats is set correctly,
> but it's not enforced today. Some app somewhere may pass in zeroed
> out stats and work just fine.
> 

Yes.

I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()),
and searched the git log of ethtool utility, so I think that is an implicit
rule and the check is missed in kernel where the patch involves.

Without this rule, we cannot guarantee the safety of copy to user.

Should we keep to be compatible with that incorrect userspace usage?
  
Jakub Kicinski June 4, 2023, 5:47 p.m. UTC | #13
On Sat, 3 Jun 2023 15:11:29 +0800 Ding Hui wrote:
> Yes.
> 
> I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()),
> and searched the git log of ethtool utility, so I think that is an implicit
> rule and the check is missed in kernel where the patch involves.
> 
> Without this rule, we cannot guarantee the safety of copy to user.
> 
> Should we keep to be compatible with that incorrect userspace usage?

If such incorrect user space exists we do, if it doesn't we don't.
Problem is that we don't know what exists out there.

Maybe we can add a pr_err_once() complaining about bad usage for now
and see if anyone reports back that they are hitting it?
  
Ding Hui June 5, 2023, 3:39 a.m. UTC | #14
On 2023/6/5 1:47, Jakub Kicinski wrote:
> On Sat, 3 Jun 2023 15:11:29 +0800 Ding Hui wrote:
>> Yes.
>>
>> I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()),
>> and searched the git log of ethtool utility, so I think that is an implicit
>> rule and the check is missed in kernel where the patch involves.
>>
>> Without this rule, we cannot guarantee the safety of copy to user.
>>
>> Should we keep to be compatible with that incorrect userspace usage?
> 
> If such incorrect user space exists we do, if it doesn't we don't.
> Problem is that we don't know what exists out there.
> 
> Maybe we can add a pr_err_once() complaining about bad usage for now
> and see if anyone reports back that they are hitting it?
> 

How about this:

Case 1:
If the user len/n_stats is not zero, we will treat it as correct usage
(although we cannot distinguish between the real correct usage and
uninitialized usage). Return -EINVAL if current length exceed the one
user specified.

Case 2:
If it is zero, we will treat it as incorrect usage, we can add a
pr_err_once() for it and keep to be compatible with it for a period of time.
At a suitable time in the future, this part can be removed by maintainers.
  
Jakub Kicinski June 5, 2023, 6:39 p.m. UTC | #15
On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote:
> Case 1:
> If the user len/n_stats is not zero, we will treat it as correct usage
> (although we cannot distinguish between the real correct usage and
> uninitialized usage). Return -EINVAL if current length exceed the one
> user specified.

This assumes user will zero-initialize the value rather than do
something like:

	buf = malloc(1 << 16); // 64k should always be enough
	ioctl(s, ETHTOOL_GSTATS, buf)

	for (i = 0; i < buf.n_stats; i++)
		/* use stats */

:(

> Case 2:
> If it is zero, we will treat it as incorrect usage, we can add a
> pr_err_once() for it and keep to be compatible with it for a period of time.
> At a suitable time in the future, this part can be removed by maintainers.
  
Ding Hui June 8, 2023, 9:06 a.m. UTC | #16
On 2023/6/6 2:39, Jakub Kicinski wrote:
> On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote:
>> Case 1:
>> If the user len/n_stats is not zero, we will treat it as correct usage
>> (although we cannot distinguish between the real correct usage and
>> uninitialized usage). Return -EINVAL if current length exceed the one
>> user specified.
> 
> This assumes user will zero-initialize the value rather than do
> something like:
> 
> 	buf = malloc(1 << 16); // 64k should always be enough
> 	ioctl(s, ETHTOOL_GSTATS, buf)
> 
> 	for (i = 0; i < buf.n_stats; i++)
> 		/* use stats */
> 
> :(
> 

Sorry for late.

Now I'm not sure what can I do next besides reporting the issue.

>> Case 2:
>> If it is zero, we will treat it as incorrect usage, we can add a
>> pr_err_once() for it and keep to be compatible with it for a period of time.
>> At a suitable time in the future, this part can be removed by maintainers.
>
  
Alexander Duyck June 8, 2023, 2:17 p.m. UTC | #17
On Thu, Jun 8, 2023 at 2:06 AM Ding Hui <dinghui@sangfor.com.cn> wrote:
>
> On 2023/6/6 2:39, Jakub Kicinski wrote:
> > On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote:
> >> Case 1:
> >> If the user len/n_stats is not zero, we will treat it as correct usage
> >> (although we cannot distinguish between the real correct usage and
> >> uninitialized usage). Return -EINVAL if current length exceed the one
> >> user specified.
> >
> > This assumes user will zero-initialize the value rather than do
> > something like:
> >
> >       buf = malloc(1 << 16); // 64k should always be enough
> >       ioctl(s, ETHTOOL_GSTATS, buf)
> >
> >       for (i = 0; i < buf.n_stats; i++)
> >               /* use stats */
> >
> > :(
> >
>
> Sorry for late.
>
> Now I'm not sure what can I do next besides reporting the issue.

Well as I said before. The issue points to a driver problem more than
anything else.

Normally the solution is to make it so that the counts don't
fluctuate. That mostly consists of providing strings equal to the
maximum count, and then 0 populating the data for any fields that
don't exist in the case of ethtool -S.

So for example in the case of igb which you had pointed out you could
take a look at ixgbe for inspiration on how to fix it since the two
drivers should be similar and one has the issue and one does not.
  
Ding Hui June 9, 2023, 3:25 p.m. UTC | #18
On 2023/6/8 22:17, Alexander Duyck wrote:
> 
> Well as I said before. The issue points to a driver problem more than
> anything else.
> 
> Normally the solution is to make it so that the counts don't
> fluctuate. That mostly consists of providing strings equal to the
> maximum count, and then 0 populating the data for any fields that
> don't exist in the case of ethtool -S.
> 
> So for example in the case of igb which you had pointed out you could
> take a look at ixgbe for inspiration on how to fix it since the two
> drivers should be similar and one has the issue and one does not.
> 

Thanks.

FYI, I just did some rough research about which drivers return constant count
and which not.

Variable (61/188):

drivers/net/dsa/bcm_sf2.c:    .get_sset_count         = bcm_sf2_sw_get_sset_count,
drivers/net/ethernet/amazon/ena/ena_ethtool.c:        .get_sset_count         = ena_get_sset_count,
drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c: .get_sset_count = xgbe_get_sset_count,
drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c:  .get_sset_count      = aq_ethtool_get_sset_count,
drivers/net/ethernet/broadcom/bcmsysport.c:   .get_sset_count         = bcm_sysport_get_sset_count,
drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c:  .get_sset_count         = bnx2x_get_sset_count,
drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c:    .get_sset_count         = bnxt_get_sset_count,
drivers/net/ethernet/brocade/bna/bnad_ethtool.c:      .get_sset_count = bnad_get_sset_count,
drivers/net/ethernet/cadence/macb_main.c:     .get_sset_count         = gem_get_sset_count,
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:   .get_sset_count         = lio_get_sset_count,
drivers/net/ethernet/cavium/liquidio/lio_ethtool.c:   .get_sset_count         = lio_vf_get_sset_count,
drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c:  .get_sset_count         = nicvf_get_sset_count,
drivers/net/ethernet/emulex/benet/be_ethtool.c:       .get_sset_count = be_get_sset_count,
drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c:   .get_sset_count = dpaa_get_sset_count,
drivers/net/ethernet/freescale/enetc/enetc_ethtool.c: .get_sset_count = enetc_get_sset_count,
drivers/net/ethernet/fungible/funeth/funeth_ethtool.c:        .get_sset_count      = fun_get_sset_count,
drivers/net/ethernet/google/gve/gve_ethtool.c:        .get_sset_count = gve_get_sset_count,
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c:   .get_sset_count = hns3_get_sset_count,
drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c:      .get_sset_count = hclge_get_sset_count,
drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c:    .get_sset_count = hclgevf_get_sset_count,
drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c:    .get_sset_count = hns_ae_get_sset_count,
drivers/net/ethernet/hisilicon/hns/hns_ethtool.c:     .get_sset_count = hns_get_sset_count,
drivers/net/ethernet/huawei/hinic/hinic_ethtool.c:    .get_sset_count = hinic_get_sset_count,
drivers/net/ethernet/ibm/ibmvnic.c:   .get_sset_count         = ibmvnic_get_sset_count,
drivers/net/ethernet/intel/i40e/i40e_ethtool.c:       .get_sset_count         = i40e_get_sset_count,
drivers/net/ethernet/intel/iavf/iavf_ethtool.c:       .get_sset_count         = iavf_get_sset_count,
drivers/net/ethernet/intel/ice/ice_ethtool.c: .get_sset_count         = ice_get_sset_count,
drivers/net/ethernet/intel/igb/igb_ethtool.c: .get_sset_count         = igb_get_sset_count,
drivers/net/ethernet/intel/igc/igc_ethtool.c: .get_sset_count         = igc_ethtool_get_sset_count,
drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c:     .get_sset_count         = ixgbe_get_sset_count,
drivers/net/ethernet/intel/ixgbevf/ethtool.c: .get_sset_count         = ixgbevf_get_sset_count,
drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c:      .get_sset_count         = mvpp2_ethtool_get_sset_count,
drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c:       .get_sset_count = octep_get_sset_count,
drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c:    .get_sset_count         = otx2_get_sset_count,
drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c:    .get_sset_count         = otx2vf_get_sset_count,
drivers/net/ethernet/mellanox/mlx4/en_ethtool.c:      .get_sset_count = mlx4_en_get_sset_count,
drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: .get_sset_count    = mlx5e_get_sset_count,
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:     .get_sset_count    = mlx5e_rep_get_sset_count,
drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c:      .get_sset_count     = mlx5i_get_sset_count,
drivers/net/ethernet/microsoft/mana/mana_ethtool.c:   .get_sset_count         = mana_get_sset_count,
drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_net_get_sset_count,
drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_port_get_sset_count,
drivers/net/ethernet/pensando/ionic/ionic_ethtool.c:  .get_sset_count         = ionic_get_sset_count,
drivers/net/ethernet/qlogic/qede/qede_ethtool.c:      .get_sset_count                 = qede_get_sset_count,
drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c:  .get_sset_count = qlcnic_get_sset_count,
drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c:  .get_sset_count         = qlcnic_get_sset_count,
drivers/net/ethernet/sfc/ef100_ethtool.c:     .get_sset_count         = efx_ethtool_get_sset_count,
drivers/net/ethernet/sfc/ethtool.c:   .get_sset_count         = efx_ethtool_get_sset_count,
drivers/net/ethernet/sfc/falcon/ethtool.c:    .get_sset_count         = ef4_ethtool_get_sset_count,
drivers/net/ethernet/sfc/siena/ethtool.c:     .get_sset_count         = efx_siena_ethtool_get_sset_count,
drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: .get_sset_count = stmmac_get_sset_count,
drivers/net/ethernet/sun/niu.c:       .get_sset_count         = niu_get_sset_count,
drivers/net/ethernet/sun/sunvnet.c:   .get_sset_count         = vnet_get_sset_count,
drivers/net/ethernet/ti/cpsw.c:       .get_sset_count         = cpsw_get_sset_count,
drivers/net/ethernet/ti/cpsw_new.c:   .get_sset_count         = cpsw_get_sset_count,
drivers/net/hyperv/netvsc_drv.c:      .get_sset_count = netvsc_get_sset_count,
drivers/net/ifb.c:    .get_sset_count         = ifb_get_sset_count,
drivers/net/veth.c:   .get_sset_count         = veth_get_sset_count,
drivers/net/virtio_net.c:     .get_sset_count = virtnet_get_sset_count,
drivers/net/vmxnet3/vmxnet3_ethtool.c:        .get_sset_count    = vmxnet3_get_sset_count,
drivers/s390/net/qeth_ethtool.c:      .get_sset_count = qeth_get_sset_count,


Constant (127/188):

arch/um/drivers/vector_kern.c:        .get_sset_count = vector_get_sset_count,
drivers/infiniband/ulp/ipoib/ipoib_ethtool.c: .get_sset_count         = ipoib_get_sset_count,
drivers/infiniband/ulp/opa_vnic/opa_vnic_ethtool.c:   .get_sset_count = vnic_get_sset_count,
drivers/net/can/flexcan/flexcan-ethtool.c:    .get_sset_count = flexcan_get_sset_count,
drivers/net/can/slcan/slcan-ethtool.c:        .get_sset_count = slcan_get_sset_count,
drivers/net/dsa/b53/b53_common.c:     .get_sset_count         = b53_get_sset_count,
drivers/net/dsa/dsa_loop.c:   .get_sset_count         = dsa_loop_get_sset_count,
drivers/net/dsa/hirschmann/hellcreek.c:       .get_sset_count        = hellcreek_get_sset_count,
drivers/net/dsa/lan9303-core.c:       .get_sset_count         = lan9303_get_sset_count,
drivers/net/dsa/lantiq_gswip.c:       .get_sset_count         = gswip_get_sset_count,
drivers/net/dsa/microchip/ksz_common.c:       .get_sset_count         = ksz_sset_count,
drivers/net/dsa/mt7530.c:     .get_sset_count         = mt7530_get_sset_count,
drivers/net/dsa/mv88e6xxx/chip.c:     .get_sset_count         = mv88e6xxx_get_sset_count,
drivers/net/dsa/ocelot/felix.c:       .get_sset_count                 = felix_get_sset_count,
drivers/net/dsa/qca/qca8k-8xxx.c:     .get_sset_count         = qca8k_get_sset_count,
drivers/net/dsa/realtek/rtl8365mb.c:  .get_sset_count = rtl8365mb_get_sset_count,
drivers/net/dsa/realtek/rtl8366rb.c:  .get_sset_count = rtl8366_get_sset_count,
drivers/net/dsa/rzn1_a5psw.c: .get_sset_count = a5psw_get_sset_count,
drivers/net/dsa/sja1105/sja1105_main.c:       .get_sset_count         = sja1105_get_sset_count,
drivers/net/dsa/vitesse-vsc73xx-core.c:       .get_sset_count = vsc73xx_get_sset_count,
drivers/net/dsa/xrs700x/xrs700x.c:    .get_sset_count         = xrs700x_get_sset_count,
drivers/net/ethernet/3com/3c59x.c:    .get_sset_count         = vortex_get_sset_count,
drivers/net/ethernet/alacritech/slicoss.c:    .get_sset_count         = slic_get_sset_count,
drivers/net/ethernet/altera/altera_tse_ethtool.c:     .get_sset_count = tse_sset_count,
drivers/net/ethernet/amd/pcnet32.c:   .get_sset_count         = pcnet32_get_sset_count,
drivers/net/ethernet/apm/xgene-v2/ethtool.c:  .get_sset_count = xge_get_sset_count,
drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c:  .get_sset_count = xgene_get_sset_count,
drivers/net/ethernet/asix/ax88796c_ioctl.c:   .get_sset_count         = ax88796c_get_sset_count,
drivers/net/ethernet/atheros/ag71xx.c:        .get_sset_count                 = ag71xx_ethtool_get_sset_count,
drivers/net/ethernet/atheros/alx/ethtool.c:   .get_sset_count = alx_get_sset_count,
drivers/net/ethernet/atheros/atlx/atl1.c:     .get_sset_count         = atl1_get_sset_count,
drivers/net/ethernet/broadcom/b44.c:  .get_sset_count         = b44_get_sset_count,
drivers/net/ethernet/broadcom/bcm63xx_enet.c: .get_sset_count         = bcm_enet_get_sset_count,
drivers/net/ethernet/broadcom/bcm63xx_enet.c: .get_sset_count         = bcm_enetsw_get_sset_count,
drivers/net/ethernet/broadcom/bgmac.c:        .get_sset_count         = bgmac_get_sset_count,
drivers/net/ethernet/broadcom/bnx2.c: .get_sset_count         = bnx2_get_sset_count,
drivers/net/ethernet/broadcom/genet/bcmgenet.c:       .get_sset_count         = bcmgenet_get_sset_count,
drivers/net/ethernet/broadcom/tg3.c:  .get_sset_count         = tg3_get_sset_count,
drivers/net/ethernet/calxeda/xgmac.c: .get_sset_count = xgmac_get_sset_count,
drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c:      .get_sset_count = get_sset_count,
drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c:   .get_sset_count    = get_sset_count,
drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c:  .get_sset_count         = cxgb4vf_get_sset_count,
drivers/net/ethernet/chelsio/cxgb/cxgb2.c:    .get_sset_count    = get_sset_count,
drivers/net/ethernet/cisco/enic/enic_ethtool.c:       .get_sset_count = enic_get_sset_count,
drivers/net/ethernet/cortina/gemini.c:        .get_sset_count = gmac_get_sset_count,
drivers/net/ethernet/dlink/sundance.c:        .get_sset_count = get_sset_count,
drivers/net/ethernet/engleder/tsnep_ethtool.c:        .get_sset_count = tsnep_ethtool_get_sset_count,
drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c: .get_sset_count = dpaa2_eth_get_sset_count,
drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c:  .get_sset_count         = dpaa2_switch_ethtool_get_sset_count,
drivers/net/ethernet/freescale/fec_main.c:    .get_sset_count         = fec_enet_get_sset_count,
drivers/net/ethernet/freescale/gianfar_ethtool.c:     .get_sset_count = gfar_sset_count,
drivers/net/ethernet/freescale/ucc_geth_ethtool.c:    .get_sset_count         = uec_get_sset_count,
drivers/net/ethernet/ibm/ehea/ehea_ethtool.c: .get_sset_count = ehea_get_sset_count,
drivers/net/ethernet/ibm/emac/core.c: .get_sset_count = emac_ethtool_get_sset_count,
drivers/net/ethernet/ibm/ibmveth.c:   .get_sset_count                  = ibmveth_get_sset_count,
drivers/net/ethernet/intel/e1000/e1000_ethtool.c:     .get_sset_count         = e1000_get_sset_count,
drivers/net/ethernet/intel/e1000e/ethtool.c:  .get_sset_count         = e1000e_get_sset_count,
drivers/net/ethernet/intel/e100.c:    .get_sset_count         = e100_get_sset_count,
drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c:     .get_sset_count         = fm10k_get_sset_count,
drivers/net/ethernet/intel/ice/ice_ethtool.c: .get_sset_count         = ice_repr_get_sset_count,
drivers/net/ethernet/intel/igbvf/ethtool.c:   .get_sset_count         = igbvf_get_sset_count,
drivers/net/ethernet/marvell/mv643xx_eth.c:   .get_sset_count         = mv643xx_eth_get_sset_count,
drivers/net/ethernet/marvell/mvneta.c:        .get_sset_count = mvneta_ethtool_get_sset_count,
drivers/net/ethernet/marvell/prestera/prestera_ethtool.c:     .get_sset_count = prestera_ethtool_get_sset_count,
drivers/net/ethernet/marvell/skge.c:  .get_sset_count = skge_get_sset_count,
drivers/net/ethernet/marvell/sky2.c:  .get_sset_count = sky2_get_sset_count,
drivers/net/ethernet/mediatek/mtk_eth_soc.c:  .get_sset_count         = mtk_get_sset_count,
drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_ethtool.c:        .get_sset_count         = mlxbf_gige_get_sset_count,
drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c:       .get_sset_count                 = mlxsw_sp_port_get_sset_count,
drivers/net/ethernet/micrel/ksz884x.c:        .get_sset_count         = netdev_get_sset_count,
drivers/net/ethernet/microchip/lan743x_ethtool.c:     .get_sset_count = lan743x_ethtool_get_sset_count,
drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c:     .get_sset_count         = lan966x_get_sset_count,
drivers/net/ethernet/microchip/sparx5/sparx5_ethtool.c:       .get_sset_count         = sparx5_get_sset_count,
drivers/net/ethernet/mscc/ocelot_net.c:       .get_sset_count         = ocelot_port_get_sset_count,
drivers/net/ethernet/myricom/myri10ge/myri10ge.c:     .get_sset_count = myri10ge_get_sset_count,
drivers/net/ethernet/neterion/s2io.c: .get_sset_count = s2io_get_sset_count,
drivers/net/ethernet/nvidia/forcedeth.c:      .get_sset_count = nv_get_sset_count,
drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c:      .get_sset_count = pch_gbe_get_sset_count,
drivers/net/ethernet/pasemi/pasemi_mac_ethtool.c:     .get_sset_count         = pasemi_mac_get_sset_count,
drivers/net/ethernet/qlogic/netxen/netxen_nic_ethtool.c:      .get_sset_count = netxen_get_sset_count,
drivers/net/ethernet/qualcomm/emac/emac-ethtool.c:    .get_sset_count  = emac_get_sset_count,
drivers/net/ethernet/qualcomm/qca_debug.c:    .get_sset_count = qcaspi_get_sset_count,
drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c:      .get_sset_count = rmnet_get_sset_count,
drivers/net/ethernet/realtek/8139cp.c:        .get_sset_count         = cp_get_sset_count,
drivers/net/ethernet/realtek/8139too.c:       .get_sset_count         = rtl8139_get_sset_count,
drivers/net/ethernet/realtek/r8169_main.c:    .get_sset_count         = rtl8169_get_sset_count,
drivers/net/ethernet/renesas/ravb_main.c:     .get_sset_count         = ravb_get_sset_count,
drivers/net/ethernet/renesas/sh_eth.c:        .get_sset_count     = sh_eth_get_sset_count,
drivers/net/ethernet/rocker/rocker_main.c:    .get_sset_count         = rocker_port_get_sset_count,
drivers/net/ethernet/samsung/sxgbe/sxgbe_ethtool.c:   .get_sset_count = sxgbe_get_sset_count,
drivers/net/ethernet/silan/sc92031.c: .get_sset_count         = sc92031_ethtool_get_sset_count,
drivers/net/ethernet/sun/cassini.c:   .get_sset_count         = cas_get_sset_count,
drivers/net/ethernet/synopsys/dwc-xlgmac-ethtool.c:   .get_sset_count = xlgmac_ethtool_get_sset_count,
drivers/net/ethernet/tehuti/tehuti.c:         .get_sset_count = bdx_get_sset_count,
drivers/net/ethernet/ti/am65-cpsw-ethtool.c:  .get_sset_count         = am65_cpsw_get_sset_count,
drivers/net/ethernet/ti/netcp_ethss.c:        .get_sset_count         = keystone_get_sset_count,
drivers/net/ethernet/toshiba/spider_net_ethtool.c:    .get_sset_count         = spider_net_get_sset_count,
drivers/net/ethernet/toshiba/tc35815.c:       .get_sset_count         = tc35815_get_sset_count,
drivers/net/ethernet/vertexcom/mse102x.c:     .get_sset_count         = mse102x_get_sset_count,
drivers/net/ethernet/via/via-velocity.c:      .get_sset_count         = velocity_get_sset_count,
drivers/net/fjes/fjes_ethtool.c:              .get_sset_count   = fjes_get_sset_count,
drivers/net/phy/adin.c:               .get_sset_count = adin_get_sset_count,
drivers/net/phy/aquantia_main.c:      .get_sset_count = aqr107_get_sset_count,
drivers/net/phy/at803x.c:     .get_sset_count         = at803x_get_sset_count,
drivers/net/phy/bcm7xxx.c:    .get_sset_count = bcm_phy_get_sset_count,                       \
drivers/net/phy/bcm-cygnus.c: .get_sset_count = bcm_phy_get_sset_count,
drivers/net/phy/broadcom.c:   .get_sset_count = bcm_phy_get_sset_count,
drivers/net/phy/icplus.c:     .get_sset_count = ip101g_get_sset_count,
drivers/net/phy/marvell.c:            .get_sset_count = marvell_get_sset_count,
drivers/net/phy/micrel.c:     .get_sset_count = kszphy_get_sset_count,
drivers/net/phy/mscc/mscc_main.c:     .get_sset_count = &vsc85xx_get_sset_count,
drivers/net/phy/nxp-c45-tja11xx.c:            .get_sset_count         = nxp_c45_get_sset_count,
drivers/net/phy/nxp-cbtx.c:           .get_sset_count         = cbtx_get_sset_count,
drivers/net/phy/nxp-tja11xx.c:                .get_sset_count = tja11xx_get_sset_count,
drivers/net/phy/smsc.c:       .get_sset_count = smsc_get_sset_count,
drivers/net/usb/asix_devices.c:       .get_sset_count         = ax88772_ethtool_get_sset_count,
drivers/net/usb/cdc_ncm.c:    .get_sset_count         = cdc_ncm_get_sset_count,
drivers/net/usb/lan78xx.c:    .get_sset_count = lan78xx_get_sset_count,
drivers/net/usb/r8152.c:      .get_sset_count = rtl8152_get_sset_count,
drivers/net/usb/smsc95xx.c:   .get_sset_count = smsc95xx_ethtool_get_sset_count,
drivers/net/wireless/ath/ath6kl/cfg80211.c:   .get_sset_count = ath6kl_get_sset_count,
drivers/net/wireless/marvell/libertas/ethtool.c:      .get_sset_count = lbs_mesh_ethtool_get_sset_count,
drivers/net/xen-netback/interface.c:  .get_sset_count = xenvif_get_sset_count,
drivers/net/xen-netfront.c:   .get_sset_count = xennet_get_sset_count,
drivers/staging/qlge/qlge_ethtool.c:  .get_sset_count = qlge_get_sset_count,
net/batman-adv/soft-interface.c:      .get_sset_count = batadv_get_sset_count,
net/mac80211/ethtool.c:       .get_sset_count = ieee80211_get_sset_count,
  
Jakub Kicinski June 9, 2023, 5:13 p.m. UTC | #19
On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote:
> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_net_get_sset_count,
> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_port_get_sset_count,

Not sure if your research is accurate, NFP does not change the number
of stats. The number depends on the device and the FW loaded, but those
are constant during a lifetime of a netdevice.
  
Alexander Duyck June 9, 2023, 5:59 p.m. UTC | #20
On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote:
> > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_net_get_sset_count,
> > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_port_get_sset_count,
>
> Not sure if your research is accurate, NFP does not change the number
> of stats. The number depends on the device and the FW loaded, but those
> are constant during a lifetime of a netdevice.

Yeah, the value doesn't need to be a constant, it just need to be constant.

So for example in the ixgbe driver I believe we were using the upper
limits on the Tx and Rx queues which last I recall are stored in the
netdev itself.
  
Ding Hui June 10, 2023, 3:47 a.m. UTC | #21
On 2023/6/10 1:59, Alexander Duyck wrote:
> On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>
>> On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote:
>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_net_get_sset_count,
>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_port_get_sset_count,
>>
>> Not sure if your research is accurate, NFP does not change the number
>> of stats. The number depends on the device and the FW loaded, but those
>> are constant during a lifetime of a netdevice.

Sorry, my research is rough indeed.

> 
> Yeah, the value doesn't need to be a constant, it just need to be constant.
> 
> So for example in the ixgbe driver I believe we were using the upper
> limits on the Tx and Rx queues which last I recall are stored in the
> netdev itself.
> 
Thanks to point that, the examples NFP and ixgbe do help me.
  
Ding Hui June 10, 2023, 4:01 a.m. UTC | #22
On 2023/6/10 11:47, Ding Hui wrote:
> On 2023/6/10 1:59, Alexander Duyck wrote:
>> On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote:
>>>
>>> On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote:
>>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_net_get_sset_count,
>>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count         = nfp_port_get_sset_count,
>>>
>>> Not sure if your research is accurate, NFP does not change the number
>>> of stats. The number depends on the device and the FW loaded, but those
>>> are constant during a lifetime of a netdevice.
> 
> Sorry, my research is rough indeed.
> 
>>
>> Yeah, the value doesn't need to be a constant, it just need to be constant.
>>
>> So for example in the ixgbe driver I believe we were using the upper
>> limits on the Tx and Rx queues which last I recall are stored in the
>> netdev itself.
>>
> Thanks to point that, the examples NFP and ixgbe do help me.

The commit f8ba7db85035 ("ice: Report stats for allocated queues via ethtool stats")
also helps a lot.
  

Patch

diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 6bb778e10461..82a975a9c895 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1902,7 +1902,7 @@  static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
 	if (copy_from_user(&test, useraddr, sizeof(test)))
 		return -EFAULT;
 
-	test.len = test_len;
+	test.len = min_t(u32, test.len, test_len);
 	data = kcalloc(test_len, sizeof(u64), GFP_USER);
 	if (!data)
 		return -ENOMEM;
@@ -1915,7 +1915,8 @@  static int ethtool_self_test(struct net_device *dev, char __user *useraddr)
 	if (copy_to_user(useraddr, &test, sizeof(test)))
 		goto out;
 	useraddr += sizeof(test);
-	if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
+	if (test.len &&
+	    copy_to_user(useraddr, data, array_size(test.len, sizeof(u64))))
 		goto out;
 	ret = 0;
 
@@ -1940,10 +1941,10 @@  static int ethtool_get_strings(struct net_device *dev, void __user *useraddr)
 		return -ENOMEM;
 	WARN_ON_ONCE(!ret);
 
-	gstrings.len = ret;
+	gstrings.len = min_t(u32, gstrings.len, ret);
 
 	if (gstrings.len) {
-		data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN));
+		data = vzalloc(array_size(ret, ETH_GSTRING_LEN));
 		if (!data)
 			return -ENOMEM;
 
@@ -2055,9 +2056,9 @@  static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
 	if (copy_from_user(&stats, useraddr, sizeof(stats)))
 		return -EFAULT;
 
-	stats.n_stats = n_stats;
+	stats.n_stats = min_t(u32, stats.n_stats, n_stats);
 
-	if (n_stats) {
+	if (stats.n_stats) {
 		data = vzalloc(array_size(n_stats, sizeof(u64)));
 		if (!data)
 			return -ENOMEM;
@@ -2070,7 +2071,8 @@  static int ethtool_get_stats(struct net_device *dev, void __user *useraddr)
 	if (copy_to_user(useraddr, &stats, sizeof(stats)))
 		goto out;
 	useraddr += sizeof(stats);
-	if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64))))
+	if (stats.n_stats &&
+	    copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64))))
 		goto out;
 	ret = 0;