Message ID | 20230601112839.13799-1-dinghui@sangfor.com.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp255636vqr; Thu, 1 Jun 2023 04:52:36 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ52Ax/4MBl4Fj9PkgEV8gjMSJYmAyGpc8oN1vzn99aHZN7ryPwNktHCf3NGMEA3CyqkNAGg X-Received: by 2002:a17:90a:d082:b0:253:828a:28f8 with SMTP id k2-20020a17090ad08200b00253828a28f8mr6464377pju.25.1685620356570; Thu, 01 Jun 2023 04:52:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685620356; cv=none; d=google.com; s=arc-20160816; b=Ms2rQ+42cRYBKqUn8hjhKKlsOC7zsqPpr4+mkTM26TQSbRe4Ra7fwGVlqmxxqj+Ydh fo15wG4ZVGRn73peCG5FEydrah2FzuprDhezE9akqU8XxcDHRqM1CRNqlWdObhL8/suA 8jguNHlUlqwu1UPmY0IUYNlNVSg2EzDCQkXSsmrQMK7gAXBtuVvaAuZG9GRVQzUtW1Ga rXRvEDcAD/RNPJnboh20E2bb2kkasnsmVPkywpz03TpMxJR7F3oDlxCRGzcNH6eFkS8v VIgpJVB02tna6TqcaM5GDG8WqIGe5TZ+Q9bH9iedemQaFV1y37C2b4voHmGT2PZCNYE1 wkMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from; bh=vdEJ5wr0TqisBsvZJ20L31Z5ULfvSW4NtCYCRek2ru0=; b=fOMwbej2sDi75k6v2oHy8aTsNJ+JVArREb+BYkXQ/d8KH1L/3fQN2HQcng7nvCCK47 z0lCSXIAk8dOXmC99+C7k5ADvQKQMCBPMjo8DWRuBpwosXZQ4Z4+guzhmDviPYRXxKZM h3Q7h+IDkk217cpznOhiHpQofW0Qv8rzVc0+U/zjaWzuJzsgS5dyyKVvqem76sgpD4xs IuVzm1v0goO03jIvvrTFfjl0vz3vW1WYHG8mBrsA6ubhgymJrd4Mq/5f2/2QH/Z7/DOO fW9DlrERfZNw1ryGdRxH6IuSNpBnlZ8fa/OXiAjwmBA7bVjJBHvDVQ0kFQK8fIPd1spR 8H8w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k65-20020a17090a3ec700b0025350783742si1084110pjc.5.2023.06.01.04.52.21; Thu, 01 Jun 2023 04:52:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232884AbjFAL3B (ORCPT <rfc822;limurcpp@gmail.com> + 99 others); Thu, 1 Jun 2023 07:29:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40012 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232670AbjFAL27 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 1 Jun 2023 07:28:59 -0400 Received: from mail-m12745.qiye.163.com (mail-m12745.qiye.163.com [115.236.127.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 05AC818F; Thu, 1 Jun 2023 04:28:47 -0700 (PDT) Received: from localhost.localdomain (unknown [IPV6:240e:3b7:327f:140:e827:966c:49bc:3060]) by mail-m12745.qiye.163.com (Hmail) with ESMTPA id 94D9E9A0A0F; Thu, 1 Jun 2023 19:28:42 +0800 (CST) From: Ding Hui <dinghui@sangfor.com.cn> To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, pengdonglin@sangfor.com.cn, huangcun@sangfor.com.cn, Ding Hui <dinghui@sangfor.com.cn> Subject: [PATCH net-next] net: ethtool: Fix out-of-bounds copy to user Date: Thu, 1 Jun 2023 19:28:39 +0800 Message-Id: <20230601112839.13799-1-dinghui@sangfor.com.cn> X-Mailer: git-send-email 2.17.1 X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlDQkxOVkxJGkpOHUxKHUlKT1UTARMWGhIXJBQOD1 lXWRgSC1lBWUlPSx5BSBlMQUhJTB1BSk9LQR5DSUxBQk1NGEFPQhkYQUhLTUtZV1kWGg8SFR0UWU FZT0tIVUpKS0hKTFVKS0tVS1kG X-HM-Tid: 0a8876b79570b218kuuu94d9e9a0a0f X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6OFE6Tgw5FD1CDD4qKhNCKx8Z Iw5PCjpVSlVKTUNOTUpDQklISEtNVTMWGhIXVR8SFRwTDhI7CBoVHB0UCVUYFBZVGBVFWVdZEgtZ QVlJT0seQUgZTEFISUwdQUpPS0EeQ0lMQUJNTRhBT0IZGEFIS01LWVdZCAFZQUhDTks3Bg++ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767501051018923045?= X-GMAIL-MSGID: =?utf-8?q?1767501051018923045?= |
Series |
[net-next] net: ethtool: Fix out-of-bounds copy to user
|
|
Commit Message
Ding Hui
June 1, 2023, 11:28 a.m. UTC
When we get statistics by ethtool during changing the number of NIC
channels greater, the utility may crash due to memory corruption.
The NIC drivers callback get_sset_count() could return a calculated
length depends on current number of channels (e.g. i40e, igb).
The ethtool allocates a user buffer with the first ioctl returned
length and invokes the second ioctl to get data. The kernel copies
data to the user buffer but without checking its length. If the length
returned by the second get_sset_count() is greater than the length
allocated by the user, it will lead to an out-of-bounds copy.
Fix it by restricting the copy length not exceed the buffer length
specified by userspace.
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
net/ethtool/ioctl.c | 16 +++++++++-------
1 file changed, 9 insertions(+), 7 deletions(-)
Comments
On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote: > When we get statistics by ethtool during changing the number of NIC > channels greater, the utility may crash due to memory corruption. > > The NIC drivers callback get_sset_count() could return a calculated > length depends on current number of channels (e.g. i40e, igb). > The drivers shouldn't be changing that value. If the drivers are doing this they should be fixed to provide a fixed length in terms of their strings. > The ethtool allocates a user buffer with the first ioctl returned > length and invokes the second ioctl to get data. The kernel copies > data to the user buffer but without checking its length. If the length > returned by the second get_sset_count() is greater than the length > allocated by the user, it will lead to an out-of-bounds copy. > > Fix it by restricting the copy length not exceed the buffer length > specified by userspace. > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> Changing the copy size would not fix this. The problem is the driver will be overwriting with the size that it thinks it should be using. Reducing the value that is provided for the memory allocations will cause the driver to corrupt memory. > --- > net/ethtool/ioctl.c | 16 +++++++++------- > 1 file changed, 9 insertions(+), 7 deletions(-) > > diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c > index 6bb778e10461..82a975a9c895 100644 > --- a/net/ethtool/ioctl.c > +++ b/net/ethtool/ioctl.c > @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) > if (copy_from_user(&test, useraddr, sizeof(test))) > return -EFAULT; > > - test.len = test_len; > + test.len = min_t(u32, test.len, test_len); > data = kcalloc(test_len, sizeof(u64), GFP_USER); > if (!data) > return -ENOMEM; This is the wrong spot to be doing this. You need to use test_len for your allocation as that is what the driver will be writing to. You should look at adjusting after the allocation call and before you do the copy > @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) > if (copy_to_user(useraddr, &test, sizeof(test))) > goto out; > useraddr += sizeof(test); > - if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) > + if (test.len && > + copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) > goto out; > ret = 0; > I don't believe this is adding any value. I wouldn't bother with checking for lengths of 0. > @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) > return -ENOMEM; > WARN_ON_ONCE(!ret); > > - gstrings.len = ret; > + gstrings.len = min_t(u32, gstrings.len, ret); > > if (gstrings.len) { > - data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); > + data = vzalloc(array_size(ret, ETH_GSTRING_LEN)); > if (!data) > return -ENOMEM; > Same here. We should be using the returned value for the allocations and tests, and then doing the min adjustment after the allocationis completed. > @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) > if (copy_from_user(&stats, useraddr, sizeof(stats))) > return -EFAULT; > > - stats.n_stats = n_stats; > + stats.n_stats = min_t(u32, stats.n_stats, n_stats); > > - if (n_stats) { > + if (stats.n_stats) { > data = vzalloc(array_size(n_stats, sizeof(u64))); > if (!data) > return -ENOMEM; Same here. We should be using n_stats, not stats.n_stats and adjust before you do the final copy. > @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) > if (copy_to_user(useraddr, &stats, sizeof(stats))) > goto out; > useraddr += sizeof(stats); > - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) > + if (stats.n_stats && > + copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) > goto out; > ret = 0; > Again. I am not sure what value is being added. If n_stats is 0 then I am pretty sure this will do nothing anyway.
On 2023/6/1 23:04, Alexander H Duyck wrote: > On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote: >> When we get statistics by ethtool during changing the number of NIC >> channels greater, the utility may crash due to memory corruption. >> >> The NIC drivers callback get_sset_count() could return a calculated >> length depends on current number of channels (e.g. i40e, igb). >> > > The drivers shouldn't be changing that value. If the drivers are doing > this they should be fixed to provide a fixed length in terms of their > strings. > Is there an explicit declaration for the rule? I searched the ethernet drivers, found that many drivers that support multiple queues return calculated length by number of queues rather than fixed value. So pushing all these drivers to follow the rule is hard to me. >> The ethtool allocates a user buffer with the first ioctl returned >> length and invokes the second ioctl to get data. The kernel copies >> data to the user buffer but without checking its length. If the length >> returned by the second get_sset_count() is greater than the length >> allocated by the user, it will lead to an out-of-bounds copy. >> >> Fix it by restricting the copy length not exceed the buffer length >> specified by userspace. >> >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > > Changing the copy size would not fix this. The problem is the driver > will be overwriting with the size that it thinks it should be using. > Reducing the value that is provided for the memory allocations will > cause the driver to corrupt memory. > I noticed that, in fact I did use the returned length to allocate kernel memory, and only use adjusted length to copy to user. >> --- >> net/ethtool/ioctl.c | 16 +++++++++------- >> 1 file changed, 9 insertions(+), 7 deletions(-) >> >> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c >> index 6bb778e10461..82a975a9c895 100644 >> --- a/net/ethtool/ioctl.c >> +++ b/net/ethtool/ioctl.c >> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) >> if (copy_from_user(&test, useraddr, sizeof(test))) >> return -EFAULT; >> >> - test.len = test_len; >> + test.len = min_t(u32, test.len, test_len); >> data = kcalloc(test_len, sizeof(u64), GFP_USER); >> if (!data) >> return -ENOMEM; > > This is the wrong spot to be doing this. You need to use test_len for > your allocation as that is what the driver will be writing to. You > should look at adjusting after the allocation call and before you do > the copy data = kcalloc(test_len, sizeof(u64), GFP_USER); // yes, **test_len** for kernel memory ... copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)) // **test.len** only for copy to user > >> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) >> if (copy_to_user(useraddr, &test, sizeof(test))) >> goto out; >> useraddr += sizeof(test); >> - if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) >> + if (test.len && >> + copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) >> goto out; >> ret = 0; >> > > I don't believe this is adding any value. I wouldn't bother with > checking for lengths of 0. > Yes, we already checked the data ptr is not NULL, so we don't need checking test.len. I'll remove it in v2. >> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) >> return -ENOMEM; >> WARN_ON_ONCE(!ret); >> >> - gstrings.len = ret; >> + gstrings.len = min_t(u32, gstrings.len, ret); >> >> if (gstrings.len) { >> - data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); >> + data = vzalloc(array_size(ret, ETH_GSTRING_LEN)); >> if (!data) >> return -ENOMEM; >> > > Same here. We should be using the returned value for the allocations > and tests, and then doing the min adjustment after the allocationis > completed. > gstrings.len = min_t(u32, gstrings.len, ret); // adjusting if (gstrings.len) { // checking the adjusted gstrings.len can avoid unnecessary vzalloc and __ethtool_get_strings() vzalloc(array_size(ret, ETH_GSTRING_LEN)); // **ret** for kernel memory, rather than adjusted lenght At last, adjusted gstrings.len for copy to user >> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) >> if (copy_from_user(&stats, useraddr, sizeof(stats))) >> return -EFAULT; >> >> - stats.n_stats = n_stats; >> + stats.n_stats = min_t(u32, stats.n_stats, n_stats); >> >> - if (n_stats) { >> + if (stats.n_stats) { >> data = vzalloc(array_size(n_stats, sizeof(u64))); >> if (!data) >> return -ENOMEM; > > Same here. We should be using n_stats, not stats.n_stats and adjust > before you do the final copy. > >> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) >> if (copy_to_user(useraddr, &stats, sizeof(stats))) >> goto out; >> useraddr += sizeof(stats); >> - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) >> + if (stats.n_stats && >> + copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) >> goto out; >> ret = 0; >> > > Again. I am not sure what value is being added. If n_stats is 0 then I > am pretty sure this will do nothing anyway. > Not really no, n_stats is returned value, and stats.n_stats is adjusted value, if the adjusted stats.n_stats is 0, we avoid memory allocation and setting data ptr to NULL, copy_to_user() with NULL ptr maybe cause warn log.
> > Changing the copy size would not fix this. The problem is the driver > > will be overwriting with the size that it thinks it should be using. > > Reducing the value that is provided for the memory allocations will > > cause the driver to corrupt memory. > > > > I noticed that, in fact I did use the returned length to allocate > kernel memory, and only use adjusted length to copy to user. This is also something i checked when quickly looking at the patch. It does look correct. Also, RTNL should be held during the time both calls are made into the driver. So nothing from userspace should be able to get in the middle of these calls to change the number of queues. Andrew
On 2023/6/2 8:26 下午, Andrew Lunn wrote: >>> Changing the copy size would not fix this. The problem is the driver >>> will be overwriting with the size that it thinks it should be using. >>> Reducing the value that is provided for the memory allocations will >>> cause the driver to corrupt memory. >>> >> >> I noticed that, in fact I did use the returned length to allocate >> kernel memory, and only use adjusted length to copy to user. > > This is also something i checked when quickly looking at the patch. It > does look correct. > Thanks. > Also, RTNL should be held during the time both calls are made into the > driver. So nothing from userspace should be able to get in the middle > of these calls to change the number of queues. > The RTNL lock is already be held during every each ioctl in dev_ethtool(). rtnl_lock(); rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state); rtnl_unlock();
On Thu, Jun 1, 2023 at 6:46 PM Ding Hui <dinghui@sangfor.com.cn> wrote: > > On 2023/6/1 23:04, Alexander H Duyck wrote: > > On Thu, 2023-06-01 at 19:28 +0800, Ding Hui wrote: > >> When we get statistics by ethtool during changing the number of NIC > >> channels greater, the utility may crash due to memory corruption. > >> > >> The NIC drivers callback get_sset_count() could return a calculated > >> length depends on current number of channels (e.g. i40e, igb). > >> > > > > The drivers shouldn't be changing that value. If the drivers are doing > > this they should be fixed to provide a fixed length in terms of their > > strings. > > > > Is there an explicit declaration for the rule? > I searched the ethernet drivers, found that many drivers that support > multiple queues return calculated length by number of queues rather than > fixed value. So pushing all these drivers to follow the rule is hard > to me. > > >> The ethtool allocates a user buffer with the first ioctl returned > >> length and invokes the second ioctl to get data. The kernel copies > >> data to the user buffer but without checking its length. If the length > >> returned by the second get_sset_count() is greater than the length > >> allocated by the user, it will lead to an out-of-bounds copy. > >> > >> Fix it by restricting the copy length not exceed the buffer length > >> specified by userspace. > >> > >> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > > > > Changing the copy size would not fix this. The problem is the driver > > will be overwriting with the size that it thinks it should be using. > > Reducing the value that is provided for the memory allocations will > > cause the driver to corrupt memory. > > > > I noticed that, in fact I did use the returned length to allocate > kernel memory, and only use adjusted length to copy to user. Ah, okay I hadn't noticed that part. Although that leads me to the question of if any of the drivers might be modifying the length values stored in the structures. We may want to add a new stack variable to track what the clamped value is for these rather than just leaving the value stored in the structure. > >> --- > >> net/ethtool/ioctl.c | 16 +++++++++------- > >> 1 file changed, 9 insertions(+), 7 deletions(-) > >> > >> diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c > >> index 6bb778e10461..82a975a9c895 100644 > >> --- a/net/ethtool/ioctl.c > >> +++ b/net/ethtool/ioctl.c > >> @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) > >> if (copy_from_user(&test, useraddr, sizeof(test))) > >> return -EFAULT; > >> > >> - test.len = test_len; > >> + test.len = min_t(u32, test.len, test_len); > >> data = kcalloc(test_len, sizeof(u64), GFP_USER); > >> if (!data) > >> return -ENOMEM; > > > > This is the wrong spot to be doing this. You need to use test_len for > > your allocation as that is what the driver will be writing to. You > > should look at adjusting after the allocation call and before you do > > the copy > > data = kcalloc(test_len, sizeof(u64), GFP_USER); // yes, **test_len** for kernel memory > ... > copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)) // **test.len** only for copy to user One other thought on this. Would we ever expect the length value to change? For many of these I wonder if we shouldn't just return an error in the case that there isn't enough space to store the test results. It might make sense to look at adding a return of ENOSPC or EFBIG when we encounter a size difference where our output is too big for the provided userspace buffer. At least with that we are not losing data. > > > >> @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) > >> if (copy_to_user(useraddr, &test, sizeof(test))) > >> goto out; > >> useraddr += sizeof(test); > >> - if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) > >> + if (test.len && > >> + copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) > >> goto out; > >> ret = 0; > >> > > > > I don't believe this is adding any value. I wouldn't bother with > > checking for lengths of 0. > > > > Yes, we already checked the data ptr is not NULL, so we don't need checking test.len. > I'll remove it in v2. > > >> @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) > >> return -ENOMEM; > >> WARN_ON_ONCE(!ret); > >> > >> - gstrings.len = ret; > >> + gstrings.len = min_t(u32, gstrings.len, ret); > >> > >> if (gstrings.len) { > >> - data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); > >> + data = vzalloc(array_size(ret, ETH_GSTRING_LEN)); > >> if (!data) > >> return -ENOMEM; > >> > > > > Same here. We should be using the returned value for the allocations > > and tests, and then doing the min adjustment after the allocationis > > completed. > > > > gstrings.len = min_t(u32, gstrings.len, ret); // adjusting > > if (gstrings.len) { // checking the adjusted gstrings.len can avoid unnecessary vzalloc and __ethtool_get_strings() > vzalloc(array_size(ret, ETH_GSTRING_LEN)); // **ret** for kernel memory, rather than adjusted lenght > > At last, adjusted gstrings.len for copy to user I see what you are talking about now. > >> @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) > >> if (copy_from_user(&stats, useraddr, sizeof(stats))) > >> return -EFAULT; > >> > >> - stats.n_stats = n_stats; > >> + stats.n_stats = min_t(u32, stats.n_stats, n_stats); > >> > >> - if (n_stats) { > >> + if (stats.n_stats) { > >> data = vzalloc(array_size(n_stats, sizeof(u64))); > >> if (!data) > >> return -ENOMEM; > > > > Same here. We should be using n_stats, not stats.n_stats and adjust > > before you do the final copy. > > > >> @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) > >> if (copy_to_user(useraddr, &stats, sizeof(stats))) > >> goto out; > >> useraddr += sizeof(stats); > >> - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) > >> + if (stats.n_stats && > >> + copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) > >> goto out; > >> ret = 0; > >> > > > > Again. I am not sure what value is being added. If n_stats is 0 then I > > am pretty sure this will do nothing anyway. > > > > Not really no, n_stats is returned value, and stats.n_stats is adjusted value, > if the adjusted stats.n_stats is 0, we avoid memory allocation and setting data ptr > to NULL, copy_to_user() with NULL ptr maybe cause warn log. I see now. So data is NULL if stats.n_stats is 0.
> > Also, RTNL should be held during the time both calls are made into the > > driver. So nothing from userspace should be able to get in the middle > > of these calls to change the number of queues. > > > > The RTNL lock is already be held during every each ioctl in dev_ethtool(). > > rtnl_lock(); > rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state); > rtnl_unlock(); Yes, exactly. So the kernel should be safe from buffer overruns. Userspace will not get more than it asked for. It might get less, and it could be different to the previous calls. But i'm not aware of anything which says anything about the consistency between different invocations of ethtool -S. Andrew
On Fri, Jun 2, 2023 at 8:42 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > > Also, RTNL should be held during the time both calls are made into the > > > driver. So nothing from userspace should be able to get in the middle > > > of these calls to change the number of queues. > > > > > > > The RTNL lock is already be held during every each ioctl in dev_ethtool(). > > > > rtnl_lock(); > > rc = __dev_ethtool(net, ifr, useraddr, ethcmd, state); > > rtnl_unlock(); > > Yes, exactly. So the kernel should be safe from buffer overruns. > > Userspace will not get more than it asked for. It might get less, and > it could be different to the previous calls. But i'm not aware of > anything which says anything about the consistency between different > invocations of ethtool -S. The problem is the userspace allocation ends up requiring we make two calls into the stack. So it takes the lock once to get the count, performs the allocation, and then calls into ethtool again taking the lock and by then the value may have changed. Within each call it is held the entire time, but the userspace has to make two calls. So in between the two the number of rings could potentially change. What this change is essentially doing is clamping the copied data to the lesser of the current value versus the value when the userspace was allocated. However I am wondering now if we shouldn't just update the size value and return that as some sort of error for the userspace to potentially reallocate and repeat until it has the right size.
> What this change is essentially doing is clamping the copied data to > the lesser of the current value versus the value when the userspace > was allocated. However I am wondering now if we shouldn't just update > the size value and return that as some sort of error for the userspace > to potentially reallocate and repeat until it has the right size. I'm not sure we should be putting any effort into the IOCTL interface. It is deprecated. We should fix overrun problems, but i would not change the API. Netlink handles this atomically, and that is the interface tools should be using, not this IOCTL. Andrew
On Fri, Jun 2, 2023 at 9:37 AM Andrew Lunn <andrew@lunn.ch> wrote: > > > What this change is essentially doing is clamping the copied data to > > the lesser of the current value versus the value when the userspace > > was allocated. However I am wondering now if we shouldn't just update > > the size value and return that as some sort of error for the userspace > > to potentially reallocate and repeat until it has the right size. > > I'm not sure we should be putting any effort into the IOCTL > interface. It is deprecated. We should fix overrun problems, but i > would not change the API. Netlink handles this atomically, and that is > the interface tools should be using, not this IOCTL. If that is the case maybe it would just make more sense to just return an error if we are at risk of overrunning the userspace allocated buffer.
On 2023/6/3 2:02, Alexander Duyck wrote: > On Fri, Jun 2, 2023 at 9:37 AM Andrew Lunn <andrew@lunn.ch> wrote: >> >>> What this change is essentially doing is clamping the copied data to >>> the lesser of the current value versus the value when the userspace >>> was allocated. However I am wondering now if we shouldn't just update >>> the size value and return that as some sort of error for the userspace >>> to potentially reallocate and repeat until it has the right size. >> >> I'm not sure we should be putting any effort into the IOCTL >> interface. It is deprecated. We should fix overrun problems, but i >> would not change the API. Netlink handles this atomically, and that is >> the interface tools should be using, not this IOCTL. > > If that is the case maybe it would just make more sense to just return > an error if we are at risk of overrunning the userspace allocated > buffer. > In that case, I can modify to return an error, however, I think the ENOSPC or EFBIG mentioned in a previous email may not be suitable, maybe like others length/size checking return EINVAL. Another thing I wondered is that should I update the current length back to user if user buffer is not enough, assuming we update the new length with error returned, the userspace can use it to reallocate buffer if he wants to, which can avoid re-call previous ioctl to get the new length.
On Sat, 3 Jun 2023 09:51:34 +0800 Ding Hui wrote: > > If that is the case maybe it would just make more sense to just return > > an error if we are at risk of overrunning the userspace allocated > > buffer. > > In that case, I can modify to return an error, however, I think the > ENOSPC or EFBIG mentioned in a previous email may not be suitable, > maybe like others length/size checking return EINVAL. > > Another thing I wondered is that should I update the current length > back to user if user buffer is not enough, assuming we update the new > length with error returned, the userspace can use it to reallocate > buffer if he wants to, which can avoid re-call previous ioctl to get > the new length. This entire thread presupposes that user provides the length of the buffer. I don't see that in the code. Take ethtool_get_stats() as an example, you assume that stats.n_stats is set correctly, but it's not enforced today. Some app somewhere may pass in zeroed out stats and work just fine.
On 2023/6/3 13:55, Jakub Kicinski wrote: > On Sat, 3 Jun 2023 09:51:34 +0800 Ding Hui wrote: >>> If that is the case maybe it would just make more sense to just return >>> an error if we are at risk of overrunning the userspace allocated >>> buffer. >> >> In that case, I can modify to return an error, however, I think the >> ENOSPC or EFBIG mentioned in a previous email may not be suitable, >> maybe like others length/size checking return EINVAL. >> >> Another thing I wondered is that should I update the current length >> back to user if user buffer is not enough, assuming we update the new >> length with error returned, the userspace can use it to reallocate >> buffer if he wants to, which can avoid re-call previous ioctl to get >> the new length. > > This entire thread presupposes that user provides the length of > the buffer. I don't see that in the code. Take ethtool_get_stats() > as an example, you assume that stats.n_stats is set correctly, > but it's not enforced today. Some app somewhere may pass in zeroed > out stats and work just fine. > Yes. I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()), and searched the git log of ethtool utility, so I think that is an implicit rule and the check is missed in kernel where the patch involves. Without this rule, we cannot guarantee the safety of copy to user. Should we keep to be compatible with that incorrect userspace usage?
On Sat, 3 Jun 2023 15:11:29 +0800 Ding Hui wrote: > Yes. > > I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()), > and searched the git log of ethtool utility, so I think that is an implicit > rule and the check is missed in kernel where the patch involves. > > Without this rule, we cannot guarantee the safety of copy to user. > > Should we keep to be compatible with that incorrect userspace usage? If such incorrect user space exists we do, if it doesn't we don't. Problem is that we don't know what exists out there. Maybe we can add a pr_err_once() complaining about bad usage for now and see if anyone reports back that they are hitting it?
On 2023/6/5 1:47, Jakub Kicinski wrote: > On Sat, 3 Jun 2023 15:11:29 +0800 Ding Hui wrote: >> Yes. >> >> I checked the others ioctl (e.g. ethtool_get_eeprom(), ethtool_get_features()), >> and searched the git log of ethtool utility, so I think that is an implicit >> rule and the check is missed in kernel where the patch involves. >> >> Without this rule, we cannot guarantee the safety of copy to user. >> >> Should we keep to be compatible with that incorrect userspace usage? > > If such incorrect user space exists we do, if it doesn't we don't. > Problem is that we don't know what exists out there. > > Maybe we can add a pr_err_once() complaining about bad usage for now > and see if anyone reports back that they are hitting it? > How about this: Case 1: If the user len/n_stats is not zero, we will treat it as correct usage (although we cannot distinguish between the real correct usage and uninitialized usage). Return -EINVAL if current length exceed the one user specified. Case 2: If it is zero, we will treat it as incorrect usage, we can add a pr_err_once() for it and keep to be compatible with it for a period of time. At a suitable time in the future, this part can be removed by maintainers.
On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote: > Case 1: > If the user len/n_stats is not zero, we will treat it as correct usage > (although we cannot distinguish between the real correct usage and > uninitialized usage). Return -EINVAL if current length exceed the one > user specified. This assumes user will zero-initialize the value rather than do something like: buf = malloc(1 << 16); // 64k should always be enough ioctl(s, ETHTOOL_GSTATS, buf) for (i = 0; i < buf.n_stats; i++) /* use stats */ :( > Case 2: > If it is zero, we will treat it as incorrect usage, we can add a > pr_err_once() for it and keep to be compatible with it for a period of time. > At a suitable time in the future, this part can be removed by maintainers.
On 2023/6/6 2:39, Jakub Kicinski wrote: > On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote: >> Case 1: >> If the user len/n_stats is not zero, we will treat it as correct usage >> (although we cannot distinguish between the real correct usage and >> uninitialized usage). Return -EINVAL if current length exceed the one >> user specified. > > This assumes user will zero-initialize the value rather than do > something like: > > buf = malloc(1 << 16); // 64k should always be enough > ioctl(s, ETHTOOL_GSTATS, buf) > > for (i = 0; i < buf.n_stats; i++) > /* use stats */ > > :( > Sorry for late. Now I'm not sure what can I do next besides reporting the issue. >> Case 2: >> If it is zero, we will treat it as incorrect usage, we can add a >> pr_err_once() for it and keep to be compatible with it for a period of time. >> At a suitable time in the future, this part can be removed by maintainers. >
On Thu, Jun 8, 2023 at 2:06 AM Ding Hui <dinghui@sangfor.com.cn> wrote: > > On 2023/6/6 2:39, Jakub Kicinski wrote: > > On Mon, 5 Jun 2023 11:39:59 +0800 Ding Hui wrote: > >> Case 1: > >> If the user len/n_stats is not zero, we will treat it as correct usage > >> (although we cannot distinguish between the real correct usage and > >> uninitialized usage). Return -EINVAL if current length exceed the one > >> user specified. > > > > This assumes user will zero-initialize the value rather than do > > something like: > > > > buf = malloc(1 << 16); // 64k should always be enough > > ioctl(s, ETHTOOL_GSTATS, buf) > > > > for (i = 0; i < buf.n_stats; i++) > > /* use stats */ > > > > :( > > > > Sorry for late. > > Now I'm not sure what can I do next besides reporting the issue. Well as I said before. The issue points to a driver problem more than anything else. Normally the solution is to make it so that the counts don't fluctuate. That mostly consists of providing strings equal to the maximum count, and then 0 populating the data for any fields that don't exist in the case of ethtool -S. So for example in the case of igb which you had pointed out you could take a look at ixgbe for inspiration on how to fix it since the two drivers should be similar and one has the issue and one does not.
On 2023/6/8 22:17, Alexander Duyck wrote: > > Well as I said before. The issue points to a driver problem more than > anything else. > > Normally the solution is to make it so that the counts don't > fluctuate. That mostly consists of providing strings equal to the > maximum count, and then 0 populating the data for any fields that > don't exist in the case of ethtool -S. > > So for example in the case of igb which you had pointed out you could > take a look at ixgbe for inspiration on how to fix it since the two > drivers should be similar and one has the issue and one does not. > Thanks. FYI, I just did some rough research about which drivers return constant count and which not. Variable (61/188): drivers/net/dsa/bcm_sf2.c: .get_sset_count = bcm_sf2_sw_get_sset_count, drivers/net/ethernet/amazon/ena/ena_ethtool.c: .get_sset_count = ena_get_sset_count, drivers/net/ethernet/amd/xgbe/xgbe-ethtool.c: .get_sset_count = xgbe_get_sset_count, drivers/net/ethernet/aquantia/atlantic/aq_ethtool.c: .get_sset_count = aq_ethtool_get_sset_count, drivers/net/ethernet/broadcom/bcmsysport.c: .get_sset_count = bcm_sysport_get_sset_count, drivers/net/ethernet/broadcom/bnx2x/bnx2x_ethtool.c: .get_sset_count = bnx2x_get_sset_count, drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c: .get_sset_count = bnxt_get_sset_count, drivers/net/ethernet/brocade/bna/bnad_ethtool.c: .get_sset_count = bnad_get_sset_count, drivers/net/ethernet/cadence/macb_main.c: .get_sset_count = gem_get_sset_count, drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: .get_sset_count = lio_get_sset_count, drivers/net/ethernet/cavium/liquidio/lio_ethtool.c: .get_sset_count = lio_vf_get_sset_count, drivers/net/ethernet/cavium/thunder/nicvf_ethtool.c: .get_sset_count = nicvf_get_sset_count, drivers/net/ethernet/emulex/benet/be_ethtool.c: .get_sset_count = be_get_sset_count, drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c: .get_sset_count = dpaa_get_sset_count, drivers/net/ethernet/freescale/enetc/enetc_ethtool.c: .get_sset_count = enetc_get_sset_count, drivers/net/ethernet/fungible/funeth/funeth_ethtool.c: .get_sset_count = fun_get_sset_count, drivers/net/ethernet/google/gve/gve_ethtool.c: .get_sset_count = gve_get_sset_count, drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c: .get_sset_count = hns3_get_sset_count, drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c: .get_sset_count = hclge_get_sset_count, drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c: .get_sset_count = hclgevf_get_sset_count, drivers/net/ethernet/hisilicon/hns/hns_ae_adapt.c: .get_sset_count = hns_ae_get_sset_count, drivers/net/ethernet/hisilicon/hns/hns_ethtool.c: .get_sset_count = hns_get_sset_count, drivers/net/ethernet/huawei/hinic/hinic_ethtool.c: .get_sset_count = hinic_get_sset_count, drivers/net/ethernet/ibm/ibmvnic.c: .get_sset_count = ibmvnic_get_sset_count, drivers/net/ethernet/intel/i40e/i40e_ethtool.c: .get_sset_count = i40e_get_sset_count, drivers/net/ethernet/intel/iavf/iavf_ethtool.c: .get_sset_count = iavf_get_sset_count, drivers/net/ethernet/intel/ice/ice_ethtool.c: .get_sset_count = ice_get_sset_count, drivers/net/ethernet/intel/igb/igb_ethtool.c: .get_sset_count = igb_get_sset_count, drivers/net/ethernet/intel/igc/igc_ethtool.c: .get_sset_count = igc_ethtool_get_sset_count, drivers/net/ethernet/intel/ixgbe/ixgbe_ethtool.c: .get_sset_count = ixgbe_get_sset_count, drivers/net/ethernet/intel/ixgbevf/ethtool.c: .get_sset_count = ixgbevf_get_sset_count, drivers/net/ethernet/marvell/mvpp2/mvpp2_main.c: .get_sset_count = mvpp2_ethtool_get_sset_count, drivers/net/ethernet/marvell/octeon_ep/octep_ethtool.c: .get_sset_count = octep_get_sset_count, drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c: .get_sset_count = otx2_get_sset_count, drivers/net/ethernet/marvell/octeontx2/nic/otx2_ethtool.c: .get_sset_count = otx2vf_get_sset_count, drivers/net/ethernet/mellanox/mlx4/en_ethtool.c: .get_sset_count = mlx4_en_get_sset_count, drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c: .get_sset_count = mlx5e_get_sset_count, drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: .get_sset_count = mlx5e_rep_get_sset_count, drivers/net/ethernet/mellanox/mlx5/core/ipoib/ethtool.c: .get_sset_count = mlx5i_get_sset_count, drivers/net/ethernet/microsoft/mana/mana_ethtool.c: .get_sset_count = mana_get_sset_count, drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_net_get_sset_count, drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_port_get_sset_count, drivers/net/ethernet/pensando/ionic/ionic_ethtool.c: .get_sset_count = ionic_get_sset_count, drivers/net/ethernet/qlogic/qede/qede_ethtool.c: .get_sset_count = qede_get_sset_count, drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c: .get_sset_count = qlcnic_get_sset_count, drivers/net/ethernet/qlogic/qlcnic/qlcnic_ethtool.c: .get_sset_count = qlcnic_get_sset_count, drivers/net/ethernet/sfc/ef100_ethtool.c: .get_sset_count = efx_ethtool_get_sset_count, drivers/net/ethernet/sfc/ethtool.c: .get_sset_count = efx_ethtool_get_sset_count, drivers/net/ethernet/sfc/falcon/ethtool.c: .get_sset_count = ef4_ethtool_get_sset_count, drivers/net/ethernet/sfc/siena/ethtool.c: .get_sset_count = efx_siena_ethtool_get_sset_count, drivers/net/ethernet/stmicro/stmmac/stmmac_ethtool.c: .get_sset_count = stmmac_get_sset_count, drivers/net/ethernet/sun/niu.c: .get_sset_count = niu_get_sset_count, drivers/net/ethernet/sun/sunvnet.c: .get_sset_count = vnet_get_sset_count, drivers/net/ethernet/ti/cpsw.c: .get_sset_count = cpsw_get_sset_count, drivers/net/ethernet/ti/cpsw_new.c: .get_sset_count = cpsw_get_sset_count, drivers/net/hyperv/netvsc_drv.c: .get_sset_count = netvsc_get_sset_count, drivers/net/ifb.c: .get_sset_count = ifb_get_sset_count, drivers/net/veth.c: .get_sset_count = veth_get_sset_count, drivers/net/virtio_net.c: .get_sset_count = virtnet_get_sset_count, drivers/net/vmxnet3/vmxnet3_ethtool.c: .get_sset_count = vmxnet3_get_sset_count, drivers/s390/net/qeth_ethtool.c: .get_sset_count = qeth_get_sset_count, Constant (127/188): arch/um/drivers/vector_kern.c: .get_sset_count = vector_get_sset_count, drivers/infiniband/ulp/ipoib/ipoib_ethtool.c: .get_sset_count = ipoib_get_sset_count, drivers/infiniband/ulp/opa_vnic/opa_vnic_ethtool.c: .get_sset_count = vnic_get_sset_count, drivers/net/can/flexcan/flexcan-ethtool.c: .get_sset_count = flexcan_get_sset_count, drivers/net/can/slcan/slcan-ethtool.c: .get_sset_count = slcan_get_sset_count, drivers/net/dsa/b53/b53_common.c: .get_sset_count = b53_get_sset_count, drivers/net/dsa/dsa_loop.c: .get_sset_count = dsa_loop_get_sset_count, drivers/net/dsa/hirschmann/hellcreek.c: .get_sset_count = hellcreek_get_sset_count, drivers/net/dsa/lan9303-core.c: .get_sset_count = lan9303_get_sset_count, drivers/net/dsa/lantiq_gswip.c: .get_sset_count = gswip_get_sset_count, drivers/net/dsa/microchip/ksz_common.c: .get_sset_count = ksz_sset_count, drivers/net/dsa/mt7530.c: .get_sset_count = mt7530_get_sset_count, drivers/net/dsa/mv88e6xxx/chip.c: .get_sset_count = mv88e6xxx_get_sset_count, drivers/net/dsa/ocelot/felix.c: .get_sset_count = felix_get_sset_count, drivers/net/dsa/qca/qca8k-8xxx.c: .get_sset_count = qca8k_get_sset_count, drivers/net/dsa/realtek/rtl8365mb.c: .get_sset_count = rtl8365mb_get_sset_count, drivers/net/dsa/realtek/rtl8366rb.c: .get_sset_count = rtl8366_get_sset_count, drivers/net/dsa/rzn1_a5psw.c: .get_sset_count = a5psw_get_sset_count, drivers/net/dsa/sja1105/sja1105_main.c: .get_sset_count = sja1105_get_sset_count, drivers/net/dsa/vitesse-vsc73xx-core.c: .get_sset_count = vsc73xx_get_sset_count, drivers/net/dsa/xrs700x/xrs700x.c: .get_sset_count = xrs700x_get_sset_count, drivers/net/ethernet/3com/3c59x.c: .get_sset_count = vortex_get_sset_count, drivers/net/ethernet/alacritech/slicoss.c: .get_sset_count = slic_get_sset_count, drivers/net/ethernet/altera/altera_tse_ethtool.c: .get_sset_count = tse_sset_count, drivers/net/ethernet/amd/pcnet32.c: .get_sset_count = pcnet32_get_sset_count, drivers/net/ethernet/apm/xgene-v2/ethtool.c: .get_sset_count = xge_get_sset_count, drivers/net/ethernet/apm/xgene/xgene_enet_ethtool.c: .get_sset_count = xgene_get_sset_count, drivers/net/ethernet/asix/ax88796c_ioctl.c: .get_sset_count = ax88796c_get_sset_count, drivers/net/ethernet/atheros/ag71xx.c: .get_sset_count = ag71xx_ethtool_get_sset_count, drivers/net/ethernet/atheros/alx/ethtool.c: .get_sset_count = alx_get_sset_count, drivers/net/ethernet/atheros/atlx/atl1.c: .get_sset_count = atl1_get_sset_count, drivers/net/ethernet/broadcom/b44.c: .get_sset_count = b44_get_sset_count, drivers/net/ethernet/broadcom/bcm63xx_enet.c: .get_sset_count = bcm_enet_get_sset_count, drivers/net/ethernet/broadcom/bcm63xx_enet.c: .get_sset_count = bcm_enetsw_get_sset_count, drivers/net/ethernet/broadcom/bgmac.c: .get_sset_count = bgmac_get_sset_count, drivers/net/ethernet/broadcom/bnx2.c: .get_sset_count = bnx2_get_sset_count, drivers/net/ethernet/broadcom/genet/bcmgenet.c: .get_sset_count = bcmgenet_get_sset_count, drivers/net/ethernet/broadcom/tg3.c: .get_sset_count = tg3_get_sset_count, drivers/net/ethernet/calxeda/xgmac.c: .get_sset_count = xgmac_get_sset_count, drivers/net/ethernet/chelsio/cxgb3/cxgb3_main.c: .get_sset_count = get_sset_count, drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c: .get_sset_count = get_sset_count, drivers/net/ethernet/chelsio/cxgb4vf/cxgb4vf_main.c: .get_sset_count = cxgb4vf_get_sset_count, drivers/net/ethernet/chelsio/cxgb/cxgb2.c: .get_sset_count = get_sset_count, drivers/net/ethernet/cisco/enic/enic_ethtool.c: .get_sset_count = enic_get_sset_count, drivers/net/ethernet/cortina/gemini.c: .get_sset_count = gmac_get_sset_count, drivers/net/ethernet/dlink/sundance.c: .get_sset_count = get_sset_count, drivers/net/ethernet/engleder/tsnep_ethtool.c: .get_sset_count = tsnep_ethtool_get_sset_count, drivers/net/ethernet/freescale/dpaa2/dpaa2-ethtool.c: .get_sset_count = dpaa2_eth_get_sset_count, drivers/net/ethernet/freescale/dpaa2/dpaa2-switch-ethtool.c: .get_sset_count = dpaa2_switch_ethtool_get_sset_count, drivers/net/ethernet/freescale/fec_main.c: .get_sset_count = fec_enet_get_sset_count, drivers/net/ethernet/freescale/gianfar_ethtool.c: .get_sset_count = gfar_sset_count, drivers/net/ethernet/freescale/ucc_geth_ethtool.c: .get_sset_count = uec_get_sset_count, drivers/net/ethernet/ibm/ehea/ehea_ethtool.c: .get_sset_count = ehea_get_sset_count, drivers/net/ethernet/ibm/emac/core.c: .get_sset_count = emac_ethtool_get_sset_count, drivers/net/ethernet/ibm/ibmveth.c: .get_sset_count = ibmveth_get_sset_count, drivers/net/ethernet/intel/e1000/e1000_ethtool.c: .get_sset_count = e1000_get_sset_count, drivers/net/ethernet/intel/e1000e/ethtool.c: .get_sset_count = e1000e_get_sset_count, drivers/net/ethernet/intel/e100.c: .get_sset_count = e100_get_sset_count, drivers/net/ethernet/intel/fm10k/fm10k_ethtool.c: .get_sset_count = fm10k_get_sset_count, drivers/net/ethernet/intel/ice/ice_ethtool.c: .get_sset_count = ice_repr_get_sset_count, drivers/net/ethernet/intel/igbvf/ethtool.c: .get_sset_count = igbvf_get_sset_count, drivers/net/ethernet/marvell/mv643xx_eth.c: .get_sset_count = mv643xx_eth_get_sset_count, drivers/net/ethernet/marvell/mvneta.c: .get_sset_count = mvneta_ethtool_get_sset_count, drivers/net/ethernet/marvell/prestera/prestera_ethtool.c: .get_sset_count = prestera_ethtool_get_sset_count, drivers/net/ethernet/marvell/skge.c: .get_sset_count = skge_get_sset_count, drivers/net/ethernet/marvell/sky2.c: .get_sset_count = sky2_get_sset_count, drivers/net/ethernet/mediatek/mtk_eth_soc.c: .get_sset_count = mtk_get_sset_count, drivers/net/ethernet/mellanox/mlxbf_gige/mlxbf_gige_ethtool.c: .get_sset_count = mlxbf_gige_get_sset_count, drivers/net/ethernet/mellanox/mlxsw/spectrum_ethtool.c: .get_sset_count = mlxsw_sp_port_get_sset_count, drivers/net/ethernet/micrel/ksz884x.c: .get_sset_count = netdev_get_sset_count, drivers/net/ethernet/microchip/lan743x_ethtool.c: .get_sset_count = lan743x_ethtool_get_sset_count, drivers/net/ethernet/microchip/lan966x/lan966x_ethtool.c: .get_sset_count = lan966x_get_sset_count, drivers/net/ethernet/microchip/sparx5/sparx5_ethtool.c: .get_sset_count = sparx5_get_sset_count, drivers/net/ethernet/mscc/ocelot_net.c: .get_sset_count = ocelot_port_get_sset_count, drivers/net/ethernet/myricom/myri10ge/myri10ge.c: .get_sset_count = myri10ge_get_sset_count, drivers/net/ethernet/neterion/s2io.c: .get_sset_count = s2io_get_sset_count, drivers/net/ethernet/nvidia/forcedeth.c: .get_sset_count = nv_get_sset_count, drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_ethtool.c: .get_sset_count = pch_gbe_get_sset_count, drivers/net/ethernet/pasemi/pasemi_mac_ethtool.c: .get_sset_count = pasemi_mac_get_sset_count, drivers/net/ethernet/qlogic/netxen/netxen_nic_ethtool.c: .get_sset_count = netxen_get_sset_count, drivers/net/ethernet/qualcomm/emac/emac-ethtool.c: .get_sset_count = emac_get_sset_count, drivers/net/ethernet/qualcomm/qca_debug.c: .get_sset_count = qcaspi_get_sset_count, drivers/net/ethernet/qualcomm/rmnet/rmnet_vnd.c: .get_sset_count = rmnet_get_sset_count, drivers/net/ethernet/realtek/8139cp.c: .get_sset_count = cp_get_sset_count, drivers/net/ethernet/realtek/8139too.c: .get_sset_count = rtl8139_get_sset_count, drivers/net/ethernet/realtek/r8169_main.c: .get_sset_count = rtl8169_get_sset_count, drivers/net/ethernet/renesas/ravb_main.c: .get_sset_count = ravb_get_sset_count, drivers/net/ethernet/renesas/sh_eth.c: .get_sset_count = sh_eth_get_sset_count, drivers/net/ethernet/rocker/rocker_main.c: .get_sset_count = rocker_port_get_sset_count, drivers/net/ethernet/samsung/sxgbe/sxgbe_ethtool.c: .get_sset_count = sxgbe_get_sset_count, drivers/net/ethernet/silan/sc92031.c: .get_sset_count = sc92031_ethtool_get_sset_count, drivers/net/ethernet/sun/cassini.c: .get_sset_count = cas_get_sset_count, drivers/net/ethernet/synopsys/dwc-xlgmac-ethtool.c: .get_sset_count = xlgmac_ethtool_get_sset_count, drivers/net/ethernet/tehuti/tehuti.c: .get_sset_count = bdx_get_sset_count, drivers/net/ethernet/ti/am65-cpsw-ethtool.c: .get_sset_count = am65_cpsw_get_sset_count, drivers/net/ethernet/ti/netcp_ethss.c: .get_sset_count = keystone_get_sset_count, drivers/net/ethernet/toshiba/spider_net_ethtool.c: .get_sset_count = spider_net_get_sset_count, drivers/net/ethernet/toshiba/tc35815.c: .get_sset_count = tc35815_get_sset_count, drivers/net/ethernet/vertexcom/mse102x.c: .get_sset_count = mse102x_get_sset_count, drivers/net/ethernet/via/via-velocity.c: .get_sset_count = velocity_get_sset_count, drivers/net/fjes/fjes_ethtool.c: .get_sset_count = fjes_get_sset_count, drivers/net/phy/adin.c: .get_sset_count = adin_get_sset_count, drivers/net/phy/aquantia_main.c: .get_sset_count = aqr107_get_sset_count, drivers/net/phy/at803x.c: .get_sset_count = at803x_get_sset_count, drivers/net/phy/bcm7xxx.c: .get_sset_count = bcm_phy_get_sset_count, \ drivers/net/phy/bcm-cygnus.c: .get_sset_count = bcm_phy_get_sset_count, drivers/net/phy/broadcom.c: .get_sset_count = bcm_phy_get_sset_count, drivers/net/phy/icplus.c: .get_sset_count = ip101g_get_sset_count, drivers/net/phy/marvell.c: .get_sset_count = marvell_get_sset_count, drivers/net/phy/micrel.c: .get_sset_count = kszphy_get_sset_count, drivers/net/phy/mscc/mscc_main.c: .get_sset_count = &vsc85xx_get_sset_count, drivers/net/phy/nxp-c45-tja11xx.c: .get_sset_count = nxp_c45_get_sset_count, drivers/net/phy/nxp-cbtx.c: .get_sset_count = cbtx_get_sset_count, drivers/net/phy/nxp-tja11xx.c: .get_sset_count = tja11xx_get_sset_count, drivers/net/phy/smsc.c: .get_sset_count = smsc_get_sset_count, drivers/net/usb/asix_devices.c: .get_sset_count = ax88772_ethtool_get_sset_count, drivers/net/usb/cdc_ncm.c: .get_sset_count = cdc_ncm_get_sset_count, drivers/net/usb/lan78xx.c: .get_sset_count = lan78xx_get_sset_count, drivers/net/usb/r8152.c: .get_sset_count = rtl8152_get_sset_count, drivers/net/usb/smsc95xx.c: .get_sset_count = smsc95xx_ethtool_get_sset_count, drivers/net/wireless/ath/ath6kl/cfg80211.c: .get_sset_count = ath6kl_get_sset_count, drivers/net/wireless/marvell/libertas/ethtool.c: .get_sset_count = lbs_mesh_ethtool_get_sset_count, drivers/net/xen-netback/interface.c: .get_sset_count = xenvif_get_sset_count, drivers/net/xen-netfront.c: .get_sset_count = xennet_get_sset_count, drivers/staging/qlge/qlge_ethtool.c: .get_sset_count = qlge_get_sset_count, net/batman-adv/soft-interface.c: .get_sset_count = batadv_get_sset_count, net/mac80211/ethtool.c: .get_sset_count = ieee80211_get_sset_count,
On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote: > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_net_get_sset_count, > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_port_get_sset_count, Not sure if your research is accurate, NFP does not change the number of stats. The number depends on the device and the FW loaded, but those are constant during a lifetime of a netdevice.
On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote: > > On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote: > > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_net_get_sset_count, > > drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_port_get_sset_count, > > Not sure if your research is accurate, NFP does not change the number > of stats. The number depends on the device and the FW loaded, but those > are constant during a lifetime of a netdevice. Yeah, the value doesn't need to be a constant, it just need to be constant. So for example in the ixgbe driver I believe we were using the upper limits on the Tx and Rx queues which last I recall are stored in the netdev itself.
On 2023/6/10 1:59, Alexander Duyck wrote: > On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote: >> >> On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote: >>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_net_get_sset_count, >>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_port_get_sset_count, >> >> Not sure if your research is accurate, NFP does not change the number >> of stats. The number depends on the device and the FW loaded, but those >> are constant during a lifetime of a netdevice. Sorry, my research is rough indeed. > > Yeah, the value doesn't need to be a constant, it just need to be constant. > > So for example in the ixgbe driver I believe we were using the upper > limits on the Tx and Rx queues which last I recall are stored in the > netdev itself. > Thanks to point that, the examples NFP and ixgbe do help me.
On 2023/6/10 11:47, Ding Hui wrote: > On 2023/6/10 1:59, Alexander Duyck wrote: >> On Fri, Jun 9, 2023 at 10:13 AM Jakub Kicinski <kuba@kernel.org> wrote: >>> >>> On Fri, 9 Jun 2023 23:25:34 +0800 Ding Hui wrote: >>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_net_get_sset_count, >>>> drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c: .get_sset_count = nfp_port_get_sset_count, >>> >>> Not sure if your research is accurate, NFP does not change the number >>> of stats. The number depends on the device and the FW loaded, but those >>> are constant during a lifetime of a netdevice. > > Sorry, my research is rough indeed. > >> >> Yeah, the value doesn't need to be a constant, it just need to be constant. >> >> So for example in the ixgbe driver I believe we were using the upper >> limits on the Tx and Rx queues which last I recall are stored in the >> netdev itself. >> > Thanks to point that, the examples NFP and ixgbe do help me. The commit f8ba7db85035 ("ice: Report stats for allocated queues via ethtool stats") also helps a lot.
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 6bb778e10461..82a975a9c895 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -1902,7 +1902,7 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) if (copy_from_user(&test, useraddr, sizeof(test))) return -EFAULT; - test.len = test_len; + test.len = min_t(u32, test.len, test_len); data = kcalloc(test_len, sizeof(u64), GFP_USER); if (!data) return -ENOMEM; @@ -1915,7 +1915,8 @@ static int ethtool_self_test(struct net_device *dev, char __user *useraddr) if (copy_to_user(useraddr, &test, sizeof(test))) goto out; useraddr += sizeof(test); - if (copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) + if (test.len && + copy_to_user(useraddr, data, array_size(test.len, sizeof(u64)))) goto out; ret = 0; @@ -1940,10 +1941,10 @@ static int ethtool_get_strings(struct net_device *dev, void __user *useraddr) return -ENOMEM; WARN_ON_ONCE(!ret); - gstrings.len = ret; + gstrings.len = min_t(u32, gstrings.len, ret); if (gstrings.len) { - data = vzalloc(array_size(gstrings.len, ETH_GSTRING_LEN)); + data = vzalloc(array_size(ret, ETH_GSTRING_LEN)); if (!data) return -ENOMEM; @@ -2055,9 +2056,9 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_from_user(&stats, useraddr, sizeof(stats))) return -EFAULT; - stats.n_stats = n_stats; + stats.n_stats = min_t(u32, stats.n_stats, n_stats); - if (n_stats) { + if (stats.n_stats) { data = vzalloc(array_size(n_stats, sizeof(u64))); if (!data) return -ENOMEM; @@ -2070,7 +2071,8 @@ static int ethtool_get_stats(struct net_device *dev, void __user *useraddr) if (copy_to_user(useraddr, &stats, sizeof(stats))) goto out; useraddr += sizeof(stats); - if (n_stats && copy_to_user(useraddr, data, array_size(n_stats, sizeof(u64)))) + if (stats.n_stats && + copy_to_user(useraddr, data, array_size(stats.n_stats, sizeof(u64)))) goto out; ret = 0;