Message ID | 20230509111148.4608-3-dinghui@sangfor.com.cn |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2799472vqo; Tue, 9 May 2023 04:21:13 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4SY1sTWLkP3MO++379p+oM79Ws1vFok9iXaSgsLTvzK/FN9+q7M1ZCa5cKs5o8js0C/KgZ X-Received: by 2002:a05:6a00:1708:b0:63b:646d:9165 with SMTP id h8-20020a056a00170800b0063b646d9165mr14940258pfc.26.1683631273467; Tue, 09 May 2023 04:21:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683631273; cv=none; d=google.com; s=arc-20160816; b=SChMHafDH3BUiuNXMVIOlKGrfkaYDwZUxaijhERJzN1MtJyO1Hg7Dd/oFJ/8FJhFQw LY+hYJsW1VTpYCqpYMMDYYIZZ9dNjeao1f/5YpvE67K9HqGNS+BNmoDBVuIVcniOcBu+ WczPcZ7Y3m+gE/n7ufTnkVf5z0WxExgxwddg7bTpXpBx5K11o99IVR1xJQ20/QHe/qxH z33sy+d72PofJgBkUuS4nzAQ1P8q6lyVOCoUDVeg9Efvrd2T5vRaxoTsRQYD6SeCkEBg yAkKL/ktLDrDAt0oEX1Ls98mATNw0cKFuI9V9u2Lfa7hnxMeVAqv1eyppkVGxim0higD KHbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from; bh=iP7L71x91m6aLTCbBzTMlcPfN+Mtw23GuUuV1axiZtk=; b=flOMKZHFS99YIKUuFxItF4JFoC5TPwlCkwZ87NtO8TmeqjMkRrz+xLeHR/9TfuGdV0 G+VUPmEj7oiirQKMHntdFU+wdiorbkYXcx0fQuw5eioV2yDb2G8PxDD9RlKaBQyw0laQ suX4KmgJguyj4kbcV/pOO1q/ZunkeG5qvXl5Z/F0hjlFkV8hoAn4UE5xPat6VCzNirS2 Cj+FahSTEAjpT3DvzgvHqZ9VR0wavo1wDyWZgSVuRDSzDYB+eDdaEUNcqPN5kMzFzaD0 obTLwnTOotsthbLwxffeKySwJlpbMd7pG8bdsDk/WyAyTLNatHR6JeT9P5HqFLviIfFy Lnzg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c19-20020a637253000000b00525025dfa5fsi1223787pgn.377.2023.05.09.04.20.58; Tue, 09 May 2023 04:21:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=sangfor.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235479AbjEILMN (ORCPT <rfc822;baris.duru.linux@gmail.com> + 99 others); Tue, 9 May 2023 07:12:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33602 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235074AbjEILMI (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 9 May 2023 07:12:08 -0400 Received: from mail-m127104.qiye.163.com (mail-m127104.qiye.163.com [115.236.127.104]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 457D840FD; Tue, 9 May 2023 04:12:07 -0700 (PDT) Received: from localhost.localdomain (unknown [IPV6:240e:3b7:3277:3e50:6cb9:7ae9:9442:26ad]) by mail-m127104.qiye.163.com (Hmail) with ESMTPA id 9AC42A4032D; Tue, 9 May 2023 19:12:04 +0800 (CST) From: Ding Hui <dinghui@sangfor.com.cn> To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, intel-wired-lan@lists.osuosl.org, jesse.brandeburg@intel.com, anthony.l.nguyen@intel.com Cc: keescook@chromium.org, grzegorzx.szczurek@intel.com, mateusz.palczewski@intel.com, mitch.a.williams@intel.com, gregory.v.rose@intel.com, jeffrey.t.kirsher@intel.com, michal.kubiak@intel.com, simon.horman@corigine.com, madhu.chittim@intel.com, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-hardening@vger.kernel.org, pengdonglin@sangfor.com.cn, huangcun@sangfor.com.cn, Ding Hui <dinghui@sangfor.com.cn> Subject: [PATCH net v5 2/2] iavf: Fix out-of-bounds when setting channels on remove Date: Tue, 9 May 2023 19:11:48 +0800 Message-Id: <20230509111148.4608-3-dinghui@sangfor.com.cn> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230509111148.4608-1-dinghui@sangfor.com.cn> References: <20230509111148.4608-1-dinghui@sangfor.com.cn> X-HM-Spam-Status: e1kfGhgUHx5ZQUpXWQgPGg8OCBgUHx5ZQUlOS1dZFg8aDwILHllBWSg2Ly tZV1koWUFITzdXWS1ZQUlXWQ8JGhUIEh9ZQVlCT0pNVklJT08eH0NJGkkdHVUTARMWGhIXJBQOD1 lXWRgSC1lBWUlPSx5BSBlMQUhJTExBSB5OS0FNGBlCQUwaHkJBQk9PSUFJTRofWVdZFhoPEhUdFF lBWU9LSFVKSktISkxVSktLVUtZBg++ X-HM-Tid: 0a8800361898b282kuuu9ac42a4032d X-HM-MType: 1 X-HM-Sender-Digest: e1kMHhlZQR0aFwgeV1kSHx4VD1lBWUc6NBA6Mhw*MD0RKys5L0ohNE0Q HT8aCSlVSlVKTUNITUhLTElOQ0hKVTMWGhIXVR8SFRwTDhI7CBoVHB0UCVUYFBZVGBVFWVdZEgtZ QVlJT0seQUgZTEFISUxMQUgeTktBTRgZQkFMGh5CQUJPT0lBSU0aH1lXWQgBWUFDSklJNwY+ X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765415346060885404?= X-GMAIL-MSGID: =?utf-8?q?1765415346060885404?= |
Series |
iavf: Fix issues when setting channels concurrency with removing
|
|
Commit Message
Ding Hui
May 9, 2023, 11:11 a.m. UTC
If we set channels greater during iavf_remove(), and waiting reset done
would be timeout, then returned with error but changed num_active_queues
directly, that will lead to OOB like the following logs. Because the
num_active_queues is greater than tx/rx_rings[] allocated actually.
Reproducer:
[root@host ~]# cat repro.sh
#!/bin/bash
pf_dbsf="0000:41:00.0"
vf0_dbsf="0000:41:02.0"
g_pids=()
function do_set_numvf()
{
echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
sleep $((RANDOM%3+1))
echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs
sleep $((RANDOM%3+1))
}
function do_set_channel()
{
local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/)
[ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; }
ifconfig $nic 192.168.18.5 netmask 255.255.255.0
ifconfig $nic up
ethtool -L $nic combined 1
ethtool -L $nic combined 4
sleep $((RANDOM%3))
}
function on_exit()
{
local pid
for pid in "${g_pids[@]}"; do
kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null
done
g_pids=()
}
trap "on_exit; exit" EXIT
while :; do do_set_numvf ; done &
g_pids+=($!)
while :; do do_set_channel ; done &
g_pids+=($!)
wait
Result:
[ 3506.152887] iavf 0000:41:02.0: Removing device
[ 3510.400799] ==================================================================
[ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf]
[ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536
[ 3510.400823]
[ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1
[ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021
[ 3510.400835] Call Trace:
[ 3510.400851] dump_stack+0x71/0xab
[ 3510.400860] print_address_description+0x6b/0x290
[ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf]
[ 3510.400868] kasan_report+0x14a/0x2b0
[ 3510.400873] iavf_free_all_tx_resources+0x156/0x160 [iavf]
[ 3510.400880] iavf_remove+0x2b6/0xc70 [iavf]
[ 3510.400884] ? iavf_free_all_rx_resources+0x160/0x160 [iavf]
[ 3510.400891] ? wait_woken+0x1d0/0x1d0
[ 3510.400895] ? notifier_call_chain+0xc1/0x130
[ 3510.400903] pci_device_remove+0xa8/0x1f0
[ 3510.400910] device_release_driver_internal+0x1c6/0x460
[ 3510.400916] pci_stop_bus_device+0x101/0x150
[ 3510.400919] pci_stop_and_remove_bus_device+0xe/0x20
[ 3510.400924] pci_iov_remove_virtfn+0x187/0x420
[ 3510.400927] ? pci_iov_add_virtfn+0xe10/0xe10
[ 3510.400929] ? pci_get_subsys+0x90/0x90
[ 3510.400932] sriov_disable+0xed/0x3e0
[ 3510.400936] ? bus_find_device+0x12d/0x1a0
[ 3510.400953] i40e_free_vfs+0x754/0x1210 [i40e]
[ 3510.400966] ? i40e_reset_all_vfs+0x880/0x880 [i40e]
[ 3510.400968] ? pci_get_device+0x7c/0x90
[ 3510.400970] ? pci_get_subsys+0x90/0x90
[ 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210
[ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10
[ 3510.400996] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e]
[ 3510.401001] sriov_numvfs_store+0x214/0x290
[ 3510.401005] ? sriov_totalvfs_show+0x30/0x30
[ 3510.401007] ? __mutex_lock_slowpath+0x10/0x10
[ 3510.401011] ? __check_object_size+0x15a/0x350
[ 3510.401018] kernfs_fop_write+0x280/0x3f0
[ 3510.401022] vfs_write+0x145/0x440
[ 3510.401025] ksys_write+0xab/0x160
[ 3510.401028] ? __ia32_sys_read+0xb0/0xb0
[ 3510.401031] ? fput_many+0x1a/0x120
[ 3510.401032] ? filp_close+0xf0/0x130
[ 3510.401038] do_syscall_64+0xa0/0x370
[ 3510.401041] ? page_fault+0x8/0x30
[ 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca
[ 3510.401073] RIP: 0033:0x7f3a9bb842c0
[ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24
[ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0
[ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001
[ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700
[ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002
[ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001
[ 3510.401090]
[ 3510.401093] Allocated by task 76795:
[ 3510.401098] kasan_kmalloc+0xa6/0xd0
[ 3510.401099] __kmalloc+0xfb/0x200
[ 3510.401104] iavf_init_interrupt_scheme+0x26f/0x1310 [iavf]
[ 3510.401108] iavf_watchdog_task+0x1d58/0x4050 [iavf]
[ 3510.401114] process_one_work+0x56a/0x11f0
[ 3510.401115] worker_thread+0x8f/0xf40
[ 3510.401117] kthread+0x2a0/0x390
[ 3510.401119] ret_from_fork+0x1f/0x40
[ 3510.401122] 0xffffffffffffffff
[ 3510.401123]
In timeout handling, we should keep the original num_active_queues
and reset num_req_queues to 0.
Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count")
Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
Cc: Donglin Peng <pengdonglin@sangfor.com.cn>
Cc: Huang Cun <huangcun@sangfor.com.cn>
---
v4 to v5:
- remove testing __IAVF_IN_REMOVE_TASK condition
- update commit message
- remove Reviewed-by tags to review again
v3 to v4:
- nothing changed
v2 to v3:
- fix review tag
v1 to v2:
- add reproduction script
---
drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Tue, May 09, 2023 at 07:11:48PM +0800, Ding Hui wrote: > If we set channels greater during iavf_remove(), and waiting reset done > would be timeout, then returned with error but changed num_active_queues > directly, that will lead to OOB like the following logs. Because the > num_active_queues is greater than tx/rx_rings[] allocated actually. > > Reproducer: > > [root@host ~]# cat repro.sh > #!/bin/bash > > pf_dbsf="0000:41:00.0" > vf0_dbsf="0000:41:02.0" > g_pids=() > > function do_set_numvf() > { > echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs > sleep $((RANDOM%3+1)) > echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs > sleep $((RANDOM%3+1)) > } > > function do_set_channel() > { > local nic=$(ls -1 --indicator-style=none /sys/bus/pci/devices/${vf0_dbsf}/net/) > [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } > ifconfig $nic 192.168.18.5 netmask 255.255.255.0 > ifconfig $nic up > ethtool -L $nic combined 1 > ethtool -L $nic combined 4 > sleep $((RANDOM%3)) > } > > function on_exit() > { > local pid > for pid in "${g_pids[@]}"; do > kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null > done > g_pids=() > } > > trap "on_exit; exit" EXIT > > while :; do do_set_numvf ; done & > g_pids+=($!) > while :; do do_set_channel ; done & > g_pids+=($!) > > wait > > Result: > > [ 3506.152887] iavf 0000:41:02.0: Removing device > [ 3510.400799] ================================================================== > [ 3510.400820] BUG: KASAN: slab-out-of-bounds in iavf_free_all_tx_resources+0x156/0x160 [iavf] > [ 3510.400823] Read of size 8 at addr ffff88b6f9311008 by task repro.sh/55536 > [ 3510.400823] > [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded Tainted: G O --------- -t - 4.18.0 #1 > [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS 2.0 04/09/2021 > [ 3510.400835] Call Trace: > [ 3510.400851] dump_stack+0x71/0xab > [ 3510.400860] print_address_description+0x6b/0x290 > [ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf] > [ 3510.400868] kasan_report+0x14a/0x2b0 > [ 3510.400873] iavf_free_all_tx_resources+0x156/0x160 [iavf] > [ 3510.400880] iavf_remove+0x2b6/0xc70 [iavf] > [ 3510.400884] ? iavf_free_all_rx_resources+0x160/0x160 [iavf] > [ 3510.400891] ? wait_woken+0x1d0/0x1d0 > [ 3510.400895] ? notifier_call_chain+0xc1/0x130 > [ 3510.400903] pci_device_remove+0xa8/0x1f0 > [ 3510.400910] device_release_driver_internal+0x1c6/0x460 > [ 3510.400916] pci_stop_bus_device+0x101/0x150 > [ 3510.400919] pci_stop_and_remove_bus_device+0xe/0x20 > [ 3510.400924] pci_iov_remove_virtfn+0x187/0x420 > [ 3510.400927] ? pci_iov_add_virtfn+0xe10/0xe10 > [ 3510.400929] ? pci_get_subsys+0x90/0x90 > [ 3510.400932] sriov_disable+0xed/0x3e0 > [ 3510.400936] ? bus_find_device+0x12d/0x1a0 > [ 3510.400953] i40e_free_vfs+0x754/0x1210 [i40e] > [ 3510.400966] ? i40e_reset_all_vfs+0x880/0x880 [i40e] > [ 3510.400968] ? pci_get_device+0x7c/0x90 > [ 3510.400970] ? pci_get_subsys+0x90/0x90 > [ 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210 > [ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10 > [ 3510.400996] i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] > [ 3510.401001] sriov_numvfs_store+0x214/0x290 > [ 3510.401005] ? sriov_totalvfs_show+0x30/0x30 > [ 3510.401007] ? __mutex_lock_slowpath+0x10/0x10 > [ 3510.401011] ? __check_object_size+0x15a/0x350 > [ 3510.401018] kernfs_fop_write+0x280/0x3f0 > [ 3510.401022] vfs_write+0x145/0x440 > [ 3510.401025] ksys_write+0xab/0x160 > [ 3510.401028] ? __ia32_sys_read+0xb0/0xb0 > [ 3510.401031] ? fput_many+0x1a/0x120 > [ 3510.401032] ? filp_close+0xf0/0x130 > [ 3510.401038] do_syscall_64+0xa0/0x370 > [ 3510.401041] ? page_fault+0x8/0x30 > [ 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca > [ 3510.401073] RIP: 0033:0x7f3a9bb842c0 > [ 3510.401079] Code: 73 01 c3 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 > [ 3510.401080] RSP: 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 > [ 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007f3a9bb842c0 > [ 3510.401085] RDX: 0000000000000002 RSI: 0000000002327408 RDI: 0000000000000001 > [ 3510.401086] RBP: 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700 > [ 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000002 > [ 3510.401087] R13: 0000000000000001 R14: 00007f3a9be52620 R15: 0000000000000001 > [ 3510.401090] > [ 3510.401093] Allocated by task 76795: > [ 3510.401098] kasan_kmalloc+0xa6/0xd0 > [ 3510.401099] __kmalloc+0xfb/0x200 > [ 3510.401104] iavf_init_interrupt_scheme+0x26f/0x1310 [iavf] > [ 3510.401108] iavf_watchdog_task+0x1d58/0x4050 [iavf] > [ 3510.401114] process_one_work+0x56a/0x11f0 > [ 3510.401115] worker_thread+0x8f/0xf40 > [ 3510.401117] kthread+0x2a0/0x390 > [ 3510.401119] ret_from_fork+0x1f/0x40 > [ 3510.401122] 0xffffffffffffffff > [ 3510.401123] > > In timeout handling, we should keep the original num_active_queues > and reset num_req_queues to 0. > > Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > Cc: Donglin Peng <pengdonglin@sangfor.com.cn> > Cc: Huang Cun <huangcun@sangfor.com.cn> > --- > v4 to v5: > - remove testing __IAVF_IN_REMOVE_TASK condition > - update commit message > - remove Reviewed-by tags to review again > > v3 to v4: > - nothing changed > > v2 to v3: > - fix review tag > > v1 to v2: > - add reproduction script > > --- > drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Thanks, Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> -----Original Message----- > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of > Leon Romanovsky > Sent: wtorek, 9 maja 2023 15:40 > To: Ding, Hui <dinghui@sangfor.com.cn> > Cc: pengdonglin@sangfor.com.cn; keescook@chromium.org; > gregory.v.rose@intel.com; Nguyen, Anthony L > <anthony.l.nguyen@intel.com>; Williams, Mitch A > <mitch.a.williams@intel.com>; Brandeburg, Jesse > <jesse.brandeburg@intel.com>; huangcun@sangfor.com.cn; linux- > kernel@vger.kernel.org; grzegorzx.szczurek@intel.com; > edumazet@google.com; Kubiak, Michal <michal.kubiak@intel.com>; intel- > wired-lan@lists.osuosl.org; jeffrey.t.kirsher@intel.com; > simon.horman@corigine.com; kuba@kernel.org; netdev@vger.kernel.org; > pabeni@redhat.com; davem@davemloft.net; linux- > hardening@vger.kernel.org > Subject: Re: [Intel-wired-lan] [PATCH net v5 2/2] iavf: Fix out-of-bounds > when setting channels on remove > > On Tue, May 09, 2023 at 07:11:48PM +0800, Ding Hui wrote: > > If we set channels greater during iavf_remove(), and waiting reset > > done would be timeout, then returned with error but changed > > num_active_queues directly, that will lead to OOB like the following > > logs. Because the num_active_queues is greater than tx/rx_rings[] > allocated actually. > > > > Reproducer: > > > > [root@host ~]# cat repro.sh > > #!/bin/bash > > > > pf_dbsf="0000:41:00.0" > > vf0_dbsf="0000:41:02.0" > > g_pids=() > > > > function do_set_numvf() > > { > > echo 2 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs > > sleep $((RANDOM%3+1)) > > echo 0 >/sys/bus/pci/devices/${pf_dbsf}/sriov_numvfs > > sleep $((RANDOM%3+1)) > > } > > > > function do_set_channel() > > { > > local nic=$(ls -1 --indicator-style=none > /sys/bus/pci/devices/${vf0_dbsf}/net/) > > [ -z "$nic" ] && { sleep $((RANDOM%3)) ; return 1; } > > ifconfig $nic 192.168.18.5 netmask 255.255.255.0 > > ifconfig $nic up > > ethtool -L $nic combined 1 > > ethtool -L $nic combined 4 > > sleep $((RANDOM%3)) > > } > > > > function on_exit() > > { > > local pid > > for pid in "${g_pids[@]}"; do > > kill -0 "$pid" &>/dev/null && kill "$pid" &>/dev/null > > done > > g_pids=() > > } > > > > trap "on_exit; exit" EXIT > > > > while :; do do_set_numvf ; done & > > g_pids+=($!) > > while :; do do_set_channel ; done & > > g_pids+=($!) > > > > wait > > > > Result: > > > > [ 3506.152887] iavf 0000:41:02.0: Removing device [ 3510.400799] > > > ========================================================== > ======== > > [ 3510.400820] BUG: KASAN: slab-out-of-bounds in > > iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400823] Read of > > size 8 at addr ffff88b6f9311008 by task repro.sh/55536 [ 3510.400823] > > [ 3510.400830] CPU: 101 PID: 55536 Comm: repro.sh Kdump: loaded > Tainted: G O --------- -t - 4.18.0 #1 > > [ 3510.400832] Hardware name: Powerleader PR2008AL/H12DSi-N6, BIOS > 2.0 > > 04/09/2021 [ 3510.400835] Call Trace: > > [ 3510.400851] dump_stack+0x71/0xab > > [ 3510.400860] print_address_description+0x6b/0x290 > > [ 3510.400865] ? iavf_free_all_tx_resources+0x156/0x160 [iavf] [ > > 3510.400868] kasan_report+0x14a/0x2b0 [ 3510.400873] > > iavf_free_all_tx_resources+0x156/0x160 [iavf] [ 3510.400880] > > iavf_remove+0x2b6/0xc70 [iavf] [ 3510.400884] ? > > iavf_free_all_rx_resources+0x160/0x160 [iavf] [ 3510.400891] ? > > wait_woken+0x1d0/0x1d0 [ 3510.400895] ? > > notifier_call_chain+0xc1/0x130 [ 3510.400903] > > pci_device_remove+0xa8/0x1f0 [ 3510.400910] > > device_release_driver_internal+0x1c6/0x460 > > [ 3510.400916] pci_stop_bus_device+0x101/0x150 [ 3510.400919] > > pci_stop_and_remove_bus_device+0xe/0x20 > > [ 3510.400924] pci_iov_remove_virtfn+0x187/0x420 [ 3510.400927] ? > > pci_iov_add_virtfn+0xe10/0xe10 [ 3510.400929] ? > > pci_get_subsys+0x90/0x90 [ 3510.400932] sriov_disable+0xed/0x3e0 [ > > 3510.400936] ? bus_find_device+0x12d/0x1a0 [ 3510.400953] > > i40e_free_vfs+0x754/0x1210 [i40e] [ 3510.400966] ? > > i40e_reset_all_vfs+0x880/0x880 [i40e] [ 3510.400968] ? > > pci_get_device+0x7c/0x90 [ 3510.400970] ? pci_get_subsys+0x90/0x90 [ > > 3510.400982] ? pci_vfs_assigned.part.7+0x144/0x210 > > [ 3510.400987] ? __mutex_lock_slowpath+0x10/0x10 [ 3510.400996] > > i40e_pci_sriov_configure+0x1fa/0x2e0 [i40e] [ 3510.401001] > > sriov_numvfs_store+0x214/0x290 [ 3510.401005] ? > > sriov_totalvfs_show+0x30/0x30 [ 3510.401007] ? > > __mutex_lock_slowpath+0x10/0x10 [ 3510.401011] ? > > __check_object_size+0x15a/0x350 [ 3510.401018] > > kernfs_fop_write+0x280/0x3f0 [ 3510.401022] vfs_write+0x145/0x440 [ > > 3510.401025] ksys_write+0xab/0x160 [ 3510.401028] ? > > __ia32_sys_read+0xb0/0xb0 [ 3510.401031] ? fput_many+0x1a/0x120 [ > > 3510.401032] ? filp_close+0xf0/0x130 [ 3510.401038] > > do_syscall_64+0xa0/0x370 [ 3510.401041] ? page_fault+0x8/0x30 [ > > 3510.401043] entry_SYSCALL_64_after_hwframe+0x65/0xca > > [ 3510.401073] RIP: 0033:0x7f3a9bb842c0 [ 3510.401079] Code: 73 01 c3 > > 48 8b 0d d8 cb 2c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 0f 1f 44 00 00 > > 83 3d 89 24 2d 00 00 75 10 b8 01 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 > > 31 c3 48 83 ec 08 e8 fe dd 01 00 48 89 04 24 [ 3510.401080] RSP: > > 002b:00007ffc05f1fe18 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ > > 3510.401083] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: > > 00007f3a9bb842c0 [ 3510.401085] RDX: 0000000000000002 RSI: > > 0000000002327408 RDI: 0000000000000001 [ 3510.401086] RBP: > > 0000000002327408 R08: 00007f3a9be53780 R09: 00007f3a9c8a4700 [ > > 3510.401086] R10: 0000000000000001 R11: 0000000000000246 R12: > 0000000000000002 [ 3510.401087] R13: 0000000000000001 R14: > 00007f3a9be52620 R15: 0000000000000001 [ 3510.401090] [ 3510.401093] > Allocated by task 76795: > > [ 3510.401098] kasan_kmalloc+0xa6/0xd0 [ 3510.401099] > > __kmalloc+0xfb/0x200 [ 3510.401104] > > iavf_init_interrupt_scheme+0x26f/0x1310 [iavf] [ 3510.401108] > > iavf_watchdog_task+0x1d58/0x4050 [iavf] [ 3510.401114] > > process_one_work+0x56a/0x11f0 [ 3510.401115] > worker_thread+0x8f/0xf40 > > [ 3510.401117] kthread+0x2a0/0x390 [ 3510.401119] > > ret_from_fork+0x1f/0x40 [ 3510.401122] 0xffffffffffffffff [ > > 3510.401123] > > > > In timeout handling, we should keep the original num_active_queues and > > reset num_req_queues to 0. > > > > Fixes: 4e5e6b5d9d13 ("iavf: Fix return of set the new channel count") > > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn> > > Cc: Donglin Peng <pengdonglin@sangfor.com.cn> > > Cc: Huang Cun <huangcun@sangfor.com.cn> > > --- > > v4 to v5: > > - remove testing __IAVF_IN_REMOVE_TASK condition > > - update commit message > > - remove Reviewed-by tags to review again > > > > v3 to v4: > > - nothing changed > > > > v2 to v3: > > - fix review tag > > > > v1 to v2: > > - add reproduction script > > > > --- > > drivers/net/ethernet/intel/iavf/iavf_ethtool.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > Thanks, > Reviewed-by: Leon Romanovsky <leonro@nvidia.com> > _______________________________________________ > Intel-wired-lan mailing list > Intel-wired-lan@osuosl.org > https://lists.osuosl.org/mailman/listinfo/intel-wired-lan Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
diff --git a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c index 6f171d1d85b7..92443f8e9fbd 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_ethtool.c +++ b/drivers/net/ethernet/intel/iavf/iavf_ethtool.c @@ -1863,7 +1863,7 @@ static int iavf_set_channels(struct net_device *netdev, } if (i == IAVF_RESET_WAIT_COMPLETE_COUNT) { adapter->flags &= ~IAVF_FLAG_REINIT_ITR_NEEDED; - adapter->num_active_queues = num_req; + adapter->num_req_queues = 0; return -EOPNOTSUPP; }