[RFC,net] sfc: Fix use-after-free due to selftest_work

Message ID 20230412005013.30456-1-dinghui@sangfor.com.cn
State New
Headers
Series [RFC,net] sfc: Fix use-after-free due to selftest_work |

Commit Message

Ding Hui April 12, 2023, 12:50 a.m. UTC
  There is a use-after-free scenario that is:

When netif_running() is false, user set mac address or vlan tag to VF,
the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
and efx_net_open(), since netif_running() is false, the port will not
start and keep port_enabled false, but selftest_worker is scheduled
in efx_net_open().

If we remove the device before selftest_worker run, the efx is freed,
then we will get a UAF in run_timer_softirq() like this:

[ 1178.907941] ==================================================================
[ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
[ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
[ 1178.907950]
[ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
[ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
[ 1178.907955] Call Trace:
[ 1178.907956]  <IRQ>
[ 1178.907960]  dump_stack+0x71/0xab
[ 1178.907963]  print_address_description+0x6b/0x290
[ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
[ 1178.907967]  kasan_report+0x14a/0x2b0
[ 1178.907968]  run_timer_softirq+0xdea/0xe90
[ 1178.907971]  ? init_timer_key+0x170/0x170
[ 1178.907973]  ? hrtimer_cancel+0x20/0x20
[ 1178.907976]  ? sched_clock+0x5/0x10
[ 1178.907978]  ? sched_clock_cpu+0x18/0x170
[ 1178.907981]  __do_softirq+0x1c8/0x5fa
[ 1178.907985]  irq_exit+0x213/0x240
[ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
[ 1178.907989]  apic_timer_interrupt+0xf/0x20
[ 1178.907990]  </IRQ>
[ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370

I am thinking about several ways to fix the issue:

[1] In this RFC, I cancel the selftest_worker unconditionally in
efx_pci_remove().

[2] Add a test condition, only invoke efx_selftest_async_start() when
efx->port_enabled is true in efx_net_open().

[3] Move invoking efx_selftest_async_start() from efx_net_open() to
efx_start_all() or efx_start_port(), that matching cancel action in
efx_stop_port().

[4] However, I also notice that in efx_ef10_set_mac_address(), the
efx_net_open() depends on original port_enabled, but others are not,
if we change all efx_net_open() depends on old state like
efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.

But I'm not sure which is better, is there any suggestions? Thanks.

Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
---
 drivers/net/ethernet/sfc/efx.c | 2 ++
 1 file changed, 2 insertions(+)
  

Comments

Jacob Keller April 12, 2023, 10:34 p.m. UTC | #1
On 4/11/2023 5:50 PM, Ding Hui wrote:
> There is a use-after-free scenario that is:
> 
> When netif_running() is false, user set mac address or vlan tag to VF,
> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> and efx_net_open(), since netif_running() is false, the port will not
> start and keep port_enabled false, but selftest_worker is scheduled
> in efx_net_open().
> 
> If we remove the device before selftest_worker run, the efx is freed,
> then we will get a UAF in run_timer_softirq() like this:
> 
> [ 1178.907941] ==================================================================
> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> [ 1178.907950]
> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> [ 1178.907955] Call Trace:
> [ 1178.907956]  <IRQ>
> [ 1178.907960]  dump_stack+0x71/0xab
> [ 1178.907963]  print_address_description+0x6b/0x290
> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> [ 1178.907967]  kasan_report+0x14a/0x2b0
> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> [ 1178.907971]  ? init_timer_key+0x170/0x170
> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> [ 1178.907976]  ? sched_clock+0x5/0x10
> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> [ 1178.907985]  irq_exit+0x213/0x240
> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> [ 1178.907990]  </IRQ>
> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> 
> I am thinking about several ways to fix the issue:
> 
> [1] In this RFC, I cancel the selftest_worker unconditionally in
> efx_pci_remove().
> 
> [2] Add a test condition, only invoke efx_selftest_async_start() when
> efx->port_enabled is true in efx_net_open().
> 
> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> efx_start_all() or efx_start_port(), that matching cancel action in
> efx_stop_port().
> 
> [4] However, I also notice that in efx_ef10_set_mac_address(), the
> efx_net_open() depends on original port_enabled, but others are not,
> if we change all efx_net_open() depends on old state like
> efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.
> 
> But I'm not sure which is better, is there any suggestions? Thanks.
> 

I think this fix makes the most sense to me.

> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> ---

net patches need a Fixes tag indicating what commit this fixes. This
being RFC is likely why that was left off?

>  drivers/net/ethernet/sfc/efx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index 884d8d168862..dd0b2363eed1 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -876,6 +876,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  	efx->state = STATE_UNINIT;
>  	rtnl_unlock();
>  
> +	efx_selftest_async_cancel(efx);
> +
>  	if (efx->type->sriov_fini)
>  		efx->type->sriov_fini(efx);
>
  
Ding Hui April 13, 2023, 1:12 a.m. UTC | #2
On 2023/4/13 6:34, Jacob Keller wrote:
> 
> 
> On 4/11/2023 5:50 PM, Ding Hui wrote:
>> There is a use-after-free scenario that is:
>>
>> When netif_running() is false, user set mac address or vlan tag to VF,
>> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
>> and efx_net_open(), since netif_running() is false, the port will not
>> start and keep port_enabled false, but selftest_worker is scheduled
>> in efx_net_open().
>>
>> If we remove the device before selftest_worker run, the efx is freed,
>> then we will get a UAF in run_timer_softirq() like this:
>>
>> [ 1178.907941] ==================================================================
>> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
>> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
>> [ 1178.907950]
>> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
>> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
>> [ 1178.907955] Call Trace:
>> [ 1178.907956]  <IRQ>
>> [ 1178.907960]  dump_stack+0x71/0xab
>> [ 1178.907963]  print_address_description+0x6b/0x290
>> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
>> [ 1178.907967]  kasan_report+0x14a/0x2b0
>> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
>> [ 1178.907971]  ? init_timer_key+0x170/0x170
>> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
>> [ 1178.907976]  ? sched_clock+0x5/0x10
>> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
>> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
>> [ 1178.907985]  irq_exit+0x213/0x240
>> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
>> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
>> [ 1178.907990]  </IRQ>
>> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
>>
>> I am thinking about several ways to fix the issue:
>>
>> [1] In this RFC, I cancel the selftest_worker unconditionally in
>> efx_pci_remove().
>>
>> [2] Add a test condition, only invoke efx_selftest_async_start() when
>> efx->port_enabled is true in efx_net_open().
>>
>> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
>> efx_start_all() or efx_start_port(), that matching cancel action in
>> efx_stop_port().
>>
>> [4] However, I also notice that in efx_ef10_set_mac_address(), the
>> efx_net_open() depends on original port_enabled, but others are not,
>> if we change all efx_net_open() depends on old state like
>> efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.
>>
>> But I'm not sure which is better, is there any suggestions? Thanks.
>>
> 
> I think this fix makes the most sense to me.
> 
>> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
>> ---
> 
> net patches need a Fixes tag indicating what commit this fixes. This
> being RFC is likely why that was left off?
> 

Thanks.

The commit dd40781e3a4e ("sfc: Run event/IRQ self-test asynchronously 
when interface is brought up") add efx_selftest_async_start() into
efx_net_open(), it was okay then since efx_net_open() was only invoked
by the callback. Base on the original purpose of this commit, the
way [2][3] makes sense.

The commit e340be923012 ("sfc: add ndo_set_vf_mac() function for EF10")
first add efx_ef10_sriov_set_vf_mac(), it invoke efx_net_open(), then
this UAF scenario started.

I'll remove RFC and add Fixes in v2.

Fixes: e340be923012 ("sfc: add ndo_set_vf_mac() function for EF10")

>>   drivers/net/ethernet/sfc/efx.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
>> index 884d8d168862..dd0b2363eed1 100644
>> --- a/drivers/net/ethernet/sfc/efx.c
>> +++ b/drivers/net/ethernet/sfc/efx.c
>> @@ -876,6 +876,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>>   	efx->state = STATE_UNINIT;
>>   	rtnl_unlock();
>>   
>> +	efx_selftest_async_cancel(efx);
>> +
>>   	if (efx->type->sriov_fini)
>>   		efx->type->sriov_fini(efx);
>>   
>
  
Martin Habets April 13, 2023, 7:37 a.m. UTC | #3
On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
> There is a use-after-free scenario that is:
> 
> When netif_running() is false, user set mac address or vlan tag to VF,
> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> and efx_net_open(), since netif_running() is false, the port will not
> start and keep port_enabled false, but selftest_worker is scheduled
> in efx_net_open().
> 
> If we remove the device before selftest_worker run, the efx is freed,
> then we will get a UAF in run_timer_softirq() like this:
> 
> [ 1178.907941] ==================================================================
> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> [ 1178.907950]
> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> [ 1178.907955] Call Trace:
> [ 1178.907956]  <IRQ>
> [ 1178.907960]  dump_stack+0x71/0xab
> [ 1178.907963]  print_address_description+0x6b/0x290
> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> [ 1178.907967]  kasan_report+0x14a/0x2b0
> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> [ 1178.907971]  ? init_timer_key+0x170/0x170
> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> [ 1178.907976]  ? sched_clock+0x5/0x10
> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> [ 1178.907985]  irq_exit+0x213/0x240
> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> [ 1178.907990]  </IRQ>
> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> 
> I am thinking about several ways to fix the issue:
> 
> [1] In this RFC, I cancel the selftest_worker unconditionally in
> efx_pci_remove().
> 
> [2] Add a test condition, only invoke efx_selftest_async_start() when
> efx->port_enabled is true in efx_net_open().
> 
> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> efx_start_all() or efx_start_port(), that matching cancel action in
> efx_stop_port().

I think moving this to efx_start_port() is best, as you say to match
the cancel in efx_stop_port().

Thanks,
Martin

> 
> [4] However, I also notice that in efx_ef10_set_mac_address(), the
> efx_net_open() depends on original port_enabled, but others are not,
> if we change all efx_net_open() depends on old state like
> efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.
> 
> But I'm not sure which is better, is there any suggestions? Thanks.
> 
> Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> ---
>  drivers/net/ethernet/sfc/efx.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> index 884d8d168862..dd0b2363eed1 100644
> --- a/drivers/net/ethernet/sfc/efx.c
> +++ b/drivers/net/ethernet/sfc/efx.c
> @@ -876,6 +876,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
>  	efx->state = STATE_UNINIT;
>  	rtnl_unlock();
>  
> +	efx_selftest_async_cancel(efx);
> +
>  	if (efx->type->sriov_fini)
>  		efx->type->sriov_fini(efx);
>  
> -- 
> 2.17.1
  
Martin Habets April 13, 2023, 7:52 a.m. UTC | #4
On Wed, Apr 12, 2023 at 03:34:51PM -0700, Jacob Keller wrote:
> 
> 
> On 4/11/2023 5:50 PM, Ding Hui wrote:
> > There is a use-after-free scenario that is:
> > 
> > When netif_running() is false, user set mac address or vlan tag to VF,
> > the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> > and efx_net_open(), since netif_running() is false, the port will not
> > start and keep port_enabled false, but selftest_worker is scheduled
> > in efx_net_open().
> > 
> > If we remove the device before selftest_worker run, the efx is freed,
> > then we will get a UAF in run_timer_softirq() like this:
> > 
> > [ 1178.907941] ==================================================================
> > [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> > [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> > [ 1178.907950]
> > [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> > [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> > [ 1178.907955] Call Trace:
> > [ 1178.907956]  <IRQ>
> > [ 1178.907960]  dump_stack+0x71/0xab
> > [ 1178.907963]  print_address_description+0x6b/0x290
> > [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> > [ 1178.907967]  kasan_report+0x14a/0x2b0
> > [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> > [ 1178.907971]  ? init_timer_key+0x170/0x170
> > [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> > [ 1178.907976]  ? sched_clock+0x5/0x10
> > [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> > [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> > [ 1178.907985]  irq_exit+0x213/0x240
> > [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> > [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> > [ 1178.907990]  </IRQ>
> > [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> > 
> > I am thinking about several ways to fix the issue:
> > 
> > [1] In this RFC, I cancel the selftest_worker unconditionally in
> > efx_pci_remove().
> > 
> > [2] Add a test condition, only invoke efx_selftest_async_start() when
> > efx->port_enabled is true in efx_net_open().
> > 
> > [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> > efx_start_all() or efx_start_port(), that matching cancel action in
> > efx_stop_port().
> > 
> > [4] However, I also notice that in efx_ef10_set_mac_address(), the
> > efx_net_open() depends on original port_enabled, but others are not,
> > if we change all efx_net_open() depends on old state like
> > efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.
> > 
> > But I'm not sure which is better, is there any suggestions? Thanks.
> > 
> 
> I think this fix makes the most sense to me.

For me this is too late. It leaves a gap where the selftest timer is still running
but the NIC has already stopped sending events. So we could still get a
failure "channel %d timed out waiting for event queue" from the selftest.

Martin

> 
> > Signed-off-by: Ding Hui <dinghui@sangfor.com.cn>
> > ---
> 
> net patches need a Fixes tag indicating what commit this fixes. This
> being RFC is likely why that was left off?
> 
> >  drivers/net/ethernet/sfc/efx.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
> > index 884d8d168862..dd0b2363eed1 100644
> > --- a/drivers/net/ethernet/sfc/efx.c
> > +++ b/drivers/net/ethernet/sfc/efx.c
> > @@ -876,6 +876,8 @@ static void efx_pci_remove(struct pci_dev *pci_dev)
> >  	efx->state = STATE_UNINIT;
> >  	rtnl_unlock();
> >  
> > +	efx_selftest_async_cancel(efx);
> > +
> >  	if (efx->type->sriov_fini)
> >  		efx->type->sriov_fini(efx);
> >
  
Ding Hui April 13, 2023, 8:11 a.m. UTC | #5
On 2023/4/13 15:52, Martin Habets wrote:
> On Wed, Apr 12, 2023 at 03:34:51PM -0700, Jacob Keller wrote:
>>
>>
>> On 4/11/2023 5:50 PM, Ding Hui wrote:
>>> There is a use-after-free scenario that is:
>>>
>>> When netif_running() is false, user set mac address or vlan tag to VF,
>>> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
>>> and efx_net_open(), since netif_running() is false, the port will not
>>> start and keep port_enabled false, but selftest_worker is scheduled
>>> in efx_net_open().
>>>
>>> If we remove the device before selftest_worker run, the efx is freed,
>>> then we will get a UAF in run_timer_softirq() like this:
>>>
>>> [ 1178.907941] ==================================================================
>>> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
>>> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
>>> [ 1178.907950]
>>> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
>>> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
>>> [ 1178.907955] Call Trace:
>>> [ 1178.907956]  <IRQ>
>>> [ 1178.907960]  dump_stack+0x71/0xab
>>> [ 1178.907963]  print_address_description+0x6b/0x290
>>> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
>>> [ 1178.907967]  kasan_report+0x14a/0x2b0
>>> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
>>> [ 1178.907971]  ? init_timer_key+0x170/0x170
>>> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
>>> [ 1178.907976]  ? sched_clock+0x5/0x10
>>> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
>>> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
>>> [ 1178.907985]  irq_exit+0x213/0x240
>>> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
>>> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
>>> [ 1178.907990]  </IRQ>
>>> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
>>>
>>> I am thinking about several ways to fix the issue:
>>>
>>> [1] In this RFC, I cancel the selftest_worker unconditionally in
>>> efx_pci_remove().
>>>
>>> [2] Add a test condition, only invoke efx_selftest_async_start() when
>>> efx->port_enabled is true in efx_net_open().
>>>
>>> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
>>> efx_start_all() or efx_start_port(), that matching cancel action in
>>> efx_stop_port().
>>>
>>> [4] However, I also notice that in efx_ef10_set_mac_address(), the
>>> efx_net_open() depends on original port_enabled, but others are not,
>>> if we change all efx_net_open() depends on old state like
>>> efx_ef10_set_mac_address() does, the UAF can also be fixed in theory.
>>>
>>> But I'm not sure which is better, is there any suggestions? Thanks.
>>>
>>
>> I think this fix makes the most sense to me.
> 
> For me this is too late. It leaves a gap where the selftest timer is still running
> but the NIC has already stopped sending events. So we could still get a
> failure "channel %d timed out waiting for event queue" from the selftest.
> 

Yes, assuming not consider removing, scheduled selftest_work if NIC not
brought up actually, we will also get this failure log.
  
Ding Hui April 13, 2023, 8:35 a.m. UTC | #6
On 2023/4/13 15:37, Martin Habets wrote:
> On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
>> There is a use-after-free scenario that is:
>>
>> When netif_running() is false, user set mac address or vlan tag to VF,
>> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
>> and efx_net_open(), since netif_running() is false, the port will not
>> start and keep port_enabled false, but selftest_worker is scheduled
>> in efx_net_open().
>>
>> If we remove the device before selftest_worker run, the efx is freed,
>> then we will get a UAF in run_timer_softirq() like this:
>>
>> [ 1178.907941] ==================================================================
>> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
>> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
>> [ 1178.907950]
>> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
>> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
>> [ 1178.907955] Call Trace:
>> [ 1178.907956]  <IRQ>
>> [ 1178.907960]  dump_stack+0x71/0xab
>> [ 1178.907963]  print_address_description+0x6b/0x290
>> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
>> [ 1178.907967]  kasan_report+0x14a/0x2b0
>> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
>> [ 1178.907971]  ? init_timer_key+0x170/0x170
>> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
>> [ 1178.907976]  ? sched_clock+0x5/0x10
>> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
>> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
>> [ 1178.907985]  irq_exit+0x213/0x240
>> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
>> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
>> [ 1178.907990]  </IRQ>
>> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
>>
>> I am thinking about several ways to fix the issue:
>>
>> [1] In this RFC, I cancel the selftest_worker unconditionally in
>> efx_pci_remove().
>>
>> [2] Add a test condition, only invoke efx_selftest_async_start() when
>> efx->port_enabled is true in efx_net_open().
>>
>> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
>> efx_start_all() or efx_start_port(), that matching cancel action in
>> efx_stop_port().
> 
> I think moving this to efx_start_port() is best, as you say to match
> the cancel in efx_stop_port().
> 

If moving to efx_start_port(), should we worry about that IRQ_TIMEOUT
is still enough?

I'm not sure if there is a long time waiting from starting of schedule
selftest_work to the ending of efx_net_open().
  
Martin Habets April 14, 2023, 9:44 a.m. UTC | #7
On Thu, Apr 13, 2023 at 04:35:08PM +0800, Ding Hui wrote:
> On 2023/4/13 15:37, Martin Habets wrote:
> > On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
> > > There is a use-after-free scenario that is:
> > > 
> > > When netif_running() is false, user set mac address or vlan tag to VF,
> > > the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> > > and efx_net_open(), since netif_running() is false, the port will not
> > > start and keep port_enabled false, but selftest_worker is scheduled
> > > in efx_net_open().
> > > 
> > > If we remove the device before selftest_worker run, the efx is freed,
> > > then we will get a UAF in run_timer_softirq() like this:
> > > 
> > > [ 1178.907941] ==================================================================
> > > [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> > > [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> > > [ 1178.907950]
> > > [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> > > [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> > > [ 1178.907955] Call Trace:
> > > [ 1178.907956]  <IRQ>
> > > [ 1178.907960]  dump_stack+0x71/0xab
> > > [ 1178.907963]  print_address_description+0x6b/0x290
> > > [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> > > [ 1178.907967]  kasan_report+0x14a/0x2b0
> > > [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> > > [ 1178.907971]  ? init_timer_key+0x170/0x170
> > > [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> > > [ 1178.907976]  ? sched_clock+0x5/0x10
> > > [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> > > [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> > > [ 1178.907985]  irq_exit+0x213/0x240
> > > [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> > > [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> > > [ 1178.907990]  </IRQ>
> > > [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> > > 
> > > I am thinking about several ways to fix the issue:
> > > 
> > > [1] In this RFC, I cancel the selftest_worker unconditionally in
> > > efx_pci_remove().
> > > 
> > > [2] Add a test condition, only invoke efx_selftest_async_start() when
> > > efx->port_enabled is true in efx_net_open().
> > > 
> > > [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> > > efx_start_all() or efx_start_port(), that matching cancel action in
> > > efx_stop_port().
> > 
> > I think moving this to efx_start_port() is best, as you say to match
> > the cancel in efx_stop_port().
> > 
> 
> If moving to efx_start_port(), should we worry about that IRQ_TIMEOUT
> is still enough?

1 second is a long time for a machine running code, so it does not worry me.

> I'm not sure if there is a long time waiting from starting of schedule
> selftest_work to the ending of efx_net_open().

I see your point. Looking at efx_start_all() there is the call to
efx_start_datapath() after the call to efx_net_open(), which takes a
relatively long time (well under 200ms though).
Logically it would be better to move efx_selftest_async_start() after this
call. What do you think?

The point here is that efx_start_all() calls efx_start_port() early, and
efx_stop_all() also calls efx_stop_port() early. The calling sequence is
correct but they are not the strict inverse of each other.

Martin

> 
> -- 
> Thanks,
> - Ding Hui
  
Ding Hui April 14, 2023, 11:03 a.m. UTC | #8
On 2023/4/14 17:44, Martin Habets wrote:
> On Thu, Apr 13, 2023 at 04:35:08PM +0800, Ding Hui wrote:
>> On 2023/4/13 15:37, Martin Habets wrote:
>>> On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
>>>> There is a use-after-free scenario that is:
>>>>
>>>> When netif_running() is false, user set mac address or vlan tag to VF,
>>>> the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
>>>> and efx_net_open(), since netif_running() is false, the port will not
>>>> start and keep port_enabled false, but selftest_worker is scheduled
>>>> in efx_net_open().
>>>>
>>>> If we remove the device before selftest_worker run, the efx is freed,
>>>> then we will get a UAF in run_timer_softirq() like this:
>>>>
>>>> [ 1178.907941] ==================================================================
>>>> [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
>>>> [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
>>>> [ 1178.907950]
>>>> [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
>>>> [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
>>>> [ 1178.907955] Call Trace:
>>>> [ 1178.907956]  <IRQ>
>>>> [ 1178.907960]  dump_stack+0x71/0xab
>>>> [ 1178.907963]  print_address_description+0x6b/0x290
>>>> [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
>>>> [ 1178.907967]  kasan_report+0x14a/0x2b0
>>>> [ 1178.907968]  run_timer_softirq+0xdea/0xe90
>>>> [ 1178.907971]  ? init_timer_key+0x170/0x170
>>>> [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
>>>> [ 1178.907976]  ? sched_clock+0x5/0x10
>>>> [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
>>>> [ 1178.907981]  __do_softirq+0x1c8/0x5fa
>>>> [ 1178.907985]  irq_exit+0x213/0x240
>>>> [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
>>>> [ 1178.907989]  apic_timer_interrupt+0xf/0x20
>>>> [ 1178.907990]  </IRQ>
>>>> [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
>>>>
>>>> I am thinking about several ways to fix the issue:
>>>>
>>>> [1] In this RFC, I cancel the selftest_worker unconditionally in
>>>> efx_pci_remove().
>>>>
>>>> [2] Add a test condition, only invoke efx_selftest_async_start() when
>>>> efx->port_enabled is true in efx_net_open().
>>>>
>>>> [3] Move invoking efx_selftest_async_start() from efx_net_open() to
>>>> efx_start_all() or efx_start_port(), that matching cancel action in
>>>> efx_stop_port().
>>>
>>> I think moving this to efx_start_port() is best, as you say to match
>>> the cancel in efx_stop_port().
>>>
>>
>> If moving to efx_start_port(), should we worry about that IRQ_TIMEOUT
>> is still enough?
> 
> 1 second is a long time for a machine running code, so it does not worry me.
> 
>> I'm not sure if there is a long time waiting from starting of schedule
>> selftest_work to the ending of efx_net_open().
> 
> I see your point. Looking at efx_start_all() there is the call to
> efx_start_datapath() after the call to efx_net_open(), which takes a
                                          ^^^^^^^^^^^^
Do you mean efx_start_port()?

> relatively long time (well under 200ms though).
> Logically it would be better to move efx_selftest_async_start() after this
> call. What do you think?

Agree with you.

> The point here is that efx_start_all() calls efx_start_port() early, and
> efx_stop_all() also calls efx_stop_port() early. The calling sequence is
> correct but they are not the strict inverse of each other.
> 

Yeah, that is what I noticed monitor_work does.
Then I'll move efx_selftest_async_start() into efx_start_all(), follows
the monitor_work.
  
Martin Habets April 14, 2023, 12:33 p.m. UTC | #9
On Fri, Apr 14, 2023 at 07:03:41PM +0800, Ding Hui wrote:
> On 2023/4/14 17:44, Martin Habets wrote:
> > On Thu, Apr 13, 2023 at 04:35:08PM +0800, Ding Hui wrote:
> > > On 2023/4/13 15:37, Martin Habets wrote:
> > > > On Wed, Apr 12, 2023 at 08:50:13AM +0800, Ding Hui wrote:
> > > > > There is a use-after-free scenario that is:
> > > > > 
> > > > > When netif_running() is false, user set mac address or vlan tag to VF,
> > > > > the xxx_set_vf_mac() or xxx_set_vf_vlan() will invoke efx_net_stop()
> > > > > and efx_net_open(), since netif_running() is false, the port will not
> > > > > start and keep port_enabled false, but selftest_worker is scheduled
> > > > > in efx_net_open().
> > > > > 
> > > > > If we remove the device before selftest_worker run, the efx is freed,
> > > > > then we will get a UAF in run_timer_softirq() like this:
> > > > > 
> > > > > [ 1178.907941] ==================================================================
> > > > > [ 1178.907948] BUG: KASAN: use-after-free in run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907950] Write of size 8 at addr ff11001f449cdc80 by task swapper/47/0
> > > > > [ 1178.907950]
> > > > > [ 1178.907953] CPU: 47 PID: 0 Comm: swapper/47 Kdump: loaded Tainted: G           O     --------- -t - 4.18.0 #1
> > > > > [ 1178.907954] Hardware name: SANGFOR X620G40/WI2HG-208T1061A, BIOS SPYH051032-U01 04/01/2022
> > > > > [ 1178.907955] Call Trace:
> > > > > [ 1178.907956]  <IRQ>
> > > > > [ 1178.907960]  dump_stack+0x71/0xab
> > > > > [ 1178.907963]  print_address_description+0x6b/0x290
> > > > > [ 1178.907965]  ? run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907967]  kasan_report+0x14a/0x2b0
> > > > > [ 1178.907968]  run_timer_softirq+0xdea/0xe90
> > > > > [ 1178.907971]  ? init_timer_key+0x170/0x170
> > > > > [ 1178.907973]  ? hrtimer_cancel+0x20/0x20
> > > > > [ 1178.907976]  ? sched_clock+0x5/0x10
> > > > > [ 1178.907978]  ? sched_clock_cpu+0x18/0x170
> > > > > [ 1178.907981]  __do_softirq+0x1c8/0x5fa
> > > > > [ 1178.907985]  irq_exit+0x213/0x240
> > > > > [ 1178.907987]  smp_apic_timer_interrupt+0xd0/0x330
> > > > > [ 1178.907989]  apic_timer_interrupt+0xf/0x20
> > > > > [ 1178.907990]  </IRQ>
> > > > > [ 1178.907991] RIP: 0010:mwait_idle+0xae/0x370
> > > > > 
> > > > > I am thinking about several ways to fix the issue:
> > > > > 
> > > > > [1] In this RFC, I cancel the selftest_worker unconditionally in
> > > > > efx_pci_remove().
> > > > > 
> > > > > [2] Add a test condition, only invoke efx_selftest_async_start() when
> > > > > efx->port_enabled is true in efx_net_open().
> > > > > 
> > > > > [3] Move invoking efx_selftest_async_start() from efx_net_open() to
> > > > > efx_start_all() or efx_start_port(), that matching cancel action in
> > > > > efx_stop_port().
> > > > 
> > > > I think moving this to efx_start_port() is best, as you say to match
> > > > the cancel in efx_stop_port().
> > > > 
> > > 
> > > If moving to efx_start_port(), should we worry about that IRQ_TIMEOUT
> > > is still enough?
> > 
> > 1 second is a long time for a machine running code, so it does not worry me.
> > 
> > > I'm not sure if there is a long time waiting from starting of schedule
> > > selftest_work to the ending of efx_net_open().
> > 
> > I see your point. Looking at efx_start_all() there is the call to
> > efx_start_datapath() after the call to efx_net_open(), which takes a
>                                          ^^^^^^^^^^^^
> Do you mean efx_start_port()?

Woops, yes that what I meant.

> > relatively long time (well under 200ms though).
> > Logically it would be better to move efx_selftest_async_start() after this
> > call. What do you think?
> 
> Agree with you.
> 
> > The point here is that efx_start_all() calls efx_start_port() early, and
> > efx_stop_all() also calls efx_stop_port() early. The calling sequence is
> > correct but they are not the strict inverse of each other.
> > 
> 
> Yeah, that is what I noticed monitor_work does.
> Then I'll move efx_selftest_async_start() into efx_start_all(), follows
> the monitor_work.

Sounds good.

Thanks,
Martin

> -- 
> Thanks,
> - Ding Hui
  

Patch

diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 884d8d168862..dd0b2363eed1 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -876,6 +876,8 @@  static void efx_pci_remove(struct pci_dev *pci_dev)
 	efx->state = STATE_UNINIT;
 	rtnl_unlock();
 
+	efx_selftest_async_cancel(efx);
+
 	if (efx->type->sriov_fini)
 		efx->type->sriov_fini(efx);