[for-next,v2] RDMA/rxe: Fix mr->map double free

Message ID 1667099073-2-1-git-send-email-lizhijian@fujitsu.com
State New
Headers
Series [for-next,v2] RDMA/rxe: Fix mr->map double free |

Commit Message

Zhijian Li (Fujitsu) Oct. 30, 2022, 3:04 a.m. UTC
  rxe_mr_cleanup() which tries to free mr->map again will be called
when rxe_mr_init_user() fails.

[43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
[43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
[43895.945208] Call Trace:
[43895.946130]  <TASK>
[43895.946931]  dump_stack_lvl+0x45/0x5d
[43895.948049]  panic+0x19e/0x349
[43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
[43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
[43895.952589]  ? preempt_count_sub+0x14/0xc0
[43895.953809]  end_report.part.0+0x54/0x7c
[43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.956406]  kasan_report.cold+0xa/0xf
[43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
[43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
[43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
[43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
[43895.964921]  ? __lock_acquire+0x876/0x31e0
[43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
[43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
[43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
[43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
[43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
[43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
[43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]

This issue was fistrly exposed since
commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
and then we fixed it in
commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
but this fix was reverted together at last by
commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")

Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
---
 drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)
  

Comments

Zhijian Li (Fujitsu) Nov. 12, 2022, 3:29 a.m. UTC | #1
ping...

Hi Yanjun


I hope you could take a look to this bug fix more earlier.


Thanks
Zhijian



On 30/10/2022 11:04, Li Zhijian wrote:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> This issue was fistrly exposed since
> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
> and then we fixed it in
> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
> but this fix was reverted together at last by
> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> 
> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index d4f10c2d1aa7..7c99d1591580 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>   		kfree(mr->map[i]);
>   
>   	kfree(mr->map);
> +	mr->map = NULL;
>   err1:
>   	return -ENOMEM;
>   }
> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   	int			num_buf;
>   	void			*vaddr;
>   	int err;
> -	int i;
>   
>   	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>   	if (IS_ERR(umem)) {
> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   				pr_warn("%s: Unable to get virtual address\n",
>   						__func__);
>   				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;
>   			}
> -
>   			buf->addr = (uintptr_t)vaddr;
>   			buf->size = PAGE_SIZE;
>   			num_buf++;
> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   
>   	return 0;
>   
> -err_cleanup_map:
> -	for (i = 0; i < mr->num_map; i++)
> -		kfree(mr->map[i]);
> -	kfree(mr->map);
>   err_release_umem:
>   	ib_umem_release(umem);
>   err_out:
  
Zhu Yanjun Nov. 13, 2022, 12:40 p.m. UTC | #2
在 2022/11/12 11:29, lizhijian@fujitsu.com 写道:
> ping...
Hi, Zhijian

The changes are too much. I need time to delve into it.

Zhu Yanjun

> 
> Hi Yanjun
> 
> 
> I hope you could take a look to this bug fix more earlier.
> 
> 
> Thanks
> Zhijian
> 
> 
> 
> On 30/10/2022 11:04, Li Zhijian wrote:
>> rxe_mr_cleanup() which tries to free mr->map again will be called
>> when rxe_mr_init_user() fails.
>>
>> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
>> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>> [43895.945208] Call Trace:
>> [43895.946130]  <TASK>
>> [43895.946931]  dump_stack_lvl+0x45/0x5d
>> [43895.948049]  panic+0x19e/0x349
>> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
>> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [43895.952589]  ? preempt_count_sub+0x14/0xc0
>> [43895.953809]  end_report.part.0+0x54/0x7c
>> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.956406]  kasan_report.cold+0xa/0xf
>> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
>> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
>> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
>> [43895.964921]  ? __lock_acquire+0x876/0x31e0
>> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
>> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
>> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
>> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
>> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
>> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
>>
>> This issue was fistrly exposed since
>> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
>> and then we fixed it in
>> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
>> but this fix was reverted together at last by
>> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
>>
>> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
>> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
>> ---
>>    drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>>    1 file changed, 2 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
>> index d4f10c2d1aa7..7c99d1591580 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>>    		kfree(mr->map[i]);
>>    
>>    	kfree(mr->map);
>> +	mr->map = NULL;
>>    err1:
>>    	return -ENOMEM;
>>    }
>> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    	int			num_buf;
>>    	void			*vaddr;
>>    	int err;
>> -	int i;
>>    
>>    	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>>    	if (IS_ERR(umem)) {
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    				pr_warn("%s: Unable to get virtual address\n",
>>    						__func__);
>>    				err = -ENOMEM;
>> -				goto err_cleanup_map;
>> +				goto err_release_umem;
>>    			}
>> -
>>    			buf->addr = (uintptr_t)vaddr;
>>    			buf->size = PAGE_SIZE;
>>    			num_buf++;
>> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>    
>>    	return 0;
>>    
>> -err_cleanup_map:
>> -	for (i = 0; i < mr->num_map; i++)
>> -		kfree(mr->map[i]);
>> -	kfree(mr->map);
>>    err_release_umem:
>>    	ib_umem_release(umem);
>>    err_out:
  
Zhu Yanjun Nov. 16, 2022, 12:10 a.m. UTC | #3
在 2022/10/30 11:04, Li Zhijian 写道:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> This issue was fistrly exposed since
> commit: b18c7da63fcb ("RDMA/rxe: Fix memory leak in error path code")
> and then we fixed it in
> commit: 8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")
> but this fix was reverted together at last by
> commit: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> 
> Fixes: 1e75550648da (Revert "RDMA/rxe: Create duplicate mapping tables for FMRs")
> Signed-off-by: Li Zhijian <lizhijian@fujitsu.com>
> ---
>   drivers/infiniband/sw/rxe/rxe_mr.c | 9 ++-------
>   1 file changed, 2 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
> index d4f10c2d1aa7..7c99d1591580 100644
> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>   		kfree(mr->map[i]);
>   
>   	kfree(mr->map);
> +	mr->map = NULL;
>   err1:
>   	return -ENOMEM;
>   }
> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   	int			num_buf;
>   	void			*vaddr;
>   	int err;
> -	int i;
>   
>   	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>   	if (IS_ERR(umem)) {
> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   				pr_warn("%s: Unable to get virtual address\n",
>   						__func__);
>   				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;

This call trace results from page_address's returning NULL, then goto 
err_cleanup_map where mr->map[i] and mr->map are freed.

And finally rxe_reg_user_mr gets an error from rxe_mr_init_user, the 
function rxe_mr_cleanup is called to handle mr to free mr->map[i] and 
mr->map again.

So mr->map[i] and mr->map are double freed.

As such, this commit is reasonable.

But why page_address will return NULL?

Zhu Yanjun

>   			}
> -
>   			buf->addr = (uintptr_t)vaddr;
>   			buf->size = PAGE_SIZE;
>   			num_buf++;
> @@ -182,10 +181,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>   
>   	return 0;
>   
> -err_cleanup_map:
> -	for (i = 0; i < mr->num_map; i++)
> -		kfree(mr->map[i]);
> -	kfree(mr->map);
>   err_release_umem:
>   	ib_umem_release(umem);
>   err_out:
  
Zhijian Li (Fujitsu) Nov. 16, 2022, 2:39 a.m. UTC | #4
On 16/11/2022 08:10, Yanjun Zhu wrote:
>>
>> index d4f10c2d1aa7..7c99d1591580 100644
>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c
>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c
>> @@ -99,6 +99,7 @@ static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
>>           kfree(mr->map[i]);
>>       kfree(mr->map);
>> +    mr->map = NULL;
>>   err1:
>>       return -ENOMEM;
>>   }
>> @@ -122,7 +123,6 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 
>> start, u64 length, u64 iova,
>>       int            num_buf;
>>       void            *vaddr;
>>       int err;
>> -    int i;
>>       umem = ib_umem_get(&rxe->ib_dev, start, length, access);
>>       if (IS_ERR(umem)) {
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 
>> start, u64 length, u64 iova,
>>                   pr_warn("%s: Unable to get virtual address\n",
>>                           __func__);
>>                   err = -ENOMEM;
>> -                goto err_cleanup_map;
>> +                goto err_release_umem;
> 
> This call trace results from page_address's returning NULL, then goto 
> err_cleanup_map where mr->map[i] and mr->map are freed.
> 
> And finally rxe_reg_user_mr gets an error from rxe_mr_init_user, the 
> function rxe_mr_cleanup is called to handle mr to free mr->map[i] and 
> mr->map again.
> 
> So mr->map[i] and mr->map are double freed.
> 
> As such, this commit is reasonable.
> 
> But why page_address will return NULL?

ENOMEM? but I don't think we need taking too much care upon the reason.

this patch is most likely porting the reverted back, commit: 
8ff5f5d9d8cf ("RDMA/rxe: Prevent double freeing rxe_map_set()")

Actually, the double free can be triggered by below error path too.

149         err = rxe_mr_alloc(mr, num_buf); 

150         if (err) { 

151                 pr_warn("%s: Unable to allocate memory for map\n", 

152                                 __func__); 

153                 goto err_release_umem; 

154         }

where rxe_mr_alloc() freed the memory but don't set 'mr->map = NULL'

Thanks
Zhijian


> 
> Zhu Yanjun
>
  
Jason Gunthorpe Nov. 19, 2022, 12:15 a.m. UTC | #5
On Sun, Oct 30, 2022 at 03:04:33AM +0000, Li Zhijian wrote:
> rxe_mr_cleanup() which tries to free mr->map again will be called
> when rxe_mr_init_user() fails.
> 
> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
> [43895.945208] Call Trace:
> [43895.946130]  <TASK>
> [43895.946931]  dump_stack_lvl+0x45/0x5d
> [43895.948049]  panic+0x19e/0x349
> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
> [43895.952589]  ? preempt_count_sub+0x14/0xc0
> [43895.953809]  end_report.part.0+0x54/0x7c
> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.956406]  kasan_report.cold+0xa/0xf
> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
> [43895.964921]  ? __lock_acquire+0x876/0x31e0
> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]

Please dont include timestamps in commit messages

> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>  				pr_warn("%s: Unable to get virtual address\n",
>  						__func__);
>  				err = -ENOMEM;
> -				goto err_cleanup_map;
> +				goto err_release_umem;
>  			}
> -

page_address() fails if this is a highmem system and the page hasn't
been kmap'd yet. So the right thing to do is to use kmap..

But this looks right, so applied to for-next

Thanks,
Jason
  
Zhu Yanjun Nov. 19, 2022, 8:25 a.m. UTC | #6
在 2022/11/19 8:15, Jason Gunthorpe 写道:
> On Sun, Oct 30, 2022 at 03:04:33AM +0000, Li Zhijian wrote:
>> rxe_mr_cleanup() which tries to free mr->map again will be called
>> when rxe_mr_init_user() fails.
>>
>> [43895.939883] CPU: 0 PID: 4917 Comm: rdma_flush_serv Kdump: loaded Not tainted 6.1.0-rc1-roce-flush+ #25
>> [43895.942341] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
>> [43895.945208] Call Trace:
>> [43895.946130]  <TASK>
>> [43895.946931]  dump_stack_lvl+0x45/0x5d
>> [43895.948049]  panic+0x19e/0x349
>> [43895.949010]  ? panic_print_sys_info.part.0+0x77/0x77
>> [43895.950356]  ? asm_sysvec_apic_timer_interrupt+0x16/0x20
>> [43895.952589]  ? preempt_count_sub+0x14/0xc0
>> [43895.953809]  end_report.part.0+0x54/0x7c
>> [43895.954993]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.956406]  kasan_report.cold+0xa/0xf
>> [43895.957668]  ? rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.959090]  rxe_mr_cleanup+0x9d/0xf0 [rdma_rxe]
>> [43895.960502]  __rxe_cleanup+0x10a/0x1e0 [rdma_rxe]
>> [43895.961983]  rxe_reg_user_mr+0xb7/0xd0 [rdma_rxe]
>> [43895.963456]  ib_uverbs_reg_mr+0x26a/0x480 [ib_uverbs]
>> [43895.964921]  ? __lock_acquire+0x876/0x31e0
>> [43895.966182]  ? ib_uverbs_ex_create_wq+0x630/0x630 [ib_uverbs]
>> [43895.967739]  ? uverbs_fill_udata+0x1c6/0x330 [ib_uverbs]
>> [43895.969204]  ib_uverbs_handler_UVERBS_METHOD_INVOKE_WRITE+0x1a2/0x250 [ib_uverbs]
>> [43895.971126]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.973094]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.975096]  ? uverbs_fill_udata+0x25f/0x330 [ib_uverbs]
>> [43895.976466]  ib_uverbs_cmd_verbs+0x1397/0x15a0 [ib_uverbs]
>> [43895.977930]  ? ib_uverbs_handler_UVERBS_METHOD_QUERY_CONTEXT+0x1a0/0x1a0 [ib_uverbs]
>> [43895.979937]  ? uverbs_fill_udata+0x330/0x330 [ib_uverbs]
> 
> Please dont include timestamps in commit messages
> 
>> @@ -163,9 +163,8 @@ int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
>>   				pr_warn("%s: Unable to get virtual address\n",
>>   						__func__);
>>   				err = -ENOMEM;
>> -				goto err_cleanup_map;
>> +				goto err_release_umem;
>>   			}
>> -
> 
> page_address() fails if this is a highmem system and the page hasn't
> been kmap'd yet. So the right thing to do is to use kmap..

Sure.

sgt_append.sgt is allocated in this function ib_umem_get. And the 
function sg_alloc_append_table_from_pages is called to allocate memory.

147 struct ib_umem *ib_umem_get(struct ib_device *device, unsigned long 
addr,
148                             size_t size, int access)
149 {
...
230                 ret = sg_alloc_append_table_from_pages(
231                         &umem->sgt_append, page_list, pinned, 0,
232                         pinned << PAGE_SHIFT, 
ib_dma_max_seg_size(device),
233                         npages, GFP_KERNEL);
...

And it seems that it is not highmem.

So page_address will not be NULL?

As such, it is not necessary to test the return vaue of page_address?

If so, can we add a new commit to avoid testing of the return value of 
page_address?

Zhu Yanjun

> 
> But this looks right, so applied to for-next
> 
> Thanks,
> Jason
  

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c
index d4f10c2d1aa7..7c99d1591580 100644
--- a/drivers/infiniband/sw/rxe/rxe_mr.c
+++ b/drivers/infiniband/sw/rxe/rxe_mr.c
@@ -99,6 +99,7 @@  static int rxe_mr_alloc(struct rxe_mr *mr, int num_buf)
 		kfree(mr->map[i]);
 
 	kfree(mr->map);
+	mr->map = NULL;
 err1:
 	return -ENOMEM;
 }
@@ -122,7 +123,6 @@  int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 	int			num_buf;
 	void			*vaddr;
 	int err;
-	int i;
 
 	umem = ib_umem_get(&rxe->ib_dev, start, length, access);
 	if (IS_ERR(umem)) {
@@ -163,9 +163,8 @@  int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 				pr_warn("%s: Unable to get virtual address\n",
 						__func__);
 				err = -ENOMEM;
-				goto err_cleanup_map;
+				goto err_release_umem;
 			}
-
 			buf->addr = (uintptr_t)vaddr;
 			buf->size = PAGE_SIZE;
 			num_buf++;
@@ -182,10 +181,6 @@  int rxe_mr_init_user(struct rxe_dev *rxe, u64 start, u64 length, u64 iova,
 
 	return 0;
 
-err_cleanup_map:
-	for (i = 0; i < mr->num_map; i++)
-		kfree(mr->map[i]);
-	kfree(mr->map);
 err_release_umem:
 	ib_umem_release(umem);
 err_out: