RDMA/restrack: Fix potential invalid address access

Message ID 20240301095514.3598280-1-haowenchao2@huawei.com
State New
Headers
Series RDMA/restrack: Fix potential invalid address access |

Commit Message

Wenchao Hao March 1, 2024, 9:55 a.m. UTC
  struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
in ib_create_cq(), while if the module exited but forgot del this
rdma_restrack_entry, it would cause a invalid address access in
rdma_restrack_clean() when print the owner of this rdma_restrack_entry.

Fix this issue by using kstrdup() to set rdma_restrack_entry's
kern_name.

Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
---
 drivers/infiniband/core/restrack.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)
  

Comments

Leon Romanovsky March 3, 2024, 12:57 p.m. UTC | #1
On Fri, Mar 01, 2024 at 05:55:15PM +0800, Wenchao Hao wrote:
> struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
> in ib_create_cq(), while if the module exited but forgot del this
> rdma_restrack_entry, it would cause a invalid address access in
> rdma_restrack_clean() when print the owner of this rdma_restrack_entry.

How is it possible to exit owner module without cleaning the resources?

Thanks

> 
> Fix this issue by using kstrdup() to set rdma_restrack_entry's
> kern_name.
> 
> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
> ---
>  drivers/infiniband/core/restrack.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
> index 01a499a8b88d..6605011c4edc 100644
> --- a/drivers/infiniband/core/restrack.c
> +++ b/drivers/infiniband/core/restrack.c
> @@ -177,7 +177,8 @@ static void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
>  void rdma_restrack_set_name(struct rdma_restrack_entry *res, const char *caller)
>  {
>  	if (caller) {
> -		res->kern_name = caller;
> +		kfree(res->kern_name);
> +		res->kern_name = kstrdup(caller, GFP_KERNEL);
>  		return;
>  	}
>  
> @@ -195,7 +196,7 @@ void rdma_restrack_parent_name(struct rdma_restrack_entry *dst,
>  			       const struct rdma_restrack_entry *parent)
>  {
>  	if (rdma_is_kernel_res(parent))
> -		dst->kern_name = parent->kern_name;
> +		dst->kern_name = kstrdup(parent->kern_name, GFP_KERNEL);
>  	else
>  		rdma_restrack_attach_task(dst, parent->task);
>  }
> @@ -306,6 +307,7 @@ static void restrack_release(struct kref *kref)
>  		put_task_struct(res->task);
>  		res->task = NULL;
>  	}
> +	kfree(res->kern_name);
>  	complete(&res->comp);
>  }
>  
> -- 
> 2.32.0
>
  
Wenchao Hao March 4, 2024, 3:21 a.m. UTC | #2
On 2024/3/3 20:57, Leon Romanovsky wrote:
> On Fri, Mar 01, 2024 at 05:55:15PM +0800, Wenchao Hao wrote:
>> struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
>> in ib_create_cq(), while if the module exited but forgot del this
>> rdma_restrack_entry, it would cause a invalid address access in
>> rdma_restrack_clean() when print the owner of this rdma_restrack_entry.
> 
> How is it possible to exit owner module without cleaning the resources?
> 

I meet this issue with one of our product who develop their owner kernel
modules based on ib_core, and there are terrible logic with the exit
code which cause resource leak.

Of curse it's bug of module who did not clear resource when exit, but
I think ib_core should avoid accessing memory of other modules directly
to provides better stability.

What's more, from the context of rdma_restrack_clean() when print
"restack: %s %s object allocated by %s is not freed ...", it seems
designed for the above scene where client has bug to alerts there
are resource leak, so we should not panic on this log print.

> Thanks
> 
>>
>> Fix this issue by using kstrdup() to set rdma_restrack_entry's
>> kern_name.
>>
>> Signed-off-by: Wenchao Hao <haowenchao2@huawei.com>
>> ---
>>   drivers/infiniband/core/restrack.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
>> index 01a499a8b88d..6605011c4edc 100644
>> --- a/drivers/infiniband/core/restrack.c
>> +++ b/drivers/infiniband/core/restrack.c
>> @@ -177,7 +177,8 @@ static void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
>>   void rdma_restrack_set_name(struct rdma_restrack_entry *res, const char *caller)
>>   {
>>   	if (caller) {
>> -		res->kern_name = caller;
>> +		kfree(res->kern_name);
>> +		res->kern_name = kstrdup(caller, GFP_KERNEL);
>>   		return;
>>   	}
>>   
>> @@ -195,7 +196,7 @@ void rdma_restrack_parent_name(struct rdma_restrack_entry *dst,
>>   			       const struct rdma_restrack_entry *parent)
>>   {
>>   	if (rdma_is_kernel_res(parent))
>> -		dst->kern_name = parent->kern_name;
>> +		dst->kern_name = kstrdup(parent->kern_name, GFP_KERNEL);
>>   	else
>>   		rdma_restrack_attach_task(dst, parent->task);
>>   }
>> @@ -306,6 +307,7 @@ static void restrack_release(struct kref *kref)
>>   		put_task_struct(res->task);
>>   		res->task = NULL;
>>   	}
>> +	kfree(res->kern_name);
>>   	complete(&res->comp);
>>   }
>>   
>> -- 
>> 2.32.0
>>
  
Leon Romanovsky March 4, 2024, 7:35 a.m. UTC | #3
On Mon, Mar 04, 2024 at 11:21:19AM +0800, Wenchao Hao wrote:
> On 2024/3/3 20:57, Leon Romanovsky wrote:
> > On Fri, Mar 01, 2024 at 05:55:15PM +0800, Wenchao Hao wrote:
> > > struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
> > > in ib_create_cq(), while if the module exited but forgot del this
> > > rdma_restrack_entry, it would cause a invalid address access in
> > > rdma_restrack_clean() when print the owner of this rdma_restrack_entry.
> > 
> > How is it possible to exit owner module without cleaning the resources?
> > 
> 
> I meet this issue with one of our product who develop their owner kernel
> modules based on ib_core, and there are terrible logic with the exit
> code which cause resource leak.
> 
> Of curse it's bug of module who did not clear resource when exit, but
> I think ib_core should avoid accessing memory of other modules directly
> to provides better stability.
> 
> What's more, from the context of rdma_restrack_clean() when print
> "restack: %s %s object allocated by %s is not freed ...", it seems
> designed for the above scene where client has bug to alerts there
> are resource leak, so we should not panic on this log print.

Can you please share the kernel panic?

Thanks
  
Wenchao Hao March 4, 2024, 1:33 p.m. UTC | #4
On 2024/3/4 15:35, Leon Romanovsky wrote:
> On Mon, Mar 04, 2024 at 11:21:19AM +0800, Wenchao Hao wrote:
>> On 2024/3/3 20:57, Leon Romanovsky wrote:
>>> On Fri, Mar 01, 2024 at 05:55:15PM +0800, Wenchao Hao wrote:
>>>> struct rdma_restrack_entry's kern_name was set to KBUILD_MODNAME
>>>> in ib_create_cq(), while if the module exited but forgot del this
>>>> rdma_restrack_entry, it would cause a invalid address access in
>>>> rdma_restrack_clean() when print the owner of this rdma_restrack_entry.
>>>
>>> How is it possible to exit owner module without cleaning the resources?
>>>
>>
>> I meet this issue with one of our product who develop their owner kernel
>> modules based on ib_core, and there are terrible logic with the exit
>> code which cause resource leak.
>>
>> Of curse it's bug of module who did not clear resource when exit, but
>> I think ib_core should avoid accessing memory of other modules directly
>> to provides better stability.
>>
>> What's more, from the context of rdma_restrack_clean() when print
>> "restack: %s %s object allocated by %s is not freed ...", it seems
>> designed for the above scene where client has bug to alerts there
>> are resource leak, so we should not panic on this log print.
> 
> Can you please share the kernel panic?
> 

Sorry, there is no stack or panic info recorded. This is because
another issue of "printk". It seems printk would cause a deadlock
when printk access invalid address with our kernel.

Actually, I found this issue with ftrace/kprobe but not printk, that's
why it takes me a long time to address it.

BTW, I am not developer of rdma, after found the issue, I think it's
better to enhance, so send this patch, and the patch has been tested with
the origin scene.

> Thanks
  

Patch

diff --git a/drivers/infiniband/core/restrack.c b/drivers/infiniband/core/restrack.c
index 01a499a8b88d..6605011c4edc 100644
--- a/drivers/infiniband/core/restrack.c
+++ b/drivers/infiniband/core/restrack.c
@@ -177,7 +177,8 @@  static void rdma_restrack_attach_task(struct rdma_restrack_entry *res,
 void rdma_restrack_set_name(struct rdma_restrack_entry *res, const char *caller)
 {
 	if (caller) {
-		res->kern_name = caller;
+		kfree(res->kern_name);
+		res->kern_name = kstrdup(caller, GFP_KERNEL);
 		return;
 	}
 
@@ -195,7 +196,7 @@  void rdma_restrack_parent_name(struct rdma_restrack_entry *dst,
 			       const struct rdma_restrack_entry *parent)
 {
 	if (rdma_is_kernel_res(parent))
-		dst->kern_name = parent->kern_name;
+		dst->kern_name = kstrdup(parent->kern_name, GFP_KERNEL);
 	else
 		rdma_restrack_attach_task(dst, parent->task);
 }
@@ -306,6 +307,7 @@  static void restrack_release(struct kref *kref)
 		put_task_struct(res->task);
 		res->task = NULL;
 	}
+	kfree(res->kern_name);
 	complete(&res->comp);
 }