net/mlx5e: fix a double-free in arfs_create_groups

Message ID 20231224081348.3535146-1-alexious@zju.edu.cn
State New
Headers
Series net/mlx5e: fix a double-free in arfs_create_groups |

Commit Message

Zhipeng Lu Dec. 24, 2023, 8:13 a.m. UTC
  When `in` allocated by kvzalloc fails, arfs_create_groups will free
ft->g and return an error. However, arfs_create_table, the only caller of
arfs_create_groups, will hold this error and call to
mlx5e_destroy_flow_table, in which the ft->g will be freed again.

Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
---
 drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Simon Horman Jan. 3, 2024, 5:22 p.m. UTC | #1
On Sun, Dec 24, 2023 at 04:13:48PM +0800, Zhipeng Lu wrote:
> When `in` allocated by kvzalloc fails, arfs_create_groups will free
> ft->g and return an error. However, arfs_create_table, the only caller of
> arfs_create_groups, will hold this error and call to
> mlx5e_destroy_flow_table, in which the ft->g will be freed again.
> 
> Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
> Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>

Thanks,

I agree this addresses the issue that you describe.
And as a minimal fix it looks good.

Reviewed-by: Simon Horman <horms@kernel.org>

However, I would like to suggest that some clean-up work could
take place as a follow-up.

I think that the error handling in this area of the code
is rather fragile. This is because initialisation is not necessarily
unwound on error within the function that initialisation occurs.

I think it would be better if arfs_create_groups():

1. Released allocates resources it allocates, including ft->g and
   elements of ft->g, on error.
2. This was achieved by using a goto unwind ladder.
3. The caller treated ft->g as uninitialised if
   arfs_create_groups fails.

Likewise, I think that:

* arfs_create_groups, should initialise ft->num_groups

And further, logic similar to the above should guide
how arfs_create_table() initialises ft->t and cleans it
up on error.

I did not look at the code beyond the scope described above.
But the above are general principles that may well apply in
other nearby code too.

...
  
Zhipeng Lu Jan. 8, 2024, 9:12 a.m. UTC | #2
> On Sun, Dec 24, 2023 at 04:13:48PM +0800, Zhipeng Lu wrote:
> > When `in` allocated by kvzalloc fails, arfs_create_groups will free
> > ft->g and return an error. However, arfs_create_table, the only caller of
> > arfs_create_groups, will hold this error and call to
> > mlx5e_destroy_flow_table, in which the ft->g will be freed again.
> > 
> > Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
> > Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
> 
> Thanks,
> 
> I agree this addresses the issue that you describe.
> And as a minimal fix it looks good.
> 
> Reviewed-by: Simon Horman <horms@kernel.org>
> 
> However, I would like to suggest that some clean-up work could
> take place as a follow-up.
> 
> I think that the error handling in this area of the code
> is rather fragile. This is because initialisation is not necessarily
> unwound on error within the function that initialisation occurs.
> 
> I think it would be better if arfs_create_groups():
> 
> 1. Released allocates resources it allocates, including ft->g and
>    elements of ft->g, on error.
> 2. This was achieved by using a goto unwind ladder.
> 3. The caller treated ft->g as uninitialised if
>    arfs_create_groups fails.
>
 
Agree, I think a unwind ladder for arfs_create_groups is much better.
I'll follow this idea to send a v2 patch later.
Another comment below.

> Likewise, I think that:
> 
> * arfs_create_groups, should initialise ft->num_groups
> 
> And further, logic similar to the above should guide
> how arfs_create_table() initialises ft->t and cleans it
> up on error.
> 

I think that ft->t you mentioned refers to mlx5_create_flow_table.
I'd like to make the life cycle of ft->t similar to ft->g in arfs_create_groups, 
but it needs to add an argument for mlx5_create_flow_table to transfer ft to 
it. However, mlx5_create_flow_table is called in more than 30 different places 
throughout the kernel. So such modification could be another refactoring patch
but may be out of this fix patch's duty.

> I did not look at the code beyond the scope described above.
> But the above are general principles that may well apply in
> other nearby code too.
> 
> ...
  
Simon Horman Jan. 8, 2024, 11:05 a.m. UTC | #3
On Mon, Jan 08, 2024 at 05:12:06PM +0800, alexious@zju.edu.cn wrote:
> 
> 
> > On Sun, Dec 24, 2023 at 04:13:48PM +0800, Zhipeng Lu wrote:
> > > When `in` allocated by kvzalloc fails, arfs_create_groups will free
> > > ft->g and return an error. However, arfs_create_table, the only caller of
> > > arfs_create_groups, will hold this error and call to
> > > mlx5e_destroy_flow_table, in which the ft->g will be freed again.
> > > 
> > > Fixes: 1cabe6b0965e ("net/mlx5e: Create aRFS flow tables")
> > > Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
> > 
> > Thanks,
> > 
> > I agree this addresses the issue that you describe.
> > And as a minimal fix it looks good.
> > 
> > Reviewed-by: Simon Horman <horms@kernel.org>
> > 
> > However, I would like to suggest that some clean-up work could
> > take place as a follow-up.
> > 
> > I think that the error handling in this area of the code
> > is rather fragile. This is because initialisation is not necessarily
> > unwound on error within the function that initialisation occurs.
> > 
> > I think it would be better if arfs_create_groups():
> > 
> > 1. Released allocates resources it allocates, including ft->g and
> >    elements of ft->g, on error.
> > 2. This was achieved by using a goto unwind ladder.
> > 3. The caller treated ft->g as uninitialised if
> >    arfs_create_groups fails.
> >
>  
> Agree, I think a unwind ladder for arfs_create_groups is much better.
> I'll follow this idea to send a v2 patch later.

Thanks.

> Another comment below.
> 
> > Likewise, I think that:
> > 
> > * arfs_create_groups, should initialise ft->num_groups
> > 
> > And further, logic similar to the above should guide
> > how arfs_create_table() initialises ft->t and cleans it
> > up on error.
> > 
> 
> I think that ft->t you mentioned refers to mlx5_create_flow_table.
> I'd like to make the life cycle of ft->t similar to ft->g in arfs_create_groups, 
> but it needs to add an argument for mlx5_create_flow_table to transfer ft to 
> it. However, mlx5_create_flow_table is called in more than 30 different places 
> throughout the kernel. So such modification could be another refactoring patch
> but may be out of this fix patch's duty.

I agree there is no need to solve all problems in this patch :)

> > I did not look at the code beyond the scope described above.
> > But the above are general principles that may well apply in
> > other nearby code too.
> > 
> > ...
  

Patch

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c b/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
index bb7f86c993e5..d9a60bd04167 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_arfs.c
@@ -257,6 +257,7 @@  static int arfs_create_groups(struct mlx5e_flow_table *ft,
 	in = kvzalloc(inlen, GFP_KERNEL);
 	if  (!in || !ft->g) {
 		kfree(ft->g);
+		ft->g = NULL;
 		kvfree(in);
 		return -ENOMEM;
 	}