[RFC] mm/mempolicy: Fix memory leak in set_mempolicy_home_node system call

Message ID 20221214222110.200487-1-mathieu.desnoyers@efficios.com
State New
Headers
Series [RFC] mm/mempolicy: Fix memory leak in set_mempolicy_home_node system call |

Commit Message

Mathieu Desnoyers Dec. 14, 2022, 10:21 p.m. UTC
  When encountering any vma in the range with policy other than MPOL_BIND
or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
on the policy just allocated with mpol_dup().

This allows arbitrary users to leak kernel memory.

Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Cc: Ben Widawsky <ben.widawsky@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Huang Ying <ying.huang@intel.com>
Cc: <linux-api@vger.kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: stable@vger.kernel.org # 5.17+
---
 mm/mempolicy.c | 1 +
 1 file changed, 1 insertion(+)
  

Comments

Randy Dunlap Dec. 14, 2022, 11:16 p.m. UTC | #1
On 12/14/22 14:21, Mathieu Desnoyers wrote:
> When encountering any vma in the range with policy other than MPOL_BIND
> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
> on the policy just allocated with mpol_dup().
> 
> This allows arbitrary users to leak kernel memory.
> 
> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Ben Widawsky <ben.widawsky@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: <linux-api@vger.kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org # 5.17+

Reviewed-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
>  mm/mempolicy.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 61aa9aedb728..02c8a712282f 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1540,6 +1540,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  		 * the home node for vmas we already updated before.
>  		 */
>  		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
> +			mpol_put(new);
>  			err = -EOPNOTSUPP;
>  			break;
>  		}
  
Huang, Ying Dec. 15, 2022, 6:34 a.m. UTC | #2
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:

> When encountering any vma in the range with policy other than MPOL_BIND
> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
> on the policy just allocated with mpol_dup().
>
> This allows arbitrary users to leak kernel memory.
>
> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Ben Widawsky <ben.widawsky@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: <linux-api@vger.kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org # 5.17+

Reviewed-by: "Huang, Ying" <ying.huang@intel.com>

Thanks!

> ---
>  mm/mempolicy.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 61aa9aedb728..02c8a712282f 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1540,6 +1540,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  		 * the home node for vmas we already updated before.
>  		 */
>  		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
> +			mpol_put(new);
>  			err = -EOPNOTSUPP;
>  			break;
>  		}
  
Michal Hocko Dec. 15, 2022, 7:51 a.m. UTC | #3
On Wed 14-12-22 17:21:10, Mathieu Desnoyers wrote:
> When encountering any vma in the range with policy other than MPOL_BIND
> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
> on the policy just allocated with mpol_dup().
> 
> This allows arbitrary users to leak kernel memory.
> 
> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Ben Widawsky <ben.widawsky@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: <linux-api@vger.kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org # 5.17+

Acked-by: Michal Hocko <mhocko@suse.com>
Thanks for catching this!

Btw. looking at the code again it seems rather pointless to duplicate
the policy just to throw it away anyway. A slightly bigger diff but this
looks more reasonable to me. What do you think? I can also send it as a
clean up on top of your fix.
---
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 61aa9aedb728..918cdc8a7f0c 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1489,7 +1489,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
-	struct mempolicy *new;
+	struct mempolicy *new. *old;
 	unsigned long vmstart;
 	unsigned long vmend;
 	unsigned long end;
@@ -1521,30 +1521,28 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
 		return 0;
 	mmap_write_lock(mm);
 	for_each_vma_range(vmi, vma, end) {
-		vmstart = max(start, vma->vm_start);
-		vmend   = min(end, vma->vm_end);
-		new = mpol_dup(vma_policy(vma));
-		if (IS_ERR(new)) {
-			err = PTR_ERR(new);
-			break;
-		}
-		/*
-		 * Only update home node if there is an existing vma policy
-		 */
-		if (!new)
-			continue;
-
 		/*
 		 * If any vma in the range got policy other than MPOL_BIND
 		 * or MPOL_PREFERRED_MANY we return error. We don't reset
 		 * the home node for vmas we already updated before.
 		 */
-		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
+		old = vma_policy(vma);
+		if (!old)
+			continue;
+		if (old->mode != MPOL_BIND && old->mode != MPOL_PREFERRED_MANY) {
 			err = -EOPNOTSUPP;
 			break;
 		}
 
+		new = mpol_dup(vma_policy(vma));
+		if (IS_ERR(new)) {
+			err = PTR_ERR(new);
+			break;
+		}
+
 		new->home_node = home_node;
+		vmstart = max(start, vma->vm_start);
+		vmend   = min(end, vma->vm_end);
 		err = mbind_range(mm, vmstart, vmend, new);
 		mpol_put(new);
 		if (err)
  
Aneesh Kumar K.V Dec. 15, 2022, 1:56 p.m. UTC | #4
Mathieu Desnoyers <mathieu.desnoyers@efficios.com> writes:

> When encountering any vma in the range with policy other than MPOL_BIND
> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
> on the policy just allocated with mpol_dup().
>
> This allows arbitrary users to leak kernel memory.
>

Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>

> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
> Cc: Ben Widawsky <ben.widawsky@intel.com>
> Cc: Dave Hansen <dave.hansen@linux.intel.com>
> Cc: Feng Tang <feng.tang@intel.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Andrea Arcangeli <aarcange@redhat.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Randy Dunlap <rdunlap@infradead.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Andi Kleen <ak@linux.intel.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Huang Ying <ying.huang@intel.com>
> Cc: <linux-api@vger.kernel.org>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: stable@vger.kernel.org # 5.17+
> ---
>  mm/mempolicy.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 61aa9aedb728..02c8a712282f 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1540,6 +1540,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  		 * the home node for vmas we already updated before.
>  		 */
>  		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
> +			mpol_put(new);
>  			err = -EOPNOTSUPP;
>  			break;
>  		}
> -- 
> 2.25.1
  
Aneesh Kumar K.V Dec. 15, 2022, 1:57 p.m. UTC | #5
Michal Hocko <mhocko@suse.com> writes:

> On Wed 14-12-22 17:21:10, Mathieu Desnoyers wrote:
>> When encountering any vma in the range with policy other than MPOL_BIND
>> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
>> on the policy just allocated with mpol_dup().
>> 
>> This allows arbitrary users to leak kernel memory.
>> 
>> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> Cc: Ben Widawsky <ben.widawsky@intel.com>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Feng Tang <feng.tang@intel.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Huang Ying <ying.huang@intel.com>
>> Cc: <linux-api@vger.kernel.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: stable@vger.kernel.org # 5.17+
>
> Acked-by: Michal Hocko <mhocko@suse.com>
> Thanks for catching this!
>
> Btw. looking at the code again it seems rather pointless to duplicate
> the policy just to throw it away anyway. A slightly bigger diff but this
> looks more reasonable to me. What do you think? I can also send it as a
> clean up on top of your fix.
> ---
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 61aa9aedb728..918cdc8a7f0c 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1489,7 +1489,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  {
>  	struct mm_struct *mm = current->mm;
>  	struct vm_area_struct *vma;
> -	struct mempolicy *new;
> +	struct mempolicy *new. *old;
>  	unsigned long vmstart;
>  	unsigned long vmend;
>  	unsigned long end;
> @@ -1521,30 +1521,28 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>  		return 0;
>  	mmap_write_lock(mm);
>  	for_each_vma_range(vmi, vma, end) {
> -		vmstart = max(start, vma->vm_start);
> -		vmend   = min(end, vma->vm_end);
> -		new = mpol_dup(vma_policy(vma));
> -		if (IS_ERR(new)) {
> -			err = PTR_ERR(new);
> -			break;
> -		}
> -		/*
> -		 * Only update home node if there is an existing vma policy
> -		 */
> -		if (!new)
> -			continue;
> -
>  		/*
>  		 * If any vma in the range got policy other than MPOL_BIND
>  		 * or MPOL_PREFERRED_MANY we return error. We don't reset
>  		 * the home node for vmas we already updated before.
>  		 */
> -		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
> +		old = vma_policy(vma);
> +		if (!old)
> +			continue;
> +		if (old->mode != MPOL_BIND && old->mode != MPOL_PREFERRED_MANY) {
>  			err = -EOPNOTSUPP;
>  			break;
>  		}
>  
> +		new = mpol_dup(vma_policy(vma));

		new = mpol_dup(old);

> +		if (IS_ERR(new)) {
> +			err = PTR_ERR(new);
> +			break;
> +		}
> +
>  		new->home_node = home_node;
> +		vmstart = max(start, vma->vm_start);
> +		vmend   = min(end, vma->vm_end);
>  		err = mbind_range(mm, vmstart, vmend, new);
>  		mpol_put(new);
>  		if (err)
> -- 
> Michal Hocko
> SUSE Labs
  
Mathieu Desnoyers Dec. 15, 2022, 2:33 p.m. UTC | #6
On 2022-12-15 02:51, Michal Hocko wrote:
> On Wed 14-12-22 17:21:10, Mathieu Desnoyers wrote:
>> When encountering any vma in the range with policy other than MPOL_BIND
>> or MPOL_PREFERRED_MANY, an error is returned without issuing a mpol_put
>> on the policy just allocated with mpol_dup().
>>
>> This allows arbitrary users to leak kernel memory.
>>
>> Fixes: c6018b4b2549 ("mm/mempolicy: add set_mempolicy_home_node syscall")
>> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
>> Cc: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
>> Cc: Ben Widawsky <ben.widawsky@intel.com>
>> Cc: Dave Hansen <dave.hansen@linux.intel.com>
>> Cc: Feng Tang <feng.tang@intel.com>
>> Cc: Michal Hocko <mhocko@kernel.org>
>> Cc: Andrea Arcangeli <aarcange@redhat.com>
>> Cc: Mel Gorman <mgorman@techsingularity.net>
>> Cc: Mike Kravetz <mike.kravetz@oracle.com>
>> Cc: Randy Dunlap <rdunlap@infradead.org>
>> Cc: Vlastimil Babka <vbabka@suse.cz>
>> Cc: Andi Kleen <ak@linux.intel.com>
>> Cc: Dan Williams <dan.j.williams@intel.com>
>> Cc: Huang Ying <ying.huang@intel.com>
>> Cc: <linux-api@vger.kernel.org>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: stable@vger.kernel.org # 5.17+
> 
> Acked-by: Michal Hocko <mhocko@suse.com>
> Thanks for catching this!
> 
> Btw. looking at the code again it seems rather pointless to duplicate
> the policy just to throw it away anyway. A slightly bigger diff but this
> looks more reasonable to me. What do you think? I can also send it as a
> clean up on top of your fix.

I think it would be best if this comes as a cleanup on top of my fix. 
The diff is larger than the minimal change needed to fix the leak in 
stable branches.

Your approach looks fine, except for the vma_policy(vma) -> old change 
already spotted by Aneesh.

Thanks,

Mathieu

> ---
> diff --git a/mm/mempolicy.c b/mm/mempolicy.c
> index 61aa9aedb728..918cdc8a7f0c 100644
> --- a/mm/mempolicy.c
> +++ b/mm/mempolicy.c
> @@ -1489,7 +1489,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>   {
>   	struct mm_struct *mm = current->mm;
>   	struct vm_area_struct *vma;
> -	struct mempolicy *new;
> +	struct mempolicy *new. *old;
>   	unsigned long vmstart;
>   	unsigned long vmend;
>   	unsigned long end;
> @@ -1521,30 +1521,28 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
>   		return 0;
>   	mmap_write_lock(mm);
>   	for_each_vma_range(vmi, vma, end) {
> -		vmstart = max(start, vma->vm_start);
> -		vmend   = min(end, vma->vm_end);
> -		new = mpol_dup(vma_policy(vma));
> -		if (IS_ERR(new)) {
> -			err = PTR_ERR(new);
> -			break;
> -		}
> -		/*
> -		 * Only update home node if there is an existing vma policy
> -		 */
> -		if (!new)
> -			continue;
> -
>   		/*
>   		 * If any vma in the range got policy other than MPOL_BIND
>   		 * or MPOL_PREFERRED_MANY we return error. We don't reset
>   		 * the home node for vmas we already updated before.
>   		 */
> -		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
> +		old = vma_policy(vma);
> +		if (!old)
> +			continue;
> +		if (old->mode != MPOL_BIND && old->mode != MPOL_PREFERRED_MANY) {
>   			err = -EOPNOTSUPP;
>   			break;
>   		}
>   
> +		new = mpol_dup(vma_policy(vma));
> +		if (IS_ERR(new)) {
> +			err = PTR_ERR(new);
> +			break;
> +		}
> +
>   		new->home_node = home_node;
> +		vmstart = max(start, vma->vm_start);
> +		vmend   = min(end, vma->vm_end);
>   		err = mbind_range(mm, vmstart, vmend, new);
>   		mpol_put(new);
>   		if (err)
  

Patch

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 61aa9aedb728..02c8a712282f 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1540,6 +1540,7 @@  SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le
 		 * the home node for vmas we already updated before.
 		 */
 		if (new->mode != MPOL_BIND && new->mode != MPOL_PREFERRED_MANY) {
+			mpol_put(new);
 			err = -EOPNOTSUPP;
 			break;
 		}