[3/5] padata: grab parallel_data refcnt for reorder

Message ID 20221019083708.27138-4-nstange@suse.de
State New
Headers
Series padata: fix liftime issues after ->serial() has completed |

Commit Message

Nicolai Stange Oct. 19, 2022, 8:37 a.m. UTC
  On entry of padata_do_serial(), the in-flight padata_priv owns a reference
to the associated parallel_data instance.

However, as soon as the padata_priv got enqueued on the reorder list, it
can be completed from a different context, causing the reference to get
released in the course.

This would potentially cause UAFs from the subsequent padata_reorder()
operations invoked from the enqueueing padata_do_serial() or from the
reorder work.

Note that this is a purely theroretical concern, the problem has never been
actually observed -- it would require multiple pcrypt request submissions
racing against each other, ultimately a pcrypt instance destruction
(DELALG) short after request completions as well as unfortunate timing.

However, for the sake of correctness, it is still worth fixing.

Make padata_do_serial() grab a reference count on the parallel_data for
the subsequent reorder operation(s). As long as the padata_priv has not
been enqueued, this is safe, because as mentioned above, in-flight
pdata_privs own a reference already.

Note that padata_reorder() might schedule another padata_reorder() work
and thus, care must be taken to not prematurely release that "reorder
refcount" from padata_do_serial() again in case that has happened.
Make padata_reorder() return a bool for indicating whether or not a
reorder work has been scheduled. Let padata_do_serial() drop its refcount
only if this is not the case. Accordingly, make the reorder work handler,
invoke_padata_reorder(), drop it then as appropriate.

A remark on the commit chosen for the Fixes tag reference below: before
commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance
padata queues"), the padata_parallel lifetime had been tied to the
padata_instance. The padata_free() resp. padata_stop() issued a
synchronize_rcu() before padata_free_pd() from the instance destruction
path, rendering UAFs from the padata_do_serial()=>padata_reorder()
invocations with BHs disabled impossible AFAICS. With that, the
padata_reorder() work remains to be considered. Before
commit b128a3040935 ("padata: allocate workqueue internally"), the
workqueue got destroyed (from pcrypt), hence drained, before the padata
instance destruction, but this change moved that to after the
padata_free_pd() invocation from __padata_free(). So, while the Fixes
reference from below is most likely technically correct, I would still like
to reiterate that this problem is probably hard to trigger in practice,
even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock
by using per-instance padata queues").

Fixes: b128a3040935 ("padata: allocate workqueue internally")
Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 kernel/padata.c | 26 +++++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)
  

Comments

Daniel Jordan Oct. 28, 2022, 4:04 p.m. UTC | #1
On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
> to the associated parallel_data instance.
> 
> However, as soon as the padata_priv got enqueued on the reorder list, it
> can be completed from a different context, causing the reference to get
> released in the course.
> 
> This would potentially cause UAFs from the subsequent padata_reorder()
> operations invoked from the enqueueing padata_do_serial() or from the
> reorder work.
> 
> Note that this is a purely theroretical concern, the problem has never been
> actually observed -- it would require multiple pcrypt request submissions
> racing against each other, ultimately a pcrypt instance destruction
> (DELALG) short after request completions as well as unfortunate timing.
> 
> However, for the sake of correctness, it is still worth fixing.
> 
> Make padata_do_serial() grab a reference count on the parallel_data for
> the subsequent reorder operation(s). As long as the padata_priv has not
> been enqueued, this is safe, because as mentioned above, in-flight
> pdata_privs own a reference already.
> 
> Note that padata_reorder() might schedule another padata_reorder() work
> and thus, care must be taken to not prematurely release that "reorder
> refcount" from padata_do_serial() again in case that has happened.
> Make padata_reorder() return a bool for indicating whether or not a
> reorder work has been scheduled. Let padata_do_serial() drop its refcount
> only if this is not the case. Accordingly, make the reorder work handler,
> invoke_padata_reorder(), drop it then as appropriate.
> 
> A remark on the commit chosen for the Fixes tag reference below: before
> commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance
> padata queues"), the padata_parallel lifetime had been tied to the
> padata_instance. The padata_free() resp. padata_stop() issued a
> synchronize_rcu() before padata_free_pd() from the instance destruction
> path, rendering UAFs from the padata_do_serial()=>padata_reorder()
> invocations with BHs disabled impossible AFAICS. With that, the
> padata_reorder() work remains to be considered. Before
> commit b128a3040935 ("padata: allocate workqueue internally"), the
> workqueue got destroyed (from pcrypt), hence drained, before the padata
> instance destruction, but this change moved that to after the
> padata_free_pd() invocation from __padata_free(). So, while the Fixes
> reference from below is most likely technically correct, I would still like
> to reiterate that this problem is probably hard to trigger in practice,
> even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock
> by using per-instance padata queues").
> 
> Fixes: b128a3040935 ("padata: allocate workqueue internally")
> Signed-off-by: Nicolai Stange <nstange@suse.de>
> ---
>  kernel/padata.c | 26 +++++++++++++++++++++-----
>  1 file changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 0bf8c80dad5a..b79226727ef7 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
>  	return padata;
>  }
>  
> -static void padata_reorder(struct parallel_data *pd)
> +static bool padata_reorder(struct parallel_data *pd)
>  {
>  	struct padata_instance *pinst = pd->ps->pinst;
>  	int cb_cpu;
> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
>  	 * care for all the objects enqueued during the holdtime of the lock.
>  	 */
>  	if (!spin_trylock_bh(&pd->lock))
> -		return;
> +		return false;
>  
>  	while (1) {
>  		padata = padata_find_next(pd, true);
> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
>  
>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
> -		queue_work(pinst->serial_wq, &pd->reorder_work);
> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
> +
> +	return false;
>  }
>  
>  static void invoke_padata_reorder(struct work_struct *work)
>  {
>  	struct parallel_data *pd;
> +	bool keep_refcnt;
>  
>  	local_bh_disable();
>  	pd = container_of(work, struct parallel_data, reorder_work);
> -	padata_reorder(pd);
> +	keep_refcnt = padata_reorder(pd);
>  	local_bh_enable();
> +
> +	if (!keep_refcnt)
> +		padata_put_pd(pd);
>  }
>  
>  static void padata_serial_worker(struct work_struct *serial_work)
> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
>  	struct padata_priv *cur;
>  
> +	/*
> +	 * The in-flight padata owns a reference on pd. However, as
> +	 * soon as it's been enqueued on the reorder list, another
> +	 * task can dequeue and complete it, thereby dropping the
> +	 * reference. Grab another reference here, it will eventually
> +	 * be released from a reorder work, if any, or below.
> +	 */
> +	padata_get_pd(pd);
> +
>  	spin_lock(&reorder->lock);
>  	/* Sort in ascending order of sequence number. */
>  	list_for_each_entry_reverse(cur, &reorder->list, list)
> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
>  	 */
>  	smp_mb();
>  
> -	padata_reorder(pd);
> +	if (!padata_reorder(pd))
> +		padata_put_pd(pd);

do_serial is supposed to be called with BHs disabled and will be in
every case after a fix for a separate issue that I plan to send this
cycle.  Given that it (will soon...) always happen under RCU protection,
part of this issue could be addressed like this, which puts the expense
of dealing with this rare problem in the slow path:

diff --git a/kernel/padata.c b/kernel/padata.c
index 0bf8c80dad5a..cd6740ae6629 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
 	if (!ps)
 		return;
 
+	/*
+	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
+	 * and ps's.
+	 */
+	synchronize_rcu();
+
 	mutex_lock(&ps->pinst->lock);
 	list_del(&ps->list);
 	padata_put_pd(rcu_dereference_protected(ps->pd, 1));

pcrypt calls padata_free_shell() when all outstanding transforms (and
thus requests, I think) have been freed/completed, so no new task can
come into padata_reorder.  synchronize_rcu() then flushes out any
remaining _reorder calls.

This doesn't deal with pending reorder_work items, but we can fix it
later in the series.

What do you think?
  
Nicolai Stange Nov. 9, 2022, 1:03 p.m. UTC | #2
Daniel Jordan <daniel.m.jordan@oracle.com> writes:

> On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
>> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
>> to the associated parallel_data instance.
>> 
>> However, as soon as the padata_priv got enqueued on the reorder list, it
>> can be completed from a different context, causing the reference to get
>> released in the course.
>> 
>> This would potentially cause UAFs from the subsequent padata_reorder()
>> operations invoked from the enqueueing padata_do_serial() or from the
>> reorder work.
>> 

<snip>

>> ---
>>  kernel/padata.c | 26 +++++++++++++++++++++-----
>>  1 file changed, 21 insertions(+), 5 deletions(-)
>> 
>> diff --git a/kernel/padata.c b/kernel/padata.c
>> index 0bf8c80dad5a..b79226727ef7 100644
>> --- a/kernel/padata.c
>> +++ b/kernel/padata.c
>> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
>>  	return padata;
>>  }
>>  
>> -static void padata_reorder(struct parallel_data *pd)
>> +static bool padata_reorder(struct parallel_data *pd)
>>  {
>>  	struct padata_instance *pinst = pd->ps->pinst;
>>  	int cb_cpu;
>> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
>>  	 * care for all the objects enqueued during the holdtime of the lock.
>>  	 */
>>  	if (!spin_trylock_bh(&pd->lock))
>> -		return;
>> +		return false;
>>  
>>  	while (1) {
>>  		padata = padata_find_next(pd, true);
>> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
>>  
>>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
>>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
>> -		queue_work(pinst->serial_wq, &pd->reorder_work);
>> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
>> +
>> +	return false;
>>  }
>>  
>>  static void invoke_padata_reorder(struct work_struct *work)
>>  {
>>  	struct parallel_data *pd;
>> +	bool keep_refcnt;
>>  
>>  	local_bh_disable();
>>  	pd = container_of(work, struct parallel_data, reorder_work);
>> -	padata_reorder(pd);
>> +	keep_refcnt = padata_reorder(pd);
>>  	local_bh_enable();
>> +
>> +	if (!keep_refcnt)
>> +		padata_put_pd(pd);
>>  }
>>  
>>  static void padata_serial_worker(struct work_struct *serial_work)
>> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
>>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
>>  	struct padata_priv *cur;
>>  
>> +	/*
>> +	 * The in-flight padata owns a reference on pd. However, as
>> +	 * soon as it's been enqueued on the reorder list, another
>> +	 * task can dequeue and complete it, thereby dropping the
>> +	 * reference. Grab another reference here, it will eventually
>> +	 * be released from a reorder work, if any, or below.
>> +	 */
>> +	padata_get_pd(pd);
>> +
>>  	spin_lock(&reorder->lock);
>>  	/* Sort in ascending order of sequence number. */
>>  	list_for_each_entry_reverse(cur, &reorder->list, list)
>> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
>>  	 */
>>  	smp_mb();
>>  
>> -	padata_reorder(pd);
>> +	if (!padata_reorder(pd))
>> +		padata_put_pd(pd);
>
> do_serial is supposed to be called with BHs disabled and will be in
> every case after a fix for a separate issue that I plan to send this
> cycle.  Given that it (will soon...) always happen under RCU protection,
> part of this issue could be addressed like this, which puts the expense
> of dealing with this rare problem in the slow path:
>
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 0bf8c80dad5a..cd6740ae6629 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
>  	if (!ps)
>  		return;
>  
> +	/*
> +	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
> +	 * and ps's.
> +	 */
> +	synchronize_rcu();
> +
>  	mutex_lock(&ps->pinst->lock);
>  	list_del(&ps->list);
>  	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
>
> pcrypt calls padata_free_shell() when all outstanding transforms (and
> thus requests, I think) have been freed/completed, so no new task can
> come into padata_reorder.  synchronize_rcu() then flushes out any
> remaining _reorder calls.
>
> This doesn't deal with pending reorder_work items, but we can fix it
> later in the series.
>
> What do you think?

Yes, I think that would work. Will you handle it alongside that fix for
a separate issue you mentioned above? Or shall I once this fix has
landed?

Thanks!

Nicolai
  
Daniel Jordan Nov. 10, 2022, 11:16 p.m. UTC | #3
On Wed, Nov 09, 2022 at 02:03:15PM +0100, Nicolai Stange wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> 
> > On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote:
> >> On entry of padata_do_serial(), the in-flight padata_priv owns a reference
> >> to the associated parallel_data instance.
> >> 
> >> However, as soon as the padata_priv got enqueued on the reorder list, it
> >> can be completed from a different context, causing the reference to get
> >> released in the course.
> >> 
> >> This would potentially cause UAFs from the subsequent padata_reorder()
> >> operations invoked from the enqueueing padata_do_serial() or from the
> >> reorder work.
> >> 
> 
> <snip>
> 
> >> ---
> >>  kernel/padata.c | 26 +++++++++++++++++++++-----
> >>  1 file changed, 21 insertions(+), 5 deletions(-)
> >> 
> >> diff --git a/kernel/padata.c b/kernel/padata.c
> >> index 0bf8c80dad5a..b79226727ef7 100644
> >> --- a/kernel/padata.c
> >> +++ b/kernel/padata.c
> >> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd,
> >>  	return padata;
> >>  }
> >>  
> >> -static void padata_reorder(struct parallel_data *pd)
> >> +static bool padata_reorder(struct parallel_data *pd)
> >>  {
> >>  	struct padata_instance *pinst = pd->ps->pinst;
> >>  	int cb_cpu;
> >> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd)
> >>  	 * care for all the objects enqueued during the holdtime of the lock.
> >>  	 */
> >>  	if (!spin_trylock_bh(&pd->lock))
> >> -		return;
> >> +		return false;
> >>  
> >>  	while (1) {
> >>  		padata = padata_find_next(pd, true);
> >> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd)
> >>  
> >>  	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
> >>  	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
> >> -		queue_work(pinst->serial_wq, &pd->reorder_work);
> >> +		return queue_work(pinst->serial_wq, &pd->reorder_work);
> >> +
> >> +	return false;
> >>  }
> >>  
> >>  static void invoke_padata_reorder(struct work_struct *work)
> >>  {
> >>  	struct parallel_data *pd;
> >> +	bool keep_refcnt;
> >>  
> >>  	local_bh_disable();
> >>  	pd = container_of(work, struct parallel_data, reorder_work);
> >> -	padata_reorder(pd);
> >> +	keep_refcnt = padata_reorder(pd);
> >>  	local_bh_enable();
> >> +
> >> +	if (!keep_refcnt)
> >> +		padata_put_pd(pd);
> >>  }
> >>  
> >>  static void padata_serial_worker(struct work_struct *serial_work)
> >> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata)
> >>  	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
> >>  	struct padata_priv *cur;
> >>  
> >> +	/*
> >> +	 * The in-flight padata owns a reference on pd. However, as
> >> +	 * soon as it's been enqueued on the reorder list, another
> >> +	 * task can dequeue and complete it, thereby dropping the
> >> +	 * reference. Grab another reference here, it will eventually
> >> +	 * be released from a reorder work, if any, or below.
> >> +	 */
> >> +	padata_get_pd(pd);
> >> +
> >>  	spin_lock(&reorder->lock);
> >>  	/* Sort in ascending order of sequence number. */
> >>  	list_for_each_entry_reverse(cur, &reorder->list, list)
> >> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata)
> >>  	 */
> >>  	smp_mb();
> >>  
> >> -	padata_reorder(pd);
> >> +	if (!padata_reorder(pd))
> >> +		padata_put_pd(pd);
> >
> > do_serial is supposed to be called with BHs disabled and will be in
> > every case after a fix for a separate issue that I plan to send this
> > cycle.  Given that it (will soon...) always happen under RCU protection,
> > part of this issue could be addressed like this, which puts the expense
> > of dealing with this rare problem in the slow path:
> >
> > diff --git a/kernel/padata.c b/kernel/padata.c
> > index 0bf8c80dad5a..cd6740ae6629 100644
> > --- a/kernel/padata.c
> > +++ b/kernel/padata.c
> > @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps)
> >  	if (!ps)
> >  		return;
> >  
> > +	/*
> > +	 * Wait for all _do_serial calls to finish to avoid touching freed pd's
> > +	 * and ps's.
> > +	 */
> > +	synchronize_rcu();
> > +
> >  	mutex_lock(&ps->pinst->lock);
> >  	list_del(&ps->list);
> >  	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
> >
> > pcrypt calls padata_free_shell() when all outstanding transforms (and
> > thus requests, I think) have been freed/completed, so no new task can
> > come into padata_reorder.  synchronize_rcu() then flushes out any
> > remaining _reorder calls.
> >
> > This doesn't deal with pending reorder_work items, but we can fix it
> > later in the series.
> >
> > What do you think?
> 
> Yes, I think that would work. Will you handle it alongside that fix for
> a separate issue you mentioned above? Or shall I once this fix has
> landed?

Please go ahead and do it yourself.  I'll send mine soon, I think it
should be ok for them to go in in the same cycle.
  

Patch

diff --git a/kernel/padata.c b/kernel/padata.c
index 0bf8c80dad5a..b79226727ef7 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -275,7 +275,7 @@  static struct padata_priv *padata_find_next(struct parallel_data *pd,
 	return padata;
 }
 
-static void padata_reorder(struct parallel_data *pd)
+static bool padata_reorder(struct parallel_data *pd)
 {
 	struct padata_instance *pinst = pd->ps->pinst;
 	int cb_cpu;
@@ -294,7 +294,7 @@  static void padata_reorder(struct parallel_data *pd)
 	 * care for all the objects enqueued during the holdtime of the lock.
 	 */
 	if (!spin_trylock_bh(&pd->lock))
-		return;
+		return false;
 
 	while (1) {
 		padata = padata_find_next(pd, true);
@@ -331,17 +331,23 @@  static void padata_reorder(struct parallel_data *pd)
 
 	reorder = per_cpu_ptr(pd->reorder_list, pd->cpu);
 	if (!list_empty(&reorder->list) && padata_find_next(pd, false))
-		queue_work(pinst->serial_wq, &pd->reorder_work);
+		return queue_work(pinst->serial_wq, &pd->reorder_work);
+
+	return false;
 }
 
 static void invoke_padata_reorder(struct work_struct *work)
 {
 	struct parallel_data *pd;
+	bool keep_refcnt;
 
 	local_bh_disable();
 	pd = container_of(work, struct parallel_data, reorder_work);
-	padata_reorder(pd);
+	keep_refcnt = padata_reorder(pd);
 	local_bh_enable();
+
+	if (!keep_refcnt)
+		padata_put_pd(pd);
 }
 
 static void padata_serial_worker(struct work_struct *serial_work)
@@ -392,6 +398,15 @@  void padata_do_serial(struct padata_priv *padata)
 	struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu);
 	struct padata_priv *cur;
 
+	/*
+	 * The in-flight padata owns a reference on pd. However, as
+	 * soon as it's been enqueued on the reorder list, another
+	 * task can dequeue and complete it, thereby dropping the
+	 * reference. Grab another reference here, it will eventually
+	 * be released from a reorder work, if any, or below.
+	 */
+	padata_get_pd(pd);
+
 	spin_lock(&reorder->lock);
 	/* Sort in ascending order of sequence number. */
 	list_for_each_entry_reverse(cur, &reorder->list, list)
@@ -407,7 +422,8 @@  void padata_do_serial(struct padata_priv *padata)
 	 */
 	smp_mb();
 
-	padata_reorder(pd);
+	if (!padata_reorder(pd))
+		padata_put_pd(pd);
 }
 EXPORT_SYMBOL(padata_do_serial);