[2/5] padata: make padata_free_shell() to respect pd's ->refcnt

Message ID 20221019083708.27138-3-nstange@suse.de
State New
Headers
Series padata: fix liftime issues after ->serial() has completed |

Commit Message

Nicolai Stange Oct. 19, 2022, 8:37 a.m. UTC
  On a PREEMPT kernel, the following has been observed while running
pcrypt_aead01 from LTP:

  [ ] general protection fault: 0000 [#1] PREEMPT_RT SMP PTI
  <...>
  [ ] Workqueue: pdecrypt_parallel padata_parallel_worker
  [ ] RIP: 0010:padata_reorder+0x19/0x120
  <...>
  [ ] Call Trace:
  [ ]  padata_parallel_worker+0xa3/0xf0
  [ ]  process_one_work+0x1db/0x4a0
  [ ]  worker_thread+0x2d/0x3c0
  [ ]  ? process_one_work+0x4a0/0x4a0
  [ ]  kthread+0x159/0x180
  [ ]  ? kthread_park+0xb0/0xb0
  [ ]  ret_from_fork+0x35/0x40

The pcrypt_aead01 testcase basically runs a NEWALG/DELALG sequence for some
fixed pcrypt instance in a loop, back to back.

The problem is that once the last ->serial() in padata_serial_worker() is
getting invoked, the pcrypt requests from the selftests would signal
completion, and pcrypt_aead01 can move on and subsequently issue a DELALG.
Upon pcrypt instance deregistration, the associated padata_shell would get
destroyed, which in turn would unconditionally free the associated
parallel_data instance.

If padata_serial_worker() now resumes operation after e.g. having
previously been preempted upon the return from the last of those ->serial()
callbacks, its subsequent accesses to pd for managing the ->refcnt will
all be UAFs. In particular, if the memory backing pd has meanwhile been
reused for some new parallel_data allocation, e.g in the course of
processing another subsequent NEWALG request, the padata_serial_worker()
might find the initial ->refcnt of one and free pd from under that NEWALG
or the associated selftests respectively, leading to "secondary" UAFs such
as in the Oops above.

Note that as it currently stands, a padata_shell owns a reference on its
associated parallel_data already. So fix the UAF in padata_serial_worker()
by making padata_free_shell() to not unconditionally free the shell's
associated parallel_data, but to properly drop that reference via
padata_put_pd() instead.

Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")
Signed-off-by: Nicolai Stange <nstange@suse.de>
---
 kernel/padata.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Daniel Jordan Oct. 28, 2022, 2:35 p.m. UTC | #1
On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote:
> On a PREEMPT kernel, the following has been observed while running
> pcrypt_aead01 from LTP:
> 
>   [ ] general protection fault: 0000 [#1] PREEMPT_RT SMP PTI
>   <...>
>   [ ] Workqueue: pdecrypt_parallel padata_parallel_worker
>   [ ] RIP: 0010:padata_reorder+0x19/0x120
>   <...>
>   [ ] Call Trace:
>   [ ]  padata_parallel_worker+0xa3/0xf0
>   [ ]  process_one_work+0x1db/0x4a0
>   [ ]  worker_thread+0x2d/0x3c0
>   [ ]  ? process_one_work+0x4a0/0x4a0
>   [ ]  kthread+0x159/0x180
>   [ ]  ? kthread_park+0xb0/0xb0
>   [ ]  ret_from_fork+0x35/0x40
> 
> The pcrypt_aead01 testcase basically runs a NEWALG/DELALG sequence for some
> fixed pcrypt instance in a loop, back to back.
> 
> The problem is that once the last ->serial() in padata_serial_worker() is
> getting invoked, the pcrypt requests from the selftests would signal
> completion, and pcrypt_aead01 can move on and subsequently issue a DELALG.
> Upon pcrypt instance deregistration, the associated padata_shell would get
> destroyed, which in turn would unconditionally free the associated
> parallel_data instance.
> 
> If padata_serial_worker() now resumes operation after e.g. having
> previously been preempted upon the return from the last of those ->serial()
> callbacks, its subsequent accesses to pd for managing the ->refcnt will
> all be UAFs. In particular, if the memory backing pd has meanwhile been
> reused for some new parallel_data allocation, e.g in the course of
> processing another subsequent NEWALG request, the padata_serial_worker()
> might find the initial ->refcnt of one and free pd from under that NEWALG
> or the associated selftests respectively, leading to "secondary" UAFs such
> as in the Oops above.
> 
> Note that as it currently stands, a padata_shell owns a reference on its
> associated parallel_data already. So fix the UAF in padata_serial_worker()
> by making padata_free_shell() to not unconditionally free the shell's
> associated parallel_data, but to properly drop that reference via
> padata_put_pd() instead.
> 
> Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")

It looks like this issue goes back to the first padata commit.  For
instance, pd->refcnt goes to zero after the last _priv is serialized,
padata_free is called in another task, and a particularly sluggish
padata_reorder call touches pd after.

So wouldn't it be

Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface")

?

Otherwise,

Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>

> Signed-off-by: Nicolai Stange <nstange@suse.de>
> ---
>  kernel/padata.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/kernel/padata.c b/kernel/padata.c
> index 3bd1e23f089b..0bf8c80dad5a 100644
> --- a/kernel/padata.c
> +++ b/kernel/padata.c
> @@ -1112,7 +1112,7 @@ void padata_free_shell(struct padata_shell *ps)
>  
>  	mutex_lock(&ps->pinst->lock);
>  	list_del(&ps->list);
> -	padata_free_pd(rcu_dereference_protected(ps->pd, 1));
> +	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
>  	mutex_unlock(&ps->pinst->lock);
>  
>  	kfree(ps);
> -- 
> 2.37.3
>
  
Daniel Jordan Oct. 28, 2022, 4:22 p.m. UTC | #2
On Fri, Oct 28, 2022 at 10:35:49AM -0400, Daniel Jordan wrote:
> On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote:
> > Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")
> 
> It looks like this issue goes back to the first padata commit.  For
> instance, pd->refcnt goes to zero after the last _priv is serialized,
> padata_free is called in another task, and a particularly sluggish
> padata_reorder call touches pd after.
> 
> So wouldn't it be
> 
> Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface")
> 
> ?

I guess not, by my own logic from 2/5.  I think it might be

Fixes: d46a5ac7a7e2 ("padata: Use a timer to handle remaining objects in the reorder queues")
  
Nicolai Stange Nov. 9, 2022, 1:02 p.m. UTC | #3
Daniel Jordan <daniel.m.jordan@oracle.com> writes:

> On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote:
>> 
>> Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")
>
> It looks like this issue goes back to the first padata commit.  For
> instance, pd->refcnt goes to zero after the last _priv is serialized,
> padata_free is called in another task, and a particularly sluggish
> padata_reorder call touches pd after.
>
> So wouldn't it be
>
> Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface")

I chose 07928d9bfc81 ("padata: Remove broken queue flushing"), because
that one reads like it fixed a couple of much more severe padata
lifetime issues, it only missed the relatively minor one addressed here,
in a sense.

Or to put it the other way around: if one were to backport this patch
here, 07928d9bfc81 should probably get picked first, I think.

But I'd be fine with any Fixes tag, of course, I don't have a strong
opinion on this matter.

Thanks!

Nicolai

>
> ?
>
> Otherwise,
>
> Acked-by: Daniel Jordan <daniel.m.jordan@oracle.com>
>
>> Signed-off-by: Nicolai Stange <nstange@suse.de>
>> ---
  
Daniel Jordan Nov. 10, 2022, 10:05 p.m. UTC | #4
On Wed, Nov 09, 2022 at 02:02:37PM +0100, Nicolai Stange wrote:
> Daniel Jordan <daniel.m.jordan@oracle.com> writes:
> 
> > On Wed, Oct 19, 2022 at 10:37:05AM +0200, Nicolai Stange wrote:
> >> 
> >> Fixes: 07928d9bfc81 ("padata: Remove broken queue flushing")
> >
> > It looks like this issue goes back to the first padata commit.  For
> > instance, pd->refcnt goes to zero after the last _priv is serialized,
> > padata_free is called in another task, and a particularly sluggish
> > padata_reorder call touches pd after.
> >
> > So wouldn't it be
> >
> > Fixes: 16295bec6398 ("padata: Generic parallelization/serialization interface")
> 
> I chose 07928d9bfc81 ("padata: Remove broken queue flushing"), because
> that one reads like it fixed a couple of much more severe padata
> lifetime issues, it only missed the relatively minor one addressed here,
> in a sense.
> 
> Or to put it the other way around: if one were to backport this patch
> here, 07928d9bfc81 should probably get picked first, I think.
> 
> But I'd be fine with any Fixes tag, of course, I don't have a strong
> opinion on this matter.

Ok, makes sense, your Fixes is fine then.  Please put that justification
in the changelog for the next version.
  

Patch

diff --git a/kernel/padata.c b/kernel/padata.c
index 3bd1e23f089b..0bf8c80dad5a 100644
--- a/kernel/padata.c
+++ b/kernel/padata.c
@@ -1112,7 +1112,7 @@  void padata_free_shell(struct padata_shell *ps)
 
 	mutex_lock(&ps->pinst->lock);
 	list_del(&ps->list);
-	padata_free_pd(rcu_dereference_protected(ps->pd, 1));
+	padata_put_pd(rcu_dereference_protected(ps->pd, 1));
 	mutex_unlock(&ps->pinst->lock);
 
 	kfree(ps);