Message ID | 20221019083708.27138-4-nstange@suse.de |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4ac7:0:0:0:0:0 with SMTP id y7csp201508wrs; Wed, 19 Oct 2022 01:38:13 -0700 (PDT) X-Google-Smtp-Source: AMsMyM5N9n1a0+pdezhz6Czf9G0I1op8KdRAbEDzqGxr7M7FQgPuq6Z0IF3lhydH7oELq5ECBLMh X-Received: by 2002:aa7:8011:0:b0:567:70cc:5b78 with SMTP id j17-20020aa78011000000b0056770cc5b78mr3390477pfi.29.1666168692935; Wed, 19 Oct 2022 01:38:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666168692; cv=none; d=google.com; s=arc-20160816; b=s5Bcamz2CcopfTfwr7mCw5ov7WPs45Pfkpq/I58CuMo8QlEMb0DlKnAcZm8Rplokfx qBkOC9+b2sSJstTh4aMi0QSp+EoSMMKTDoNKZDD+cfJFSqBBYQFmqaINWwyekywGL7MI jjj8Ja6iERlTcs48LY20ls8VTcUKdRaBXSZ7q/1uOcyBDoLZJFx2MrOJVJxupmAq3jCg y83GiLg1KKBXyNo4A9VaJqoAx7ExCNOiJQwSF2yB4atuLe4tUdNt7UxXStIq0wYiLcaT bNieNhmOYT1OU4GK5eHYdpvGvc32NEbs7GOdkoJAjds7TjRXmTq9a6j7YvBdoxh1ChXN nPsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature:dkim-signature; bh=nDl7gQ8N7tn2RB1Gh66JslmBkmAIPalxnvJSrXhYvqQ=; b=eStRP9R+ljE/yZkNXtELuPgDmeDKAT4a7VC+/Yx2xsMaLY4J5MJhXAaGCraIXAfJzE H1pneFctOgBidE0kMKVr3jJ7myLE5brkwvTnpRp3TlKDzq6YQyMR3oX7mKMhB6xO8MJ0 ZkNGQ8/a1be0PO0hxsusSz7j2OELR7fYW3a9fQqKHY+S5XuF4FZYLKNtGCVWUajuWGSX jt3t3rk3mbRlgxwlER4oWG8kV5kqZsFniTfIyPDcEHVGaBkc7LGIhPJqiGoVzd31v66C 5GSPeyJ073Hll6ERc2iZODmFeNRouEYM9a3Y0uZCSSaxazuZjjKkWJdshhTKDRLKzLMl bGRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=pgqxuoXE; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=xDK9iTx0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t187-20020a632dc4000000b0045ec918ad38si18565567pgt.548.2022.10.19.01.37.59; Wed, 19 Oct 2022 01:38:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@suse.de header.s=susede2_rsa header.b=pgqxuoXE; dkim=neutral (no key) header.i=@suse.de header.s=susede2_ed25519 header.b=xDK9iTx0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=suse.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230161AbiJSIhi (ORCPT <rfc822;samuel.l.nystrom@gmail.com> + 99 others); Wed, 19 Oct 2022 04:37:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230071AbiJSIh3 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 19 Oct 2022 04:37:29 -0400 Received: from smtp-out2.suse.de (smtp-out2.suse.de [IPv6:2001:67c:2178:6::1d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3C077D7BB; Wed, 19 Oct 2022 01:37:26 -0700 (PDT) Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by smtp-out2.suse.de (Postfix) with ESMTPS id 5F99F20382; Wed, 19 Oct 2022 08:37:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_rsa; t=1666168645; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nDl7gQ8N7tn2RB1Gh66JslmBkmAIPalxnvJSrXhYvqQ=; b=pgqxuoXE0xR8Djlfp+UWAPSVNOMcy8HcaYtM8psOj4cHlq7SEG9CFhupuWyAQ1AiXcXKq9 Cih0uYHHzco9Ksw++jgPUG9a8zTD/Kq/QQlh6fZJKGqXKOCbCLkHsAG8CN163f+NNJQCHq e1YD0AW1OJUan0dCOj8wBcFvR0YaneU= DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=suse.de; s=susede2_ed25519; t=1666168645; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nDl7gQ8N7tn2RB1Gh66JslmBkmAIPalxnvJSrXhYvqQ=; b=xDK9iTx0RlIKR7s5AblRCrV1tEdBD4Z89I2aHzDYTpLHjuz2wLAP1vfMEFQbQUmo7K1CLi gVyC5su9vtvv7sBg== Received: from imap2.suse-dmz.suse.de (imap2.suse-dmz.suse.de [192.168.254.74]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-521) server-digest SHA512) (No client certificate requested) by imap2.suse-dmz.suse.de (Postfix) with ESMTPS id 4F83C13345; Wed, 19 Oct 2022 08:37:25 +0000 (UTC) Received: from dovecot-director2.suse.de ([192.168.254.65]) by imap2.suse-dmz.suse.de with ESMTPSA id 7b/7EkW3T2NVZQAAMHmgww (envelope-from <nstange@suse.de>); Wed, 19 Oct 2022 08:37:25 +0000 From: Nicolai Stange <nstange@suse.de> To: Steffen Klassert <steffen.klassert@secunet.com>, Daniel Jordan <daniel.m.jordan@oracle.com> Cc: Herbert Xu <herbert@gondor.apana.org.au>, Martin Doucha <mdoucha@suse.cz>, linux-crypto@vger.kernel.org, linux-kernel@vger.kernel.org, Nicolai Stange <nstange@suse.de> Subject: [PATCH 3/5] padata: grab parallel_data refcnt for reorder Date: Wed, 19 Oct 2022 10:37:06 +0200 Message-Id: <20221019083708.27138-4-nstange@suse.de> X-Mailer: git-send-email 2.37.3 In-Reply-To: <20221019083708.27138-1-nstange@suse.de> References: <20221019083708.27138-1-nstange@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747104503501871660?= X-GMAIL-MSGID: =?utf-8?q?1747104503501871660?= |
Series |
padata: fix liftime issues after ->serial() has completed
|
|
Commit Message
Nicolai Stange
Oct. 19, 2022, 8:37 a.m. UTC
On entry of padata_do_serial(), the in-flight padata_priv owns a reference
to the associated parallel_data instance.
However, as soon as the padata_priv got enqueued on the reorder list, it
can be completed from a different context, causing the reference to get
released in the course.
This would potentially cause UAFs from the subsequent padata_reorder()
operations invoked from the enqueueing padata_do_serial() or from the
reorder work.
Note that this is a purely theroretical concern, the problem has never been
actually observed -- it would require multiple pcrypt request submissions
racing against each other, ultimately a pcrypt instance destruction
(DELALG) short after request completions as well as unfortunate timing.
However, for the sake of correctness, it is still worth fixing.
Make padata_do_serial() grab a reference count on the parallel_data for
the subsequent reorder operation(s). As long as the padata_priv has not
been enqueued, this is safe, because as mentioned above, in-flight
pdata_privs own a reference already.
Note that padata_reorder() might schedule another padata_reorder() work
and thus, care must be taken to not prematurely release that "reorder
refcount" from padata_do_serial() again in case that has happened.
Make padata_reorder() return a bool for indicating whether or not a
reorder work has been scheduled. Let padata_do_serial() drop its refcount
only if this is not the case. Accordingly, make the reorder work handler,
invoke_padata_reorder(), drop it then as appropriate.
A remark on the commit chosen for the Fixes tag reference below: before
commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance
padata queues"), the padata_parallel lifetime had been tied to the
padata_instance. The padata_free() resp. padata_stop() issued a
synchronize_rcu() before padata_free_pd() from the instance destruction
path, rendering UAFs from the padata_do_serial()=>padata_reorder()
invocations with BHs disabled impossible AFAICS. With that, the
padata_reorder() work remains to be considered. Before
commit b128a3040935 ("padata: allocate workqueue internally"), the
workqueue got destroyed (from pcrypt), hence drained, before the padata
instance destruction, but this change moved that to after the
padata_free_pd() invocation from __padata_free(). So, while the Fixes
reference from below is most likely technically correct, I would still like
to reiterate that this problem is probably hard to trigger in practice,
even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock
by using per-instance padata queues").
Fixes: b128a3040935 ("padata: allocate workqueue internally")
Signed-off-by: Nicolai Stange <nstange@suse.de>
---
kernel/padata.c | 26 +++++++++++++++++++++-----
1 file changed, 21 insertions(+), 5 deletions(-)
Comments
On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote: > On entry of padata_do_serial(), the in-flight padata_priv owns a reference > to the associated parallel_data instance. > > However, as soon as the padata_priv got enqueued on the reorder list, it > can be completed from a different context, causing the reference to get > released in the course. > > This would potentially cause UAFs from the subsequent padata_reorder() > operations invoked from the enqueueing padata_do_serial() or from the > reorder work. > > Note that this is a purely theroretical concern, the problem has never been > actually observed -- it would require multiple pcrypt request submissions > racing against each other, ultimately a pcrypt instance destruction > (DELALG) short after request completions as well as unfortunate timing. > > However, for the sake of correctness, it is still worth fixing. > > Make padata_do_serial() grab a reference count on the parallel_data for > the subsequent reorder operation(s). As long as the padata_priv has not > been enqueued, this is safe, because as mentioned above, in-flight > pdata_privs own a reference already. > > Note that padata_reorder() might schedule another padata_reorder() work > and thus, care must be taken to not prematurely release that "reorder > refcount" from padata_do_serial() again in case that has happened. > Make padata_reorder() return a bool for indicating whether or not a > reorder work has been scheduled. Let padata_do_serial() drop its refcount > only if this is not the case. Accordingly, make the reorder work handler, > invoke_padata_reorder(), drop it then as appropriate. > > A remark on the commit chosen for the Fixes tag reference below: before > commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock by using per-instance > padata queues"), the padata_parallel lifetime had been tied to the > padata_instance. The padata_free() resp. padata_stop() issued a > synchronize_rcu() before padata_free_pd() from the instance destruction > path, rendering UAFs from the padata_do_serial()=>padata_reorder() > invocations with BHs disabled impossible AFAICS. With that, the > padata_reorder() work remains to be considered. Before > commit b128a3040935 ("padata: allocate workqueue internally"), the > workqueue got destroyed (from pcrypt), hence drained, before the padata > instance destruction, but this change moved that to after the > padata_free_pd() invocation from __padata_free(). So, while the Fixes > reference from below is most likely technically correct, I would still like > to reiterate that this problem is probably hard to trigger in practice, > even more so before commit bbefa1dd6a6d ("crypto: pcrypt - Avoid deadlock > by using per-instance padata queues"). > > Fixes: b128a3040935 ("padata: allocate workqueue internally") > Signed-off-by: Nicolai Stange <nstange@suse.de> > --- > kernel/padata.c | 26 +++++++++++++++++++++----- > 1 file changed, 21 insertions(+), 5 deletions(-) > > diff --git a/kernel/padata.c b/kernel/padata.c > index 0bf8c80dad5a..b79226727ef7 100644 > --- a/kernel/padata.c > +++ b/kernel/padata.c > @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, > return padata; > } > > -static void padata_reorder(struct parallel_data *pd) > +static bool padata_reorder(struct parallel_data *pd) > { > struct padata_instance *pinst = pd->ps->pinst; > int cb_cpu; > @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd) > * care for all the objects enqueued during the holdtime of the lock. > */ > if (!spin_trylock_bh(&pd->lock)) > - return; > + return false; > > while (1) { > padata = padata_find_next(pd, true); > @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd) > > reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); > if (!list_empty(&reorder->list) && padata_find_next(pd, false)) > - queue_work(pinst->serial_wq, &pd->reorder_work); > + return queue_work(pinst->serial_wq, &pd->reorder_work); > + > + return false; > } > > static void invoke_padata_reorder(struct work_struct *work) > { > struct parallel_data *pd; > + bool keep_refcnt; > > local_bh_disable(); > pd = container_of(work, struct parallel_data, reorder_work); > - padata_reorder(pd); > + keep_refcnt = padata_reorder(pd); > local_bh_enable(); > + > + if (!keep_refcnt) > + padata_put_pd(pd); > } > > static void padata_serial_worker(struct work_struct *serial_work) > @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata) > struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu); > struct padata_priv *cur; > > + /* > + * The in-flight padata owns a reference on pd. However, as > + * soon as it's been enqueued on the reorder list, another > + * task can dequeue and complete it, thereby dropping the > + * reference. Grab another reference here, it will eventually > + * be released from a reorder work, if any, or below. > + */ > + padata_get_pd(pd); > + > spin_lock(&reorder->lock); > /* Sort in ascending order of sequence number. */ > list_for_each_entry_reverse(cur, &reorder->list, list) > @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata) > */ > smp_mb(); > > - padata_reorder(pd); > + if (!padata_reorder(pd)) > + padata_put_pd(pd); do_serial is supposed to be called with BHs disabled and will be in every case after a fix for a separate issue that I plan to send this cycle. Given that it (will soon...) always happen under RCU protection, part of this issue could be addressed like this, which puts the expense of dealing with this rare problem in the slow path: diff --git a/kernel/padata.c b/kernel/padata.c index 0bf8c80dad5a..cd6740ae6629 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps) if (!ps) return; + /* + * Wait for all _do_serial calls to finish to avoid touching freed pd's + * and ps's. + */ + synchronize_rcu(); + mutex_lock(&ps->pinst->lock); list_del(&ps->list); padata_put_pd(rcu_dereference_protected(ps->pd, 1)); pcrypt calls padata_free_shell() when all outstanding transforms (and thus requests, I think) have been freed/completed, so no new task can come into padata_reorder. synchronize_rcu() then flushes out any remaining _reorder calls. This doesn't deal with pending reorder_work items, but we can fix it later in the series. What do you think?
Daniel Jordan <daniel.m.jordan@oracle.com> writes: > On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote: >> On entry of padata_do_serial(), the in-flight padata_priv owns a reference >> to the associated parallel_data instance. >> >> However, as soon as the padata_priv got enqueued on the reorder list, it >> can be completed from a different context, causing the reference to get >> released in the course. >> >> This would potentially cause UAFs from the subsequent padata_reorder() >> operations invoked from the enqueueing padata_do_serial() or from the >> reorder work. >> <snip> >> --- >> kernel/padata.c | 26 +++++++++++++++++++++----- >> 1 file changed, 21 insertions(+), 5 deletions(-) >> >> diff --git a/kernel/padata.c b/kernel/padata.c >> index 0bf8c80dad5a..b79226727ef7 100644 >> --- a/kernel/padata.c >> +++ b/kernel/padata.c >> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, >> return padata; >> } >> >> -static void padata_reorder(struct parallel_data *pd) >> +static bool padata_reorder(struct parallel_data *pd) >> { >> struct padata_instance *pinst = pd->ps->pinst; >> int cb_cpu; >> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd) >> * care for all the objects enqueued during the holdtime of the lock. >> */ >> if (!spin_trylock_bh(&pd->lock)) >> - return; >> + return false; >> >> while (1) { >> padata = padata_find_next(pd, true); >> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd) >> >> reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); >> if (!list_empty(&reorder->list) && padata_find_next(pd, false)) >> - queue_work(pinst->serial_wq, &pd->reorder_work); >> + return queue_work(pinst->serial_wq, &pd->reorder_work); >> + >> + return false; >> } >> >> static void invoke_padata_reorder(struct work_struct *work) >> { >> struct parallel_data *pd; >> + bool keep_refcnt; >> >> local_bh_disable(); >> pd = container_of(work, struct parallel_data, reorder_work); >> - padata_reorder(pd); >> + keep_refcnt = padata_reorder(pd); >> local_bh_enable(); >> + >> + if (!keep_refcnt) >> + padata_put_pd(pd); >> } >> >> static void padata_serial_worker(struct work_struct *serial_work) >> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata) >> struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu); >> struct padata_priv *cur; >> >> + /* >> + * The in-flight padata owns a reference on pd. However, as >> + * soon as it's been enqueued on the reorder list, another >> + * task can dequeue and complete it, thereby dropping the >> + * reference. Grab another reference here, it will eventually >> + * be released from a reorder work, if any, or below. >> + */ >> + padata_get_pd(pd); >> + >> spin_lock(&reorder->lock); >> /* Sort in ascending order of sequence number. */ >> list_for_each_entry_reverse(cur, &reorder->list, list) >> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata) >> */ >> smp_mb(); >> >> - padata_reorder(pd); >> + if (!padata_reorder(pd)) >> + padata_put_pd(pd); > > do_serial is supposed to be called with BHs disabled and will be in > every case after a fix for a separate issue that I plan to send this > cycle. Given that it (will soon...) always happen under RCU protection, > part of this issue could be addressed like this, which puts the expense > of dealing with this rare problem in the slow path: > > diff --git a/kernel/padata.c b/kernel/padata.c > index 0bf8c80dad5a..cd6740ae6629 100644 > --- a/kernel/padata.c > +++ b/kernel/padata.c > @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps) > if (!ps) > return; > > + /* > + * Wait for all _do_serial calls to finish to avoid touching freed pd's > + * and ps's. > + */ > + synchronize_rcu(); > + > mutex_lock(&ps->pinst->lock); > list_del(&ps->list); > padata_put_pd(rcu_dereference_protected(ps->pd, 1)); > > pcrypt calls padata_free_shell() when all outstanding transforms (and > thus requests, I think) have been freed/completed, so no new task can > come into padata_reorder. synchronize_rcu() then flushes out any > remaining _reorder calls. > > This doesn't deal with pending reorder_work items, but we can fix it > later in the series. > > What do you think? Yes, I think that would work. Will you handle it alongside that fix for a separate issue you mentioned above? Or shall I once this fix has landed? Thanks! Nicolai
On Wed, Nov 09, 2022 at 02:03:15PM +0100, Nicolai Stange wrote: > Daniel Jordan <daniel.m.jordan@oracle.com> writes: > > > On Wed, Oct 19, 2022 at 10:37:06AM +0200, Nicolai Stange wrote: > >> On entry of padata_do_serial(), the in-flight padata_priv owns a reference > >> to the associated parallel_data instance. > >> > >> However, as soon as the padata_priv got enqueued on the reorder list, it > >> can be completed from a different context, causing the reference to get > >> released in the course. > >> > >> This would potentially cause UAFs from the subsequent padata_reorder() > >> operations invoked from the enqueueing padata_do_serial() or from the > >> reorder work. > >> > > <snip> > > >> --- > >> kernel/padata.c | 26 +++++++++++++++++++++----- > >> 1 file changed, 21 insertions(+), 5 deletions(-) > >> > >> diff --git a/kernel/padata.c b/kernel/padata.c > >> index 0bf8c80dad5a..b79226727ef7 100644 > >> --- a/kernel/padata.c > >> +++ b/kernel/padata.c > >> @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, > >> return padata; > >> } > >> > >> -static void padata_reorder(struct parallel_data *pd) > >> +static bool padata_reorder(struct parallel_data *pd) > >> { > >> struct padata_instance *pinst = pd->ps->pinst; > >> int cb_cpu; > >> @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd) > >> * care for all the objects enqueued during the holdtime of the lock. > >> */ > >> if (!spin_trylock_bh(&pd->lock)) > >> - return; > >> + return false; > >> > >> while (1) { > >> padata = padata_find_next(pd, true); > >> @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd) > >> > >> reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); > >> if (!list_empty(&reorder->list) && padata_find_next(pd, false)) > >> - queue_work(pinst->serial_wq, &pd->reorder_work); > >> + return queue_work(pinst->serial_wq, &pd->reorder_work); > >> + > >> + return false; > >> } > >> > >> static void invoke_padata_reorder(struct work_struct *work) > >> { > >> struct parallel_data *pd; > >> + bool keep_refcnt; > >> > >> local_bh_disable(); > >> pd = container_of(work, struct parallel_data, reorder_work); > >> - padata_reorder(pd); > >> + keep_refcnt = padata_reorder(pd); > >> local_bh_enable(); > >> + > >> + if (!keep_refcnt) > >> + padata_put_pd(pd); > >> } > >> > >> static void padata_serial_worker(struct work_struct *serial_work) > >> @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata) > >> struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu); > >> struct padata_priv *cur; > >> > >> + /* > >> + * The in-flight padata owns a reference on pd. However, as > >> + * soon as it's been enqueued on the reorder list, another > >> + * task can dequeue and complete it, thereby dropping the > >> + * reference. Grab another reference here, it will eventually > >> + * be released from a reorder work, if any, or below. > >> + */ > >> + padata_get_pd(pd); > >> + > >> spin_lock(&reorder->lock); > >> /* Sort in ascending order of sequence number. */ > >> list_for_each_entry_reverse(cur, &reorder->list, list) > >> @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata) > >> */ > >> smp_mb(); > >> > >> - padata_reorder(pd); > >> + if (!padata_reorder(pd)) > >> + padata_put_pd(pd); > > > > do_serial is supposed to be called with BHs disabled and will be in > > every case after a fix for a separate issue that I plan to send this > > cycle. Given that it (will soon...) always happen under RCU protection, > > part of this issue could be addressed like this, which puts the expense > > of dealing with this rare problem in the slow path: > > > > diff --git a/kernel/padata.c b/kernel/padata.c > > index 0bf8c80dad5a..cd6740ae6629 100644 > > --- a/kernel/padata.c > > +++ b/kernel/padata.c > > @@ -1110,6 +1110,12 @@ void padata_free_shell(struct padata_shell *ps) > > if (!ps) > > return; > > > > + /* > > + * Wait for all _do_serial calls to finish to avoid touching freed pd's > > + * and ps's. > > + */ > > + synchronize_rcu(); > > + > > mutex_lock(&ps->pinst->lock); > > list_del(&ps->list); > > padata_put_pd(rcu_dereference_protected(ps->pd, 1)); > > > > pcrypt calls padata_free_shell() when all outstanding transforms (and > > thus requests, I think) have been freed/completed, so no new task can > > come into padata_reorder. synchronize_rcu() then flushes out any > > remaining _reorder calls. > > > > This doesn't deal with pending reorder_work items, but we can fix it > > later in the series. > > > > What do you think? > > Yes, I think that would work. Will you handle it alongside that fix for > a separate issue you mentioned above? Or shall I once this fix has > landed? Please go ahead and do it yourself. I'll send mine soon, I think it should be ok for them to go in in the same cycle.
diff --git a/kernel/padata.c b/kernel/padata.c index 0bf8c80dad5a..b79226727ef7 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -275,7 +275,7 @@ static struct padata_priv *padata_find_next(struct parallel_data *pd, return padata; } -static void padata_reorder(struct parallel_data *pd) +static bool padata_reorder(struct parallel_data *pd) { struct padata_instance *pinst = pd->ps->pinst; int cb_cpu; @@ -294,7 +294,7 @@ static void padata_reorder(struct parallel_data *pd) * care for all the objects enqueued during the holdtime of the lock. */ if (!spin_trylock_bh(&pd->lock)) - return; + return false; while (1) { padata = padata_find_next(pd, true); @@ -331,17 +331,23 @@ static void padata_reorder(struct parallel_data *pd) reorder = per_cpu_ptr(pd->reorder_list, pd->cpu); if (!list_empty(&reorder->list) && padata_find_next(pd, false)) - queue_work(pinst->serial_wq, &pd->reorder_work); + return queue_work(pinst->serial_wq, &pd->reorder_work); + + return false; } static void invoke_padata_reorder(struct work_struct *work) { struct parallel_data *pd; + bool keep_refcnt; local_bh_disable(); pd = container_of(work, struct parallel_data, reorder_work); - padata_reorder(pd); + keep_refcnt = padata_reorder(pd); local_bh_enable(); + + if (!keep_refcnt) + padata_put_pd(pd); } static void padata_serial_worker(struct work_struct *serial_work) @@ -392,6 +398,15 @@ void padata_do_serial(struct padata_priv *padata) struct padata_list *reorder = per_cpu_ptr(pd->reorder_list, hashed_cpu); struct padata_priv *cur; + /* + * The in-flight padata owns a reference on pd. However, as + * soon as it's been enqueued on the reorder list, another + * task can dequeue and complete it, thereby dropping the + * reference. Grab another reference here, it will eventually + * be released from a reorder work, if any, or below. + */ + padata_get_pd(pd); + spin_lock(&reorder->lock); /* Sort in ascending order of sequence number. */ list_for_each_entry_reverse(cur, &reorder->list, list) @@ -407,7 +422,8 @@ void padata_do_serial(struct padata_priv *padata) */ smp_mb(); - padata_reorder(pd); + if (!padata_reorder(pd)) + padata_put_pd(pd); } EXPORT_SYMBOL(padata_do_serial);