[6/7] workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items
Commit Message
Now that work_grab_pending() can always grab the PENDING bit without
sleeping, the only thing that prevents allowing cancel_work_sync() of a BH
work item from an atomic context is the flushing of the in-flight instance.
When we're flushing a BH work item for cancel_work_sync(), we know that the
work item is not queued and must be executing in a BH context, which means
that it's safe to busy-wait for its completion from a non-hardirq atomic
context.
This patch updates __flush_work() so that it busy-waits when flushing a BH
work item for cancel_work_sync(). might_sleep() is pushed from
start_flush_work() to its callers - when operating on a BH work item,
__cancel_work_sync() now enforces !in_hardirq() instead of might_sleep().
This allows cancel_work_sync() and disable_work() to be called from
non-hardirq atomic contexts on BH work items.
v2: Lai pointed out that __flush_work() was accessing pool->flags outside
the RCU critical section protecting the pool pointer. Fix it by testing
and remembering the result inside the RCU critical section.
Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Lai Jiangshan <jiangshanlai@gmail.com>
---
kernel/workqueue.c | 64 +++++++++++++++++++++++++++++++++++-----------
1 file changed, 49 insertions(+), 15 deletions(-)
Comments
Hello
On Thu, Feb 22, 2024 at 1:43 AM Tejun Heo <tj@kernel.org> wrote:
> @@ -4077,11 +4076,37 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
>
> rcu_read_lock();
> pool = start_flush_work(work, &barr, from_cancel);
> + is_bh = pool && (pool->flags & POOL_BH);
> rcu_read_unlock();
> if (!pool)
> return false;
>
> - wait_for_completion(&barr.done);
> + if (is_bh && from_cancel) {
Can "*work_data_bits(work) & WORK_OFFQ_BH" be used here?
If so, the previous patch will not be needed.
Thanks.
Lai
> + /*
> + * We're flushing a BH work item which is being canceled. It
> + * must have been executing during start_flush_work() and can't
> + * currently be queued. If @work is still executing, we know it
> + * is running in the BH context and thus can be busy-waited.
> + *
> + * On RT, prevent a live lock when %current preempted soft
> + * interrupt processing or prevents ksoftirqd from running by
> + * keeping flipping BH. If the BH work item runs on a different
> + * CPU then this has no effect other than doing the BH
> + * disable/enable dance for nothing. This is copied from
> + * kernel/softirq.c::tasklet_unlock_spin_wait().
> + */
> + while (!try_wait_for_completion(&barr.done)) {
> + if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
> + local_bh_disable();
> + local_bh_enable();
> + } else {
> + cpu_relax();
> + }
> + }
> + } else {
> + wait_for_completion(&barr.done);
> + }
> +
> destroy_work_on_stack(&barr.work);
> return true;
> }
Hello,
On Thu, Feb 22, 2024 at 12:36:29PM +0800, Lai Jiangshan wrote:
> On Thu, Feb 22, 2024 at 1:43 AM Tejun Heo <tj@kernel.org> wrote:
>
> > @@ -4077,11 +4076,37 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
> >
> > rcu_read_lock();
> > pool = start_flush_work(work, &barr, from_cancel);
> > + is_bh = pool && (pool->flags & POOL_BH);
> > rcu_read_unlock();
> > if (!pool)
> > return false;
> >
> > - wait_for_completion(&barr.done);
> > + if (is_bh && from_cancel) {
>
> Can "*work_data_bits(work) & WORK_OFFQ_BH" be used here?
> If so, the previous patch will not be needed.
Hmm... yeah, if we test from_cancel first, we should know that the work item
is offq and then can depend on OFFQ_BH. Maybe I'm missing something. Will
try that.
Thanks.
@@ -4013,8 +4013,6 @@ static struct worker_pool *start_flush_work(struct work_struct *work,
struct pool_workqueue *pwq;
struct workqueue_struct *wq;
- might_sleep();
-
pool = get_work_pool(work);
if (!pool)
return NULL;
@@ -4068,6 +4066,7 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
{
struct wq_barrier barr;
struct worker_pool *pool;
+ bool is_bh;
if (WARN_ON(!wq_online))
return false;
@@ -4077,11 +4076,37 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
rcu_read_lock();
pool = start_flush_work(work, &barr, from_cancel);
+ is_bh = pool && (pool->flags & POOL_BH);
rcu_read_unlock();
if (!pool)
return false;
- wait_for_completion(&barr.done);
+ if (is_bh && from_cancel) {
+ /*
+ * We're flushing a BH work item which is being canceled. It
+ * must have been executing during start_flush_work() and can't
+ * currently be queued. If @work is still executing, we know it
+ * is running in the BH context and thus can be busy-waited.
+ *
+ * On RT, prevent a live lock when %current preempted soft
+ * interrupt processing or prevents ksoftirqd from running by
+ * keeping flipping BH. If the BH work item runs on a different
+ * CPU then this has no effect other than doing the BH
+ * disable/enable dance for nothing. This is copied from
+ * kernel/softirq.c::tasklet_unlock_spin_wait().
+ */
+ while (!try_wait_for_completion(&barr.done)) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+ local_bh_disable();
+ local_bh_enable();
+ } else {
+ cpu_relax();
+ }
+ }
+ } else {
+ wait_for_completion(&barr.done);
+ }
+
destroy_work_on_stack(&barr.work);
return true;
}
@@ -4099,6 +4124,7 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
*/
bool flush_work(struct work_struct *work)
{
+ might_sleep();
return __flush_work(work, false);
}
EXPORT_SYMBOL_GPL(flush_work);
@@ -4188,6 +4214,11 @@ static bool __cancel_work_sync(struct work_struct *work, u32 cflags)
ret = __cancel_work(work, cflags | WORK_CANCEL_DISABLE);
+ if (*work_data_bits(work) & WORK_OFFQ_BH)
+ WARN_ON_ONCE(in_hardirq());
+ else
+ might_sleep();
+
/*
* Skip __flush_work() during early boot when we know that @work isn't
* executing. This allows canceling during early boot.
@@ -4214,19 +4245,19 @@ EXPORT_SYMBOL(cancel_work);
* cancel_work_sync - cancel a work and wait for it to finish
* @work: the work to cancel
*
- * Cancel @work and wait for its execution to finish. This function
- * can be used even if the work re-queues itself or migrates to
- * another workqueue. On return from this function, @work is
- * guaranteed to be not pending or executing on any CPU.
+ * Cancel @work and wait for its execution to finish. This function can be used
+ * even if the work re-queues itself or migrates to another workqueue. On return
+ * from this function, @work is guaranteed to be not pending or executing on any
+ * CPU as long as there aren't racing enqueues.
*
- * cancel_work_sync(&delayed_work->work) must not be used for
- * delayed_work's. Use cancel_delayed_work_sync() instead.
+ * cancel_work_sync(&delayed_work->work) must not be used for delayed_work's.
+ * Use cancel_delayed_work_sync() instead.
*
- * The caller must ensure that the workqueue on which @work was last
- * queued can't be destroyed before this function returns.
+ * Must be called from a sleepable context if @work was last queued on a non-BH
+ * workqueue. Can also be called from non-hardirq atomic contexts including BH
+ * if @work was last queued on a BH workqueue.
*
- * Return:
- * %true if @work was pending, %false otherwise.
+ * Returns %true if @work was pending, %false otherwise.
*/
bool cancel_work_sync(struct work_struct *work)
{
@@ -4296,8 +4327,11 @@ EXPORT_SYMBOL_GPL(disable_work);
* Similar to disable_work() but also wait for @work to finish if currently
* executing.
*
- * Must be called from a sleepable context. Returns %true if @work was pending,
- * %false otherwise.
+ * Must be called from a sleepable context if @work was last queued on a non-BH
+ * workqueue. Can also be called from non-hardirq atomic contexts including BH
+ * if @work was last queued on a BH workqueue.
+ *
+ * Returns %true if @work was pending, %false otherwise.
*/
bool disable_work_sync(struct work_struct *work)
{