[8/8] dm-verity: Convert from tasklet to BH workqueue

Message ID 20240130091300.2968534-9-tj@kernel.org
State New
Headers
Series [1/8] workqueue: Update lock debugging code |

Commit Message

Tejun Heo Jan. 30, 2024, 9:11 a.m. UTC
  The only generic interface to execute asynchronously in the BH context is
tasklet; however, it's marked deprecated and has some design flaws. To
replace tasklets, BH workqueue support was recently added. A BH workqueue
behaves similarly to regular workqueues except that the queued work items
are executed in the BH context.

This patch converts dm-verity from tasklet to BH workqueue.

This is a minimal conversion which doesn't rename the related names
including the "try_verify_in_tasklet" option. If this patch is applied, a
follow-up patch would be necessary. I couldn't decide whether the option
name would need to be updated too.

Only compile tested. I don't know how to verity.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Alasdair Kergon <agk@redhat.com>
Cc: Mike Snitzer <snitzer@kernel.org>
Cc: Mikulas Patocka <mpatocka@redhat.com>
Cc: dm-devel@lists.linux.dev
---
 drivers/md/dm-verity-target.c | 8 ++++----
 drivers/md/dm-verity.h        | 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)
  

Comments

Mikulas Patocka Jan. 31, 2024, 9:19 p.m. UTC | #1
On Mon, 29 Jan 2024, Tejun Heo wrote:

> The only generic interface to execute asynchronously in the BH context is
> tasklet; however, it's marked deprecated and has some design flaws. To
> replace tasklets, BH workqueue support was recently added. A BH workqueue
> behaves similarly to regular workqueues except that the queued work items
> are executed in the BH context.
> 
> This patch converts dm-verity from tasklet to BH workqueue.
> 
> This is a minimal conversion which doesn't rename the related names
> including the "try_verify_in_tasklet" option. If this patch is applied, a
> follow-up patch would be necessary. I couldn't decide whether the option
> name would need to be updated too.
> 
> Only compile tested. I don't know how to verity.

Download the cryptsetup package with "git clone 
https://gitlab.com/cryptsetup/cryptsetup" and run the testsuite: 
/autogen.sh && ./configure && make && make check

> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Alasdair Kergon <agk@redhat.com>
> Cc: Mike Snitzer <snitzer@kernel.org>
> Cc: Mikulas Patocka <mpatocka@redhat.com>
> Cc: dm-devel@lists.linux.dev
> ---
>  drivers/md/dm-verity-target.c | 8 ++++----
>  drivers/md/dm-verity.h        | 2 +-
>  2 files changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
> index 14e58ae70521..911261de2d08 100644
> --- a/drivers/md/dm-verity-target.c
> +++ b/drivers/md/dm-verity-target.c
> @@ -645,9 +645,9 @@ static void verity_work(struct work_struct *w)
>  	verity_finish_io(io, errno_to_blk_status(verity_verify_io(io)));
>  }
>  
> -static void verity_tasklet(unsigned long data)
> +static void verity_bh_work(struct work_struct *w)
>  {
> -	struct dm_verity_io *io = (struct dm_verity_io *)data;
> +	struct dm_verity_io *io = container_of(w, struct dm_verity_io, bh_work);
>  	int err;
>  
>  	io->in_tasklet = true;
> @@ -675,8 +675,8 @@ static void verity_end_io(struct bio *bio)
>  	}
>  
>  	if (static_branch_unlikely(&use_tasklet_enabled) && io->v->use_tasklet) {
> -		tasklet_init(&io->tasklet, verity_tasklet, (unsigned long)io);
> -		tasklet_schedule(&io->tasklet);
> +		INIT_WORK(&io->bh_work, verity_bh_work);
> +		queue_work(system_bh_wq, &io->bh_work);
>  	} else {
>  		INIT_WORK(&io->work, verity_work);
>  		queue_work(io->v->verify_wq, &io->work);
> diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
> index f9d522c870e6..7c16f834f31a 100644
> --- a/drivers/md/dm-verity.h
> +++ b/drivers/md/dm-verity.h
> @@ -83,7 +83,7 @@ struct dm_verity_io {
>  	struct bvec_iter iter;
>  
>  	struct work_struct work;
> -	struct tasklet_struct tasklet;
> +	struct work_struct bh_work;
>  
>  	/*
>  	 * Three variably-size fields follow this struct:

Do we really need two separate work_structs here? They are never submitted 
concurrently, so I think that one would be enough. Or, am I missing 
something?

Mikulas
  
Tejun Heo Jan. 31, 2024, 9:32 p.m. UTC | #2
Hello,

On Wed, Jan 31, 2024 at 10:19:07PM +0100, Mikulas Patocka wrote:
> > @@ -83,7 +83,7 @@ struct dm_verity_io {
> >  	struct bvec_iter iter;
> >  
> >  	struct work_struct work;
> > -	struct tasklet_struct tasklet;
> > +	struct work_struct bh_work;
> >  
> >  	/*
> >  	 * Three variably-size fields follow this struct:
> 
> Do we really need two separate work_structs here? They are never submitted 
> concurrently, so I think that one would be enough. Or, am I missing 
> something?

I don't know, so just did the dumb thing. If the caller always guarantees
that the work items are never queued at the same time, reusing is fine.
However, the followings might be useful to keep on mind:

- work_struct is pretty small - 4 pointers.

- INIT_WORK() on a queued work item isn't gonna be pretty.

- Flushing and no-concurrent-execution guarantee are broken on INIT_WORK().
  e.g. If you queue_work(), INIT_WORK(), flush_work(), the flush isn't
  actually going to wait for the work item to finish. Also, if you do
  queue_work(), INIT_WORK(), queue_work(), the two queued work item
  instances may end up running concurrently.

Muxing a single work item carries more risks of subtle bugs, but in some
cases, the way it's used is clear (e.g. sequential chaining) and that's
fine.

Thanks.
  
Mikulas Patocka Jan. 31, 2024, 10:02 p.m. UTC | #3
On Wed, 31 Jan 2024, Tejun Heo wrote:

> Hello,
> 
> On Wed, Jan 31, 2024 at 10:19:07PM +0100, Mikulas Patocka wrote:
> > > @@ -83,7 +83,7 @@ struct dm_verity_io {
> > >  	struct bvec_iter iter;
> > >  
> > >  	struct work_struct work;
> > > -	struct tasklet_struct tasklet;
> > > +	struct work_struct bh_work;
> > >  
> > >  	/*
> > >  	 * Three variably-size fields follow this struct:
> > 
> > Do we really need two separate work_structs here? They are never submitted 
> > concurrently, so I think that one would be enough. Or, am I missing 
> > something?
> 
> I don't know, so just did the dumb thing. If the caller always guarantees
> that the work items are never queued at the same time, reusing is fine.
> However, the followings might be useful to keep on mind:
> 
> - work_struct is pretty small - 4 pointers.
> 
> - INIT_WORK() on a queued work item isn't gonna be pretty.
> 
> - Flushing and no-concurrent-execution guarantee are broken on INIT_WORK().
>   e.g. If you queue_work(), INIT_WORK(), flush_work(), the flush isn't
>   actually going to wait for the work item to finish. Also, if you do
>   queue_work(), INIT_WORK(), queue_work(), the two queued work item
>   instances may end up running concurrently.
> 
> Muxing a single work item carries more risks of subtle bugs, but in some
> cases, the way it's used is clear (e.g. sequential chaining) and that's
> fine.

The code doesn't call INIT_WORK() on a queued work item and it doesn't 
flush the workqueue (it destroys it only in a situation when there are no 
work items running) so I think it's safe to use just one work_struct.

Mikulas
  
Linus Torvalds Jan. 31, 2024, 11:19 p.m. UTC | #4
On Wed, 31 Jan 2024 at 13:32, Tejun Heo <tj@kernel.org> wrote:
>
> I don't know, so just did the dumb thing. If the caller always guarantees
> that the work items are never queued at the same time, reusing is fine.

So the reason I thought it would be a good cleanup to introduce that
"atomic" workqueue thing (now "bh") was that this case literally has a
switch between "use tasklets' or "use workqueues".

So it's not even about "reusing" the workqueue, it's literally a
matter of making it always just use workqueues, and the switch then
becomes just *which* workqueue to use - system or bh.

In fact, I suspect there is very little reason ever to *not* just use
the bh one, and even the switch could be removed.

Because I think the only reason the "workqueue of tasklet" choice
existed in the first place was that workqueues were the "proper" data
structure, and the tasklet case was added later as a latency hack, and
everybody knew that tasklets were deprecated.

             Linus
  
Tejun Heo Feb. 1, 2024, 12:04 a.m. UTC | #5
Hello, Linus.

On Wed, Jan 31, 2024 at 03:19:01PM -0800, Linus Torvalds wrote:
> On Wed, 31 Jan 2024 at 13:32, Tejun Heo <tj@kernel.org> wrote:
> >
> > I don't know, so just did the dumb thing. If the caller always guarantees
> > that the work items are never queued at the same time, reusing is fine.
> 
> So the reason I thought it would be a good cleanup to introduce that
> "atomic" workqueue thing (now "bh") was that this case literally has a
> switch between "use tasklets' or "use workqueues".
> 
> So it's not even about "reusing" the workqueue, it's literally a
> matter of making it always just use workqueues, and the switch then
> becomes just *which* workqueue to use - system or bh.

Yeah, that's how the dm-crypt got converted. The patch just before this one.
This one probably can be converted the same way. I don't see the work item
being re-initialized. It probably is better to initialize the work item
together with the enclosing struct and then just queue it when needed.

Mikulas, I couldn't decide what to do with the "try_verify_in_tasklet"
option and just decided to do the minimal thing hoping that someone more
familiar with the code can take over the actual conversion. How much of user
interface commitment is that? Should it be renamed or would it be better to
leave it be?

Thanks.
  
Mike Snitzer Feb. 1, 2024, 12:07 a.m. UTC | #6
On Wed, Jan 31 2024 at  6:19P -0500,
Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Wed, 31 Jan 2024 at 13:32, Tejun Heo <tj@kernel.org> wrote:
> >
> > I don't know, so just did the dumb thing. If the caller always guarantees
> > that the work items are never queued at the same time, reusing is fine.
> 
> So the reason I thought it would be a good cleanup to introduce that
> "atomic" workqueue thing (now "bh") was that this case literally has a
> switch between "use tasklets' or "use workqueues".
> 
> So it's not even about "reusing" the workqueue, it's literally a
> matter of making it always just use workqueues, and the switch then
> becomes just *which* workqueue to use - system or bh.

DM generally always use dedicated workqueues instead of the system.

The dm-crypt tasklet's completion path did punt to the workqueue
otherwise there was use-after-free of the per-bio-data that included
the tasklet. And for verity there was fallback to workqueue if
tasklet-based verification failed. Didn't inspire confidence.

> In fact, I suspect there is very little reason ever to *not* just use
> the bh one, and even the switch could be removed.
>
> Because I think the only reason the "workqueue of tasklet" choice
> existed in the first place was that workqueues were the "proper" data
> structure, and the tasklet case was added later as a latency hack, and
> everybody knew that tasklets were deprecated.

Correct, abusing tasklets was a very contrived latency optimization.
Happy to see it all go away! (hindsight: it never should have gone in).

Mike
  
Mike Snitzer Feb. 1, 2024, 12:19 a.m. UTC | #7
On Wed, Jan 31 2024 at  7:04P -0500,
Tejun Heo <tj@kernel.org> wrote:

> Hello, Linus.
> 
> On Wed, Jan 31, 2024 at 03:19:01PM -0800, Linus Torvalds wrote:
> > On Wed, 31 Jan 2024 at 13:32, Tejun Heo <tj@kernel.org> wrote:
> > >
> > > I don't know, so just did the dumb thing. If the caller always guarantees
> > > that the work items are never queued at the same time, reusing is fine.
> > 
> > So the reason I thought it would be a good cleanup to introduce that
> > "atomic" workqueue thing (now "bh") was that this case literally has a
> > switch between "use tasklets' or "use workqueues".
> > 
> > So it's not even about "reusing" the workqueue, it's literally a
> > matter of making it always just use workqueues, and the switch then
> > becomes just *which* workqueue to use - system or bh.
> 
> Yeah, that's how the dm-crypt got converted. The patch just before this one.
> This one probably can be converted the same way. I don't see the work item
> being re-initialized. It probably is better to initialize the work item
> together with the enclosing struct and then just queue it when needed.

Sounds good.
 
> Mikulas, I couldn't decide what to do with the "try_verify_in_tasklet"
> option and just decided to do the minimal thing hoping that someone more
> familiar with the code can take over the actual conversion. How much of user
> interface commitment is that? Should it be renamed or would it be better to
> leave it be?

cryptsetup did add support for it, so I think it worthwhile to
preserve the option; but it'd be fine to have it just be a backward
compatible alias for a more appropriately named option?

Mike
  
Mike Snitzer Feb. 20, 2024, 7:44 p.m. UTC | #8
On Wed, Jan 31 2024 at  7:19P -0500,
Mike Snitzer <snitzer@kernel.org> wrote:

> On Wed, Jan 31 2024 at  7:04P -0500,
> Tejun Heo <tj@kernel.org> wrote:
> 
> > Hello, Linus.
> > 
> > On Wed, Jan 31, 2024 at 03:19:01PM -0800, Linus Torvalds wrote:
> > > On Wed, 31 Jan 2024 at 13:32, Tejun Heo <tj@kernel.org> wrote:
> > > >
> > > > I don't know, so just did the dumb thing. If the caller always guarantees
> > > > that the work items are never queued at the same time, reusing is fine.
> > > 
> > > So the reason I thought it would be a good cleanup to introduce that
> > > "atomic" workqueue thing (now "bh") was that this case literally has a
> > > switch between "use tasklets' or "use workqueues".
> > > 
> > > So it's not even about "reusing" the workqueue, it's literally a
> > > matter of making it always just use workqueues, and the switch then
> > > becomes just *which* workqueue to use - system or bh.
> > 
> > Yeah, that's how the dm-crypt got converted. The patch just before this one.
> > This one probably can be converted the same way. I don't see the work item
> > being re-initialized. It probably is better to initialize the work item
> > together with the enclosing struct and then just queue it when needed.
> 
> Sounds good.
>  
> > Mikulas, I couldn't decide what to do with the "try_verify_in_tasklet"
> > option and just decided to do the minimal thing hoping that someone more
> > familiar with the code can take over the actual conversion. How much of user
> > interface commitment is that? Should it be renamed or would it be better to
> > leave it be?
> 
> cryptsetup did add support for it, so I think it worthwhile to
> preserve the option; but it'd be fine to have it just be a backward
> compatible alias for a more appropriately named option?

Hey Tejun,

I'm not sure where things stand with the 6.9 workqueue changes to add
BH workqueue.  I had a look at your various branches and I'm not
seeing where you might have staged any conversion patches (like this
dm-verity one).

I just staged various unrelated dm-verity and dm-crypt 6.8 fixes from
Mikulas that I'll be sending to Linus later this week (for v6.8-rc6).
Those changes required rebasing 'dm-6.9' because of conflicts, here is
the dm-6.9 branch:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-6.9

So we'll definitely need to rebase your changes on dm-6.9 to convert
dm-crypt and dm-verity over to your BH workqueue.  Are you OK with
doing that or would you prefer I merge some 6.9 workqueue branch that
you have into dm-6.9? And then Mikulas and I work to make the required
DM target conversion changes?

However you'd like to do it: please let me know where you have the
latest 6.9 code the adds BH workqueue (and if you have DM target
conversion code that reflects your latest).

Thanks,
Mike
  
Tejun Heo Feb. 20, 2024, 8:05 p.m. UTC | #9
Hello, Mike.

On Tue, Feb 20, 2024 at 02:44:29PM -0500, Mike Snitzer wrote:
> I'm not sure where things stand with the 6.9 workqueue changes to add
> BH workqueue.  I had a look at your various branches and I'm not
> seeing where you might have staged any conversion patches (like this
> dm-verity one).

Yeah, the branch is for-6.9-bh-conversions in the wq tree but I haven't
queued the DM patches yet. Wanted to make sure the perf signal from TCP
conversion is okay first. FWIW, while Eric still has concerns, the initial
test didn't show any appreciable regression with production memcache
workload on our side.

> I just staged various unrelated dm-verity and dm-crypt 6.8 fixes from
> Mikulas that I'll be sending to Linus later this week (for v6.8-rc6).
> Those changes required rebasing 'dm-6.9' because of conflicts, here is
> the dm-6.9 branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-6.9
> 
> So we'll definitely need to rebase your changes on dm-6.9 to convert
> dm-crypt and dm-verity over to your BH workqueue.  Are you OK with
> doing that or would you prefer I merge some 6.9 workqueue branch that
> you have into dm-6.9? And then Mikulas and I work to make the required
> DM target conversion changes?

Oh, if you don't mind, it'd be best if you could pull wq/for-6.9 into dm-6.9
and do the conversions there.

Thank you.
  
Mike Snitzer Feb. 22, 2024, 9:24 p.m. UTC | #10
On Tue, Feb 20 2024 at  3:05P -0500,
Tejun Heo <tj@kernel.org> wrote:

> Hello, Mike.
> 
> On Tue, Feb 20, 2024 at 02:44:29PM -0500, Mike Snitzer wrote:
> > I'm not sure where things stand with the 6.9 workqueue changes to add
> > BH workqueue.  I had a look at your various branches and I'm not
> > seeing where you might have staged any conversion patches (like this
> > dm-verity one).
> 
> Yeah, the branch is for-6.9-bh-conversions in the wq tree but I haven't
> queued the DM patches yet. Wanted to make sure the perf signal from TCP
> conversion is okay first. FWIW, while Eric still has concerns, the initial
> test didn't show any appreciable regression with production memcache
> workload on our side.
> 
> > I just staged various unrelated dm-verity and dm-crypt 6.8 fixes from
> > Mikulas that I'll be sending to Linus later this week (for v6.8-rc6).
> > Those changes required rebasing 'dm-6.9' because of conflicts, here is
> > the dm-6.9 branch:
> > https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-6.9
> > 
> > So we'll definitely need to rebase your changes on dm-6.9 to convert
> > dm-crypt and dm-verity over to your BH workqueue.  Are you OK with
> > doing that or would you prefer I merge some 6.9 workqueue branch that
> > you have into dm-6.9? And then Mikulas and I work to make the required
> > DM target conversion changes?
> 
> Oh, if you don't mind, it'd be best if you could pull wq/for-6.9 into dm-6.9
> and do the conversions there.
> 
> Thank you.

I've rebased and made the changes available in this dm-6.9-bh-wq branch:
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-6.9-bh-wq

I left them both DM conversion commits attributed to use despite the
rebase (and churn of renames I did in the dm-verity commit); hopefully
you're fine with that but if not I can split the renames out.

I successfully tested using the cryptsetup testsuite ('make check').
And I've also added these changes to linux-next, via linux-dm.git's
'for-next':
https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=for-next

Mikulas, feel free to review/test and possibly add another patch for
dm-verity that only uses a single work_struct in struct dm_verity_io

Thanks,
Mike
  
Tejun Heo Feb. 23, 2024, 5:22 p.m. UTC | #11
On Thu, Feb 22, 2024 at 04:24:45PM -0500, Mike Snitzer wrote:
> I've rebased and made the changes available in this dm-6.9-bh-wq branch:
> https://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm.git/log/?h=dm-6.9-bh-wq
> 
> I left them both DM conversion commits attributed to use despite the
> rebase (and churn of renames I did in the dm-verity commit); hopefully
> you're fine with that but if not I can split the renames out.

Looks good to me.

Thanks.
  

Patch

diff --git a/drivers/md/dm-verity-target.c b/drivers/md/dm-verity-target.c
index 14e58ae70521..911261de2d08 100644
--- a/drivers/md/dm-verity-target.c
+++ b/drivers/md/dm-verity-target.c
@@ -645,9 +645,9 @@  static void verity_work(struct work_struct *w)
 	verity_finish_io(io, errno_to_blk_status(verity_verify_io(io)));
 }
 
-static void verity_tasklet(unsigned long data)
+static void verity_bh_work(struct work_struct *w)
 {
-	struct dm_verity_io *io = (struct dm_verity_io *)data;
+	struct dm_verity_io *io = container_of(w, struct dm_verity_io, bh_work);
 	int err;
 
 	io->in_tasklet = true;
@@ -675,8 +675,8 @@  static void verity_end_io(struct bio *bio)
 	}
 
 	if (static_branch_unlikely(&use_tasklet_enabled) && io->v->use_tasklet) {
-		tasklet_init(&io->tasklet, verity_tasklet, (unsigned long)io);
-		tasklet_schedule(&io->tasklet);
+		INIT_WORK(&io->bh_work, verity_bh_work);
+		queue_work(system_bh_wq, &io->bh_work);
 	} else {
 		INIT_WORK(&io->work, verity_work);
 		queue_work(io->v->verify_wq, &io->work);
diff --git a/drivers/md/dm-verity.h b/drivers/md/dm-verity.h
index f9d522c870e6..7c16f834f31a 100644
--- a/drivers/md/dm-verity.h
+++ b/drivers/md/dm-verity.h
@@ -83,7 +83,7 @@  struct dm_verity_io {
 	struct bvec_iter iter;
 
 	struct work_struct work;
-	struct tasklet_struct tasklet;
+	struct work_struct bh_work;
 
 	/*
 	 * Three variably-size fields follow this struct: