[v6,1/3] ublk: add opcode offsets for DRV_IN/DRV_OUT

Message ID 20230706130930.64283-2-nmi@metaspace.dk
State New
Headers
Series ublk: enable zoned storage support |

Commit Message

Andreas Hindborg July 6, 2023, 1:09 p.m. UTC
  From: Andreas Hindborg <a.hindborg@samsung.com>

Ublk zoned storage support relies on DRV_IN handling for zone report.
Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands.

Also add parenthesis to existing opcodes for better macro hygiene.

Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
---
 include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++----
 1 file changed, 14 insertions(+), 4 deletions(-)
  

Comments

Damien Le Moal July 6, 2023, 11:50 p.m. UTC | #1
On 7/6/23 22:09, Andreas Hindborg wrote:
> From: Andreas Hindborg <a.hindborg@samsung.com>
> 
> Ublk zoned storage support relies on DRV_IN handling for zone report.
> Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands.
> 
> Also add parenthesis to existing opcodes for better macro hygiene.
> 
> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
> ---
>  include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++----
>  1 file changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
> index 4b8558db90e1..2ebb8d5d827a 100644
> --- a/include/uapi/linux/ublk_cmd.h
> +++ b/include/uapi/linux/ublk_cmd.h
> @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info {
>  	__u64   reserved2;
>  };
>  
> -#define		UBLK_IO_OP_READ		0
> +#define		UBLK_IO_OP_READ			0
>  #define		UBLK_IO_OP_WRITE		1
>  #define		UBLK_IO_OP_FLUSH		2
> -#define		UBLK_IO_OP_DISCARD	3
> -#define		UBLK_IO_OP_WRITE_SAME	4
> -#define		UBLK_IO_OP_WRITE_ZEROES	5
> +#define		UBLK_IO_OP_DISCARD		3
> +#define		UBLK_IO_OP_WRITE_SAME		4
> +#define		UBLK_IO_OP_WRITE_ZEROES		5
> +/*
> + * Passthrough (driver private) operation codes range between

This is unclear... Here, what does "driver" refer to ? If it is the ublk
kernel driver, than these commands should not be defined in the uapi
header file, they should be defined in drivers/block/ublk.h. However, if
these are for the user space driver, like all the other operations, then
let's clearly state so. But then, I still not understand why these need
a different naming pattern using the "__UBLK" prefix...

> + * __UBLK_IO_OP_DRV_IN_START and __UBLK_IO_OP_DRV_IN_END for REQ_OP_DRV_IN type
> + * operations and between __UBLK_IO_OP_DRV_OUT_START and
> + * __UBLK_IO_OP_DRV_OUT_END for REQ_OP_DRV_OUT type operations.
> + */
> +#define		__UBLK_IO_OP_DRV_IN_START	32
> +#define		__UBLK_IO_OP_DRV_IN_END		96
> +#define		__UBLK_IO_OP_DRV_OUT_START	__UBLK_IO_OP_DRV_IN_END
> +#define		__UBLK_IO_OP_DRV_OUT_END	160
>  
>  #define		UBLK_IO_F_FAILFAST_DEV		(1U << 8)
>  #define		UBLK_IO_F_FAILFAST_TRANSPORT	(1U << 9)
  
Ming Lei July 7, 2023, 12:59 a.m. UTC | #2
On Fri, Jul 07, 2023 at 08:50:01AM +0900, Damien Le Moal wrote:
> On 7/6/23 22:09, Andreas Hindborg wrote:
> > From: Andreas Hindborg <a.hindborg@samsung.com>
> > 
> > Ublk zoned storage support relies on DRV_IN handling for zone report.
> > Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands.
> > 
> > Also add parenthesis to existing opcodes for better macro hygiene.
> > 
> > Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
> > ---
> >  include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++----
> >  1 file changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
> > index 4b8558db90e1..2ebb8d5d827a 100644
> > --- a/include/uapi/linux/ublk_cmd.h
> > +++ b/include/uapi/linux/ublk_cmd.h
> > @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info {
> >  	__u64   reserved2;
> >  };
> >  
> > -#define		UBLK_IO_OP_READ		0
> > +#define		UBLK_IO_OP_READ			0
> >  #define		UBLK_IO_OP_WRITE		1
> >  #define		UBLK_IO_OP_FLUSH		2
> > -#define		UBLK_IO_OP_DISCARD	3
> > -#define		UBLK_IO_OP_WRITE_SAME	4
> > -#define		UBLK_IO_OP_WRITE_ZEROES	5
> > +#define		UBLK_IO_OP_DISCARD		3
> > +#define		UBLK_IO_OP_WRITE_SAME		4
> > +#define		UBLK_IO_OP_WRITE_ZEROES		5
> > +/*
> > + * Passthrough (driver private) operation codes range between
> 
> This is unclear... Here, what does "driver" refer to ? If it is the ublk
> kernel driver, than these commands should not be defined in the uapi
> header file, they should be defined in drivers/block/ublk.h. However, if
> these are for the user space driver, like all the other operations, then

Like normal IO, these passthrough requests needs userspace to handle too,
usually they just belong to specific ublk target, such as report zones.,
so here it is part of UAPI.

But yes, we should document it clearly, maybe something below?

	Ublk passthrough operation code ranges, and each passthrough
	operation provides generic interface between ublk kernel driver
	and ublk userspace, and this interface is usually used for handling
	generic block layer request, such as command of zoned report zones.
	Passthrough operation is only needed iff ublk kernel driver has to
	be involved for handling this operation.

> let's clearly state so. But then, I still not understand why these need
> a different naming pattern using the "__UBLK" prefix...

I think __UBLK just meant we don't suggest userspace to use it directly,
since the added macros are just for making ranges for DRV_IN and DRV_OUT,
so we can check command direction easily be using this start/end info in
both sides.


Thanks, 
Ming
  
Damien Le Moal July 7, 2023, 1:42 a.m. UTC | #3
On 7/7/23 09:59, Ming Lei wrote:
> On Fri, Jul 07, 2023 at 08:50:01AM +0900, Damien Le Moal wrote:
>> On 7/6/23 22:09, Andreas Hindborg wrote:
>>> From: Andreas Hindborg <a.hindborg@samsung.com>
>>>
>>> Ublk zoned storage support relies on DRV_IN handling for zone report.
>>> Prepare for this change by adding offsets for the DRV_IN/DRV_OUT commands.
>>>
>>> Also add parenthesis to existing opcodes for better macro hygiene.
>>>
>>> Signed-off-by: Andreas Hindborg <a.hindborg@samsung.com>
>>> ---
>>>  include/uapi/linux/ublk_cmd.h | 18 ++++++++++++++----
>>>  1 file changed, 14 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
>>> index 4b8558db90e1..2ebb8d5d827a 100644
>>> --- a/include/uapi/linux/ublk_cmd.h
>>> +++ b/include/uapi/linux/ublk_cmd.h
>>> @@ -229,12 +229,22 @@ struct ublksrv_ctrl_dev_info {
>>>  	__u64   reserved2;
>>>  };
>>>  
>>> -#define		UBLK_IO_OP_READ		0
>>> +#define		UBLK_IO_OP_READ			0
>>>  #define		UBLK_IO_OP_WRITE		1
>>>  #define		UBLK_IO_OP_FLUSH		2
>>> -#define		UBLK_IO_OP_DISCARD	3
>>> -#define		UBLK_IO_OP_WRITE_SAME	4
>>> -#define		UBLK_IO_OP_WRITE_ZEROES	5
>>> +#define		UBLK_IO_OP_DISCARD		3
>>> +#define		UBLK_IO_OP_WRITE_SAME		4
>>> +#define		UBLK_IO_OP_WRITE_ZEROES		5
>>> +/*
>>> + * Passthrough (driver private) operation codes range between
>>
>> This is unclear... Here, what does "driver" refer to ? If it is the ublk
>> kernel driver, than these commands should not be defined in the uapi
>> header file, they should be defined in drivers/block/ublk.h. However, if
>> these are for the user space driver, like all the other operations, then
> 
> Like normal IO, these passthrough requests needs userspace to handle too,
> usually they just belong to specific ublk target, such as report zones.,
> so here it is part of UAPI.
> 
> But yes, we should document it clearly, maybe something below?
> 
> 	Ublk passthrough operation code ranges, and each passthrough
> 	operation provides generic interface between ublk kernel driver
> 	and ublk userspace, and this interface is usually used for handling
> 	generic block layer request, such as command of zoned report zones.
> 	Passthrough operation is only needed iff ublk kernel driver has to
> 	be involved for handling this operation.

Yes, that is better.

> 
>> let's clearly state so. But then, I still not understand why these need
>> a different naming pattern using the "__UBLK" prefix...
> 
> I think __UBLK just meant we don't suggest userspace to use it directly,
> since the added macros are just for making ranges for DRV_IN and DRV_OUT,
> so we can check command direction easily be using this start/end info in
> both sides.

Personally, I would still prefer to not add this "__" prefix as these
are operations that the ublk user driver will have to deal with, like
the other ones. So I do not see the point of that prefix. But no strong
feeling about that :)

> 
> 
> Thanks, 
> Ming
>
  
Christoph Hellwig July 10, 2023, 6:52 a.m. UTC | #4
On Fri, Jul 07, 2023 at 08:59:03AM +0800, Ming Lei wrote:
> > let's clearly state so. But then, I still not understand why these need
> > a different naming pattern using the "__UBLK" prefix...
> 
> I think __UBLK just meant we don't suggest userspace to use it directly,
> since the added macros are just for making ranges for DRV_IN and DRV_OUT,
> so we can check command direction easily be using this start/end info in
> both sides.

Folks, please stop coupling a uapi (or on-disk protocol) too tightly
to Linux internals.  Think of what makes sense as a communication
protocol, not what is an internal kernel interface.

REPORT_ZONES is a sensible command, and supported in ATA/SCSI/NVMe in
one way or another.  In Linux it is a synchronous method call right now
for one reason or another, and most implementation map it to a
passthrough command - be that the actual protocol command or something
internal for virtio.

So for ublk this is just another command like any other, that needs to
be defined and documented.  Nothing internal or driver specific.
  
Ming Lei July 10, 2023, 9:27 a.m. UTC | #5
On Sun, Jul 09, 2023 at 11:52:39PM -0700, Christoph Hellwig wrote:
> On Fri, Jul 07, 2023 at 08:59:03AM +0800, Ming Lei wrote:
> > > let's clearly state so. But then, I still not understand why these need
> > > a different naming pattern using the "__UBLK" prefix...
> > 
> > I think __UBLK just meant we don't suggest userspace to use it directly,
> > since the added macros are just for making ranges for DRV_IN and DRV_OUT,
> > so we can check command direction easily be using this start/end info in
> > both sides.
> 
> Folks, please stop coupling a uapi (or on-disk protocol) too tightly
> to Linux internals.  Think of what makes sense as a communication
> protocol, not what is an internal kernel interface.
> 
> REPORT_ZONES is a sensible command, and supported in ATA/SCSI/NVMe in
> one way or another.  In Linux it is a synchronous method call right now
> for one reason or another, and most implementation map it to a
> passthrough command - be that the actual protocol command or something
> internal for virtio.
> 
> So for ublk this is just another command like any other, that needs to
> be defined and documented.  Nothing internal or driver specific.
 
Yes, that is exactly what we are doing.

The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting
more ublk passthrough commands, and the motivation is for running
check(such as buffer direction) in two sides easily.

However, I think it is just fine to delay to add it until introducing
the 2nd ublk pt command.

Thanks, 
Ming
  
Christoph Hellwig July 10, 2023, 9:32 a.m. UTC | #6
On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote:
> Yes, that is exactly what we are doing.
> 
> The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting
> more ublk passthrough commands, and the motivation is for running
> check(such as buffer direction) in two sides easily.
> 
> However, I think it is just fine to delay to add it until introducing
> the 2nd ublk pt command.

The concept of a passthrough command just doesn't make sense for an
on the wire protocol.  It is a linux concept that distinguished between
the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are
well defined and can be used by file systems and other consumers, and
ways to pass through arbitrary blobs only known by the driver.

Anything in a wire protocol needs to be very well defined in that
protocol completely indpendent of what Linux concept it maps to.
Especially as the Linux concepts can change, and fairly frequently do.
  
Ming Lei July 10, 2023, 10:02 a.m. UTC | #7
On Mon, Jul 10, 2023 at 02:32:44AM -0700, Christoph Hellwig wrote:
> On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote:
> > Yes, that is exactly what we are doing.
> > 
> > The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting
> > more ublk passthrough commands, and the motivation is for running
> > check(such as buffer direction) in two sides easily.
> > 
> > However, I think it is just fine to delay to add it until introducing
> > the 2nd ublk pt command.
> 
> The concept of a passthrough command just doesn't make sense for an
> on the wire protocol.  It is a linux concept that distinguished between
> the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are
> well defined and can be used by file systems and other consumers, and
> ways to pass through arbitrary blobs only known by the driver.
> 
> Anything in a wire protocol needs to be very well defined in that
> protocol completely indpendent of what Linux concept it maps to.
> Especially as the Linux concepts can change, and fairly frequently do.

Yeah, you are right wrt. linux pt command, and here we shouldn't use
the term of passthrough.

Actually the UAPI is trying to define interface between driver and
userspace, which is just like interface between driver and hardware, such
as, how nvme/sd_zbc is dealing with actual hardware wrt. reporting
zones.

Thanks,
Ming
  
Andreas Hindborg July 11, 2023, 6:23 a.m. UTC | #8
Christoph Hellwig <hch@infradead.org> writes:

> On Mon, Jul 10, 2023 at 05:27:23PM +0800, Ming Lei wrote:
>> Yes, that is exactly what we are doing.
>> 
>> The added macros of UBLK_IO_OP_DRV_IN_START[END] are just for supporting
>> more ublk passthrough commands, and the motivation is for running
>> check(such as buffer direction) in two sides easily.
>> 
>> However, I think it is just fine to delay to add it until introducing
>> the 2nd ublk pt command.
>
> The concept of a passthrough command just doesn't make sense for an
> on the wire protocol.  It is a linux concept that distinguished between
> the Linux synthetic command like REQ_OP_READ/WRITE/DISCARD etc that are
> well defined and can be used by file systems and other consumers, and
> ways to pass through arbitrary blobs only known by the driver.

Yet most on-the-wire protocols for actual hardware does support this
some way or another. But I agree that for ublk it is probably not
needed. It would probably be easier to talk to the ublk daemon through
other means than passthrough in the block layer.

>
> Anything in a wire protocol needs to be very well defined in that
> protocol completely indpendent of what Linux concept it maps to.
> Especially as the Linux concepts can change, and fairly frequently do.

I somewhat agree in the sense that for consistency, we should either
move zone management commands to the DRV_OUT range OR move report_zones
out of this special range and just next to the zone management
operations. I like the latter option better, and I would love to see the
block layer do the same at some point. It feels backwards that
report_zones get special treatment all over the place.

Best regards,
Andreas
  
Christoph Hellwig July 11, 2023, 8:22 a.m. UTC | #9
On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote:
> Yet most on-the-wire protocols for actual hardware does support this
> some way or another.

Supports what?  Passthrough?  No.

> I somewhat agree in the sense that for consistency, we should either
> move zone management commands to the DRV_OUT range OR move report_zones
> out of this special range and just next to the zone management
> operations. I like the latter option better, and I would love to see the
> block layer do the same at some point. It feels backwards that
> report_zones get special treatment all over the place.

DRV_IN/OUT is purely a Linux concept.  It doesn't make any sense for
a wire protocol.
  
Andreas Hindborg July 11, 2023, 9:02 a.m. UTC | #10
Christoph Hellwig <hch@infradead.org> writes:

> On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote:
>> Yet most on-the-wire protocols for actual hardware does support this
>> some way or another.
>
> Supports what?  Passthrough?  No.

Both SCSI and NVMe has command identifier ranges reserved for vendor
specific commands. I would assume that one use of these is to implement
passthrough channels to a device for testing out new interfaces. Just
guessing though.

>
>> I somewhat agree in the sense that for consistency, we should either
>> move zone management commands to the DRV_OUT range OR move report_zones
>> out of this special range and just next to the zone management
>> operations. I like the latter option better, and I would love to see the
>> block layer do the same at some point. It feels backwards that
>> report_zones get special treatment all over the place.
>
> DRV_IN/OUT is purely a Linux concept.  It doesn't make any sense for
> a wire protocol.

Ok 👍

BR Andreas
  
Christoph Hellwig July 11, 2023, 9:27 a.m. UTC | #11
On Tue, Jul 11, 2023 at 11:02:15AM +0200, Andreas Hindborg (Samsung) wrote:
> 
> Christoph Hellwig <hch@infradead.org> writes:
> 
> > On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote:
> >> Yet most on-the-wire protocols for actual hardware does support this
> >> some way or another.
> >
> > Supports what?  Passthrough?  No.
> 
> Both SCSI and NVMe has command identifier ranges reserved for vendor
> specific commands. I would assume that one use of these is to implement
> passthrough channels to a device for testing out new interfaces. Just
> guessing though.

Vendor specific commands is an entirely different concept from Linux
passthrough requests.
  
Andreas Hindborg July 11, 2023, 10:15 a.m. UTC | #12
Christoph Hellwig <hch@infradead.org> writes:

> On Tue, Jul 11, 2023 at 11:02:15AM +0200, Andreas Hindborg (Samsung) wrote:
>> 
>> Christoph Hellwig <hch@infradead.org> writes:
>> 
>> > On Tue, Jul 11, 2023 at 08:23:40AM +0200, Andreas Hindborg (Samsung) wrote:
>> >> Yet most on-the-wire protocols for actual hardware does support this
>> >> some way or another.
>> >
>> > Supports what?  Passthrough?  No.
>> 
>> Both SCSI and NVMe has command identifier ranges reserved for vendor
>> specific commands. I would assume that one use of these is to implement
>> passthrough channels to a device for testing out new interfaces. Just
>> guessing though.
>
> Vendor specific commands is an entirely different concept from Linux
> passthrough requests.

And yet they are somewhat similar, in the sense that they allow the user
of a protocol to express semantics that is not captured in the
established protocol. Uring command passthrough -> request passthrough
-> vendor specific commands. They sort of map well in terms of what they
allow the user to achieve. Or did I misunderstand something completely?
  
Christoph Hellwig July 11, 2023, 12:01 p.m. UTC | #13
On Tue, Jul 11, 2023 at 12:15:18PM +0200, Andreas Hindborg (Samsung) wrote:
> And yet they are somewhat similar, in the sense that they allow the user
> of a protocol to express semantics that is not captured in the
> established protocol. Uring command passthrough -> request passthrough
> -> vendor specific commands. They sort of map well in terms of what they
> allow the user to achieve. Or did I misunderstand something completely?

Well, there is a relationship, but it's one way.

Vendor specific command are basically always going to be used through
a passthrough interface, because they aren't standardized.

But most commands used through a passthrough interface are normal
standardized commands, just either used in a way not supported by
the normal Linux interface or just in creative ways.
  

Patch

diff --git a/include/uapi/linux/ublk_cmd.h b/include/uapi/linux/ublk_cmd.h
index 4b8558db90e1..2ebb8d5d827a 100644
--- a/include/uapi/linux/ublk_cmd.h
+++ b/include/uapi/linux/ublk_cmd.h
@@ -229,12 +229,22 @@  struct ublksrv_ctrl_dev_info {
 	__u64   reserved2;
 };
 
-#define		UBLK_IO_OP_READ		0
+#define		UBLK_IO_OP_READ			0
 #define		UBLK_IO_OP_WRITE		1
 #define		UBLK_IO_OP_FLUSH		2
-#define		UBLK_IO_OP_DISCARD	3
-#define		UBLK_IO_OP_WRITE_SAME	4
-#define		UBLK_IO_OP_WRITE_ZEROES	5
+#define		UBLK_IO_OP_DISCARD		3
+#define		UBLK_IO_OP_WRITE_SAME		4
+#define		UBLK_IO_OP_WRITE_ZEROES		5
+/*
+ * Passthrough (driver private) operation codes range between
+ * __UBLK_IO_OP_DRV_IN_START and __UBLK_IO_OP_DRV_IN_END for REQ_OP_DRV_IN type
+ * operations and between __UBLK_IO_OP_DRV_OUT_START and
+ * __UBLK_IO_OP_DRV_OUT_END for REQ_OP_DRV_OUT type operations.
+ */
+#define		__UBLK_IO_OP_DRV_IN_START	32
+#define		__UBLK_IO_OP_DRV_IN_END		96
+#define		__UBLK_IO_OP_DRV_OUT_START	__UBLK_IO_OP_DRV_IN_END
+#define		__UBLK_IO_OP_DRV_OUT_END	160
 
 #define		UBLK_IO_F_FAILFAST_DEV		(1U << 8)
 #define		UBLK_IO_F_FAILFAST_TRANSPORT	(1U << 9)