[RFC,v2,1/3] drivers/accel: define kconfig and register a new major

Message ID 20221102203405.1797491-2-ogabbay@kernel.org
State New
Headers
Series new subsystem for compute accelerator devices |

Commit Message

Oded Gabbay Nov. 2, 2022, 8:34 p.m. UTC
  Add a new Kconfig for the accel subsystem. The Kconfig currently
contains only the basic CONFIG_ACCEL option that will be used to
decide whether to compile the accel registration code. That code will
be called directly from the DRM core code.

The accelerator devices will be exposed to the user space with a new,
dedicated major number - 261.

The accel init function registers the new major number as a char device
and create corresponding sysfs and debugfs root entries, similar to
what is done in DRM init function.

I added a new header called drm_accel.h to include/drm/, that will hold
the prototypes of the accel_drv.c functions. In case CONFIG_ACCEL is
set to 'N', that header will contain empty inline implementations of
those functions, to allow DRM core code to compile successfully
without dependency on CONFIG_ACCEL.

I Updated the MAINTAINERS file accordingly with the newly added folder
and I have taken the liberty to appropriate the dri-devel mailing list
and the dri-devel IRC channel for the accel subsystem.

Signed-off-by: Oded Gabbay <ogabbay@kernel.org>
---
Changes in v2:
 - Created accel_drv.c that will hold the accel framework core functions,
   instead of embedding the code inside drm core functions.
 - Created drm/drm_accel.h
 - Removed all #ifdef CONFIG_ACCEL from drm_drv.c

 Documentation/admin-guide/devices.txt |   5 ++
 MAINTAINERS                           |   8 ++
 drivers/Kconfig                       |   2 +
 drivers/Makefile                      |   3 +
 drivers/accel/Kconfig                 |  24 ++++++
 drivers/accel/Makefile                |  10 +++
 drivers/accel/accel_drv.c             | 112 ++++++++++++++++++++++++++
 include/drm/drm_accel.h               |  31 +++++++
 8 files changed, 195 insertions(+)
 create mode 100644 drivers/accel/Kconfig
 create mode 100644 drivers/accel/Makefile
 create mode 100644 drivers/accel/accel_drv.c
 create mode 100644 include/drm/drm_accel.h

--
2.25.1
  

Comments

Jeffrey Hugo Nov. 2, 2022, 9:04 p.m. UTC | #1
On 11/2/2022 2:34 PM, Oded Gabbay wrote:
> diff --git a/drivers/accel/accel_drv.c b/drivers/accel/accel_drv.c
> new file mode 100644
> index 000000000000..6132765ea054
> --- /dev/null
> +++ b/drivers/accel/accel_drv.c
> @@ -0,0 +1,112 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +/*
> + * Copyright 2022 HabanaLabs, Ltd.
> + * All Rights Reserved.
> + *
> + */
> +
> +#include <linux/module.h>

Alphebetical order?

> +#include <linux/debugfs.h>
> +#include <linux/device.h>
> +
> +#include <drm/drm_accel.h>
> +#include <drm/drm_ioctl.h>
> +#include <drm/drm_print.h>
> +
> +static struct dentry *accel_debugfs_root;
> +struct class *accel_class;

Static?
  
Randy Dunlap Nov. 2, 2022, 10:58 p.m. UTC | #2
On 11/2/22 13:34, Oded Gabbay wrote:
> diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
> new file mode 100644
> index 000000000000..282ea24f90c5
> --- /dev/null
> +++ b/drivers/accel/Kconfig
> @@ -0,0 +1,24 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Compute Acceleration device configuration
> +#
> +# This framework provides support for compute acceleration devices, such
> +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> +# devices
> +#
> +menuconfig ACCEL
> +	tristate "Compute Acceleration Framework"
> +	depends on DRM
> +	help
> +	  Framework for device drivers of compute acceleration devices, such
> +	  as, but not limited to, Machine-Learning and Deep-Learning
> +	  acceleration devices.
> +	  If you say Y here, you need to select the module that's right for
> +	  your acceleration device from the list below.
> +	  This framework is integrated with the DRM subsystem as compute
> +	  accelerators and GPUs share a lot in common and can use almost the
> +	  same infrastructure code.
> +	  Having said that, acceleration devices will have a different
> +	  major number than GPUs, and will be exposed to user-space using
> +	  different device files, called accel/accel* (in /dev, sysfs
> +	  and debugfs)

Please add a period at the end of the help text.

+	  and debugfs).
  
Greg KH Nov. 3, 2022, 12:32 a.m. UTC | #3
On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
> --- /dev/null
> +++ b/drivers/accel/Kconfig
> @@ -0,0 +1,24 @@
> +# SPDX-License-Identifier: GPL-2.0-only
> +#
> +# Compute Acceleration device configuration
> +#
> +# This framework provides support for compute acceleration devices, such
> +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> +# devices
> +#
> +menuconfig ACCEL
> +	tristate "Compute Acceleration Framework"
> +	depends on DRM
> +	help
> +	  Framework for device drivers of compute acceleration devices, such
> +	  as, but not limited to, Machine-Learning and Deep-Learning
> +	  acceleration devices.
> +	  If you say Y here, you need to select the module that's right for
> +	  your acceleration device from the list below.
> +	  This framework is integrated with the DRM subsystem as compute
> +	  accelerators and GPUs share a lot in common and can use almost the
> +	  same infrastructure code.
> +	  Having said that, acceleration devices will have a different
> +	  major number than GPUs, and will be exposed to user-space using
> +	  different device files, called accel/accel* (in /dev, sysfs
> +	  and debugfs)

Module name if "M" is chosen?


> +static char *accel_devnode(struct device *dev, umode_t *mode)
> +{
> +	return kasprintf(GFP_KERNEL, "accel/%s", dev_name(dev));
> +}
> +
> +static CLASS_ATTR_STRING(accel_version, 0444, "accel 1.0.0 20221018");

What is a version number doing here?

Please no, I understand that DRM has this, but it really does not make
sense for any in-kernel code.  And that's not how sysfs is supposed to
work anyway (if a file is present, the value is documented, if the file
is not present, the value is just not there, userspace has to handle
it all.)

> +
> +/**
> + * accel_sysfs_init - initialize sysfs helpers
> + *
> + * This is used to create the ACCEL class, which is the implicit parent of any
> + * other top-level ACCEL sysfs objects.
> + *
> + * You must call accel_sysfs_destroy() to release the allocated resources.
> + *
> + * Return: 0 on success, negative error code on failure.
> + */
> +static int accel_sysfs_init(void)
> +{
> +	int err;
> +
> +	accel_class = class_create(THIS_MODULE, "accel");
> +	if (IS_ERR(accel_class))
> +		return PTR_ERR(accel_class);
> +
> +	err = class_create_file(accel_class, &class_attr_accel_version.attr);

Hint, if you ever find yourself adding sysfs files "by hand" like this,
you are doing something wrong.  The driver code should create them
automatically for you by setting up default groups, _OR_ as in this
case, you shouldn't be adding the file at all :)

> +static void accel_sysfs_destroy(void)
> +{
> +	if (IS_ERR_OR_NULL(accel_class))
> +		return;
> +	class_remove_file(accel_class, &class_attr_accel_version.attr);

No need to manually destroy files when you remove a device.  But you
will remove this file anyway for the next version of this patch, so it's
not a big deal :)

> +	class_destroy(accel_class);
> +	accel_class = NULL;
> +}
> +
> +static int accel_stub_open(struct inode *inode, struct file *filp)
> +{
> +	DRM_DEBUG("Operation not supported");

ftrace is wonderful, please use that and not hand-rolled custom "I am
here!" type messages like this.

thanks,

greg k-h
  
Oded Gabbay Nov. 3, 2022, 1:28 p.m. UTC | #4
On Wed, Nov 2, 2022 at 11:04 PM Jeffrey Hugo <quic_jhugo@quicinc.com> wrote:
>
> On 11/2/2022 2:34 PM, Oded Gabbay wrote:
> > diff --git a/drivers/accel/accel_drv.c b/drivers/accel/accel_drv.c
> > new file mode 100644
> > index 000000000000..6132765ea054
> > --- /dev/null
> > +++ b/drivers/accel/accel_drv.c
> > @@ -0,0 +1,112 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +/*
> > + * Copyright 2022 HabanaLabs, Ltd.
> > + * All Rights Reserved.
> > + *
> > + */
> > +
> > +#include <linux/module.h>
>
> Alphebetical order?
ok
>
> > +#include <linux/debugfs.h>
> > +#include <linux/device.h>
> > +
> > +#include <drm/drm_accel.h>
> > +#include <drm/drm_ioctl.h>
> > +#include <drm/drm_print.h>
> > +
> > +static struct dentry *accel_debugfs_root;
> > +struct class *accel_class;
>
> Static?
>
yes, thx.
  
Oded Gabbay Nov. 3, 2022, 1:29 p.m. UTC | #5
On Thu, Nov 3, 2022 at 12:58 AM Randy Dunlap <rdunlap@infradead.org> wrote:
>
>
>
> On 11/2/22 13:34, Oded Gabbay wrote:
> > diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
> > new file mode 100644
> > index 000000000000..282ea24f90c5
> > --- /dev/null
> > +++ b/drivers/accel/Kconfig
> > @@ -0,0 +1,24 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +#
> > +# Compute Acceleration device configuration
> > +#
> > +# This framework provides support for compute acceleration devices, such
> > +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> > +# devices
> > +#
> > +menuconfig ACCEL
> > +     tristate "Compute Acceleration Framework"
> > +     depends on DRM
> > +     help
> > +       Framework for device drivers of compute acceleration devices, such
> > +       as, but not limited to, Machine-Learning and Deep-Learning
> > +       acceleration devices.
> > +       If you say Y here, you need to select the module that's right for
> > +       your acceleration device from the list below.
> > +       This framework is integrated with the DRM subsystem as compute
> > +       accelerators and GPUs share a lot in common and can use almost the
> > +       same infrastructure code.
> > +       Having said that, acceleration devices will have a different
> > +       major number than GPUs, and will be exposed to user-space using
> > +       different device files, called accel/accel* (in /dev, sysfs
> > +       and debugfs)
>
> Please add a period at the end of the help text.
>
> +         and debugfs).
sure, thx.
Oded
>
> --
> ~Randy
  
Oded Gabbay Nov. 3, 2022, 1:31 p.m. UTC | #6
On Thu, Nov 3, 2022 at 2:31 AM Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
>
> On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
> > --- /dev/null
> > +++ b/drivers/accel/Kconfig
> > @@ -0,0 +1,24 @@
> > +# SPDX-License-Identifier: GPL-2.0-only
> > +#
> > +# Compute Acceleration device configuration
> > +#
> > +# This framework provides support for compute acceleration devices, such
> > +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> > +# devices
> > +#
> > +menuconfig ACCEL
> > +     tristate "Compute Acceleration Framework"
> > +     depends on DRM
> > +     help
> > +       Framework for device drivers of compute acceleration devices, such
> > +       as, but not limited to, Machine-Learning and Deep-Learning
> > +       acceleration devices.
> > +       If you say Y here, you need to select the module that's right for
> > +       your acceleration device from the list below.
> > +       This framework is integrated with the DRM subsystem as compute
> > +       accelerators and GPUs share a lot in common and can use almost the
> > +       same infrastructure code.
> > +       Having said that, acceleration devices will have a different
> > +       major number than GPUs, and will be exposed to user-space using
> > +       different device files, called accel/accel* (in /dev, sysfs
> > +       and debugfs)
>
> Module name if "M" is chosen?
Will add
>
>
> > +static char *accel_devnode(struct device *dev, umode_t *mode)
> > +{
> > +     return kasprintf(GFP_KERNEL, "accel/%s", dev_name(dev));
> > +}
> > +
> > +static CLASS_ATTR_STRING(accel_version, 0444, "accel 1.0.0 20221018");
>
> What is a version number doing here?
>
> Please no, I understand that DRM has this, but it really does not make
> sense for any in-kernel code.  And that's not how sysfs is supposed to
> work anyway (if a file is present, the value is documented, if the file
> is not present, the value is just not there, userspace has to handle
> it all.)
Actually I don't know if drm even uses that. I just copied it to be
identical to drm's sysfs, but
as accel doesn't have any history, I can remove it.
>
> > +
> > +/**
> > + * accel_sysfs_init - initialize sysfs helpers
> > + *
> > + * This is used to create the ACCEL class, which is the implicit parent of any
> > + * other top-level ACCEL sysfs objects.
> > + *
> > + * You must call accel_sysfs_destroy() to release the allocated resources.
> > + *
> > + * Return: 0 on success, negative error code on failure.
> > + */
> > +static int accel_sysfs_init(void)
> > +{
> > +     int err;
> > +
> > +     accel_class = class_create(THIS_MODULE, "accel");
> > +     if (IS_ERR(accel_class))
> > +             return PTR_ERR(accel_class);
> > +
> > +     err = class_create_file(accel_class, &class_attr_accel_version.attr);
>
> Hint, if you ever find yourself adding sysfs files "by hand" like this,
> you are doing something wrong.  The driver code should create them
> automatically for you by setting up default groups, _OR_ as in this
> case, you shouldn't be adding the file at all :)
ok, removed.

>
> > +static void accel_sysfs_destroy(void)
> > +{
> > +     if (IS_ERR_OR_NULL(accel_class))
> > +             return;
> > +     class_remove_file(accel_class, &class_attr_accel_version.attr);
>
> No need to manually destroy files when you remove a device.  But you
> will remove this file anyway for the next version of this patch, so it's
> not a big deal :)
:)
>
> > +     class_destroy(accel_class);
> > +     accel_class = NULL;
> > +}
> > +
> > +static int accel_stub_open(struct inode *inode, struct file *filp)
> > +{
> > +     DRM_DEBUG("Operation not supported");
>
> ftrace is wonderful, please use that and not hand-rolled custom "I am
> here!" type messages like this.
I'll just remove it as this line is being replaced anyway in the next patch.
Thanks,
Oded
>
> thanks,
>
> greg k-h
  
Oded Gabbay Nov. 3, 2022, 8:39 p.m. UTC | #7
On Thu, Nov 3, 2022 at 3:31 PM Oded Gabbay <ogabbay@kernel.org> wrote:
>
> On Thu, Nov 3, 2022 at 2:31 AM Greg Kroah-Hartman
> <gregkh@linuxfoundation.org> wrote:
> >
> > On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
> > > --- /dev/null
> > > +++ b/drivers/accel/Kconfig
> > > @@ -0,0 +1,24 @@
> > > +# SPDX-License-Identifier: GPL-2.0-only
> > > +#
> > > +# Compute Acceleration device configuration
> > > +#
> > > +# This framework provides support for compute acceleration devices, such
> > > +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> > > +# devices
> > > +#
> > > +menuconfig ACCEL
> > > +     tristate "Compute Acceleration Framework"
> > > +     depends on DRM
> > > +     help
> > > +       Framework for device drivers of compute acceleration devices, such
> > > +       as, but not limited to, Machine-Learning and Deep-Learning
> > > +       acceleration devices.
> > > +       If you say Y here, you need to select the module that's right for
> > > +       your acceleration device from the list below.
> > > +       This framework is integrated with the DRM subsystem as compute
> > > +       accelerators and GPUs share a lot in common and can use almost the
> > > +       same infrastructure code.
> > > +       Having said that, acceleration devices will have a different
> > > +       major number than GPUs, and will be exposed to user-space using
> > > +       different device files, called accel/accel* (in /dev, sysfs
> > > +       and debugfs)
> >
> > Module name if "M" is chosen?
> Will add
So, unfortunately, the path of doing accel as a kernel module won't
work cleanly (Thanks stanislaw for pointing this out to me).
The reason is the circular dependency between drm and accel. drm calls
accel exported symbols during init and when devices are registering
(all the minor handling), and accel calls drm exported symbols because
I don't want to duplicate the entire drm core code.

I'll keep this menuconfig to provide the ability to disable this code
for people who think it is too "experimental". And in the future, when
drivers will join this subsystem, they will need this place for their
kconfig.

Thanks,
Oded
  
Randy Dunlap Nov. 3, 2022, 11:01 p.m. UTC | #8
On 11/3/22 13:39, Oded Gabbay wrote:
> On Thu, Nov 3, 2022 at 3:31 PM Oded Gabbay <ogabbay@kernel.org> wrote:
>>
>> On Thu, Nov 3, 2022 at 2:31 AM Greg Kroah-Hartman
>> <gregkh@linuxfoundation.org> wrote:
>>>
>>> On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
>>>> --- /dev/null
>>>> +++ b/drivers/accel/Kconfig
>>>> @@ -0,0 +1,24 @@
>>>> +# SPDX-License-Identifier: GPL-2.0-only
>>>> +#
>>>> +# Compute Acceleration device configuration
>>>> +#
>>>> +# This framework provides support for compute acceleration devices, such
>>>> +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
>>>> +# devices
>>>> +#
>>>> +menuconfig ACCEL
>>>> +     tristate "Compute Acceleration Framework"
>>>> +     depends on DRM
>>>> +     help
>>>> +       Framework for device drivers of compute acceleration devices, such
>>>> +       as, but not limited to, Machine-Learning and Deep-Learning
>>>> +       acceleration devices.
>>>> +       If you say Y here, you need to select the module that's right for
>>>> +       your acceleration device from the list below.
>>>> +       This framework is integrated with the DRM subsystem as compute
>>>> +       accelerators and GPUs share a lot in common and can use almost the
>>>> +       same infrastructure code.
>>>> +       Having said that, acceleration devices will have a different
>>>> +       major number than GPUs, and will be exposed to user-space using
>>>> +       different device files, called accel/accel* (in /dev, sysfs
>>>> +       and debugfs)
>>>
>>> Module name if "M" is chosen?
>> Will add
> So, unfortunately, the path of doing accel as a kernel module won't
> work cleanly (Thanks stanislaw for pointing this out to me).
> The reason is the circular dependency between drm and accel. drm calls
> accel exported symbols during init and when devices are registering
> (all the minor handling), and accel calls drm exported symbols because
> I don't want to duplicate the entire drm core code.

But DRM is a tristate symbol, so during drm init (loadable module), couldn't
it call accel init code (loadable module)?

Or are you saying that they only work together if both of them are builtin?

> I'll keep this menuconfig to provide the ability to disable this code
> for people who think it is too "experimental". And in the future, when
> drivers will join this subsystem, they will need this place for their
> kconfig.
> 
> Thanks,
> Oded
  
Stanislaw Gruszka Nov. 4, 2022, 7:23 a.m. UTC | #9
On Thu, Nov 03, 2022 at 04:01:08PM -0700, Randy Dunlap wrote:
> >>> Module name if "M" is chosen?
> >> Will add
> > So, unfortunately, the path of doing accel as a kernel module won't
> > work cleanly (Thanks stanislaw for pointing this out to me).
> > The reason is the circular dependency between drm and accel. drm calls
> > accel exported symbols during init and when devices are registering
> > (all the minor handling), and accel calls drm exported symbols because
> > I don't want to duplicate the entire drm core code.
> 
> But DRM is a tristate symbol, so during drm init (loadable module), couldn't
> it call accel init code (loadable module)?
> 
> Or are you saying that they only work together if both of them are builtin?

Yes, with current state of the patches, we can not build code as modules.
There are symbols in accel that are from drm and we use accel symbols
in drm. This could be fixed by separating symbols accel requires in 
separate module i.e. drm_file_helper.ko, however Oded proposed to make
CONFIG_ACCEL compile option for DRM and all accel code will be
build in drm.ko . I think that ok, since accel is not big. 

Regards
Stanislaw
  
Jason Gunthorpe Nov. 7, 2022, 12:56 p.m. UTC | #10
On Thu, Nov 03, 2022 at 10:39:36PM +0200, Oded Gabbay wrote:
> On Thu, Nov 3, 2022 at 3:31 PM Oded Gabbay <ogabbay@kernel.org> wrote:
> >
> > On Thu, Nov 3, 2022 at 2:31 AM Greg Kroah-Hartman
> > <gregkh@linuxfoundation.org> wrote:
> > >
> > > On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
> > > > --- /dev/null
> > > > +++ b/drivers/accel/Kconfig
> > > > @@ -0,0 +1,24 @@
> > > > +# SPDX-License-Identifier: GPL-2.0-only
> > > > +#
> > > > +# Compute Acceleration device configuration
> > > > +#
> > > > +# This framework provides support for compute acceleration devices, such
> > > > +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> > > > +# devices
> > > > +#
> > > > +menuconfig ACCEL
> > > > +     tristate "Compute Acceleration Framework"
> > > > +     depends on DRM
> > > > +     help
> > > > +       Framework for device drivers of compute acceleration devices, such
> > > > +       as, but not limited to, Machine-Learning and Deep-Learning
> > > > +       acceleration devices.
> > > > +       If you say Y here, you need to select the module that's right for
> > > > +       your acceleration device from the list below.
> > > > +       This framework is integrated with the DRM subsystem as compute
> > > > +       accelerators and GPUs share a lot in common and can use almost the
> > > > +       same infrastructure code.
> > > > +       Having said that, acceleration devices will have a different
> > > > +       major number than GPUs, and will be exposed to user-space using
> > > > +       different device files, called accel/accel* (in /dev, sysfs
> > > > +       and debugfs)
> > >
> > > Module name if "M" is chosen?
> > Will add
> So, unfortunately, the path of doing accel as a kernel module won't
> work cleanly (Thanks stanislaw for pointing this out to me).
> The reason is the circular dependency between drm and accel. drm calls
> accel exported symbols during init and when devices are registering
> (all the minor handling), and accel calls drm exported symbols because
> I don't want to duplicate the entire drm core code.

I really don't think this is the right way to integrate with
DRM. Accel should be a layer over top of DRM, not have these wakky
co-dependencies.

The fact you are running into stuff like this already smells really
bad.

Jason
  
Oded Gabbay Nov. 7, 2022, 1:01 p.m. UTC | #11
On Mon, Nov 7, 2022 at 2:56 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Thu, Nov 03, 2022 at 10:39:36PM +0200, Oded Gabbay wrote:
> > On Thu, Nov 3, 2022 at 3:31 PM Oded Gabbay <ogabbay@kernel.org> wrote:
> > >
> > > On Thu, Nov 3, 2022 at 2:31 AM Greg Kroah-Hartman
> > > <gregkh@linuxfoundation.org> wrote:
> > > >
> > > > On Wed, Nov 02, 2022 at 10:34:03PM +0200, Oded Gabbay wrote:
> > > > > --- /dev/null
> > > > > +++ b/drivers/accel/Kconfig
> > > > > @@ -0,0 +1,24 @@
> > > > > +# SPDX-License-Identifier: GPL-2.0-only
> > > > > +#
> > > > > +# Compute Acceleration device configuration
> > > > > +#
> > > > > +# This framework provides support for compute acceleration devices, such
> > > > > +# as, but not limited to, Machine-Learning and Deep-Learning acceleration
> > > > > +# devices
> > > > > +#
> > > > > +menuconfig ACCEL
> > > > > +     tristate "Compute Acceleration Framework"
> > > > > +     depends on DRM
> > > > > +     help
> > > > > +       Framework for device drivers of compute acceleration devices, such
> > > > > +       as, but not limited to, Machine-Learning and Deep-Learning
> > > > > +       acceleration devices.
> > > > > +       If you say Y here, you need to select the module that's right for
> > > > > +       your acceleration device from the list below.
> > > > > +       This framework is integrated with the DRM subsystem as compute
> > > > > +       accelerators and GPUs share a lot in common and can use almost the
> > > > > +       same infrastructure code.
> > > > > +       Having said that, acceleration devices will have a different
> > > > > +       major number than GPUs, and will be exposed to user-space using
> > > > > +       different device files, called accel/accel* (in /dev, sysfs
> > > > > +       and debugfs)
> > > >
> > > > Module name if "M" is chosen?
> > > Will add
> > So, unfortunately, the path of doing accel as a kernel module won't
> > work cleanly (Thanks stanislaw for pointing this out to me).
> > The reason is the circular dependency between drm and accel. drm calls
> > accel exported symbols during init and when devices are registering
> > (all the minor handling), and accel calls drm exported symbols because
> > I don't want to duplicate the entire drm core code.
>
> I really don't think this is the right way to integrate with
> DRM. Accel should be a layer over top of DRM, not have these wakky
> co-dependencies.
>
> The fact you are running into stuff like this already smells really
> bad.
>
> Jason
I don't agree with your statement that it should be "a layer over top of DRM".
Anything on top of DRM is a device driver.
Accel is not a device driver, it is a new type of drm minor / drm driver.

Please look at v3 of the patch-set. There I abandoned the idea of
having accel as a separate module and instead it is part of drm.ko, as
it should be because it is just a new drm minor.

The only alternative imo to that is to abandon the idea of reusing
drm, and just make an independant accel core code.

Oded
  
Jason Gunthorpe Nov. 7, 2022, 1:10 p.m. UTC | #12
On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> I don't agree with your statement that it should be "a layer over top of DRM".
> Anything on top of DRM is a device driver.
> Accel is not a device driver, it is a new type of drm minor / drm driver.

Yeah, I still think this is not the right way, you are getting almost
nothing from DRM and making everything more complicated in the
process.

> The only alternative imo to that is to abandon the idea of reusing
> drm, and just make an independant accel core code.

Not quite really, layer it properly and librarize parts of DRM into
things accel can re-use so they are not intimately tied to the DRM
struct device notion.

IMHO this is much better, because accel has very little need of DRM to
manage a struct device/cdev in the first place.

Jason
  
Stanislaw Gruszka Nov. 7, 2022, 1:25 p.m. UTC | #13
Hi

On Mon, Nov 07, 2022 at 09:10:36AM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > I don't agree with your statement that it should be "a layer over top of DRM".
> > Anything on top of DRM is a device driver.
> > Accel is not a device driver, it is a new type of drm minor / drm driver.
> 
> Yeah, I still think this is not the right way, you are getting almost
> nothing from DRM and making everything more complicated in the
> process.
> 
> > The only alternative imo to that is to abandon the idea of reusing
> > drm, and just make an independant accel core code.
> 
> Not quite really, layer it properly and librarize parts of DRM into
> things accel can re-use so they are not intimately tied to the DRM
> struct device notion.

What do you mean by "struct device notion" ? struct drm_devce ? 

Regards
Stanislaw
  
Oded Gabbay Nov. 7, 2022, 2:02 p.m. UTC | #14
On Mon, Nov 7, 2022 at 3:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > I don't agree with your statement that it should be "a layer over top of DRM".
> > Anything on top of DRM is a device driver.
> > Accel is not a device driver, it is a new type of drm minor / drm driver.
>
> Yeah, I still think this is not the right way, you are getting almost
> nothing from DRM and making everything more complicated in the
> process.
>
> > The only alternative imo to that is to abandon the idea of reusing
> > drm, and just make an independant accel core code.
>
> Not quite really, layer it properly and librarize parts of DRM into
> things accel can re-use so they are not intimately tied to the DRM
> struct device notion.
>
> IMHO this is much better, because accel has very little need of DRM to
> manage a struct device/cdev in the first place.
>
> Jason
I'm not following. How can an accel device be a new type of drm_minor,
if it doesn't have access to all its functions and members ?
How will accel device leverage, for example, the GEM code without
being a drm_minor ?

Librarizing parts of DRM sounds nice in theory but the reality is that
everything there is interconnected, all the structures are
interdependent.
I would have to re-write the entire DRM library to make such a thing
work. I don't think this was the intention.

The current design makes the accel device an integral part of drm,
with very minimal code duplication and without re-writing DRM.

Oded
  
Jason Gunthorpe Nov. 7, 2022, 2:10 p.m. UTC | #15
On Mon, Nov 07, 2022 at 04:02:01PM +0200, Oded Gabbay wrote:
> On Mon, Nov 7, 2022 at 3:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > > I don't agree with your statement that it should be "a layer over top of DRM".
> > > Anything on top of DRM is a device driver.
> > > Accel is not a device driver, it is a new type of drm minor / drm driver.
> >
> > Yeah, I still think this is not the right way, you are getting almost
> > nothing from DRM and making everything more complicated in the
> > process.
> >
> > > The only alternative imo to that is to abandon the idea of reusing
> > > drm, and just make an independant accel core code.
> >
> > Not quite really, layer it properly and librarize parts of DRM into
> > things accel can re-use so they are not intimately tied to the DRM
> > struct device notion.
> >
> > IMHO this is much better, because accel has very little need of DRM to
> > manage a struct device/cdev in the first place.
> >
> > Jason
> I'm not following. How can an accel device be a new type of drm_minor,
> if it doesn't have access to all its functions and members ?

"drm_minor" is not necessary anymore. Strictly managing minor numbers
lost its value years ago when /dev/ was reorganized. Just use
dynamic minors fully.

> How will accel device leverage, for example, the GEM code without
> being a drm_minor ?

Split GEM into a library so it doesn't require that.

> Librarizing parts of DRM sounds nice in theory but the reality is that
> everything there is interconnected, all the structures are
> interdependent.

Yes, the kernel is full of stuff that needs improving. Let's not take
shortcuts.

> I would have to re-write the entire DRM library to make such a thing
> work. I don't think this was the intention.

Not necessarily you, whoever someday needs GEM would have to do some
work.

> The current design makes the accel device an integral part of drm,
> with very minimal code duplication and without re-writing DRM.

And it smells bad, you can't even make it into a proper module. Who
knows what other problems will come.

Jason
  
Oded Gabbay Nov. 7, 2022, 3:53 p.m. UTC | #16
On Mon, Nov 7, 2022 at 4:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Nov 07, 2022 at 04:02:01PM +0200, Oded Gabbay wrote:
> > On Mon, Nov 7, 2022 at 3:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > > > I don't agree with your statement that it should be "a layer over top of DRM".
> > > > Anything on top of DRM is a device driver.
> > > > Accel is not a device driver, it is a new type of drm minor / drm driver.
> > >
> > > Yeah, I still think this is not the right way, you are getting almost
> > > nothing from DRM and making everything more complicated in the
> > > process.
> > >
> > > > The only alternative imo to that is to abandon the idea of reusing
> > > > drm, and just make an independant accel core code.
> > >
> > > Not quite really, layer it properly and librarize parts of DRM into
> > > things accel can re-use so they are not intimately tied to the DRM
> > > struct device notion.
> > >
> > > IMHO this is much better, because accel has very little need of DRM to
> > > manage a struct device/cdev in the first place.
> > >
> > > Jason
> > I'm not following. How can an accel device be a new type of drm_minor,
> > if it doesn't have access to all its functions and members ?
>
> "drm_minor" is not necessary anymore. Strictly managing minor numbers
> lost its value years ago when /dev/ was reorganized. Just use
> dynamic minors fully.
drm minor is not just about handling minor numbers. It contains the
entire code to manage devices that register with drm framework (e.g.
supply callbacks to file operations), manage their lifecycle,
resources (e.g. automatic free of resources on release), sysfs,
debugfs, etc.

To take all of that out of drm.ko and make it a separate kernel module
is a big change, which I don't know if the drm people even want me to
do.

>
> > How will accel device leverage, for example, the GEM code without
> > being a drm_minor ?
>
> Split GEM into a library so it doesn't require that.
I don't see the advantage of doing that over defining accel as a new
type of drm minor.
>
> > Librarizing parts of DRM sounds nice in theory but the reality is that
> > everything there is interconnected, all the structures are
> > interdependent.
>
> Yes, the kernel is full of stuff that needs improving. Let's not take
> shortcuts.
It's not about shortcuts. It's about a different way to solve this
issue which I don't think is anyway hacky or inappropriate.

>
> > I would have to re-write the entire DRM library to make such a thing
> > work. I don't think this was the intention.
>
> Not necessarily you, whoever someday needs GEM would have to do some
> work.
>
> > The current design makes the accel device an integral part of drm,
> > with very minimal code duplication and without re-writing DRM.
>
> And it smells bad, you can't even make it into a proper module. Who
> knows what other problems will come.
I would argue that if accel is part of drm, it doesn't have to be a
proper module. I don't see that as a hard requirement.
Oded

>
> Jason
  
Jason Gunthorpe Nov. 7, 2022, 4:30 p.m. UTC | #17
On Mon, Nov 07, 2022 at 05:53:55PM +0200, Oded Gabbay wrote:
> On Mon, Nov 7, 2022 at 4:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> >
> > On Mon, Nov 07, 2022 at 04:02:01PM +0200, Oded Gabbay wrote:
> > > On Mon, Nov 7, 2022 at 3:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > >
> > > > On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > > > > I don't agree with your statement that it should be "a layer over top of DRM".
> > > > > Anything on top of DRM is a device driver.
> > > > > Accel is not a device driver, it is a new type of drm minor / drm driver.
> > > >
> > > > Yeah, I still think this is not the right way, you are getting almost
> > > > nothing from DRM and making everything more complicated in the
> > > > process.
> > > >
> > > > > The only alternative imo to that is to abandon the idea of reusing
> > > > > drm, and just make an independant accel core code.
> > > >
> > > > Not quite really, layer it properly and librarize parts of DRM into
> > > > things accel can re-use so they are not intimately tied to the DRM
> > > > struct device notion.
> > > >
> > > > IMHO this is much better, because accel has very little need of DRM to
> > > > manage a struct device/cdev in the first place.
> > > >
> > > > Jason
> > > I'm not following. How can an accel device be a new type of drm_minor,
> > > if it doesn't have access to all its functions and members ?
> >
> > "drm_minor" is not necessary anymore. Strictly managing minor numbers
> > lost its value years ago when /dev/ was reorganized. Just use
> > dynamic minors fully.
> drm minor is not just about handling minor numbers. It contains the
> entire code to manage devices that register with drm framework (e.g.
> supply callbacks to file operations), manage their lifecycle,
> resources (e.g. automatic free of resources on release), sysfs,
> debugfs, etc.

This is why you are having such troubles, this is already good library
code. You don't need DRM to wrapper debugfs APIs, for instance. We
have devm, though maybe it is not a good idea, etc

Greg already pointed out the sysfs was not being done correctly
anyhow.

I don't think DRM is improving on these core kernel services. Just use
the normal stuff directly.

> > > How will accel device leverage, for example, the GEM code without
> > > being a drm_minor ?
> >
> > Split GEM into a library so it doesn't require that.
> I don't see the advantage of doing that over defining accel as a new
> type of drm minor.

Making things into smaller libraries is recognized as a far better
kernel approach than trying to make a gigantic wide midlayer that stuffs
itself into everything. LWN called this the "midlayer mistake" and
wrote about the pitfalls a long time ago:

https://lwn.net/Articles/336262/

It is exactly what you are experiencing trying to stretch a
midlayer even further out.

Jason
  
Oded Gabbay Nov. 7, 2022, 7:27 p.m. UTC | #18
On Mon, Nov 7, 2022 at 6:31 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Nov 07, 2022 at 05:53:55PM +0200, Oded Gabbay wrote:
> > On Mon, Nov 7, 2022 at 4:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > >
> > > On Mon, Nov 07, 2022 at 04:02:01PM +0200, Oded Gabbay wrote:
> > > > On Mon, Nov 7, 2022 at 3:10 PM Jason Gunthorpe <jgg@nvidia.com> wrote:
> > > > >
> > > > > On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > > > > > I don't agree with your statement that it should be "a layer over top of DRM".
> > > > > > Anything on top of DRM is a device driver.
> > > > > > Accel is not a device driver, it is a new type of drm minor / drm driver.
> > > > >
> > > > > Yeah, I still think this is not the right way, you are getting almost
> > > > > nothing from DRM and making everything more complicated in the
> > > > > process.
> > > > >
> > > > > > The only alternative imo to that is to abandon the idea of reusing
> > > > > > drm, and just make an independant accel core code.
> > > > >
> > > > > Not quite really, layer it properly and librarize parts of DRM into
> > > > > things accel can re-use so they are not intimately tied to the DRM
> > > > > struct device notion.
> > > > >
> > > > > IMHO this is much better, because accel has very little need of DRM to
> > > > > manage a struct device/cdev in the first place.
> > > > >
> > > > > Jason
> > > > I'm not following. How can an accel device be a new type of drm_minor,
> > > > if it doesn't have access to all its functions and members ?
> > >
> > > "drm_minor" is not necessary anymore. Strictly managing minor numbers
> > > lost its value years ago when /dev/ was reorganized. Just use
> > > dynamic minors fully.
> > drm minor is not just about handling minor numbers. It contains the
> > entire code to manage devices that register with drm framework (e.g.
> > supply callbacks to file operations), manage their lifecycle,
> > resources (e.g. automatic free of resources on release), sysfs,
> > debugfs, etc.
>
> This is why you are having such troubles, this is already good library
> code. You don't need DRM to wrapper debugfs APIs, for instance. We
> have devm, though maybe it is not a good idea, etc
>
> Greg already pointed out the sysfs was not being done correctly
> anyhow.
>
> I don't think DRM is improving on these core kernel services. Just use
> the normal stuff directly.
I get what you are saying but if I do all that, then how is this
subsystem related to DRM and re-using its code ? (at least at this
stage)
btw, using the basic stuff directly was my original intention, if you
remember the original accel mail thread from July/August.
And then we all decided in LPC that we shouldn't do that and instead
accel should use the DRM code and just expose a new major+minor for
the new drivers.

So, something doesn't add up...
imo, we need to choose between doing accel either as a small new
feature in drm, or as an independent subsystem.
I just don't see how I do the former without calling drm code directly
and using all its wrappers.

>
> > > > How will accel device leverage, for example, the GEM code without
> > > > being a drm_minor ?
> > >
> > > Split GEM into a library so it doesn't require that.
> > I don't see the advantage of doing that over defining accel as a new
> > type of drm minor.
>
> Making things into smaller libraries is recognized as a far better
> kernel approach than trying to make a gigantic wide midlayer that stuffs
> itself into everything. LWN called this the "midlayer mistake" and
> wrote about the pitfalls a long time ago:
>
> https://lwn.net/Articles/336262/
>
> It is exactly what you are experiencing trying to stretch a
> midlayer even further out.
>
> Jason
I'm all for breaking it down to smaller libraries, I completely agree with you.
But as you wrote above, why do I even need to use the drm wrappers for
the basic stuff ? I'll just call the kernel api directly.
And if that's the case then I don't need to rip that code out of the
heart of drm and make it a separate module.

For GEM (as an example of something less basic) it might be a
different story, but we are not there yet.

Oded
  
Dave Airlie Nov. 7, 2022, 8:18 p.m. UTC | #19
On Mon, 7 Nov 2022 at 23:10, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Mon, Nov 07, 2022 at 03:01:08PM +0200, Oded Gabbay wrote:
> > I don't agree with your statement that it should be "a layer over top of DRM".
> > Anything on top of DRM is a device driver.
> > Accel is not a device driver, it is a new type of drm minor / drm driver.
>
> Yeah, I still think this is not the right way, you are getting almost
> nothing from DRM and making everything more complicated in the
> process.

You are looking at the small picture that is these patches, there are
just infrastructure to start the process of merging drivers and
reusing other parts of the drm code.

We aren't going to ever get anywhere if we start splitting code out of
drm just in case, we get this stuff rolling in the tree and if we have
a pressing need to refactor it out into separate libraries later then
we can address that from a more educated place, instead of just
throwing huge refactors around before we have any code to even use
them.
>
> IMHO this is much better, because accel has very little need of DRM to
> manage a struct device/cdev in the first place.

Right now it doesn't, but when drivers start leveraging the other code
it will reuse a lot more code.

I'm not going to spend too much time entertaining this, devm vs drmm
memory etc are real problems drm has already identified if not
completely solved.

Dave.
  
Dave Airlie Nov. 7, 2022, 8:33 p.m. UTC | #20
> > >
> > > "drm_minor" is not necessary anymore. Strictly managing minor numbers
> > > lost its value years ago when /dev/ was reorganized. Just use
> > > dynamic minors fully.
> > drm minor is not just about handling minor numbers. It contains the
> > entire code to manage devices that register with drm framework (e.g.
> > supply callbacks to file operations), manage their lifecycle,
> > resources (e.g. automatic free of resources on release), sysfs,
> > debugfs, etc.
>
> This is why you are having such troubles, this is already good library
> code. You don't need DRM to wrapper debugfs APIs, for instance. We
> have devm, though maybe it is not a good idea, etc
>
> Greg already pointed out the sysfs was not being done correctly
> anyhow.
>
> I don't think DRM is improving on these core kernel services. Just use
> the normal stuff directly.

At plumbers we decided a direction, I think the direction is good, if
there is refactoring to be done, I'd rather it was done in tree with a
clear direction.

Coming in now and saying we should go down a different path isn't
really helpful. We need to get rolling on this, we have drivers that
want to land somewhere now, which means we need to just get a
framework in place, leveraging drm code is the way to do it.

There is no need to an "accel" module, what does that even buy you,
the idea is to have an accel subsystem that allows drivers to use drm
features, not an accel subsystem that refactors drm features, that
would take years. There are already drivers for this subsystem wanting
to use GEM, and I don't think holding them up for a year to refactor
something that we don't have a clear reason or goal behind
refactoring.

If there is a problem with the drm subsystem interactions with the
kernel standard implementations then let's go fix that and accel will
also get fixed, but there's no reason to start going down that road at
the same time as introducing accel.

Also with the idr/xarray stuff, this isn't the patchset to be
introducing a bunch of new and divergent work, if this patchset
identifies deficiencies then let's document them and work on them in
parallel instead of blocking the initial landing in favour of some
future refactors with no in-tree users.

Dave.
  
Jason Gunthorpe Nov. 8, 2022, 12:28 p.m. UTC | #21
On Tue, Nov 08, 2022 at 06:33:23AM +1000, Dave Airlie wrote:

> At plumbers we decided a direction, I think the direction is good, if
> there is refactoring to be done, I'd rather it was done in tree with a
> clear direction.
> 
> Coming in now and saying we should go down a different path isn't
> really helpful. We need to get rolling on this, we have drivers that
> want to land somewhere now, which means we need to just get a
> framework in place, leveraging drm code is the way to do it.

It is not a different path, at plumbers we decided accel should try to
re-use parts of DRM that make sense. I think that should be done by
making those DRM parts into libraries that can be re-used, not by
trying to twist DRM into something weird.

If this thing needs special major/minor numbers, it's own class, its
own debufs, sysfs, etc, then it should not be abusing the DRM struct
device infrastructure to create that very basic kernel infrastructure.

Somehow we ended up with the worst of both worlds. If you want to to
be DRM then it should just be DRM and we shouldn't see all this core
infrastructue code for debugfs/sysfs/cdevs/etc in thes patches at all.

Jason
  
Dave Airlie Nov. 9, 2022, 7:22 a.m. UTC | #22
On Tue, 8 Nov 2022 at 22:28, Jason Gunthorpe <jgg@nvidia.com> wrote:
>
> On Tue, Nov 08, 2022 at 06:33:23AM +1000, Dave Airlie wrote:
>
> > At plumbers we decided a direction, I think the direction is good, if
> > there is refactoring to be done, I'd rather it was done in tree with a
> > clear direction.
> >
> > Coming in now and saying we should go down a different path isn't
> > really helpful. We need to get rolling on this, we have drivers that
> > want to land somewhere now, which means we need to just get a
> > framework in place, leveraging drm code is the way to do it.
>
> It is not a different path, at plumbers we decided accel should try to
> re-use parts of DRM that make sense. I think that should be done by
> making those DRM parts into libraries that can be re-used, not by
> trying to twist DRM into something weird.

There isn't much twisting here, the thing is this is just the code for sharing,
there isn't going to be mountains more. This code gives accel drivers access
to a lot of things. Refactoring it out will take a year or so, and I don't think
buys us anything.

>
> If this thing needs special major/minor numbers, it's own class, its
> own debufs, sysfs, etc, then it should not be abusing the DRM struct
> device infrastructure to create that very basic kernel infrastructure.
>
> Somehow we ended up with the worst of both worlds. If you want to to
> be DRM then it should just be DRM and we shouldn't see all this core
> infrastructue code for debugfs/sysfs/cdevs/etc in thes patches at all.

We can refactor this out even clearer in the long run if it needs to,
but you are overly focusing on the small picture of these patches and
not the larger sharing this enables.

At this point I'm going to be merging close to what we have here, so
we can move forward with getting some drivers lined up.

Dave.
  

Patch

diff --git a/Documentation/admin-guide/devices.txt b/Documentation/admin-guide/devices.txt
index 9764d6edb189..06c525e01ea5 100644
--- a/Documentation/admin-guide/devices.txt
+++ b/Documentation/admin-guide/devices.txt
@@ -3080,6 +3080,11 @@ 
 		  ...
 		  255 = /dev/osd255	256th OSD Device

+ 261 char	Compute Acceleration Devices
+		  0 = /dev/accel/accel0	First acceleration device
+		  1 = /dev/accel/accel1	Second acceleration device
+		    ...
+
  384-511 char	RESERVED FOR DYNAMIC ASSIGNMENT
 		Character devices that request a dynamic allocation of major
 		number will take numbers starting from 511 and downward,
diff --git a/MAINTAINERS b/MAINTAINERS
index ab07cf28844e..9b34f756e343 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -6825,6 +6825,14 @@  F:	include/drm/drm*
 F:	include/linux/vga*
 F:	include/uapi/drm/drm*

+DRM COMPUTE ACCELERATORS DRIVERS AND FRAMEWORK
+M:	Oded Gabbay <ogabbay@kernel.org>
+L:	dri-devel@lists.freedesktop.org
+S:	Maintained
+C:	irc://irc.oftc.net/dri-devel
+T:	git https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git
+F:	drivers/accel/
+
 DRM DRIVERS FOR ALLWINNER A10
 M:	Maxime Ripard <mripard@kernel.org>
 M:	Chen-Yu Tsai <wens@csie.org>
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 19ee995bd0ae..968bd0a6fd78 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -99,6 +99,8 @@  source "drivers/media/Kconfig"

 source "drivers/video/Kconfig"

+source "drivers/accel/Kconfig"
+
 source "sound/Kconfig"

 source "drivers/hid/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index bdf1c66141c9..658199dcee96 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -62,6 +62,9 @@  obj-y				+= iommu/
 # gpu/ comes after char for AGP vs DRM startup and after iommu
 obj-y				+= gpu/

+# accel is part of drm so it must come after gpu
+obj-$(CONFIG_ACCEL)		+= accel/
+
 obj-$(CONFIG_CONNECTOR)		+= connector/

 # i810fb and intelfb depend on char/agp/
diff --git a/drivers/accel/Kconfig b/drivers/accel/Kconfig
new file mode 100644
index 000000000000..282ea24f90c5
--- /dev/null
+++ b/drivers/accel/Kconfig
@@ -0,0 +1,24 @@ 
+# SPDX-License-Identifier: GPL-2.0-only
+#
+# Compute Acceleration device configuration
+#
+# This framework provides support for compute acceleration devices, such
+# as, but not limited to, Machine-Learning and Deep-Learning acceleration
+# devices
+#
+menuconfig ACCEL
+	tristate "Compute Acceleration Framework"
+	depends on DRM
+	help
+	  Framework for device drivers of compute acceleration devices, such
+	  as, but not limited to, Machine-Learning and Deep-Learning
+	  acceleration devices.
+	  If you say Y here, you need to select the module that's right for
+	  your acceleration device from the list below.
+	  This framework is integrated with the DRM subsystem as compute
+	  accelerators and GPUs share a lot in common and can use almost the
+	  same infrastructure code.
+	  Having said that, acceleration devices will have a different
+	  major number than GPUs, and will be exposed to user-space using
+	  different device files, called accel/accel* (in /dev, sysfs
+	  and debugfs)
diff --git a/drivers/accel/Makefile b/drivers/accel/Makefile
new file mode 100644
index 000000000000..b5b7d812a8ef
--- /dev/null
+++ b/drivers/accel/Makefile
@@ -0,0 +1,10 @@ 
+# SPDX-License-Identifier: GPL-2.0
+
+# Makefile for the accel framework. This framework provides support for
+# compute acceleration devices, such as, but not limited to, Machine-Learning
+# and Deep-Learning acceleration devices
+
+accel-y := \
+	accel_drv.o
+
+obj-$(CONFIG_ACCEL)	+= accel.o
diff --git a/drivers/accel/accel_drv.c b/drivers/accel/accel_drv.c
new file mode 100644
index 000000000000..6132765ea054
--- /dev/null
+++ b/drivers/accel/accel_drv.c
@@ -0,0 +1,112 @@ 
+// SPDX-License-Identifier: GPL-2.0
+
+/*
+ * Copyright 2022 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/debugfs.h>
+#include <linux/device.h>
+
+#include <drm/drm_accel.h>
+#include <drm/drm_ioctl.h>
+#include <drm/drm_print.h>
+
+static struct dentry *accel_debugfs_root;
+struct class *accel_class;
+
+static char *accel_devnode(struct device *dev, umode_t *mode)
+{
+	return kasprintf(GFP_KERNEL, "accel/%s", dev_name(dev));
+}
+
+static CLASS_ATTR_STRING(accel_version, 0444, "accel 1.0.0 20221018");
+
+/**
+ * accel_sysfs_init - initialize sysfs helpers
+ *
+ * This is used to create the ACCEL class, which is the implicit parent of any
+ * other top-level ACCEL sysfs objects.
+ *
+ * You must call accel_sysfs_destroy() to release the allocated resources.
+ *
+ * Return: 0 on success, negative error code on failure.
+ */
+static int accel_sysfs_init(void)
+{
+	int err;
+
+	accel_class = class_create(THIS_MODULE, "accel");
+	if (IS_ERR(accel_class))
+		return PTR_ERR(accel_class);
+
+	err = class_create_file(accel_class, &class_attr_accel_version.attr);
+	if (err) {
+		class_destroy(accel_class);
+		accel_class = NULL;
+		return err;
+	}
+
+	accel_class->devnode = accel_devnode;
+
+	return 0;
+}
+
+/**
+ * accel_sysfs_destroy - destroys ACCEL class
+ *
+ * Destroy the ACCEL device class.
+ */
+static void accel_sysfs_destroy(void)
+{
+	if (IS_ERR_OR_NULL(accel_class))
+		return;
+	class_remove_file(accel_class, &class_attr_accel_version.attr);
+	class_destroy(accel_class);
+	accel_class = NULL;
+}
+
+static int accel_stub_open(struct inode *inode, struct file *filp)
+{
+	DRM_DEBUG("Operation not supported");
+
+	return -EOPNOTSUPP;
+}
+
+static const struct file_operations accel_stub_fops = {
+	.owner = THIS_MODULE,
+	.open = accel_stub_open,
+	.llseek = noop_llseek,
+};
+
+void accel_core_exit(void)
+{
+	unregister_chrdev(ACCEL_MAJOR, "accel");
+	debugfs_remove(accel_debugfs_root);
+	accel_sysfs_destroy();
+}
+
+int __init accel_core_init(void)
+{
+	int ret;
+
+	ret = accel_sysfs_init();
+	if (ret < 0) {
+		DRM_ERROR("Cannot create ACCEL class: %d\n", ret);
+		goto error;
+	}
+
+	accel_debugfs_root = debugfs_create_dir("accel", NULL);
+
+	ret = register_chrdev(ACCEL_MAJOR, "accel", &accel_stub_fops);
+	if (ret < 0)
+		goto error;
+
+error:
+	/* Any cleanup will be done in drm_core_exit() that will call
+	 * to accel_core_exit()
+	 */
+	return ret;
+}
diff --git a/include/drm/drm_accel.h b/include/drm/drm_accel.h
new file mode 100644
index 000000000000..cf43a7b30f34
--- /dev/null
+++ b/include/drm/drm_accel.h
@@ -0,0 +1,31 @@ 
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright 2022 HabanaLabs, Ltd.
+ * All Rights Reserved.
+ *
+ */
+
+#ifndef DRM_ACCEL_H_
+#define DRM_ACCEL_H_
+
+#define ACCEL_MAJOR     261
+
+#if IS_ENABLED(CONFIG_ACCEL)
+
+void accel_core_exit(void);
+int accel_core_init(void);
+
+#else
+
+static inline void accel_core_exit(void)
+{
+}
+
+static inline int __init accel_core_init(void)
+{
+	return 0;
+}
+
+#endif /* IS_ENABLED(CONFIG_ACCEL) */
+
+#endif /* DRM_ACCEL_H_ */