Message ID | 20221106210225.2065371-1-ogabbay@kernel.org |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp1673493wru; Sun, 6 Nov 2022 13:03:57 -0800 (PST) X-Google-Smtp-Source: AMsMyM5RbhbikUDDG0wNJJFfrvvih1OuuRdk4fxJhNG7vkrFIyEx8IydOt1LgIVuA9qz54BgslsT X-Received: by 2002:a17:902:8212:b0:186:a260:50a0 with SMTP id x18-20020a170902821200b00186a26050a0mr47047851pln.157.1667768637484; Sun, 06 Nov 2022 13:03:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667768637; cv=none; d=google.com; s=arc-20160816; b=hZvObFfbVa3UCS7cPACZmp9d1cJ0oijssn1gqhXXVEX9+lAHCUP/odPAh8W8bL/lF5 NmAWdk1oZxSxUne6bJrjVLUxmi4pV2AVFGj7RBXBUFfJASjXLfY8XglF/jB0ih0EKU+a yg1wOgzrflX7jyrG6JEmkAf8FgwUYoa48DChIIYMb2h5/LSiybcDSOQgkfJG5z1cr7Rj LUL205jGfCGpvNJ7whVYLsd0fzTU5J/rl9+14kxcWp36+NazMpbL2MlBR9DaSKjB8sk7 L1sYmndnl6p4j/3rlD8aWC/4wa0GcDJj0ve9RmvCnqw86te9ShRBO8mNCLw0yVLnXBwX eF8Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=LgrCgXa/bXUMEh1Zx6XSt4bP9cJ/PuswpAjl6RQTwnE=; b=vEHUjYYNG1YcQGDZKWWC9JfD1U43y5/qoQm/ul1sX6joLWvEyR7IFhk31ZKqmEITsi dGsxY6zPlCDXGohhwCQkO8mzIunHKOtbUE/cBo9BiH6uZq1XKU4uRGJEooJzYPkp8SZX m84mscR4rR+VUuaZUjMwWgY1DmbjP4IkXrhWfP8ukCn4M+PgnBZWGloMTdR9nKI2H/oN Kif0mq7ioPK7RpbQ8b/8DWVnxy91VTeNGtQEAHyxQlUQIVq46f0qdxJjHpWMUo618tHY E42HG7S19GJdBGYHFmgq9Z24QaA0ceBd0Xw6jZ1HjGCcSQTU/zvs7YPpTb0tSFPfgcWA 3ZJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GprBHDGE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w3-20020a170902ca0300b0017f9636902fsi6572636pld.391.2022.11.06.13.03.44; Sun, 06 Nov 2022 13:03:57 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GprBHDGE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230050AbiKFVCj (ORCPT <rfc822;hjfbswb@gmail.com> + 99 others); Sun, 6 Nov 2022 16:02:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229566AbiKFVCi (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sun, 6 Nov 2022 16:02:38 -0500 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 940CCFADD for <linux-kernel@vger.kernel.org>; Sun, 6 Nov 2022 13:02:36 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 4198AB80D17 for <linux-kernel@vger.kernel.org>; Sun, 6 Nov 2022 21:02:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6FE5DC433D7; Sun, 6 Nov 2022 21:02:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1667768553; bh=EXvwdhWEYV7UJ2h0xFwZgcbB4zy2TYFGbCvTW03tHYU=; h=From:To:Cc:Subject:Date:From; b=GprBHDGEFAa8bq7NcFH9F6bzocK/wwk4LpOUsP86Ft/pnzakOV/s7rhPdDIb4LvoB b6bMd37UQcYM80/6j6/hEp82DmnwLbT5qI0ba8NmhZGbm91FwwElyzR8BTsFQZwknL DJVjSPfYJSSlRtjvvz5INLEKCXy/W6Twpl3tCl1p/kmKLXLZFeV+VgU4+nepV7EbPW b3orOmpSRCucQ+nwO6glQMSnMKY5UA35GQb9DdH6P72CxJGNhedkAqo+SrnL0VSxL1 4HUoyZs6G2FkbB59ucvGgI4gtwsMuZmxyz1tvjtVLSWAvSWV6QnYnT8AvjajVhIz2W +jnxmIcA88duA== From: Oded Gabbay <ogabbay@kernel.org> To: David Airlie <airlied@gmail.com>, Daniel Vetter <daniel@ffwll.ch>, Arnd Bergmann <arnd@arndb.de>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, linux-kernel@vger.kernel.org, dri-devel@lists.freedesktop.org, Jason Gunthorpe <jgg@nvidia.com>, John Hubbard <jhubbard@nvidia.com>, Alex Deucher <alexander.deucher@amd.com> Cc: Maarten Lankhorst <maarten.lankhorst@linux.intel.com>, Maxime Ripard <mripard@kernel.org>, Thomas Zimmermann <tzimmermann@suse.de>, Yuji Ishikawa <yuji2.ishikawa@toshiba.co.jp>, Jiho Chu <jiho.chu@samsung.com>, Daniel Stone <daniel@fooishbar.org>, Tvrtko Ursulin <tvrtko.ursulin@linux.intel.com>, Jeffrey Hugo <quic_jhugo@quicinc.com>, Christoph Hellwig <hch@infradead.org>, Kevin Hilman <khilman@baylibre.com>, Jagan Teki <jagan@amarulasolutions.com>, Jacek Lawrynowicz <jacek.lawrynowicz@linux.intel.com>, Maciej Kwapulinski <maciej.kwapulinski@linux.intel.com>, Stanislaw Gruszka <stanislaw.gruszka@linux.intel.com>, Randy Dunlap <rdunlap@infradead.org> Subject: [RFC PATCH v3 0/3] new subsystem for compute accelerator devices Date: Sun, 6 Nov 2022 23:02:22 +0200 Message-Id: <20221106210225.2065371-1-ogabbay@kernel.org> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748782166873538347?= X-GMAIL-MSGID: =?utf-8?q?1748782166873538347?= |
Series |
new subsystem for compute accelerator devices
|
|
Message
Oded Gabbay
Nov. 6, 2022, 9:02 p.m. UTC
This is the third version of the RFC following the comments given on the second version, but more importantly, following testing done by the VPU driver people and myself. We found out that there is a circular dependency between DRM and accel. DRM calls accel exported symbols during init and when accel devices are registering (all the minor handling), then accel calls DRM exported symbols. Therefore, if the two components are compiled as modules, there is a circular dependency. To overcome this, I have decided to compile the accel core code as part of the DRM kernel module (drm.ko). IMO, this is inline with the spirit of the design choice to have accel reuse the DRM core code and avoid code duplication. Another important change is that I have reverted back to use IDR for minor handling instead of xarray. This is because I have found that xarray doesn't handle well the scenario where you allocate a NULL entry and then exchange it with a real pointer. It appears xarray still considers that entry a "zero" entry. This is unfortunate because DRM works that way (first allocates a NULL entry and then replaces the entry with a real pointer). I decided to revert to IDR because I don't want to hold up these patches, as many people are blocked until the support for accel is merged. The xarray issue should be fixed as a separate patch by either fixing the xarray code or changing how DRM + ACCEL do minor id handling. The patches are in the following repo: https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 As in v2, The HEAD of that branch is a commit adding a dummy driver that registers an accel device using the new framework. This can be served as a simple reference. I have checked inserting and removing the dummy driver, and opening and closing /dev/accel/accel0 and nothing got broken :) v1 cover letter: https://lkml.org/lkml/2022/10/22/544 v2 cover letter: https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ Thanks, Oded. Oded Gabbay (3): drivers/accel: define kconfig and register a new major accel: add dedicated minor for accelerator devices drm: initialize accel framework Documentation/admin-guide/devices.txt | 5 + MAINTAINERS | 8 + drivers/Kconfig | 2 + drivers/accel/Kconfig | 24 ++ drivers/accel/drm_accel.c | 322 ++++++++++++++++++++++++++ drivers/gpu/drm/Makefile | 1 + drivers/gpu/drm/drm_drv.c | 102 +++++--- drivers/gpu/drm/drm_file.c | 2 +- drivers/gpu/drm/drm_sysfs.c | 24 +- include/drm/drm_accel.h | 97 ++++++++ include/drm/drm_device.h | 3 + include/drm/drm_drv.h | 8 + include/drm/drm_file.h | 21 +- 13 files changed, 582 insertions(+), 37 deletions(-) create mode 100644 drivers/accel/Kconfig create mode 100644 drivers/accel/drm_accel.c create mode 100644 include/drm/drm_accel.h -- 2.25.1
Comments
On 11/6/2022 2:02 PM, Oded Gabbay wrote: > This is the third version of the RFC following the comments given on the > second version, but more importantly, following testing done by the VPU > driver people and myself. We found out that there is a circular dependency > between DRM and accel. DRM calls accel exported symbols during init and when > accel devices are registering (all the minor handling), then accel calls DRM > exported symbols. Therefore, if the two components are compiled as modules, > there is a circular dependency. > > To overcome this, I have decided to compile the accel core code as part of > the DRM kernel module (drm.ko). IMO, this is inline with the spirit of the > design choice to have accel reuse the DRM core code and avoid code > duplication. > > Another important change is that I have reverted back to use IDR for minor > handling instead of xarray. This is because I have found that xarray doesn't > handle well the scenario where you allocate a NULL entry and then exchange it > with a real pointer. It appears xarray still considers that entry a "zero" > entry. This is unfortunate because DRM works that way (first allocates a NULL > entry and then replaces the entry with a real pointer). > > I decided to revert to IDR because I don't want to hold up these patches, > as many people are blocked until the support for accel is merged. The xarray > issue should be fixed as a separate patch by either fixing the xarray code or > changing how DRM + ACCEL do minor id handling. This sounds sane to me. However, this appears to be something that Matthew Wilcox should be aware of (added for visibility). Perhaps he has a very quick solution. If not, at-least he might have ideas on how to best address in the future. > The patches are in the following repo: > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > registers an accel device using the new framework. This can be served > as a simple reference. I have checked inserting and removing the dummy driver, > and opening and closing /dev/accel/accel0 and nothing got broken :) > > v1 cover letter: > https://lkml.org/lkml/2022/10/22/544 > > v2 cover letter: > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ > > Thanks, > Oded. > > Oded Gabbay (3): > drivers/accel: define kconfig and register a new major > accel: add dedicated minor for accelerator devices > drm: initialize accel framework > > Documentation/admin-guide/devices.txt | 5 + > MAINTAINERS | 8 + > drivers/Kconfig | 2 + > drivers/accel/Kconfig | 24 ++ > drivers/accel/drm_accel.c | 322 ++++++++++++++++++++++++++ > drivers/gpu/drm/Makefile | 1 + > drivers/gpu/drm/drm_drv.c | 102 +++++--- > drivers/gpu/drm/drm_file.c | 2 +- > drivers/gpu/drm/drm_sysfs.c | 24 +- > include/drm/drm_accel.h | 97 ++++++++ > include/drm/drm_device.h | 3 + > include/drm/drm_drv.h | 8 + > include/drm/drm_file.h | 21 +- > 13 files changed, 582 insertions(+), 37 deletions(-) > create mode 100644 drivers/accel/Kconfig > create mode 100644 drivers/accel/drm_accel.c > create mode 100644 include/drm/drm_accel.h > > -- > 2.25.1 >
On Sun, Nov 06, 2022 at 11:02:22PM +0200, Oded Gabbay wrote: > Another important change is that I have reverted back to use IDR for minor > handling instead of xarray. This is because I have found that xarray doesn't > handle well the scenario where you allocate a NULL entry and then exchange it > with a real pointer. It appears xarray still considers that entry a "zero" > entry. This is unfortunate because DRM works that way (first allocates a NULL > entry and then replaces the entry with a real pointer). This is what XA_ZERO_ENTRY is for. Some APIs, like xa_alloc automatically promote NULL to XA_ZERO_ENTRY, others require it to be explicit. If you use the usual pattern of xa_alloc(NULL), xa_store(!NULL) then you should be fine, as far as I know. So long as the xarray was tagged as allocating. Jason
On Mon, Nov 07, 2022 at 09:07:28AM -0700, Jeffrey Hugo wrote: > > Another important change is that I have reverted back to use IDR for minor > > handling instead of xarray. This is because I have found that xarray doesn't > > handle well the scenario where you allocate a NULL entry and then exchange it > > with a real pointer. It appears xarray still considers that entry a "zero" > > entry. This is unfortunate because DRM works that way (first allocates a NULL > > entry and then replaces the entry with a real pointer). > > > > I decided to revert to IDR because I don't want to hold up these patches, > > as many people are blocked until the support for accel is merged. The xarray > > issue should be fixed as a separate patch by either fixing the xarray code or > > changing how DRM + ACCEL do minor id handling. > > This sounds sane to me. However, this appears to be something that Matthew > Wilcox should be aware of (added for visibility). Perhaps he has a very > quick solution. If not, at-least he might have ideas on how to best address > in the future. Thanks for cc'ing me. I wasn't aware of this problem because I hadn't seen Oded's email yet. The "problem" is simply a mis-use of the API.
Hi Oded, On Sun, Nov 6, 2022 at 4:03 PM Oded Gabbay <ogabbay@kernel.org> wrote: > The patches are in the following repo: > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > registers an accel device using the new framework. This can be served > as a simple reference. I have checked inserting and removing the dummy driver, > and opening and closing /dev/accel/accel0 and nothing got broken :) > > v1 cover letter: > https://lkml.org/lkml/2022/10/22/544 > > v2 cover letter: > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ I was in the room at Plumbers when a lot of this was discussed (in 2022 and also 2019), but I haven't really had an opportunity to provide feedback until now. In general, I think it's great and thanks for pushing it forward and getting feedback. The v1 cover letter mentioned RAS (reliability, availability, serviceability) and Dave also mentioned it here [1]. There was a suggestion to use Netlink. It's an area that I'm fairly interested in because I do a lot of development on the firmware side (and specifically, with Zephyr). Personally, I think Netlink could be one option for serializing and deserializing RAS information but it would be helpful for that interface to be somewhat flexible, like a void * and length, and to provide userspace the capability of querying which RAS formats are supported. For example, AntMicro used OpenAMP + rpmsg in their NVMe accelerator, and gave a talk on it at ZDS and Plumbers this year [2][3]. In Zephyr, the LGPL license for Netlink might be a non-starter (although I'm no lawyer). However, Zephyr does already support OpenAMP, protobufs, json, and will soon support Thrift. Some companies might prefer to use Netlink. Others might prefer to use ASN.1. Some companies might prefer to use key-value pairs and limit the parameters and messages to uint32s. Some might handle all of the RAS details in-kernel, while others might want the kernel to act more like a transport to firmware. Companies already producing accelerators may have a particular preference for serialization / deserialization in their own datacenters. With that, it would be helpful to be able to query RAS capabilities via ioctl. #define ACCEL_CAP_RAS_KEY_VAL_32 BIT(0) #define ACCEL_CAP_RAS_NETLINK BIT(1) #define ACCEL_CAP_RAS_JSON BIT(2) #define ACCEL_CAP_RAS_PROTOBUF BIT(3) #define ACCEL_CAP_RAS_GRPC BIT(4) #define ACCEL_CAP_RAS_THRIFT BIT(5) #define ACCEL_CAP_RAS_JSON BIT(6) #define ACCEL_CAP_RAS_ASN1 BIT(7) or something along those lines. Anyway, just putting the idea out there. I'm sure there are a lot of opinions on this topic and that there are a lot of implications of using this or that serialization format. Obviously there can be security implications as well. Apologies if I've already missed some of this discussion. Cheers, C [1] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html [2] https://zephyr2022.sched.com/event/10CFD/open-source-nvme-ai-accelerator-platform-with-zephyr-karol-gugala-antmicro [3] https://lpc.events/event/16/contributions/1245/
On Sat, Nov 12, 2022 at 12:04 AM Christopher Friedt <chrisfriedt@gmail.com> wrote: > > Hi Oded, > > On Sun, Nov 6, 2022 at 4:03 PM Oded Gabbay <ogabbay@kernel.org> wrote: > > The patches are in the following repo: > > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > > registers an accel device using the new framework. This can be served > > as a simple reference. I have checked inserting and removing the dummy driver, > > and opening and closing /dev/accel/accel0 and nothing got broken :) > > > > v1 cover letter: > > https://lkml.org/lkml/2022/10/22/544 > > > > v2 cover letter: > > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ > > I was in the room at Plumbers when a lot of this was discussed (in > 2022 and also 2019), but I haven't really had an opportunity to > provide feedback until now. In general, I think it's great and thanks > for pushing it forward and getting feedback. > > The v1 cover letter mentioned RAS (reliability, availability, > serviceability) and Dave also mentioned it here [1]. There was a > suggestion to use Netlink. It's an area that I'm fairly interested in > because I do a lot of development on the firmware side (and > specifically, with Zephyr). > > Personally, I think Netlink could be one option for serializing and > deserializing RAS information but it would be helpful for that > interface to be somewhat flexible, like a void * and length, and to > provide userspace the capability of querying which RAS formats are > supported. > > For example, AntMicro used OpenAMP + rpmsg in their NVMe accelerator, > and gave a talk on it at ZDS and Plumbers this year [2][3]. > > In Zephyr, the LGPL license for Netlink might be a non-starter > (although I'm no lawyer). However, Zephyr does already support > OpenAMP, protobufs, json, and will soon support Thrift. > > Some companies might prefer to use Netlink. Others might prefer to use > ASN.1. Some companies might prefer to use key-value pairs and limit > the parameters and messages to uint32s. Some might handle all of the > RAS details in-kernel, while others might want the kernel to act more > like a transport to firmware. > > Companies already producing accelerators may have a particular > preference for serialization / deserialization in their own > datacenters. > > With that, it would be helpful to be able to query RAS capabilities via ioctl. > > #define ACCEL_CAP_RAS_KEY_VAL_32 BIT(0) > #define ACCEL_CAP_RAS_NETLINK BIT(1) > #define ACCEL_CAP_RAS_JSON BIT(2) > #define ACCEL_CAP_RAS_PROTOBUF BIT(3) > #define ACCEL_CAP_RAS_GRPC BIT(4) > #define ACCEL_CAP_RAS_THRIFT BIT(5) > #define ACCEL_CAP_RAS_JSON BIT(6) > #define ACCEL_CAP_RAS_ASN1 BIT(7) > > or something along those lines. Anyway, just putting the idea out there. > > I'm sure there are a lot of opinions on this topic and that there are > a lot of implications of using this or that serialization format. > Obviously there can be security implications as well. > > Apologies if I've already missed some of this discussion. > > Cheers, > > C > > [1] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html > [2] https://zephyr2022.sched.com/event/10CFD/open-source-nvme-ai-accelerator-platform-with-zephyr-karol-gugala-antmicro > [3] https://lpc.events/event/16/contributions/1245/ Hi Christopher, Thanks for all this information. At this stage, I'm mainly trying to gather information on RAS current status in the OCP (Open Compute Project) and Linux kernel, so your email was on point :) It seems to me that this topic is broader than just accelerators or GPUs, because there are other device types that are implementing some kind of RAS (e.g. NIC). My gut feeling is that the end solution would be some kind of generic kernel driver/framework that will expose RAS to userspace for any device type, but it's too early to tell. I'll update once I have the full picture. Thanks, Oded