From patchwork Mon Feb 6 09:05:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 53100 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2130308wrn; Mon, 6 Feb 2023 01:09:28 -0800 (PST) X-Google-Smtp-Source: AK7set+DH/yqSCEvxHmp/fVGGusWi6d+kDvlJbTu8mRjr7lhrJwKOTfrarMYwM0XNDIh11LKcBIB X-Received: by 2002:a17:906:a01a:b0:889:8b6c:abf8 with SMTP id p26-20020a170906a01a00b008898b6cabf8mr17616475ejy.10.1675674568396; Mon, 06 Feb 2023 01:09:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675674568; cv=none; d=google.com; s=arc-20160816; b=Q+nZqJ8n3VE+xZS4jptGuGu3LC2MoQgWXYBxzyC4G/EmMcGMOZbJDoyVr4/rrBb4II 7bKMz2GrQ5/oOHrCJ5QTYKjtxbapAzI13pvKJMb44sNssyVPtj+TYtgv9Rb1/kpLNgAF tBOKs+bKdNnn1S3dNB1x2Grot70axbxsdlsZbCzJOghILdp1a62+Bvrc84L2rrOSGwMm 2SOTyalI/PXuTwAbj/HncEIh/XdsvRyWd+LG4mwlBLJVpFph53s1nZ9tqWTOmY3xh3yC LErpUwQv+Yvp73tOK9KSZu8gPaCLjShhLxoAx4uzfHuxu4SHQBzgUJ+iNV2ZtJFlRNmV /zVQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CG9QeFevjWs5kO14BaFYqaCdcepZ7/buH8inU/b7/hI=; b=TxznJeJAS6w2wTjivix1EYGz7x/fCmZCbB0eOcDMTQKUpbqmOi9HUCJmJrzoIe93BL mDmMrFC3gdleXUkZn3LqrHWyVD+euJhlc2C4Bnw3HkNz/xat3xtFMDohosHLmDH2h0fD +YpqWATYgs0jrNPGCGh9ymKhmch4n+5E2GVhH+NvW8Q8Vl+HEeqv1P3guYzR4ayutXVF pOdyjkE1BUK1Y6S0JnIPIuV81aFPaEobvuOFLwCZXrDSgxbtEkWbA6RQbMF9t8jbEolF R6IdyQ5WRU5o6iD4L4LQCipBtAgMeI63R3+dbogCXMUdAfGHC711AnIiB0w2IffgvOsB KoCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eoFsCnIJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id iy19-20020a170907819300b0087bd6ca5a89si11632612ejc.62.2023.02.06.01.09.04; Mon, 06 Feb 2023 01:09:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eoFsCnIJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230127AbjBFJHZ (ORCPT + 99 others); Mon, 6 Feb 2023 04:07:25 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230089AbjBFJHW (ORCPT ); Mon, 6 Feb 2023 04:07:22 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 637321D904; Mon, 6 Feb 2023 01:06:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675674410; x=1707210410; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Hy92DpJJLbHqBdommpUCK8SJz1aa0zaHDXjA2R52pAI=; b=eoFsCnIJSikbLhvDhxDIjKfezfGr4iViebixoZshgzHwYZ3M+cAJVSJC tlFHL5nmUQtSARbDwzCAEpLTxGWplWfd1KMO9PpepI2WAzQACFp2I+Tca 66ON9s/8hIKytvYLnWAQeBaR76DCBfrneu1kzb6F+CwXiM9UTKV6JbHrI aQWul3C130jY4u+acpKqs8cy6Ww5puW/9m4Ie/SMX4kd7iu/9sB98TXhI dW3e86Rmx9HAzoG2GYufqSJFVMZKD5dtLWS6/cKdrWwM7yiBi7qbf5XWC He4un1CJ+d4qzvQ3kxeknM4lGw1hfYwHF1d/GAE9BZmdRMjlCnv9HdZ0F A==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309495909" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309495909" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 01:06:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="911862872" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="911862872" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 06 Feb 2023 01:06:03 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, Joao Martins Subject: [PATCH v2 12/14] vfio: Add cdev for vfio_device Date: Mon, 6 Feb 2023 01:05:30 -0800 Message-Id: <20230206090532.95598-13-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230206090532.95598-1-yi.l.liu@intel.com> References: <20230206090532.95598-1-yi.l.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757072136297631743?= X-GMAIL-MSGID: =?utf-8?q?1757072136297631743?= This allows user to directly open a vfio device w/o using the legacy container/group interface, as a prerequisite for supporting new iommu features like nested translation. The device fd opened in this manner doesn't have the capability to access the device as the fops open() doesn't open the device until the successful BIND_IOMMUFD which be added in next patch. With this patch, devices registered to vfio core have both group and device interface created. - group interface : /dev/vfio/$groupID - device interface: /dev/vfio/devices/vfioX (X is the minor number and is unique across devices) Given a vfio device the user can identify the matching vfioX by checking the sysfs path of the device. Take PCI device (0000:6a:01.0) for example, /sys/bus/pci/devices/0000\:6a\:01.0/vfio-dev/vfio0/dev contains the major:minor of the matching vfioX. Userspace then opens the /dev/vfio/devices/vfioX and checks with fstat that the major:minor matches. The vfio_device cdev logic in this patch: *) __vfio_register_dev() path ends up doing cdev_device_add() for each vfio_device; *) vfio_unregister_group_dev() path does cdev_device_del(); Signed-off-by: Yi Liu Signed-off-by: Joao Martins --- drivers/vfio/Kconfig | 11 +++++++ drivers/vfio/Makefile | 1 + drivers/vfio/device_cdev.c | 64 ++++++++++++++++++++++++++++++++++++++ drivers/vfio/vfio.h | 26 ++++++++++++++++ drivers/vfio/vfio_main.c | 41 +++++++++++++++++++++--- include/linux/vfio.h | 2 ++ 6 files changed, 141 insertions(+), 4 deletions(-) create mode 100644 drivers/vfio/device_cdev.c diff --git a/drivers/vfio/Kconfig b/drivers/vfio/Kconfig index a8f544629467..0476abf154f2 100644 --- a/drivers/vfio/Kconfig +++ b/drivers/vfio/Kconfig @@ -12,6 +12,17 @@ menuconfig VFIO If you don't know what to do here, say N. if VFIO +config VFIO_DEVICE_CDEV + bool "Support for the VFIO cdev /dev/vfio/devices/vfioX" + depends on IOMMUFD + help + The VFIO device cdev is another way for userspace to get device + access. Userspace gets device fd by opening device cdev under + /dev/vfio/devices/vfioX, and then bind the device fd with an iommufd + to set up secure context for device access. + + If you don't know what to do here, say N. + config VFIO_CONTAINER bool "Support for the VFIO container /dev/vfio/vfio" select VFIO_IOMMU_TYPE1 if MMU && (X86 || S390 || ARM || ARM64) diff --git a/drivers/vfio/Makefile b/drivers/vfio/Makefile index 70e7dcb302ef..245394aeb94b 100644 --- a/drivers/vfio/Makefile +++ b/drivers/vfio/Makefile @@ -4,6 +4,7 @@ obj-$(CONFIG_VFIO) += vfio.o vfio-y += vfio_main.o \ group.o \ iova_bitmap.o +vfio-$(CONFIG_VFIO_DEVICE_CDEV) += device_cdev.o vfio-$(CONFIG_IOMMUFD) += iommufd.o vfio-$(CONFIG_VFIO_CONTAINER) += container.o vfio-$(CONFIG_VFIO_VIRQFD) += virqfd.o diff --git a/drivers/vfio/device_cdev.c b/drivers/vfio/device_cdev.c new file mode 100644 index 000000000000..f024833c9e2c --- /dev/null +++ b/drivers/vfio/device_cdev.c @@ -0,0 +1,64 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (c) 2023 Intel Corporation. + */ +#include + +#include "vfio.h" + +static struct vfio { + dev_t device_devt; +} vfio; + +void vfio_init_device_cdev(struct vfio_device *device) +{ + device->device.devt = MKDEV(MAJOR(vfio.device_devt), device->index); + cdev_init(&device->cdev, &vfio_device_fops); + device->cdev.owner = THIS_MODULE; +} + +int vfio_device_fops_open(struct inode *inode, struct file *filep) +{ + struct vfio_device *device = container_of(inode->i_cdev, + struct vfio_device, cdev); + struct vfio_device_file *df; + int ret; + + if (!vfio_device_try_get_registration(device)) + return -ENODEV; + + /* + * device access is blocked until .open_device() is called + * in BIND_IOMMUFD. + */ + df = vfio_allocate_device_file(device, true); + if (IS_ERR(df)) { + ret = PTR_ERR(df); + goto err_put_registration; + } + + filep->private_data = df; + + return 0; + +err_put_registration: + vfio_device_put_registration(device); + return ret; +} + +static char *vfio_device_devnode(const struct device *dev, umode_t *mode) +{ + return kasprintf(GFP_KERNEL, "vfio/devices/%s", dev_name(dev)); +} + +int vfio_cdev_init(struct class *device_class) +{ + device_class->devnode = vfio_device_devnode; + return alloc_chrdev_region(&vfio.device_devt, 0, + MINORMASK + 1, "vfio-dev"); +} + +void vfio_cdev_cleanup(void) +{ + unregister_chrdev_region(vfio.device_devt, MINORMASK + 1); +} diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index 2debf0173861..c7c75865afec 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -250,6 +250,32 @@ static inline void vfio_iommufd_unbind(struct vfio_device *device) } #endif +#if IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV) +void vfio_init_device_cdev(struct vfio_device *device); +int vfio_device_fops_open(struct inode *inode, struct file *filep); +int vfio_cdev_init(struct class *device_class); +void vfio_cdev_cleanup(void); +#else +static inline void vfio_init_device_cdev(struct vfio_device *device) +{ +} + +static inline int vfio_device_fops_open(struct inode *inode, + struct file *filep) +{ + return 0; +} + +static inline int vfio_cdev_init(struct class *device_class) +{ + return 0; +} + +static inline void vfio_cdev_cleanup(void) +{ +} +#endif /* CONFIG_VFIO_DEVICE_CDEV */ + #if IS_ENABLED(CONFIG_VFIO_VIRQFD) int __init vfio_virqfd_init(void); void vfio_virqfd_exit(void); diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index e07b185f9820..035730dc6ad4 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -235,6 +235,7 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev, device->device.release = vfio_device_release; device->device.class = vfio.device_class; device->device.parent = device->dev; + vfio_init_device_cdev(device); return 0; out_uninit: @@ -243,6 +244,25 @@ static int vfio_init_device(struct vfio_device *device, struct device *dev, return ret; } +static int vfio_device_add(struct vfio_device *device) +{ + int ret; + + if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)) + ret = cdev_device_add(&device->cdev, &device->device); + else + ret = device_add(&device->device); + return ret; +} + +static void vfio_device_del(struct vfio_device *device) +{ + if (IS_ENABLED(CONFIG_VFIO_DEVICE_CDEV)) + cdev_device_del(&device->cdev, &device->device); + else + device_del(&device->device); +} + static int __vfio_register_dev(struct vfio_device *device, enum vfio_group_type type) { @@ -269,7 +289,7 @@ static int __vfio_register_dev(struct vfio_device *device, if (ret) return ret; - ret = device_add(&device->device); + ret = vfio_device_add(device); if (ret) goto err_out; @@ -309,6 +329,13 @@ void vfio_unregister_group_dev(struct vfio_device *device) bool interrupted = false; long rc; + /* + * Balances vfio_device_add in register path. Putting it as the + * first operation in unregister to prevent registration refcount + * from incrementing per cdev open. + */ + vfio_device_del(device); + vfio_device_put_registration(device); rc = try_wait_for_completion(&device->comp); while (rc <= 0) { @@ -334,9 +361,6 @@ void vfio_unregister_group_dev(struct vfio_device *device) vfio_device_group_unregister(device); - /* Balances device_add in register path */ - device_del(&device->device); - /* Balances vfio_device_set_group in register path */ vfio_device_remove_group(device); } @@ -1220,6 +1244,7 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) const struct file_operations vfio_device_fops = { .owner = THIS_MODULE, + .open = vfio_device_fops_open, .release = vfio_device_fops_release, .read = vfio_device_fops_read, .write = vfio_device_fops_write, @@ -1593,9 +1618,16 @@ static int __init vfio_init(void) goto err_dev_class; } + ret = vfio_cdev_init(vfio.device_class); + if (ret) + goto err_alloc_dev_chrdev; + pr_info(DRIVER_DESC " version: " DRIVER_VERSION "\n"); return 0; +err_alloc_dev_chrdev: + class_destroy(vfio.device_class); + vfio.device_class = NULL; err_dev_class: vfio_virqfd_exit(); err_virqfd: @@ -1606,6 +1638,7 @@ static int __init vfio_init(void) static void __exit vfio_cleanup(void) { ida_destroy(&vfio.device_ida); + vfio_cdev_cleanup(); class_destroy(vfio.device_class); vfio.device_class = NULL; vfio_virqfd_exit(); diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 83d1e0af0a70..fd4bf9c21ffe 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -13,6 +13,7 @@ #include #include #include +#include #include #include @@ -51,6 +52,7 @@ struct vfio_device { /* Members below here are private, not for driver use */ unsigned int index; struct device device; /* device.kref covers object life circle */ + struct cdev cdev; refcount_t refcount; /* user count on registered device*/ unsigned int open_count; struct completion comp;