From patchwork Mon Feb 6 09:05:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yi Liu X-Patchwork-Id: 53093 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2129533wrn; Mon, 6 Feb 2023 01:07:16 -0800 (PST) X-Google-Smtp-Source: AK7set8vpy6nrFux5PXMvG0odjcPBf0j4H7Q52IMfXnCQZIpdEBymPy4xFku+a2BRdqbj3pGdZUq X-Received: by 2002:a17:90b:3b4e:b0:230:bb46:7882 with SMTP id ot14-20020a17090b3b4e00b00230bb467882mr3139342pjb.49.1675674435850; Mon, 06 Feb 2023 01:07:15 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675674435; cv=none; d=google.com; s=arc-20160816; b=Ljs6+Z/wRdu7c8/uJ9RhvU5rHQt/DoSqH3bKCS9EpejkQJgehTG16kj5JjdP+xwnx6 8C9XE5kY30uIxk+VrHdEu8x+YgI5Qh5H0TDUKzG+ePmMPCq4nZtdSly+0uUtG25OGDES RKwKx1oJL6e78P0sLPNYnzMdCmp2/L4AhR4mWAEPJYRN/YXNOLnfk5pmQrx+L72gOLo1 ivc6g2GIQCapQFNyuucZ2TRA6O70mSeArr4hSe/EOhCsvoOjv9gaohdgC5jGB3Uspp9d aHWD3Zsm18D8hqRM2CIjpmlh8OOE9Oy68duvHqVFlZmML/0DBNhO9dziIwl2mpXZYXWM QAQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oshZFqCABJNJQJsK6HwPre+6c2TG7EtoG0UdRqvVNDY=; b=n1x/g3Gp4Vx4KfwJSWQvbommh6IKx10upzz4iySbNqLGR5p4emSXLmEz2cLHJ74UbG 0MTfsFpn0ZGO7JGKxzkEMfuCNCp8+WtRmznt+z40HJuARobed5XdeoQT9mGTmzfEDiXV RNmZbU2HBPk52c3mGuOSu5JAoduP43h8GZc22XOgyS1+PTm41jDoJHgEiJ4f7MliFVD6 1K9xa7C/Teh3/fVG9QTvobxFVgtR0ZprcuDD5MM00MLdD2c5iZQnbMCWuxZta5RZTRxh AKojmJeCmL33ndO2A3PSdzULVvam+pXDn5uM2gOYCuyzm5IzbKcYTYqX/OY7IlEp6RJt TghQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="V/rHN/E2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l7-20020a17090a72c700b00229311676ecsi12308183pjk.49.2023.02.06.01.07.03; Mon, 06 Feb 2023 01:07:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="V/rHN/E2"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230043AbjBFJG0 (ORCPT + 99 others); Mon, 6 Feb 2023 04:06:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45622 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230024AbjBFJGU (ORCPT ); Mon, 6 Feb 2023 04:06:20 -0500 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2D5961DB86; Mon, 6 Feb 2023 01:05:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1675674359; x=1707210359; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=320rjb5oGBZ8je/W/hiIRJDKHA61nOxqBmTbhyG7Nrk=; b=V/rHN/E2y2arw5p92aeKWHdGFErIsTFSjoYgEBje80U8disjbaY2idGZ p0jQT/4BtmMXhynUCCjmDo3oW8LpdxHYkYlw2MXvXkE6zYk12qbwmCfzE hs4YzedHmLYYGvwNLHJHxE6z8B26aWzaHBa/J/wxkW5fbpQW5q0R03BPT hG3OSoZ3rH7Rlv/Ep3/VAu3umhNcALDKvQPnv3zHL1+VV7SUJIFwY9qd6 W8iLqqw/HS5q8+L70h1PTEvibZqR66ut7JEoA1wwxynijYFpGlnf8oQrY IMpBDCbE4qaqRThIDJtq5uxmhg+904BeQ3Xc0gNmrWNoT9kr+GBT21mLg w==; X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="309495847" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="309495847" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Feb 2023 01:05:55 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10612"; a="911862826" X-IronPort-AV: E=Sophos;i="5.97,276,1669104000"; d="scan'208";a="911862826" Received: from 984fee00a4c6.jf.intel.com ([10.165.58.231]) by fmsmga006.fm.intel.com with ESMTP; 06 Feb 2023 01:05:53 -0800 From: Yi Liu To: alex.williamson@redhat.com, jgg@nvidia.com, kevin.tian@intel.com Cc: cohuck@redhat.com, eric.auger@redhat.com, nicolinc@nvidia.com, kvm@vger.kernel.org, mjrosato@linux.ibm.com, chao.p.peng@linux.intel.com, yi.l.liu@intel.com, yi.y.sun@linux.intel.com, peterx@redhat.com, jasowang@redhat.com, shameerali.kolothum.thodi@huawei.com, lulu@redhat.com, suravee.suthikulpanit@amd.com, intel-gvt-dev@lists.freedesktop.org, intel-gfx@lists.freedesktop.org, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org Subject: [PATCH v2 07/14] vfio: Block device access via device fd until device is opened Date: Mon, 6 Feb 2023 01:05:25 -0800 Message-Id: <20230206090532.95598-8-yi.l.liu@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230206090532.95598-1-yi.l.liu@intel.com> References: <20230206090532.95598-1-yi.l.liu@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1757071997484414407?= X-GMAIL-MSGID: =?utf-8?q?1757071997484414407?= Allow the vfio_device file to be in a state where the device FD is opened but the device cannot be used by userspace (i.e. its .open_device() hasn't been called). This inbetween state is not used when the device FD is spawned from the group FD, however when we create the device FD directly by opening a cdev it will be opened in the blocked state. The reason for the inbetween state is userspace only gets a FD but doesn't have the secure until binding the FD to an iommufd. So in the blocked state, only the bind operation is allowed, other device accesses are not allowed. Completing bind will allow user to further access the device. This is implemented by adding a flag in struct vfio_device_file to mark the blocked state and using a simple smp_load_acquire() to obtain the flag value and serialize all the device setup with the thread accessing this device. Following this lockless scheme, it can safely handle the device FD unbound->bound but it cannot handle bound->unbound. To allow this we'd need to add a lock on all the vfio ioctls which seems costly. So once device FD is bound, it remains bound until the FD is closed. Suggested-by: Jason Gunthorpe Signed-off-by: Yi Liu Reviewed-by: Kevin Tian --- drivers/vfio/vfio.h | 1 + drivers/vfio/vfio_main.c | 34 +++++++++++++++++++++++++++++++++- 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index d8275881c1f1..802e13f1256e 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -18,6 +18,7 @@ struct vfio_container; struct vfio_device_file { struct vfio_device *device; + bool access_granted; spinlock_t kvm_ref_lock; /* protect kvm field */ struct kvm *kvm; struct iommufd_ctx *iommufd; /* protected by struct vfio_device_set::lock */ diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index c517252aba19..2267057240bd 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -476,7 +476,15 @@ int vfio_device_open(struct vfio_device_file *df) device->open_count--; } - return ret; + if (ret) + return ret; + + /* + * Paired with smp_load_acquire() in vfio_device_fops::ioctl/ + * read/write/mmap + */ + smp_store_release(&df->access_granted, true); + return 0; } void vfio_device_close(struct vfio_device_file *df) @@ -1104,8 +1112,14 @@ static long vfio_device_fops_unl_ioctl(struct file *filep, { struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + bool access; int ret; + /* Paired with smp_store_release() in vfio_device_open() */ + access = smp_load_acquire(&df->access_granted); + if (!access) + return -EINVAL; + ret = vfio_device_pm_runtime_get(device); if (ret) return ret; @@ -1132,6 +1146,12 @@ static ssize_t vfio_device_fops_read(struct file *filep, char __user *buf, { struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + bool access; + + /* Paired with smp_store_release() in vfio_device_open() */ + access = smp_load_acquire(&df->access_granted); + if (!access) + return -EINVAL; if (unlikely(!device->ops->read)) return -EINVAL; @@ -1145,6 +1165,12 @@ static ssize_t vfio_device_fops_write(struct file *filep, { struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + bool access; + + /* Paired with smp_store_release() in vfio_device_open() */ + access = smp_load_acquire(&df->access_granted); + if (!access) + return -EINVAL; if (unlikely(!device->ops->write)) return -EINVAL; @@ -1156,6 +1182,12 @@ static int vfio_device_fops_mmap(struct file *filep, struct vm_area_struct *vma) { struct vfio_device_file *df = filep->private_data; struct vfio_device *device = df->device; + bool access; + + /* Paired with smp_store_release() in vfio_device_open() */ + access = smp_load_acquire(&df->access_granted); + if (!access) + return -EINVAL; if (unlikely(!device->ops->mmap)) return -EINVAL;