From patchwork Fri Jul 14 10:25:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Singh Sengar X-Patchwork-Id: 120408 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2412549vqm; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) X-Google-Smtp-Source: APBJJlHTF9ro2x9dOF6ho8uv+VRS3XgzQReDd6dKuraJdzKRHhG7CL5rmOK05DQUV7hYPtBVbPyy X-Received: by 2002:a05:6a00:181d:b0:67b:8602:aa1b with SMTP id y29-20020a056a00181d00b0067b8602aa1bmr3893917pfa.27.1689331655082; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689331655; cv=none; d=google.com; s=arc-20160816; b=Fb0+qnEUWfOwTonaPerxJnZR5uTxVypFH942Dt2s4nAj7SWjscVeJ/43GNOM556vM1 N93dWt9PeT+YffclX1/n4eU6zjwTHK0HdmapA40pHBqjkejjG5SZYJvYWznxMeAnmJ/9 pezqI9bp/K8pCbBJn/RVekOBCsBTqifoE9YLXqhKh/qgXDlb0h66WM1KXlMJ2fUjkNSh 1jPxbwWBm4cX49LCZrLrgwCXj8ap13+KoIoXzxfBrLJDs19yMzloWe2TqFkrGWbxjK/P 1oE218CP53S8Rwr/ReaPzw4GD0+Ytue2vgDNaGq0nIe12HOkY3GhTk4eyzaI1ZLgIYup J2Cw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :to:from:dkim-signature:dkim-filter; bh=ZDb5v3WQfia+d+dhap2V7HLfR4KKKUXFWrCOXXrmaSo=; fh=ahkjrYVgnZMRmK+J626CGhBdW9goONUEh6WYVopVVOA=; b=oVvugdvgro9MOVr7wNT/OM/4Wycr3lL/gQ7bczrWC+ivQe4gn3IhJbeIlAtFQmY6P7 ZNU6MdtTnPdfJwfVy8mV+kNh1dpFBdPbkFcIpcNvZhb3fKyk9tTQLz7qDvRtE8SCxmGB Vrv7laJvLpZdJ9RFJcylh9GGNz9sSVtd3Uy/+LtF19p4YqaDk/37mRgNQekgh5O56jEb IdgYlAQtsnK4OlHJHwaahqZNBDIvsgC9rJe8GSNDyinI2/C3WkV0JvfDHOzMb75P4bDg VMnATa2IwkczVCfOYIg4NUAhnhKBCatlDOZFE79Lcwl5En+JCJRXqloEIMRkem1fueuh lJow== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=BLDDp0DK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y185-20020a638ac2000000b0055bf134b7b5si6603223pgd.828.2023.07.14.03.47.22; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=BLDDp0DK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236205AbjGNKZ5 (ORCPT + 99 others); Fri, 14 Jul 2023 06:25:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235396AbjGNKZy (ORCPT ); Fri, 14 Jul 2023 06:25:54 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 835DC2733; Fri, 14 Jul 2023 03:25:52 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id EF38C21C467B; Fri, 14 Jul 2023 03:25:51 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com EF38C21C467B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1689330352; bh=ZDb5v3WQfia+d+dhap2V7HLfR4KKKUXFWrCOXXrmaSo=; h=From:To:Subject:Date:In-Reply-To:References:From; b=BLDDp0DKfAKFQzcZEvOxAbkdsrw4dThzYL9ncsMvw80PFhoTZO/Mc5ZQlfoFuwxI5 KpJGTya4r1Ubs2tP+Pr6i6EXR7NBsKMvwyQDQQhYkWJ57OoJr0ByHM66QBPGltw5Ec lPOchdFSA/Nu/5Fe1k+gje1fAlHUz3jXziwSSTLE= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v3 1/3] uio: Add hv_vmbus_client driver Date: Fri, 14 Jul 2023 03:25:44 -0700 Message-Id: <1689330346-5374-2-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> References: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771392629574153028 X-GMAIL-MSGID: 1771392629574153028 Add a new UIO-based driver that generically supports low speed Hyper-V VMBus devices. This driver can be bound to VMBus devices by user space drivers that provide device-specific management. The new driver provides the following core functionality, which is suitable for low speed devices: * A single VMBus channel for each device * Ability to specify the VMBus channel ring buffer size for each device * Host notification via a hypercall instead of monitor bits Signed-off-by: Saurabh Sengar Reviewed-by: Michael Kelley --- [V3] - Removed ringbuffer sysfs entry and used uio framework for mmap - Remove ".id_table = NULL" - kasprintf -> devm_kasprintf - Change global variable ring_size to per device - More checks on value which can be set for ring_size - Remove driverctl, and used echo command instead for driver documentation - Remove unnecessary one time use macros - Change kernel version and date for sysfs documentation - Update documentation - Better commit message [V2] - Update driver info in Documentation/driver-api/uio-howto.rst - Update ring_size sysfs info in Documentation/ABI/stable/sysfs-bus-vmbus - Remove DRIVER_VERSION - Remove refcnt - scnprintf -> sysfs_emit - sysfs_create_file -> ATTRIBUTE_GROUPS + ".driver.groups"; - sysfs_create_bin_file -> device_create_bin_file - dev_notice -> dev_err - remove MODULE_VERSION Documentation/ABI/stable/sysfs-bus-vmbus | 10 ++ Documentation/driver-api/uio-howto.rst | 54 ++++++ drivers/uio/Kconfig | 12 ++ drivers/uio/Makefile | 1 + drivers/uio/uio_hv_vmbus_client.c | 218 +++++++++++++++++++++++ 5 files changed, 295 insertions(+) create mode 100644 drivers/uio/uio_hv_vmbus_client.c diff --git a/Documentation/ABI/stable/sysfs-bus-vmbus b/Documentation/ABI/stable/sysfs-bus-vmbus index 3066feae1d8d..7e77eda77be3 100644 --- a/Documentation/ABI/stable/sysfs-bus-vmbus +++ b/Documentation/ABI/stable/sysfs-bus-vmbus @@ -153,6 +153,16 @@ Contact: Stephen Hemminger Description: Binary file created by uio_hv_generic for ring buffer Users: Userspace drivers +What: /sys/bus/vmbus/devices//ring_size +Date: September 2023 +KernelVersion: 6.6 +Contact: Saurabh Sengar +Description: File created by uio_hv_vmbus_client for setting device ring + buffer size. The value specified within the file denotes the + total memory allocation for the one complete ring buffer, which + includes the ring buffer header, of size PAGE_SIZE. +Users: Userspace drivers + What: /sys/bus/vmbus/devices//channels//intr_in_full Date: February 2019 KernelVersion: 5.0 diff --git a/Documentation/driver-api/uio-howto.rst b/Documentation/driver-api/uio-howto.rst index 907ffa3b38f5..625c2bda369f 100644 --- a/Documentation/driver-api/uio-howto.rst +++ b/Documentation/driver-api/uio-howto.rst @@ -722,6 +722,60 @@ For example:: /sys/bus/vmbus/devices/3811fe4d-0fa0-4b62-981a-74fc1084c757/channels/21/ring +Generic Hyper-V driver for low speed devices +============================================ + +The generic driver is a kernel module named uio_hv_vmbus_client. It +supports slow devices on the Hyper-V VMBus similar to uio_hv_generic +for faster devices. This driver also gives flexibility of customized +ring buffer sizes. + +Making the driver recognize the device +-------------------------------------- + +Since the driver does not declare any device GUID's, it will not get +loaded automatically and will not automatically bind to any devices. You +must load it and allocate id to the driver yourself. For example, to use +the fcopy device class GUID:: + + modprobe uio_hv_vmbus_client + echo "34d14be3-dee4-41c8-9ae7-6b174977c192" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/new_id + +If there already is a hardware specific kernel driver for the device, +the generic driver still won't bind to it. In this case if you want to +use the generic driver for a userspace library you'll have to manually unbind +the hardware specific driver and bind the generic driver, using the device +instance GUID like this:: + + echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/unbind + echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/bind + +You can verify that the device has been bound to the driver by looking +for it in sysfs, for example like the following:: + + ls -l /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/driver + +Which if successful should print:: + + .../eb765408-105f-49b6-b4aa-c123b64d17d4/driver -> ../../../bus/vmbus/drivers/uio_hv_vmbus_client + +Things to know about uio_hv_vmbus_client +---------------------------------------- + +The uio_hv_vmbus_client driver maps the Hyper-V device ring buffer to userspace +and offers an interface to manage it. + +The userspace API for mapping and performing read/write operations on the device +ring buffer is implemented in tools/hv/vmbus_bufring.c. Userspace applications +should use this file as a library and build their logic on top of it. + +Additionally, the uio_hv_vmbus_client driver offers the "ring_size" sysfs entry +for setting the device ring buffer size before opening the device. + +For example:: + + /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring_size + Further information =================== diff --git a/drivers/uio/Kconfig b/drivers/uio/Kconfig index 2e16c5338e5b..bd4d27ecfc9a 100644 --- a/drivers/uio/Kconfig +++ b/drivers/uio/Kconfig @@ -166,6 +166,18 @@ config UIO_HV_GENERIC If you compile this as a module, it will be called uio_hv_generic. +config UIO_HV_SLOW_DEVICES + tristate "Generic driver for low speed VMBus devices" + depends on HYPERV + help + Generic driver that you can dynamically bind to low speed Hyper-V + VMBus devices to allow a user space driver to manage the device. + The driver provides a single VMBus channel and uses a hypercall + instead of monitor bits to interrupt the host. The driver provides + a configurable per-device ring buffer size. + + If you compile this as a module, it will be called uio_hv_vmbus_client. + config UIO_DFL tristate "Generic driver for DFL (Device Feature List) bus" depends on FPGA_DFL diff --git a/drivers/uio/Makefile b/drivers/uio/Makefile index f2f416a14228..44be0f96da34 100644 --- a/drivers/uio/Makefile +++ b/drivers/uio/Makefile @@ -11,4 +11,5 @@ obj-$(CONFIG_UIO_PRUSS) += uio_pruss.o obj-$(CONFIG_UIO_MF624) += uio_mf624.o obj-$(CONFIG_UIO_FSL_ELBC_GPCM) += uio_fsl_elbc_gpcm.o obj-$(CONFIG_UIO_HV_GENERIC) += uio_hv_generic.o +obj-$(CONFIG_UIO_HV_SLOW_DEVICES) += uio_hv_vmbus_client.o obj-$(CONFIG_UIO_DFL) += uio_dfl.o diff --git a/drivers/uio/uio_hv_vmbus_client.c b/drivers/uio/uio_hv_vmbus_client.c new file mode 100644 index 000000000000..778f43b3701d --- /dev/null +++ b/drivers/uio/uio_hv_vmbus_client.c @@ -0,0 +1,218 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * uio_hv_vmbus_client - UIO driver for low speed VMBus devices + * + * Copyright (c) 2023, Microsoft Corporation. + * + * Authors: + * Saurabh Sengar + * + * Since the driver does not declare any device ids, userspace code must + * allocate an id and bind the device to the driver. + * + * For example, to associate the fcopy service with this driver: + * # echo "34d14be3-dee4-41c8-9ae7-6b174977c192" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/new_id + * + * If there already is a hardware specific kernel driver for the device, + * the generic driver still won't bind to it. In this case if you want to + * use the generic driver for a userspace library you'll have to manually unbind + * the hardware specific driver and bind the generic driver, using the device + * instance GUID like this: + * # echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/unbind + * # echo "eb765408-105f-49b6-b4aa-c123b64d17d4" > /sys/bus/vmbus/drivers/uio_hv_vmbus_client/bind + */ + +#include +#include +#include +#include +#include + +struct uio_hv_vmbus_dev { + struct uio_info info; + struct hv_device *device; + int ring_size; +}; + +/* + * This is the irqcontrol callback to be registered to uio_info. + * It can be used to disable/enable interrupt from user space processes. + * + * @param info + * pointer to uio_info. + * @param irq_state + * state value. 1 to enable interrupt. + */ +static int uio_hv_vmbus_irqcontrol(struct uio_info *info, s32 irq_state) +{ + struct uio_hv_vmbus_dev *pdata = info->priv; + struct hv_device *hv_dev = pdata->device; + + /* Issue a full memory barrier before triggering the notification */ + virt_mb(); + + if (irq_state == 1) + vmbus_setevent(hv_dev->channel); + + return 0; +} + +/* + * Callback from vmbus_event when something is in inbound ring. + */ +static void uio_hv_vmbus_channel_cb(void *context) +{ + struct uio_hv_vmbus_dev *pdata = context; + + /* Issue a full memory barrier before sending the event to userspace */ + virt_mb(); + + uio_event_notify(&pdata->info); +} + +static int uio_hv_vmbus_open(struct uio_info *info, struct inode *inode) +{ + struct uio_hv_vmbus_dev *pdata = container_of(info, struct uio_hv_vmbus_dev, info); + struct hv_device *hv_dev = pdata->device; + struct vmbus_channel *channel = hv_dev->channel; + void *ring_buffer; + int ret; + + ret = vmbus_open(channel, pdata->ring_size, pdata->ring_size, NULL, 0, + uio_hv_vmbus_channel_cb, pdata); + if (ret) { + dev_err(&hv_dev->device, "error %d when opening the channel\n", ret); + return ret; + } + channel->inbound.ring_buffer->interrupt_mask = 0; + set_channel_read_mode(channel, HV_CALL_ISR); + + /* set the mem pointer */ + info->mem[0].name = "txrx_rings"; + ring_buffer = page_address(channel->ringbuffer_page); + info->mem[0].addr = (uintptr_t)virt_to_phys(ring_buffer); + info->mem[0].size = channel->ringbuffer_pagecount << PAGE_SHIFT; + info->mem[0].memtype = UIO_MEM_IOVA; + + return ret; +} + +static int uio_hv_vmbus_release(struct uio_info *info, struct inode *inode) +{ + struct uio_hv_vmbus_dev *pdata = container_of(info, struct uio_hv_vmbus_dev, info); + struct hv_device *hv_dev = pdata->device; + + vmbus_close(hv_dev->channel); + + /* restore the mem pointer to its original state */ + info->mem[0].name = NULL; + info->mem[0].addr = 0; + info->mem[0].size = 1; + info->mem[0].memtype = UIO_MEM_NONE; + + return 0; +} + +static ssize_t ring_size_show(struct device *dev, struct device_attribute *attr, + char *buf) +{ + struct uio_info *info = dev_get_drvdata(dev); + struct uio_hv_vmbus_dev *pdata = container_of(info, struct uio_hv_vmbus_dev, info); + + return sysfs_emit(buf, "%d\n", pdata->ring_size); +} + +static ssize_t ring_size_store(struct device *dev, struct device_attribute *attr, + const char *buf, size_t count) +{ + unsigned int val; + struct uio_info *info = dev_get_drvdata(dev); + struct uio_hv_vmbus_dev *pdata = container_of(info, struct uio_hv_vmbus_dev, info); + + if (kstrtouint(buf, 0, &val) < 0) + return -EINVAL; + + if (val < 2 * PAGE_SIZE || val % PAGE_SIZE) + return -EINVAL; + + pdata->ring_size = val; + + return count; +} + +static DEVICE_ATTR_RW(ring_size); + +static struct attribute *uio_hv_vmbus_client_attrs[] = { + &dev_attr_ring_size.attr, + NULL, +}; +ATTRIBUTE_GROUPS(uio_hv_vmbus_client); + +static int uio_hv_vmbus_probe(struct hv_device *dev, const struct hv_vmbus_device_id *dev_id) +{ + struct uio_hv_vmbus_dev *pdata; + int ret; + char *name = NULL; + + pdata = devm_kzalloc(&dev->device, sizeof(*pdata), GFP_KERNEL); + if (!pdata) + return -ENOMEM; + + name = devm_kasprintf(&dev->device, GFP_KERNEL, "%pUl", &dev->dev_instance); + + /* Fill general uio info */ + pdata->info.name = name; /* /sys/class/uio/uioX/name */ + pdata->info.version = "1"; + pdata->info.irqcontrol = uio_hv_vmbus_irqcontrol; + pdata->info.open = uio_hv_vmbus_open; + pdata->info.release = uio_hv_vmbus_release; + pdata->info.irq = UIO_IRQ_CUSTOM; + pdata->info.priv = pdata; + pdata->ring_size = VMBUS_RING_SIZE(3 * HV_HYP_PAGE_SIZE); /* Default ringbuffer size */ + pdata->device = dev; + + /* dummy value to register the mem pointers which will be updated by open */ + pdata->info.mem[0].size = 1; + + ret = uio_register_device(&dev->device, &pdata->info); + if (ret) { + dev_err(&dev->device, "uio_hv_vmbus register failed\n"); + return ret; + } + + hv_set_drvdata(dev, pdata); + + return 0; +} + +static void uio_hv_vmbus_remove(struct hv_device *dev) +{ + struct uio_hv_vmbus_dev *pdata = hv_get_drvdata(dev); + + if (pdata) + uio_unregister_device(&pdata->info); +} + +static struct hv_driver uio_hv_vmbus_drv = { + .driver.dev_groups = uio_hv_vmbus_client_groups, + .name = "uio_hv_vmbus_client", + .probe = uio_hv_vmbus_probe, + .remove = uio_hv_vmbus_remove, +}; + +static int __init uio_hv_vmbus_init(void) +{ + return vmbus_driver_register(&uio_hv_vmbus_drv); +} + +static void __exit uio_hv_vmbus_exit(void) +{ + vmbus_driver_unregister(&uio_hv_vmbus_drv); +} + +module_init(uio_hv_vmbus_init); +module_exit(uio_hv_vmbus_exit); + +MODULE_LICENSE("GPL"); +MODULE_AUTHOR("Saurabh Sengar "); +MODULE_DESCRIPTION("Generic UIO driver for low speed VMBus devices"); From patchwork Fri Jul 14 10:25:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Singh Sengar X-Patchwork-Id: 120418 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2427443vqm; Fri, 14 Jul 2023 04:14:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlEpvUWsRVNVxKz92tlHEKeJL56UcRrhbe3dsl7+CBqjwbyFrsYnb1zlAIfZg9U4xjzwsEr8 X-Received: by 2002:a05:6a20:1383:b0:12b:fe14:907e with SMTP id hn3-20020a056a20138300b0012bfe14907emr3724276pzc.20.1689333291663; Fri, 14 Jul 2023 04:14:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689333291; cv=none; d=google.com; s=arc-20160816; b=tYIwga0+t/EKKZBOcGtpympZgTni2KBOnffU8GO7LSqtcQGc/UUm4ZQ1xvSUw4fBP3 ZwEvgsvkUni+kDxEp7EURXvYFnQM1p7oDVl6CxiDE3PTaf6mfbKyQhqqM7ubVoQa5U10 RdKcrjhcPtMo3gLtBJXBq0dyuOhYCzh2v1vORzwsK8FZ0FWkqYUzpOcilzqtQ71MjoWv SC5NnSST2t71Pj2kZzMiOcknhtXWRUFnEchEQ3i3D2dMz7fJgwXbT4/0zuUBvUSN6ak3 d2KQq4zDImrt6C6am/XKClqa/rocAJxagnIaWfqi+Gqy+LShLgVnWfwmWypHb7s7XWPu bsPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :to:from:dkim-signature:dkim-filter; bh=t3GtJjcOIBfxDE+W651xwxvQJl6oxFvMK7Bv6EPFN2s=; fh=ahkjrYVgnZMRmK+J626CGhBdW9goONUEh6WYVopVVOA=; b=XmDl5irN2Fr8b6SfBvyKRX8G/yA7A+T/MmJU7XLWQtPEHkaXsLgO4TXq0eQ7/oc7HQ gGW5nNS7nYLMl6oN2qc4TEkEMfGdgcGmnqSARf+hcbsIey10bFyQH+9d8yyL4/3ksK/G KyRt64eOBObywDWeo9AduJqaeDq7yY1wotAYOIjAjh0Glm9GueSXyQ+admvn9U0W+MAo WDK2zwbscvIa1H6BO6a9px1K1T6toL61QsWPMo8HjIrF/DfbcSwDVV5TBIpbnCcOnG/6 UQaOb4YnB4Y74RAIoU+qCbNuc5fKHiqN7rO1chv6qVxC5cU3nq0iQvQ2WrNbFPPmemv/ K3dg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=dB9VBUs9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id fd35-20020a056a002ea300b0067a39a4c158si7041453pfb.2.2023.07.14.04.14.38; Fri, 14 Jul 2023 04:14:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=dB9VBUs9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236209AbjGNKZ7 (ORCPT + 99 others); Fri, 14 Jul 2023 06:25:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236194AbjGNKZz (ORCPT ); Fri, 14 Jul 2023 06:25:55 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 893BF273B; Fri, 14 Jul 2023 03:25:52 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id 1421621C467D; Fri, 14 Jul 2023 03:25:52 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 1421621C467D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1689330352; bh=t3GtJjcOIBfxDE+W651xwxvQJl6oxFvMK7Bv6EPFN2s=; h=From:To:Subject:Date:In-Reply-To:References:From; b=dB9VBUs9v9snk8cw3PiXzilkvTEEclrP3E5dbkCxT+F7kzc72nAj8hnAX01QSYfsa 9FiCkL2DqjXcUsDXpHKlSdoP+PH3urRFvu2beubmX57iS9tUAYQLpp1NO/tjKscAD6 3Pfgq7GGEArz+U20ncNdA8OxAebpfTD+ZeOgur5s= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v3 2/3] tools: hv: Add vmbus_bufring Date: Fri, 14 Jul 2023 03:25:45 -0700 Message-Id: <1689330346-5374-3-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> References: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771394345645888417 X-GMAIL-MSGID: 1771394345645888417 Provide a userspace interface for userspace drivers or applications to read/write a VMBus ringbuffer. A significant part of this code is borrowed from DPDK[1]. Current library is supported exclusively for the x86 architecture. To build this library: make -C tools/hv libvmbus_bufring.a Applications using this library can include the vmbus_bufring.h header file and libvmbus_bufring.a statically. [1] https://github.com/DPDK/dpdk/ Signed-off-by: Mary Hardy Signed-off-by: Saurabh Sengar --- [V3] - Made ring buffer data offset depend on page size - remove rte_smp_rwmb macro and reused rte_compiler_barrier instead - Added legal counsel sign-off - Removed "Link:" tag - Improve commit messages - new library compilation dependent on x86 - simplify mmap [V2] - simpler sysfs path, less parsing tools/hv/Build | 1 + tools/hv/Makefile | 13 +- tools/hv/vmbus_bufring.c | 297 +++++++++++++++++++++++++++++++++++++++ tools/hv/vmbus_bufring.h | 154 ++++++++++++++++++++ 4 files changed, 464 insertions(+), 1 deletion(-) create mode 100644 tools/hv/vmbus_bufring.c create mode 100644 tools/hv/vmbus_bufring.h diff --git a/tools/hv/Build b/tools/hv/Build index 6cf51fa4b306..2a667d3d94cb 100644 --- a/tools/hv/Build +++ b/tools/hv/Build @@ -1,3 +1,4 @@ hv_kvp_daemon-y += hv_kvp_daemon.o hv_vss_daemon-y += hv_vss_daemon.o hv_fcopy_daemon-y += hv_fcopy_daemon.o +vmbus_bufring-y += vmbus_bufring.o diff --git a/tools/hv/Makefile b/tools/hv/Makefile index fe770e679ae8..33cf488fd20f 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -11,14 +11,19 @@ srctree := $(patsubst %/,%,$(dir $(CURDIR))) srctree := $(patsubst %/,%,$(dir $(srctree))) endif +include $(srctree)/tools/scripts/Makefile.arch + # Do not use make's built-in rules # (this improves performance and avoids hard-to-debug behaviour); MAKEFLAGS += -r override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include +ifeq ($(SRCARCH),x86) +ALL_LIBS := libvmbus_bufring.a +endif ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon -ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) +ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) $(patsubst %,$(OUTPUT)%,$(ALL_LIBS)) ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh @@ -27,6 +32,12 @@ all: $(ALL_PROGRAMS) export srctree OUTPUT CC LD CFLAGS include $(srctree)/tools/build/Makefile.include +HV_VMBUS_BUFRING_IN := $(OUTPUT)vmbus_bufring.o +$(HV_VMBUS_BUFRING_IN): FORCE + $(Q)$(MAKE) $(build)=vmbus_bufring +$(OUTPUT)libvmbus_bufring.a : vmbus_bufring.o + $(AR) rcs $@ $^ + HV_KVP_DAEMON_IN := $(OUTPUT)hv_kvp_daemon-in.o $(HV_KVP_DAEMON_IN): FORCE $(Q)$(MAKE) $(build)=hv_kvp_daemon diff --git a/tools/hv/vmbus_bufring.c b/tools/hv/vmbus_bufring.c new file mode 100644 index 000000000000..fb1f0489c625 --- /dev/null +++ b/tools/hv/vmbus_bufring.c @@ -0,0 +1,297 @@ +// SPDX-License-Identifier: BSD-3-Clause +/* + * Copyright (c) 2009-2012,2016,2023 Microsoft Corp. + * Copyright (c) 2012 NetApp Inc. + * Copyright (c) 2012 Citrix Inc. + * All rights reserved. + */ + +#include +#include +#include +#include +#include +#include +#include "vmbus_bufring.h" + +#define rte_compiler_barrier() ({ asm volatile ("" : : : "memory"); }) +#define RINGDATA_START_OFFSET (getpagesize()) +#define VMBUS_RQST_ERROR 0xFFFFFFFFFFFFFFFF +#define ALIGN(val, align) ((typeof(val))((val) & (~((typeof(val))((align) - 1))))) + +/* Increase bufring index by inc with wraparound */ +static inline uint32_t vmbus_br_idxinc(uint32_t idx, uint32_t inc, uint32_t sz) +{ + idx += inc; + if (idx >= sz) + idx -= sz; + + return idx; +} + +void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen) +{ + br->vbr = buf; + br->windex = br->vbr->windex; + br->dsize = blen - RINGDATA_START_OFFSET; +} + +static inline __always_inline void +rte_smp_mb(void) +{ + asm volatile("lock addl $0, -128(%%rsp); " ::: "memory"); +} + +static inline int +rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src) +{ + uint8_t res; + + asm volatile("lock ; " + "cmpxchgl %[src], %[dst];" + "sete %[res];" + : [res] "=a" (res), /* output */ + [dst] "=m" (*dst) + : [src] "r" (src), /* input */ + "a" (exp), + "m" (*dst) + : "memory"); /* no-clobber list */ + return res; +} + +static inline uint32_t +vmbus_txbr_copyto(const struct vmbus_br *tbr, uint32_t windex, + const void *src0, uint32_t cplen) +{ + uint8_t *br_data = (uint8_t *)tbr->vbr + RINGDATA_START_OFFSET; + uint32_t br_dsize = tbr->dsize; + const uint8_t *src = src0; + + if (cplen > br_dsize - windex) { + uint32_t fraglen = br_dsize - windex; + + /* Wrap-around detected */ + memcpy(br_data + windex, src, fraglen); + memcpy(br_data, src + fraglen, cplen - fraglen); + } else { + memcpy(br_data + windex, src, cplen); + } + + return vmbus_br_idxinc(windex, cplen, br_dsize); +} + +/* + * Write scattered channel packet to TX bufring. + * + * The offset of this channel packet is written as a 64bits value + * immediately after this channel packet. + * + * The write goes through three stages: + * 1. Reserve space in ring buffer for the new data. + * Writer atomically moves priv_write_index. + * 2. Copy the new data into the ring. + * 3. Update the tail of the ring (visible to host) that indicates + * next read location. Writer updates write_index + */ +static int +vmbus_txbr_write(struct vmbus_br *tbr, const struct iovec iov[], int iovlen, + bool *need_sig) +{ + struct vmbus_bufring *vbr = tbr->vbr; + uint32_t ring_size = tbr->dsize; + uint32_t old_windex, next_windex, windex, total; + uint64_t save_windex; + int i; + + total = 0; + for (i = 0; i < iovlen; i++) + total += iov[i].iov_len; + total += sizeof(save_windex); + + /* Reserve space in ring */ + do { + uint32_t avail; + + /* Get current free location */ + old_windex = tbr->windex; + + /* Prevent compiler reordering this with calculation */ + rte_compiler_barrier(); + + avail = vmbus_br_availwrite(tbr, old_windex); + + /* If not enough space in ring, then tell caller. */ + if (avail <= total) + return -EAGAIN; + + next_windex = vmbus_br_idxinc(old_windex, total, ring_size); + + /* Atomic update of next write_index for other threads */ + } while (!rte_atomic32_cmpset(&tbr->windex, old_windex, next_windex)); + + /* Space from old..new is now reserved */ + windex = old_windex; + for (i = 0; i < iovlen; i++) + windex = vmbus_txbr_copyto(tbr, windex, iov[i].iov_base, iov[i].iov_len); + + /* Set the offset of the current channel packet. */ + save_windex = ((uint64_t)old_windex) << 32; + windex = vmbus_txbr_copyto(tbr, windex, &save_windex, + sizeof(save_windex)); + + /* The region reserved should match region used */ + if (windex != next_windex) + return -EINVAL; + + /* Ensure that data is available before updating host index */ + rte_compiler_barrier(); + + /* Checkin for our reservation. wait for our turn to update host */ + while (!rte_atomic32_cmpset(&vbr->windex, old_windex, next_windex)) + _mm_pause(); + + return 0; +} + +int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data, + uint32_t dlen, uint32_t flags) +{ + struct vmbus_chanpkt pkt; + unsigned int pktlen, pad_pktlen; + const uint32_t hlen = sizeof(pkt); + bool send_evt = false; + uint64_t pad = 0; + struct iovec iov[3]; + int error; + + pktlen = hlen + dlen; + pad_pktlen = ALIGN(pktlen, sizeof(uint64_t)); + + pkt.hdr.type = type; + pkt.hdr.flags = flags; + pkt.hdr.hlen = hlen >> VMBUS_CHANPKT_SIZE_SHIFT; + pkt.hdr.tlen = pad_pktlen >> VMBUS_CHANPKT_SIZE_SHIFT; + pkt.hdr.xactid = VMBUS_RQST_ERROR; /* doesn't support multiple requests at same time */ + + iov[0].iov_base = &pkt; + iov[0].iov_len = hlen; + iov[1].iov_base = data; + iov[1].iov_len = dlen; + iov[2].iov_base = &pad; + iov[2].iov_len = pad_pktlen - pktlen; + + error = vmbus_txbr_write(txbr, iov, 3, &send_evt); + + return error; +} + +static inline uint32_t +vmbus_rxbr_copyfrom(const struct vmbus_br *rbr, uint32_t rindex, + void *dst0, size_t cplen) +{ + const uint8_t *br_data = (uint8_t *)rbr->vbr + RINGDATA_START_OFFSET; + uint32_t br_dsize = rbr->dsize; + uint8_t *dst = dst0; + + if (cplen > br_dsize - rindex) { + uint32_t fraglen = br_dsize - rindex; + + /* Wrap-around detected. */ + memcpy(dst, br_data + rindex, fraglen); + memcpy(dst + fraglen, br_data, cplen - fraglen); + } else { + memcpy(dst, br_data + rindex, cplen); + } + + return vmbus_br_idxinc(rindex, cplen, br_dsize); +} + +/* Copy data from receive ring but don't change index */ +static int +vmbus_rxbr_peek(const struct vmbus_br *rbr, void *data, size_t dlen) +{ + uint32_t avail; + + /* + * The requested data and the 64bits channel packet + * offset should be there at least. + */ + avail = vmbus_br_availread(rbr); + if (avail < dlen + sizeof(uint64_t)) + return -EAGAIN; + + vmbus_rxbr_copyfrom(rbr, rbr->vbr->rindex, data, dlen); + return 0; +} + +/* + * Copy data from receive ring and change index + * NOTE: + * We assume (dlen + skip) == sizeof(channel packet). + */ +static int +vmbus_rxbr_read(struct vmbus_br *rbr, void *data, size_t dlen, size_t skip) +{ + struct vmbus_bufring *vbr = rbr->vbr; + uint32_t br_dsize = rbr->dsize; + uint32_t rindex; + + if (vmbus_br_availread(rbr) < dlen + skip + sizeof(uint64_t)) + return -EAGAIN; + + /* Record where host was when we started read (for debug) */ + rbr->windex = rbr->vbr->windex; + + /* + * Copy channel packet from RX bufring. + */ + rindex = vmbus_br_idxinc(rbr->vbr->rindex, skip, br_dsize); + rindex = vmbus_rxbr_copyfrom(rbr, rindex, data, dlen); + + /* + * Discard this channel packet's 64bits offset, which is useless to us. + */ + rindex = vmbus_br_idxinc(rindex, sizeof(uint64_t), br_dsize); + + /* Update the read index _after_ the channel packet is fetched. */ + rte_compiler_barrier(); + + vbr->rindex = rindex; + + return 0; +} + +int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr, + void *data, uint32_t *len) +{ + struct vmbus_chanpkt_hdr pkt; + uint32_t dlen, bufferlen = *len; + int error; + + error = vmbus_rxbr_peek(rxbr, &pkt, sizeof(pkt)); + if (error) + return error; + + if (unlikely(pkt.hlen < VMBUS_CHANPKT_HLEN_MIN)) + /* XXX this channel is dead actually. */ + return -EIO; + + if (unlikely(pkt.hlen > pkt.tlen)) + return -EIO; + + /* Length are in quad words */ + dlen = pkt.tlen << VMBUS_CHANPKT_SIZE_SHIFT; + *len = dlen; + + /* If caller buffer is not large enough */ + if (unlikely(dlen > bufferlen)) + return -ENOBUFS; + + /* Read data and skip packet header */ + error = vmbus_rxbr_read(rxbr, data, dlen, 0); + if (error) + return error; + + /* Return the number of bytes read */ + return dlen + sizeof(uint64_t); +} diff --git a/tools/hv/vmbus_bufring.h b/tools/hv/vmbus_bufring.h new file mode 100644 index 000000000000..45ecc48e517f --- /dev/null +++ b/tools/hv/vmbus_bufring.h @@ -0,0 +1,154 @@ +/* SPDX-License-Identifier: BSD-3-Clause */ + +#ifndef _VMBUS_BUF_H_ +#define _VMBUS_BUF_H_ + +#include +#include + +#define __packed __attribute__((__packed__)) +#define unlikely(x) __builtin_expect(!!(x), 0) + +#define ICMSGHDRFLAG_TRANSACTION 1 +#define ICMSGHDRFLAG_REQUEST 2 +#define ICMSGHDRFLAG_RESPONSE 4 + +#define IC_VERSION_NEGOTIATION_MAX_VER_COUNT 100 +#define ICMSG_HDR (sizeof(struct vmbuspipe_hdr) + sizeof(struct icmsg_hdr)) +#define ICMSG_NEGOTIATE_PKT_SIZE(icframe_vercnt, icmsg_vercnt) \ + (ICMSG_HDR + sizeof(struct icmsg_negotiate) + \ + (((icframe_vercnt) + (icmsg_vercnt)) * sizeof(struct ic_version))) + +/* + * Channel packets + */ + +/* Channel packet flags */ +#define VMBUS_CHANPKT_TYPE_INBAND 0x0006 +#define VMBUS_CHANPKT_TYPE_RXBUF 0x0007 +#define VMBUS_CHANPKT_TYPE_GPA 0x0009 +#define VMBUS_CHANPKT_TYPE_COMP 0x000b + +#define VMBUS_CHANPKT_FLAG_NONE 0 +#define VMBUS_CHANPKT_FLAG_RC 0x0001 /* report completion */ + +#define VMBUS_CHANPKT_SIZE_SHIFT 3 +#define VMBUS_CHANPKT_SIZE_ALIGN BIT(VMBUS_CHANPKT_SIZE_SHIFT) +#define VMBUS_CHANPKT_HLEN_MIN \ + (sizeof(struct vmbus_chanpkt_hdr) >> VMBUS_CHANPKT_SIZE_SHIFT) + +/* + * Buffer ring + */ +struct vmbus_bufring { + volatile uint32_t windex; + volatile uint32_t rindex; + + /* + * Interrupt mask {0,1} + * + * For TX bufring, host set this to 1, when it is processing + * the TX bufring, so that we can safely skip the TX event + * notification to host. + * + * For RX bufring, once this is set to 1 by us, host will not + * further dispatch interrupts to us, even if there are data + * pending on the RX bufring. This effectively disables the + * interrupt of the channel to which this RX bufring is attached. + */ + volatile uint32_t imask; + + /* + * Win8 uses some of the reserved bits to implement + * interrupt driven flow management. On the send side + * we can request that the receiver interrupt the sender + * when the ring transitions from being full to being able + * to handle a message of size "pending_send_sz". + * + * Add necessary state for this enhancement. + */ + volatile uint32_t pending_send; + uint32_t reserved1[12]; + + union { + struct { + uint32_t feat_pending_send_sz:1; + }; + uint32_t value; + } feature_bits; + + /* + * Ring data starts here + RingDataStartOffset + * !!! DO NOT place any fields below this !!! + */ + uint8_t data[]; +} __packed; + +struct vmbus_br { + struct vmbus_bufring *vbr; + uint32_t dsize; + uint32_t windex; /* next available location */ +}; + +struct vmbus_chanpkt_hdr { + uint16_t type; /* VMBUS_CHANPKT_TYPE_ */ + uint16_t hlen; /* header len, in 8 bytes */ + uint16_t tlen; /* total len, in 8 bytes */ + uint16_t flags; /* VMBUS_CHANPKT_FLAG_ */ + uint64_t xactid; +} __packed; + +struct vmbus_chanpkt { + struct vmbus_chanpkt_hdr hdr; +} __packed; + +struct vmbuspipe_hdr { + unsigned int flags; + unsigned int msgsize; +} __packed; + +struct ic_version { + unsigned short major; + unsigned short minor; +} __packed; + +struct icmsg_negotiate { + unsigned short icframe_vercnt; + unsigned short icmsg_vercnt; + unsigned int reserved; + struct ic_version icversion_data[]; /* any size array */ +} __packed; + +struct icmsg_hdr { + struct ic_version icverframe; + unsigned short icmsgtype; + struct ic_version icvermsg; + unsigned short icmsgsize; + unsigned int status; + unsigned char ictransaction_id; + unsigned char icflags; + unsigned char reserved[2]; +} __packed; + +int rte_vmbus_chan_recv_raw(struct vmbus_br *rxbr, void *data, uint32_t *len); +int rte_vmbus_chan_send(struct vmbus_br *txbr, uint16_t type, void *data, + uint32_t dlen, uint32_t flags); +void vmbus_br_setup(struct vmbus_br *br, void *buf, unsigned int blen); + +/* Amount of space available for write */ +static inline uint32_t vmbus_br_availwrite(const struct vmbus_br *br, uint32_t windex) +{ + uint32_t rindex = br->vbr->rindex; + + if (windex >= rindex) + return br->dsize - (windex - rindex); + else + return rindex - windex; +} + +static inline uint32_t vmbus_br_availread(const struct vmbus_br *br) +{ + return br->dsize - vmbus_br_availwrite(br, br->vbr->windex); +} + +#endif /* !_VMBUS_BUF_H_ */ From patchwork Fri Jul 14 10:25:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saurabh Singh Sengar X-Patchwork-Id: 120409 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp2412552vqm; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) X-Google-Smtp-Source: APBJJlGp3JwJldoBWTaaZOgSX3NPQmnpkyvV3Gs1hlDzMQltZD0klCAsSLygt/n0PaQS8+VT4ZM4 X-Received: by 2002:a05:6a20:5493:b0:133:c3b7:e077 with SMTP id i19-20020a056a20549300b00133c3b7e077mr986945pzk.49.1689331655496; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689331655; cv=none; d=google.com; s=arc-20160816; b=GtWGAoCT7UCyCHOmipg772FmwXqWHjadHbWSZt1fQGLXsBvfgbXLnz0x6t51bE6hsG owT9M0WR+mpSUeKxj984BDpyNc4V3M++4vg9ubZJSwgXuWfdbrR50OKEBjD4CQ0b0iz+ EZKgKTOXvHzatb2nSEsfRmnXJVHeHCubRsQoeqShQRUy9KYjbnfbjuM09XD2Q94hk1Pz 3++uUz0u6Otq18fav2PcsBBOWqLAPJRrTG2jExSdVcSynXMcnB5Hekgc5eIX6yaWV3TB 6JjE5BhfhfjsCKnYah/+IZJueXMZq/irwgJeeg/RMRI0rI9DoqIf8I/OapqBCrObbVg0 6QWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :to:from:dkim-signature:dkim-filter; bh=rS4DsygfSZGVIBVFo/gGi2qxWZfTJ1rWo3mCJ/PNH3M=; fh=ahkjrYVgnZMRmK+J626CGhBdW9goONUEh6WYVopVVOA=; b=j7wtP7yTiaN9xV0mr9H8teN+vAa9Xkp7hQZQPPfIfn8AuNFTk/DKnH0tOV/lWp9sGs b+gRlPLPQVG4tWK0PP55ZukeaHy/7lwd3Q5bsZJeo1yMAyvJXJNJZNj8+rhUyu+JaEEt qZCHFfVFdysiUQHlBvmhEl0gOmBVwRdxN6mMZH7A8/xl5AI41xgWRS0jfoJvxUnMsEGY 2n1p0ekVdu0YUPV8i4tT6BWpEWSbzrLm+/HFT6/tNx0PnEOejfnQvuM6QoJOKhOcph41 OiyPFnBRUpm7O0pJpDJiYWB/eZj2f3cjMhFKkU1JVj9o2hJvT/qLEWT4csJCYQoGXK0E CKWA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=F6kU3glT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p7-20020a056a000b4700b00653fb3f21d3si7024323pfo.373.2023.07.14.03.47.23; Fri, 14 Jul 2023 03:47:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=F6kU3glT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236219AbjGNK0E (ORCPT + 99 others); Fri, 14 Jul 2023 06:26:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37520 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236198AbjGNKZz (ORCPT ); Fri, 14 Jul 2023 06:25:55 -0400 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A209B273F; Fri, 14 Jul 2023 03:25:52 -0700 (PDT) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id 2D5A821C467E; Fri, 14 Jul 2023 03:25:52 -0700 (PDT) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 2D5A821C467E DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1689330352; bh=rS4DsygfSZGVIBVFo/gGi2qxWZfTJ1rWo3mCJ/PNH3M=; h=From:To:Subject:Date:In-Reply-To:References:From; b=F6kU3glT/3DaYLaeyXfSs49CU5zTqwOvXQqQ9MgD1BtdR6cnG33H5jvL7cWJCEuI7 JoqmR4HgZEETaQ4eAL1JqiJbRoQKhVRcwXx93UbKGcYlL91ZA3gLArunn53cimYDLD g8xg1oM/8LsXk5fmi8/esS+DcjSdXm8QE1TSJ7So= From: Saurabh Sengar To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, mikelley@microsoft.com, gregkh@linuxfoundation.org, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-doc@vger.kernel.org Subject: [PATCH v3 3/3] tools: hv: Add new fcopy application based on uio driver Date: Fri, 14 Jul 2023 03:25:46 -0700 Message-Id: <1689330346-5374-4-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> References: <1689330346-5374-1-git-send-email-ssengar@linux.microsoft.com> X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL, USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771392629810100636 X-GMAIL-MSGID: 1771392629810100636 Implement the file copy service for Linux guests on Hyper-V. This permits the host to copy a file (over VMBus) into the guest. This facility is part of "guest integration services" supported on the Hyper-V platform. Here is a link that provides additional details on this functionality: http://technet.microsoft.com/en-us/library/dn464282.aspx This new fcopy application uses uio_hv_vmbus_client driver which makes the earlier hv_util based driver and application obsolete. Signed-off-by: Saurabh Sengar --- [V3] - Improve cover letter and commit messages - Improve debug prints - Instead of hardcoded instance id, query from class id sysfs - Set the ring_size value from application - Update the application to mmap /dev/uio instead of sysfs - new application compilation dependent on x86 [V2] - simpler sysfs path tools/hv/Build | 1 + tools/hv/Makefile | 10 +- tools/hv/hv_fcopy_uio_daemon.c | 578 +++++++++++++++++++++++++++++++++ 3 files changed, 588 insertions(+), 1 deletion(-) create mode 100644 tools/hv/hv_fcopy_uio_daemon.c diff --git a/tools/hv/Build b/tools/hv/Build index 2a667d3d94cb..efcbb74a0d23 100644 --- a/tools/hv/Build +++ b/tools/hv/Build @@ -2,3 +2,4 @@ hv_kvp_daemon-y += hv_kvp_daemon.o hv_vss_daemon-y += hv_vss_daemon.o hv_fcopy_daemon-y += hv_fcopy_daemon.o vmbus_bufring-y += vmbus_bufring.o +hv_fcopy_uio_daemon-y += hv_fcopy_uio_daemon.o diff --git a/tools/hv/Makefile b/tools/hv/Makefile index 33cf488fd20f..678c6c450a53 100644 --- a/tools/hv/Makefile +++ b/tools/hv/Makefile @@ -21,8 +21,10 @@ override CFLAGS += -O2 -Wall -g -D_GNU_SOURCE -I$(OUTPUT)include ifeq ($(SRCARCH),x86) ALL_LIBS := libvmbus_bufring.a -endif +ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon hv_fcopy_uio_daemon +else ALL_TARGETS := hv_kvp_daemon hv_vss_daemon hv_fcopy_daemon +endif ALL_PROGRAMS := $(patsubst %,$(OUTPUT)%,$(ALL_TARGETS)) $(patsubst %,$(OUTPUT)%,$(ALL_LIBS)) ALL_SCRIPTS := hv_get_dhcp_info.sh hv_get_dns_info.sh hv_set_ifconfig.sh @@ -56,6 +58,12 @@ $(HV_FCOPY_DAEMON_IN): FORCE $(OUTPUT)hv_fcopy_daemon: $(HV_FCOPY_DAEMON_IN) $(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $< -o $@ +HV_FCOPY_UIO_DAEMON_IN := $(OUTPUT)hv_fcopy_uio_daemon-in.o +$(HV_FCOPY_UIO_DAEMON_IN): FORCE + $(Q)$(MAKE) $(build)=hv_fcopy_uio_daemon +$(OUTPUT)hv_fcopy_uio_daemon: $(HV_FCOPY_UIO_DAEMON_IN) libvmbus_bufring.a + $(QUIET_LINK)$(CC) -lm $< -L. -lvmbus_bufring -o $@ + clean: rm -f $(ALL_PROGRAMS) find $(or $(OUTPUT),.) -name '*.o' -delete -o -name '\.*.d' -delete diff --git a/tools/hv/hv_fcopy_uio_daemon.c b/tools/hv/hv_fcopy_uio_daemon.c new file mode 100644 index 000000000000..e8618a30dc7e --- /dev/null +++ b/tools/hv/hv_fcopy_uio_daemon.c @@ -0,0 +1,578 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * An implementation of host to guest copy functionality for Linux. + * + * Copyright (C) 2023, Microsoft, Inc. + * + * Author : K. Y. Srinivasan + * Author : Saurabh Sengar + * + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include "vmbus_bufring.h" + +#define ICMSGTYPE_NEGOTIATE 0 +#define ICMSGTYPE_FCOPY 7 + +#define WIN8_SRV_MAJOR 1 +#define WIN8_SRV_MINOR 1 +#define WIN8_SRV_VERSION (WIN8_SRV_MAJOR << 16 | WIN8_SRV_MINOR) + +#define MAX_PATH_LEN 300 +#define MAX_LINE_LEN 40 +#define DEVICES_SYSFS "/sys/bus/vmbus/devices" +#define FCOPY_CLASS_ID "34d14be3-dee4-41c8-9ae7-6b174977c192" + +#define FCOPY_VER_COUNT 1 +static const int fcopy_versions[] = { + WIN8_SRV_VERSION +}; + +#define FW_VER_COUNT 1 +static const int fw_versions[] = { + UTIL_FW_VERSION +}; + +#define HV_RING_SIZE (4 * 4096) + +unsigned char desc[HV_RING_SIZE]; + +static int target_fd; +static char target_fname[PATH_MAX]; +static unsigned long long filesize; + +static int hv_fcopy_create_file(char *file_name, char *path_name, __u32 flags) +{ + int error = HV_E_FAIL; + char *q, *p; + + filesize = 0; + p = (char *)path_name; + snprintf(target_fname, sizeof(target_fname), "%s/%s", + (char *)path_name, (char *)file_name); + + /* + * Check to see if the path is already in place; if not, + * create if required. + */ + while ((q = strchr(p, '/')) != NULL) { + if (q == p) { + p++; + continue; + } + *q = '\0'; + if (access(path_name, F_OK)) { + if (flags & CREATE_PATH) { + if (mkdir(path_name, 0755)) { + syslog(LOG_ERR, "Failed to create %s", + path_name); + goto done; + } + } else { + syslog(LOG_ERR, "Invalid path: %s", path_name); + goto done; + } + } + p = q + 1; + *q = '/'; + } + + if (!access(target_fname, F_OK)) { + syslog(LOG_INFO, "File: %s exists", target_fname); + if (!(flags & OVER_WRITE)) { + error = HV_ERROR_ALREADY_EXISTS; + goto done; + } + } + + target_fd = open(target_fname, + O_RDWR | O_CREAT | O_TRUNC | O_CLOEXEC, 0744); + if (target_fd == -1) { + syslog(LOG_INFO, "Open Failed: %s", strerror(errno)); + goto done; + } + + error = 0; +done: + if (error) + target_fname[0] = '\0'; + return error; +} + +static int hv_copy_data(struct hv_do_fcopy *cpmsg) +{ + ssize_t bytes_written; + int ret = 0; + + bytes_written = pwrite(target_fd, cpmsg->data, cpmsg->size, + cpmsg->offset); + + filesize += cpmsg->size; + if (bytes_written != cpmsg->size) { + switch (errno) { + case ENOSPC: + ret = HV_ERROR_DISK_FULL; + break; + default: + ret = HV_E_FAIL; + break; + } + syslog(LOG_ERR, "pwrite failed to write %llu bytes: %ld (%s)", + filesize, (long)bytes_written, strerror(errno)); + } + + return ret; +} + +/* + * Reset target_fname to "" in the two below functions for hibernation: if + * the fcopy operation is aborted by hibernation, the daemon should remove the + * partially-copied file; to achieve this, the hv_utils driver always fakes a + * CANCEL_FCOPY message upon suspend, and later when the VM resumes back, + * the daemon calls hv_copy_cancel() to remove the file; if a file is copied + * successfully before suspend, hv_copy_finished() must reset target_fname to + * avoid that the file can be incorrectly removed upon resume, since the faked + * CANCEL_FCOPY message is spurious in this case. + */ +static int hv_copy_finished(void) +{ + close(target_fd); + target_fname[0] = '\0'; + return 0; +} + +static void print_usage(char *argv[]) +{ + fprintf(stderr, "Usage: %s [options]\n" + "Options are:\n" + " -n, --no-daemon stay in foreground, don't daemonize\n" + " -h, --help print this help\n", argv[0]); +} + +static bool vmbus_prep_negotiate_resp(struct icmsg_hdr *icmsghdrp, unsigned char *buf, + unsigned int buflen, const int *fw_version, int fw_vercnt, + const int *srv_version, int srv_vercnt, + int *nego_fw_version, int *nego_srv_version) +{ + int icframe_major, icframe_minor; + int icmsg_major, icmsg_minor; + int fw_major, fw_minor; + int srv_major, srv_minor; + int i, j; + bool found_match = false; + struct icmsg_negotiate *negop; + + /* Check that there's enough space for icframe_vercnt, icmsg_vercnt */ + if (buflen < ICMSG_HDR + offsetof(struct icmsg_negotiate, reserved)) { + syslog(LOG_ERR, "Invalid icmsg negotiate"); + return false; + } + + icmsghdrp->icmsgsize = 0x10; + negop = (struct icmsg_negotiate *)&buf[ICMSG_HDR]; + + icframe_major = negop->icframe_vercnt; + icframe_minor = 0; + + icmsg_major = negop->icmsg_vercnt; + icmsg_minor = 0; + + /* Validate negop packet */ + if (icframe_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT || + icmsg_major > IC_VERSION_NEGOTIATION_MAX_VER_COUNT || + ICMSG_NEGOTIATE_PKT_SIZE(icframe_major, icmsg_major) > buflen) { + syslog(LOG_ERR, "Invalid icmsg negotiate - icframe_major: %u, icmsg_major: %u\n", + icframe_major, icmsg_major); + goto fw_error; + } + + /* + * Select the framework version number we will + * support. + */ + + for (i = 0; i < fw_vercnt; i++) { + fw_major = (fw_version[i] >> 16); + fw_minor = (fw_version[i] & 0xFFFF); + + for (j = 0; j < negop->icframe_vercnt; j++) { + if (negop->icversion_data[j].major == fw_major && + negop->icversion_data[j].minor == fw_minor) { + icframe_major = negop->icversion_data[j].major; + icframe_minor = negop->icversion_data[j].minor; + found_match = true; + break; + } + } + + if (found_match) + break; + } + + if (!found_match) + goto fw_error; + + found_match = false; + + for (i = 0; i < srv_vercnt; i++) { + srv_major = (srv_version[i] >> 16); + srv_minor = (srv_version[i] & 0xFFFF); + + for (j = negop->icframe_vercnt; + (j < negop->icframe_vercnt + negop->icmsg_vercnt); + j++) { + if (negop->icversion_data[j].major == srv_major && + negop->icversion_data[j].minor == srv_minor) { + icmsg_major = negop->icversion_data[j].major; + icmsg_minor = negop->icversion_data[j].minor; + found_match = true; + break; + } + } + + if (found_match) + break; + } + + /* + * Respond with the framework and service + * version numbers we can support. + */ +fw_error: + if (!found_match) { + negop->icframe_vercnt = 0; + negop->icmsg_vercnt = 0; + } else { + negop->icframe_vercnt = 1; + negop->icmsg_vercnt = 1; + } + + if (nego_fw_version) + *nego_fw_version = (icframe_major << 16) | icframe_minor; + + if (nego_srv_version) + *nego_srv_version = (icmsg_major << 16) | icmsg_minor; + + negop->icversion_data[0].major = icframe_major; + negop->icversion_data[0].minor = icframe_minor; + negop->icversion_data[1].major = icmsg_major; + negop->icversion_data[1].minor = icmsg_minor; + + return found_match; +} + +static void wcstoutf8(char *dest, const __u16 *src, size_t dest_size) +{ + size_t len = 0; + + while (len < dest_size) { + if (src[len] < 0x80) + dest[len++] = (char)(*src++); + else + dest[len++] = 'X'; + } + + dest[len] = '\0'; +} + +static int hv_fcopy_start(struct hv_start_fcopy *smsg_in) +{ + setlocale(LC_ALL, "en_US.utf8"); + size_t file_size, path_size; + char *file_name, *path_name; + char *in_file_name = (char *)smsg_in->file_name; + char *in_path_name = (char *)smsg_in->path_name; + + file_size = wcstombs(NULL, (const wchar_t *restrict)in_file_name, 0) + 1; + path_size = wcstombs(NULL, (const wchar_t *restrict)in_path_name, 0) + 1; + + file_name = (char *)malloc(file_size * sizeof(char)); + path_name = (char *)malloc(path_size * sizeof(char)); + + wcstoutf8(file_name, (__u16 *)in_file_name, file_size); + wcstoutf8(path_name, (__u16 *)in_path_name, path_size); + + return hv_fcopy_create_file(file_name, path_name, smsg_in->copy_flags); +} + +static int hv_fcopy_send_data(struct hv_fcopy_hdr *fcopy_msg, int recvlen) +{ + int operation = fcopy_msg->operation; + + /* + * The strings sent from the host are encoded in + * utf16; convert it to utf8 strings. + * The host assures us that the utf16 strings will not exceed + * the max lengths specified. We will however, reserve room + * for the string terminating character - in the utf16s_utf8s() + * function we limit the size of the buffer where the converted + * string is placed to W_MAX_PATH -1 to guarantee + * that the strings can be properly terminated! + */ + + switch (operation) { + case START_FILE_COPY: + return hv_fcopy_start((struct hv_start_fcopy *)fcopy_msg); + case WRITE_TO_FILE: + return hv_copy_data((struct hv_do_fcopy *)fcopy_msg); + case COMPLETE_FCOPY: + return hv_copy_finished(); + } + + return HV_E_FAIL; +} + +/* process the packet recv from host */ +static int fcopy_pkt_process(struct vmbus_br *txbr) +{ + int ret, offset, pktlen; + int fcopy_srv_version; + const struct vmbus_chanpkt_hdr *pkt; + struct hv_fcopy_hdr *fcopy_msg; + struct icmsg_hdr *icmsghdr; + + pkt = (const struct vmbus_chanpkt_hdr *)desc; + offset = pkt->hlen << 3; + pktlen = (pkt->tlen << 3) - offset; + icmsghdr = (struct icmsg_hdr *)&desc[offset + sizeof(struct vmbuspipe_hdr)]; + icmsghdr->status = HV_E_FAIL; + + if (icmsghdr->icmsgtype == ICMSGTYPE_NEGOTIATE) { + if (vmbus_prep_negotiate_resp(icmsghdr, desc + offset, pktlen, fw_versions, + FW_VER_COUNT, fcopy_versions, FCOPY_VER_COUNT, + NULL, &fcopy_srv_version)) { + syslog(LOG_INFO, "FCopy IC version %d.%d", + fcopy_srv_version >> 16, fcopy_srv_version & 0xFFFF); + icmsghdr->status = 0; + } + } else if (icmsghdr->icmsgtype == ICMSGTYPE_FCOPY) { + /* Ensure recvlen is big enough to contain hv_fcopy_hdr */ + if (pktlen < ICMSG_HDR + sizeof(struct hv_fcopy_hdr)) { + syslog(LOG_ERR, "Invalid Fcopy hdr. Packet length too small: %u", + pktlen); + return -ENOBUFS; + } + + fcopy_msg = (struct hv_fcopy_hdr *)&desc[offset + ICMSG_HDR]; + icmsghdr->status = hv_fcopy_send_data(fcopy_msg, pktlen); + } + + icmsghdr->icflags = ICMSGHDRFLAG_TRANSACTION | ICMSGHDRFLAG_RESPONSE; + ret = rte_vmbus_chan_send(txbr, 0x6, desc + offset, pktlen, 0); + if (ret) { + syslog(LOG_ERR, "Write to ringbuffer failed err: %d", ret); + return ret; + } + + return 0; +} + +static void fcopy_get_first_folder(char *path, char *chan_no) +{ + DIR *dir = opendir(path); + struct dirent *entry; + + if (!dir) { + syslog(LOG_ERR, "Failed to open directory (errno=%s).\n", strerror(errno)); + return; + } + + while ((entry = readdir(dir)) != NULL) { + if (entry->d_type == DT_DIR && strcmp(entry->d_name, ".") != 0 && + strcmp(entry->d_name, "..") != 0) { + strcpy(chan_no, entry->d_name); + break; + } + } + + closedir(dir); +} + +static void fcopy_set_ring_size(char *path, char *inst, int size) +{ + char ring_size_path[MAX_PATH_LEN] = {0}; + FILE *fd; + + snprintf(ring_size_path, sizeof(ring_size_path), "%s/%s/%s", path, inst, "ring_size"); + fd = fopen(ring_size_path, "w"); + if (!fd) { + syslog(LOG_WARNING, "Failed to open ring_size file (errno=%s).\n", strerror(errno)); + return; + } + fprintf(fd, "%d", size); + fclose(fd); +} + +static char *fcopy_read_sysfs(char *path, char *buf, int len) +{ + FILE *fd; + char *ret; + + fd = fopen(path, "r"); + if (!fd) + return NULL; + + ret = fgets(buf, len, fd); + fclose(fd); + + return ret; +} + +static int fcopy_get_instance_id(char *path, char *class_id, char *inst) +{ + DIR *dir = opendir(path); + struct dirent *entry; + char tmp_path[MAX_PATH_LEN] = {0}; + char line[MAX_LINE_LEN]; + + if (!dir) { + syslog(LOG_ERR, "Failed to open directory (errno=%s).\n", strerror(errno)); + return -EINVAL; + } + + while ((entry = readdir(dir)) != NULL) { + if (entry->d_type == DT_LNK && strcmp(entry->d_name, ".") != 0 && + strcmp(entry->d_name, "..") != 0) { + /* search for the sysfs path with matching class_id */ + snprintf(tmp_path, sizeof(tmp_path), "%s/%s/%s", + path, entry->d_name, "class_id"); + if (!fcopy_read_sysfs(tmp_path, line, MAX_LINE_LEN)) + continue; + + /* class id matches, now fetch the instance id from device_id */ + if (strstr(line, class_id)) { + snprintf(tmp_path, sizeof(tmp_path), "%s/%s/%s", + path, entry->d_name, "device_id"); + if (!fcopy_read_sysfs(tmp_path, line, MAX_LINE_LEN)) + continue; + /* remove braces */ + strncpy(inst, line + 1, strlen(line) - 3); + break; + } + } + } + + closedir(dir); + return 0; +} + +int main(int argc, char *argv[]) +{ + int fcopy_fd = -1, tmp = 1; + int daemonize = 1, long_index = 0, opt, ret = -EINVAL; + struct vmbus_br txbr, rxbr; + void *ring; + uint32_t len = HV_RING_SIZE; + char uio_name[10] = {0}; + char uio_dev_path[15] = {0}; + char uio_path[MAX_PATH_LEN] = {0}; + char inst[MAX_LINE_LEN] = {0}; + + static struct option long_options[] = { + {"help", no_argument, 0, 'h' }, + {"no-daemon", no_argument, 0, 'n' }, + {0, 0, 0, 0 } + }; + + while ((opt = getopt_long(argc, argv, "hn", long_options, + &long_index)) != -1) { + switch (opt) { + case 'n': + daemonize = 0; + break; + case 'h': + default: + print_usage(argv); + exit(EXIT_FAILURE); + } + } + + if (daemonize && daemon(1, 0)) { + syslog(LOG_ERR, "daemon() failed; error: %s", strerror(errno)); + exit(EXIT_FAILURE); + } + + openlog("HV_UIO_FCOPY", 0, LOG_USER); + syslog(LOG_INFO, "starting; pid is:%d", getpid()); + + /* get instance id */ + if (fcopy_get_instance_id(DEVICES_SYSFS, FCOPY_CLASS_ID, inst)) + exit(EXIT_FAILURE); + + /* set ring_size value */ + fcopy_set_ring_size(DEVICES_SYSFS, inst, HV_RING_SIZE); + + /* get /dev/uioX dev path and open it */ + snprintf(uio_path, sizeof(uio_path), "%s/%s/%s", DEVICES_SYSFS, inst, "uio"); + fcopy_get_first_folder(uio_path, uio_name); + snprintf(uio_dev_path, sizeof(uio_dev_path), "/dev/%s", uio_name); + fcopy_fd = open(uio_dev_path, O_RDWR); + + if (fcopy_fd < 0) { + syslog(LOG_ERR, "open %s failed; error: %d %s", + uio_dev_path, errno, strerror(errno)); + syslog(LOG_ERR, "Please make sure module uio_hv_vmbus_client is loaded and" \ + " device is not used by any other application\n"); + ret = fcopy_fd; + exit(EXIT_FAILURE); + } + + ring = mmap(NULL, 2 * HV_RING_SIZE, PROT_READ | PROT_WRITE, MAP_SHARED, fcopy_fd, 0); + if (ring == MAP_FAILED) { + ret = errno; + syslog(LOG_ERR, "mmap ringbuffer failed; error: %d %s", ret, strerror(ret)); + goto close; + } + vmbus_br_setup(&txbr, ring, HV_RING_SIZE); + vmbus_br_setup(&rxbr, (char *)ring + HV_RING_SIZE, HV_RING_SIZE); + + while (1) { + /* + * In this loop we process fcopy messages after the + * handshake is complete. + */ + ret = pread(fcopy_fd, &tmp, sizeof(int), 0); + if (ret < 0) { + syslog(LOG_ERR, "pread failed: %s", strerror(errno)); + continue; + } + + len = HV_RING_SIZE; + ret = rte_vmbus_chan_recv_raw(&rxbr, desc, &len); + if (unlikely(ret <= 0)) { + /* This indicates a failure to communicate (or worse) */ + syslog(LOG_ERR, "VMBus channel recv error: %d", ret); + } else { + ret = fcopy_pkt_process(&txbr); + if (ret < 0) + goto close; + + /* Signal host */ + tmp = 1; + if ((write(fcopy_fd, &tmp, sizeof(int))) != sizeof(int)) { + ret = errno; + syslog(LOG_ERR, "Registration failed: %s\n", strerror(ret)); + goto close; + } + } + } +close: + close(fcopy_fd); + return ret; +}