From patchwork Wed Oct 18 08:19:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 154749 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4640496vqb; Wed, 18 Oct 2023 01:22:03 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHd4k2bxKatTyfiyogUzm8vETNecIBTI2Q6dZGw2vvtj/oLHAo9LtAbLGLKB9BH3Iqe6YX5 X-Received: by 2002:a17:903:340c:b0:1ca:1ce1:bfac with SMTP id ke12-20020a170903340c00b001ca1ce1bfacmr4522430plb.1.1697617323687; Wed, 18 Oct 2023 01:22:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697617323; cv=none; d=google.com; s=arc-20160816; b=iJtYDLtRtfDnNSin/zFUThUv6iJ3oUAaOVGzU4ORNSpTIEaqkVSJfesJ/JjseP7EM6 zzxUclrEIGt6waOJp41MxAJPsdRzp6AbvzFn2Qg7aNk5E5RunaqU6q2WyWcPyUqcJm6h MCnEIvFpcZFpmxsD3ozTUXgZlc1vHQbd1S8s1iW+HEmfVMukX0FnsXHRgSR1U3dw+LG4 GwyXHGIA5K8PHD6RWLWst+O/Afc+q99RtJejuoMU43XiE0wzdvRAYnW+xXX71XV3hh7u txaaheuUM6vpmqIngXPt/hSEiq0DXskH6qivZ8smfdz14ZGJdPqqZfl+T3rEt+meKV7k cZ1A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vo45+LjX02PZVuJ0DVPGfZr63kuJS351gaTVc8oUTnw=; fh=OTS644v5rY1g/BUg247H5CRHtQNgRTQkhSPfJvy2RGE=; b=TltquEtmprLoxIfd8TYSpSMJLWV1siUvtVKedR5k/fl7JFf7GN1VHqjQdWEOc9s/LY k+tlnuMmdcguW5Y2/0xOHiDwQtNS2YA3rAg5Xk6LYtHfr+VdEaRd2vRhbys1vdSFjYwT EI2p0r5PRQelbKMNshXbbsDfXbKDTKNOHQJ2w9Iq30sxdX7BuKYNaBZhKCH5J68G3zoN 4ZX7+PwNqcuT9xdEPH0AgmQ97dXQuwYZ4Sw8uKYC4NRAhXmr5YTNWxkLoOIf+DYn24FA dQKYWbtiimFLzvBYJdC32y142XPaiVg8gt5Uc53FbiRxwuliNWdb8nT5i/A6ny0Iff9R mIRw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="p88ZKR/A"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id a7-20020a170902b58700b001c6223e5675si3388386pls.188.2023.10.18.01.22.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 01:22:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="p88ZKR/A"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 5AFD180758CA; Wed, 18 Oct 2023 01:20:36 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235110AbjJRIU2 (ORCPT + 24 others); Wed, 18 Oct 2023 04:20:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48758 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230047AbjJRIUY (ORCPT ); Wed, 18 Oct 2023 04:20:24 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 69315C6 for ; Wed, 18 Oct 2023 01:20:22 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 02897C433C7; Wed, 18 Oct 2023 08:20:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697617222; bh=+iJNkAX3uTlU6h3PwOl18AULdHhbN3mhV+q+jP2nzPQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=p88ZKR/AY6Q5a3gOGqVHXP/+gK4sH6u8N3vAbDm25eZ6glgD143jXNWYMvUDsoZRu CmSME1X7TtU4b/mfsuardPV3NpW2cJLUfhrxhEYxw/HsrUxUqLP6Lp/i26E7H1+JN0 Ba0XdwePcPUHIP1FPEZBondihXSGd/yGG1IfgMRdEAU5wlLC5sYkjSEM2+T4h2d+Fh bhV8T51ycU7hGCk2Sc4yA/+5+ayDxwZn8lQOQU6nf/UDadZfKvGhWjcd+uOXhlT0yY B+muRkOTP5ADsI23nK2OULKYl56YGPoy5m3Zs3DAQzJSDSXIo8YRiIavdHtz4vJ5H2 HlHNmVBhnAiHg== From: Saeed Mahameed To: Arnd Bergmann , Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, Leon Romanovsky , Jason Gunthorpe , Jiri Pirko , Saeed Mahameed Subject: [PATCH 1/5] mlx5: Add aux dev for ctl interface Date: Wed, 18 Oct 2023 01:19:37 -0700 Message-ID: <20231018081941.475277-2-saeed@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231018081941.475277-1-saeed@kernel.org> References: <20231018081941.475277-1-saeed@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 18 Oct 2023 01:20:36 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780080782703380827 X-GMAIL-MSGID: 1780080782703380827 From: Saeed Mahameed Allow ctl protocol interface auxiliary driver in mlx5. Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Saeed Mahameed --- drivers/net/ethernet/mellanox/mlx5/core/dev.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/drivers/net/ethernet/mellanox/mlx5/core/dev.c b/drivers/net/ethernet/mellanox/mlx5/core/dev.c index 7909f378dc93..f0e91793f4ad 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/dev.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/dev.c @@ -215,8 +215,14 @@ enum { MLX5_INTERFACE_PROTOCOL_MPIB, MLX5_INTERFACE_PROTOCOL_VNET, + MLX5_INTERFACE_PROTOCOL_CTL, }; +static bool is_ctl_supported(struct mlx5_core_dev *dev) +{ + return MLX5_CAP_GEN(dev, uctx_cap); +} + static const struct mlx5_adev_device { const char *suffix; bool (*is_supported)(struct mlx5_core_dev *dev); @@ -237,6 +243,8 @@ static const struct mlx5_adev_device { .is_supported = &is_ib_rep_supported }, [MLX5_INTERFACE_PROTOCOL_MPIB] = { .suffix = "multiport", .is_supported = &is_mp_supported }, + [MLX5_INTERFACE_PROTOCOL_CTL] = { .suffix = "ctl", + .is_supported = &is_ctl_supported }, }; int mlx5_adev_idx_alloc(void) From patchwork Wed Oct 18 08:19:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 154746 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4640085vqb; Wed, 18 Oct 2023 01:20:55 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFHVpGdNP3bHIMUy4AjGtVYwdpJ2WFnG/mkUbwTOdYoUUf2LMbPBXKpAtUzGI76+io72F6E X-Received: by 2002:a17:902:cf4e:b0:1c9:e48c:7260 with SMTP id e14-20020a170902cf4e00b001c9e48c7260mr4385385plg.6.1697617254829; Wed, 18 Oct 2023 01:20:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697617254; cv=none; d=google.com; s=arc-20160816; b=by2hkoC46ol3oUSXV2Xo08RbaMQWJ1zctKzhVpXm7vAwbaT7Qu2arknP8mI81PXGnx 35ZXHeVtdAWQa3D2/IECUtBACrKDRG7aEeb7c8zdU1oZ59WXjHccu7NiJd3V3NgiPk+F Jf4tA6AIAxzKcOq1/uoOSbEpPc8uR+vNrwsIpZnG5w7MpaHNchnhakRAZ0XnsizQ/ZI3 PBQxw0E3jykP5fecs8aPmPuZI6wlln70REWdoU2cr4oIHpNr3Cme29RmsftX9+jmt9q6 hVD+vPD7QXhxsIOM7vEyrY5wurF5wryGDEctLhc9PFxHqAz5tukCm+j9hPUSX6MTljBZ 27uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=5plIJb0zcswtSis4k2ymeYdcd57rg85ugZG0uHdWNuw=; fh=OTS644v5rY1g/BUg247H5CRHtQNgRTQkhSPfJvy2RGE=; b=vLkJdhIVsWPKtEff1IIlFZ7gO2sGMa5+8tczNVt5TW+XoV04jmJ2euAfdyuafTq3UQ vOClDKgjLaXzKdphRayYrLyQ5dxOfvsPthoQ5O5+JgPzS1UN+n9/UNdM+if6OxGDheJ0 O5s6DYtU9S3TMPXHjkl3T0oZ+2buOfFlp7FzxG8pSdAw9ddUao0kW6pyht9pPvD6Vkr+ CTXfOcKWDz1wU98ld6ZleKjfZr7RET6asjQMl71Y9/sJ5IHsW10v3IzjgxSpS7tFovy1 0KqtP5+f11gKCTGlz99q018sKgNAGnWQFKix4ItPCjm2VjB0egxjaWP7JZG7a2Zz9y1a 1EAw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="G7vxX/qx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id d6-20020a170902654600b001b89bab468esi3483868pln.107.2023.10.18.01.20.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 01:20:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="G7vxX/qx"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 7441B812D21F; Wed, 18 Oct 2023 01:20:53 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235130AbjJRIUh (ORCPT + 24 others); Wed, 18 Oct 2023 04:20:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235063AbjJRIUZ (ORCPT ); Wed, 18 Oct 2023 04:20:25 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 736A8B6 for ; Wed, 18 Oct 2023 01:20:23 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F0D2DC433CC; Wed, 18 Oct 2023 08:20:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697617223; bh=AvrV1V5ACD7rbRi/MZaosZKc/YeG8T3V1eXzzaxe0+s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=G7vxX/qxIbP7Z+FPV39pDdKsOUxtWSKGJhXBc28Uk0RR1gXHQMDssksh+4TDsgCII n9cdxaGJ81F723jgfTMIZk6agrwQ1UmgZq3InE26Nl+TGIhG2YvpAvF3usANvxAW+w uMvjoAD3lnJL2cXUxbf/vAlBkclIgbfmieesCuqoiwpg3y+PEzp1mKJLienO64fgAP 0wukfm1gquD+7vE3oj1/7k0huQHw/6qwmw3MF1e1Rj9vzo7SNAPP0JiEiF2ogQvVbY Xm3idjYCdYzOND8iuRt4j2zXpEE/bgcFOFTfutBHOdoYq8/CVMLK5mu64b3lkosqGG TA95Iap1FW7iQ== From: Saeed Mahameed To: Arnd Bergmann , Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, Leon Romanovsky , Jason Gunthorpe , Jiri Pirko , Saeed Mahameed Subject: [PATCH 2/5] misc: mlx5ctl: Add mlx5ctl misc driver Date: Wed, 18 Oct 2023 01:19:38 -0700 Message-ID: <20231018081941.475277-3-saeed@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231018081941.475277-1-saeed@kernel.org> References: <20231018081941.475277-1-saeed@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 18 Oct 2023 01:20:53 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780080710655995235 X-GMAIL-MSGID: 1780080710655995235 From: Saeed Mahameed The ConnectX HW family supported by the mlx5 drivers uses an architecture where a FW component executes "mailbox RPCs" issued by the driver to make changes to the device. This results in a complex debugging environment where the FW component has information and low level configuration that needs to be accessed to userspace for debugging purposes. Historically a userspace program was used that accessed the PCI register and config space directly through /sys/bus/pci/.../XXX and could operate these debugging interfaces in parallel with the running driver. This approach is incompatible with secure boot and kernel lockdown so this driver provides a secure and restricted interface to that same data. On open the driver would allocate a special FW UID (user context ID) restrected to debug RPCs only, later in this series all user RPCs will be stamped with this UID. Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Saeed Mahameed --- drivers/misc/Kconfig | 1 + drivers/misc/Makefile | 1 + drivers/misc/mlx5ctl/Kconfig | 14 ++ drivers/misc/mlx5ctl/Makefile | 4 + drivers/misc/mlx5ctl/main.c | 314 ++++++++++++++++++++++++++++++++++ 5 files changed, 334 insertions(+) create mode 100644 drivers/misc/mlx5ctl/Kconfig create mode 100644 drivers/misc/mlx5ctl/Makefile create mode 100644 drivers/misc/mlx5ctl/main.c diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig index cadd4a820c03..b46bd8edc348 100644 --- a/drivers/misc/Kconfig +++ b/drivers/misc/Kconfig @@ -579,4 +579,5 @@ source "drivers/misc/cardreader/Kconfig" source "drivers/misc/uacce/Kconfig" source "drivers/misc/pvpanic/Kconfig" source "drivers/misc/mchp_pci1xxxx/Kconfig" +source "drivers/misc/mlx5ctl/Kconfig" endmenu diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile index f2a4d1ff65d4..49bc4697f498 100644 --- a/drivers/misc/Makefile +++ b/drivers/misc/Makefile @@ -67,3 +67,4 @@ obj-$(CONFIG_TMR_MANAGER) += xilinx_tmr_manager.o obj-$(CONFIG_TMR_INJECT) += xilinx_tmr_inject.o obj-$(CONFIG_TPS6594_ESM) += tps6594-esm.o obj-$(CONFIG_TPS6594_PFSM) += tps6594-pfsm.o +obj-$(CONFIG_MLX5CTL) += mlx5ctl/ diff --git a/drivers/misc/mlx5ctl/Kconfig b/drivers/misc/mlx5ctl/Kconfig new file mode 100644 index 000000000000..faaa1dba2cc2 --- /dev/null +++ b/drivers/misc/mlx5ctl/Kconfig @@ -0,0 +1,14 @@ +# SPDX-License-Identifier: GPL-2.0 +# + +config MLX5CTL + tristate "mlx5 ConnectX control misc driver" + depends on MLX5_CORE + help + MLX5CTL provides interface for the user process to access the debug and + configuration registers of the ConnectX hardware family + (NICs, PCI switches and SmartNIC SoCs). + This will allow configuration and debug tools to work out of the box on + mainstream kernel. + + If you don't know what to do here, say N. diff --git a/drivers/misc/mlx5ctl/Makefile b/drivers/misc/mlx5ctl/Makefile new file mode 100644 index 000000000000..b5c7f99e0ab6 --- /dev/null +++ b/drivers/misc/mlx5ctl/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0 + +obj-$(CONFIG_MLX5CTL) += mlx5ctl.o +mlx5ctl-y := main.o diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c new file mode 100644 index 000000000000..de8d6129432c --- /dev/null +++ b/drivers/misc/mlx5ctl/main.c @@ -0,0 +1,314 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +MODULE_DESCRIPTION("mlx5 ConnectX control misc driver"); +MODULE_AUTHOR("Saeed Mahameed "); +MODULE_LICENSE("Dual BSD/GPL"); + +struct mlx5ctl_dev { + struct mlx5_core_dev *mdev; + struct miscdevice miscdev; + struct auxiliary_device *adev; + struct list_head fd_list; + spinlock_t fd_list_lock; /* protect list add/del */ + struct rw_semaphore rw_lock; + struct kref refcount; +}; + +struct mlx5ctl_fd { + u16 uctx_uid; + u32 uctx_cap; + u32 ucap; /* user cap */ + struct mlx5ctl_dev *mcdev; + struct list_head list; +}; + +#define mlx5ctl_err(mcdev, format, ...) \ + dev_err(mcdev->miscdev.parent, format, ##__VA_ARGS__) + +#define mlx5ctl_dbg(mcdev, format, ...) \ + dev_dbg(mcdev->miscdev.parent, "PID %d: " format, \ + current->pid, ##__VA_ARGS__) + +enum { + MLX5_UCTX_OBJECT_CAP_RAW_TX = 0x1, + MLX5_UCTX_OBJECT_CAP_INTERNAL_DEVICE_RESOURCES = 0x2, + MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES = 0x4, +}; + +static int mlx5ctl_alloc_uid(struct mlx5ctl_dev *mcdev, u32 cap) +{ + u32 out[MLX5_ST_SZ_DW(create_uctx_out)] = {}; + u32 in[MLX5_ST_SZ_DW(create_uctx_in)] = {}; + void *uctx; + int err; + u16 uid; + + uctx = MLX5_ADDR_OF(create_uctx_in, in, uctx); + + mlx5ctl_dbg(mcdev, "MLX5_CMD_OP_CREATE_UCTX: caps 0x%x\n", cap); + MLX5_SET(create_uctx_in, in, opcode, MLX5_CMD_OP_CREATE_UCTX); + MLX5_SET(uctx, uctx, cap, cap); + + err = mlx5_cmd_exec(mcdev->mdev, in, sizeof(in), out, sizeof(out)); + if (err) + return err; + + uid = MLX5_GET(create_uctx_out, out, uid); + mlx5ctl_dbg(mcdev, "allocated uid %d with caps 0x%x\n", uid, cap); + return uid; +} + +static void mlx5ctl_release_uid(struct mlx5ctl_dev *mcdev, u16 uid) +{ + u32 in[MLX5_ST_SZ_DW(destroy_uctx_in)] = {}; + struct mlx5_core_dev *mdev = mcdev->mdev; + int err; + + MLX5_SET(destroy_uctx_in, in, opcode, MLX5_CMD_OP_DESTROY_UCTX); + MLX5_SET(destroy_uctx_in, in, uid, uid); + + err = mlx5_cmd_exec_in(mdev, destroy_uctx, in); + mlx5ctl_dbg(mcdev, "released uid %d err(%d)\n", uid, err); +} + +static void mcdev_get(struct mlx5ctl_dev *mcdev); +static void mcdev_put(struct mlx5ctl_dev *mcdev); + +static int mlx5ctl_open_mfd(struct mlx5ctl_fd *mfd) +{ + struct mlx5_core_dev *mdev = mfd->mcdev->mdev; + struct mlx5ctl_dev *mcdev = mfd->mcdev; + u32 ucap = 0, cap = 0; + int uid; + +#define MLX5_UCTX_CAP(mdev, cap) \ + (MLX5_CAP_GEN(mdev, uctx_cap) & MLX5_UCTX_OBJECT_CAP_##cap) + + if (capable(CAP_NET_RAW) && MLX5_UCTX_CAP(mdev, RAW_TX)) { + ucap |= CAP_NET_RAW; + cap |= MLX5_UCTX_OBJECT_CAP_RAW_TX; + } + + if (capable(CAP_SYS_RAWIO) && MLX5_UCTX_CAP(mdev, INTERNAL_DEVICE_RESOURCES)) { + ucap |= CAP_SYS_RAWIO; + cap |= MLX5_UCTX_OBJECT_CAP_INTERNAL_DEVICE_RESOURCES; + } + + if (capable(CAP_SYS_ADMIN) && MLX5_UCTX_CAP(mdev, TOOLS_RESOURCES)) { + ucap |= CAP_SYS_ADMIN; + cap |= MLX5_UCTX_OBJECT_CAP_TOOLS_RESOURCES; + } + + uid = mlx5ctl_alloc_uid(mcdev, cap); + if (uid < 0) + return uid; + + mfd->uctx_uid = uid; + mfd->uctx_cap = cap; + mfd->ucap = ucap; + mfd->mcdev = mcdev; + + mlx5ctl_dbg(mcdev, "allocated uid %d with uctx caps 0x%x, user cap 0x%x\n", + uid, cap, ucap); + return 0; +} + +static void mlx5ctl_release_mfd(struct mlx5ctl_fd *mfd) +{ + struct mlx5ctl_dev *mcdev = mfd->mcdev; + + mlx5ctl_release_uid(mcdev, mfd->uctx_uid); +} + +static int mlx5ctl_open(struct inode *inode, struct file *file) +{ + struct mlx5_core_dev *mdev; + struct mlx5ctl_dev *mcdev; + struct mlx5ctl_fd *mfd; + int err = 0; + + mcdev = container_of(file->private_data, struct mlx5ctl_dev, miscdev); + mcdev_get(mcdev); + down_read(&mcdev->rw_lock); + mdev = mcdev->mdev; + if (!mdev) { + err = -ENODEV; + goto unlock; + } + + mfd = kzalloc(sizeof(*mfd), GFP_KERNEL_ACCOUNT); + if (!mfd) + return -ENOMEM; + + mfd->mcdev = mcdev; + err = mlx5ctl_open_mfd(mfd); + if (err) + goto unlock; + + spin_lock(&mcdev->fd_list_lock); + list_add_tail(&mfd->list, &mcdev->fd_list); + spin_unlock(&mcdev->fd_list_lock); + + file->private_data = mfd; + +unlock: + up_read(&mcdev->rw_lock); + if (err) { + mcdev_put(mcdev); + kfree(mfd); + } + return err; +} + +static int mlx5ctl_release(struct inode *inode, struct file *file) +{ + struct mlx5ctl_fd *mfd = file->private_data; + struct mlx5ctl_dev *mcdev = mfd->mcdev; + + down_read(&mcdev->rw_lock); + if (!mcdev->mdev) { + pr_debug("[%d] UID %d mlx5ctl: mdev is already released\n", + current->pid, mfd->uctx_uid); + /* All mfds are already released, skip ... */ + goto unlock; + } + + spin_lock(&mcdev->fd_list_lock); + list_del(&mfd->list); + spin_unlock(&mcdev->fd_list_lock); + + mlx5ctl_release_mfd(mfd); + +unlock: + kfree(mfd); + up_read(&mcdev->rw_lock); + mcdev_put(mcdev); + file->private_data = NULL; + return 0; +} + +static const struct file_operations mlx5ctl_fops = { + .owner = THIS_MODULE, + .open = mlx5ctl_open, + .release = mlx5ctl_release, +}; + +static int mlx5ctl_probe(struct auxiliary_device *adev, + const struct auxiliary_device_id *id) + +{ + struct mlx5_adev *madev = container_of(adev, struct mlx5_adev, adev); + struct mlx5_core_dev *mdev = madev->mdev; + struct mlx5ctl_dev *mcdev; + char *devname = NULL; + int err; + + mcdev = kzalloc(sizeof(*mcdev), GFP_KERNEL_ACCOUNT); + if (!mcdev) + return -ENOMEM; + + kref_init(&mcdev->refcount); + INIT_LIST_HEAD(&mcdev->fd_list); + spin_lock_init(&mcdev->fd_list_lock); + init_rwsem(&mcdev->rw_lock); + mcdev->mdev = mdev; + mcdev->adev = adev; + devname = kasprintf(GFP_KERNEL_ACCOUNT, "mlx5ctl-%s", + dev_name(&adev->dev)); + if (!devname) { + err = -ENOMEM; + goto abort; + } + + mcdev->miscdev = (struct miscdevice) { + .minor = MISC_DYNAMIC_MINOR, + .name = devname, + .fops = &mlx5ctl_fops, + .parent = &adev->dev, + }; + + err = misc_register(&mcdev->miscdev); + if (err) { + mlx5ctl_err(mcdev, "mlx5ctl: failed to register misc device err %d\n", err); + goto abort; + } + + mlx5ctl_dbg(mcdev, "probe mdev@%s %s\n", dev_driver_string(mdev->device), dev_name(mdev->device)); + + auxiliary_set_drvdata(adev, mcdev); + + return 0; + +abort: + kfree(devname); + kfree(mcdev); + return err; +} + +static void mlx5ctl_remove(struct auxiliary_device *adev) +{ + struct mlx5ctl_dev *mcdev = auxiliary_get_drvdata(adev); + struct mlx5_core_dev *mdev = mcdev->mdev; + struct mlx5ctl_fd *mfd, *n; + + misc_deregister(&mcdev->miscdev); + down_write(&mcdev->rw_lock); + + list_for_each_entry_safe(mfd, n, &mcdev->fd_list, list) { + mlx5ctl_dbg(mcdev, "UID %d still has open FDs\n", mfd->uctx_uid); + list_del(&mfd->list); + mlx5ctl_release_mfd(mfd); + } + + mlx5ctl_dbg(mcdev, "removed mdev %s %s\n", + dev_driver_string(mdev->device), dev_name(mdev->device)); + + mcdev->mdev = NULL; /* prevent already open fds from accessing the device */ + up_write(&mcdev->rw_lock); + mcdev_put(mcdev); +} + +static void mcdev_free(struct kref *ref) +{ + struct mlx5ctl_dev *mcdev = container_of(ref, struct mlx5ctl_dev, refcount); + + kfree(mcdev->miscdev.name); + kfree(mcdev); +} + +static void mcdev_get(struct mlx5ctl_dev *mcdev) +{ + kref_get(&mcdev->refcount); +} + +static void mcdev_put(struct mlx5ctl_dev *mcdev) +{ + kref_put(&mcdev->refcount, mcdev_free); +} + +static const struct auxiliary_device_id mlx5ctl_id_table[] = { + { .name = MLX5_ADEV_NAME ".ctl", }, + {}, +}; + +MODULE_DEVICE_TABLE(auxiliary, mlx5ctl_id_table); + +static struct auxiliary_driver mlx5ctl_driver = { + .name = "ctl", + .probe = mlx5ctl_probe, + .remove = mlx5ctl_remove, + .id_table = mlx5ctl_id_table, +}; + +module_auxiliary_driver(mlx5ctl_driver); From patchwork Wed Oct 18 08:19:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 154750 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4640503vqb; Wed, 18 Oct 2023 01:22:04 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHIlqYxhDsSU4UBb4h7g3zNOCtbnCIVCySmNU1lw+ANGFuoimoIaLoQr4Hru9owLEgCBMha X-Received: by 2002:a9d:7291:0:b0:6cd:74d:1f34 with SMTP id t17-20020a9d7291000000b006cd074d1f34mr838416otj.1.1697617324443; Wed, 18 Oct 2023 01:22:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697617324; cv=none; d=google.com; s=arc-20160816; b=D24ItZj/FqmXedkMXrmLeJxzDsmYB7DWYd1xSZUAIpWFV++OBbfJ0J6g7z0FqlswuC btr8mpszRoSCOlIWR0khyJUVFOfOJg95Ml60S+y6A8MAs+QOkWHKHjVP3uxRYsZRRMmS BdaEdzBfbrI85sPTtLo9F642hKWs1QkEYs+gePWy/UzvgBn6LCfVHTyT//Xrszy0yHZ1 +YY6g1OfcW6D44wid7pAbt6QI0hSg9NxQ7J7CsNCxFBLBIFur48yb+1JDdYVy4cXZTYg bpXMfXCZSU/amGVfDUD7TMiF8qMe9VWVsJvkF5MJERFSLmjdQNY+swP2lT3EO7RU70D2 YE0w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7ZdbztE5v3Gm56tRr4hdBYvBMgDr+4+0FF+C1vludAc=; fh=OTS644v5rY1g/BUg247H5CRHtQNgRTQkhSPfJvy2RGE=; b=jcBDorL0E/gDV8eKuFkzV62GBQ1lVxW5XfUW68FKshvfQDCkGjGbq5EjcVn88tL7qK TMLSDOveMEaahBkBxXW9iBy3ylHUHCDWxc/Kp5ubBTAnqrHv/l07eg6JSnK6VtVmF2Kt tV7mH+HyR7B05cmPHWHLig8JYATVjOHIhwoVNeOjFPa3v4n0gIRARlrdqg4uZ5+lFWpb a3d90zcjAApBLT81hCEGmaih8hMjTAPSQ8UxMcYlwBO9DKberQQGzLRTqAat6KjFw87y Zb+qvogH8mAQ4z8aIWayBcSnWfW1vHUxS8bCaWMy+AzyOVTYwvwhbEeWilsHlXauifFz HRyw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=MUjLEJMl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id m11-20020a056a00080b00b0068fcb7125e1si3632484pfk.242.2023.10.18.01.22.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 01:22:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=MUjLEJMl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id E4F598099277; Wed, 18 Oct 2023 01:20:59 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235154AbjJRIUl (ORCPT + 24 others); Wed, 18 Oct 2023 04:20:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48808 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235065AbjJRIU0 (ORCPT ); Wed, 18 Oct 2023 04:20:26 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5FF60EA for ; Wed, 18 Oct 2023 01:20:24 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E8C95C433C8; Wed, 18 Oct 2023 08:20:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697617224; bh=ATZUQIWAOgDAg1PKI2T/FaVUQ9kUPdB2Z0fU+YtNRwE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=MUjLEJMlV7ER+0vMYzAwh9Dks212LHRvExU9uZoCilL95H+Vp4h5BPdwzL7UaOZId tI37don1zU9Yr8FE/syE0LiWInBI0xTOcLQDXYdFcA8aIhfz4I1WiG8Lj2z2iI83t4 2Sn+WnrTpArtApT658+Tbxrdt3mBsfzKlTe3lIDvBC9t67PCKfb7KOzFkPUUBSyzgk QpuVeRnfAS5lrq97GybMiSrgiMVCdywgumuqJyqzej448yZ0CNDxpSMbyVH0kV2qj8 BEI4DTxErGVqcDCncSqNki9aI0upTvkeR1+xdR8hig5EWQcnnxt2iJbRIHkQ2Ys8jE OBICB8aU7Vi/w== From: Saeed Mahameed To: Arnd Bergmann , Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, Leon Romanovsky , Jason Gunthorpe , Jiri Pirko , Saeed Mahameed Subject: [PATCH 3/5] misc: mlx5ctl: Add info ioctl Date: Wed, 18 Oct 2023 01:19:39 -0700 Message-ID: <20231018081941.475277-4-saeed@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231018081941.475277-1-saeed@kernel.org> References: <20231018081941.475277-1-saeed@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 18 Oct 2023 01:21:00 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780080783216998778 X-GMAIL-MSGID: 1780080783216998778 From: Saeed Mahameed Implement INFO ioctl to return the allocated UID and the capability flags and some other useful device information such as the underlying ConnectX device. Example: $ mlx5ctl mlx5_core.ctl.0 mlx5dev: 0000:00:04.0 UCTX UID: 1 UCTX CAP: 0x3 DEV UCTX CAP: 0x3 USER CAP: 0x1d Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Saeed Mahameed --- .../userspace-api/ioctl/ioctl-number.rst | 1 + drivers/misc/mlx5ctl/main.c | 72 +++++++++++++++++++ include/uapi/misc/mlx5ctl.h | 24 +++++++ 3 files changed, 97 insertions(+) create mode 100644 include/uapi/misc/mlx5ctl.h diff --git a/Documentation/userspace-api/ioctl/ioctl-number.rst b/Documentation/userspace-api/ioctl/ioctl-number.rst index 4ea5b837399a..9faf91ffefff 100644 --- a/Documentation/userspace-api/ioctl/ioctl-number.rst +++ b/Documentation/userspace-api/ioctl/ioctl-number.rst @@ -89,6 +89,7 @@ Code Seq# Include File Comments 0x20 all drivers/cdrom/cm206.h 0x22 all scsi/sg.h 0x3E 00-0F linux/counter.h +0x5c all uapi/misc/mlx5ctl.h Nvidia ConnectX control '!' 00-1F uapi/linux/seccomp.h '#' 00-3F IEEE 1394 Subsystem Block for the entire subsystem diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c index de8d6129432c..008ad3a12d97 100644 --- a/drivers/misc/mlx5ctl/main.c +++ b/drivers/misc/mlx5ctl/main.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -198,10 +199,81 @@ static int mlx5ctl_release(struct inode *inode, struct file *file) return 0; } +static int mlx5ctl_info_ioctl(struct file *file, void __user *arg, size_t usize) +{ + struct mlx5ctl_fd *mfd = file->private_data; + struct mlx5ctl_dev *mcdev = mfd->mcdev; + struct mlx5_core_dev *mdev = mcdev->mdev; + struct mlx5ctl_info *info; + void *kbuff = NULL; + size_t ksize = 0; + int err = 0; + + ksize = max(sizeof(struct mlx5ctl_info), usize); + kbuff = kzalloc(ksize, GFP_KERNEL_ACCOUNT); + if (!kbuff) + return -ENOMEM; + + info = kbuff; + + info->size = sizeof(struct mlx5ctl_info); + info->flags = 0; + + info->dev_uctx_cap = MLX5_CAP_GEN(mdev, uctx_cap); + info->uctx_cap = mfd->uctx_cap; + info->uctx_uid = mfd->uctx_uid; + info->ucap = mfd->ucap; + + strscpy(info->devname, dev_name(&mdev->pdev->dev), sizeof(info->devname)); + + if (copy_to_user(arg, kbuff, usize)) + err = -EFAULT; + + kfree(kbuff); + return err; +} + +static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + struct mlx5ctl_fd *mfd = file->private_data; + struct mlx5ctl_dev *mcdev = mfd->mcdev; + void __user *argp = (void __user *)arg; + size_t size = _IOC_SIZE(cmd); + int err = 0; + + if (!capable(CAP_SYS_ADMIN)) + return -EPERM; + + mlx5ctl_dbg(mcdev, "ioctl 0x%x type/nr: %d/%d size: %d DIR:%d\n", cmd, + _IOC_TYPE(cmd), _IOC_NR(cmd), _IOC_SIZE(cmd), _IOC_DIR(cmd)); + + down_read(&mcdev->rw_lock); + if (!mcdev->mdev) { + err = -ENODEV; + goto unlock; + } + + switch (cmd) { + case MLX5CTL_IOCTL_INFO: + err = mlx5ctl_info_ioctl(file, argp, size); + break; + + default: + mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd); + err = -ENOIOCTLCMD; + break; + } +unlock: + up_read(&mcdev->rw_lock); + return err; +} + static const struct file_operations mlx5ctl_fops = { .owner = THIS_MODULE, .open = mlx5ctl_open, .release = mlx5ctl_release, + .unlocked_ioctl = mlx5ctl_ioctl, }; static int mlx5ctl_probe(struct auxiliary_device *adev, diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h new file mode 100644 index 000000000000..81d89cd285fc --- /dev/null +++ b/include/uapi/misc/mlx5ctl.h @@ -0,0 +1,24 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB WITH Linux-syscall-note */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#ifndef __MLX5CTL_IOCTL_H__ +#define __MLX5CTL_IOCTL_H__ + +struct mlx5ctl_info { + __aligned_u64 flags; + __u32 size; + __u8 devname[64]; /* underlaying ConnectX device */ + __u16 uctx_uid; /* current process allocated UCTX UID */ + __u16 reserved1; + __u32 uctx_cap; /* current process effective UCTX cap */ + __u32 dev_uctx_cap; /* device's UCTX capabilities */ + __u32 ucap; /* process user capability */ + __u32 reserved2[4]; +}; + +#define MLX5CTL_IOCTL_MAGIC 0x5c + +#define MLX5CTL_IOCTL_INFO \ + _IOR(MLX5CTL_IOCTL_MAGIC, 0x0, struct mlx5ctl_info) + +#endif /* __MLX5CTL_IOCTL_H__ */ From patchwork Wed Oct 18 08:19:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 154747 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4640126vqb; Wed, 18 Oct 2023 01:21:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGPmzn+1DDmkayt8QWEYeCyYrkZV8Hin1ZdMiVzocQZZkkUMfR/tyUPrER83JYlMSuT7s1u X-Received: by 2002:a05:6870:4c10:b0:1e9:e8fd:bb77 with SMTP id pk16-20020a0568704c1000b001e9e8fdbb77mr5238783oab.0.1697617264629; Wed, 18 Oct 2023 01:21:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697617264; cv=none; d=google.com; s=arc-20160816; b=k+iLgrb12X36HldZviRvG/7+Du8kTOFPMGKkJx6YA06u8+NqudIIbr6MEwaSpLh6vs FPAS1xO6O6NboT3iX7RljidyRpI1Jw8Fv0i4ICZ9d5GIZJYS0lqpZSiYD2dcew5SFf0B jHrgQeXQvX8/24CtaRckr4WsCGndB+E+MoISwhxPTPU6HDuri6XC5m6fnn7kPPzdm7nE zZpeJd9UPyQn62i5SbVEhZ++JlyrRuS3LQ22i+0RB3Hgvkt+HhzRlaPBCFWsR8+76/fX 6BA0VYgz0dBqgZBOEd+Vp+AmZsXM8PSaAvHFlDwxwLkVDa4WLFjRlCSNTPI485yGpxlu 9UgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VcP2jWVDf6Z6Ax5zXVcnxnrSuYrxJ6ZG7lDK1cG1eF8=; fh=OTS644v5rY1g/BUg247H5CRHtQNgRTQkhSPfJvy2RGE=; b=YO4LA5yk+yZvcqUk2i6bI/Ty9DFpfRfQ/BeYok5r7Zl2Ojwye3NhOn/N8U0D3XS5XX 8ApcKju6KF2YjdS/W/KLr8CaUiQEVPucoIOsecqreOmSXP4byex6WtKc9p1ew2uQmKpn qCUDaikbPV+Dyy6UbMT1NqGZibSsTytgVJzqGFc4pYRSJBs1QhmBjQLq1McS4NzdmQcm gkmDBER8XB5IRKsnRRYljOrjFDAdgsbFYLOX7yIdfI93iqM/MlUjbue+E7HttM6gxSSC IHjRP03Q8pzCwLp1MBK8hC7f80FUw0bUZk9XXn/DtT7QNAJdFMzaeWahERBpiT7NsVGO 8gMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hLeCGIQa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id f9-20020a639c09000000b005ae4b7cdd6asi1532251pge.284.2023.10.18.01.21.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 01:21:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=hLeCGIQa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id AE6A9812D239; Wed, 18 Oct 2023 01:21:02 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235184AbjJRIUp (ORCPT + 24 others); Wed, 18 Oct 2023 04:20:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235093AbjJRIU0 (ORCPT ); Wed, 18 Oct 2023 04:20:26 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CF50F9 for ; Wed, 18 Oct 2023 01:20:25 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BCC87C433CA; Wed, 18 Oct 2023 08:20:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697617224; bh=TWmi1Z5nSzlL8GsKRL77jxFXt54ALrwkVztLIBKNRe4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=hLeCGIQaPoylt0uQ/wPtp+0aK27AkGYEl/EkSEPLMjy1/mckCKNWkWLVBg9MevASw 1xWTVarYZJLx1fS58y0yHJa+EUq2wI7SbxiXTuIfnfASbvNUPHuqAGEy7kn8+4wX4R dsaqRnRtWh6fpwxjbvaPsZVMDrfuCT6XuzmqfFMmu07XKJlUzUyi9hvscTbQGTpPW6 qUiq+pWbNhNEGLh+n6mK2r/lZgRiHroxrT5I+Zoy65qak5UIxwzp0sZ4nG3bSDlJnw iAWl7Pn4Jkz4vVB+PVsc6DZ663uyN4igfQpO/pgdCFyWJMT2Qv3s9CHiNSRSnzazry Wo1Vn6aGZkyWg== From: Saeed Mahameed To: Arnd Bergmann , Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, Leon Romanovsky , Jason Gunthorpe , Jiri Pirko , Saeed Mahameed Subject: [PATCH 4/5] misc: mlx5ctl: Add command rpc ioctl Date: Wed, 18 Oct 2023 01:19:40 -0700 Message-ID: <20231018081941.475277-5-saeed@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231018081941.475277-1-saeed@kernel.org> References: <20231018081941.475277-1-saeed@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 18 Oct 2023 01:21:02 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780080720741913581 X-GMAIL-MSGID: 1780080720741913581 From: Saeed Mahameed Add new IOCTL to allow user space to send device debug rpcs and attach the user's uctx UID to each rpc. In the mlx5 architecture the FW RPC commands are of the format of inbox and outbox buffers. The inbox buffer contains the command rpc layout as described in the ConnectX Programmers Reference Manual (PRM) document and as defined in include/linux/mlx5/mlx5_ifc.h. On success the user outbox buffer will be filled with the device's rpc response. For example to query device capabilities: a user fills out an inbox buffer with the inbox layout: struct mlx5_ifc_query_hca_cap_in_bits and expects an outbox buffer with the layout: struct mlx5_ifc_cmd_hca_cap_bits Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Saeed Mahameed --- drivers/misc/mlx5ctl/main.c | 93 +++++++++++++++++++++++++++++++++++++ include/uapi/misc/mlx5ctl.h | 13 ++++++ 2 files changed, 106 insertions(+) diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c index 008ad3a12d97..5f4edcc3e112 100644 --- a/drivers/misc/mlx5ctl/main.c +++ b/drivers/misc/mlx5ctl/main.c @@ -233,6 +233,95 @@ static int mlx5ctl_info_ioctl(struct file *file, void __user *arg, size_t usize) return err; } +struct mlx5_ifc_mbox_in_hdr_bits { + u8 opcode[0x10]; + u8 uid[0x10]; + + u8 reserved_at_20[0x10]; + u8 op_mod[0x10]; + + u8 reserved_at_40[0x40]; +}; + +struct mlx5_ifc_mbox_out_hdr_bits { + u8 status[0x8]; + u8 reserved_at_8[0x18]; + + u8 syndrome[0x20]; + + u8 reserved_at_40[0x40]; +}; + +static int mlx5ctl_cmdrpc_ioctl(struct file *file, void __user *arg, size_t usize) +{ + struct mlx5ctl_fd *mfd = file->private_data; + struct mlx5ctl_dev *mcdev = mfd->mcdev; + struct mlx5ctl_cmdrpc *rpc = NULL; + void *in = NULL, *out = NULL; + size_t ksize = 0; + int err; + + ksize = max(sizeof(struct mlx5ctl_cmdrpc), usize); + rpc = kzalloc(ksize, GFP_KERNEL_ACCOUNT); + if (!rpc) + return -ENOMEM; + + err = copy_from_user(rpc, arg, usize); + if (err) + goto out; + + mlx5ctl_dbg(mcdev, "[UID %d] cmdrpc: rpc->inlen %d rpc->outlen %d\n", + mfd->uctx_uid, rpc->inlen, rpc->outlen); + + if (rpc->inlen < MLX5_ST_SZ_BYTES(mbox_in_hdr) || + rpc->outlen < MLX5_ST_SZ_BYTES(mbox_out_hdr) || + rpc->inlen > MLX5CTL_MAX_RPC_SIZE || + rpc->outlen > MLX5CTL_MAX_RPC_SIZE) { + err = -EINVAL; + goto out; + } + + if (rpc->flags) { + err = -EOPNOTSUPP; + goto out; + } + + in = memdup_user(u64_to_user_ptr(rpc->in), rpc->inlen); + if (IS_ERR(in)) { + err = PTR_ERR(in); + goto out; + } + + out = kvzalloc(rpc->outlen, GFP_KERNEL_ACCOUNT); + if (!out) { + err = -ENOMEM; + goto out; + } + + mlx5ctl_dbg(mcdev, "[UID %d] cmdif: opcode 0x%x inlen %d outlen %d\n", + mfd->uctx_uid, + MLX5_GET(mbox_in_hdr, in, opcode), rpc->inlen, rpc->outlen); + + MLX5_SET(mbox_in_hdr, in, uid, mfd->uctx_uid); + err = mlx5_cmd_do(mcdev->mdev, in, rpc->inlen, out, rpc->outlen); + mlx5ctl_dbg(mcdev, "[UID %d] cmdif: opcode 0x%x retval %d\n", + mfd->uctx_uid, + MLX5_GET(mbox_in_hdr, in, opcode), err); + + /* -EREMOTEIO means outbox is valid, but out.status is not */ + if (!err || err == -EREMOTEIO) { + err = 0; + if (copy_to_user(u64_to_user_ptr(rpc->out), out, rpc->outlen)) + err = -EFAULT; + } + +out: + kvfree(out); + kfree(in); + kfree(rpc); + return err; +} + static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -259,6 +348,10 @@ static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd, err = mlx5ctl_info_ioctl(file, argp, size); break; + case MLX5CTL_IOCTL_CMDRPC: + err = mlx5ctl_cmdrpc_ioctl(file, argp, size); + break; + default: mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd); err = -ENOIOCTLCMD; diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h index 81d89cd285fc..49c26ccc2d21 100644 --- a/include/uapi/misc/mlx5ctl.h +++ b/include/uapi/misc/mlx5ctl.h @@ -16,9 +16,22 @@ struct mlx5ctl_info { __u32 reserved2[4]; }; +struct mlx5ctl_cmdrpc { + __aligned_u64 in; /* RPC inbox buffer user address */ + __aligned_u64 out; /* RPC outbox buffer user address */ + __u32 inlen; /* inbox buffer length */ + __u32 outlen; /* outbox buffer length */ + __aligned_u64 flags; +}; + +#define MLX5CTL_MAX_RPC_SIZE 8192 + #define MLX5CTL_IOCTL_MAGIC 0x5c #define MLX5CTL_IOCTL_INFO \ _IOR(MLX5CTL_IOCTL_MAGIC, 0x0, struct mlx5ctl_info) +#define MLX5CTL_IOCTL_CMDRPC \ + _IOWR(MLX5CTL_IOCTL_MAGIC, 0x1, struct mlx5ctl_cmdrpc) + #endif /* __MLX5CTL_IOCTL_H__ */ From patchwork Wed Oct 18 08:19:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Saeed Mahameed X-Patchwork-Id: 154748 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp4640129vqb; Wed, 18 Oct 2023 01:21:05 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHukY2fAcvYB3Q31QTA1FIIY4bPFAMXfoziy2IX+7YFxPi2d0G/+tgZKkMSSToyRK9KvvXa X-Received: by 2002:a05:6358:72a6:b0:166:d9c9:dbe with SMTP id w38-20020a05635872a600b00166d9c90dbemr5057189rwf.3.1697617265498; Wed, 18 Oct 2023 01:21:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697617265; cv=none; d=google.com; s=arc-20160816; b=kJhIaQPwYUIYQO4BCRE/HR5u7P9nhjB91f/wdPY9q7RGJg5s43JFfIXUz8elZu21wy eZ7A1FD24uu/nP/phrEoSCqVkMMQALkTCYuozC7zD7/epXi/ikAlBRUpn+/KolN6+dJB izYAabAXE/hlKBtP3PKf7kH7sBmsLpUhPLx6rehhXewd2+HJby8twHYQGhoUefHddR3k 8ye7aXaHmv/A51Sj8tClujglQY06Ug6/yuuHKsAVls2DJuG6lBGHfY7wkj3LafnkzZeU X5uPz1izumPsyaYdpVqh9YnRWEUEDe7zms09oqEwjcSJpwLu3oJM/lTfM+A/CjzrZvSG bV2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=zM5AMfPNvLGtWFEFqn8w/VPpZsgagOzp5P/6bLPjUPs=; fh=OTS644v5rY1g/BUg247H5CRHtQNgRTQkhSPfJvy2RGE=; b=nLNzQa3kfOTcyeaq1zzRO5EGaPpsKA0BIkTlt8n3t/UQPWQO3xUscxJJ43TYZqXAv6 LjaI9QiRewIi6CvxqWsOuueLolnk1MUu+VgoK7NvJkhBwJXKaD7bN2P3kg0pit94/p1u Qg662QhfC5+IzprqFoTooIqhzLGVKfxPYWDYMOjPPoxXKbBVOclHNvPTYpemKQvbMu3U 3kQ3cE+rW+eStjmuC+E+15AcwHLxg6uFIO1/xwHld1LT62OIuLUBD+3LsxK6W9L9zk5N Q+tP7Zb1h4p0yQKoO60vtOi2UvgKOFP/QA3Vc+cC8IxE0PXeZwfpPNktuY3EiHR77pNq PPvQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=un3d2nsj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id t4-20020a63f344000000b005778df5647dsi1583512pgj.401.2023.10.18.01.21.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 01:21:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=un3d2nsj; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 72B34812D23D; Wed, 18 Oct 2023 01:21:04 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235190AbjJRIUt (ORCPT + 24 others); Wed, 18 Oct 2023 04:20:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235118AbjJRIU2 (ORCPT ); Wed, 18 Oct 2023 04:20:28 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1FB2FA for ; Wed, 18 Oct 2023 01:20:25 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9D91CC433C8; Wed, 18 Oct 2023 08:20:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1697617225; bh=3i3+doLFvzlAfHeafN/PxeF54arnKvsliH5HQmE+QCc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=un3d2nsjBx7ZW3me7tYYwktCtIGqEr0RKSuvdNmT/eKIamkNJMNwkK8cqBAGJNtyH UREoncbnVmu8hZ5cbncWQSe2Lf41HWuNCZqTitbFc2dPATfkjJj6GEGWVX5F0PeiVf MyJsc+1pqEDDnVNulSs/XIa44jl8g+BOYDoIWAadqQtQ3KtLr7Iqj85FtHvBNHIQaE lFH63q5zUVjEFmICkTZovlAOdlS2PFSH3WY9sSZT2KzZ3vLIRcbCm2N55vD4uq5aE5 zudQhVSC46UJax8Zso78FNIkaACzsySUzEohH96ifYuJ0mOUnJ+GnrUdYJuxxF1i+E rzybnLITLKb4g== From: Saeed Mahameed To: Arnd Bergmann , Greg Kroah-Hartman Cc: linux-kernel@vger.kernel.org, Leon Romanovsky , Jason Gunthorpe , Jiri Pirko , Saeed Mahameed Subject: [PATCH 5/5] misc: mlx5ctl: Add umem reg/unreg ioctl Date: Wed, 18 Oct 2023 01:19:41 -0700 Message-ID: <20231018081941.475277-6-saeed@kernel.org> X-Mailer: git-send-email 2.41.0 In-Reply-To: <20231018081941.475277-1-saeed@kernel.org> References: <20231018081941.475277-1-saeed@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Wed, 18 Oct 2023 01:21:04 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780080721890516957 X-GMAIL-MSGID: 1780080721890516957 From: Saeed Mahameed Command rpc outbox buffer is limited in size, which can be very annoying when trying to pull large traces out of the device. Many device rpcs offer the ability to scatter output traces, contexts and logs directly into user space buffers in a single shot. Allow user to register user memory space, so the device may dump information directly into user memory space. The registered memory will be described by a device UMEM object which has a unique umem_id, this umem_id can be later used in the rpc inbox to tell the device where to populate the response output, e.g HW traces and other debug object queries. To do so this patch introduces two ioctls: MLX5CTL_IOCTL_UMEM_REG(va_address, size): - calculate page fragments from the user provided virtual address - pin the pages, and allocate a sg list - dma map the sg list - create a UMEM device object that points to the dma addresses - add a driver umem object to an xarray data base for bookkeeping - return UMEM ID to user so it can be used in subsequent rpcs MLX5CTL_IOCTL_UMEM_UNREG(umem_id): - user provides a pre allocated umem ID - unwinds the above Example usecase, ConnectX device coredump can be as large as 2MB. Using inline rpcs will take thousands of rpcs to get the full coredump which can take multiple seconds. With UMEM, it can be done in a single rpc, using 2MB of umem user buffer. $ mlx5ctl mlx5_core.ctl.0 coredump --umem_size=$(( 2 ** 20 )) 00 00 00 00 01 00 20 00 00 00 00 04 00 00 48 ec 00 00 00 08 00 00 00 00 00 00 00 0c 00 00 00 03 00 00 00 10 00 00 00 00 00 00 00 14 00 00 00 00 .... 00 50 0b 3c 00 00 00 00 00 50 0b 40 00 00 00 00 00 50 0b 44 00 00 00 00 00 50 0b 48 00 00 00 00 00 50 0c 00 00 00 00 00 INFO : Core dump done INFO : Core dump size 831304 INFO : Core dump address 0x0 INFO : Core dump cookie 0x500c04 INFO : More Dump 0 Other usecases are: dynamic HW and FW trace monitoring, high frequency diagnostic counters sampling and batched objects and resource dumps. Reviewed-by: Leon Romanovsky Reviewed-by: Jason Gunthorpe Signed-off-by: Saeed Mahameed --- drivers/misc/mlx5ctl/Makefile | 1 + drivers/misc/mlx5ctl/main.c | 49 +++++ drivers/misc/mlx5ctl/umem.c | 325 ++++++++++++++++++++++++++++++++++ drivers/misc/mlx5ctl/umem.h | 17 ++ include/uapi/misc/mlx5ctl.h | 14 ++ 5 files changed, 406 insertions(+) create mode 100644 drivers/misc/mlx5ctl/umem.c create mode 100644 drivers/misc/mlx5ctl/umem.h diff --git a/drivers/misc/mlx5ctl/Makefile b/drivers/misc/mlx5ctl/Makefile index b5c7f99e0ab6..f35234e931a8 100644 --- a/drivers/misc/mlx5ctl/Makefile +++ b/drivers/misc/mlx5ctl/Makefile @@ -2,3 +2,4 @@ obj-$(CONFIG_MLX5CTL) += mlx5ctl.o mlx5ctl-y := main.o +mlx5ctl-y += umem.o diff --git a/drivers/misc/mlx5ctl/main.c b/drivers/misc/mlx5ctl/main.c index 5f4edcc3e112..d4d72689f6e9 100644 --- a/drivers/misc/mlx5ctl/main.c +++ b/drivers/misc/mlx5ctl/main.c @@ -12,6 +12,8 @@ #include #include +#include "umem.h" + MODULE_DESCRIPTION("mlx5 ConnectX control misc driver"); MODULE_AUTHOR("Saeed Mahameed "); MODULE_LICENSE("Dual BSD/GPL"); @@ -30,6 +32,8 @@ struct mlx5ctl_fd { u16 uctx_uid; u32 uctx_cap; u32 ucap; /* user cap */ + + struct mlx5ctl_umem_db *umem_db; struct mlx5ctl_dev *mcdev; struct list_head list; }; @@ -115,6 +119,12 @@ static int mlx5ctl_open_mfd(struct mlx5ctl_fd *mfd) if (uid < 0) return uid; + mfd->umem_db = mlx5ctl_umem_db_create(mdev, uid); + if (IS_ERR(mfd->umem_db)) { + mlx5ctl_release_uid(mcdev, uid); + return PTR_ERR(mfd->umem_db); + } + mfd->uctx_uid = uid; mfd->uctx_cap = cap; mfd->ucap = ucap; @@ -129,6 +139,7 @@ static void mlx5ctl_release_mfd(struct mlx5ctl_fd *mfd) { struct mlx5ctl_dev *mcdev = mfd->mcdev; + mlx5ctl_umem_db_destroy(mfd->umem_db); mlx5ctl_release_uid(mcdev, mfd->uctx_uid); } @@ -322,6 +333,36 @@ static int mlx5ctl_cmdrpc_ioctl(struct file *file, void __user *arg, size_t usiz return err; } +static ssize_t mlx5ctl_ioctl_umem_reg(struct file *file, unsigned long arg) +{ + struct mlx5ctl_fd *mfd = file->private_data; + struct mlx5ctl_umem_reg umem_reg; + int umem_id; + + if (copy_from_user(&umem_reg, (void __user *)arg, sizeof(umem_reg))) + return -EFAULT; + + umem_id = mlx5ctl_umem_reg(mfd->umem_db, (unsigned long)umem_reg.addr, umem_reg.len); + if (umem_id < 0) + return umem_id; + + umem_reg.umem_id = umem_id; + + if (copy_to_user((void __user *)arg, &umem_reg, sizeof(umem_reg))) { + mlx5ctl_umem_unreg(mfd->umem_db, umem_id); + return -EFAULT; + } + + return 0; +} + +static size_t mlx5ctl_ioctl_umem_unreg(struct file *file, unsigned long arg) +{ + struct mlx5ctl_fd *mfd = file->private_data; + + return mlx5ctl_umem_unreg(mfd->umem_db, (u32)arg); +} + static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { @@ -352,6 +393,14 @@ static ssize_t mlx5ctl_ioctl(struct file *file, unsigned int cmd, err = mlx5ctl_cmdrpc_ioctl(file, argp, size); break; + case MLX5CTL_IOCTL_UMEM_REG: + err = mlx5ctl_ioctl_umem_reg(file, arg); + break; + + case MLX5CTL_IOCTL_UMEM_UNREG: + err = mlx5ctl_ioctl_umem_unreg(file, arg); + break; + default: mlx5ctl_dbg(mcdev, "Unknown ioctl %x\n", cmd); err = -ENOIOCTLCMD; diff --git a/drivers/misc/mlx5ctl/umem.c b/drivers/misc/mlx5ctl/umem.c new file mode 100644 index 000000000000..c21b54d24762 --- /dev/null +++ b/drivers/misc/mlx5ctl/umem.c @@ -0,0 +1,325 @@ +// SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#include +#include +#include + +#include "umem.h" + +#define umem_dbg(__mdev, fmt, ...) \ + dev_dbg((__mdev)->device, "mlx5ctl_umem: " fmt, ##__VA_ARGS__) + +#define MLX5CTL_UMEM_MAX_MB 64 + +static size_t umem_num_pages(u64 addr, size_t len) +{ + return (size_t)((ALIGN(addr + len, PAGE_SIZE) - + ALIGN_DOWN(addr, PAGE_SIZE))) / + PAGE_SIZE; +} + +struct mlx5ctl_umem { + struct sg_table sgt; + unsigned long addr; + size_t size; + size_t offset; + size_t npages; + struct task_struct *source_task; + struct mm_struct *source_mm; + struct user_struct *source_user; + u32 umem_id; + struct page **page_list; +}; + +struct mlx5ctl_umem_db { + struct xarray xarray; + struct mlx5_core_dev *mdev; + u32 uctx_uid; +}; + +static int inc_user_locked_vm(struct mlx5ctl_umem *umem, unsigned long npages) +{ + unsigned long lock_limit; + unsigned long cur_pages; + unsigned long new_pages; + + lock_limit = task_rlimit(umem->source_task, RLIMIT_MEMLOCK) >> + PAGE_SHIFT; + do { + cur_pages = atomic_long_read(&umem->source_user->locked_vm); + new_pages = cur_pages + npages; + if (new_pages > lock_limit) + return -ENOMEM; + } while (atomic_long_cmpxchg(&umem->source_user->locked_vm, cur_pages, + new_pages) != cur_pages); + return 0; +} + +static void dec_user_locked_vm(struct mlx5ctl_umem *umem, unsigned long npages) +{ + if (WARN_ON(atomic_long_read(&umem->source_user->locked_vm) < npages)) + return; + atomic_long_sub(npages, &umem->source_user->locked_vm); +} + +static struct mlx5ctl_umem *mlx5ctl_umem_pin(struct mlx5ctl_umem_db *umem_db, + unsigned long addr, size_t size) +{ + size_t npages = umem_num_pages(addr, size); + struct mlx5_core_dev *mdev = umem_db->mdev; + unsigned long endaddr = addr + size; + struct mlx5ctl_umem *umem; + struct page **page_list; + int err = -EINVAL; + int pinned = 0; + + umem_dbg(mdev, "%s: addr %p size %zu npages %zu\n", + __func__, (void *)addr, size, npages); + + /* Avoid integer overflow */ + if (endaddr < addr || PAGE_ALIGN(endaddr) < endaddr) + return ERR_PTR(-EINVAL); + + if (npages == 0 || pages_to_mb(npages) > MLX5CTL_UMEM_MAX_MB) + return ERR_PTR(-EINVAL); + + page_list = kvmalloc_array(npages, sizeof(struct page *), GFP_KERNEL_ACCOUNT); + if (!page_list) + return ERR_PTR(-ENOMEM); + + umem = kzalloc(sizeof(*umem), GFP_KERNEL_ACCOUNT); + if (!umem) { + kvfree(page_list); + return ERR_PTR(-ENOMEM); + } + + umem->addr = addr; + umem->size = size; + umem->offset = addr & ~PAGE_MASK; + umem->npages = npages; + + umem->page_list = page_list; + umem->source_mm = current->mm; + umem->source_task = current->group_leader; + get_task_struct(current->group_leader); + umem->source_user = get_uid(current_user()); + + /* mm and RLIMIT_MEMLOCK user task accounting similar to what is + * being done in iopt_alloc_pages() and do_update_pinned() + * for IOPT_PAGES_ACCOUNT_USER @drivers/iommu/iommufd/pages.c + */ + mmgrab(umem->source_mm); + + pinned = pin_user_pages_fast(addr, npages, FOLL_WRITE, page_list); + if (pinned != npages) { + umem_dbg(mdev, "pin_user_pages_fast failed %d\n", pinned); + err = pinned < 0 ? pinned : -ENOMEM; + goto pin_failed; + } + + err = inc_user_locked_vm(umem, npages); + if (err) + goto pin_failed; + + atomic64_add(npages, &umem->source_mm->pinned_vm); + + err = sg_alloc_table_from_pages(&umem->sgt, page_list, npages, 0, + npages << PAGE_SHIFT, GFP_KERNEL_ACCOUNT); + if (err) { + umem_dbg(mdev, "sg_alloc_table failed: %d\n", err); + goto sgt_failed; + } + + umem_dbg(mdev, "\tsgt: size %zu npages %zu sgt.nents (%d)\n", + size, npages, umem->sgt.nents); + + err = dma_map_sgtable(mdev->device, &umem->sgt, DMA_BIDIRECTIONAL, 0); + if (err) { + umem_dbg(mdev, "dma_map_sgtable failed: %d\n", err); + goto dma_failed; + } + + umem_dbg(mdev, "\tsgt: dma_nents %d\n", umem->sgt.nents); + return umem; + +dma_failed: +sgt_failed: + sg_free_table(&umem->sgt); + atomic64_sub(npages, &umem->source_mm->pinned_vm); + dec_user_locked_vm(umem, npages); +pin_failed: + if (pinned > 0) + unpin_user_pages(page_list, pinned); + mmdrop(umem->source_mm); + free_uid(umem->source_user); + put_task_struct(umem->source_task); + + kfree(umem); + kvfree(page_list); + return ERR_PTR(err); +} + +static void mlx5ctl_umem_unpin(struct mlx5ctl_umem_db *umem_db, + struct mlx5ctl_umem *umem) +{ + struct mlx5_core_dev *mdev = umem_db->mdev; + + umem_dbg(mdev, "%s: addr %p size %zu npages %zu dma_nents %d\n", + __func__, (void *)umem->addr, umem->size, umem->npages, + umem->sgt.nents); + + dma_unmap_sgtable(mdev->device, &umem->sgt, DMA_BIDIRECTIONAL, 0); + sg_free_table(&umem->sgt); + + atomic64_sub(umem->npages, &umem->source_mm->pinned_vm); + dec_user_locked_vm(umem, umem->npages); + unpin_user_pages(umem->page_list, umem->npages); + mmdrop(umem->source_mm); + free_uid(umem->source_user); + put_task_struct(umem->source_task); + + kvfree(umem->page_list); + kfree(umem); +} + +static int mlx5ctl_umem_create(struct mlx5_core_dev *mdev, + struct mlx5ctl_umem *umem, u32 uid) +{ + u32 out[MLX5_ST_SZ_DW(create_umem_out)] = {}; + int err, inlen, i, n = 0; + struct scatterlist *sg; + void *in, *umemptr; + __be64 *mtt; + + inlen = MLX5_ST_SZ_BYTES(create_umem_in) + + umem->npages * MLX5_ST_SZ_BYTES(mtt); + + in = kzalloc(inlen, GFP_KERNEL_ACCOUNT); + if (!in) + return -ENOMEM; + + MLX5_SET(create_umem_in, in, opcode, MLX5_CMD_OP_CREATE_UMEM); + MLX5_SET(create_umem_in, in, uid, uid); + + umemptr = MLX5_ADDR_OF(create_umem_in, in, umem); + + MLX5_SET(umem, umemptr, log_page_size, + PAGE_SHIFT - MLX5_ADAPTER_PAGE_SHIFT); + MLX5_SET64(umem, umemptr, num_of_mtt, umem->npages); + MLX5_SET(umem, umemptr, page_offset, umem->offset); + + umem_dbg(mdev, + "UMEM CREATE: log_page_size %d num_of_mtt %lld page_offset %d\n", + MLX5_GET(umem, umemptr, log_page_size), + MLX5_GET64(umem, umemptr, num_of_mtt), + MLX5_GET(umem, umemptr, page_offset)); + + mtt = MLX5_ADDR_OF(create_umem_in, in, umem.mtt); + for_each_sgtable_dma_sg(&umem->sgt, sg, i) { + u64 dma_addr = sg_dma_address(sg); + ssize_t len = sg_dma_len(sg); + + for (; n < umem->npages && len > 0; n++, mtt++) { + *mtt = cpu_to_be64(dma_addr); + MLX5_SET(mtt, mtt, wr_en, 1); + MLX5_SET(mtt, mtt, rd_en, 1); + dma_addr += PAGE_SIZE; + len -= PAGE_SIZE; + } + WARN_ON_ONCE(n == umem->npages && len > 0); + } + + err = mlx5_cmd_exec(mdev, in, inlen, out, sizeof(out)); + if (err) + goto out; + + umem->umem_id = MLX5_GET(create_umem_out, out, umem_id); + umem_dbg(mdev, "\tUMEM CREATED: umem_id %d\n", umem->umem_id); +out: + kfree(in); + return err; +} + +static void mlx5ctl_umem_destroy(struct mlx5_core_dev *mdev, + struct mlx5ctl_umem *umem) +{ + u32 in[MLX5_ST_SZ_DW(destroy_umem_in)] = {}; + + MLX5_SET(destroy_umem_in, in, opcode, MLX5_CMD_OP_DESTROY_UMEM); + MLX5_SET(destroy_umem_in, in, umem_id, umem->umem_id); + + umem_dbg(mdev, "UMEM DESTROY: umem_id %d\n", umem->umem_id); + mlx5_cmd_exec_in(mdev, destroy_umem, in); +} + +int mlx5ctl_umem_reg(struct mlx5ctl_umem_db *umem_db, unsigned long addr, + size_t size) +{ + struct mlx5ctl_umem *umem; + void *ret; + int err; + + umem = mlx5ctl_umem_pin(umem_db, addr, size); + if (IS_ERR(umem)) + return PTR_ERR(umem); + + err = mlx5ctl_umem_create(umem_db->mdev, umem, umem_db->uctx_uid); + if (err) + goto umem_create_err; + + ret = xa_store(&umem_db->xarray, umem->umem_id, umem, GFP_KERNEL_ACCOUNT); + if (WARN(xa_is_err(ret), "Failed to store UMEM")) { + err = xa_err(ret); + goto xa_store_err; + } + + return umem->umem_id; + +xa_store_err: + mlx5ctl_umem_destroy(umem_db->mdev, umem); +umem_create_err: + mlx5ctl_umem_unpin(umem_db, umem); + return err; +} + +int mlx5ctl_umem_unreg(struct mlx5ctl_umem_db *umem_db, u32 umem_id) +{ + struct mlx5ctl_umem *umem; + + umem = xa_erase(&umem_db->xarray, umem_id); + if (!umem) + return -ENOENT; + + mlx5ctl_umem_destroy(umem_db->mdev, umem); + mlx5ctl_umem_unpin(umem_db, umem); + return 0; +} + +struct mlx5ctl_umem_db *mlx5ctl_umem_db_create(struct mlx5_core_dev *mdev, + u32 uctx_uid) +{ + struct mlx5ctl_umem_db *umem_db; + + umem_db = kzalloc(sizeof(*umem_db), GFP_KERNEL_ACCOUNT); + if (!umem_db) + return ERR_PTR(-ENOMEM); + + xa_init(&umem_db->xarray); + umem_db->mdev = mdev; + umem_db->uctx_uid = uctx_uid; + + return umem_db; +} + +void mlx5ctl_umem_db_destroy(struct mlx5ctl_umem_db *umem_db) +{ + struct mlx5ctl_umem *umem; + unsigned long index; + + xa_for_each(&umem_db->xarray, index, umem) + mlx5ctl_umem_unreg(umem_db, umem->umem_id); + + xa_destroy(&umem_db->xarray); + kfree(umem_db); +} diff --git a/drivers/misc/mlx5ctl/umem.h b/drivers/misc/mlx5ctl/umem.h new file mode 100644 index 000000000000..880bf66e600d --- /dev/null +++ b/drivers/misc/mlx5ctl/umem.h @@ -0,0 +1,17 @@ +/* SPDX-License-Identifier: GPL-2.0 OR Linux-OpenIB */ +/* Copyright (c) 2023, NVIDIA CORPORATION & AFFILIATES. All rights reserved. */ + +#ifndef __MLX5CTL_UMEM_H__ +#define __MLX5CTL_UMEM_H__ + +#include +#include + +struct mlx5ctl_umem_db; + +struct mlx5ctl_umem_db *mlx5ctl_umem_db_create(struct mlx5_core_dev *mdev, u32 uctx_uid); +void mlx5ctl_umem_db_destroy(struct mlx5ctl_umem_db *umem_db); +int mlx5ctl_umem_reg(struct mlx5ctl_umem_db *umem_db, unsigned long addr, size_t size); +int mlx5ctl_umem_unreg(struct mlx5ctl_umem_db *umem_db, u32 umem_id); + +#endif /* __MLX5CTL_UMEM_H__ */ diff --git a/include/uapi/misc/mlx5ctl.h b/include/uapi/misc/mlx5ctl.h index 49c26ccc2d21..c0960ad1a8f0 100644 --- a/include/uapi/misc/mlx5ctl.h +++ b/include/uapi/misc/mlx5ctl.h @@ -24,6 +24,14 @@ struct mlx5ctl_cmdrpc { __aligned_u64 flags; }; +struct mlx5ctl_umem_reg { + __aligned_u64 addr; /* user address */ + __aligned_u64 len; /* user buffer length */ + __aligned_u64 flags; + __u32 umem_id; /* returned device's umem ID */ + __u32 reserved[7]; +}; + #define MLX5CTL_MAX_RPC_SIZE 8192 #define MLX5CTL_IOCTL_MAGIC 0x5c @@ -34,4 +42,10 @@ struct mlx5ctl_cmdrpc { #define MLX5CTL_IOCTL_CMDRPC \ _IOWR(MLX5CTL_IOCTL_MAGIC, 0x1, struct mlx5ctl_cmdrpc) +#define MLX5CTL_IOCTL_UMEM_REG \ + _IOWR(MLX5CTL_IOCTL_MAGIC, 0x2, struct mlx5ctl_umem_reg) + +#define MLX5CTL_IOCTL_UMEM_UNREG \ + _IOWR(MLX5CTL_IOCTL_MAGIC, 0x3, unsigned long) + #endif /* __MLX5CTL_IOCTL_H__ */