[v3,5/9] cxl/pci: Only register RCDs with device 0, function 0 as CXL memory device
Message ID | 20221109104059.766720-6-rrichter@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp261479wru; Wed, 9 Nov 2022 02:42:51 -0800 (PST) X-Google-Smtp-Source: AMsMyM7of6aBpY2XzaVI/d+VRCq4BUA/7NF784NhVe9nG07fp/YumYxJ9VzhrEWuIdZdPuy1vyO4 X-Received: by 2002:a05:6a00:1781:b0:561:7f7f:dc38 with SMTP id s1-20020a056a00178100b005617f7fdc38mr61225219pfg.42.1667990571122; Wed, 09 Nov 2022 02:42:51 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1667990571; cv=pass; d=google.com; s=arc-20160816; b=IE477PX12t9MtcQcxPa7q+7wP1BxbXiAONpJlWqEgYslzDgpDSlEXl9kIvwpKgpzRD HmBf8r5b7/FR/01fXjDL0WVoJDQJ7/t9TT8aM7Z7oelNAL1No4fe75uod53EupdSSzAV 8oZ7Xu8V1j1IId0vomdcyN2Yp1p8Sj7q0r8Ra+NtuQBNa862Y5tarwr7ofV2TokITK+q TB/RhsU3aDIMPuzdaRT4m81f6rRCkix40s3wy5bR2BP6xbUjWcVqJjIPl0rz3YK0dp55 2le8VL42tVn4Z6h2bTKU8zGzwwfwZNIh+hH5+PDlZ8vI9+mhS7cvziz39M6b9yeGPvii YmFA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=nS3RJuVhOl31ZYeV432y068pJtOlAuTnnT5e2/RDVFE=; b=fmF37LlfIAnBmCw7HpQ5KtsmPGxqZA4+7KAsqzvL6Dnn+7cwR/HtQt+6SLS0VsKmY+ jf7kycMAKTebIEOj9Tw5Q6Us/r1uSfVJzF3zme9k90RBZkVzAP4sjBe6RmJPFlHkfN4Y UIAobqgQOfmRWT93v/ERoO0hevZQ8QL2U8Mw11Z44P0TL4S0DtHJNM+3Pz73DCVfGts8 vBR6Fwc2wxil/gHIMJyASFrgoilw7RACNmFW5b5vNOIQNaxTUQnUPh0gqG9ZvvAhW48N aDroaMfBgaWCzfWF5zfVSpXA31WDy80YLgzY8Iwg1VUDqOC4T45Vn/LWem1J8RNstero qztw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=KUpyQ9Fu; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z5-20020a655a45000000b0044d6999624asi19061565pgs.339.2022.11.09.02.42.37; Wed, 09 Nov 2022 02:42:51 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=KUpyQ9Fu; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231148AbiKIKmC (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Wed, 9 Nov 2022 05:42:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55978 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231127AbiKIKlq (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 9 Nov 2022 05:41:46 -0500 Received: from NAM10-DM6-obe.outbound.protection.outlook.com (mail-dm6nam10on2045.outbound.protection.outlook.com [40.107.93.45]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 588D02655B; Wed, 9 Nov 2022 02:41:43 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YlQ711kwq5wBQwi7ZyQDAjnZC/bZmCFyCpGAPP6VKtdBiGlWOoM7CwUIp1mJ4nJjsMXgEYoptTFJi5jt9MEX6KgXLXdOnSH09SR9YwTcC/+TL1oGynW4USk97XrVvB1pPwLIVaSbhJOaDsbS+8kJUyN/4pI99QCApbZ4mbiL7Yro8Ud9/2MyHzAGrAKDumbcDjpiMNP5upfCU4fymbcuEfX31Zk2wI+V+EieS+ObIVQI2ZCNgnQZbM9OkXqzRmiS0zo/U7djgLn1+LlU4iXdhGWa436DsVniJDeTNOh2VP7GJ8Ka3DmutqVEbRVnF8OVgQqIy8UsxPWUBPHfxuZlQw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nS3RJuVhOl31ZYeV432y068pJtOlAuTnnT5e2/RDVFE=; b=gMssLs4XiWU0SrJNyWiFVtutHi5c8Lb/uaejiQpQ8fzs4diiOh1WvOoi0CFJlxX9CdfEjZ8XF3lTCXhU1qD/0owhYQyguOwuaRHCsw0XXhy36nChb/VA8p37qRME95a8NIMRJo+hgKNC4mjJfIuubS3IUNsyVOCUyO0pBZm3vUv+LKj0Uy30/aNrVFR6WcLnT/c7hHRy8G+QL29f9PhMYablbOswz1c0y6QtooKbMgrwc6ggM7IqVaObgm014eLiodmv2z6Cgi6SqwgEur7X3UONWMPus5TO7MstBjIx2mjPHDIb/glYbuIz0+9MvmttPxyx3Bp0ncNIEfVBvJSlCQ== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=intel.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=nS3RJuVhOl31ZYeV432y068pJtOlAuTnnT5e2/RDVFE=; b=KUpyQ9FuArG6hySGCLlGJnOeM57od8hunbRTaA2ynHtQRkAlFqLzAsaGBbny+AvF+ztUt8bLHW95VVe/kHvriCbtRcBjLja7Bizj0MU8HsUKGldhYOG+zhZ8TspbMFQexpr63VsqQSdUW/LU9TOzB0gNQ9IymrNcYrZFhVtBeIQ= Received: from BN0PR04CA0062.namprd04.prod.outlook.com (2603:10b6:408:ea::7) by MW4PR12MB7239.namprd12.prod.outlook.com (2603:10b6:303:228::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27; Wed, 9 Nov 2022 10:41:41 +0000 Received: from BN8NAM11FT031.eop-nam11.prod.protection.outlook.com (2603:10b6:408:ea:cafe::22) by BN0PR04CA0062.outlook.office365.com (2603:10b6:408:ea::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5791.27 via Frontend Transport; Wed, 9 Nov 2022 10:41:41 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by BN8NAM11FT031.mail.protection.outlook.com (10.13.177.25) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.5813.12 via Frontend Transport; Wed, 9 Nov 2022 10:41:41 +0000 Received: from rric.localdomain (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Wed, 9 Nov 2022 04:41:32 -0600 From: Robert Richter <rrichter@amd.com> To: Alison Schofield <alison.schofield@intel.com>, Vishal Verma <vishal.l.verma@intel.com>, Ira Weiny <ira.weiny@intel.com>, Ben Widawsky <bwidawsk@kernel.org>, Dan Williams <dan.j.williams@intel.com> CC: <linux-cxl@vger.kernel.org>, <linux-kernel@vger.kernel.org>, Bjorn Helgaas <bhelgaas@google.com>, "Rafael J. Wysocki" <rafael@kernel.org>, Len Brown <lenb@kernel.org>, Jonathan Cameron <Jonathan.Cameron@huawei.com>, "Davidlohr Bueso" <dave@stgolabs.net>, Dave Jiang <dave.jiang@intel.com>, Robert Richter <rrichter@amd.com> Subject: [PATCH v3 5/9] cxl/pci: Only register RCDs with device 0, function 0 as CXL memory device Date: Wed, 9 Nov 2022 11:40:55 +0100 Message-ID: <20221109104059.766720-6-rrichter@amd.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221109104059.766720-1-rrichter@amd.com> References: <20221109104059.766720-1-rrichter@amd.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: BN8NAM11FT031:EE_|MW4PR12MB7239:EE_ X-MS-Office365-Filtering-Correlation-Id: ddd8ab70-2fa1-40b1-f37c-08dac23efccc X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: y6/d7+PK4NJVWpH1W0AwtNNJLHO1LEtagbmHg2WGeJsfb0cmXvpPqSDWMUhkKt7yiszXCCNZAoTZrCVbaMR9N5COoMz2MkrkokyyHSEge4Ik0PpdKjDbiniGkrhLoCHAh5aBS4yqndaTRSzKIU8W32pv10mq/b/gvX8wmOv1Am6VpGLPkjtaPzw2OGDLyfeZPRABE2PxspIruvKHFT3O1vCSaMDc08GU0jB8sQfGKqHDd13NLJaWFSBJoAdra83UW6UU9SdUUSCVD3muDAL2bdDlDZ6wQhvlEirt1WhsSpEkZpi/atFyaOY91liHWid158PmC9QaGzsoOma1Z/fYVzltPfNgYsW8vRBPAbwt7aoKAZ5g9uDk4XFDG/Q88yb8MSZgW3T+ZC0/Su9kezd0JoXKKtZhS1o6ZS9MxQDTw0Nw7Lrg8PHxWaoN3GXFfAXEsQiYE1m1Fdy/yQWvHRgWo/jLfaKckt7YwwojVmgLWMdi041I4J/dREQWKsLYOGrgDB2T2nko1CTMFQoP3dWOOR6t2HPx4MqbTv8veepGua0m0FPq40Qpb2q75V8btEOalq8eu1Nv+UnV4uanQyigI8nLyaL3bYjuP7DQmb/9W7WHgXW2nB6lgvZfZPKnalt0uNslvzbTVur3f16MhIEJkuu7wevzumX/HMzLFSBVpo0YmepMcZhS0NUwqOCT32xakXnocvE2QuBdhUZuQyzxdJS5ior21GMd6iajssBb8wNKvCs0qgZa6cGnljkcPpsevTHpjLj7pBFANsT+JD72I3xwnMyVmTH/m/23oTanktUf0sk9fxaVyQjpPrVA0c/+ X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230022)(4636009)(346002)(39860400002)(396003)(376002)(136003)(451199015)(40470700004)(46966006)(36840700001)(356005)(81166007)(36756003)(26005)(2616005)(40460700003)(82740400003)(82310400005)(7416002)(8936002)(83380400001)(2906002)(5660300002)(8676002)(70206006)(4326008)(41300700001)(54906003)(70586007)(40480700001)(316002)(186003)(1076003)(478600001)(16526019)(47076005)(426003)(336012)(6666004)(36860700001)(110136005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 09 Nov 2022 10:41:41.0928 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: ddd8ab70-2fa1-40b1-f37c-08dac23efccc X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: BN8NAM11FT031.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB7239 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749014881208379795?= X-GMAIL-MSGID: =?utf-8?q?1749014881208379795?= |
Series |
cxl: Add support for Restricted CXL hosts (RCD mode)
|
|
Commit Message
Robert Richter
Nov. 9, 2022, 10:40 a.m. UTC
The Device 0, Function 0 DVSEC controls the CXL functionality of the
entire device. Add a check to prevent registration of any other PCI
device on the bus as a CXL memory device.
Signed-off-by: Robert Richter <rrichter@amd.com>
---
drivers/cxl/pci.c | 25 +++++++++++++++++++++++--
1 file changed, 23 insertions(+), 2 deletions(-)
Comments
Robert Richter wrote: > The Device 0, Function 0 DVSEC controls the CXL functionality of the > entire device. Add a check to prevent registration of any other PCI > device on the bus as a CXL memory device. Can you reference the specification wording that indicates that the OS needs to actively avoid these situations, or otherwise point to the real world scenario where this filtering is needed? > > Signed-off-by: Robert Richter <rrichter@amd.com> > --- > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > 1 file changed, 23 insertions(+), 2 deletions(-) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index faeb5d9d7a7a..cc4f206f24b3 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > } > } > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > +{ > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > + return 0; /* no RCD */ > + > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > + return 0; /* ok */ > + > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", s/0x%02x/%#02x/ > + pdev->devfn, pcie_dvsec); This looks like a dev_dbg() to me. Otherwise a warning will always fire on a benign condition. > + > + return -ENODEV; > +} > + > static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > { > struct cxl_register_map map; > struct cxl_memdev *cxlmd; > struct cxl_dev_state *cxlds; > + u16 pcie_dvsec; > int rc; > > /* > @@ -442,6 +457,13 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) != > offsetof(struct cxl_regs, device_regs.memdev)); > > + pcie_dvsec = pci_find_dvsec_capability( > + pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE); > + > + rc = check_restricted_device(pdev, pcie_dvsec); > + if (rc) > + return rc; > + > rc = pcim_enable_device(pdev); > if (rc) > return rc; > @@ -451,8 +473,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) > return PTR_ERR(cxlds); > > cxlds->serial = pci_get_dsn(pdev); > - cxlds->cxl_dvsec = pci_find_dvsec_capability( > - pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE); > + cxlds->cxl_dvsec = pcie_dvsec; > if (!cxlds->cxl_dvsec) > dev_warn(&pdev->dev, > "Device DVSEC not present, skip CXL.mem init\n"); > -- > 2.30.2 >
On 16.11.22 11:24:48, Dan Williams wrote: > Robert Richter wrote: > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > entire device. Add a check to prevent registration of any other PCI > > device on the bus as a CXL memory device. > > Can you reference the specification wording that indicates that the OS > needs to actively avoid these situations, or otherwise point to the real > world scenario where this filtering is needed? CXL 3.0 8.1.3 PCIe DVSEC for CXL Device """ An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus and can expose one or more PCIe device numbers and function numbers at this bus number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown in Figure 8-1. """ """ In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC control the CXL functionality of the entire device. """ There are some other occurrences. I think this is even true for VH mode, as multiple CXL devices on the bus are exposed through multiple DSPs or Root Ports. Anyway, I limited this to an RCD only, esp. because its counterpart would be missing and thus port mapping would fail otherwise. See restricted_host_enumerate_dport() of this series. > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > --- > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > --- a/drivers/cxl/pci.c > > +++ b/drivers/cxl/pci.c > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > } > > } > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > +{ > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > + return 0; /* no RCD */ > > + > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > + return 0; /* ok */ > > + > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > s/0x%02x/%#02x/ > > > + pdev->devfn, pcie_dvsec); Ok. > This looks like a dev_dbg() to me. Otherwise a warning will always fire > on a benign condition. I have chosen dev_warn() here as this is a non-compliant unexpected behavior of the device. There are no (legal) cases this may happen. I suppose you are worried about spamming the console here, but that error should be reported somewhere and thus being visible. Note: There can be multiple devices on the bus, but those shouldn't have a CXL mem dev class code and the devices shouldn't being probed by cxl_pci_probe() which contains the check and later creates a cxlmd dev. -Robert
Robert Richter wrote: > On 16.11.22 11:24:48, Dan Williams wrote: > > Robert Richter wrote: > > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > > entire device. Add a check to prevent registration of any other PCI > > > device on the bus as a CXL memory device. > > > > Can you reference the specification wording that indicates that the OS > > needs to actively avoid these situations, or otherwise point to the real > > world scenario where this filtering is needed? > > CXL 3.0 > > 8.1.3 PCIe DVSEC for CXL Device > > """ > An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus > and can expose one or more PCIe device numbers and function numbers at this bus > number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe > Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown > in Figure 8-1. > """ > > """ > In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC > control the CXL functionality of the entire device. > """ > > There are some other occurrences. I think this is even true for VH > mode, as multiple CXL devices on the bus are exposed through multiple > DSPs or Root Ports. > > Anyway, I limited this to an RCD only, esp. because its counterpart > would be missing and thus port mapping would fail otherwise. See > restricted_host_enumerate_dport() of this series. > > > > > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > --- > > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > > --- a/drivers/cxl/pci.c > > > +++ b/drivers/cxl/pci.c > > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > > } > > > } > > > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > > +{ > > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > > + return 0; /* no RCD */ > > > + > > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > > + return 0; /* ok */ > > > + > > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > > > s/0x%02x/%#02x/ > > > > > + pdev->devfn, pcie_dvsec); > > Ok. > > > This looks like a dev_dbg() to me. Otherwise a warning will always fire > > on a benign condition. > > I have chosen dev_warn() here as this is a non-compliant unexpected > behavior of the device. There are no (legal) cases this may happen. I > suppose you are worried about spamming the console here, but that > error should be reported somewhere and thus being visible. There are so many spec illegal values and conditions that the driver could checki, but does not. The reason I am poking here is why does the driver need to be explicit about *this* illegal condition versus all the other potential conditions? What is the practical end user impact if Linux does not include this change? For example, if it is just one vendor that made this mistake that can be an explicit quirk. A dev_warn() is not necessary for simple quirks.
On 17.11.22 09:27:23, Dan Williams wrote: > Robert Richter wrote: > > On 16.11.22 11:24:48, Dan Williams wrote: > > > Robert Richter wrote: > > > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > > > entire device. Add a check to prevent registration of any other PCI > > > > device on the bus as a CXL memory device. > > > > > > Can you reference the specification wording that indicates that the OS > > > needs to actively avoid these situations, or otherwise point to the real > > > world scenario where this filtering is needed? > > > > CXL 3.0 > > > > 8.1.3 PCIe DVSEC for CXL Device > > > > """ > > An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus > > and can expose one or more PCIe device numbers and function numbers at this bus > > number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe > > Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown > > in Figure 8-1. > > """ > > > > """ > > In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC > > control the CXL functionality of the entire device. > > """ > > > > There are some other occurrences. I think this is even true for VH > > mode, as multiple CXL devices on the bus are exposed through multiple > > DSPs or Root Ports. > > > > Anyway, I limited this to an RCD only, esp. because its counterpart > > would be missing and thus port mapping would fail otherwise. See > > restricted_host_enumerate_dport() of this series. > > > > > > > > > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > > --- > > > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > > > --- a/drivers/cxl/pci.c > > > > +++ b/drivers/cxl/pci.c > > > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > > > } > > > > } > > > > > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > > > +{ > > > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > > > + return 0; /* no RCD */ > > > > + > > > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > > > + return 0; /* ok */ > > > > + > > > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > > > > > s/0x%02x/%#02x/ > > > > > > > + pdev->devfn, pcie_dvsec); > > > > Ok. > > > > > This looks like a dev_dbg() to me. Otherwise a warning will always fire > > > on a benign condition. > > > > I have chosen dev_warn() here as this is a non-compliant unexpected > > behavior of the device. There are no (legal) cases this may happen. I > > suppose you are worried about spamming the console here, but that > > error should be reported somewhere and thus being visible. > > There are so many spec illegal values and conditions that the driver > could checki, but does not. The reason I am poking here is why does the > driver need to be explicit about *this* illegal condition versus all the > other potential conditions? What is the practical end user impact if > Linux does not include this change? For example, if it is just one > vendor that made this mistake that can be an explicit quirk. > > A dev_warn() is not necessary for simple quirks. This is not simply a cross check, the driver prevents enablement of CXL mem devs other than PCI_DEVFN(0, 0). It shouldn't silently drop out then. It's a device malfunction which should appropriate reported and not only visible if dbg is enabled. As written above, the check is necessary as the counterpart is missing otherwise and init would fail later with a more obfuscating error message. -Robert
Robert Richter wrote: > On 17.11.22 09:27:23, Dan Williams wrote: > > Robert Richter wrote: > > > On 16.11.22 11:24:48, Dan Williams wrote: > > > > Robert Richter wrote: > > > > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > > > > entire device. Add a check to prevent registration of any other PCI > > > > > device on the bus as a CXL memory device. > > > > > > > > Can you reference the specification wording that indicates that the OS > > > > needs to actively avoid these situations, or otherwise point to the real > > > > world scenario where this filtering is needed? > > > > > > CXL 3.0 > > > > > > 8.1.3 PCIe DVSEC for CXL Device > > > > > > """ > > > An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus > > > and can expose one or more PCIe device numbers and function numbers at this bus > > > number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe > > > Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown > > > in Figure 8-1. > > > """ > > > > > > """ > > > In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC > > > control the CXL functionality of the entire device. > > > """ > > > > > > There are some other occurrences. I think this is even true for VH > > > mode, as multiple CXL devices on the bus are exposed through multiple > > > DSPs or Root Ports. > > > > > > Anyway, I limited this to an RCD only, esp. because its counterpart > > > would be missing and thus port mapping would fail otherwise. See > > > restricted_host_enumerate_dport() of this series. > > > > > > > > > > > > > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > > > --- > > > > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > > > > --- a/drivers/cxl/pci.c > > > > > +++ b/drivers/cxl/pci.c > > > > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > > > > } > > > > > } > > > > > > > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > > > > +{ > > > > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > > > > + return 0; /* no RCD */ > > > > > + > > > > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > > > > + return 0; /* ok */ > > > > > + > > > > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > > > > > > > s/0x%02x/%#02x/ > > > > > > > > > + pdev->devfn, pcie_dvsec); > > > > > > Ok. > > > > > > > This looks like a dev_dbg() to me. Otherwise a warning will always fire > > > > on a benign condition. > > > > > > I have chosen dev_warn() here as this is a non-compliant unexpected > > > behavior of the device. There are no (legal) cases this may happen. I > > > suppose you are worried about spamming the console here, but that > > > error should be reported somewhere and thus being visible. > > > > There are so many spec illegal values and conditions that the driver > > could checki, but does not. The reason I am poking here is why does the > > driver need to be explicit about *this* illegal condition versus all the > > other potential conditions? What is the practical end user impact if > > Linux does not include this change? For example, if it is just one > > vendor that made this mistake that can be an explicit quirk. > > > > A dev_warn() is not necessary for simple quirks. > > This is not simply a cross check, the driver prevents enablement of > CXL mem devs other than PCI_DEVFN(0, 0). It shouldn't silently drop > out then. It's a device malfunction which should appropriate reported > and not only visible if dbg is enabled. > > As written above, the check is necessary as the counterpart is missing It is only necessary if this condition happens in practice, not a theoretically. So I am asking, are you seeing this with an actual device that someone will use in production? If so, that's what pci quirks are for to keep those workarounds organized in a common location.
On 18.11.22 08:55:13, Dan Williams wrote: > Robert Richter wrote: > > On 17.11.22 09:27:23, Dan Williams wrote: > > > Robert Richter wrote: > > > > On 16.11.22 11:24:48, Dan Williams wrote: > > > > > Robert Richter wrote: > > > > > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > > > > > entire device. Add a check to prevent registration of any other PCI > > > > > > device on the bus as a CXL memory device. > > > > > > > > > > Can you reference the specification wording that indicates that the OS > > > > > needs to actively avoid these situations, or otherwise point to the real > > > > > world scenario where this filtering is needed? > > > > > > > > CXL 3.0 > > > > > > > > 8.1.3 PCIe DVSEC for CXL Device > > > > > > > > """ > > > > An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus > > > > and can expose one or more PCIe device numbers and function numbers at this bus > > > > number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe > > > > Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown > > > > in Figure 8-1. > > > > """ > > > > > > > > """ > > > > In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC > > > > control the CXL functionality of the entire device. > > > > """ > > > > > > > > There are some other occurrences. I think this is even true for VH > > > > mode, as multiple CXL devices on the bus are exposed through multiple > > > > DSPs or Root Ports. > > > > > > > > Anyway, I limited this to an RCD only, esp. because its counterpart > > > > would be missing and thus port mapping would fail otherwise. See > > > > restricted_host_enumerate_dport() of this series. > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > > > > --- > > > > > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > > > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > > > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > > > > > --- a/drivers/cxl/pci.c > > > > > > +++ b/drivers/cxl/pci.c > > > > > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > > > > > } > > > > > > } > > > > > > > > > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > > > > > +{ > > > > > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > > > > > + return 0; /* no RCD */ > > > > > > + > > > > > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > > > > > + return 0; /* ok */ > > > > > > + > > > > > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > > > > > > > > > s/0x%02x/%#02x/ > > > > > > > > > > > + pdev->devfn, pcie_dvsec); > > > > > > > > Ok. > > > > > > > > > This looks like a dev_dbg() to me. Otherwise a warning will always fire > > > > > on a benign condition. > > > > > > > > I have chosen dev_warn() here as this is a non-compliant unexpected > > > > behavior of the device. There are no (legal) cases this may happen. I > > > > suppose you are worried about spamming the console here, but that > > > > error should be reported somewhere and thus being visible. > > > > > > There are so many spec illegal values and conditions that the driver > > > could checki, but does not. The reason I am poking here is why does the > > > driver need to be explicit about *this* illegal condition versus all the > > > other potential conditions? What is the practical end user impact if > > > Linux does not include this change? For example, if it is just one > > > vendor that made this mistake that can be an explicit quirk. > > > > > > A dev_warn() is not necessary for simple quirks. > > > > This is not simply a cross check, the driver prevents enablement of > > CXL mem devs other than PCI_DEVFN(0, 0). It shouldn't silently drop > > out then. It's a device malfunction which should appropriate reported > > and not only visible if dbg is enabled. > > > > As written above, the check is necessary as the counterpart is missing > > It is only necessary if this condition happens in practice, not a > theoretically. So I am asking, are you seeing this with an actual device > that someone will use in production? If so, that's what pci quirks are > for to keep those workarounds organized in a common location. I can make it a dev_dbg() message. But I do not understand the ratio behind this. This is not a quirk nor a workaround or a fix for something. The likely paths are the conditions checked that return 0. Only if the unlikely case happens where a CXL mem dev is not a dev 0, func 0, a warning is shown to inform the user that this dev is not enabled. So yes, this might be theoretical similar to that a driver cannot allocate memory. But why not print this as a warning message then? Anyway, let's make it a dev_dbg(). Thanks, -Robert
Robert Richter wrote: > On 18.11.22 08:55:13, Dan Williams wrote: > > Robert Richter wrote: > > > On 17.11.22 09:27:23, Dan Williams wrote: > > > > Robert Richter wrote: > > > > > On 16.11.22 11:24:48, Dan Williams wrote: > > > > > > Robert Richter wrote: > > > > > > > The Device 0, Function 0 DVSEC controls the CXL functionality of the > > > > > > > entire device. Add a check to prevent registration of any other PCI > > > > > > > device on the bus as a CXL memory device. > > > > > > > > > > > > Can you reference the specification wording that indicates that the OS > > > > > > needs to actively avoid these situations, or otherwise point to the real > > > > > > world scenario where this filtering is needed? > > > > > > > > > > CXL 3.0 > > > > > > > > > > 8.1.3 PCIe DVSEC for CXL Device > > > > > > > > > > """ > > > > > An RCD creates a new PCIe enumeration hierarchy. As such, it spawns a new Root Bus > > > > > and can expose one or more PCIe device numbers and function numbers at this bus > > > > > number. These are exposed as Root Complex Integrated Endpoints (RCiEP). The PCIe > > > > > Configuration Space of Device 0, Function 0 shall include the CXL PCIe DVSEC as shown > > > > > in Figure 8-1. > > > > > """ > > > > > > > > > > """ > > > > > In either case, the capability, status, and control fields in Device 0, Function 0 DVSEC > > > > > control the CXL functionality of the entire device. > > > > > """ > > > > > > > > > > There are some other occurrences. I think this is even true for VH > > > > > mode, as multiple CXL devices on the bus are exposed through multiple > > > > > DSPs or Root Ports. > > > > > > > > > > Anyway, I limited this to an RCD only, esp. because its counterpart > > > > > would be missing and thus port mapping would fail otherwise. See > > > > > restricted_host_enumerate_dport() of this series. > > > > > > > > > > > > > > > > > > > > > > > > > Signed-off-by: Robert Richter <rrichter@amd.com> > > > > > > > --- > > > > > > > drivers/cxl/pci.c | 25 +++++++++++++++++++++++-- > > > > > > > 1 file changed, 23 insertions(+), 2 deletions(-) > > > > > > > > > > > > > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > > > > > > > index faeb5d9d7a7a..cc4f206f24b3 100644 > > > > > > > --- a/drivers/cxl/pci.c > > > > > > > +++ b/drivers/cxl/pci.c > > > > > > > @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) > > > > > > > } > > > > > > > } > > > > > > > > > > > > > > +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) > > > > > > > +{ > > > > > > > + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) > > > > > > > + return 0; /* no RCD */ > > > > > > > + > > > > > > > + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) > > > > > > > + return 0; /* ok */ > > > > > > > + > > > > > > > + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", > > > > > > > > > > > > s/0x%02x/%#02x/ > > > > > > > > > > > > > + pdev->devfn, pcie_dvsec); > > > > > > > > > > Ok. > > > > > > > > > > > This looks like a dev_dbg() to me. Otherwise a warning will always fire > > > > > > on a benign condition. > > > > > > > > > > I have chosen dev_warn() here as this is a non-compliant unexpected > > > > > behavior of the device. There are no (legal) cases this may happen. I > > > > > suppose you are worried about spamming the console here, but that > > > > > error should be reported somewhere and thus being visible. > > > > > > > > There are so many spec illegal values and conditions that the driver > > > > could checki, but does not. The reason I am poking here is why does the > > > > driver need to be explicit about *this* illegal condition versus all the > > > > other potential conditions? What is the practical end user impact if > > > > Linux does not include this change? For example, if it is just one > > > > vendor that made this mistake that can be an explicit quirk. > > > > > > > > A dev_warn() is not necessary for simple quirks. > > > > > > This is not simply a cross check, the driver prevents enablement of > > > CXL mem devs other than PCI_DEVFN(0, 0). It shouldn't silently drop > > > out then. It's a device malfunction which should appropriate reported > > > and not only visible if dbg is enabled. > > > > > > As written above, the check is necessary as the counterpart is missing > > > > It is only necessary if this condition happens in practice, not a > > theoretically. So I am asking, are you seeing this with an actual device > > that someone will use in production? If so, that's what pci quirks are > > for to keep those workarounds organized in a common location. > > I can make it a dev_dbg() message. But I do not understand the ratio > behind this. This is not a quirk nor a workaround or a fix for > something. The likely paths are the conditions checked that return 0. > Only if the unlikely case happens where a CXL mem dev is not a dev 0, > func 0, a warning is shown to inform the user that this dev is not > enabled. So yes, this might be theoretical similar to that a driver > cannot allocate memory. But why not print this as a warning message > then? > > Anyway, let's make it a dev_dbg(). Sorry for the thrash, lets set aside the the dev_dbg() vs dev_warn() issue. It is minor compared to *why* this patch needs to be applied. I would expect all production devices to be spec compliant and not advertise the CXL memory device class code on anything but function0. So either, there is a real threat that someone will build such a mistake and Linux needs to take this action to protect itself, or no one will ever build such a device and this patch is not needed. Basically I read the changelog and it answered the "What?" question, but it did not answer the "Why?" question.
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index faeb5d9d7a7a..cc4f206f24b3 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -428,11 +428,26 @@ static void devm_cxl_pci_create_doe(struct cxl_dev_state *cxlds) } } +static int check_restricted_device(struct pci_dev *pdev, u16 pcie_dvsec) +{ + if (pci_pcie_type(pdev) != PCI_EXP_TYPE_RC_END) + return 0; /* no RCD */ + + if (pdev->devfn == PCI_DEVFN(0, 0) && pcie_dvsec) + return 0; /* ok */ + + dev_warn(&pdev->dev, "Skipping RCD: devfn=0x%02x dvsec=%u\n", + pdev->devfn, pcie_dvsec); + + return -ENODEV; +} + static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) { struct cxl_register_map map; struct cxl_memdev *cxlmd; struct cxl_dev_state *cxlds; + u16 pcie_dvsec; int rc; /* @@ -442,6 +457,13 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) BUILD_BUG_ON(offsetof(struct cxl_regs, memdev) != offsetof(struct cxl_regs, device_regs.memdev)); + pcie_dvsec = pci_find_dvsec_capability( + pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE); + + rc = check_restricted_device(pdev, pcie_dvsec); + if (rc) + return rc; + rc = pcim_enable_device(pdev); if (rc) return rc; @@ -451,8 +473,7 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) return PTR_ERR(cxlds); cxlds->serial = pci_get_dsn(pdev); - cxlds->cxl_dvsec = pci_find_dvsec_capability( - pdev, PCI_DVSEC_VENDOR_ID_CXL, CXL_DVSEC_PCIE_DEVICE); + cxlds->cxl_dvsec = pcie_dvsec; if (!cxlds->cxl_dvsec) dev_warn(&pdev->dev, "Device DVSEC not present, skip CXL.mem init\n");