[v2,1/3] cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registers
Message ID | 20230721214740.256602-2-Smita.KoralahalliChannabasappa@amd.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp489102vqg; Fri, 21 Jul 2023 15:04:37 -0700 (PDT) X-Google-Smtp-Source: APBJJlHygm0WhaMeNm0vu4m66OMkX+Ud1vwHPWYetNB127P3o50Tupih0GGm4JMRXXJeo3dmiHaW X-Received: by 2002:a17:903:2445:b0:1b8:a7ec:38c5 with SMTP id l5-20020a170903244500b001b8a7ec38c5mr3093005pls.57.1689977076720; Fri, 21 Jul 2023 15:04:36 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1689977076; cv=pass; d=google.com; s=arc-20160816; b=lMW+o5ccEcoRuPB82dpP7xRu9corYotLS5sPAmDQRhktKiaHT40I7Hf+icVyKJu342 ypiFlkmlGiqgSR8skN8u50XBlgexBWtOeqYf12jAX8xP2uo//YGUEnekiRZh9lGHa0Ph hIdleYOcMEMMjk9pvEuYnn6GR+QvulKjx4dTsdnMWM/6llD4yYNXFSPm9S15MTVxEnwz EsBYCZaQEItTRNYQ7NcX/zGHibM1hYeDyLNrOieCzM2WiihuOLd1gJ7SrqTJNNUEUoHG 4drGAt7MFyjG3ZmedU4OLe10V9RbIId8dm4b4eZhaTf7pjI7jR0CWuX8rAlZrCvgtaQg VDQA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=ImFtej2hD+NQ+WEh9U58ju2LXmLakH5vglo5gPxX1+Y=; fh=XsftW3UEsKVDpg34R8O86oa1udFTwHG6Ldr3ac/pbNM=; b=YIxTV+9piiUJzM90EXGqzjBa4KaSMzFUyMpyEd5RqjSoCQ/4h0mjAXDiLPOPazZfv+ 3f9QTX6edjB4vlMLWRaUuk+5X56Av/nzcPn9p+xPkRA2lw0dyXTNoy7iEYB18IZasxn0 kiU069zaYMTeUmBEGJjuX6h5Cq3h7vvCrKp+lazLPdrLQ+0/7b4PTqZGYuXIQeIM09+x hPFIi3T3Cb0qQvLl6d4qNKdpEC+QGUSr3qTe8ERFASGt1GS83ylIdCtTILOzJ73+SIoA NwMzaOaEbWdqaMo63KKh1BA/g4p9p/gpDT5+BDQeAV0sqyAdLHCsNRbnpL7NktV2ual3 GMjA== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=JzOXFj8h; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d6-20020a170902cec600b001b3d6c68bd1si4066382plg.643.2023.07.21.15.04.23; Fri, 21 Jul 2023 15:04:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@amd.com header.s=selector1 header.b=JzOXFj8h; arc=pass (i=1 spf=pass spfdomain=amd.com dmarc=pass fromdomain=amd.com); spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amd.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230210AbjGUVr7 (ORCPT <rfc822;assdfgzxcv4@gmail.com> + 99 others); Fri, 21 Jul 2023 17:47:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229554AbjGUVr4 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 21 Jul 2023 17:47:56 -0400 Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on2064.outbound.protection.outlook.com [40.107.96.64]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1667E269F; Fri, 21 Jul 2023 14:47:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nR6ZZU3X/pMQhFg4r4cjVSBDlsKlafZYHeEdS3yI+WC2GZnBlXdZFghhJ6+RCDN+MGE0yYb4lJIUOoWDf08uOhfbOsZ97dIdSE1F7xpNGB37ATf9WcO4Y90/kcjXwmvQzQnTcntn8A+VhBdCKvF9KTt+Az+uFGEzBTfO6W4USF0Y3Lo+44jLVl9mswhxVDurx5upwFOhyEQf+nNEBygOzRhWe37VQFa8IXJ2KGcr9PNIYpMiWCbgklE+lj5VPsI8+ZZAKOIs4cD41Bnw6QWkywyFWyB1AVxTqyW+Yz6fJd5YZg12apucYAUgDmY5zc2qAfkjq4wDCvNoJZ35WBEtoA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=ImFtej2hD+NQ+WEh9U58ju2LXmLakH5vglo5gPxX1+Y=; b=MXmCpu+OD4QqA1gVGIZpMkZVWcJotsMCqetz3JviXYeHESFhPqa6RoWmuUntql9tNk8TXWHcnfDd0eSlOlgsBLj5iRDHzbPiTurtQBCdaJ/JR2xNl2AANGQuFpJ5IOcs8RqQFO6gOHKIEB6QRVRTqxXt3qJ9JpdpTL4bZThHjN9hLSLvCk6f+T25M7a7pJ5ccKpOZ2H31Yrj8qNKem/wppl3qRLW1f/pCp5ULFjc6Kv1hYeFEpelvxPdxbxCnCYUhgjqHW/DfIjOjLTsf7dqs9DLMbSFABPGsbF9PtpWKaCsUQuvjy5xLUXQ9r1U4YYa6c2pU/XlCet7kEyuL4eTjw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=vger.kernel.org smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amd.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=ImFtej2hD+NQ+WEh9U58ju2LXmLakH5vglo5gPxX1+Y=; b=JzOXFj8h8AdIaUeFpeLWo8c4wpF5nGzlfBxUVdsFdwxUWlEZNRvlEn+fdwNTxrlRDx/T4LCL0Ya1CENuUrCki++EsCIi8sgOhLrR3oDV4/wMHep/AqBjsIE7ea7v39QQbOPvxSI02bSf7f0p1hT2CKdYlUxr0ha9Vxy9IYCz4EE= Received: from DM6PR06CA0001.namprd06.prod.outlook.com (2603:10b6:5:120::14) by MW4PR12MB6778.namprd12.prod.outlook.com (2603:10b6:303:1e8::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6609.25; Fri, 21 Jul 2023 21:47:53 +0000 Received: from DM6NAM11FT003.eop-nam11.prod.protection.outlook.com (2603:10b6:5:120:cafe::c) by DM6PR06CA0001.outlook.office365.com (2603:10b6:5:120::14) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6609.28 via Frontend Transport; Fri, 21 Jul 2023 21:47:53 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 165.204.84.17) smtp.mailfrom=amd.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=amd.com; Received-SPF: Pass (protection.outlook.com: domain of amd.com designates 165.204.84.17 as permitted sender) receiver=protection.outlook.com; client-ip=165.204.84.17; helo=SATLEXMB04.amd.com; pr=C Received: from SATLEXMB04.amd.com (165.204.84.17) by DM6NAM11FT003.mail.protection.outlook.com (10.13.173.162) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.20.6609.28 via Frontend Transport; Fri, 21 Jul 2023 21:47:53 +0000 Received: from ethanolx50f7host.amd.com (10.180.168.240) by SATLEXMB04.amd.com (10.181.40.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Fri, 21 Jul 2023 16:47:52 -0500 From: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> To: <linux-pci@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <linux-cxl@vger.kernel.org> CC: Bjorn Helgaas <bhelgaas@google.com>, <oohall@gmail.com>, Lukas Wunner <lukas@wunner.de>, Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@linux.intel.com>, Mahesh J Salgaonkar <mahesh@linux.ibm.com>, Alison Schofield <alison.schofield@intel.com>, "Vishal Verma" <vishal.l.verma@intel.com>, Ira Weiny <ira.weiny@intel.com>, "Ben Widawsky" <bwidawsk@kernel.org>, Dan Williams <dan.j.williams@intel.com>, Jonathan Cameron <Jonathan.Cameron@Huawei.com>, Yazen Ghannam <yazen.ghannam@amd.com>, Terry Bowman <terry.bowman@amd.com>, Robert Richter <rrichter@amd.com>, Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Subject: [PATCH v2 1/3] cxl/pci: Fix appropriate checking for _OSC while handling CXL RAS registers Date: Fri, 21 Jul 2023 21:47:38 +0000 Message-ID: <20230721214740.256602-2-Smita.KoralahalliChannabasappa@amd.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20230721214740.256602-1-Smita.KoralahalliChannabasappa@amd.com> References: <20230721214740.256602-1-Smita.KoralahalliChannabasappa@amd.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.180.168.240] X-ClientProxiedBy: SATLEXMB04.amd.com (10.181.40.145) To SATLEXMB04.amd.com (10.181.40.145) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DM6NAM11FT003:EE_|MW4PR12MB6778:EE_ X-MS-Office365-Filtering-Correlation-Id: 40f10c06-8b07-4f54-f42a-08db8a34230f X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: 4T3g/ZHtE9doqUvtrz2aSKiManJ6R/g4svxmjG+CflHDG5VZMNnkIcw1DpGlTX6XKhLCCpLjiYQQMY5eXKI0ae5hA++WmbCshu9CFv3XpR36u9fEq1FUDIqHkBWnjvfWdCTUB/jeriCA/3durlW0NC2I+TWOfcbAUZeeUtJyG9heENSJaiLVAKwJYYQyu9m6QB4URFL9bvalJgTwiwz68RXycosPMzSWtfkKx+T9cUVz3026NJYkhghra1LDw6eUD7CRCJcOp+mDCCLVOq3y0zKLm9zt5Hilr2N6Yxo3Qlyo4b8fMZRKHNG0DTB7A09Pn9awa0eomXItT+xVyzWxTWq7Np1tC1b4pNesVJSWGSDiGq8evk9i5uoyFbsc2UJ1+HYCSut4gj6TSnTIoAm0K5yUQASc05/m7ghy1GKeP8B7zR9Jd1jUAemYCpVa3/3QK9s7dUwpMTbtSDA+0BFllYOiCaTYEFf2feaOyvX4HKzGYRJtwWCZNOuKkMNk8cCSrr7ZyWUd9NKQAVFUIhMwYgWCDzrT4YVdtU9ekW41kD/nu9fRoK1ZWSfKGQCgj/qvORq6n4yHWovXjtqO8RM5+uEPsJs+0Vp3Gt+zp8zAgyNpFb7DjKzoPV/T6tuYePSP8bsfmfme0Hnn/+HCThL2Nr9IEvmBnx/qn0rdov7BjAMtoxkp3bnh17Rb8raF16lqCQTDKluJvSWe+cbf9PI8bM94hiX9UHMXOc5lCT3nWbwV8h9gf/g2joma74Fyw/9Ez89Lpdfypg8T1ofCtOoO3A== X-Forefront-Antispam-Report: CIP:165.204.84.17;CTRY:US;LANG:en;SCL:1;SRV:;IPV:CAL;SFV:NSPM;H:SATLEXMB04.amd.com;PTR:InfoDomainNonexistent;CAT:NONE;SFS:(13230028)(4636009)(396003)(376002)(346002)(39860400002)(136003)(82310400008)(451199021)(46966006)(36840700001)(40470700004)(2906002)(36860700001)(83380400001)(40460700003)(2616005)(426003)(47076005)(356005)(81166007)(82740400003)(36756003)(86362001)(40480700001)(1076003)(16526019)(41300700001)(70206006)(336012)(7696005)(70586007)(4326008)(316002)(26005)(6666004)(186003)(8936002)(8676002)(7416002)(478600001)(54906003)(5660300002)(110136005)(36900700001);DIR:OUT;SFP:1101; X-OriginatorOrg: amd.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 21 Jul 2023 21:47:53.2675 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 40f10c06-8b07-4f54-f42a-08db8a34230f X-MS-Exchange-CrossTenant-Id: 3dd8961f-e488-4e60-8e11-a82d994e183d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=3dd8961f-e488-4e60-8e11-a82d994e183d;Ip=[165.204.84.17];Helo=[SATLEXMB04.amd.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT003.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: MW4PR12MB6778 X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FORGED_SPF_HELO, RCVD_IN_DNSWL_BLOCKED,RCVD_IN_MSPIKE_H2,SPF_HELO_PASS,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772069403214388671 X-GMAIL-MSGID: 1772069403214388671 |
Series |
PCI/AER, CXL: Fix appropriate _OSC check for CXL RAS Cap
|
|
Commit Message
Smita Koralahalli
July 21, 2023, 9:47 p.m. UTC
According to Section 9.17.2, Table 9-26 of CXL Specification [1], owner
of AER should also own CXL Protocol Error Management as there is no
explicit control of CXL Protocol error. And the CXL RAS Cap registers
reported on Protocol errors should check for AER _OSC rather than CXL
Memory Error Reporting Control _OSC.
The CXL Memory Error Reporting Control _OSC specifically highlights
handling Memory Error Logging and Signaling Enhancements. These kinds of
errors are reported through a device's mailbox and can be managed
independently from CXL Protocol Errors.
This change fixes handling and reporting CXL Protocol Errors and RAS
registers natively with native AER and FW-First CXL Memory Error Reporting
Control.
[1] Compute Express Link (CXL) Specification, Revision 3.1, Aug 1 2022.
Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL")
Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com>
---
v2:
Added fixes tag.
Included what the patch fixes in commit message.
---
drivers/cxl/pci.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
Comments
On 7/21/23 2:47 PM, Smita Koralahalli wrote: > According to Section 9.17.2, Table 9-26 of CXL Specification [1], owner > of AER should also own CXL Protocol Error Management as there is no > explicit control of CXL Protocol error. And the CXL RAS Cap registers > reported on Protocol errors should check for AER _OSC rather than CXL > Memory Error Reporting Control _OSC. > > The CXL Memory Error Reporting Control _OSC specifically highlights > handling Memory Error Logging and Signaling Enhancements. These kinds of > errors are reported through a device's mailbox and can be managed > independently from CXL Protocol Errors. > > This change fixes handling and reporting CXL Protocol Errors and RAS > registers natively with native AER and FW-First CXL Memory Error Reporting > Control. > > [1] Compute Express Link (CXL) Specification, Revision 3.1, Aug 1 2022. > > Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") > Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> > --- > v2: > Added fixes tag. > Included what the patch fixes in commit message. > --- > drivers/cxl/pci.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 1cb1494c28fe..2323169b6e5f 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -541,9 +541,9 @@ static int cxl_pci_ras_unmask(struct pci_dev *pdev) > return 0; > } > > - /* BIOS has CXL error control */ > - if (!host_bridge->native_cxl_error) > - return -ENXIO; > + /* BIOS has PCIe AER error control */ > + if (!host_bridge->native_aer) > + return 0; Why not directly use pcie_aer_is_native() here? > > rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); > if (rc)
On 21.07.23 21:47:38, Smita Koralahalli wrote: > According to Section 9.17.2, Table 9-26 of CXL Specification [1], owner > of AER should also own CXL Protocol Error Management as there is no > explicit control of CXL Protocol error. And the CXL RAS Cap registers > reported on Protocol errors should check for AER _OSC rather than CXL > Memory Error Reporting Control _OSC. > > The CXL Memory Error Reporting Control _OSC specifically highlights > handling Memory Error Logging and Signaling Enhancements. These kinds of > errors are reported through a device's mailbox and can be managed > independently from CXL Protocol Errors. > > This change fixes handling and reporting CXL Protocol Errors and RAS > registers natively with native AER and FW-First CXL Memory Error Reporting > Control. > > [1] Compute Express Link (CXL) Specification, Revision 3.1, Aug 1 2022. > > Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") > Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> Reviewed-by: Robert Richter <rrichter@amd.com> > --- > v2: > Added fixes tag. > Included what the patch fixes in commit message. > --- > drivers/cxl/pci.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 1cb1494c28fe..2323169b6e5f 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -541,9 +541,9 @@ static int cxl_pci_ras_unmask(struct pci_dev *pdev) > return 0; > } > > - /* BIOS has CXL error control */ > - if (!host_bridge->native_cxl_error) > - return -ENXIO; > + /* BIOS has PCIe AER error control */ > + if (!host_bridge->native_aer) > + return 0; > > rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); > if (rc) > -- > 2.17.1 >
>> - /* BIOS has CXL error control */ >> - if (!host_bridge->native_cxl_error) >> - return -ENXIO; >> + /* BIOS has PCIe AER error control */ >> + if (!host_bridge->native_aer) >> + return 0; > > Why not directly use pcie_aer_is_native() here? Yeah, this was in my v1. But changed as per Robert's comments, to be applicable for automated backports.. https://lore.kernel.org/all/ZLkxiZv3lWfazwVH@rric.localdomain/ Please advice. Thanks, Smita > >> >> rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); >> if (rc) >
Smita Koralahalli wrote: > According to Section 9.17.2, Table 9-26 of CXL Specification [1], owner > of AER should also own CXL Protocol Error Management as there is no > explicit control of CXL Protocol error. And the CXL RAS Cap registers > reported on Protocol errors should check for AER _OSC rather than CXL > Memory Error Reporting Control _OSC. > > The CXL Memory Error Reporting Control _OSC specifically highlights > handling Memory Error Logging and Signaling Enhancements. These kinds of > errors are reported through a device's mailbox and can be managed > independently from CXL Protocol Errors. > > This change fixes handling and reporting CXL Protocol Errors and RAS > registers natively with native AER and FW-First CXL Memory Error Reporting > Control. I feel like this could be said more succinctly and with an indication of what the end user should expect to see. Something like: "cxl_pci fails to unmask CXL protocol errors when CXL memory error reporting is not granted native control. Given that CXL memory error reporting uses the event interface and protocol errors use AER, unmask protocol errors based only on the native AER setting. Without this change end user deployments will fail to report protocol errors in the case where native memory error handling is not granted to Linux." > > [1] Compute Express Link (CXL) Specification, Revision 3.1, Aug 1 2022. > > Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") > Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> > --- > v2: > Added fixes tag. > Included what the patch fixes in commit message. > --- > drivers/cxl/pci.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c > index 1cb1494c28fe..2323169b6e5f 100644 > --- a/drivers/cxl/pci.c > +++ b/drivers/cxl/pci.c > @@ -541,9 +541,9 @@ static int cxl_pci_ras_unmask(struct pci_dev *pdev) > return 0; > } > > - /* BIOS has CXL error control */ > - if (!host_bridge->native_cxl_error) > - return -ENXIO; > + /* BIOS has PCIe AER error control */ > + if (!host_bridge->native_aer) > + return 0; The error code does not matter here and changing it makes the patch that bit much more noisier than it needs to be. So just leave it as: return -ENXIO; > > rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); > if (rc) > -- > 2.17.1 >
Smita Koralahalli wrote: > > >> - /* BIOS has CXL error control */ > >> - if (!host_bridge->native_cxl_error) > >> - return -ENXIO; > >> + /* BIOS has PCIe AER error control */ > >> + if (!host_bridge->native_aer) > >> + return 0; > > > > Why not directly use pcie_aer_is_native() here? > Yeah, this was in my v1. But changed as per Robert's comments, to be > applicable for automated backports.. > > https://lore.kernel.org/all/ZLkxiZv3lWfazwVH@rric.localdomain/ > > Please advice. Keep it the way you have it. Minimizing the backport is the right call.
On 8/7/2023 8:17 PM, Dan Williams wrote: > Smita Koralahalli wrote: >> According to Section 9.17.2, Table 9-26 of CXL Specification [1], owner >> of AER should also own CXL Protocol Error Management as there is no >> explicit control of CXL Protocol error. And the CXL RAS Cap registers >> reported on Protocol errors should check for AER _OSC rather than CXL >> Memory Error Reporting Control _OSC. >> >> The CXL Memory Error Reporting Control _OSC specifically highlights >> handling Memory Error Logging and Signaling Enhancements. These kinds of >> errors are reported through a device's mailbox and can be managed >> independently from CXL Protocol Errors. >> >> This change fixes handling and reporting CXL Protocol Errors and RAS >> registers natively with native AER and FW-First CXL Memory Error Reporting >> Control. > > I feel like this could be said more succinctly and with an indication of > what the end user should expect to see. Something like: > > "cxl_pci fails to unmask CXL protocol errors when CXL memory error > reporting is not granted native control. Given that CXL memory error > reporting uses the event interface and protocol errors use AER, unmask > protocol errors based only on the native AER setting. Without this > change end user deployments will fail to report protocol errors in the > case where native memory error handling is not granted to Linux." Sure, will make the change for a more clearer description. Thanks! > >> >> [1] Compute Express Link (CXL) Specification, Revision 3.1, Aug 1 2022. >> >> Fixes: 248529edc86f ("cxl: add RAS status unmasking for CXL") >> Signed-off-by: Smita Koralahalli <Smita.KoralahalliChannabasappa@amd.com> >> --- >> v2: >> Added fixes tag. >> Included what the patch fixes in commit message. >> --- >> drivers/cxl/pci.c | 6 +++--- >> 1 file changed, 3 insertions(+), 3 deletions(-) >> >> diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c >> index 1cb1494c28fe..2323169b6e5f 100644 >> --- a/drivers/cxl/pci.c >> +++ b/drivers/cxl/pci.c >> @@ -541,9 +541,9 @@ static int cxl_pci_ras_unmask(struct pci_dev *pdev) >> return 0; >> } >> >> - /* BIOS has CXL error control */ >> - if (!host_bridge->native_cxl_error) >> - return -ENXIO; >> + /* BIOS has PCIe AER error control */ >> + if (!host_bridge->native_aer) >> + return 0; > > The error code does not matter here and changing it makes the patch that > bit much more noisier than it needs to be. So just leave it as: Doing this will return an error from cxl_pci probe thereby failing the device node creation in FW-First AER/DPC. I cannot think of other places where we reference the device node in FW-First mode but I have a place where this could potentially be a roadblock. I'm trying to add trace events support for FW-First Protocol Errors. https://lore.kernel.org/linux-cxl/D9381C12-A585-4089-873B-3707C17823D3@fb.com/T/#mcaf8a78c1295372ab811be7e1ccb6a8a4d99f3e9 And we already have an existing trace_cxl_aer_correctable_error() and similarly for uncorrectable error for native protocol error reporting. I was trying to reuse the same function for fw-first as well. This function references cxl memory device node which will be NULL in FW-First if this returns an error. I don't mind having a separate trace event function for FW-First mode as it would simplify things especially when dealing with RCH DP.. But there may be other potential places where we might reference this device node in FW-First. Please advice. Thanks, Smita > > return -ENXIO; > >> >> rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); >> if (rc) >> -- >> 2.17.1 >> > > >
diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 1cb1494c28fe..2323169b6e5f 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -541,9 +541,9 @@ static int cxl_pci_ras_unmask(struct pci_dev *pdev) return 0; } - /* BIOS has CXL error control */ - if (!host_bridge->native_cxl_error) - return -ENXIO; + /* BIOS has PCIe AER error control */ + if (!host_bridge->native_aer) + return 0; rc = pcie_capability_read_word(pdev, PCI_EXP_DEVCTL, &cap); if (rc)