From patchwork Sat Feb 18 02:18:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alison Schofield X-Patchwork-Id: 58877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp191084wrn; Fri, 17 Feb 2023 19:06:40 -0800 (PST) X-Google-Smtp-Source: AK7set+m4fQukQhXKR93lQ4k7zhwbznufkBFp0e6ub8yjc+YJogQJCbbfRncHrgQlK6dJY7S7I0R X-Received: by 2002:a62:84cc:0:b0:5a9:cc1f:b2ef with SMTP id k195-20020a6284cc000000b005a9cc1fb2efmr5315986pfd.26.1676689599964; Fri, 17 Feb 2023 19:06:39 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676689599; cv=none; d=google.com; s=arc-20160816; b=CkRmegzFEbAyguADOvwY1qPSVlAx+Ut+ISd5lGXeU9KRHoMSYQvNk3vWuMQLKWV0OT abcSNOgWoG76Mx5JDWcQNkFTI60Hy39MZsqyw6pInOUEm8cL4W3CKkZ+5wn1v+RTeIHY 56EIjyj5JWHnbHXjX/XYH8wMV/k8Y8acwND6dohckoHlCJcFpYpF1BIGRNI8bcuAnnzv eQLwPvwYss9dcJNKcGWr1Bp9rnkvnsVmgxKndW0GBNMxJ2XBpVLuO1OAJnnT4atUq919 Ewt6l0Q1WzVw9s63wfKTV7gMS9stcvWFstPx9Z1kCbJ/qMCE0ndH6N8kYzAkxxvgHRXB 0HIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=mOBBCA+AhL/mGEQchcMA44Y+p/A8Sc6/4tKKPwoC2yE=; b=GrrLlgjLMQdJhRnCNMUtXD5N84NELum1gMjHsIFaNBvXYJfTdEFIKrwG4HN8Wtagg3 xsHoxAAPiSdvThyn7WGjYRbwpnrr8vqZlFVEmxVxamyFRu9sU7RtF2GbHXVIcXpQXowI 8wIn2uLYw9fJAR2p5apoasFgyxB0Qln1cKzx5JjKzuFs3BEjNW/dDvBHtd8WdDNOIUf3 WbvnDKYVJHzWfL0/1fodeGvoKSTnIfF+Ygv6YDTjzj0K5vT4OYbRUF2orB/B+qpHfARm aHb/nj6ojfMLAdGdk7e0PjsF2rw6ijUf1t/4s4GI2LC6JP9CXRfltudOq7R8Q4EAdbEh rL4A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=c3vYxuli; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o21-20020aa79795000000b00593c79b5541si6010766pfp.26.2023.02.17.19.06.27; Fri, 17 Feb 2023 19:06:39 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=c3vYxuli; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229770AbjBRCTF (ORCPT + 99 others); Fri, 17 Feb 2023 21:19:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229744AbjBRCTC (ORCPT ); Fri, 17 Feb 2023 21:19:02 -0500 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EDE46C00F; Fri, 17 Feb 2023 18:19:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1676686740; x=1708222740; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=PW6WLN2OKDVjEO50s1i9kILGsUK37pTf9zbgWeYNOLo=; b=c3vYxulijmUL0qsxo/ey0Vv0H2vysD2zDtHGV1pxnHXaRwy5pZdmYi9W rPaBwXcpKvUlt0loYSwRgpxBKOkK/2MKM2yw4M46613+oauUP3Nuzhq00 1OUTBZqi0xeu6ymO0KrxNoslbgbMzUB2H6RXUDLZC5CPTIx6RMFPH7tH7 DK8KfGbs76DA2MoEKuceKjh6Aj9PDVNSE5gzoe3dHMrN5xEaZheN/o4nN j5ksRSEOSzgu/DVCHjIEwSZx2JuRB2fgsQyUP2k5AjE5+HU2zfakt+O2N RmmwgbbrT+It4gqv+ZCQxtr5XlPWzRQ1VcrHe0FASZzM6IWDIwDINanl+ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10624"; a="320244203" X-IronPort-AV: E=Sophos;i="5.97,306,1669104000"; d="scan'208";a="320244203" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Feb 2023 18:18:59 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10624"; a="672787875" X-IronPort-AV: E=Sophos;i="5.97,306,1669104000"; d="scan'208";a="672787875" Received: from aschofie-mobl2.amr.corp.intel.com (HELO localhost) ([10.252.133.15]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 17 Feb 2023 18:18:58 -0800 From: alison.schofield@intel.com To: Dan Williams , Ira Weiny , Vishal Verma , Dave Jiang , Ben Widawsky , Steven Rostedt Cc: Alison Schofield , linux-cxl@vger.kernel.org, linux-kernel@vger.kernel.org, Jonathan Cameron Subject: [PATCH v7 1/6] cxl/mbox: Add GET_POISON_LIST mailbox command Date: Fri, 17 Feb 2023 18:18:49 -0800 Message-Id: <69c1a7dcefcd559c61372e1b15fbeac6e86af24e.1676685180.git.alison.schofield@intel.com> X-Mailer: git-send-email 2.37.3 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758136473697618990?= X-GMAIL-MSGID: =?utf-8?q?1758136473697618990?= From: Alison Schofield CXL devices maintain a list of locations that are poisoned or result in poison if the addresses are accessed by the host. Per the spec (CXL 3.0 8.2.9.8.4.1), the device returns this Poison list as a set of Media Error Records that include the source of the error, the starting device physical address and length. The length is the number of adjacent DPAs in the record and is in units of 64 bytes. Retrieve the poison list. Reviewed-by: Jonathan Cameron Signed-off-by: Alison Schofield --- drivers/cxl/core/mbox.c | 72 +++++++++++++++++++++++++++++++++++++++++ drivers/cxl/cxlmem.h | 67 ++++++++++++++++++++++++++++++++++++++ drivers/cxl/pci.c | 4 +++ 3 files changed, 143 insertions(+) diff --git a/drivers/cxl/core/mbox.c b/drivers/cxl/core/mbox.c index fc7631bb1c24..faa61e5f541f 100644 --- a/drivers/cxl/core/mbox.c +++ b/drivers/cxl/core/mbox.c @@ -5,6 +5,8 @@ #include #include #include +#include +#include #include #include @@ -994,6 +996,7 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds) /* See CXL 2.0 Table 175 Identify Memory Device Output Payload */ struct cxl_mbox_identify id; struct cxl_mbox_cmd mbox_cmd; + u32 val; int rc; mbox_cmd = (struct cxl_mbox_cmd) { @@ -1017,6 +1020,11 @@ int cxl_dev_state_identify(struct cxl_dev_state *cxlds) cxlds->lsa_size = le32_to_cpu(id.lsa_size); memcpy(cxlds->firmware_version, id.fw_revision, sizeof(id.fw_revision)); + if (test_bit(CXL_MEM_COMMAND_ID_GET_POISON, cxlds->enabled_cmds)) { + val = get_unaligned_le24(id.poison_list_max_mer); + cxlds->poison.max_errors = min_t(u32, val, CXL_POISON_LIST_MAX); + } + return 0; } EXPORT_SYMBOL_NS_GPL(cxl_dev_state_identify, CXL); @@ -1107,6 +1115,70 @@ int cxl_set_timestamp(struct cxl_dev_state *cxlds) } EXPORT_SYMBOL_NS_GPL(cxl_set_timestamp, CXL); +int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, + struct cxl_region *cxlr) +{ + struct cxl_dev_state *cxlds = cxlmd->cxlds; + struct cxl_mbox_poison_payload_out *po; + struct cxl_mbox_poison_payload_in pi; + struct cxl_mbox_cmd mbox_cmd; + int nr_records = 0; + int rc; + + rc = mutex_lock_interruptible(&cxlds->poison.lock); + if (rc) + return rc; + + po = cxlds->poison.payload_out; + pi.offset = cpu_to_le64(offset); + pi.length = cpu_to_le64(len / CXL_POISON_LEN_MULT); + + mbox_cmd = (struct cxl_mbox_cmd) { + .opcode = CXL_MBOX_OP_GET_POISON, + .size_in = sizeof(pi), + .payload_in = &pi, + .size_out = cxlds->payload_size, + .payload_out = po, + .min_out = struct_size(po, record, 0), + }; + + do { + rc = cxl_internal_send_cmd(cxlds, &mbox_cmd); + if (rc) + break; + + /* TODO TRACE the media error records */ + + /* Protect against an uncleared _FLAG_MORE */ + nr_records = nr_records + le16_to_cpu(po->count); + if (nr_records >= cxlds->poison.max_errors) { + dev_dbg(&cxlmd->dev, "Max Error Records reached: %d\n", + nr_records); + break; + } + } while (po->flags & CXL_POISON_FLAG_MORE); + + mutex_unlock(&cxlds->poison.lock); + return rc; +} +EXPORT_SYMBOL_NS_GPL(cxl_mem_get_poison, CXL); + +int cxl_poison_state_init(struct cxl_dev_state *cxlds) +{ + if (!test_bit(CXL_MEM_COMMAND_ID_GET_POISON, cxlds->enabled_cmds)) + return 0; + + cxlds->poison.payload_out = devm_kzalloc(cxlds->dev, + cxlds->payload_size, + GFP_KERNEL); + if (!cxlds->poison.payload_out) + return -ENOMEM; + + mutex_init(&cxlds->poison.lock); + return 0; +} +EXPORT_SYMBOL_NS_GPL(cxl_poison_state_init, CXL); + struct cxl_dev_state *cxl_dev_state_create(struct device *dev) { struct cxl_dev_state *cxlds; diff --git a/drivers/cxl/cxlmem.h b/drivers/cxl/cxlmem.h index c6c560c67a8a..c0feb43b585d 100644 --- a/drivers/cxl/cxlmem.h +++ b/drivers/cxl/cxlmem.h @@ -215,6 +215,24 @@ struct cxl_event_state { struct mutex log_lock; }; +/** + * struct cxl_poison_state - Poison list retrieval + * + * @max_errors: Maximum media error records held in device cache + * @payload_out: The poison list payload returned by device + * @lock: Protect reads of the poison list + * + * Reads of the poison list are synchronized to ensure that a reader + * does not get an incomplete list because their request overlapped + * (was interrupted or preceded by) another read request of the same + * DPA range. CXL Spec 3.0 Section 8.2.9.8.4.1 + */ +struct cxl_poison_state { + u32 max_errors; + struct cxl_mbox_poison_payload_out *payload_out; + struct mutex lock; /* Protect reads of poison list */ +}; + /** * struct cxl_dev_state - The driver device state * @@ -250,6 +268,7 @@ struct cxl_event_state { * @info: Cached DVSEC information about the device. * @serial: PCIe Device Serial Number * @doe_mbs: PCI DOE mailbox array + * @poison: poison list retrieval info * @mbox_send: @dev specific transport for transmitting mailbox commands * * See section 8.2.9.5.2 Capacity Configuration and Label Storage for @@ -289,6 +308,7 @@ struct cxl_dev_state { struct xarray doe_mbs; struct cxl_event_state event; + struct cxl_poison_state poison; int (*mbox_send)(struct cxl_dev_state *cxlds, struct cxl_mbox_cmd *cmd); }; @@ -537,6 +557,50 @@ struct cxl_mbox_set_timestamp_in { } __packed; +/* Get Poison List CXL 3.0 Spec 8.2.9.8.4.1 */ +struct cxl_mbox_poison_payload_in { + __le64 offset; + __le64 length; +} __packed; + +struct cxl_mbox_poison_payload_out { + u8 flags; + u8 rsvd1; + __le64 overflow_t; + __le16 count; + u8 rsvd2[20]; + struct cxl_poison_record { + __le64 address; + __le32 length; + __le32 rsvd; + } __packed record[]; +} __packed; + +/* + * Get Poison List address field encodes the starting + * address of poison, and the source of the poison. + */ +#define CXL_POISON_START_MASK GENMASK_ULL(63, 6) +#define CXL_POISON_SOURCE_MASK GENMASK(2, 0) + +/* Get Poison List record length is in units of 64 bytes */ +#define CXL_POISON_LEN_MULT 64 + +/* Kernel defined maximum for a list of poison errors */ +#define CXL_POISON_LIST_MAX 1024 + +/* Get Poison List: Payload out flags */ +#define CXL_POISON_FLAG_MORE BIT(0) +#define CXL_POISON_FLAG_OVERFLOW BIT(1) +#define CXL_POISON_FLAG_SCANNING BIT(2) + +/* Get Poison List: Poison Source */ +#define CXL_POISON_SOURCE_UNKNOWN 0 +#define CXL_POISON_SOURCE_EXTERNAL 1 +#define CXL_POISON_SOURCE_INTERNAL 2 +#define CXL_POISON_SOURCE_INJECTED 3 +#define CXL_POISON_SOURCE_VENDOR 7 + /** * struct cxl_mem_command - Driver representation of a memory device command * @info: Command information as it exists for the UAPI @@ -607,6 +671,9 @@ void set_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds void clear_exclusive_cxl_commands(struct cxl_dev_state *cxlds, unsigned long *cmds); void cxl_mem_get_event_records(struct cxl_dev_state *cxlds, u32 status); int cxl_set_timestamp(struct cxl_dev_state *cxlds); +int cxl_poison_state_init(struct cxl_dev_state *cxlds); +int cxl_mem_get_poison(struct cxl_memdev *cxlmd, u64 offset, u64 len, + struct cxl_region *cxlr); #ifdef CONFIG_CXL_SUSPEND void cxl_mem_active_inc(void); diff --git a/drivers/cxl/pci.c b/drivers/cxl/pci.c index 60b23624d167..827ea0895778 100644 --- a/drivers/cxl/pci.c +++ b/drivers/cxl/pci.c @@ -769,6 +769,10 @@ static int cxl_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) if (rc) return rc; + rc = cxl_poison_state_init(cxlds); + if (rc) + return rc; + rc = cxl_dev_state_identify(cxlds); if (rc) return rc;