[4/4] ras/events: Trace CXL CPER events even without the CXL stack loaded
Commit Message
If CXL is solely managed by firmware (including HDM configuration and
event processing via firmware first) it is possible to run the system
without the CXL software loaded. In this case no CXL callback will be
loaded and CXL CPER errors will not be processed at all.
In this case memory device and region (HPA) information is missing but
omitting the error completely is not friendly for such a user. Some
device information is available in the generic event which could prove
useful to a user.
Utilize the local work item to trace a generic CXL CPER event.
Duplicate the pattern of decoding the CXL event header to aid in adding
future trace points if needed. This was an easy lift from the CXL trace
points. But stop at header decoding only because this is an unlikely
configuration for the system. Further decoding can be obtained with
user space tools or added later if needed.
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Ira Weiny <ira.weiny@intel.com>
---
drivers/acpi/apei/ghes.c | 5 ++-
include/ras/ras_event.h | 90 ++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 94 insertions(+), 1 deletion(-)
Comments
Ira Weiny wrote:
> If CXL is solely managed by firmware (including HDM configuration and
> event processing via firmware first) it is possible to run the system
> without the CXL software loaded. In this case no CXL callback will be
> loaded and CXL CPER errors will not be processed at all.
>
> In this case memory device and region (HPA) information is missing but
> omitting the error completely is not friendly for such a user. Some
> device information is available in the generic event which could prove
> useful to a user.
>
> Utilize the local work item to trace a generic CXL CPER event.
>
> Duplicate the pattern of decoding the CXL event header to aid in adding
> future trace points if needed. This was an easy lift from the CXL trace
> points. But stop at header decoding only because this is an unlikely
> configuration for the system. Further decoding can be obtained with
> user space tools or added later if needed.
>
> Cc: Ard Biesheuvel <ardb@kernel.org>
> Cc: "Rafael J. Wysocki" <rafael@kernel.org>
> Cc: Tony Luck <tony.luck@intel.com>
> Cc: Borislav Petkov <bp@alien8.de>
> Suggested-by: Dan Williams <dan.j.williams@intel.com>
> Signed-off-by: Ira Weiny <ira.weiny@intel.com>
> ---
> drivers/acpi/apei/ghes.c | 5 ++-
> include/ras/ras_event.h | 90 ++++++++++++++++++++++++++++++++++++++++++++++++
> 2 files changed, 94 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> index f433f4eae888..9ac323cbf195 100644
> --- a/drivers/acpi/apei/ghes.c
> +++ b/drivers/acpi/apei/ghes.c
> @@ -729,7 +729,10 @@ static void cxl_cper_local_fn(struct work_struct *work)
>
> while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1,
> &cxl_cper_read_lock)) {
> - /* drop msg */
> + struct cxl_cper_event_rec *rec = &wd.rec;
> + union cxl_event *evt = &rec->event;
> +
> + trace_cper_cxl_gen_event(rec, &evt->generic);
So it was confusing to read the empty stub function 2 patches back when this
change was coming, and basic reporting of CXL event does not need the
workqueue indirection. Note that EDAC triggers trace events directly in
the atomic notifier chain, so CXL could do the same.
> static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn);
> diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
> index cbd3ddd7c33d..319faf552b65 100644
> --- a/include/ras/ras_event.h
> +++ b/include/ras/ras_event.h
This is more heavywieght than I was expecting and defeats the purpose of
centralizing advanced decode in the CXL driver itself.
I would expect this to be just the tracing equivalent of the
ignore_section logic in cper_estatus_print_section().
Dan Williams wrote:
> Ira Weiny wrote:
> >
> > diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
> > index f433f4eae888..9ac323cbf195 100644
> > --- a/drivers/acpi/apei/ghes.c
> > +++ b/drivers/acpi/apei/ghes.c
> > @@ -729,7 +729,10 @@ static void cxl_cper_local_fn(struct work_struct *work)
> >
> > while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1,
> > &cxl_cper_read_lock)) {
> > - /* drop msg */
> > + struct cxl_cper_event_rec *rec = &wd.rec;
> > + union cxl_event *evt = &rec->event;
> > +
> > + trace_cper_cxl_gen_event(rec, &evt->generic);
>
> So it was confusing to read the empty stub function 2 patches back when this
> change was coming, and basic reporting of CXL event does not need the
> workqueue indirection. Note that EDAC triggers trace events directly in
> the atomic notifier chain, so CXL could do the same.
>
> > static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn);
> > diff --git a/include/ras/ras_event.h b/include/ras/ras_event.h
> > index cbd3ddd7c33d..319faf552b65 100644
> > --- a/include/ras/ras_event.h
> > +++ b/include/ras/ras_event.h
>
> This is more heavywieght than I was expecting and defeats the purpose of
> centralizing advanced decode in the CXL driver itself.
Fair enough but it is not that heavy IMO. It was mostly lifted from the
CXL traces.
>
> I would expect this to be just the tracing equivalent of the
> ignore_section logic in cper_estatus_print_section().
I think it is fair to give the user some information. I'll redo this
patch to be first and drop the extra work item.
Ira
@@ -729,7 +729,10 @@ static void cxl_cper_local_fn(struct work_struct *work)
while (kfifo_out_spinlocked(&cxl_cper_fifo, &wd, 1,
&cxl_cper_read_lock)) {
- /* drop msg */
+ struct cxl_cper_event_rec *rec = &wd.rec;
+ union cxl_event *evt = &rec->event;
+
+ trace_cper_cxl_gen_event(rec, &evt->generic);
}
}
static DECLARE_WORK(cxl_local_work, cxl_cper_local_fn);
@@ -422,6 +422,96 @@ TRACE_EVENT(memory_failure_event,
)
);
#endif /* CONFIG_MEMORY_FAILURE */
+
+#include <linux/cxl-event.h>
+#include <asm-generic/unaligned.h>
+
+/*
+ * Common Event Record Format
+ * CXL 3.0 section 8.2.9.2.1; Table 8-42
+ */
+#define CXL_EVENT_RECORD_FLAG_PERMANENT BIT(2)
+#define CXL_EVENT_RECORD_FLAG_MAINT_NEEDED BIT(3)
+#define CXL_EVENT_RECORD_FLAG_PERF_DEGRADED BIT(4)
+#define CXL_EVENT_RECORD_FLAG_HW_REPLACE BIT(5)
+#define show_hdr_flags(flags) __print_flags(flags, " | ", \
+ { CXL_EVENT_RECORD_FLAG_PERMANENT, "PERMANENT_CONDITION" }, \
+ { CXL_EVENT_RECORD_FLAG_MAINT_NEEDED, "MAINTENANCE_NEEDED" }, \
+ { CXL_EVENT_RECORD_FLAG_PERF_DEGRADED, "PERFORMANCE_DEGRADED" }, \
+ { CXL_EVENT_RECORD_FLAG_HW_REPLACE, "HARDWARE_REPLACEMENT_NEEDED" } \
+)
+
+/*
+ * Define macros for the common header of each CPER CXL event.
+ *
+ * Tracepoints using these macros must do 3 things:
+ *
+ * 1) Add CPER_CXL_EVT_TP_entry to TP_STRUCT__entry
+ * 2) Use CPER_CXL_EVT_TP_fast_assign within TP_fast_assign;
+ * pass the serial number and CXL event header
+ * 3) Use CPER_CXL_EVT_TP_printk() instead of TP_printk()
+ *
+ * See the generic_event tracepoint as an example.
+ */
+#define CPER_CXL_EVT_TP_entry \
+ __field(u16, segment) \
+ __field(u8, bus) \
+ __field(u8, device) \
+ __field(u8, func) \
+ __field(u64, serial) \
+ __field(u32, hdr_flags) \
+ __field(u16, hdr_handle) \
+ __field(u16, hdr_related_handle) \
+ __field(u64, hdr_timestamp) \
+ __field(u8, hdr_length) \
+ __field(u8, hdr_maint_op_class)
+
+#define CPER_CXL_EVT_TP_fast_assign(cper_rec, evt_hdr) \
+ __entry->segment = cper_rec->hdr.device_id.segment_num; \
+ __entry->bus = cper_rec->hdr.device_id.bus_num; \
+ __entry->device = cper_rec->hdr.device_id.device_num; \
+ __entry->func = cper_rec->hdr.device_id.func_num; \
+ __entry->serial = (((u64)cper_rec->hdr.dev_serial_num.upper_dw) << 32) |\
+ cper_rec->hdr.dev_serial_num.lower_dw; \
+ __entry->hdr_length = (evt_hdr).length; \
+ __entry->hdr_flags = get_unaligned_le24((evt_hdr).flags); \
+ __entry->hdr_handle = le16_to_cpu((evt_hdr).handle); \
+ __entry->hdr_related_handle = le16_to_cpu((evt_hdr).related_handle); \
+ __entry->hdr_timestamp = le64_to_cpu((evt_hdr).timestamp); \
+ __entry->hdr_maint_op_class = (evt_hdr).maint_op_class
+
+#define CPER_CXL_EVT_TP_printk(fmt, ...) \
+ TP_printk("device=%04x:%02x:%02x.%02x serial=%lld : time=%llu " \
+ "len=%d flags='%s' handle=%x related_handle=%x " \
+ "maint_op_class=%u : " fmt, \
+ __entry->segment, __entry->bus, __entry->device, __entry->func, \
+ __entry->serial, \
+ __entry->hdr_timestamp, __entry->hdr_length, \
+ show_hdr_flags(__entry->hdr_flags), __entry->hdr_handle, \
+ __entry->hdr_related_handle, __entry->hdr_maint_op_class, \
+ ##__VA_ARGS__)
+
+TRACE_EVENT(cper_cxl_gen_event,
+
+ TP_PROTO(struct cxl_cper_event_rec *cper_rec,
+ struct cxl_event_generic *gen_rec),
+
+ TP_ARGS(cper_rec, gen_rec),
+
+ TP_STRUCT__entry(
+ CPER_CXL_EVT_TP_entry
+ __array(u8, data, CXL_EVENT_RECORD_DATA_LENGTH)
+ ),
+
+ TP_fast_assign(
+ CPER_CXL_EVT_TP_fast_assign(cper_rec, gen_rec->hdr);
+ memcpy(__entry->data, gen_rec->data, CXL_EVENT_RECORD_DATA_LENGTH);
+ ),
+
+ CPER_CXL_EVT_TP_printk("%s",
+ __print_hex(__entry->data, CXL_EVENT_RECORD_DATA_LENGTH))
+);
+
#endif /* _TRACE_HW_EVENT_MC_H */
/* This part must be outside protection */