[v1,3/3] PCI: qcom: Add support for detecting controller level PCIe errors

Message ID 20240221140405.28532-4-root@hu-msarkar-hyd.qualcomm.com
State New
Headers
Series Add support for detecting Controller Level PCIe Errors |

Commit Message

root Feb. 21, 2024, 2:04 p.m. UTC
  From: Nitesh Gupta <nitegupt@quicinc.com>

Synopsys Controllers provide capabilities to detect various controller
level errors. These can range from controller interface error to random
PCIe configuration errors. This patch intends to add support to detect
these errors and report it to userspace entity via sysfs, which can take
appropriate actions to mitigate the errors.

Signed-off-by: Nitesh Gupta <nitegupt@quicinc.com>
Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
---
 drivers/pci/controller/dwc/pcie-designware.h |  26 ++
 drivers/pci/controller/dwc/pcie-qcom.c       | 350 +++++++++++++++++++
 2 files changed, 376 insertions(+)
  

Comments

Krzysztof Kozlowski Feb. 21, 2024, 2:35 p.m. UTC | #1
On 21/02/2024 15:04, root wrote:
> From: Nitesh Gupta <nitegupt@quicinc.com>
> 
> Synopsys Controllers provide capabilities to detect various controller
> level errors. These can range from controller interface error to random
> PCIe configuration errors. This patch intends to add support to detect
> these errors and report it to userspace entity via sysfs, which can take
> appropriate actions to mitigate the errors.
> 

..

> +
>  static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
>  {
>  	struct dw_pcie *pci = pcie->pci;
> @@ -1496,6 +1829,21 @@ static int qcom_pcie_probe(struct platform_device *pdev)
>  		goto err_pm_runtime_put;
>  	}
>  
> +	pcie->global_irq = platform_get_irq_byname(pdev, "global");
> +	if (pcie->global_irq < 0) {
> +		ret = pcie->global_irq;
> +		goto err_pm_runtime_put;

How does this work with old DTS and with all other platforms? Was it tested?

Best regards,
Krzysztof
  
Bjorn Helgaas Feb. 21, 2024, 6:50 p.m. UTC | #2
On Wed, Feb 21, 2024 at 07:34:04PM +0530, root wrote:
> From: Nitesh Gupta <nitegupt@quicinc.com>
> 
> Synopsys Controllers provide capabilities to detect various controller

"Synopsys controllers"?  "Synopsys" refers to the DesignWare core, but
most of this code is in the qcom driver.  If it's qcom-specific, this
should say "Qualcomm controllers".

> level errors. These can range from controller interface error to random
> PCIe configuration errors. This patch intends to add support to detect
> these errors and report it to userspace entity via sysfs, which can take
> appropriate actions to mitigate the errors.

s/This patch intends to add/Add/, so the commit log says what the
patch *does*, not "what it intends to do".

> +
> +/*
> + * Error Reporting DBI register
> + */

Typical style in this file (granted, it's not 100% consistent) is to
make these single-line comments, i.e.,

  /* Error Reporting DBI register */

> +#define DBI_DEVICE_CONTROL_DEVICE_STATUS	0x78
> +#define DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG	0x8c

Most other #defines in this file use upper-case hex.

> +#define PCIE_AER_EXT_CAP_ID			0x01

Why not the existing PCI_EXT_CAP_ID_ERR?  If this is the standard PCIe
AER stuff, we shouldn't make it needlessly device-specific.

> +#define PCI_EXT_CAP_RASDP_ID			0x0b

Looks like possibly PCI_EXT_CAP_ID_VNDR?  Capability IDs are
definitely not device-specific.  The fact that a PCI_EXT_CAP_ID_VNDR
capability in a device with Vendor ID PCI_VENDOR_ID_QCOM has a
qcom-specific meaning is obviously specific to qcom, but the
Capability ID itself is not.

> +/* DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG register fields */
> +#define PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN		BIT(0)
> +#define PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN	BIT(1)
> +#define PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN	BIT(2)
> +
> +/* DBI_DEVICE_CONTROL_DEVICE_STATUS register fields */
> +#define PCIE_CAP_UNSUPPORT_REQ_REP_EN		BIT(3)
> +#define PCIE_CAP_FATAL_ERR_REPORT_EN		BIT(2)
> +#define PCIE_CAP_NON_FATAL_ERR_REPORT_EN	BIT(1)
> +#define PCIE_CAP_CORR_ERR_REPORT_EN		BIT(0)

These look like alternate ways to access the generic PCIe Capability.
If that's the case, either use the existing PCI_EXP_RTCTL_SECEE,
PCI_EXP_DEVCTL_CERE, etc., or at least match the "RTCTL_SECEE" parts
of the names so we can see the connection.

> +/* DBI_ADV_ERR_CAP_CTRL_OFF register fields */
> +#define ECRC_GEN_EN				BIT(6)
> +#define ECRC_CHECK_EN				BIT(8)

Do these correspond to PCI_ERR_CAP_ECRC_GENE, PCI_ERR_CAP_ECRC_CHKE?

> +/* DBI_ROOT_ERR_CMD_OFF register fields */
> +#define CORR_ERR_REPORTING_EN			BIT(0)
> +#define NON_FATAL_ERR_REPORTING_EN		BIT(1)
> +#define FATAL_ERR_REPORTING_EN			BIT(2)

PCI_ERR_ROOT_CMD_COR_EN, etc?

> +static void qcom_pcie_enable_error_reporting_2_7_0(struct qcom_pcie *pcie)
> +{
> + ...

> +	val = readl(pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
> +	val |= (PCIE_CAP_CORR_ERR_REPORT_EN | PCIE_CAP_NON_FATAL_ERR_REPORT_EN |
> +			PCIE_CAP_FATAL_ERR_REPORT_EN | PCIE_CAP_UNSUPPORT_REQ_REP_EN);
> +	writel(val, pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);

Is there any way to split the AER part (specified by the PCIe spec)
from the qcom-specific (or dwc-specific) part?  This looks an awful
lot like pci_enable_pcie_error_reporting(), and we should do this in
the PCI core in a generic way if possible.

> +	val = readl(pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
> +	val |= (PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN | PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN |
> +			PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN);
> +	writel(val, pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);

Bjorn
  
Frank Li Feb. 21, 2024, 9 p.m. UTC | #3
On Wed, Feb 21, 2024 at 07:34:04PM +0530, root wrote:
> From: Nitesh Gupta <nitegupt@quicinc.com>
> 
> Synopsys Controllers provide capabilities to detect various controller
> level errors. These can range from controller interface error to random
> PCIe configuration errors. This patch intends to add support to detect
> these errors and report it to userspace entity via sysfs, which can take
> appropriate actions to mitigate the errors.
> 
> Signed-off-by: Nitesh Gupta <nitegupt@quicinc.com>
> Signed-off-by: Mrinmay Sarkar <quic_msarkar@quicinc.com>
> ---
>  drivers/pci/controller/dwc/pcie-designware.h |  26 ++
>  drivers/pci/controller/dwc/pcie-qcom.c       | 350 +++++++++++++++++++
>  2 files changed, 376 insertions(+)
> 
> diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
> index 26dae4837462..cd45f9a2f9bc 100644
> --- a/drivers/pci/controller/dwc/pcie-designware.h
> +++ b/drivers/pci/controller/dwc/pcie-designware.h
> @@ -223,6 +223,32 @@
>  
>  #define PCIE_RAS_DES_EVENT_COUNTER_DATA		0xc
>  
> +
> +/*
> + * Error Reporting DBI register
> + */
> +#define DBI_DEVICE_CONTROL_DEVICE_STATUS	0x78
> +#define DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG	0x8c
> +#define DBI_INTERFACE_TIMER_STATUS		0x938
> +#define DBI_SAFETY_MASK_OFF			0x960
> +#define DBI_SAFETY_STATUS			0x964
> +
> +#define DBI_ADV_ERR_CAP_CTRL_OFF		0x18
> +#define DBI_ROOT_ERR_CMD_OFF			0x2c
> +
> +/*
> + * RAS-DP register
> + */
> +#define PCIE_RASDP_ERROR_MODE_EN_REG		0x28
> +#define RASDP_ERROR_MODE_EN			BIT(0)
> +
> +/*
> + * Interface Timer register
> + */
> +#define PCIE_INTERFACE_TIMER_CONTROL		0x930
> +#define INTERFACE_TIMER_EN			BIT(0)
> +#define INTERFACE_TIMER_AER_EN			BIT(1)
> +
>  /*
>   * The default address offset between dbi_base and atu_base. Root controller
>   * drivers are not required to initialize atu_base if the offset matches this
> diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
> index 10f2d0bb86be..138e3b08d4b9 100644
> --- a/drivers/pci/controller/dwc/pcie-qcom.c
> +++ b/drivers/pci/controller/dwc/pcie-qcom.c
> @@ -20,6 +20,7 @@
>  #include <linux/kernel.h>
>  #include <linux/init.h>
>  #include <linux/of.h>
> +#include <linux/device.h>
>  #include <linux/of_gpio.h>
>  #include <linux/pci.h>
>  #include <linux/pm_runtime.h>
> @@ -68,6 +69,73 @@
>  #define PARF_DEBUG_CNT_AUX_CLK_IN_L1SUB_L1	0xc84
>  #define PARF_DEBUG_CNT_AUX_CLK_IN_L1SUB_L2	0xc88
>  
> +/* Error Reporting Parf Registers */
> +#define PARF_INT_ALL_STATUS			0x224
> +#define PARF_INT_ALL_CLEAR			0x228
> +#define PARF_INT_CLEAR				0x21c
> +#define PARF_INT_STATUS			0x220
> +#define PARF_INT_ALL_MASK			0x22c
> +#define PARF_INT_ALL_2_CLEAR			0x504
> +#define PARF_INT_ALL_2_STATUS			0x500
> +#define PARF_INT_ALL_3_CLEAR			0x2e14
> +#define PARF_INT_ALL_3_STATUS			0x2e10
> +#define PARF_INT_ALL_4_CLEAR			0x2dd8
> +#define PARF_INT_ALL_4_STATUS			0x2dd0
> +#define PARF_INT_ALL_5_CLEAR			0x2ddc
> +#define PARF_INT_ALL_5_STATUS			0x2dd4
> +#define PARF_CFG_SAFETY_INT_MASK_CTRL		0x2c60
> +
> +
> +#define PCIE_AER_EXT_CAP_ID			0x01
> +#define PCI_EXT_CAP_RASDP_ID			0x0b
> +
> +/* Interrupt Masks */
> +#define CFGPCIE_INT_ALL_STATUS_MASK		0x3ff3e
> +#define CFGPCIE_PARF_INT_STATUS_MASK		0x1b
> +#define CFGPCIE_INTERFACE_TIMER_STATUS_MASK	0xe7b
> +#define CFGPCIE_INT_ALL_2_STATUS_MASK		GENMASK(24, 0)
> +#define CFGPCIE_INT_ALL_3_STATUS_MASK		GENMASK(31, 0)
> +#define CFGPCIE_INT_ALL_4_STATUS_MASK		GENMASK(31, 0)
> +#define CFGPCIE_INT_ALL_5_STATUS_MASK		GENMASK(31, 0)
> +
> +/* PCI_INTERRUPT_LINE register field */
> +#define SERR_EN					BIT(17)
> +
> +/* DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG register fields */
> +#define PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN		BIT(0)
> +#define PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN	BIT(1)
> +#define PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN	BIT(2)
> +
> +/* DBI_DEVICE_CONTROL_DEVICE_STATUS register fields */
> +#define PCIE_CAP_UNSUPPORT_REQ_REP_EN		BIT(3)
> +#define PCIE_CAP_FATAL_ERR_REPORT_EN		BIT(2)
> +#define PCIE_CAP_NON_FATAL_ERR_REPORT_EN	BIT(1)
> +#define PCIE_CAP_CORR_ERR_REPORT_EN		BIT(0)
> +
> +/* PARF_CFG_SAFETY_INT_MASK_CTRL register fields */
> +#define CFG_SAFETY_UNCORR_INT_MASK		BIT(0)
> +#define CFG_SAFETY_CORR_INT_MASK		BIT(1)
> +
> +/* DBI_ADV_ERR_CAP_CTRL_OFF register fields */
> +#define ECRC_GEN_EN				BIT(6)
> +#define ECRC_CHECK_EN				BIT(8)
> +
> +/* DBI_ROOT_ERR_CMD_OFF register fields */
> +#define CORR_ERR_REPORTING_EN			BIT(0)
> +#define NON_FATAL_ERR_REPORTING_EN		BIT(1)
> +#define FATAL_ERR_REPORTING_EN			BIT(2)
> +
> +/* DBI_SAFETY_MASK_OFF register fields */
> +#define SAFETY_INT_MASK			GENMASK(5, 0)
> +
> +/* DBI_SAFETY_STATUS register fields */
> +#define PCIE_RASDP_UNCORR_ERR			BIT(0)
> +#define PCIE_IFACE_TMR_ERR			BIT(1)
> +#define PCIE_CDM_CHK_ERR			BIT(2)
> +#define PCIE_AER_UNCORR_ERR			BIT(3)
> +#define PCIE_AER_CORR_ERR			BIT(4)
> +#define PCIE_RASDP_CORR_ERR			BIT(5)
> +
>  /* PARF_SYS_CTRL register fields */
>  #define MAC_PHY_POWERDOWN_IN_P2_D_MUX_EN	BIT(29)
>  #define MST_WAKEUP_EN				BIT(13)
> @@ -231,6 +299,24 @@ struct qcom_pcie_cfg {
>  	const struct qcom_pcie_ops *ops;
>  };
>  
> +enum qcom_pcie_fault_code {
> +	RASDP_UNCORR_ERROR,	/* RASDP uncorrectable error */
> +	RASDP_CORR_ERROR,	/* RASDP correctable error */
> +	CDM_REG_CHK_ERROR,	/* CDM register check error */
> +	INTERFACE_TIMER_ERROR,	/* PCIe local bus interface timer error */
> +	PCIE_SPURIOUS_INT,	/* Spurious Interrupt received */
> +	MAX_PCIE_SAFETY_FAULT	/* Maximum PCIe fault source code supported */
> +};
> +
> +static const char * const pcie_fault_string[] = {
> +	"RASDP_Uncorr_Error",
> +	"RASDP_Corr_Error",
> +	"CDM_Reg_Chk_Error",
> +	"Interface_Timer_Error",
> +	"PCIe_Spurious_Interrupt",
> +	"TOTAL_PCIE_FAULTS",
> +};
> +
>  struct qcom_pcie {
>  	struct dw_pcie *pci;
>  	void __iomem *parf;			/* DT parf */
> @@ -243,6 +329,10 @@ struct qcom_pcie {
>  	const struct qcom_pcie_cfg *cfg;
>  	struct dentry *debugfs;
>  	bool suspended;
> +	int global_irq;
> +	spinlock_t safety_lock;
> +	u32 pcie_fault[MAX_PCIE_SAFETY_FAULT];
> +	u32 pcie_fault_total;
>  };
>  
>  #define to_qcom_pcie(x)		dev_get_drvdata((x)->dev)
> @@ -959,9 +1049,94 @@ static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
>  	return ret;
>  }
>  
> +static void qcom_pcie_enable_error_reporting_2_7_0(struct qcom_pcie *pcie)
> +{
> +	struct dw_pcie *pci = pcie->pci;
> +	u32 val, offset;
> +
> +	/* Clear all the interrupts before we enable it */
> +	writel(0, pci->dbi_base + DBI_SAFETY_STATUS);
> +	writel(0, pci->dbi_base + DBI_INTERFACE_TIMER_STATUS);
> +
> +	/* Enable interrupts which are aggregated using GLOBAL_INT */
> +	writel(CFGPCIE_INT_ALL_STATUS_MASK, pcie->parf + PARF_INT_ALL_CLEAR);
> +	writel(CFGPCIE_PARF_INT_STATUS_MASK, pcie->parf + PARF_INT_CLEAR);
> +	writel(CFGPCIE_INT_ALL_2_STATUS_MASK, pcie->parf + PARF_INT_ALL_2_CLEAR);
> +	writel(CFGPCIE_INT_ALL_3_STATUS_MASK, pcie->parf + PARF_INT_ALL_3_CLEAR);
> +	writel(CFGPCIE_INT_ALL_4_STATUS_MASK, pcie->parf + PARF_INT_ALL_4_CLEAR);
> +	writel(CFGPCIE_INT_ALL_5_STATUS_MASK, pcie->parf + PARF_INT_ALL_5_CLEAR);
> +
> +	dw_pcie_dbi_ro_wr_en(pci);
> +
> +	val = readl(pci->dbi_base + PCI_INTERRUPT_LINE);
> +	val |= SERR_EN;
> +	writel(val, pci->dbi_base + PCI_INTERRUPT_LINE);
> +
> +	val = readl(pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
> +	val |= (PCIE_CAP_CORR_ERR_REPORT_EN | PCIE_CAP_NON_FATAL_ERR_REPORT_EN |
> +			PCIE_CAP_FATAL_ERR_REPORT_EN | PCIE_CAP_UNSUPPORT_REQ_REP_EN);
> +	writel(val, pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
> +
> +	val = readl(pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
> +	val |= (PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN | PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN |
> +			PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN);
> +	writel(val, pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
> +
> +	dw_pcie_dbi_ro_wr_dis(pci);
> +
> +	/* Enable RAS-DP Interrupts */
> +	offset = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_RASDP_ID);
> +	val = readl(pci->dbi_base + offset + PCIE_RASDP_ERROR_MODE_EN_REG);
> +	val |= RASDP_ERROR_MODE_EN;
> +	writel(val, pci->dbi_base + PCIE_RASDP_ERROR_MODE_EN_REG);
> +
> +	/* Enable CDM Check */
> +	val = readl(pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
> +	/* Enable continuous CMD register check mode */
> +	val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS;
> +	/* Start the CDM register check */
> +	val |= PCIE_PL_CHK_REG_CHK_REG_START;
> +	/* Enable comparison CDM register check mode */
> +	val |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
> +	/* Enable logic CDM register check mode */
> +	val |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
> +	writel(val, pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
> +
> +	/* Interface Timer Enable */
> +	val = readl(pci->dbi_base + PCIE_INTERFACE_TIMER_CONTROL);
> +	val |=  (INTERFACE_TIMER_EN | INTERFACE_TIMER_AER_EN);
> +	writel(val, pci->dbi_base + PCIE_INTERFACE_TIMER_CONTROL);

You defined DBI_* at pcie-designware.h. Supposed it should be same for
all chips that use dwc controller. So I think you should put common code
into pcie-designware.c to avoid duplicate the same logic at every chip
vendor's driver.

Frank


> +
> +	/* Enable safety correctable and uncorrectable error reporting */
> +	val = readl(pcie->parf + PARF_CFG_SAFETY_INT_MASK_CTRL);
> +	val |= (CFG_SAFETY_UNCORR_INT_MASK | CFG_SAFETY_CORR_INT_MASK);
> +	writel(val, pcie->parf + PARF_CFG_SAFETY_INT_MASK_CTRL);
> +
> +	/* Enable CRC check and generation */
> +	offset = dw_pcie_find_ext_capability(pci, PCIE_AER_EXT_CAP_ID);
> +	val = readl(pci->dbi_base + offset + DBI_ADV_ERR_CAP_CTRL_OFF);
> +	val |= (ECRC_GEN_EN | ECRC_CHECK_EN);
> +	writel(val, pci->dbi_base + offset + DBI_ADV_ERR_CAP_CTRL_OFF);
> +
> +	/* Enable AER */
> +	val = readl(pci->dbi_base + offset + DBI_ROOT_ERR_CMD_OFF);
> +	val |= (CORR_ERR_REPORTING_EN | NON_FATAL_ERR_REPORTING_EN
> +		| FATAL_ERR_REPORTING_EN);
> +	writel(val, pci->dbi_base + offset + DBI_ROOT_ERR_CMD_OFF);
> +
> +	/* Enable interrupts */
> +	val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +	val &= ~(SAFETY_INT_MASK);
> +	writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +
> +	/* Disable Legacy Interrupts */
> +	writel(0, pcie->parf + PARF_INT_ALL_MASK);
> +}
> +
>  static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
>  {
>  	qcom_pcie_clear_hpc(pcie->pci);
> +	qcom_pcie_enable_error_reporting_2_7_0(pcie);
>  
>  	return 0;
>  }
> @@ -1416,6 +1591,130 @@ static void qcom_pcie_icc_update(struct qcom_pcie *pcie)
>  	}
>  }
>  
> +static void qcom_pcie_check_spurious_int(struct qcom_pcie *pcie)
> +{
> +	struct dw_pcie *pci = pcie->pci;
> +	u32 *pcie_fault = pcie->pcie_fault;
> +	struct device *dev = pci->dev;
> +	struct kobject *kobj_ref = &dev->kobj;
> +	u32 val;
> +
> +	val = readl(pci->dbi_base + DBI_INTERFACE_TIMER_STATUS);
> +	if (val & CFGPCIE_INTERFACE_TIMER_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_ALL_STATUS);
> +	if (val & CFGPCIE_INT_ALL_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_STATUS);
> +	if (val & CFGPCIE_PARF_INT_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_ALL_2_STATUS);
> +	if (val & CFGPCIE_INT_ALL_2_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_ALL_3_STATUS);
> +	if (val & CFGPCIE_INT_ALL_3_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_ALL_4_STATUS);
> +	if (val & CFGPCIE_INT_ALL_4_STATUS_MASK)
> +		return;
> +
> +	val = readl(pcie->parf + PARF_INT_ALL_5_STATUS);
> +	if (val & CFGPCIE_INT_ALL_5_STATUS_MASK)
> +		return;
> +
> +	dev_err(pci->dev, "PCIe Spurious Interrupt");
> +	pcie_fault[PCIE_SPURIOUS_INT]++;
> +	pcie->pcie_fault_total++;
> +	sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
> +}
> +
> +static irqreturn_t qcom_pcie_global_irq_thread(int irq, void *data)
> +{
> +	struct qcom_pcie *pcie = data;
> +	struct dw_pcie *pci = pcie->pci;
> +	u32 *pcie_fault = pcie->pcie_fault;
> +	struct device *dev = pci->dev;
> +	struct kobject *kobj_ref = &dev->kobj;
> +	unsigned long irqsave_flags;
> +	u32 val, int_status;
> +
> +	spin_lock_irqsave(&pcie->safety_lock, irqsave_flags);
> +
> +	int_status = readl(pci->dbi_base + DBI_SAFETY_STATUS);
> +	writel(0, pci->dbi_base + DBI_SAFETY_STATUS);
> +
> +	if (int_status) {
> +		dev_err(pci->dev, "global interrupt fired status: %u", int_status);
> +
> +		if (int_status & PCIE_RASDP_UNCORR_ERR) {
> +			dev_err(pci->dev, "RASDP uncorrectable error triggered");
> +			pcie_fault[RASDP_UNCORR_ERROR]++;
> +			pcie->pcie_fault_total++;
> +			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
> +
> +			/*
> +			 * rasdp_uncorr_err ends up triggering a
> +			 * pcie_uncorr error continuously. So masking
> +			 * pcie_uncorr interrupts .
> +			 */
> +			val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +			val |= PCIE_AER_UNCORR_ERR;
> +			writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +		}
> +
> +		if (int_status & PCIE_CDM_CHK_ERR) {
> +			dev_err(pci->dev, "CDM error triggered");
> +			val = readl(pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
> +
> +			if (val & PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR) {
> +				pcie_fault[CDM_REG_CHK_ERROR]++;
> +				pcie->pcie_fault_total++;
> +				sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
> +
> +				/*
> +				 * cdm_chk_err injection results in a continuous
> +				 * interrupt storm on certain targets, so masking it.
> +				 */
> +				val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +				val |= (PCIE_CDM_CHK_ERR | PCIE_AER_UNCORR_ERR);
> +				writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +			}
> +		}
> +
> +		if (int_status & PCIE_IFACE_TMR_ERR) {
> +			dev_err(pci->dev, "Iface Timeout error triggered");
> +			pcie_fault[INTERFACE_TIMER_ERROR]++;
> +			pcie->pcie_fault_total++;
> +			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
> +
> +			/*
> +			 * interface_timer_err injection results in a continuous
> +			 * interrupt storm on certain targets, so masking it.
> +			 */
> +			val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +			val |= (PCIE_IFACE_TMR_ERR | PCIE_AER_UNCORR_ERR);
> +			writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
> +		}
> +
> +		if (int_status & PCIE_RASDP_CORR_ERR) {
> +			dev_err(pci->dev, "RASDP correctable error triggered");
> +			pcie_fault[RASDP_CORR_ERROR]++;
> +			pcie->pcie_fault_total++;
> +			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
> +		}
> +	} else {
> +		qcom_pcie_check_spurious_int(pcie);
> +	}
> +
> +	spin_unlock_irqrestore(&pcie->safety_lock, irqsave_flags);
> +	return IRQ_HANDLED;
> +}
> +
>  static int qcom_pcie_link_transition_count(struct seq_file *s, void *data)
>  {
>  	struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(s->private);
> @@ -1438,6 +1737,40 @@ static int qcom_pcie_link_transition_count(struct seq_file *s, void *data)
>  	return 0;
>  }
>  
> +static ssize_t qcom_pcie_error_report_show(struct device *dev,
> +		struct device_attribute *attr,
> +		char *buf)
> +{
> +	unsigned int i;
> +	struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(dev);
> +	u32 *pcie_fault = pcie->pcie_fault;
> +	size_t len = 0;
> +
> +	for (i = 0; i < MAX_PCIE_SAFETY_FAULT; i++) {
> +		if (pcie_fault_string[i])
> +			len += sysfs_emit_at(buf, len, "%s: %lu\n",
> +					pcie_fault_string[i],
> +					pcie_fault[i]);
> +	}
> +
> +	len += sysfs_emit_at(buf, len, "%s: %lu\n",
> +					pcie_fault_string[i],
> +					pcie->pcie_fault_total);
> +
> +	return len;
> +}
> +static DEVICE_ATTR_RO(qcom_pcie_error_report);
> +
> +static struct attribute *qcom_pcie_attrs[] = {
> +	&dev_attr_qcom_pcie_error_report.attr,
> +	NULL,
> +};
> +
> +static const struct attribute_group qcom_pcie_attribute_group = {
> +	.attrs = qcom_pcie_attrs,
> +	.name = "qcom_pcie"
> +};
> +
>  static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
>  {
>  	struct dw_pcie *pci = pcie->pci;
> @@ -1496,6 +1829,21 @@ static int qcom_pcie_probe(struct platform_device *pdev)
>  		goto err_pm_runtime_put;
>  	}
>  
> +	pcie->global_irq = platform_get_irq_byname(pdev, "global");
> +	if (pcie->global_irq < 0) {
> +		ret = pcie->global_irq;
> +		goto err_pm_runtime_put;
> +	}
> +
> +	ret = devm_request_threaded_irq(dev, pcie->global_irq, NULL,
> +				qcom_pcie_global_irq_thread,
> +				IRQF_ONESHOT,
> +				"global_irq", pcie);
> +	if (ret) {
> +		dev_err(dev, "Failed to request Global IRQ\n");
> +		goto err_pm_runtime_put;
> +	}
> +
>  	pcie->parf = devm_platform_ioremap_resource_byname(pdev, "parf");
>  	if (IS_ERR(pcie->parf)) {
>  		ret = PTR_ERR(pcie->parf);
> @@ -1551,6 +1899,8 @@ static int qcom_pcie_probe(struct platform_device *pdev)
>  	if (pcie->mhi)
>  		qcom_pcie_init_debugfs(pcie);
>  
> +	sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
> +
>  	return 0;
>  
>  err_phy_exit:
> -- 
> 2.40.1
>
  
kernel test robot Feb. 22, 2024, 11:01 a.m. UTC | #4
Hi root,

kernel test robot noticed the following build warnings:

[auto build test WARNING on pci/next]
[also build test WARNING on pci/for-linus robh/for-next mani-mhi/mhi-next linus/master v6.8-rc5 next-20240221]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/root/dt-bindings-PCI-qcom-Add-global-irq-support-for-SA8775p/20240221-220722
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20240221140405.28532-4-root%40hu-msarkar-hyd.qualcomm.com
patch subject: [PATCH v1 3/3] PCI: qcom: Add support for detecting controller level PCIe errors
config: microblaze-allmodconfig (https://download.01.org/0day-ci/archive/20240222/202402221838.5n7vo0Jo-lkp@intel.com/config)
compiler: microblaze-linux-gcc (GCC) 13.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240222/202402221838.5n7vo0Jo-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202402221838.5n7vo0Jo-lkp@intel.com/

All warnings (new ones prefixed by >>):

   drivers/pci/controller/dwc/pcie-qcom.c: In function 'qcom_pcie_error_report_show':
>> drivers/pci/controller/dwc/pcie-qcom.c:1751:63: warning: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Wformat=]
    1751 |                         len += sysfs_emit_at(buf, len, "%s: %lu\n",
         |                                                             ~~^
         |                                                               |
         |                                                               long unsigned int
         |                                                             %u
    1752 |                                         pcie_fault_string[i],
    1753 |                                         pcie_fault[i]);
         |                                         ~~~~~~~~~~~~~          
         |                                                   |
         |                                                   u32 {aka unsigned int}
   drivers/pci/controller/dwc/pcie-qcom.c:1756:47: warning: format '%lu' expects argument of type 'long unsigned int', but argument 5 has type 'u32' {aka 'unsigned int'} [-Wformat=]
    1756 |         len += sysfs_emit_at(buf, len, "%s: %lu\n",
         |                                             ~~^
         |                                               |
         |                                               long unsigned int
         |                                             %u
    1757 |                                         pcie_fault_string[i],
    1758 |                                         pcie->pcie_fault_total);
         |                                         ~~~~~~~~~~~~~~~~~~~~~~
         |                                             |
         |                                             u32 {aka unsigned int}
   drivers/pci/controller/dwc/pcie-qcom.c: In function 'qcom_pcie_probe':
>> drivers/pci/controller/dwc/pcie-qcom.c:1902:9: warning: ignoring return value of 'sysfs_create_group' declared with attribute 'warn_unused_result' [-Wunused-result]
    1902 |         sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
         |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


vim +1751 drivers/pci/controller/dwc/pcie-qcom.c

  1739	
  1740	static ssize_t qcom_pcie_error_report_show(struct device *dev,
  1741			struct device_attribute *attr,
  1742			char *buf)
  1743	{
  1744		unsigned int i;
  1745		struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(dev);
  1746		u32 *pcie_fault = pcie->pcie_fault;
  1747		size_t len = 0;
  1748	
  1749		for (i = 0; i < MAX_PCIE_SAFETY_FAULT; i++) {
  1750			if (pcie_fault_string[i])
> 1751				len += sysfs_emit_at(buf, len, "%s: %lu\n",
  1752						pcie_fault_string[i],
  1753						pcie_fault[i]);
  1754		}
  1755	
  1756		len += sysfs_emit_at(buf, len, "%s: %lu\n",
  1757						pcie_fault_string[i],
  1758						pcie->pcie_fault_total);
  1759	
  1760		return len;
  1761	}
  1762	static DEVICE_ATTR_RO(qcom_pcie_error_report);
  1763	
  1764	static struct attribute *qcom_pcie_attrs[] = {
  1765		&dev_attr_qcom_pcie_error_report.attr,
  1766		NULL,
  1767	};
  1768	
  1769	static const struct attribute_group qcom_pcie_attribute_group = {
  1770		.attrs = qcom_pcie_attrs,
  1771		.name = "qcom_pcie"
  1772	};
  1773	
  1774	static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
  1775	{
  1776		struct dw_pcie *pci = pcie->pci;
  1777		struct device *dev = pci->dev;
  1778		char *name;
  1779	
  1780		name = devm_kasprintf(dev, GFP_KERNEL, "%pOFP", dev->of_node);
  1781		if (!name)
  1782			return;
  1783	
  1784		pcie->debugfs = debugfs_create_dir(name, NULL);
  1785		debugfs_create_devm_seqfile(dev, "link_transition_count", pcie->debugfs,
  1786					    qcom_pcie_link_transition_count);
  1787	}
  1788	
  1789	static int qcom_pcie_probe(struct platform_device *pdev)
  1790	{
  1791		const struct qcom_pcie_cfg *pcie_cfg;
  1792		struct device *dev = &pdev->dev;
  1793		struct qcom_pcie *pcie;
  1794		struct dw_pcie_rp *pp;
  1795		struct resource *res;
  1796		struct dw_pcie *pci;
  1797		int ret;
  1798	
  1799		pcie_cfg = of_device_get_match_data(dev);
  1800		if (!pcie_cfg || !pcie_cfg->ops) {
  1801			dev_err(dev, "Invalid platform data\n");
  1802			return -EINVAL;
  1803		}
  1804	
  1805		pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
  1806		if (!pcie)
  1807			return -ENOMEM;
  1808	
  1809		pci = devm_kzalloc(dev, sizeof(*pci), GFP_KERNEL);
  1810		if (!pci)
  1811			return -ENOMEM;
  1812	
  1813		pm_runtime_enable(dev);
  1814		ret = pm_runtime_get_sync(dev);
  1815		if (ret < 0)
  1816			goto err_pm_runtime_put;
  1817	
  1818		pci->dev = dev;
  1819		pci->ops = &dw_pcie_ops;
  1820		pp = &pci->pp;
  1821	
  1822		pcie->pci = pci;
  1823	
  1824		pcie->cfg = pcie_cfg;
  1825	
  1826		pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_HIGH);
  1827		if (IS_ERR(pcie->reset)) {
  1828			ret = PTR_ERR(pcie->reset);
  1829			goto err_pm_runtime_put;
  1830		}
  1831	
  1832		pcie->global_irq = platform_get_irq_byname(pdev, "global");
  1833		if (pcie->global_irq < 0) {
  1834			ret = pcie->global_irq;
  1835			goto err_pm_runtime_put;
  1836		}
  1837	
  1838		ret = devm_request_threaded_irq(dev, pcie->global_irq, NULL,
  1839					qcom_pcie_global_irq_thread,
  1840					IRQF_ONESHOT,
  1841					"global_irq", pcie);
  1842		if (ret) {
  1843			dev_err(dev, "Failed to request Global IRQ\n");
  1844			goto err_pm_runtime_put;
  1845		}
  1846	
  1847		pcie->parf = devm_platform_ioremap_resource_byname(pdev, "parf");
  1848		if (IS_ERR(pcie->parf)) {
  1849			ret = PTR_ERR(pcie->parf);
  1850			goto err_pm_runtime_put;
  1851		}
  1852	
  1853		pcie->elbi = devm_platform_ioremap_resource_byname(pdev, "elbi");
  1854		if (IS_ERR(pcie->elbi)) {
  1855			ret = PTR_ERR(pcie->elbi);
  1856			goto err_pm_runtime_put;
  1857		}
  1858	
  1859		/* MHI region is optional */
  1860		res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "mhi");
  1861		if (res) {
  1862			pcie->mhi = devm_ioremap_resource(dev, res);
  1863			if (IS_ERR(pcie->mhi)) {
  1864				ret = PTR_ERR(pcie->mhi);
  1865				goto err_pm_runtime_put;
  1866			}
  1867		}
  1868	
  1869		pcie->phy = devm_phy_optional_get(dev, "pciephy");
  1870		if (IS_ERR(pcie->phy)) {
  1871			ret = PTR_ERR(pcie->phy);
  1872			goto err_pm_runtime_put;
  1873		}
  1874	
  1875		ret = qcom_pcie_icc_init(pcie);
  1876		if (ret)
  1877			goto err_pm_runtime_put;
  1878	
  1879		ret = pcie->cfg->ops->get_resources(pcie);
  1880		if (ret)
  1881			goto err_pm_runtime_put;
  1882	
  1883		pp->ops = &qcom_pcie_dw_ops;
  1884	
  1885		ret = phy_init(pcie->phy);
  1886		if (ret)
  1887			goto err_pm_runtime_put;
  1888	
  1889		platform_set_drvdata(pdev, pcie);
  1890	
  1891		ret = dw_pcie_host_init(pp);
  1892		if (ret) {
  1893			dev_err(dev, "cannot initialize host\n");
  1894			goto err_phy_exit;
  1895		}
  1896	
  1897		qcom_pcie_icc_update(pcie);
  1898	
  1899		if (pcie->mhi)
  1900			qcom_pcie_init_debugfs(pcie);
  1901	
> 1902		sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
  1903	
  1904		return 0;
  1905	
  1906	err_phy_exit:
  1907		phy_exit(pcie->phy);
  1908	err_pm_runtime_put:
  1909		pm_runtime_put(dev);
  1910		pm_runtime_disable(dev);
  1911	
  1912		return ret;
  1913	}
  1914
  
Bjorn Helgaas Feb. 22, 2024, 12:38 p.m. UTC | #5
On Wed, Feb 21, 2024 at 12:50:17PM -0600, Bjorn Helgaas wrote:
> On Wed, Feb 21, 2024 at 07:34:04PM +0530, root wrote:
> > From: Nitesh Gupta <nitegupt@quicinc.com>
> > 
> > Synopsys Controllers provide capabilities to detect various controller
> > level errors. These can range from controller interface error to random
> > PCIe configuration errors. This patch intends to add support to detect
> > these errors and report it to userspace entity via sysfs, which can take
> > appropriate actions to mitigate the errors.

> > +static void qcom_pcie_enable_error_reporting_2_7_0(struct qcom_pcie *pcie)
> > +{
> > + ...
> 
> > +	val = readl(pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
> > +	val |= (PCIE_CAP_CORR_ERR_REPORT_EN | PCIE_CAP_NON_FATAL_ERR_REPORT_EN |
> > +			PCIE_CAP_FATAL_ERR_REPORT_EN | PCIE_CAP_UNSUPPORT_REQ_REP_EN);
> > +	writel(val, pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
> 
> Is there any way to split the AER part (specified by the PCIe spec)
> from the qcom-specific (or dwc-specific) part?  This looks an awful
> lot like pci_enable_pcie_error_reporting(), and we should do this in
> the PCI core in a generic way if possible.
> 
> > +	val = readl(pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
> > +	val |= (PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN | PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN |
> > +			PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN);
> > +	writel(val, pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);

More to the point: why do we need to do this in the qcom driver at
all?  Why is pci_enable_pcie_error_reporting() not enough?

Bjorn
  
kernel test robot Feb. 22, 2024, 12:39 p.m. UTC | #6
Hi root,

kernel test robot noticed the following build warnings:

[auto build test WARNING on pci/next]
[also build test WARNING on pci/for-linus robh/for-next mani-mhi/mhi-next linus/master v6.8-rc5 next-20240221]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/root/dt-bindings-PCI-qcom-Add-global-irq-support-for-SA8775p/20240221-220722
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pci/pci.git next
patch link:    https://lore.kernel.org/r/20240221140405.28532-4-root%40hu-msarkar-hyd.qualcomm.com
patch subject: [PATCH v1 3/3] PCI: qcom: Add support for detecting controller level PCIe errors
config: i386-buildonly-randconfig-005-20240222 (https://download.01.org/0day-ci/archive/20240222/202402222044.JXTN6nr0-lkp@intel.com/config)
compiler: clang version 17.0.6 (https://github.com/llvm/llvm-project 6009708b4367171ccdbf4b5905cb6a803753fe18)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240222/202402222044.JXTN6nr0-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202402222044.JXTN6nr0-lkp@intel.com/

All warnings (new ones prefixed by >>):

>> drivers/pci/controller/dwc/pcie-qcom.c:1753:6: warning: format specifies type 'unsigned long' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
    1751 |                         len += sysfs_emit_at(buf, len, "%s: %lu\n",
         |                                                             ~~~
         |                                                             %u
    1752 |                                         pcie_fault_string[i],
    1753 |                                         pcie_fault[i]);
         |                                         ^~~~~~~~~~~~~
   drivers/pci/controller/dwc/pcie-qcom.c:1758:6: warning: format specifies type 'unsigned long' but the argument has type 'u32' (aka 'unsigned int') [-Wformat]
    1756 |         len += sysfs_emit_at(buf, len, "%s: %lu\n",
         |                                             ~~~
         |                                             %u
    1757 |                                         pcie_fault_string[i],
    1758 |                                         pcie->pcie_fault_total);
         |                                         ^~~~~~~~~~~~~~~~~~~~~~
>> drivers/pci/controller/dwc/pcie-qcom.c:1902:2: warning: ignoring return value of function declared with 'warn_unused_result' attribute [-Wunused-result]
    1902 |         sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
         |         ^~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   3 warnings generated.


vim +1753 drivers/pci/controller/dwc/pcie-qcom.c

  1739	
  1740	static ssize_t qcom_pcie_error_report_show(struct device *dev,
  1741			struct device_attribute *attr,
  1742			char *buf)
  1743	{
  1744		unsigned int i;
  1745		struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(dev);
  1746		u32 *pcie_fault = pcie->pcie_fault;
  1747		size_t len = 0;
  1748	
  1749		for (i = 0; i < MAX_PCIE_SAFETY_FAULT; i++) {
  1750			if (pcie_fault_string[i])
  1751				len += sysfs_emit_at(buf, len, "%s: %lu\n",
  1752						pcie_fault_string[i],
> 1753						pcie_fault[i]);
  1754		}
  1755	
  1756		len += sysfs_emit_at(buf, len, "%s: %lu\n",
  1757						pcie_fault_string[i],
> 1758						pcie->pcie_fault_total);
  1759	
  1760		return len;
  1761	}
  1762	static DEVICE_ATTR_RO(qcom_pcie_error_report);
  1763	
  1764	static struct attribute *qcom_pcie_attrs[] = {
  1765		&dev_attr_qcom_pcie_error_report.attr,
  1766		NULL,
  1767	};
  1768	
  1769	static const struct attribute_group qcom_pcie_attribute_group = {
  1770		.attrs = qcom_pcie_attrs,
  1771		.name = "qcom_pcie"
  1772	};
  1773	
  1774	static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
  1775	{
  1776		struct dw_pcie *pci = pcie->pci;
  1777		struct device *dev = pci->dev;
  1778		char *name;
  1779	
  1780		name = devm_kasprintf(dev, GFP_KERNEL, "%pOFP", dev->of_node);
  1781		if (!name)
  1782			return;
  1783	
  1784		pcie->debugfs = debugfs_create_dir(name, NULL);
  1785		debugfs_create_devm_seqfile(dev, "link_transition_count", pcie->debugfs,
  1786					    qcom_pcie_link_transition_count);
  1787	}
  1788	
  1789	static int qcom_pcie_probe(struct platform_device *pdev)
  1790	{
  1791		const struct qcom_pcie_cfg *pcie_cfg;
  1792		struct device *dev = &pdev->dev;
  1793		struct qcom_pcie *pcie;
  1794		struct dw_pcie_rp *pp;
  1795		struct resource *res;
  1796		struct dw_pcie *pci;
  1797		int ret;
  1798	
  1799		pcie_cfg = of_device_get_match_data(dev);
  1800		if (!pcie_cfg || !pcie_cfg->ops) {
  1801			dev_err(dev, "Invalid platform data\n");
  1802			return -EINVAL;
  1803		}
  1804	
  1805		pcie = devm_kzalloc(dev, sizeof(*pcie), GFP_KERNEL);
  1806		if (!pcie)
  1807			return -ENOMEM;
  1808	
  1809		pci = devm_kzalloc(dev, sizeof(*pci), GFP_KERNEL);
  1810		if (!pci)
  1811			return -ENOMEM;
  1812	
  1813		pm_runtime_enable(dev);
  1814		ret = pm_runtime_get_sync(dev);
  1815		if (ret < 0)
  1816			goto err_pm_runtime_put;
  1817	
  1818		pci->dev = dev;
  1819		pci->ops = &dw_pcie_ops;
  1820		pp = &pci->pp;
  1821	
  1822		pcie->pci = pci;
  1823	
  1824		pcie->cfg = pcie_cfg;
  1825	
  1826		pcie->reset = devm_gpiod_get_optional(dev, "perst", GPIOD_OUT_HIGH);
  1827		if (IS_ERR(pcie->reset)) {
  1828			ret = PTR_ERR(pcie->reset);
  1829			goto err_pm_runtime_put;
  1830		}
  1831	
  1832		pcie->global_irq = platform_get_irq_byname(pdev, "global");
  1833		if (pcie->global_irq < 0) {
  1834			ret = pcie->global_irq;
  1835			goto err_pm_runtime_put;
  1836		}
  1837	
  1838		ret = devm_request_threaded_irq(dev, pcie->global_irq, NULL,
  1839					qcom_pcie_global_irq_thread,
  1840					IRQF_ONESHOT,
  1841					"global_irq", pcie);
  1842		if (ret) {
  1843			dev_err(dev, "Failed to request Global IRQ\n");
  1844			goto err_pm_runtime_put;
  1845		}
  1846	
  1847		pcie->parf = devm_platform_ioremap_resource_byname(pdev, "parf");
  1848		if (IS_ERR(pcie->parf)) {
  1849			ret = PTR_ERR(pcie->parf);
  1850			goto err_pm_runtime_put;
  1851		}
  1852	
  1853		pcie->elbi = devm_platform_ioremap_resource_byname(pdev, "elbi");
  1854		if (IS_ERR(pcie->elbi)) {
  1855			ret = PTR_ERR(pcie->elbi);
  1856			goto err_pm_runtime_put;
  1857		}
  1858	
  1859		/* MHI region is optional */
  1860		res = platform_get_resource_byname(pdev, IORESOURCE_MEM, "mhi");
  1861		if (res) {
  1862			pcie->mhi = devm_ioremap_resource(dev, res);
  1863			if (IS_ERR(pcie->mhi)) {
  1864				ret = PTR_ERR(pcie->mhi);
  1865				goto err_pm_runtime_put;
  1866			}
  1867		}
  1868	
  1869		pcie->phy = devm_phy_optional_get(dev, "pciephy");
  1870		if (IS_ERR(pcie->phy)) {
  1871			ret = PTR_ERR(pcie->phy);
  1872			goto err_pm_runtime_put;
  1873		}
  1874	
  1875		ret = qcom_pcie_icc_init(pcie);
  1876		if (ret)
  1877			goto err_pm_runtime_put;
  1878	
  1879		ret = pcie->cfg->ops->get_resources(pcie);
  1880		if (ret)
  1881			goto err_pm_runtime_put;
  1882	
  1883		pp->ops = &qcom_pcie_dw_ops;
  1884	
  1885		ret = phy_init(pcie->phy);
  1886		if (ret)
  1887			goto err_pm_runtime_put;
  1888	
  1889		platform_set_drvdata(pdev, pcie);
  1890	
  1891		ret = dw_pcie_host_init(pp);
  1892		if (ret) {
  1893			dev_err(dev, "cannot initialize host\n");
  1894			goto err_phy_exit;
  1895		}
  1896	
  1897		qcom_pcie_icc_update(pcie);
  1898	
  1899		if (pcie->mhi)
  1900			qcom_pcie_init_debugfs(pcie);
  1901	
> 1902		sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
  1903	
  1904		return 0;
  1905	
  1906	err_phy_exit:
  1907		phy_exit(pcie->phy);
  1908	err_pm_runtime_put:
  1909		pm_runtime_put(dev);
  1910		pm_runtime_disable(dev);
  1911	
  1912		return ret;
  1913	}
  1914
  

Patch

diff --git a/drivers/pci/controller/dwc/pcie-designware.h b/drivers/pci/controller/dwc/pcie-designware.h
index 26dae4837462..cd45f9a2f9bc 100644
--- a/drivers/pci/controller/dwc/pcie-designware.h
+++ b/drivers/pci/controller/dwc/pcie-designware.h
@@ -223,6 +223,32 @@ 
 
 #define PCIE_RAS_DES_EVENT_COUNTER_DATA		0xc
 
+
+/*
+ * Error Reporting DBI register
+ */
+#define DBI_DEVICE_CONTROL_DEVICE_STATUS	0x78
+#define DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG	0x8c
+#define DBI_INTERFACE_TIMER_STATUS		0x938
+#define DBI_SAFETY_MASK_OFF			0x960
+#define DBI_SAFETY_STATUS			0x964
+
+#define DBI_ADV_ERR_CAP_CTRL_OFF		0x18
+#define DBI_ROOT_ERR_CMD_OFF			0x2c
+
+/*
+ * RAS-DP register
+ */
+#define PCIE_RASDP_ERROR_MODE_EN_REG		0x28
+#define RASDP_ERROR_MODE_EN			BIT(0)
+
+/*
+ * Interface Timer register
+ */
+#define PCIE_INTERFACE_TIMER_CONTROL		0x930
+#define INTERFACE_TIMER_EN			BIT(0)
+#define INTERFACE_TIMER_AER_EN			BIT(1)
+
 /*
  * The default address offset between dbi_base and atu_base. Root controller
  * drivers are not required to initialize atu_base if the offset matches this
diff --git a/drivers/pci/controller/dwc/pcie-qcom.c b/drivers/pci/controller/dwc/pcie-qcom.c
index 10f2d0bb86be..138e3b08d4b9 100644
--- a/drivers/pci/controller/dwc/pcie-qcom.c
+++ b/drivers/pci/controller/dwc/pcie-qcom.c
@@ -20,6 +20,7 @@ 
 #include <linux/kernel.h>
 #include <linux/init.h>
 #include <linux/of.h>
+#include <linux/device.h>
 #include <linux/of_gpio.h>
 #include <linux/pci.h>
 #include <linux/pm_runtime.h>
@@ -68,6 +69,73 @@ 
 #define PARF_DEBUG_CNT_AUX_CLK_IN_L1SUB_L1	0xc84
 #define PARF_DEBUG_CNT_AUX_CLK_IN_L1SUB_L2	0xc88
 
+/* Error Reporting Parf Registers */
+#define PARF_INT_ALL_STATUS			0x224
+#define PARF_INT_ALL_CLEAR			0x228
+#define PARF_INT_CLEAR				0x21c
+#define PARF_INT_STATUS			0x220
+#define PARF_INT_ALL_MASK			0x22c
+#define PARF_INT_ALL_2_CLEAR			0x504
+#define PARF_INT_ALL_2_STATUS			0x500
+#define PARF_INT_ALL_3_CLEAR			0x2e14
+#define PARF_INT_ALL_3_STATUS			0x2e10
+#define PARF_INT_ALL_4_CLEAR			0x2dd8
+#define PARF_INT_ALL_4_STATUS			0x2dd0
+#define PARF_INT_ALL_5_CLEAR			0x2ddc
+#define PARF_INT_ALL_5_STATUS			0x2dd4
+#define PARF_CFG_SAFETY_INT_MASK_CTRL		0x2c60
+
+
+#define PCIE_AER_EXT_CAP_ID			0x01
+#define PCI_EXT_CAP_RASDP_ID			0x0b
+
+/* Interrupt Masks */
+#define CFGPCIE_INT_ALL_STATUS_MASK		0x3ff3e
+#define CFGPCIE_PARF_INT_STATUS_MASK		0x1b
+#define CFGPCIE_INTERFACE_TIMER_STATUS_MASK	0xe7b
+#define CFGPCIE_INT_ALL_2_STATUS_MASK		GENMASK(24, 0)
+#define CFGPCIE_INT_ALL_3_STATUS_MASK		GENMASK(31, 0)
+#define CFGPCIE_INT_ALL_4_STATUS_MASK		GENMASK(31, 0)
+#define CFGPCIE_INT_ALL_5_STATUS_MASK		GENMASK(31, 0)
+
+/* PCI_INTERRUPT_LINE register field */
+#define SERR_EN					BIT(17)
+
+/* DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG register fields */
+#define PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN		BIT(0)
+#define PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN	BIT(1)
+#define PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN	BIT(2)
+
+/* DBI_DEVICE_CONTROL_DEVICE_STATUS register fields */
+#define PCIE_CAP_UNSUPPORT_REQ_REP_EN		BIT(3)
+#define PCIE_CAP_FATAL_ERR_REPORT_EN		BIT(2)
+#define PCIE_CAP_NON_FATAL_ERR_REPORT_EN	BIT(1)
+#define PCIE_CAP_CORR_ERR_REPORT_EN		BIT(0)
+
+/* PARF_CFG_SAFETY_INT_MASK_CTRL register fields */
+#define CFG_SAFETY_UNCORR_INT_MASK		BIT(0)
+#define CFG_SAFETY_CORR_INT_MASK		BIT(1)
+
+/* DBI_ADV_ERR_CAP_CTRL_OFF register fields */
+#define ECRC_GEN_EN				BIT(6)
+#define ECRC_CHECK_EN				BIT(8)
+
+/* DBI_ROOT_ERR_CMD_OFF register fields */
+#define CORR_ERR_REPORTING_EN			BIT(0)
+#define NON_FATAL_ERR_REPORTING_EN		BIT(1)
+#define FATAL_ERR_REPORTING_EN			BIT(2)
+
+/* DBI_SAFETY_MASK_OFF register fields */
+#define SAFETY_INT_MASK			GENMASK(5, 0)
+
+/* DBI_SAFETY_STATUS register fields */
+#define PCIE_RASDP_UNCORR_ERR			BIT(0)
+#define PCIE_IFACE_TMR_ERR			BIT(1)
+#define PCIE_CDM_CHK_ERR			BIT(2)
+#define PCIE_AER_UNCORR_ERR			BIT(3)
+#define PCIE_AER_CORR_ERR			BIT(4)
+#define PCIE_RASDP_CORR_ERR			BIT(5)
+
 /* PARF_SYS_CTRL register fields */
 #define MAC_PHY_POWERDOWN_IN_P2_D_MUX_EN	BIT(29)
 #define MST_WAKEUP_EN				BIT(13)
@@ -231,6 +299,24 @@  struct qcom_pcie_cfg {
 	const struct qcom_pcie_ops *ops;
 };
 
+enum qcom_pcie_fault_code {
+	RASDP_UNCORR_ERROR,	/* RASDP uncorrectable error */
+	RASDP_CORR_ERROR,	/* RASDP correctable error */
+	CDM_REG_CHK_ERROR,	/* CDM register check error */
+	INTERFACE_TIMER_ERROR,	/* PCIe local bus interface timer error */
+	PCIE_SPURIOUS_INT,	/* Spurious Interrupt received */
+	MAX_PCIE_SAFETY_FAULT	/* Maximum PCIe fault source code supported */
+};
+
+static const char * const pcie_fault_string[] = {
+	"RASDP_Uncorr_Error",
+	"RASDP_Corr_Error",
+	"CDM_Reg_Chk_Error",
+	"Interface_Timer_Error",
+	"PCIe_Spurious_Interrupt",
+	"TOTAL_PCIE_FAULTS",
+};
+
 struct qcom_pcie {
 	struct dw_pcie *pci;
 	void __iomem *parf;			/* DT parf */
@@ -243,6 +329,10 @@  struct qcom_pcie {
 	const struct qcom_pcie_cfg *cfg;
 	struct dentry *debugfs;
 	bool suspended;
+	int global_irq;
+	spinlock_t safety_lock;
+	u32 pcie_fault[MAX_PCIE_SAFETY_FAULT];
+	u32 pcie_fault_total;
 };
 
 #define to_qcom_pcie(x)		dev_get_drvdata((x)->dev)
@@ -959,9 +1049,94 @@  static int qcom_pcie_init_2_7_0(struct qcom_pcie *pcie)
 	return ret;
 }
 
+static void qcom_pcie_enable_error_reporting_2_7_0(struct qcom_pcie *pcie)
+{
+	struct dw_pcie *pci = pcie->pci;
+	u32 val, offset;
+
+	/* Clear all the interrupts before we enable it */
+	writel(0, pci->dbi_base + DBI_SAFETY_STATUS);
+	writel(0, pci->dbi_base + DBI_INTERFACE_TIMER_STATUS);
+
+	/* Enable interrupts which are aggregated using GLOBAL_INT */
+	writel(CFGPCIE_INT_ALL_STATUS_MASK, pcie->parf + PARF_INT_ALL_CLEAR);
+	writel(CFGPCIE_PARF_INT_STATUS_MASK, pcie->parf + PARF_INT_CLEAR);
+	writel(CFGPCIE_INT_ALL_2_STATUS_MASK, pcie->parf + PARF_INT_ALL_2_CLEAR);
+	writel(CFGPCIE_INT_ALL_3_STATUS_MASK, pcie->parf + PARF_INT_ALL_3_CLEAR);
+	writel(CFGPCIE_INT_ALL_4_STATUS_MASK, pcie->parf + PARF_INT_ALL_4_CLEAR);
+	writel(CFGPCIE_INT_ALL_5_STATUS_MASK, pcie->parf + PARF_INT_ALL_5_CLEAR);
+
+	dw_pcie_dbi_ro_wr_en(pci);
+
+	val = readl(pci->dbi_base + PCI_INTERRUPT_LINE);
+	val |= SERR_EN;
+	writel(val, pci->dbi_base + PCI_INTERRUPT_LINE);
+
+	val = readl(pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
+	val |= (PCIE_CAP_CORR_ERR_REPORT_EN | PCIE_CAP_NON_FATAL_ERR_REPORT_EN |
+			PCIE_CAP_FATAL_ERR_REPORT_EN | PCIE_CAP_UNSUPPORT_REQ_REP_EN);
+	writel(val, pci->dbi_base + DBI_DEVICE_CONTROL_DEVICE_STATUS);
+
+	val = readl(pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
+	val |= (PCIE_CAP_SYS_ERR_ON_CORR_ERR_EN | PCIE_CAP_SYS_ERR_ON_NON_FATAL_ERR_EN |
+			PCIE_CAP_SYS_ERR_ON_FATAL_ERR_EN);
+	writel(val, pci->dbi_base + DBI_ROOT_CONTROL_ROOT_CAPABILITIES_REG);
+
+	dw_pcie_dbi_ro_wr_dis(pci);
+
+	/* Enable RAS-DP Interrupts */
+	offset = dw_pcie_find_ext_capability(pci, PCI_EXT_CAP_RASDP_ID);
+	val = readl(pci->dbi_base + offset + PCIE_RASDP_ERROR_MODE_EN_REG);
+	val |= RASDP_ERROR_MODE_EN;
+	writel(val, pci->dbi_base + PCIE_RASDP_ERROR_MODE_EN_REG);
+
+	/* Enable CDM Check */
+	val = readl(pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
+	/* Enable continuous CMD register check mode */
+	val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS;
+	/* Start the CDM register check */
+	val |= PCIE_PL_CHK_REG_CHK_REG_START;
+	/* Enable comparison CDM register check mode */
+	val |= PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR;
+	/* Enable logic CDM register check mode */
+	val |= PCIE_PL_CHK_REG_CHK_REG_LOGIC_ERROR;
+	writel(val, pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
+
+	/* Interface Timer Enable */
+	val = readl(pci->dbi_base + PCIE_INTERFACE_TIMER_CONTROL);
+	val |=  (INTERFACE_TIMER_EN | INTERFACE_TIMER_AER_EN);
+	writel(val, pci->dbi_base + PCIE_INTERFACE_TIMER_CONTROL);
+
+	/* Enable safety correctable and uncorrectable error reporting */
+	val = readl(pcie->parf + PARF_CFG_SAFETY_INT_MASK_CTRL);
+	val |= (CFG_SAFETY_UNCORR_INT_MASK | CFG_SAFETY_CORR_INT_MASK);
+	writel(val, pcie->parf + PARF_CFG_SAFETY_INT_MASK_CTRL);
+
+	/* Enable CRC check and generation */
+	offset = dw_pcie_find_ext_capability(pci, PCIE_AER_EXT_CAP_ID);
+	val = readl(pci->dbi_base + offset + DBI_ADV_ERR_CAP_CTRL_OFF);
+	val |= (ECRC_GEN_EN | ECRC_CHECK_EN);
+	writel(val, pci->dbi_base + offset + DBI_ADV_ERR_CAP_CTRL_OFF);
+
+	/* Enable AER */
+	val = readl(pci->dbi_base + offset + DBI_ROOT_ERR_CMD_OFF);
+	val |= (CORR_ERR_REPORTING_EN | NON_FATAL_ERR_REPORTING_EN
+		| FATAL_ERR_REPORTING_EN);
+	writel(val, pci->dbi_base + offset + DBI_ROOT_ERR_CMD_OFF);
+
+	/* Enable interrupts */
+	val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
+	val &= ~(SAFETY_INT_MASK);
+	writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
+
+	/* Disable Legacy Interrupts */
+	writel(0, pcie->parf + PARF_INT_ALL_MASK);
+}
+
 static int qcom_pcie_post_init_2_7_0(struct qcom_pcie *pcie)
 {
 	qcom_pcie_clear_hpc(pcie->pci);
+	qcom_pcie_enable_error_reporting_2_7_0(pcie);
 
 	return 0;
 }
@@ -1416,6 +1591,130 @@  static void qcom_pcie_icc_update(struct qcom_pcie *pcie)
 	}
 }
 
+static void qcom_pcie_check_spurious_int(struct qcom_pcie *pcie)
+{
+	struct dw_pcie *pci = pcie->pci;
+	u32 *pcie_fault = pcie->pcie_fault;
+	struct device *dev = pci->dev;
+	struct kobject *kobj_ref = &dev->kobj;
+	u32 val;
+
+	val = readl(pci->dbi_base + DBI_INTERFACE_TIMER_STATUS);
+	if (val & CFGPCIE_INTERFACE_TIMER_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_ALL_STATUS);
+	if (val & CFGPCIE_INT_ALL_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_STATUS);
+	if (val & CFGPCIE_PARF_INT_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_ALL_2_STATUS);
+	if (val & CFGPCIE_INT_ALL_2_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_ALL_3_STATUS);
+	if (val & CFGPCIE_INT_ALL_3_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_ALL_4_STATUS);
+	if (val & CFGPCIE_INT_ALL_4_STATUS_MASK)
+		return;
+
+	val = readl(pcie->parf + PARF_INT_ALL_5_STATUS);
+	if (val & CFGPCIE_INT_ALL_5_STATUS_MASK)
+		return;
+
+	dev_err(pci->dev, "PCIe Spurious Interrupt");
+	pcie_fault[PCIE_SPURIOUS_INT]++;
+	pcie->pcie_fault_total++;
+	sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
+}
+
+static irqreturn_t qcom_pcie_global_irq_thread(int irq, void *data)
+{
+	struct qcom_pcie *pcie = data;
+	struct dw_pcie *pci = pcie->pci;
+	u32 *pcie_fault = pcie->pcie_fault;
+	struct device *dev = pci->dev;
+	struct kobject *kobj_ref = &dev->kobj;
+	unsigned long irqsave_flags;
+	u32 val, int_status;
+
+	spin_lock_irqsave(&pcie->safety_lock, irqsave_flags);
+
+	int_status = readl(pci->dbi_base + DBI_SAFETY_STATUS);
+	writel(0, pci->dbi_base + DBI_SAFETY_STATUS);
+
+	if (int_status) {
+		dev_err(pci->dev, "global interrupt fired status: %u", int_status);
+
+		if (int_status & PCIE_RASDP_UNCORR_ERR) {
+			dev_err(pci->dev, "RASDP uncorrectable error triggered");
+			pcie_fault[RASDP_UNCORR_ERROR]++;
+			pcie->pcie_fault_total++;
+			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
+
+			/*
+			 * rasdp_uncorr_err ends up triggering a
+			 * pcie_uncorr error continuously. So masking
+			 * pcie_uncorr interrupts .
+			 */
+			val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
+			val |= PCIE_AER_UNCORR_ERR;
+			writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
+		}
+
+		if (int_status & PCIE_CDM_CHK_ERR) {
+			dev_err(pci->dev, "CDM error triggered");
+			val = readl(pci->dbi_base + PCIE_PL_CHK_REG_CONTROL_STATUS);
+
+			if (val & PCIE_PL_CHK_REG_CHK_REG_COMPARISON_ERROR) {
+				pcie_fault[CDM_REG_CHK_ERROR]++;
+				pcie->pcie_fault_total++;
+				sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
+
+				/*
+				 * cdm_chk_err injection results in a continuous
+				 * interrupt storm on certain targets, so masking it.
+				 */
+				val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
+				val |= (PCIE_CDM_CHK_ERR | PCIE_AER_UNCORR_ERR);
+				writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
+			}
+		}
+
+		if (int_status & PCIE_IFACE_TMR_ERR) {
+			dev_err(pci->dev, "Iface Timeout error triggered");
+			pcie_fault[INTERFACE_TIMER_ERROR]++;
+			pcie->pcie_fault_total++;
+			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
+
+			/*
+			 * interface_timer_err injection results in a continuous
+			 * interrupt storm on certain targets, so masking it.
+			 */
+			val = readl(pci->dbi_base + DBI_SAFETY_MASK_OFF);
+			val |= (PCIE_IFACE_TMR_ERR | PCIE_AER_UNCORR_ERR);
+			writel(val, pci->dbi_base + DBI_SAFETY_MASK_OFF);
+		}
+
+		if (int_status & PCIE_RASDP_CORR_ERR) {
+			dev_err(pci->dev, "RASDP correctable error triggered");
+			pcie_fault[RASDP_CORR_ERROR]++;
+			pcie->pcie_fault_total++;
+			sysfs_notify(kobj_ref, NULL, "qcom_pcie_error_report");
+		}
+	} else {
+		qcom_pcie_check_spurious_int(pcie);
+	}
+
+	spin_unlock_irqrestore(&pcie->safety_lock, irqsave_flags);
+	return IRQ_HANDLED;
+}
+
 static int qcom_pcie_link_transition_count(struct seq_file *s, void *data)
 {
 	struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(s->private);
@@ -1438,6 +1737,40 @@  static int qcom_pcie_link_transition_count(struct seq_file *s, void *data)
 	return 0;
 }
 
+static ssize_t qcom_pcie_error_report_show(struct device *dev,
+		struct device_attribute *attr,
+		char *buf)
+{
+	unsigned int i;
+	struct qcom_pcie *pcie = (struct qcom_pcie *)dev_get_drvdata(dev);
+	u32 *pcie_fault = pcie->pcie_fault;
+	size_t len = 0;
+
+	for (i = 0; i < MAX_PCIE_SAFETY_FAULT; i++) {
+		if (pcie_fault_string[i])
+			len += sysfs_emit_at(buf, len, "%s: %lu\n",
+					pcie_fault_string[i],
+					pcie_fault[i]);
+	}
+
+	len += sysfs_emit_at(buf, len, "%s: %lu\n",
+					pcie_fault_string[i],
+					pcie->pcie_fault_total);
+
+	return len;
+}
+static DEVICE_ATTR_RO(qcom_pcie_error_report);
+
+static struct attribute *qcom_pcie_attrs[] = {
+	&dev_attr_qcom_pcie_error_report.attr,
+	NULL,
+};
+
+static const struct attribute_group qcom_pcie_attribute_group = {
+	.attrs = qcom_pcie_attrs,
+	.name = "qcom_pcie"
+};
+
 static void qcom_pcie_init_debugfs(struct qcom_pcie *pcie)
 {
 	struct dw_pcie *pci = pcie->pci;
@@ -1496,6 +1829,21 @@  static int qcom_pcie_probe(struct platform_device *pdev)
 		goto err_pm_runtime_put;
 	}
 
+	pcie->global_irq = platform_get_irq_byname(pdev, "global");
+	if (pcie->global_irq < 0) {
+		ret = pcie->global_irq;
+		goto err_pm_runtime_put;
+	}
+
+	ret = devm_request_threaded_irq(dev, pcie->global_irq, NULL,
+				qcom_pcie_global_irq_thread,
+				IRQF_ONESHOT,
+				"global_irq", pcie);
+	if (ret) {
+		dev_err(dev, "Failed to request Global IRQ\n");
+		goto err_pm_runtime_put;
+	}
+
 	pcie->parf = devm_platform_ioremap_resource_byname(pdev, "parf");
 	if (IS_ERR(pcie->parf)) {
 		ret = PTR_ERR(pcie->parf);
@@ -1551,6 +1899,8 @@  static int qcom_pcie_probe(struct platform_device *pdev)
 	if (pcie->mhi)
 		qcom_pcie_init_debugfs(pcie);
 
+	sysfs_create_group(&pdev->dev.kobj, &qcom_pcie_attribute_group);
+
 	return 0;
 
 err_phy_exit: