Message ID | 20230918122411.237635-4-biju.das.jz@bp.renesas.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp2622854vqi; Mon, 18 Sep 2023 05:36:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGPQlPZUct6pthcCyDIQixpsIbAFt+KLvVBqo1YNjjwWVsCFsPVAMVSpMW+K323LDuYWuaC X-Received: by 2002:a05:6808:641:b0:3a7:4987:d44 with SMTP id z1-20020a056808064100b003a749870d44mr9085966oih.20.1695040588870; Mon, 18 Sep 2023 05:36:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695040588; cv=none; d=google.com; s=arc-20160816; b=sSqd8DKmGYe0jFLioq9nO9SMUGr9139LyyEme4+lmY01eeOdd9Sl07kC1DCQIucied Dhyn6qvYDyV1iSnt3flnKKZ/JIYAwJt4P5e/aC+VfRs789kDp9kTSsfnd5nhpkrAGi5A HJkl84UKaVzbcCkS7hzjZ9WmW/bqW+VM91kWPgnOnHcqx1yHDq1vXj2J5PRfWeNLlRXF cYa33WkvMdMeaUe54ZbimEDcoqTuPJZM8tDK8z0poynm3VB5NaO1kJuLlLVsTon3OQGT ihiWswp+JgHGcTK92eKC551bKlhUbyO9cxBXlR+pU2FvY1ZN2Ur7eV3uWxNosRjA59r1 SBAw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=gxNYRPFBnG7Nax8nI6ldnmakAVjJBO0VuhXl8arQcfw=; fh=sWH11FnHMfCkHSU9kXlYjtdyZVAjuuR7mMUGHavtm3Q=; b=Nw9WHa1lPzkzgvtm82HfPQKRkUYsDfFs+bCg2vD+fOVas7cfWZ2nC+JNLB/nIAW6ZD pMbRnDWMc7N5d0/newKNuCGBR4qwJcsO05x/V8p0P0nqMemF3IoSZGm9a2QVdpBr6lVd GSLHakGiu2gCeg6DQicHT1DyszPNQj2YFbjkxGiMW6CSFXwHKDqcZlIqzqAmNzldLRk3 hQuyv/qLEdOnCwRur+b4YligjOfIHJB6EWF2ehsaerWVM0Kxx8mJB1D6/XE3bwKNejAF VdsD4GEG8jyCf9XsnMxTkKoqNbVHuyKt1N9lcyGXTDwCMFcuiO4NS7jnnEnbIedyL4BD kSHw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=renesas.com Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id t17-20020a63d251000000b00573f8a5337esi7922220pgi.461.2023.09.18.05.36.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Sep 2023 05:36:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=renesas.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id AB18A81BC499; Mon, 18 Sep 2023 05:25:37 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241965AbjIRMYv (ORCPT <rfc822;kernel.ruili@gmail.com> + 27 others); Mon, 18 Sep 2023 08:24:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59458 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242017AbjIRMYi (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 18 Sep 2023 08:24:38 -0400 Received: from relmlie6.idc.renesas.com (relmlor2.renesas.com [210.160.252.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1DB869F; Mon, 18 Sep 2023 05:24:28 -0700 (PDT) X-IronPort-AV: E=Sophos;i="6.02,156,1688396400"; d="scan'208";a="180128606" Received: from unknown (HELO relmlir6.idc.renesas.com) ([10.200.68.152]) by relmlie6.idc.renesas.com with ESMTP; 18 Sep 2023 21:24:28 +0900 Received: from localhost.localdomain (unknown [10.226.92.107]) by relmlir6.idc.renesas.com (Postfix) with ESMTP id 9B45C41F6AEF; Mon, 18 Sep 2023 21:24:25 +0900 (JST) From: Biju Das <biju.das.jz@bp.renesas.com> To: Thomas Gleixner <tglx@linutronix.de>, Marc Zyngier <maz@kernel.org> Cc: Biju Das <biju.das.jz@bp.renesas.com>, Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>, Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>, Geert Uytterhoeven <geert+renesas@glider.be>, Biju Das <biju.das.au@gmail.com>, linux-kernel@vger.kernel.org, linux-renesas-soc@vger.kernel.org Subject: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge trigger detection for TINT Date: Mon, 18 Sep 2023 13:24:11 +0100 Message-Id: <20230918122411.237635-4-biju.das.jz@bp.renesas.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20230918122411.237635-1-biju.das.jz@bp.renesas.com> References: <20230918122411.237635-1-biju.das.jz@bp.renesas.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Mon, 18 Sep 2023 05:25:37 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1777378880942260617 X-GMAIL-MSGID: 1777378880942260617 |
Series |
Fix IRQ storm with GPIO interrupts
|
|
Commit Message
Biju Das
Sept. 18, 2023, 12:24 p.m. UTC
In case of edge trigger detection, enabling the TINT source causes a phantum interrupt that leads to irq storm. So clear the phantum interrupt in rzg2l_irqc_irq_enable(). This issue is observed when the irq handler disables the interrupts using disable_irq_nosync() and scheduling a work queue and in the work queue, re-enabling the interrupt with enable_irq(). Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver") Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> --- drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++ 1 file changed, 6 insertions(+)
Comments
On Mon, 18 Sep 2023 13:24:11 +0100, Biju Das <biju.das.jz@bp.renesas.com> wrote: > > In case of edge trigger detection, enabling the TINT source causes a > phantum interrupt that leads to irq storm. So clear the phantum interrupt > in rzg2l_irqc_irq_enable(). > > This issue is observed when the irq handler disables the interrupts using > disable_irq_nosync() and scheduling a work queue and in the work queue, > re-enabling the interrupt with enable_irq(). > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller driver") > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > --- > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++ > 1 file changed, 6 insertions(+) > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c > index 33a22bafedcd..78a9e90512a6 100644 > --- a/drivers/irqchip/irq-renesas-rzg2l.c > +++ b/drivers/irqchip/irq-renesas-rzg2l.c > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data *d) > reg = readl_relaxed(priv->base + TSSR(tssr_index)); > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset); > writel_relaxed(reg, priv->base + TSSR(tssr_index)); > + /* > + * In case of edge trigger detection, enabling the TINT source > + * cause a phantum interrupt that leads to irq storm. So clear > + * the phantum interrupt. > + */ > + rzg2l_tint_eoi(d); This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt being made pending in the middle, and you've lost that interrupt. What prevents this scenario? M.
Hi Marc Zyngier, Thanks for the feedback. > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > trigger detection for TINT > > On Mon, 18 Sep 2023 13:24:11 +0100, > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > In case of edge trigger detection, enabling the TINT source causes a > > phantum interrupt that leads to irq storm. So clear the phantum > > interrupt in rzg2l_irqc_irq_enable(). > > > > This issue is observed when the irq handler disables the interrupts > > using > > disable_irq_nosync() and scheduling a work queue and in the work > > queue, re-enabling the interrupt with enable_irq(). > > > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller > > driver") > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > --- > > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c > > b/drivers/irqchip/irq-renesas-rzg2l.c > > index 33a22bafedcd..78a9e90512a6 100644 > > --- a/drivers/irqchip/irq-renesas-rzg2l.c > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data > *d) > > reg = readl_relaxed(priv->base + TSSR(tssr_index)); > > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset); > > writel_relaxed(reg, priv->base + TSSR(tssr_index)); > > + /* > > + * In case of edge trigger detection, enabling the TINT source > > + * cause a phantum interrupt that leads to irq storm. So clear > > + * the phantum interrupt. > > + */ > > + rzg2l_tint_eoi(d); > > This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt > being made pending in the middle, and you've lost that interrupt. In this driver that will never happen as it clears the TINT source during disable(), so there won't be any TINT source for interrupt detection after disable(). Cheers, Biju > What prevents this scenario?
On Tue, 19 Sep 2023 16:24:53 +0100, Biju Das <biju.das.jz@bp.renesas.com> wrote: > > Hi Marc Zyngier, > > Thanks for the feedback. > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > > trigger detection for TINT > > > > On Mon, 18 Sep 2023 13:24:11 +0100, > > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > > > In case of edge trigger detection, enabling the TINT source causes a > > > phantum interrupt that leads to irq storm. So clear the phantum > > > interrupt in rzg2l_irqc_irq_enable(). > > > > > > This issue is observed when the irq handler disables the interrupts > > > using > > > disable_irq_nosync() and scheduling a work queue and in the work > > > queue, re-enabling the interrupt with enable_irq(). > > > > > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt Controller > > > driver") > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > > > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > > --- > > > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++ > > > 1 file changed, 6 insertions(+) > > > > > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c > > > b/drivers/irqchip/irq-renesas-rzg2l.c > > > index 33a22bafedcd..78a9e90512a6 100644 > > > --- a/drivers/irqchip/irq-renesas-rzg2l.c > > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c > > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data > > *d) > > > reg = readl_relaxed(priv->base + TSSR(tssr_index)); > > > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset); > > > writel_relaxed(reg, priv->base + TSSR(tssr_index)); > > > + /* > > > + * In case of edge trigger detection, enabling the TINT source > > > + * cause a phantum interrupt that leads to irq storm. So clear > > > + * the phantum interrupt. > > > + */ > > > + rzg2l_tint_eoi(d); > > > > This looks incredibly unsafe. disable_irq()+enable_irq() with an interrupt > > being made pending in the middle, and you've lost that interrupt. > > In this driver that will never happen as it clears the TINT source > during disable(), so there won't be any TINT source for interrupt > detection after disable(). So you mean that you *already* lose interrupts across a disable followed by an enable? I'm slightly puzzled... M.
Hi Marc Zyngier, > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > trigger detection for TINT > > On Tue, 19 Sep 2023 16:24:53 +0100, > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > Hi Marc Zyngier, > > > > Thanks for the feedback. > > > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with > > > edge trigger detection for TINT > > > > > > On Mon, 18 Sep 2023 13:24:11 +0100, > > > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > > > > > In case of edge trigger detection, enabling the TINT source causes > > > > a phantum interrupt that leads to irq storm. So clear the phantum > > > > interrupt in rzg2l_irqc_irq_enable(). > > > > > > > > This issue is observed when the irq handler disables the > > > > interrupts using > > > > disable_irq_nosync() and scheduling a work queue and in the work > > > > queue, re-enabling the interrupt with enable_irq(). > > > > > > > > Fixes: 3fed09559cd8 ("irqchip: Add RZ/G2L IA55 Interrupt > > > > Controller > > > > driver") > > > > Signed-off-by: Biju Das <biju.das.jz@bp.renesas.com> > > > > Tested-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com> > > > > --- > > > > drivers/irqchip/irq-renesas-rzg2l.c | 6 ++++++ > > > > 1 file changed, 6 insertions(+) > > > > > > > > diff --git a/drivers/irqchip/irq-renesas-rzg2l.c > > > > b/drivers/irqchip/irq-renesas-rzg2l.c > > > > index 33a22bafedcd..78a9e90512a6 100644 > > > > --- a/drivers/irqchip/irq-renesas-rzg2l.c > > > > +++ b/drivers/irqchip/irq-renesas-rzg2l.c > > > > @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct > > > > irq_data > > > *d) > > > > reg = readl_relaxed(priv->base + TSSR(tssr_index)); > > > > reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset); > > > > writel_relaxed(reg, priv->base + TSSR(tssr_index)); > > > > + /* > > > > + * In case of edge trigger detection, enabling the TINT > source > > > > + * cause a phantum interrupt that leads to irq storm. So > clear > > > > + * the phantum interrupt. > > > > + */ > > > > + rzg2l_tint_eoi(d); > > > > > > This looks incredibly unsafe. disable_irq()+enable_irq() with an > > > interrupt being made pending in the middle, and you've lost that > interrupt. > > > > In this driver that will never happen as it clears the TINT source > > during disable(), so there won't be any TINT source for interrupt > > detection after disable(). > > So you mean that you *already* lose interrupts across a disable followed by > an enable? I'm slightly puzzled... There is no interrupt lost at all. Currently this patch addresses 2 issues. Scenario 1: Extra interrupt when we select TINT source on enable_irq() Getting an extra interrupt, when client drivers calls enable_irq() during probe()/resume(). In this case, the irq handler on the Client driver just clear the interrupt status bit. Issue 2: IRQ storm when we select TINT source on enable_irq() Here as well, we are getting an extra interrupt, when client drivers calls enable_irq() during probe() and this Interrupts getting generated infinitely, when the client driver calls disable_irq() in irq handler and in in work queue calling enable_irq(). Currently we are not loosing interrupts, but we are getting additional Interrupt(phantom) which is causing the issue. Cheers, Biju
On Tue, 19 Sep 2023 17:32:05 +0100, Biju Das <biju.das.jz@bp.renesas.com> wrote: [...] > > So you mean that you *already* lose interrupts across a disable followed by > > an enable? I'm slightly puzzled... > > There is no interrupt lost at all. > > Currently this patch addresses 2 issues. > > Scenario 1: Extra interrupt when we select TINT source on enable_irq() > > Getting an extra interrupt, when client drivers calls enable_irq() > during probe()/resume(). In this case, the irq handler on the Client > driver just clear the interrupt status bit. > > Issue 2: IRQ storm when we select TINT source on enable_irq() > > Here as well, we are getting an extra interrupt, when client drivers > calls enable_irq() during probe() and this Interrupts getting > generated infinitely, when the client driver calls disable_irq() in > irq handler and in in work queue calling enable_irq(). How do you know this is a spurious interrupt? For all you can tell, you are just consuming an edge. I absolutely don't buy this workaround, because you have no context that allows you to discriminate between a real spurious interrupt and a normal interrupt that lands while the interrupt line was masked. > Currently we are not loosing interrupts, but we are getting additional > Interrupt(phantom) which is causing the issue. If you get an interrupt at probe time in the endpoint driver, that's probably because the device is not in a quiescent state when the interrupt is requested. And it is probably this that needs addressing. M.
Hi Marc Zyngier, > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > trigger detection for TINT > > On Tue, 19 Sep 2023 17:32:05 +0100, > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > [...] > > > > So you mean that you *already* lose interrupts across a disable > > > followed by an enable? I'm slightly puzzled... > > > > There is no interrupt lost at all. > > > > Currently this patch addresses 2 issues. > > > > Scenario 1: Extra interrupt when we select TINT source on enable_irq() > > > > Getting an extra interrupt, when client drivers calls enable_irq() > > during probe()/resume(). In this case, the irq handler on the Client > > driver just clear the interrupt status bit. > > > > Issue 2: IRQ storm when we select TINT source on enable_irq() > > > > Here as well, we are getting an extra interrupt, when client drivers > > calls enable_irq() during probe() and this Interrupts getting > > generated infinitely, when the client driver calls disable_irq() in > > irq handler and in in work queue calling enable_irq(). > > How do you know this is a spurious interrupt? We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and other end to ground. During the boot, I get an interrupt even though there is no high to low transition, when the IRQ is setup in the probe(). From this it is a spurious interrupt. > For all you can tell, you are > just consuming an edge. I absolutely don't buy this workaround, because you > have no context that allows you to discriminate between a real spurious > interrupt and a normal interrupt that lands while the interrupt line was > masked. > > > Currently we are not loosing interrupts, but we are getting additional > > Interrupt(phantom) which is causing the issue. > > If you get an interrupt at probe time in the endpoint driver, that's > probably because the device is not in a quiescent state when the interrupt > is requested. And it is probably this that needs addressing. Any pointer for addressing this issue? Thanks for your help. Cheers, Biju
On Tue, 19 Sep 2023 18:06:54 +0100, Biju Das <biju.das.jz@bp.renesas.com> wrote: > > Hi Marc Zyngier, > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > > trigger detection for TINT > > > > On Tue, 19 Sep 2023 17:32:05 +0100, > > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > [...] > > > > > > So you mean that you *already* lose interrupts across a disable > > > > followed by an enable? I'm slightly puzzled... > > > > > > There is no interrupt lost at all. > > > > > > Currently this patch addresses 2 issues. > > > > > > Scenario 1: Extra interrupt when we select TINT source on enable_irq() > > > > > > Getting an extra interrupt, when client drivers calls enable_irq() > > > during probe()/resume(). In this case, the irq handler on the Client > > > driver just clear the interrupt status bit. > > > > > > Issue 2: IRQ storm when we select TINT source on enable_irq() > > > > > > Here as well, we are getting an extra interrupt, when client drivers > > > calls enable_irq() during probe() and this Interrupts getting > > > generated infinitely, when the client driver calls disable_irq() in > > > irq handler and in in work queue calling enable_irq(). > > > > How do you know this is a spurious interrupt? > > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin > and other end to ground. During the boot, I get an interrupt > even though there is no high to low transition, when the IRQ is setup > in the probe(). From this it is a spurious interrupt. That doesn't really handle my question. At the point of enabling the interrupt and consuming the edge (which is what this patch does), how do you know you can readily discard this signal? This is a genuine question. Spurious interrupts at boot are common. The HW resets in a funky, unspecified state, and it's SW's job to initialise it before letting other agents in the system use interrupts. > > > For all you can tell, you are > > just consuming an edge. I absolutely don't buy this workaround, because you > > have no context that allows you to discriminate between a real spurious > > interrupt and a normal interrupt that lands while the interrupt line was > > masked. > > > > > Currently we are not loosing interrupts, but we are getting additional > > > Interrupt(phantom) which is causing the issue. > > > > If you get an interrupt at probe time in the endpoint driver, that's > > probably because the device is not in a quiescent state when the interrupt > > is requested. And it is probably this that needs addressing. > > Any pointer for addressing this issue? Nothing but the most basic stuff: you should make sure that the interrupt isn't enabled before you can actually handle it, and triage it as spurious. M.
Hi Marc Zyngier, Thanks for the feedback. > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > trigger detection for TINT > > On Tue, 19 Sep 2023 18:06:54 +0100, > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > Hi Marc Zyngier, > > > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with > > > edge trigger detection for TINT > > > > > > On Tue, 19 Sep 2023 17:32:05 +0100, > > > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > > > [...] > > > > > > > > So you mean that you *already* lose interrupts across a disable > > > > > followed by an enable? I'm slightly puzzled... > > > > > > > > There is no interrupt lost at all. > > > > > > > > Currently this patch addresses 2 issues. > > > > > > > > Scenario 1: Extra interrupt when we select TINT source on > > > > enable_irq() > > > > > > > > Getting an extra interrupt, when client drivers calls enable_irq() > > > > during probe()/resume(). In this case, the irq handler on the > > > > Client driver just clear the interrupt status bit. > > > > > > > > Issue 2: IRQ storm when we select TINT source on enable_irq() > > > > > > > > Here as well, we are getting an extra interrupt, when client > > > > drivers calls enable_irq() during probe() and this Interrupts > > > > getting generated infinitely, when the client driver calls > > > > disable_irq() in irq handler and in in work queue calling > enable_irq(). > > > > > > How do you know this is a spurious interrupt? > > > > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and > > other end to ground. During the boot, I get an interrupt even though > > there is no high to low transition, when the IRQ is setup in the > > probe(). From this it is a spurious interrupt. > > That doesn't really handle my question. At the point of enabling the > interrupt and consuming the edge (which is what this patch does), how do > you know you can readily discard this signal? This is a genuine question. > > Spurious interrupts at boot are common. The HW resets in a funky, > unspecified state, and it's SW's job to initialise it before letting other > agents in the system use interrupts. I got your point related to loosing interrupts. Now I can detect spurious interrupts for edge trigger. Pin controller driver has a read-only register to monitor input values of GPIO input pins, use that register values before/after rzg2l_irq_enable() with TINT Status Control Register (TSCR) in IRQ controller to detect the spurious interrupt. Eg: 1) Check PIN_43_0 value (ex: low)in pinctrl driver 2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in pinctrl driver 3) Check PIN_43_0 value (ex: low) in pinctrl driver 4) Check the TINT Status Control Register(TSCR) in IRQ controller driver If the values in 1 and 3 are same and the status in 4 is set, then there is a spurious interrupt. > > > > > > For all you can tell, you are > > > just consuming an edge. I absolutely don't buy this workaround, > > > because you have no context that allows you to discriminate between > > > a real spurious interrupt and a normal interrupt that lands while > > > the interrupt line was masked. > > > > > > > Currently we are not loosing interrupts, but we are getting > > > > additional > > > > Interrupt(phantom) which is causing the issue. > > > > > > If you get an interrupt at probe time in the endpoint driver, that's > > > probably because the device is not in a quiescent state when the > > > interrupt is requested. And it is probably this that needs addressing. > > > > Any pointer for addressing this issue? > > Nothing but the most basic stuff: you should make sure that the interrupt > isn't enabled before you can actually handle it, and triage it as spurious. For the GPIO interrupt case I have, RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver-->GIC controller. 1) I have configured the pin as GPIO interrupts in pin controller driver 2) Set the IRQ detection in IRQ controller for edge trigger 3) The moment I set the IRQ source in IRQ controller I get an interrupt, even though there is no voltage transition. Here the system is setup properly, but there is a spurious interrupt. Currently don't know how to handle it? Any pointers for handling this issue? Note: Currently the pin controller driver is not configuring GPIO as GPIO input in Port Mode Register for the GPIO interrupts instead it is using reset value which is "Hi-Z". I will send a patch to fix it. Cheers, Biju
Hi Marc, > Subject: RE: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with edge > trigger detection for TINT > > Hi Marc Zyngier, > > Thanks for the feedback. > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm with > > edge trigger detection for TINT > > > > On Tue, 19 Sep 2023 18:06:54 +0100, > > Biju Das <biju.das.jz@bp.renesas.com> wrote: > > > > > > Hi Marc Zyngier, > > > > > > > Subject: Re: [PATCH 3/3] irqchip: renesas-rzg2l: Fix irq storm > > > > with edge trigger detection for TINT > > > > > > > > On Tue, 19 Sep 2023 17:32:05 +0100, Biju Das > > > > <biju.das.jz@bp.renesas.com> wrote: > > > > > > > > [...] > > > > > > > > > > So you mean that you *already* lose interrupts across a > > > > > > disable followed by an enable? I'm slightly puzzled... > > > > > > > > > > There is no interrupt lost at all. > > > > > > > > > > Currently this patch addresses 2 issues. > > > > > > > > > > Scenario 1: Extra interrupt when we select TINT source on > > > > > enable_irq() > > > > > > > > > > Getting an extra interrupt, when client drivers calls > > > > > enable_irq() during probe()/resume(). In this case, the irq > > > > > handler on the Client driver just clear the interrupt status bit. > > > > > > > > > > Issue 2: IRQ storm when we select TINT source on enable_irq() > > > > > > > > > > Here as well, we are getting an extra interrupt, when client > > > > > drivers calls enable_irq() during probe() and this Interrupts > > > > > getting generated infinitely, when the client driver calls > > > > > disable_irq() in irq handler and in in work queue calling > > enable_irq(). > > > > > > > > How do you know this is a spurious interrupt? > > > > > > We have PMOD on RZ/G2L SMARC EVK. So I connected it to GPIO pin and > > > other end to ground. During the boot, I get an interrupt even though > > > there is no high to low transition, when the IRQ is setup in the > > > probe(). From this it is a spurious interrupt. > > > > That doesn't really handle my question. At the point of enabling the > > interrupt and consuming the edge (which is what this patch does), how > > do you know you can readily discard this signal? This is a genuine > question. > > > > Spurious interrupts at boot are common. The HW resets in a funky, > > unspecified state, and it's SW's job to initialise it before letting > > other agents in the system use interrupts. > > I got your point related to loosing interrupts. > > Now I can detect spurious interrupts for edge trigger. > > Pin controller driver has a read-only register to monitor input values of > GPIO input pins, use that register values before/after rzg2l_irq_enable() > with TINT Status Control Register (TSCR) in IRQ controller to detect the > spurious interrupt. > > Eg: > 1) Check PIN_43_0 value (ex: low)in pinctrl driver > 2) Enable the IRQ using rzg2l_irq_enable()/ irq_chip_enable_parent()in > pinctrl driver > 3) Check PIN_43_0 value (ex: low) in pinctrl driver > 4) Check the TINT Status Control Register(TSCR) in IRQ controller driver > > If the values in 1 and 3 are same and the status in 4 is set, then > there is a spurious interrupt. > > > > > > > > > > For all you can tell, you are > > > > just consuming an edge. I absolutely don't buy this workaround, > > > > because you have no context that allows you to discriminate > > > > between a real spurious interrupt and a normal interrupt that > > > > lands while the interrupt line was masked. > > > > > > > > > Currently we are not loosing interrupts, but we are getting > > > > > additional > > > > > Interrupt(phantom) which is causing the issue. > > > > > > > > If you get an interrupt at probe time in the endpoint driver, > > > > that's probably because the device is not in a quiescent state > > > > when the interrupt is requested. And it is probably this that needs > addressing. > > > > > > Any pointer for addressing this issue? > > > > Nothing but the most basic stuff: you should make sure that the > > interrupt isn't enabled before you can actually handle it, and triage it > as spurious. > > For the GPIO interrupt case I have, > > RTC driver(endpoint)--> Pin controller driver -->IRQ controller driver-- > >GIC controller. > > 1) I have configured the pin as GPIO interrupts in pin controller driver > 2) Set the IRQ detection in IRQ controller for edge trigger > 3) The moment I set the IRQ source in IRQ controller > I get an interrupt, even though there is no voltage transition. > > Here the system is setup properly, but there is a spurious interrupt. > Currently don't know how to handle it? > > Any pointers for handling this issue? > > Note: > Currently the pin controller driver is not configuring GPIO as GPIO input > in Port Mode Register for the GPIO interrupts instead it is using reset > value which is "Hi-Z". I will send a patch to fix it. An update, I have found a way to fix the spurious interrupt issue. Spurious interrupt is generated if we do simultaneous writing of TINT Source selection and TINT Source enable in TSSRx register. If we write the register in correct order, then there is no issue. i.e., first set the TINT Source selection and after that enable it. Looks like it is a HW race condition. I am checking this issue with HW team. Cheers, Biju
diff --git a/drivers/irqchip/irq-renesas-rzg2l.c b/drivers/irqchip/irq-renesas-rzg2l.c index 33a22bafedcd..78a9e90512a6 100644 --- a/drivers/irqchip/irq-renesas-rzg2l.c +++ b/drivers/irqchip/irq-renesas-rzg2l.c @@ -144,6 +144,12 @@ static void rzg2l_irqc_irq_enable(struct irq_data *d) reg = readl_relaxed(priv->base + TSSR(tssr_index)); reg |= (TIEN | tint) << TSSEL_SHIFT(tssr_offset); writel_relaxed(reg, priv->base + TSSR(tssr_index)); + /* + * In case of edge trigger detection, enabling the TINT source + * cause a phantum interrupt that leads to irq storm. So clear + * the phantum interrupt. + */ + rzg2l_tint_eoi(d); raw_spin_unlock(&priv->lock); irq_chip_unmask_parent(d); }