[v2,2/7] mmc: sdhci-of-arasan: Fix SDHCI_RESET_ALL for CQHCI

Message ID 20221019145246.v2.2.I29f6a2189e84e35ad89c1833793dca9e36c64297@changeid
State New
Headers
Series mmc: sdhci controllers: Fix SDHCI_RESET_ALL for CQHCI |

Commit Message

Brian Norris Oct. 19, 2022, 9:54 p.m. UTC
  SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't
tracking that properly in software. When out of sync, we may trigger
various timeouts.

It's not typical to perform resets while CQE is enabled, but one
particular case I hit commonly enough: mmc_suspend() -> mmc_power_off().
Typically we will eventually deactivate CQE (cqhci_suspend() ->
cqhci_deactivate()), but that's not guaranteed -- in particular, if
we perform a partial (e.g., interrupted) system suspend.

The same bug was already found and fixed for two other drivers, in v5.7
and v5.9:

5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers

The latter is especially prescient, saying "other drivers using CQHCI
might benefit from a similar change, if they also have CQHCI reset by
SDHCI_RESET_ALL."

So like these other patches, deactivate CQHCI when resetting the
controller.

Fixes: 84362d79f436 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
Cc: <stable@vger.kernel.org>
Signed-off-by: Brian Norris <briannorris@chromium.org>
---

Changes in v2:
 - Rely on cqhci_deactivate() to safely handle (ignore)
   not-yet-initialized CQE support

 drivers/mmc/host/sdhci-of-arasan.c | 3 +++
 1 file changed, 3 insertions(+)
  

Comments

Florian Fainelli Oct. 19, 2022, 9:59 p.m. UTC | #1
On 10/19/22 14:54, Brian Norris wrote:
> SDHCI_RESET_ALL resets will reset the hardware CQE state, but we aren't
> tracking that properly in software. When out of sync, we may trigger
> various timeouts.
> 
> It's not typical to perform resets while CQE is enabled, but one
> particular case I hit commonly enough: mmc_suspend() -> mmc_power_off().
> Typically we will eventually deactivate CQE (cqhci_suspend() ->
> cqhci_deactivate()), but that's not guaranteed -- in particular, if
> we perform a partial (e.g., interrupted) system suspend.
> 
> The same bug was already found and fixed for two other drivers, in v5.7
> and v5.9:
> 
> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers
> 
> The latter is especially prescient, saying "other drivers using CQHCI
> might benefit from a similar change, if they also have CQHCI reset by
> SDHCI_RESET_ALL."
> 
> So like these other patches, deactivate CQHCI when resetting the
> controller.
> 
> Fixes: 84362d79f436 ("mmc: sdhci-of-arasan: Add CQHCI support for arasan,sdhci-5.1")
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Brian Norris <briannorris@chromium.org>
> ---
> 
> Changes in v2:
>   - Rely on cqhci_deactivate() to safely handle (ignore)
>     not-yet-initialized CQE support
> 
>   drivers/mmc/host/sdhci-of-arasan.c | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c
> index 3997cad1f793..b30f0d6baf5b 100644
> --- a/drivers/mmc/host/sdhci-of-arasan.c
> +++ b/drivers/mmc/host/sdhci-of-arasan.c
> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
>   	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>   	struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
>   
> +	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
> +		cqhci_deactivate(host->mmc);
> +
>   	sdhci_reset(host, mask);

Cannot this be absorbed by sdhci_reset() that all of these drivers 
appear to be utilizing since you have access to the host and the mask to 
make that decision?
  
Brian Norris Oct. 19, 2022, 10:19 p.m. UTC | #2
On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote:
> On 10/19/22 14:54, Brian Norris wrote:
> > The same bug was already found and fixed for two other drivers, in v5.7
> > and v5.9:
> > 
> > 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
> > df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers
> > 
> > The latter is especially prescient, saying "other drivers using CQHCI
> > might benefit from a similar change, if they also have CQHCI reset by
> > SDHCI_RESET_ALL."

> > --- a/drivers/mmc/host/sdhci-of-arasan.c
> > +++ b/drivers/mmc/host/sdhci-of-arasan.c
> > @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
> >   	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
> >   	struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
> > +	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
> > +		cqhci_deactivate(host->mmc);
> > +
> >   	sdhci_reset(host, mask);
> 
> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to
> be utilizing since you have access to the host and the mask to make that
> decision?

It potentially could.

I don't know if this is a specified SDHCI behavior that really belongs
in the common helper, or if this is just a commonly-shared behavior. Per
the comments I quote above ("if they also have CQHCI reset by
SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific
behavior.

I suppose it's not all that harmful to do this even if some SDHCI
controller doesn't have the same behavior/quirk.

I guess I also don't know if any SDHCI controllers will support command
queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see
CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't
set MMC_CAP2_CQE.

Brian
  
Adrian Hunter Oct. 20, 2022, 6:29 a.m. UTC | #3
On 20/10/22 01:19, Brian Norris wrote:
> On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote:
>> On 10/19/22 14:54, Brian Norris wrote:
>>> The same bug was already found and fixed for two other drivers, in v5.7
>>> and v5.9:
>>>
>>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
>>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers
>>>
>>> The latter is especially prescient, saying "other drivers using CQHCI
>>> might benefit from a similar change, if they also have CQHCI reset by
>>> SDHCI_RESET_ALL."
> 
>>> --- a/drivers/mmc/host/sdhci-of-arasan.c
>>> +++ b/drivers/mmc/host/sdhci-of-arasan.c
>>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
>>>   	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>>>   	struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
>>> +	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
>>> +		cqhci_deactivate(host->mmc);
>>> +
>>>   	sdhci_reset(host, mask);
>>
>> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to
>> be utilizing since you have access to the host and the mask to make that
>> decision?
> 
> It potentially could.
> 
> I don't know if this is a specified SDHCI behavior that really belongs
> in the common helper, or if this is just a commonly-shared behavior. Per
> the comments I quote above ("if they also have CQHCI reset by
> SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific
> behavior.
> 
> I suppose it's not all that harmful to do this even if some SDHCI
> controller doesn't have the same behavior/quirk.
> 
> I guess I also don't know if any SDHCI controllers will support command
> queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see
> CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't
> set MMC_CAP2_CQE.

SDHCI and CQHCI are separate modules and are not dependent, so they cannot
call into each other directly (and should not).  A new CQE API would be
needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host),
and wrapped in mmc/host.h:

static inline void mmc_cqe_notify_reset(struct mmc_host *host)
{
	if (host->cqe_ops->cqe_notify_reset)
		host->cqe_ops->cqe_notify_reset(host);
}

Alternatively, you could make a new module for SDHCI/CQHCI helper functions,
although in this case there is so little code it could be static inline and
added in a new include file instead, say sdhci-cqhci.h e.g.

#include "cqhci.h"
#include "sdhci.h"

static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask)
{
	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) &&
	    host->mmc->cqe_private)
		cqhci_deactivate(host->mmc);
	sdhci_reset(host, mask);
}
  
Florian Fainelli Oct. 21, 2022, 5:45 p.m. UTC | #4
On 10/19/22 23:29, Adrian Hunter wrote:
> On 20/10/22 01:19, Brian Norris wrote:
>> On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote:
>>> On 10/19/22 14:54, Brian Norris wrote:
>>>> The same bug was already found and fixed for two other drivers, in v5.7
>>>> and v5.9:
>>>>
>>>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
>>>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers
>>>>
>>>> The latter is especially prescient, saying "other drivers using CQHCI
>>>> might benefit from a similar change, if they also have CQHCI reset by
>>>> SDHCI_RESET_ALL."
>>
>>>> --- a/drivers/mmc/host/sdhci-of-arasan.c
>>>> +++ b/drivers/mmc/host/sdhci-of-arasan.c
>>>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
>>>>    	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>>>>    	struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
>>>> +	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
>>>> +		cqhci_deactivate(host->mmc);
>>>> +
>>>>    	sdhci_reset(host, mask);
>>>
>>> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to
>>> be utilizing since you have access to the host and the mask to make that
>>> decision?
>>
>> It potentially could.
>>
>> I don't know if this is a specified SDHCI behavior that really belongs
>> in the common helper, or if this is just a commonly-shared behavior. Per
>> the comments I quote above ("if they also have CQHCI reset by
>> SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific
>> behavior.
>>
>> I suppose it's not all that harmful to do this even if some SDHCI
>> controller doesn't have the same behavior/quirk.
>>
>> I guess I also don't know if any SDHCI controllers will support command
>> queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see
>> CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't
>> set MMC_CAP2_CQE.
> 
> SDHCI and CQHCI are separate modules and are not dependent, so they cannot
> call into each other directly (and should not).  A new CQE API would be
> needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host),
> and wrapped in mmc/host.h:
> 
> static inline void mmc_cqe_notify_reset(struct mmc_host *host)
> {
> 	if (host->cqe_ops->cqe_notify_reset)
> 		host->cqe_ops->cqe_notify_reset(host);
> }
> 
> Alternatively, you could make a new module for SDHCI/CQHCI helper functions,
> although in this case there is so little code it could be static inline and
> added in a new include file instead, say sdhci-cqhci.h e.g.
> 
> #include "cqhci.h"
> #include "sdhci.h"
> 
> static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask)
> {
> 	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) &&
> 	    host->mmc->cqe_private)
> 		cqhci_deactivate(host->mmc);
> 	sdhci_reset(host, mask);
> }
> 

I like the simplicity of the inline helper, especially towards 
backports. May suggest to name it sdhci_and_cqhci_reset() to illustrate 
that it does both, and does not apply specifically CQHCI that would be 
"embedded" into SDHCI, but your call here.
  
Adrian Hunter Oct. 23, 2022, 4:47 p.m. UTC | #5
On 21/10/22 20:45, Florian Fainelli wrote:
> On 10/19/22 23:29, Adrian Hunter wrote:
>> On 20/10/22 01:19, Brian Norris wrote:
>>> On Wed, Oct 19, 2022 at 02:59:39PM -0700, Florian Fainelli wrote:
>>>> On 10/19/22 14:54, Brian Norris wrote:
>>>>> The same bug was already found and fixed for two other drivers, in v5.7
>>>>> and v5.9:
>>>>>
>>>>> 5cf583f1fb9c mmc: sdhci-msm: Deactivate CQE during SDHC reset
>>>>> df57d73276b8 mmc: sdhci-pci: Fix SDHCI_RESET_ALL for CQHCI for Intel GLK-based controllers
>>>>>
>>>>> The latter is especially prescient, saying "other drivers using CQHCI
>>>>> might benefit from a similar change, if they also have CQHCI reset by
>>>>> SDHCI_RESET_ALL."
>>>
>>>>> --- a/drivers/mmc/host/sdhci-of-arasan.c
>>>>> +++ b/drivers/mmc/host/sdhci-of-arasan.c
>>>>> @@ -366,6 +366,9 @@ static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
>>>>>        struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
>>>>>        struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
>>>>> +    if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
>>>>> +        cqhci_deactivate(host->mmc);
>>>>> +
>>>>>        sdhci_reset(host, mask);
>>>>
>>>> Cannot this be absorbed by sdhci_reset() that all of these drivers appear to
>>>> be utilizing since you have access to the host and the mask to make that
>>>> decision?
>>>
>>> It potentially could.
>>>
>>> I don't know if this is a specified SDHCI behavior that really belongs
>>> in the common helper, or if this is just a commonly-shared behavior. Per
>>> the comments I quote above ("if they also have CQHCI reset by
>>> SDHCI_RESET_ALL"), I chose to leave that as an implementation-specific
>>> behavior.
>>>
>>> I suppose it's not all that harmful to do this even if some SDHCI
>>> controller doesn't have the same behavior/quirk.
>>>
>>> I guess I also don't know if any SDHCI controllers will support command
>>> queueing (MMC_CAP2_CQE) via somethings *besides* CQHCI. I see
>>> CQE support in sdhci-sprd.c without CQHCI, although that driver doesn't
>>> set MMC_CAP2_CQE.
>>
>> SDHCI and CQHCI are separate modules and are not dependent, so they cannot
>> call into each other directly (and should not).  A new CQE API would be
>> needed in mmc_cqe_ops e.g. (*cqe_notify_reset)(struct mmc_host *host),
>> and wrapped in mmc/host.h:
>>
>> static inline void mmc_cqe_notify_reset(struct mmc_host *host)
>> {
>>     if (host->cqe_ops->cqe_notify_reset)
>>         host->cqe_ops->cqe_notify_reset(host);
>> }
>>
>> Alternatively, you could make a new module for SDHCI/CQHCI helper functions,
>> although in this case there is so little code it could be static inline and
>> added in a new include file instead, say sdhci-cqhci.h e.g.
>>
>> #include "cqhci.h"
>> #include "sdhci.h"
>>
>> static inline void sdhci_cqhci_reset(struct sdhci_host *host, u8 mask)
>> {
>>     if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL) &&
>>         host->mmc->cqe_private)
>>         cqhci_deactivate(host->mmc);
>>     sdhci_reset(host, mask);
>> }
>>
> 
> I like the simplicity of the inline helper, especially towards backports. May suggest to name it sdhci_and_cqhci_reset() to illustrate that it does both, and does not apply specifically CQHCI that would be "embedded" into SDHCI, but your call here.

sdhci_and_cqhci_reset() is fine by me
  

Patch

diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c
index 3997cad1f793..b30f0d6baf5b 100644
--- a/drivers/mmc/host/sdhci-of-arasan.c
+++ b/drivers/mmc/host/sdhci-of-arasan.c
@@ -366,6 +366,9 @@  static void sdhci_arasan_reset(struct sdhci_host *host, u8 mask)
 	struct sdhci_pltfm_host *pltfm_host = sdhci_priv(host);
 	struct sdhci_arasan_data *sdhci_arasan = sdhci_pltfm_priv(pltfm_host);
 
+	if ((host->mmc->caps2 & MMC_CAP2_CQE) && (mask & SDHCI_RESET_ALL))
+		cqhci_deactivate(host->mmc);
+
 	sdhci_reset(host, mask);
 
 	if (sdhci_arasan->quirks & SDHCI_ARASAN_QUIRK_FORCE_CDTEST) {