drm/panfrost: Ignore core_mask for poweroff and sync interrupts

Message ID 20231123095320.41433-1-angelogioacchino.delregno@collabora.com
State New
Headers
Series drm/panfrost: Ignore core_mask for poweroff and sync interrupts |

Commit Message

AngeloGioacchino Del Regno Nov. 23, 2023, 9:53 a.m. UTC
  Some SoCs may be equipped with a GPU containing two core groups
and this is exactly the case of Samsung's Exynos 5422 featuring
an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
is partial, as this driver currently supports using only one
core group and that's reflected on all parts of it, including
the power on (and power off, previously to this patch) function.

The issue with this is that even though executing the soft reset
operation should power off all cores unconditionally, on at least
one platform we're seeing a crash that seems to be happening due
to an interrupt firing which may be because we are calling power
transition only on the first core group, leaving the second one
unchanged, or because ISR execution was pending before entering
the panfrost_gpu_power_off() function and executed after powering
off the GPU cores, or all of the above.

Finally, solve this by changing the power off flow to
 1. Mask and clear all interrupts: we don't need nor want any, as
    we are polling PWRTRANS anyway;
 2. Call synchronize_irq() after that to make sure that any pending
    ISR is executed before powering off the GPU Shaders/Tilers/L2
    hence avoiding unpowered registers R/W; and
 3. Ignore the core_mask and ask the GPU to poweroff both core groups

Of course it was also necessary to add a `irq` variable to `struct
panfrost_device` as we need to get that in panfrost_gpu_power_off()
for calling synchronize_irq() on it.

Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
[Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6]
Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
---
 drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
 drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++-------
 2 files changed, 19 insertions(+), 8 deletions(-)
  

Comments

Boris Brezillon Nov. 23, 2023, 10:35 a.m. UTC | #1
On Thu, 23 Nov 2023 10:53:20 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> Some SoCs may be equipped with a GPU containing two core groups
> and this is exactly the case of Samsung's Exynos 5422 featuring
> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
> is partial, as this driver currently supports using only one
> core group and that's reflected on all parts of it, including
> the power on (and power off, previously to this patch) function.
> 
> The issue with this is that even though executing the soft reset
> operation should power off all cores unconditionally, on at least
> one platform we're seeing a crash that seems to be happening due
> to an interrupt firing which may be because we are calling power
> transition only on the first core group, leaving the second one
> unchanged, or because ISR execution was pending before entering
> the panfrost_gpu_power_off() function and executed after powering
> off the GPU cores, or all of the above.
> 
> Finally, solve this by changing the power off flow to
>  1. Mask and clear all interrupts: we don't need nor want any, as
>     we are polling PWRTRANS anyway;
>  2. Call synchronize_irq() after that to make sure that any pending
>     ISR is executed before powering off the GPU Shaders/Tilers/L2
>     hence avoiding unpowered registers R/W; and
>  3. Ignore the core_mask and ask the GPU to poweroff both core groups

Could we split that in two patches? 1+2 in one patch, and 3 in another.
These are two orthogonal fixes IMO.

> 
> Of course it was also necessary to add a `irq` variable to `struct
> panfrost_device` as we need to get that in panfrost_gpu_power_off()
> for calling synchronize_irq() on it.
> 
> Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
> [Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6]
> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> ---
>  drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
>  drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++-------
>  2 files changed, 19 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
> index 0fc558db6bfd..b4feaa99e34f 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
> @@ -94,6 +94,7 @@ struct panfrost_device {
>  	struct device *dev;
>  	struct drm_device *ddev;
>  	struct platform_device *pdev;
> +	int irq;

I know it's the only irq being stored at the panfrost_device level, but
I think it's clearer if we explicitly prefix it with gpu_.

>  
>  	void __iomem *iomem;
>  	struct clk *clock;
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 1cc55fb9c45b..30b395125155 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -425,11 +425,21 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>  
>  void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  {
> -	u64 core_mask = panfrost_get_core_mask(pfdev);
>  	int ret;
>  	u32 val;
>  
> -	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
> +	/* We are polling PWRTRANS and we don't need nor want interrupts */

I kinda agree with that, but that's not exactly why we're
masking+syncing IRQs here. If that was the only reason, the right fix
would be to modify GPU_IRQ_MASK_ALL so it doesn't include the PWRTRANS
bits.

This fix should cover more than just the power transition IRQ use case:
we want all IRQs to be disabled before the device is suspended.

> +	gpu_write(pfdev, GPU_INT_MASK, 0);
> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> +
> +	/*
> +	 * Make sure that we don't have pending ISRs, otherwise we'll be
> +	 * reading and/or writing registers while the GPU is powered off
> +	 */
> +	synchronize_irq(pfdev->irq);

Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
sure making it part of panfrost_gpu_power_off() is a good idea. I'd
rather have this panfrost_gpu_suspend_irq() helper called from
panfrost_device_[runtime_]suspend(), along with
panfrost_{mmu,job}_suspend_irq().

> +
> +	/* Now it's safe to request poweroff for Shaders, Tilers and L2 */

It was safe before too, it's just that we probably don't want to be
interrupted, if all we do is ignore the interrupts we receive, hence
the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
IRQs we care about instead.

> +	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>  	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>  					 val, !val, 1, 1000);
>  	if (ret)
> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  	if (ret)
>  		dev_err(pfdev->dev, "tiler power transition timeout");
>  
> -	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
> +	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);

I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
such that, once we got it right for one block, we have it working for
all of them. And if there's a fix to apply, it automatically applies
to all blocks without having to fix the same bug in each copy.

Note that these panthor_gpu_block_power_{on,off}() helpers also handle
the case where power transitions are already in progress when you ask a
new power transition, which I don't think is checked in
panfrost_gpu_power_{off,on}().

>  	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>  					 val, !val, 0, 1000);

Not introduced by the patch, but I noticed args passed on the second
line are no longer aligned on the open parens.

>  	if (ret)
> @@ -451,7 +461,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>  
>  int panfrost_gpu_init(struct panfrost_device *pfdev)
>  {
> -	int err, irq;
> +	int err;
>  
>  	err = panfrost_gpu_soft_reset(pfdev);
>  	if (err)
> @@ -466,11 +476,11 @@ int panfrost_gpu_init(struct panfrost_device *pfdev)
>  
>  	dma_set_max_seg_size(pfdev->dev, UINT_MAX);
>  
> -	irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "gpu");
> -	if (irq < 0)
> -		return irq;
> +	pfdev->irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "gpu");
> +	if (pfdev->irq < 0)
> +		return pfdev->irq;
>  
> -	err = devm_request_irq(pfdev->dev, irq, panfrost_gpu_irq_handler,
> +	err = devm_request_irq(pfdev->dev, pfdev->irq, panfrost_gpu_irq_handler,
>  			       IRQF_SHARED, KBUILD_MODNAME "-gpu", pfdev);
>  	if (err) {
>  		dev_err(pfdev->dev, "failed to request gpu irq");

[1]https://gitlab.freedesktop.org/bbrezillon/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_gpu.h#L22
[2]https://gitlab.freedesktop.org/bbrezillon/linux/-/blob/panthor/drivers/gpu/drm/panthor/panthor_gpu.c#L193
  
Krzysztof Kozlowski Nov. 23, 2023, 10:39 a.m. UTC | #2
On 23/11/2023 10:53, AngeloGioacchino Del Regno wrote:
> Some SoCs may be equipped with a GPU containing two core groups
> and this is exactly the case of Samsung's Exynos 5422 featuring
> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
> is partial, as this driver currently supports using only one
> core group and that's reflected on all parts of it, including
> the power on (and power off, previously to this patch) function.
> 
> The issue with this is that even though executing the soft reset
> operation should power off all cores unconditionally, on at least
> one platform we're seeing a crash that seems to be happening due
> to an interrupt firing which may be because we are calling power
> transition only on the first core group, leaving the second one
> unchanged, or because ISR execution was pending before entering
> the panfrost_gpu_power_off() function and executed after powering
> off the GPU cores, or all of the above.

Does not apply - I tried next 20231117/21/22/23.

Best regards,
Krzysztof
  
AngeloGioacchino Del Regno Nov. 23, 2023, 10:40 a.m. UTC | #3
Il 23/11/23 11:39, Krzysztof Kozlowski ha scritto:
> On 23/11/2023 10:53, AngeloGioacchino Del Regno wrote:
>> Some SoCs may be equipped with a GPU containing two core groups
>> and this is exactly the case of Samsung's Exynos 5422 featuring
>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
>> is partial, as this driver currently supports using only one
>> core group and that's reflected on all parts of it, including
>> the power on (and power off, previously to this patch) function.
>>
>> The issue with this is that even though executing the soft reset
>> operation should power off all cores unconditionally, on at least
>> one platform we're seeing a crash that seems to be happening due
>> to an interrupt firing which may be because we are calling power
>> transition only on the first core group, leaving the second one
>> unchanged, or because ISR execution was pending before entering
>> the panfrost_gpu_power_off() function and executed after powering
>> off the GPU cores, or all of the above.
> 
> Does not apply - I tried next 20231117/21/22/23.
> 

Sorry about that, I'll send a v2 soon.

Thanks,
Angelo

> Best regards,
> Krzysztof
> 
> _______________________________________________
> Kernel mailing list -- kernel@mailman.collabora.com
> To unsubscribe send an email to kernel-leave@mailman.collabora.com
  
AngeloGioacchino Del Regno Nov. 23, 2023, 11:15 a.m. UTC | #4
Il 23/11/23 11:35, Boris Brezillon ha scritto:
> On Thu, 23 Nov 2023 10:53:20 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> Some SoCs may be equipped with a GPU containing two core groups
>> and this is exactly the case of Samsung's Exynos 5422 featuring
>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
>> is partial, as this driver currently supports using only one
>> core group and that's reflected on all parts of it, including
>> the power on (and power off, previously to this patch) function.
>>
>> The issue with this is that even though executing the soft reset
>> operation should power off all cores unconditionally, on at least
>> one platform we're seeing a crash that seems to be happening due
>> to an interrupt firing which may be because we are calling power
>> transition only on the first core group, leaving the second one
>> unchanged, or because ISR execution was pending before entering
>> the panfrost_gpu_power_off() function and executed after powering
>> off the GPU cores, or all of the above.
>>
>> Finally, solve this by changing the power off flow to
>>   1. Mask and clear all interrupts: we don't need nor want any, as
>>      we are polling PWRTRANS anyway;
>>   2. Call synchronize_irq() after that to make sure that any pending
>>      ISR is executed before powering off the GPU Shaders/Tilers/L2
>>      hence avoiding unpowered registers R/W; and
>>   3. Ignore the core_mask and ask the GPU to poweroff both core groups
> 
> Could we split that in two patches? 1+2 in one patch, and 3 in another.
> These are two orthogonal fixes IMO.
> 

My initial idea was exactly that, but I opted for one patch doing 'em all
because a "full fix" comprises all of 1+2+3: the third one without the
first two and vice-versa may not fully resolve the issue that was seen
on the HC1 board.

So, while I agree that it'd be slightly more readable as a diff if those
were two different commits I do have reasons against splitting.....

>>
>> Of course it was also necessary to add a `irq` variable to `struct
>> panfrost_device` as we need to get that in panfrost_gpu_power_off()
>> for calling synchronize_irq() on it.
>>
>> Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in panfrost_gpu_power_off()")
>> [Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6]
>> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> ---
>>   drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++-------
>>   2 files changed, 19 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
>> index 0fc558db6bfd..b4feaa99e34f 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>> @@ -94,6 +94,7 @@ struct panfrost_device {
>>   	struct device *dev;
>>   	struct drm_device *ddev;
>>   	struct platform_device *pdev;
>> +	int irq;
> 
> I know it's the only irq being stored at the panfrost_device level, but
> I think it's clearer if we explicitly prefix it with gpu_.
> 

Makes sense, agreed.

>>   
>>   	void __iomem *iomem;
>>   	struct clk *clock;
>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> index 1cc55fb9c45b..30b395125155 100644
>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>> @@ -425,11 +425,21 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>>   
>>   void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>   {
>> -	u64 core_mask = panfrost_get_core_mask(pfdev);
>>   	int ret;
>>   	u32 val;
>>   
>> -	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
>> +	/* We are polling PWRTRANS and we don't need nor want interrupts */
> 
> I kinda agree with that, but that's not exactly why we're
> masking+syncing IRQs here. If that was the only reason, the right fix
> would be to modify GPU_IRQ_MASK_ALL so it doesn't include the PWRTRANS
> bits.
> 
> This fix should cover more than just the power transition IRQ use case:
> we want all IRQs to be disabled before the device is suspended.
> 

Eh I should reword that, effectively, because what I wrote as comments make
sense (as in, the logic of it completes) only if you read both of them "as one".

I'll do that in the new suspend irq helper :-)

>> +	gpu_write(pfdev, GPU_INT_MASK, 0);
>> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>> +
>> +	/*
>> +	 * Make sure that we don't have pending ISRs, otherwise we'll be
>> +	 * reading and/or writing registers while the GPU is powered off
>> +	 */
>> +	synchronize_irq(pfdev->irq);
> 
> Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
> sure making it part of panfrost_gpu_power_off() is a good idea. I'd
> rather have this panfrost_gpu_suspend_irq() helper called from
> panfrost_device_[runtime_]suspend(), along with
> panfrost_{mmu,job}_suspend_irq().
> 

Okay I will move that to a helper, but I still want to clarify:
  - For JOB, we're checking if panfrost_job_is_idle() before trying
    to runtime_suspend() (hence before trying to power off cores),
    so implicitly no interrupt can fire I guess? Though there could
    still be a pending ISR there too.... mmh. Brain ticking :-)
  - For MMU, we're not checking anything, but I guess that if there
    is no job, the mmu can't be doing anything at all?
    ...but then you also gave me the doubt about that one as well.

What I think that would be sensible to do is to get this commit as
a "clear" fix for the "Really power off" one, then have one or more
additional commit(s) without any fixes tag to improve the IRQ suspend
with the new mmu/job irq suspend helpers.

Of course *this* commit would introduce the panfrost_gpu_suspend_irq()
helper directly, instead of moving the logic to a helper in a later one.

Any reason against? :-)

>> +
>> +	/* Now it's safe to request poweroff for Shaders, Tilers and L2 */
> 
> It was safe before too, it's just that we probably don't want to be

In theory yes, in practice the Odroid HC1 board crashed :-P

P.S.: Joking! I understand what you're saying :-)

> interrupted, if all we do is ignore the interrupts we receive, hence
> the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
> IRQs we care about instead.
> 
>> +	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>   	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>   					 val, !val, 1, 1000);
>>   	if (ret)
>> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>   	if (ret)
>>   		dev_err(pfdev->dev, "tiler power transition timeout");
>>   
>> -	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>> +	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
> 
> I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
> poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
> such that, once we got it right for one block, we have it working for
> all of them. And if there's a fix to apply, it automatically applies
> to all blocks without having to fix the same bug in each copy.
> 

...technically yes, but practically this driver doesn't currently support any
GPU that even fills the _LO registers.

I guess that we can do that later?

That's just to (paranoidly) avoid any risk to introduce other regressions in
this merge window, since we're fixing one that shouldn't have happened in the
first place...

> Note that these panthor_gpu_block_power_{on,off}() helpers also handle
> the case where power transitions are already in progress when you ask a
> new power transition, which I don't think is checked in
> panfrost_gpu_power_{off,on}().
> 

I admit I didn't analyze the panthor driver - but here, the only power transitions
that may happen are either because of panfrost_gpu_power_on(), or because of
panfrost_gpu_power_off(), and both are polling and blocking... so from what I
understand, there's no possibility to have "another" power transition happening
while calling poweron, or poweroff.

That would change if we start to selectively turn on and off a number of shaders
and/or a number of tilers (not all of them) depending on the workload, but we're
not doing that...

...yet?

:-)

>>   	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>>   					 val, !val, 0, 1000);
> 
> Not introduced by the patch, but I noticed args passed on the second
> line are no longer aligned on the open parens.
> 

Yeah, fixing that for v2 :-)

Cheers,
Angelo
  
AngeloGioacchino Del Regno Nov. 23, 2023, 11:43 a.m. UTC | #5
Il 23/11/23 12:15, AngeloGioacchino Del Regno ha scritto:
> Il 23/11/23 11:35, Boris Brezillon ha scritto:
>> On Thu, 23 Nov 2023 10:53:20 +0100
>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> wrote:
>>
>>> Some SoCs may be equipped with a GPU containing two core groups
>>> and this is exactly the case of Samsung's Exynos 5422 featuring
>>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
>>> is partial, as this driver currently supports using only one
>>> core group and that's reflected on all parts of it, including
>>> the power on (and power off, previously to this patch) function.
>>>
>>> The issue with this is that even though executing the soft reset
>>> operation should power off all cores unconditionally, on at least
>>> one platform we're seeing a crash that seems to be happening due
>>> to an interrupt firing which may be because we are calling power
>>> transition only on the first core group, leaving the second one
>>> unchanged, or because ISR execution was pending before entering
>>> the panfrost_gpu_power_off() function and executed after powering
>>> off the GPU cores, or all of the above.
>>>
>>> Finally, solve this by changing the power off flow to
>>>   1. Mask and clear all interrupts: we don't need nor want any, as
>>>      we are polling PWRTRANS anyway;
>>>   2. Call synchronize_irq() after that to make sure that any pending
>>>      ISR is executed before powering off the GPU Shaders/Tilers/L2
>>>      hence avoiding unpowered registers R/W; and
>>>   3. Ignore the core_mask and ask the GPU to poweroff both core groups
>>
>> Could we split that in two patches? 1+2 in one patch, and 3 in another.
>> These are two orthogonal fixes IMO.
>>
> 
> My initial idea was exactly that, but I opted for one patch doing 'em all
> because a "full fix" comprises all of 1+2+3: the third one without the
> first two and vice-versa may not fully resolve the issue that was seen
> on the HC1 board.
> 
> So, while I agree that it'd be slightly more readable as a diff if those
> were two different commits I do have reasons against splitting.....
> 
>>>
>>> Of course it was also necessary to add a `irq` variable to `struct
>>> panfrost_device` as we need to get that in panfrost_gpu_power_off()
>>> for calling synchronize_irq() on it.
>>>
>>> Fixes: 123b431f8a5c ("drm/panfrost: Really power off GPU cores in 
>>> panfrost_gpu_power_off()")
>>> [Regression detected on Odroid HC1, Exynos 5422, Mali-T628 MP6]
>>> Reported-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
>>> Signed-off-by: AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>> ---
>>>   drivers/gpu/drm/panfrost/panfrost_device.h |  1 +
>>>   drivers/gpu/drm/panfrost/panfrost_gpu.c    | 26 +++++++++++++++-------
>>>   2 files changed, 19 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h 
>>> b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> index 0fc558db6bfd..b4feaa99e34f 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_device.h
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_device.h
>>> @@ -94,6 +94,7 @@ struct panfrost_device {
>>>       struct device *dev;
>>>       struct drm_device *ddev;
>>>       struct platform_device *pdev;
>>> +    int irq;
>>
>> I know it's the only irq being stored at the panfrost_device level, but
>> I think it's clearer if we explicitly prefix it with gpu_.
>>
> 
> Makes sense, agreed.
> 
>>>       void __iomem *iomem;
>>>       struct clk *clock;
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> index 1cc55fb9c45b..30b395125155 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> @@ -425,11 +425,21 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>>>   void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>   {
>>> -    u64 core_mask = panfrost_get_core_mask(pfdev);
>>>       int ret;
>>>       u32 val;
>>> -    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & 
>>> core_mask);
>>> +    /* We are polling PWRTRANS and we don't need nor want interrupts */
>>
>> I kinda agree with that, but that's not exactly why we're
>> masking+syncing IRQs here. If that was the only reason, the right fix
>> would be to modify GPU_IRQ_MASK_ALL so it doesn't include the PWRTRANS
>> bits.
>>
>> This fix should cover more than just the power transition IRQ use case:
>> we want all IRQs to be disabled before the device is suspended.
>>
> 
> Eh I should reword that, effectively, because what I wrote as comments make
> sense (as in, the logic of it completes) only if you read both of them "as one".
> 
> I'll do that in the new suspend irq helper :-)
> 
>>> +    gpu_write(pfdev, GPU_INT_MASK, 0);
>>> +    gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>> +
>>> +    /*
>>> +     * Make sure that we don't have pending ISRs, otherwise we'll be
>>> +     * reading and/or writing registers while the GPU is powered off
>>> +     */
>>> +    synchronize_irq(pfdev->irq);
>>
>> Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
>> sure making it part of panfrost_gpu_power_off() is a good idea. I'd
>> rather have this panfrost_gpu_suspend_irq() helper called from
>> panfrost_device_[runtime_]suspend(), along with
>> panfrost_{mmu,job}_suspend_irq().
>>
> 
> Okay I will move that to a helper, but I still want to clarify:
>   - For JOB, we're checking if panfrost_job_is_idle() before trying
>     to runtime_suspend() (hence before trying to power off cores),
>     so implicitly no interrupt can fire I guess? Though there could
>     still be a pending ISR there too.... mmh. Brain ticking :-)
>   - For MMU, we're not checking anything, but I guess that if there
>     is no job, the mmu can't be doing anything at all?
>     ...but then you also gave me the doubt about that one as well.
> 
> What I think that would be sensible to do is to get this commit as
> a "clear" fix for the "Really power off" one, then have one or more
> additional commit(s) without any fixes tag to improve the IRQ suspend
> with the new mmu/job irq suspend helpers.
> 
> Of course *this* commit would introduce the panfrost_gpu_suspend_irq()
> helper directly, instead of moving the logic to a helper in a later one.
> 
> Any reason against? :-)
> 
>>> +
>>> +    /* Now it's safe to request poweroff for Shaders, Tilers and L2 */
>>
>> It was safe before too, it's just that we probably don't want to be
> 
> In theory yes, in practice the Odroid HC1 board crashed :-P
> 
> P.S.: Joking! I understand what you're saying :-)
> 
>> interrupted, if all we do is ignore the interrupts we receive, hence
>> the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
>> IRQs we care about instead.
>>
>>> +    gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>>       ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>>                        val, !val, 1, 1000);
>>>       if (ret)
>>> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>       if (ret)
>>>           dev_err(pfdev->dev, "tiler power transition timeout");
>>> -    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>>> +    gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>>
>> I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
>> poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
>> such that, once we got it right for one block, we have it working for
>> all of them. And if there's a fix to apply, it automatically applies
>> to all blocks without having to fix the same bug in each copy.
>>
> 
> ...technically yes, but practically this driver doesn't currently support any
> GPU that even fills the _LO registers.
> 
> I guess that we can do that later?
> 
> That's just to (paranoidly) avoid any risk to introduce other regressions in
> this merge window, since we're fixing one that shouldn't have happened in the
> first place...
> 
>> Note that these panthor_gpu_block_power_{on,off}() helpers also handle
>> the case where power transitions are already in progress when you ask a
>> new power transition, which I don't think is checked in
>> panfrost_gpu_power_{off,on}().
>>
> 
> I admit I didn't analyze the panthor driver - but here, the only power transitions
> that may happen are either because of panfrost_gpu_power_on(), or because of
> panfrost_gpu_power_off(), and both are polling and blocking... so from what I
> understand, there's no possibility to have "another" power transition happening
> while calling poweron, or poweroff.
> 
> That would change if we start to selectively turn on and off a number of shaders
> and/or a number of tilers (not all of them) depending on the workload, but we're
> not doing that...
> 
> ...yet?
> 
> :-)
> 
>>>       ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>>>                        val, !val, 0, 1000);
>>
>> Not introduced by the patch, but I noticed args passed on the second
>> line are no longer aligned on the open parens.
>>
> 
> Yeah, fixing that for v2 :-)
> 

Sorry for the double reply - I just noticed that you're seeing this misalignment
because I had a *local* commit introducing that but, on linux-next, this is not
present, so there's no misalignment to fix........ :-)

Cheers,
Angelo
  
Boris Brezillon Nov. 23, 2023, 12:59 p.m. UTC | #6
On Thu, 23 Nov 2023 12:15:01 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> Il 23/11/23 11:35, Boris Brezillon ha scritto:
> > On Thu, 23 Nov 2023 10:53:20 +0100
> > AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> > wrote:
> >   
> >> Some SoCs may be equipped with a GPU containing two core groups
> >> and this is exactly the case of Samsung's Exynos 5422 featuring
> >> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
> >> is partial, as this driver currently supports using only one
> >> core group and that's reflected on all parts of it, including
> >> the power on (and power off, previously to this patch) function.
> >>
> >> The issue with this is that even though executing the soft reset
> >> operation should power off all cores unconditionally, on at least
> >> one platform we're seeing a crash that seems to be happening due
> >> to an interrupt firing which may be because we are calling power
> >> transition only on the first core group, leaving the second one
> >> unchanged, or because ISR execution was pending before entering
> >> the panfrost_gpu_power_off() function and executed after powering
> >> off the GPU cores, or all of the above.
> >>
> >> Finally, solve this by changing the power off flow to
> >>   1. Mask and clear all interrupts: we don't need nor want any, as
> >>      we are polling PWRTRANS anyway;
> >>   2. Call synchronize_irq() after that to make sure that any pending
> >>      ISR is executed before powering off the GPU Shaders/Tilers/L2
> >>      hence avoiding unpowered registers R/W; and
> >>   3. Ignore the core_mask and ask the GPU to poweroff both core groups  
> > 
> > Could we split that in two patches? 1+2 in one patch, and 3 in another.
> > These are two orthogonal fixes IMO.
> >   
> 
> My initial idea was exactly that, but I opted for one patch doing 'em all
> because a "full fix" comprises all of 1+2+3: the third one without the
> first two and vice-versa may not fully resolve the issue that was seen
> on the HC1 board.

Guess it depends how you see it. I'd argue that these are 2 orthogonal
bugs, and the suspend fix might be worth backporting to older versions.

> 
> So, while I agree that it'd be slightly more readable as a diff if those
> were two different commits I do have reasons against splitting.....

If we just need a quick fix to avoid PWRTRANS interrupts from kicking
in when we power-off the cores, I think we'd be better off dropping 
GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
at [re]initialization time, and then have a separate series that fixes
the problem more generically.

> >> +	gpu_write(pfdev, GPU_INT_MASK, 0);
> >> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> >> +
> >> +	/*
> >> +	 * Make sure that we don't have pending ISRs, otherwise we'll be
> >> +	 * reading and/or writing registers while the GPU is powered off
> >> +	 */
> >> +	synchronize_irq(pfdev->irq);  
> > 
> > Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
> > sure making it part of panfrost_gpu_power_off() is a good idea. I'd
> > rather have this panfrost_gpu_suspend_irq() helper called from
> > panfrost_device_[runtime_]suspend(), along with
> > panfrost_{mmu,job}_suspend_irq().
> >   
> 
> Okay I will move that to a helper, but I still want to clarify:
>   - For JOB, we're checking if panfrost_job_is_idle() before trying
>     to runtime_suspend() (hence before trying to power off cores),
>     so implicitly no interrupt can fire I guess? Though there could
>     still be a pending ISR there too.... mmh. Brain ticking :-)

There's indeed no reason to see job interrupts if we're asked to enter
suspend, but it's mostly a matter of safety/correctness. If, as
expected, there's no pending interrupt, the write(_INT_MASK) +
synchronize_irq() should be relatively cheap.

>   - For MMU, we're not checking anything, but I guess that if there
>     is no job, the mmu can't be doing anything at all?
>     ...but then you also gave me the doubt about that one as well.

Same here, if we've properly flushed all jobs, and handled all pending
interrupts, we shouldn't end up with pending MMU irqs when we're asked
to suspend. But the extra mask+synchronize_irq() buys us extra safety.

> 
> What I think that would be sensible to do is to get this commit as
> a "clear" fix for the "Really power off" one, then have one or more
> additional commit(s) without any fixes tag to improve the IRQ suspend
> with the new mmu/job irq suspend helpers.

If you need a self-contained fix to avoid power transition interrupts
messing up with suspend, then, as I suggested, it might make more sense
to drop GPU_IRQ_POWER_CHANGED[_ALL] when writing GPU_INT_MASK, which
you want anyway, to avoid being interrupted when you do power
transitions.

> 
> Of course *this* commit would introduce the panfrost_gpu_suspend_irq()
> helper directly, instead of moving the logic to a helper in a later one.
> 
> Any reason against? :-)
> 
> >> +
> >> +	/* Now it's safe to request poweroff for Shaders, Tilers and L2 */  
> > 
> > It was safe before too, it's just that we probably don't want to be  
> 
> In theory yes, in practice the Odroid HC1 board crashed :-P

Just to be clear, it's not the accesses to the PWROFF/PWRTRANS
registers in this function that were causing the crash, it's the fact we
were writing to those regs, and leaving the corresponding interrupt
unprocessed before returning from panfrost_device[_runtime]_suspend().

> 
> P.S.: Joking! I understand what you're saying :-)

Okay, but the comment was still inaccurate :P.

> 
> > interrupted, if all we do is ignore the interrupts we receive, hence
> > the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
> > IRQs we care about instead.
> >   
> >> +	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
> >>   	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> >>   					 val, !val, 1, 1000);
> >>   	if (ret)
> >> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
> >>   	if (ret)
> >>   		dev_err(pfdev->dev, "tiler power transition timeout");
> >>   
> >> -	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
> >> +	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);  
> > 
> > I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
> > poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
> > such that, once we got it right for one block, we have it working for
> > all of them. And if there's a fix to apply, it automatically applies
> > to all blocks without having to fix the same bug in each copy.
> >   
> 
> ...technically yes, but practically this driver doesn't currently support any
> GPU that even fills the _LO registers.

Once we have a solution that works for all use cases, it can work for
the limited set we support at the moment :P.

> 
> I guess that we can do that later?

Sure, that was more of a follow-up patch suggestion.

> 
> That's just to (paranoidly) avoid any risk to introduce other regressions in
> this merge window, since we're fixing one that shouldn't have happened in the
> first place...

Agreed, but if that's the goal, then I'd go for the simpler change I
suggested above (dropping GPU_IRQ_POWER_CHANGED[_ALL] from the
interrupt mask altogether). This way you don't have to worry about
receiving power transition interrupts in the first place, and you also
save interrupt context switching time when you power on the various
blocks.

> 
> > Note that these panthor_gpu_block_power_{on,off}() helpers also handle
> > the case where power transitions are already in progress when you ask a
> > new power transition, which I don't think is checked in
> > panfrost_gpu_power_{off,on}().
> >   
> 
> I admit I didn't analyze the panthor driver - but here, the only power transitions
> that may happen are either because of panfrost_gpu_power_on(), or because of
> panfrost_gpu_power_off(), and both are polling and blocking... so from what I
> understand, there's no possibility to have "another" power transition happening
> while calling poweron, or poweroff.

Well yes, in theory there's no reason to have more than one transition
happening at a given time (that's assuming power transition never time
out, or if they do, the system gets back to an idle state before we try
to do the next power transition). It's just that, if it ever happens,
for any reason, we'd be safe against this unexpected situation, for a
cost that's relatively low (just an extra register read if things are in
the expected idle state).

Definitely not saying we should do that in this patch, but I think we
should do anything we can do to improve robustness, assuming the cost
of these extra checks is acceptable (we're not really in a fastpath
here).
  
AngeloGioacchino Del Regno Nov. 23, 2023, 1:24 p.m. UTC | #7
Il 23/11/23 13:59, Boris Brezillon ha scritto:
> On Thu, 23 Nov 2023 12:15:01 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> Il 23/11/23 11:35, Boris Brezillon ha scritto:
>>> On Thu, 23 Nov 2023 10:53:20 +0100
>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>> wrote:
>>>    
>>>> Some SoCs may be equipped with a GPU containing two core groups
>>>> and this is exactly the case of Samsung's Exynos 5422 featuring
>>>> an ARM Mali-T628 MP6 GPU: the support for this GPU in Panfrost
>>>> is partial, as this driver currently supports using only one
>>>> core group and that's reflected on all parts of it, including
>>>> the power on (and power off, previously to this patch) function.
>>>>
>>>> The issue with this is that even though executing the soft reset
>>>> operation should power off all cores unconditionally, on at least
>>>> one platform we're seeing a crash that seems to be happening due
>>>> to an interrupt firing which may be because we are calling power
>>>> transition only on the first core group, leaving the second one
>>>> unchanged, or because ISR execution was pending before entering
>>>> the panfrost_gpu_power_off() function and executed after powering
>>>> off the GPU cores, or all of the above.
>>>>
>>>> Finally, solve this by changing the power off flow to
>>>>    1. Mask and clear all interrupts: we don't need nor want any, as
>>>>       we are polling PWRTRANS anyway;
>>>>    2. Call synchronize_irq() after that to make sure that any pending
>>>>       ISR is executed before powering off the GPU Shaders/Tilers/L2
>>>>       hence avoiding unpowered registers R/W; and
>>>>    3. Ignore the core_mask and ask the GPU to poweroff both core groups
>>>
>>> Could we split that in two patches? 1+2 in one patch, and 3 in another.
>>> These are two orthogonal fixes IMO.
>>>    
>>
>> My initial idea was exactly that, but I opted for one patch doing 'em all
>> because a "full fix" comprises all of 1+2+3: the third one without the
>> first two and vice-versa may not fully resolve the issue that was seen
>> on the HC1 board.
> 
> Guess it depends how you see it. I'd argue that these are 2 orthogonal
> bugs, and the suspend fix might be worth backporting to older versions.
> 

Yes, but older versions are not affected by this regression because the GPU
was never turned off for real... so this commit is really fixing just the
issues that came out *because* of the "Really power off" commit, which is
not worth backporting as we already saw one regression with that and doing
such thing would be mostly dangerous.

If your suggestion was to backport 1+2 instead, I disagree: there are no
suspend instabilities in older kernel versions, so we'd be fixing something
that anyway always worked fine for years.

This is to say: the "Really power off" code should be treated like a commit
that implements a new feature, rather than fixing an old one, *only* because
there is a possibility to create regressions somehow (again, we already had
one).

At least IMO...

>>
>> So, while I agree that it'd be slightly more readable as a diff if those
>> were two different commits I do have reasons against splitting.....
> 
> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
> in when we power-off the cores, I think we'd be better off dropping
> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
> at [re]initialization time, and then have a separate series that fixes
> the problem more generically.
> 

But that didn't work:
https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/


...while this "full" solution worked:
https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/

https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/


...so this *is* a "quick fix" already... :-)

>>>> +	gpu_write(pfdev, GPU_INT_MASK, 0);
>>>> +	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>>> +
>>>> +	/*
>>>> +	 * Make sure that we don't have pending ISRs, otherwise we'll be
>>>> +	 * reading and/or writing registers while the GPU is powered off
>>>> +	 */
>>>> +	synchronize_irq(pfdev->irq);
>>>
>>> Could we move that to a panfrost_gpu_suspend_irq() helper? I'm also not
>>> sure making it part of panfrost_gpu_power_off() is a good idea. I'd
>>> rather have this panfrost_gpu_suspend_irq() helper called from
>>> panfrost_device_[runtime_]suspend(), along with
>>> panfrost_{mmu,job}_suspend_irq().
>>>    
>>
>> Okay I will move that to a helper, but I still want to clarify:
>>    - For JOB, we're checking if panfrost_job_is_idle() before trying
>>      to runtime_suspend() (hence before trying to power off cores),
>>      so implicitly no interrupt can fire I guess? Though there could
>>      still be a pending ISR there too.... mmh. Brain ticking :-)
> 
> There's indeed no reason to see job interrupts if we're asked to enter
> suspend, but it's mostly a matter of safety/correctness. If, as
> expected, there's no pending interrupt, the write(_INT_MASK) +
> synchronize_irq() should be relatively cheap.
> 
>>    - For MMU, we're not checking anything, but I guess that if there
>>      is no job, the mmu can't be doing anything at all?
>>      ...but then you also gave me the doubt about that one as well.
> 
> Same here, if we've properly flushed all jobs, and handled all pending
> interrupts, we shouldn't end up with pending MMU irqs when we're asked
> to suspend. But the extra mask+synchronize_irq() buys us extra safety.
> 
>>
>> What I think that would be sensible to do is to get this commit as
>> a "clear" fix for the "Really power off" one, then have one or more
>> additional commit(s) without any fixes tag to improve the IRQ suspend
>> with the new mmu/job irq suspend helpers.
> 
> If you need a self-contained fix to avoid power transition interrupts
> messing up with suspend, then, as I suggested, it might make more sense
> to drop GPU_IRQ_POWER_CHANGED[_ALL] when writing GPU_INT_MASK, which
> you want anyway, to avoid being interrupted when you do power
> transitions.
> 
>>
>> Of course *this* commit would introduce the panfrost_gpu_suspend_irq()
>> helper directly, instead of moving the logic to a helper in a later one.
>>
>> Any reason against? :-)
>>
>>>> +
>>>> +	/* Now it's safe to request poweroff for Shaders, Tilers and L2 */
>>>
>>> It was safe before too, it's just that we probably don't want to be
>>
>> In theory yes, in practice the Odroid HC1 board crashed :-P
> 
> Just to be clear, it's not the accesses to the PWROFF/PWRTRANS
> registers in this function that were causing the crash, it's the fact we
> were writing to those regs, and leaving the corresponding interrupt
> unprocessed before returning from panfrost_device[_runtime]_suspend().
> 
>>
>> P.S.: Joking! I understand what you're saying :-)
> 
> Okay, but the comment was still inaccurate :P.
> 

Hahaha! Fair! :-D

>>
>>> interrupted, if all we do is ignore the interrupts we receive, hence
>>> the suggestion to not use GPU_IRQ_MASK_ALL, and only enable the
>>> IRQs we care about instead.
>>>    
>>>> +	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>>>    	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>>>    					 val, !val, 1, 1000);
>>>>    	if (ret)
>>>> @@ -441,7 +451,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>>    	if (ret)
>>>>    		dev_err(pfdev->dev, "tiler power transition timeout");
>>>>    
>>>> -	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>>>> +	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>>>
>>> I really think we should have a helper doing the 'write(PWROFF_{LO,HI} +
>>> poll(PWRTRANS_{LO,HI})' sequence, similar to what's done here [1][2],
>>> such that, once we got it right for one block, we have it working for
>>> all of them. And if there's a fix to apply, it automatically applies
>>> to all blocks without having to fix the same bug in each copy.
>>>    
>>
>> ...technically yes, but practically this driver doesn't currently support any
>> GPU that even fills the _LO registers.
> 
> Once we have a solution that works for all use cases, it can work for
> the limited set we support at the moment :P.
> 
>>
>> I guess that we can do that later?
> 
> Sure, that was more of a follow-up patch suggestion.
> 

Thanks for that.

>>
>> That's just to (paranoidly) avoid any risk to introduce other regressions in
>> this merge window, since we're fixing one that shouldn't have happened in the
>> first place...
> 
> Agreed, but if that's the goal, then I'd go for the simpler change I
> suggested above (dropping GPU_IRQ_POWER_CHANGED[_ALL] from the
> interrupt mask altogether). This way you don't have to worry about
> receiving power transition interrupts in the first place, and you also
> save interrupt context switching time when you power on the various
> blocks.
> 

...but again, that wouldn't fully work, as per krzk's test.

>>
>>> Note that these panthor_gpu_block_power_{on,off}() helpers also handle
>>> the case where power transitions are already in progress when you ask a
>>> new power transition, which I don't think is checked in
>>> panfrost_gpu_power_{off,on}().
>>>    
>>
>> I admit I didn't analyze the panthor driver - but here, the only power transitions
>> that may happen are either because of panfrost_gpu_power_on(), or because of
>> panfrost_gpu_power_off(), and both are polling and blocking... so from what I
>> understand, there's no possibility to have "another" power transition happening
>> while calling poweron, or poweroff.
> 
> Well yes, in theory there's no reason to have more than one transition
> happening at a given time (that's assuming power transition never time
> out, or if they do, the system gets back to an idle state before we try
> to do the next power transition). It's just that, if it ever happens,
> for any reason, we'd be safe against this unexpected situation, for a
> cost that's relatively low (just an extra register read if things are in
> the expected idle state).
> 
> Definitely not saying we should do that in this patch, but I think we
> should do anything we can do to improve robustness, assuming the cost
> of these extra checks is acceptable (we're not really in a fastpath
> here).

There's a lot of room for improvement in Panfrost, on that I agree.

Cheers,
Angelo
  
Boris Brezillon Nov. 23, 2023, 1:51 p.m. UTC | #8
On Thu, 23 Nov 2023 14:24:57 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> >>
> >> So, while I agree that it'd be slightly more readable as a diff if those
> >> were two different commits I do have reasons against splitting.....  
> > 
> > If we just need a quick fix to avoid PWRTRANS interrupts from kicking
> > in when we power-off the cores, I think we'd be better off dropping
> > GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
> > at [re]initialization time, and then have a separate series that fixes
> > the problem more generically.
> >   
> 
> But that didn't work:
> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/

I meant, your 'ignore-core_mask' fix + the
'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.

So,

https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
+
https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/

> 
> 
> ...while this "full" solution worked:
> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
> 
> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
> 
> 
> ...so this *is* a "quick fix" already... :-)

It's a half-baked solution for the missing irq-synchronization-on-suspend
issue IMHO. I understand why you want it all in one patch that can serve
as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
even simpler diff (see below), and then fully address the
irq-synhronization-on-suspend issue in a follow-up patchset.

--->8---
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 09f5e1563ebd..6e2d7650cc2b 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
        }
 
        gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
-       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
+       gpu_write(pfdev, GPU_INT_MASK,
+                 GPU_IRQ_MASK_ERROR |
+                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
+                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
 
        /*
         * All in-flight jobs should have released their cycle
@@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
 
 void panfrost_gpu_power_off(struct panfrost_device *pfdev)
 {
-       u64 core_mask = panfrost_get_core_mask(pfdev);
        int ret;
        u32 val;
 
-       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
+       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
        ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
                                         val, !val, 1, 1000);
        if (ret)
@@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
        if (ret)
                dev_err(pfdev->dev, "tiler power transition timeout");
 
-       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
+       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
        ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
                                 val, !val, 0, 1000);
        if (ret)
  
AngeloGioacchino Del Regno Nov. 23, 2023, 3:14 p.m. UTC | #9
Il 23/11/23 14:51, Boris Brezillon ha scritto:
> On Thu, 23 Nov 2023 14:24:57 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>>>>
>>>> So, while I agree that it'd be slightly more readable as a diff if those
>>>> were two different commits I do have reasons against splitting.....
>>>
>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
>>> in when we power-off the cores, I think we'd be better off dropping
>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
>>> at [re]initialization time, and then have a separate series that fixes
>>> the problem more generically.
>>>    
>>
>> But that didn't work:
>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
> 
> I meant, your 'ignore-core_mask' fix + the
> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
> 
> So,
> 
> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
> +
> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
> 
>>
>>
>> ...while this "full" solution worked:
>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
>>
>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
>>
>>
>> ...so this *is* a "quick fix" already... :-)
> 
> It's a half-baked solution for the missing irq-synchronization-on-suspend
> issue IMHO. I understand why you want it all in one patch that can serve
> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
> even simpler diff (see below), and then fully address the
> irq-synhronization-on-suspend issue in a follow-up patchset.
> 
> --->8---
> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> index 09f5e1563ebd..6e2d7650cc2b 100644
> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>          }
>   
>          gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
> +       gpu_write(pfdev, GPU_INT_MASK,
> +                 GPU_IRQ_MASK_ERROR |
> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
>   

...but if we do that, the next patch(es) will contain a partial revert of this
commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...

I'm not sure that it's worth changing this like that, then changing it back right
after :-\

Anyway, if anyone else agrees with doing it and then partially revert, I have no
issues going with this one instead; what I care about ultimately is resolving the
regression ASAP :-)

Cheers,
Angelo

>          /*
>           * All in-flight jobs should have released their cycle
> @@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>   
>   void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>   {
> -       u64 core_mask = panfrost_get_core_mask(pfdev);
>          int ret;
>          u32 val;
>   
> -       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
> +       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>          ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>                                           val, !val, 1, 1000);
>          if (ret)
> @@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>          if (ret)
>                  dev_err(pfdev->dev, "tiler power transition timeout");
>   
> -       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
> +       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>          ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>                                   val, !val, 0, 1000);
>          if (ret)
> 
>
  
Boris Brezillon Nov. 23, 2023, 3:40 p.m. UTC | #10
On Thu, 23 Nov 2023 16:14:12 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> Il 23/11/23 14:51, Boris Brezillon ha scritto:
> > On Thu, 23 Nov 2023 14:24:57 +0100
> > AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> > wrote:
> >   
> >>>>
> >>>> So, while I agree that it'd be slightly more readable as a diff if those
> >>>> were two different commits I do have reasons against splitting.....  
> >>>
> >>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
> >>> in when we power-off the cores, I think we'd be better off dropping
> >>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
> >>> at [re]initialization time, and then have a separate series that fixes
> >>> the problem more generically.
> >>>      
> >>
> >> But that didn't work:
> >> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/  
> > 
> > I meant, your 'ignore-core_mask' fix + the
> > 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
> > 
> > So,
> > 
> > https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
> > +
> > https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
> >   
> >>
> >>
> >> ...while this "full" solution worked:
> >> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
> >>
> >> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
> >>
> >>
> >> ...so this *is* a "quick fix" already... :-)  
> > 
> > It's a half-baked solution for the missing irq-synchronization-on-suspend
> > issue IMHO. I understand why you want it all in one patch that can serve
> > as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
> > panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
> > even simpler diff (see below), and then fully address the
> > irq-synhronization-on-suspend issue in a follow-up patchset.
> >   
> > --->8---  
> > diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> > index 09f5e1563ebd..6e2d7650cc2b 100644
> > --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> > +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> > @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
> >          }
> >   
> >          gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> > -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);

We probably want a comment here:

	/* Only enable the interrupts we care about. */

> > +       gpu_write(pfdev, GPU_INT_MASK,
> > +                 GPU_IRQ_MASK_ERROR |
> > +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
> > +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
> >     
> 
> ...but if we do that, the next patch(es) will contain a partial revert of this
> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...

Why should we revert it? We're not processing the PWRTRANS interrupts
in the interrupt handler, those should never have been enabled in the
first place. The only reason we'd want to revert that change is if we
decide to do have interrupt-based waits in the poweron/off
implementation, which, as far as I'm aware, is not something we intend
to do any time soon.

> 
> I'm not sure that it's worth changing this like that, then changing it back right
> after :-\
> 
> Anyway, if anyone else agrees with doing it and then partially revert, I have no
> issues going with this one instead; what I care about ultimately is resolving the
> regression ASAP :-)
> 
> Cheers,
> Angelo
> 
> >          /*
> >           * All in-flight jobs should have released their cycle
> > @@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
> >   
> >   void panfrost_gpu_power_off(struct panfrost_device *pfdev)
> >   {
> > -       u64 core_mask = panfrost_get_core_mask(pfdev);
> >          int ret;
> >          u32 val;
> >   
> > -       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
> > +       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
> >          ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
> >                                           val, !val, 1, 1000);
> >          if (ret)
> > @@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
> >          if (ret)
> >                  dev_err(pfdev->dev, "tiler power transition timeout");
> >   
> > -       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
> > +       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
> >          ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
> >                                   val, !val, 0, 1000);
> >          if (ret)
> > 
> >   
>
  
AngeloGioacchino Del Regno Nov. 24, 2023, 9:17 a.m. UTC | #11
Il 23/11/23 16:40, Boris Brezillon ha scritto:
> On Thu, 23 Nov 2023 16:14:12 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> Il 23/11/23 14:51, Boris Brezillon ha scritto:
>>> On Thu, 23 Nov 2023 14:24:57 +0100
>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>> wrote:
>>>    
>>>>>>
>>>>>> So, while I agree that it'd be slightly more readable as a diff if those
>>>>>> were two different commits I do have reasons against splitting.....
>>>>>
>>>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
>>>>> in when we power-off the cores, I think we'd be better off dropping
>>>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
>>>>> at [re]initialization time, and then have a separate series that fixes
>>>>> the problem more generically.
>>>>>       
>>>>
>>>> But that didn't work:
>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>
>>> I meant, your 'ignore-core_mask' fix + the
>>> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
>>>
>>> So,
>>>
>>> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
>>> +
>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>    
>>>>
>>>>
>>>> ...while this "full" solution worked:
>>>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
>>>>
>>>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
>>>>
>>>>
>>>> ...so this *is* a "quick fix" already... :-)
>>>
>>> It's a half-baked solution for the missing irq-synchronization-on-suspend
>>> issue IMHO. I understand why you want it all in one patch that can serve
>>> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
>>> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
>>> even simpler diff (see below), and then fully address the
>>> irq-synhronization-on-suspend issue in a follow-up patchset.
>>>    
>>> --->8---
>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> index 09f5e1563ebd..6e2d7650cc2b 100644
>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>>>           }
>>>    
>>>           gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
> 
> We probably want a comment here:
> 
> 	/* Only enable the interrupts we care about. */
> 
>>> +       gpu_write(pfdev, GPU_INT_MASK,
>>> +                 GPU_IRQ_MASK_ERROR |
>>> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
>>> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
>>>      
>>
>> ...but if we do that, the next patch(es) will contain a partial revert of this
>> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...
> 
> Why should we revert it? We're not processing the PWRTRANS interrupts
> in the interrupt handler, those should never have been enabled in the
> first place. The only reason we'd want to revert that change is if we
> decide to do have interrupt-based waits in the poweron/off
> implementation, which, as far as I'm aware, is not something we intend
> to do any time soon.
> 

You're right, yes. Okay, I'll push the new code soon.

Cheers!

>>
>> I'm not sure that it's worth changing this like that, then changing it back right
>> after :-\
>>
>> Anyway, if anyone else agrees with doing it and then partially revert, I have no
>> issues going with this one instead; what I care about ultimately is resolving the
>> regression ASAP :-)
>>
>> Cheers,
>> Angelo
>>
>>>           /*
>>>            * All in-flight jobs should have released their cycle
>>> @@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>>>    
>>>    void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>    {
>>> -       u64 core_mask = panfrost_get_core_mask(pfdev);
>>>           int ret;
>>>           u32 val;
>>>    
>>> -       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
>>> +       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>>           ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>>                                            val, !val, 1, 1000);
>>>           if (ret)
>>> @@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>           if (ret)
>>>                   dev_err(pfdev->dev, "tiler power transition timeout");
>>>    
>>> -       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>>> +       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>>>           ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>>>                                    val, !val, 0, 1000);
>>>           if (ret)
>>>
>>>    
>>
>
  
AngeloGioacchino Del Regno Nov. 24, 2023, 10:12 a.m. UTC | #12
Il 24/11/23 10:17, AngeloGioacchino Del Regno ha scritto:
> Il 23/11/23 16:40, Boris Brezillon ha scritto:
>> On Thu, 23 Nov 2023 16:14:12 +0100
>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>> wrote:
>>
>>> Il 23/11/23 14:51, Boris Brezillon ha scritto:
>>>> On Thu, 23 Nov 2023 14:24:57 +0100
>>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>>> wrote:
>>>>>>>
>>>>>>> So, while I agree that it'd be slightly more readable as a diff if those
>>>>>>> were two different commits I do have reasons against splitting.....
>>>>>>
>>>>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
>>>>>> in when we power-off the cores, I think we'd be better off dropping
>>>>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
>>>>>> at [re]initialization time, and then have a separate series that fixes
>>>>>> the problem more generically.
>>>>>
>>>>> But that didn't work:
>>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>>
>>>> I meant, your 'ignore-core_mask' fix + the
>>>> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
>>>>
>>>> So,
>>>>
>>>> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
>>>> +
>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>>>
>>>>>
>>>>> ...while this "full" solution worked:
>>>>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
>>>>>
>>>>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
>>>>>
>>>>>
>>>>> ...so this *is* a "quick fix" already... :-)
>>>>
>>>> It's a half-baked solution for the missing irq-synchronization-on-suspend
>>>> issue IMHO. I understand why you want it all in one patch that can serve
>>>> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
>>>> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
>>>> even simpler diff (see below), and then fully address the
>>>> irq-synhronization-on-suspend issue in a follow-up patchset.
>>>> --->8---
>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
>>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>> index 09f5e1563ebd..6e2d7650cc2b 100644
>>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>>>>           }
>>>>           gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>>> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
>>
>> We probably want a comment here:
>>
>>     /* Only enable the interrupts we care about. */
>>
>>>> +       gpu_write(pfdev, GPU_INT_MASK,
>>>> +                 GPU_IRQ_MASK_ERROR |
>>>> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
>>>> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
>>>
>>> ...but if we do that, the next patch(es) will contain a partial revert of this
>>> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...
>>
>> Why should we revert it? We're not processing the PWRTRANS interrupts
>> in the interrupt handler, those should never have been enabled in the
>> first place. The only reason we'd want to revert that change is if we
>> decide to do have interrupt-based waits in the poweron/off
>> implementation, which, as far as I'm aware, is not something we intend
>> to do any time soon.
>>
> 
> You're right, yes. Okay, I'll push the new code soon.
> 
> Cheers!
> 

Update: I was running some (rather fast) tests here because I ... felt like playing
with it, basically :-)

So, I had an issue with MediaTek platforms being unable to cut power to the GPU or
disable clocks aggressively... and after trying "this and that" I couldn't get it
working (in runtime suspend).

Long story short - after implementing `panfrost_{job,mmu,gpu}_suspend_irq()` (only
gpu irq, as you said, is a half solution), I can not only turn off clocks, but even
turn off GPU power supplies entirely, bringing the power consumption of the GPU
itself during *runtime* suspend to ... zero.

The result of this test makes me truly happy, even though complete powercut during
runtime suspend may not be feasible for other reasons (takes ~200000ns on AVG,
MIN ~160000ns, but the MAX is ~475000ns - and beware that I haven't run that for
long, I'd suspect to get up to 1-1.5ms as max time, so that's a big no).

This means that I will take a day or two and I'll push both the "simple" fix for
the Really-power-off and also some more commits to add the full irq sync.

Cheers!
Angelo

>>>
>>> I'm not sure that it's worth changing this like that, then changing it back right
>>> after :-\
>>>
>>> Anyway, if anyone else agrees with doing it and then partially revert, I have no
>>> issues going with this one instead; what I care about ultimately is resolving the
>>> regression ASAP :-)
>>>
>>> Cheers,
>>> Angelo
>>>
>>>>           /*
>>>>            * All in-flight jobs should have released their cycle
>>>> @@ -425,11 +428,10 @@ void panfrost_gpu_power_on(struct panfrost_device *pfdev)
>>>>    void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>>    {
>>>> -       u64 core_mask = panfrost_get_core_mask(pfdev);
>>>>           int ret;
>>>>           u32 val;
>>>> -       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & 
>>>> core_mask);
>>>> +       gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
>>>>           ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
>>>>                                            val, !val, 1, 1000);
>>>>           if (ret)
>>>> @@ -441,7 +443,7 @@ void panfrost_gpu_power_off(struct panfrost_device *pfdev)
>>>>           if (ret)
>>>>                   dev_err(pfdev->dev, "tiler power transition timeout");
>>>> -       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
>>>> +       gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
>>>>           ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
>>>>                                    val, !val, 0, 1000);
>>>>           if (ret)
>>>>
>>>
>>
> 
> 
>
  
Boris Brezillon Nov. 24, 2023, 10:21 a.m. UTC | #13
On Fri, 24 Nov 2023 11:12:57 +0100
AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
wrote:

> Il 24/11/23 10:17, AngeloGioacchino Del Regno ha scritto:
> > Il 23/11/23 16:40, Boris Brezillon ha scritto:  
> >> On Thu, 23 Nov 2023 16:14:12 +0100
> >> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> >> wrote:
> >>  
> >>> Il 23/11/23 14:51, Boris Brezillon ha scritto:  
> >>>> On Thu, 23 Nov 2023 14:24:57 +0100
> >>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> >>>> wrote:  
> >>>>>>>
> >>>>>>> So, while I agree that it'd be slightly more readable as a diff if those
> >>>>>>> were two different commits I do have reasons against splitting.....  
> >>>>>>
> >>>>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
> >>>>>> in when we power-off the cores, I think we'd be better off dropping
> >>>>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
> >>>>>> at [re]initialization time, and then have a separate series that fixes
> >>>>>> the problem more generically.  
> >>>>>
> >>>>> But that didn't work:
> >>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/  
> >>>>
> >>>> I meant, your 'ignore-core_mask' fix + the
> >>>> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
> >>>>
> >>>> So,
> >>>>
> >>>> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
> >>>> +
> >>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/  
> >>>>>
> >>>>>
> >>>>> ...while this "full" solution worked:
> >>>>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
> >>>>>
> >>>>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
> >>>>>
> >>>>>
> >>>>> ...so this *is* a "quick fix" already... :-)  
> >>>>
> >>>> It's a half-baked solution for the missing irq-synchronization-on-suspend
> >>>> issue IMHO. I understand why you want it all in one patch that can serve
> >>>> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
> >>>> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
> >>>> even simpler diff (see below), and then fully address the
> >>>> irq-synhronization-on-suspend issue in a follow-up patchset.  
> >>>> --->8---  
> >>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c 
> >>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> index 09f5e1563ebd..6e2d7650cc2b 100644
> >>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
> >>>> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
> >>>>           }
> >>>>           gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
> >>>> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);  
> >>
> >> We probably want a comment here:
> >>
> >>     /* Only enable the interrupts we care about. */
> >>  
> >>>> +       gpu_write(pfdev, GPU_INT_MASK,
> >>>> +                 GPU_IRQ_MASK_ERROR |
> >>>> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
> >>>> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);  
> >>>
> >>> ...but if we do that, the next patch(es) will contain a partial revert of this
> >>> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...  
> >>
> >> Why should we revert it? We're not processing the PWRTRANS interrupts
> >> in the interrupt handler, those should never have been enabled in the
> >> first place. The only reason we'd want to revert that change is if we
> >> decide to do have interrupt-based waits in the poweron/off
> >> implementation, which, as far as I'm aware, is not something we intend
> >> to do any time soon.
> >>  
> > 
> > You're right, yes. Okay, I'll push the new code soon.
> > 
> > Cheers!
> >   
> 
> Update: I was running some (rather fast) tests here because I ... felt like playing
> with it, basically :-)
> 
> So, I had an issue with MediaTek platforms being unable to cut power to the GPU or
> disable clocks aggressively... and after trying "this and that" I couldn't get it
> working (in runtime suspend).
> 
> Long story short - after implementing `panfrost_{job,mmu,gpu}_suspend_irq()` (only
> gpu irq, as you said, is a half solution), I can not only turn off clocks, but even
> turn off GPU power supplies entirely, bringing the power consumption of the GPU
> itself during *runtime* suspend to ... zero.

Very nice!

> 
> The result of this test makes me truly happy, even though complete powercut during
> runtime suspend may not be feasible for other reasons (takes ~200000ns on AVG,
> MIN ~160000ns, but the MAX is ~475000ns - and beware that I haven't run that for
> long, I'd suspect to get up to 1-1.5ms as max time, so that's a big no).

Do you know what's taking so long? I'm disabling clks + the main power
domain in panthor (I leave the regulators enabled), but I didn't get to
measure the time it takes to enter/exit suspend. I might have to do
what you did in panfrost and have different paths for system and RPM
suspend.

> 
> This means that I will take a day or two and I'll push both the "simple" fix for
> the Really-power-off and also some more commits to add the full irq sync.

Thanks for working on that, and sorry if I've been picky in my previous
reviews. Looking forward to review these new patches.

Regards,

Boris
  
AngeloGioacchino Del Regno Nov. 24, 2023, 10:48 a.m. UTC | #14
Il 24/11/23 11:21, Boris Brezillon ha scritto:
> On Fri, 24 Nov 2023 11:12:57 +0100
> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
> wrote:
> 
>> Il 24/11/23 10:17, AngeloGioacchino Del Regno ha scritto:
>>> Il 23/11/23 16:40, Boris Brezillon ha scritto:
>>>> On Thu, 23 Nov 2023 16:14:12 +0100
>>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>>> wrote:
>>>>   
>>>>> Il 23/11/23 14:51, Boris Brezillon ha scritto:
>>>>>> On Thu, 23 Nov 2023 14:24:57 +0100
>>>>>> AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>
>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> So, while I agree that it'd be slightly more readable as a diff if those
>>>>>>>>> were two different commits I do have reasons against splitting.....
>>>>>>>>
>>>>>>>> If we just need a quick fix to avoid PWRTRANS interrupts from kicking
>>>>>>>> in when we power-off the cores, I think we'd be better off dropping
>>>>>>>> GPU_IRQ_POWER_CHANGED[_ALL] from the value we write to GPU_INT_MASK
>>>>>>>> at [re]initialization time, and then have a separate series that fixes
>>>>>>>> the problem more generically.
>>>>>>>
>>>>>>> But that didn't work:
>>>>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>>>>
>>>>>> I meant, your 'ignore-core_mask' fix + the
>>>>>> 'drop GPU_IRQ_POWER_CHANGED[_ALL] in GPU_INT_MASK' one.
>>>>>>
>>>>>> So,
>>>>>>
>>>>>> https://lore.kernel.org/all/4c73f67e-174c-497e-85a5-cb053ce657cb@collabora.com/
>>>>>> +
>>>>>> https://lore.kernel.org/all/d95259b8-10cf-4ded-866c-47cbd2a44f84@linaro.org/
>>>>>>>
>>>>>>>
>>>>>>> ...while this "full" solution worked:
>>>>>>> https://lore.kernel.org/all/39e9514b-087c-42eb-8d0e-f75dc620e954@linaro.org/
>>>>>>>
>>>>>>> https://lore.kernel.org/all/5b24cc73-23aa-4837-abb9-b6d138b46426@linaro.org/
>>>>>>>
>>>>>>>
>>>>>>> ...so this *is* a "quick fix" already... :-)
>>>>>>
>>>>>> It's a half-baked solution for the missing irq-synchronization-on-suspend
>>>>>> issue IMHO. I understand why you want it all in one patch that can serve
>>>>>> as a fix for 123b431f8a5c ("drm/panfrost: Really power off GPU cores in
>>>>>> panfrost_gpu_power_off()"), which is why I'm suggesting to go for an
>>>>>> even simpler diff (see below), and then fully address the
>>>>>> irq-synhronization-on-suspend issue in a follow-up patchset.
>>>>>> --->8---
>>>>>> diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>>>> b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>>>> index 09f5e1563ebd..6e2d7650cc2b 100644
>>>>>> --- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>>>> +++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
>>>>>> @@ -78,7 +78,10 @@ int panfrost_gpu_soft_reset(struct panfrost_device *pfdev)
>>>>>>            }
>>>>>>            gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
>>>>>> -       gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL);
>>>>
>>>> We probably want a comment here:
>>>>
>>>>      /* Only enable the interrupts we care about. */
>>>>   
>>>>>> +       gpu_write(pfdev, GPU_INT_MASK,
>>>>>> +                 GPU_IRQ_MASK_ERROR |
>>>>>> +                 GPU_IRQ_PERFCNT_SAMPLE_COMPLETED |
>>>>>> +                 GPU_IRQ_CLEAN_CACHES_COMPLETED);
>>>>>
>>>>> ...but if we do that, the next patch(es) will contain a partial revert of this
>>>>> commit, putting back this to gpu_write(pfdev, GPU_INT_MASK, GPU_IRQ_MASK_ALL)...
>>>>
>>>> Why should we revert it? We're not processing the PWRTRANS interrupts
>>>> in the interrupt handler, those should never have been enabled in the
>>>> first place. The only reason we'd want to revert that change is if we
>>>> decide to do have interrupt-based waits in the poweron/off
>>>> implementation, which, as far as I'm aware, is not something we intend
>>>> to do any time soon.
>>>>   
>>>
>>> You're right, yes. Okay, I'll push the new code soon.
>>>
>>> Cheers!
>>>    
>>
>> Update: I was running some (rather fast) tests here because I ... felt like playing
>> with it, basically :-)
>>
>> So, I had an issue with MediaTek platforms being unable to cut power to the GPU or
>> disable clocks aggressively... and after trying "this and that" I couldn't get it
>> working (in runtime suspend).
>>
>> Long story short - after implementing `panfrost_{job,mmu,gpu}_suspend_irq()` (only
>> gpu irq, as you said, is a half solution), I can not only turn off clocks, but even
>> turn off GPU power supplies entirely, bringing the power consumption of the GPU
>> itself during *runtime* suspend to ... zero.
> 
> Very nice!
> 
>>
>> The result of this test makes me truly happy, even though complete powercut during
>> runtime suspend may not be feasible for other reasons (takes ~200000ns on AVG,
>> MIN ~160000ns, but the MAX is ~475000ns - and beware that I haven't run that for
>> long, I'd suspect to get up to 1-1.5ms as max time, so that's a big no).
> 
> Do you know what's taking so long? I'm disabling clks + the main power
> domain in panthor (I leave the regulators enabled), but I didn't get to
> measure the time it takes to enter/exit suspend. I might have to do
> what you did in panfrost and have different paths for system and RPM
> suspend.
> 

That's SoC dependant... there's only one way to get runtime suspend right in terms
of timing, and that is to select what to do there on a per-SoC basis: this is why
some of them will take lots of time to turn off (or on!) clocks, because clock
controllers are not all equal: this is not only in relation to different vendors
(as in, rockchip vs nxp vs mediatek vs qcom vs...) but also for different parts
from the same vendor (as in, MSM8953 uses different clock controllers compared to
SM8350 and MT6795 different compared to MT6985 and MT8195).

Some of them will require a long time to turn on a PLL mainly because "locking the
PLL" (read: waiting until the PLL outputs a certain clock rate in a stable manner)
may take time, or some could lose time by waiting for ungating... though, usually
clock controllers take no time to *turn off* a clock, but needs a bit more time to
turn them *on* (which matters in runtime resume, but not much in system pm resume).

The situation is exactly the same in regard to power supplies: some may need a bit
of time to turn on/off a power *domain* (not MTK's case), some lose time turning
on/off a regulator instead.

Related to the case of my MediaTek platforms (MT8192, MT8195, MT8186), the power
domains go on/off in nanoseconds, same for clocks (as the clock tree uses a generic
PLL for init, then switches to a GPU-dedicate with "you being picky" :-)d PLL which 
anyway doesn't require a
lot of time to reach lock state), but turning off the *Vgpu* (gpu core supply)
regulator takes "all that much".

So the only solution to do that on a per-SoC basis would be to have the code to
turn on/off clocks and regulators in a barrier, and have it in both runtime PM
*and* system PM, and to select - for each SoC - what to do such that the driver is
able to do *either of*:

  - Runtime PM only PWRTRANS (beware: power domains off implicit, in all cases!)
  - Runtime PM PWRTRANS + clocks off
  - Runtime PM PWRTRANS + clocks off + regulators off

  - System PM RPM() only
  - System PM RPM() + clocks off (only if not done already in RPM)
  - System PM RPM() + clocks off + regulators off (only if not done already in RPM)

...at least, that's to reach an "almost perfect" power save tuning on *all*, but
the shortest path is obviously to leave the "clocks off + regulators off" code in
System PM only, as that's something that 99.9% of the SoCs would support.

I'd be happy to find a (clean) solution to add what I described to Panfrost (and
would be reusable in Panthor, I believe) but I don't know if I'd be able to put
something on the table for this merge window anyway.

>>
>> This means that I will take a day or two and I'll push both the "simple" fix for
>> the Really-power-off and also some more commits to add the full irq sync.
> 
> Thanks for working on that, and sorry if I've been picky in my previous
> reviews. Looking forward to review these new patches.
> 

Many times, being picky means reaching a better solution, and this is one of those
cases: there's no need to be sorry (not to me at least).

I appreciate when others are giving out valuable *and constructive* feedback, which
again, this was the case - so thank you for "being picky"! :-)

Cheers,
Angelo
  

Patch

diff --git a/drivers/gpu/drm/panfrost/panfrost_device.h b/drivers/gpu/drm/panfrost/panfrost_device.h
index 0fc558db6bfd..b4feaa99e34f 100644
--- a/drivers/gpu/drm/panfrost/panfrost_device.h
+++ b/drivers/gpu/drm/panfrost/panfrost_device.h
@@ -94,6 +94,7 @@  struct panfrost_device {
 	struct device *dev;
 	struct drm_device *ddev;
 	struct platform_device *pdev;
+	int irq;
 
 	void __iomem *iomem;
 	struct clk *clock;
diff --git a/drivers/gpu/drm/panfrost/panfrost_gpu.c b/drivers/gpu/drm/panfrost/panfrost_gpu.c
index 1cc55fb9c45b..30b395125155 100644
--- a/drivers/gpu/drm/panfrost/panfrost_gpu.c
+++ b/drivers/gpu/drm/panfrost/panfrost_gpu.c
@@ -425,11 +425,21 @@  void panfrost_gpu_power_on(struct panfrost_device *pfdev)
 
 void panfrost_gpu_power_off(struct panfrost_device *pfdev)
 {
-	u64 core_mask = panfrost_get_core_mask(pfdev);
 	int ret;
 	u32 val;
 
-	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present & core_mask);
+	/* We are polling PWRTRANS and we don't need nor want interrupts */
+	gpu_write(pfdev, GPU_INT_MASK, 0);
+	gpu_write(pfdev, GPU_INT_CLEAR, GPU_IRQ_MASK_ALL);
+
+	/*
+	 * Make sure that we don't have pending ISRs, otherwise we'll be
+	 * reading and/or writing registers while the GPU is powered off
+	 */
+	synchronize_irq(pfdev->irq);
+
+	/* Now it's safe to request poweroff for Shaders, Tilers and L2 */
+	gpu_write(pfdev, SHADER_PWROFF_LO, pfdev->features.shader_present);
 	ret = readl_relaxed_poll_timeout(pfdev->iomem + SHADER_PWRTRANS_LO,
 					 val, !val, 1, 1000);
 	if (ret)
@@ -441,7 +451,7 @@  void panfrost_gpu_power_off(struct panfrost_device *pfdev)
 	if (ret)
 		dev_err(pfdev->dev, "tiler power transition timeout");
 
-	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present & core_mask);
+	gpu_write(pfdev, L2_PWROFF_LO, pfdev->features.l2_present);
 	ret = readl_poll_timeout(pfdev->iomem + L2_PWRTRANS_LO,
 					 val, !val, 0, 1000);
 	if (ret)
@@ -451,7 +461,7 @@  void panfrost_gpu_power_off(struct panfrost_device *pfdev)
 
 int panfrost_gpu_init(struct panfrost_device *pfdev)
 {
-	int err, irq;
+	int err;
 
 	err = panfrost_gpu_soft_reset(pfdev);
 	if (err)
@@ -466,11 +476,11 @@  int panfrost_gpu_init(struct panfrost_device *pfdev)
 
 	dma_set_max_seg_size(pfdev->dev, UINT_MAX);
 
-	irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "gpu");
-	if (irq < 0)
-		return irq;
+	pfdev->irq = platform_get_irq_byname(to_platform_device(pfdev->dev), "gpu");
+	if (pfdev->irq < 0)
+		return pfdev->irq;
 
-	err = devm_request_irq(pfdev->dev, irq, panfrost_gpu_irq_handler,
+	err = devm_request_irq(pfdev->dev, pfdev->irq, panfrost_gpu_irq_handler,
 			       IRQF_SHARED, KBUILD_MODNAME "-gpu", pfdev);
 	if (err) {
 		dev_err(pfdev->dev, "failed to request gpu irq");