drm/lima: fix sched context destroy

Message ID 20230606143247.433018-1-nunes.erico@gmail.com
State New
Headers
Series drm/lima: fix sched context destroy |

Commit Message

Erico Nunes June 6, 2023, 2:32 p.m. UTC
  The drm sched entity must be flushed before finishing, to account for
jobs potentially still in flight at that time.
Lima did not do this flush until now, so switch the destroy call to the
drm_sched_entity_destroy() wrapper which will take care of that.

This fixes a regression on lima which started since the rework in
commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
where some specific types of applications may hang indefinitely.

Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
---
 drivers/gpu/drm/lima/lima_sched.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
  

Comments

Vasily Khoruzhick June 7, 2023, 1:18 a.m. UTC | #1
On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <nunes.erico@gmail.com> wrote:
>
> The drm sched entity must be flushed before finishing, to account for
> jobs potentially still in flight at that time.
> Lima did not do this flush until now, so switch the destroy call to the
> drm_sched_entity_destroy() wrapper which will take care of that.
>
> This fixes a regression on lima which started since the rework in
> commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> where some specific types of applications may hang indefinitely.
>
> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> Signed-off-by: Erico Nunes <nunes.erico@gmail.com>

Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>

> ---
>  drivers/gpu/drm/lima/lima_sched.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> index ff003403fbbc..ffd91a5ee299 100644
> --- a/drivers/gpu/drm/lima/lima_sched.c
> +++ b/drivers/gpu/drm/lima/lima_sched.c
> @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
>  void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>                              struct lima_sched_context *context)
>  {
> -       drm_sched_entity_fini(&context->base);
> +       drm_sched_entity_destroy(&context->base);
>  }
>
>  struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
> --
> 2.40.1
>
  
Qiang Yu June 7, 2023, 4:04 a.m. UTC | #2
Reviewed-by: Qiang Yu <yuq825@gmail.com>

Applied to drm-misc-fixes.

On Wed, Jun 7, 2023 at 9:18 AM Vasily Khoruzhick <anarsoul@gmail.com> wrote:
>
> On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <nunes.erico@gmail.com> wrote:
> >
> > The drm sched entity must be flushed before finishing, to account for
> > jobs potentially still in flight at that time.
> > Lima did not do this flush until now, so switch the destroy call to the
> > drm_sched_entity_destroy() wrapper which will take care of that.
> >
> > This fixes a regression on lima which started since the rework in
> > commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > where some specific types of applications may hang indefinitely.
> >
> > Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
> > Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
>
> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
>
> > ---
> >  drivers/gpu/drm/lima/lima_sched.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
> > index ff003403fbbc..ffd91a5ee299 100644
> > --- a/drivers/gpu/drm/lima/lima_sched.c
> > +++ b/drivers/gpu/drm/lima/lima_sched.c
> > @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
> >  void lima_sched_context_fini(struct lima_sched_pipe *pipe,
> >                              struct lima_sched_context *context)
> >  {
> > -       drm_sched_entity_fini(&context->base);
> > +       drm_sched_entity_destroy(&context->base);
> >  }
> >
> >  struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
> > --
> > 2.40.1
> >
  
Christian König June 7, 2023, 9:20 a.m. UTC | #3
Acked-by: Christian König <christian.koenig@amd.com>

While you are at it: It's beneficial for drivers to implement the flush 
callback on the file descriptor.

This way you can still send a SIGKILL when a terminating application 
waits for the entity to be flushed out to the hardware and all the 
pending jobs are canceled.

Regards,
Christian.

Am 07.06.23 um 06:04 schrieb Qiang Yu:
> Reviewed-by: Qiang Yu <yuq825@gmail.com>
>
> Applied to drm-misc-fixes.
>
> On Wed, Jun 7, 2023 at 9:18 AM Vasily Khoruzhick <anarsoul@gmail.com> wrote:
>> On Tue, Jun 6, 2023 at 7:33 AM Erico Nunes <nunes.erico@gmail.com> wrote:
>>> The drm sched entity must be flushed before finishing, to account for
>>> jobs potentially still in flight at that time.
>>> Lima did not do this flush until now, so switch the destroy call to the
>>> drm_sched_entity_destroy() wrapper which will take care of that.
>>>
>>> This fixes a regression on lima which started since the rework in
>>> commit 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
>>> where some specific types of applications may hang indefinitely.
>>>
>>> Fixes: 2fdb8a8f07c2 ("drm/scheduler: rework entity flush, kill and fini")
>>> Signed-off-by: Erico Nunes <nunes.erico@gmail.com>
>> Reviewed-by: Vasily Khoruzhick <anarsoul@gmail.com>
>>
>>> ---
>>>   drivers/gpu/drm/lima/lima_sched.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
>>> index ff003403fbbc..ffd91a5ee299 100644
>>> --- a/drivers/gpu/drm/lima/lima_sched.c
>>> +++ b/drivers/gpu/drm/lima/lima_sched.c
>>> @@ -165,7 +165,7 @@ int lima_sched_context_init(struct lima_sched_pipe *pipe,
>>>   void lima_sched_context_fini(struct lima_sched_pipe *pipe,
>>>                               struct lima_sched_context *context)
>>>   {
>>> -       drm_sched_entity_fini(&context->base);
>>> +       drm_sched_entity_destroy(&context->base);
>>>   }
>>>
>>>   struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)
>>> --
>>> 2.40.1
>>>
  

Patch

diff --git a/drivers/gpu/drm/lima/lima_sched.c b/drivers/gpu/drm/lima/lima_sched.c
index ff003403fbbc..ffd91a5ee299 100644
--- a/drivers/gpu/drm/lima/lima_sched.c
+++ b/drivers/gpu/drm/lima/lima_sched.c
@@ -165,7 +165,7 @@  int lima_sched_context_init(struct lima_sched_pipe *pipe,
 void lima_sched_context_fini(struct lima_sched_pipe *pipe,
 			     struct lima_sched_context *context)
 {
-	drm_sched_entity_fini(&context->base);
+	drm_sched_entity_destroy(&context->base);
 }
 
 struct dma_fence *lima_sched_context_queue_task(struct lima_sched_task *task)