[v2,4/8] drm/msm/dpu: Disallow unallocated resources to be returned

Message ID 20221221231943.1961117-5-marijn.suijten@somainline.org
State New
Headers
Series drm/msm: DSC Electric Boogaloo for sm8[12]50 |

Commit Message

Marijn Suijten Dec. 21, 2022, 11:19 p.m. UTC
  In the event that the topology requests resources that have not been
created by the system (because they are typically not represented in
dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
blocks) remain NULL but will still be returned out of
dpu_rm_get_assigned_resources, where the caller expects to get an array
containing num_blks valid pointers (but instead gets these NULLs).

To prevent this from happening, where null-pointer dereferences
typically result in a hard-to-debug platform lockup, num_blks shouldn't
increase past NULL blocks and will print an error and break instead.
After all, max_blks represents the static size of the maximum number of
blocks whereas the actual amount varies per platform.

^1: which can happen after a git rebase ended up moving additions to
_dpu_cfg to a different struct which has the same patch context.

Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
 1 file changed, 5 insertions(+)
  

Comments

Dmitry Baryshkov Jan. 8, 2023, 11:28 p.m. UTC | #1
On 22/12/2022 01:19, Marijn Suijten wrote:
> In the event that the topology requests resources that have not been
> created by the system (because they are typically not represented in
> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> blocks) remain NULL but will still be returned out of
> dpu_rm_get_assigned_resources, where the caller expects to get an array
> containing num_blks valid pointers (but instead gets these NULLs).
> 
> To prevent this from happening, where null-pointer dereferences
> typically result in a hard-to-debug platform lockup, num_blks shouldn't
> increase past NULL blocks and will print an error and break instead.
> After all, max_blks represents the static size of the maximum number of
> blocks whereas the actual amount varies per platform.
> 
> ^1: which can happen after a git rebase ended up moving additions to
> _dpu_cfg to a different struct which has the same patch context.
> 
> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> ---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
>   1 file changed, 5 insertions(+)

I think the patch is not fully correct. Please check resource 
availability during allocation. I wouldn't expect an error from 
get_assigned_resources because of resource exhaustion.
  
Dmitry Baryshkov Jan. 8, 2023, 11:30 p.m. UTC | #2
On 09/01/2023 01:28, Dmitry Baryshkov wrote:
> On 22/12/2022 01:19, Marijn Suijten wrote:
>> In the event that the topology requests resources that have not been
>> created by the system (because they are typically not represented in
>> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
>> blocks) remain NULL but will still be returned out of
>> dpu_rm_get_assigned_resources, where the caller expects to get an array
>> containing num_blks valid pointers (but instead gets these NULLs).
>>
>> To prevent this from happening, where null-pointer dereferences
>> typically result in a hard-to-debug platform lockup, num_blks shouldn't
>> increase past NULL blocks and will print an error and break instead.
>> After all, max_blks represents the static size of the maximum number of
>> blocks whereas the actual amount varies per platform.
>>
>> ^1: which can happen after a git rebase ended up moving additions to
>> _dpu_cfg to a different struct which has the same patch context.
>>
>> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
>> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
>> ---
>>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
>>   1 file changed, 5 insertions(+)
> 
> I think the patch is not fully correct. Please check resource 
> availability during allocation. I wouldn't expect an error from 
> get_assigned_resources because of resource exhaustion.
> 

Another option, since allocation functions (except DSC) already have 
these safety checks: check error message to mention internal 
inconstency: allocated resource doesn't exist.
  
Marijn Suijten Jan. 9, 2023, 8:23 a.m. UTC | #3
On 2023-01-09 01:30:29, Dmitry Baryshkov wrote:
> On 09/01/2023 01:28, Dmitry Baryshkov wrote:
> > On 22/12/2022 01:19, Marijn Suijten wrote:
> >> In the event that the topology requests resources that have not been
> >> created by the system (because they are typically not represented in
> >> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> >> blocks) remain NULL but will still be returned out of
> >> dpu_rm_get_assigned_resources, where the caller expects to get an array
> >> containing num_blks valid pointers (but instead gets these NULLs).
> >>
> >> To prevent this from happening, where null-pointer dereferences
> >> typically result in a hard-to-debug platform lockup, num_blks shouldn't
> >> increase past NULL blocks and will print an error and break instead.
> >> After all, max_blks represents the static size of the maximum number of
> >> blocks whereas the actual amount varies per platform.
> >>
> >> ^1: which can happen after a git rebase ended up moving additions to
> >> _dpu_cfg to a different struct which has the same patch context.
> >>
> >> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
> >> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> >> ---
> >>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
> >>   1 file changed, 5 insertions(+)
> > 
> > I think the patch is not fully correct. Please check resource 
> > availability during allocation. I wouldn't expect an error from 
> > get_assigned_resources because of resource exhaustion.

Theoretically patch 5/8 should take care of this, and we should never
reach this failure condition.  Emphasis on /should/, this may happen
again if/when another block type is added with sub-par resource
allocation and assignment implementation.

> Another option, since allocation functions (except DSC) already have 
> these safety checks: check error message to mention internal 
> inconstency: allocated resource doesn't exist.

Is this a suggestion for the wording of the error message?

- Marijn
  
Dmitry Baryshkov Jan. 9, 2023, 9:06 a.m. UTC | #4
On Mon, 9 Jan 2023 at 10:24, Marijn Suijten
<marijn.suijten@somainline.org> wrote:
>
> On 2023-01-09 01:30:29, Dmitry Baryshkov wrote:
> > On 09/01/2023 01:28, Dmitry Baryshkov wrote:
> > > On 22/12/2022 01:19, Marijn Suijten wrote:
> > >> In the event that the topology requests resources that have not been
> > >> created by the system (because they are typically not represented in
> > >> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> > >> blocks) remain NULL but will still be returned out of
> > >> dpu_rm_get_assigned_resources, where the caller expects to get an array
> > >> containing num_blks valid pointers (but instead gets these NULLs).
> > >>
> > >> To prevent this from happening, where null-pointer dereferences
> > >> typically result in a hard-to-debug platform lockup, num_blks shouldn't
> > >> increase past NULL blocks and will print an error and break instead.
> > >> After all, max_blks represents the static size of the maximum number of
> > >> blocks whereas the actual amount varies per platform.
> > >>
> > >> ^1: which can happen after a git rebase ended up moving additions to
> > >> _dpu_cfg to a different struct which has the same patch context.
> > >>
> > >> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
> > >> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> > >> ---
> > >>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
> > >>   1 file changed, 5 insertions(+)
> > >
> > > I think the patch is not fully correct. Please check resource
> > > availability during allocation. I wouldn't expect an error from
> > > get_assigned_resources because of resource exhaustion.
>
> Theoretically patch 5/8 should take care of this, and we should never
> reach this failure condition.  Emphasis on /should/, this may happen
> again if/when another block type is added with sub-par resource
> allocation and assignment implementation.

Yeah. Maybe swapping 4/8 and 5/8 makes sense.

>
> > Another option, since allocation functions (except DSC) already have
> > these safety checks: check error message to mention internal
> > inconstency: allocated resource doesn't exist.
>
> Is this a suggestion for the wording of the error message?

Yes. Because the current message makes one think that it is output
during allocation / assignment to encoder, while this is a safety net.

>
> - Marijn




--
With best wishes
Dmitry
  
Marijn Suijten Jan. 9, 2023, 5:12 p.m. UTC | #5
On 2023-01-09 11:06:45, Dmitry Baryshkov wrote:
> On Mon, 9 Jan 2023 at 10:24, Marijn Suijten
> <marijn.suijten@somainline.org> wrote:
> >
> > On 2023-01-09 01:30:29, Dmitry Baryshkov wrote:
> > > On 09/01/2023 01:28, Dmitry Baryshkov wrote:
> > > > On 22/12/2022 01:19, Marijn Suijten wrote:
> > > >> In the event that the topology requests resources that have not been
> > > >> created by the system (because they are typically not represented in
> > > >> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> > > >> blocks) remain NULL but will still be returned out of
> > > >> dpu_rm_get_assigned_resources, where the caller expects to get an array
> > > >> containing num_blks valid pointers (but instead gets these NULLs).
> > > >>
> > > >> To prevent this from happening, where null-pointer dereferences
> > > >> typically result in a hard-to-debug platform lockup, num_blks shouldn't
> > > >> increase past NULL blocks and will print an error and break instead.
> > > >> After all, max_blks represents the static size of the maximum number of
> > > >> blocks whereas the actual amount varies per platform.
> > > >>
> > > >> ^1: which can happen after a git rebase ended up moving additions to
> > > >> _dpu_cfg to a different struct which has the same patch context.
> > > >>
> > > >> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
> > > >> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> > > >> ---
> > > >>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
> > > >>   1 file changed, 5 insertions(+)
> > > >
> > > > I think the patch is not fully correct. Please check resource
> > > > availability during allocation. I wouldn't expect an error from
> > > > get_assigned_resources because of resource exhaustion.
> >
> > Theoretically patch 5/8 should take care of this, and we should never
> > reach this failure condition.  Emphasis on /should/, this may happen
> > again if/when another block type is added with sub-par resource
> > allocation and assignment implementation.
> 
> Yeah. Maybe swapping 4/8 and 5/8 makes sense.

Ack.

> > > Another option, since allocation functions (except DSC) already have
> > > these safety checks: check error message to mention internal
> > > inconstency: allocated resource doesn't exist.
> >
> > Is this a suggestion for the wording of the error message?
> 
> Yes. Because the current message makes one think that it is output
> during allocation / assignment to encoder, while this is a safety net.

Good.  So the patch is correct, just the wording is off, which I fully
agree on.  This isn't allocating anything, just handing out what was
previously allocated (and is a safety net).

- Marijn
  
Dmitry Baryshkov Jan. 9, 2023, 6:24 p.m. UTC | #6
On 09/01/2023 19:12, Marijn Suijten wrote:
> On 2023-01-09 11:06:45, Dmitry Baryshkov wrote:
>> On Mon, 9 Jan 2023 at 10:24, Marijn Suijten
>> <marijn.suijten@somainline.org> wrote:
>>>
>>> On 2023-01-09 01:30:29, Dmitry Baryshkov wrote:
>>>> On 09/01/2023 01:28, Dmitry Baryshkov wrote:
>>>>> On 22/12/2022 01:19, Marijn Suijten wrote:
>>>>>> In the event that the topology requests resources that have not been
>>>>>> created by the system (because they are typically not represented in
>>>>>> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
>>>>>> blocks) remain NULL but will still be returned out of
>>>>>> dpu_rm_get_assigned_resources, where the caller expects to get an array
>>>>>> containing num_blks valid pointers (but instead gets these NULLs).
>>>>>>
>>>>>> To prevent this from happening, where null-pointer dereferences
>>>>>> typically result in a hard-to-debug platform lockup, num_blks shouldn't
>>>>>> increase past NULL blocks and will print an error and break instead.
>>>>>> After all, max_blks represents the static size of the maximum number of
>>>>>> blocks whereas the actual amount varies per platform.
>>>>>>
>>>>>> ^1: which can happen after a git rebase ended up moving additions to
>>>>>> _dpu_cfg to a different struct which has the same patch context.
>>>>>>
>>>>>> Fixes: bb00a452d6f7 ("drm/msm/dpu: Refactor resource manager")
>>>>>> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
>>>>>> ---
>>>>>>    drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 5 +++++
>>>>>>    1 file changed, 5 insertions(+)
>>>>>
>>>>> I think the patch is not fully correct. Please check resource
>>>>> availability during allocation. I wouldn't expect an error from
>>>>> get_assigned_resources because of resource exhaustion.
>>>
>>> Theoretically patch 5/8 should take care of this, and we should never
>>> reach this failure condition.  Emphasis on /should/, this may happen
>>> again if/when another block type is added with sub-par resource
>>> allocation and assignment implementation.
>>
>> Yeah. Maybe swapping 4/8 and 5/8 makes sense.
> 
> Ack.
> 
>>>> Another option, since allocation functions (except DSC) already have
>>>> these safety checks: check error message to mention internal
>>>> inconstency: allocated resource doesn't exist.
>>>
>>> Is this a suggestion for the wording of the error message?
>>
>> Yes. Because the current message makes one think that it is output
>> during allocation / assignment to encoder, while this is a safety net.
> 
> Good.  So the patch is correct, just the wording is off, which I fully
> agree on.  This isn't allocating anything, just handing out what was
> previously allocated (and is a safety net).

Yes. Please excuse me if my original message was not 100% clear.

> 
> - Marijn
  

Patch

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 73b3442e7467..8471d04bff50 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -660,6 +660,11 @@  int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
 				  blks_size, enc_id);
 			break;
 		}
+		if (!hw_blks[i]) {
+			DPU_ERROR("No more resource %d available to assign to enc %d\n",
+				  type, enc_id);
+			break;
+		}
 		blks[num_blks++] = hw_blks[i];
 	}