[RFC,6/6] drm/msm/dpu: Disallow unallocated (DSC) resources to be returned

Message ID 20221213232207.113607-7-marijn.suijten@somainline.org
State New
Headers
Series drm/msm: DSC Electric Boogaloo for sm8[12]50 |

Commit Message

Marijn Suijten Dec. 13, 2022, 11:22 p.m. UTC
  In the event that the topology requests resources that have not been
created by the system (because they are typically not represented in
dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
blocks) remain NULL but will still be returned out of
dpu_rm_get_assigned_resources, where the caller expects to get an array
containing num_blks valid pointers (but instead gets these NULLs).

To prevent this from happening, where null-pointer dereferences
typically result in a hard-to-debug platform lockup, num_blks shouldn't
increase past NULL blocks and will print an error and break instead.
After all, max_blks represents the static size of the maximum number of
blocks whereas the actual amount varies per platform.

In the specific case of DSC initial resource allocation should behave
more like LMs and CTLs where NULL resources are skipped.  The current
hardcoded mapping of DSC blocks should be loosened separately as DPU
5.0.0 introduced a crossbar where DSC blocks can be "somewhat" freely
bound to any PP and CTL, but that hardcoding currently means that we
will return an error when the topology reserves a DSC that isn't
available, instead of looking for the next free one.

^1: which can happen after a git rebase ended up moving additions to
_dpu_cfg to a different struct which has the same patch context.

Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
---
 drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 10 ++++++++++
 1 file changed, 10 insertions(+)
  

Comments

Dmitry Baryshkov Dec. 14, 2022, 6:56 p.m. UTC | #1
On 14/12/2022 01:22, Marijn Suijten wrote:
> In the event that the topology requests resources that have not been
> created by the system (because they are typically not represented in
> dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> blocks) remain NULL but will still be returned out of
> dpu_rm_get_assigned_resources, where the caller expects to get an array
> containing num_blks valid pointers (but instead gets these NULLs).
> 
> To prevent this from happening, where null-pointer dereferences
> typically result in a hard-to-debug platform lockup, num_blks shouldn't
> increase past NULL blocks and will print an error and break instead.
> After all, max_blks represents the static size of the maximum number of
> blocks whereas the actual amount varies per platform.
> 
> In the specific case of DSC initial resource allocation should behave
> more like LMs and CTLs where NULL resources are skipped.  The current
> hardcoded mapping of DSC blocks should be loosened separately as DPU
> 5.0.0 introduced a crossbar where DSC blocks can be "somewhat" freely
> bound to any PP and CTL, but that hardcoding currently means that we
> will return an error when the topology reserves a DSC that isn't
> available, instead of looking for the next free one.
> 
> ^1: which can happen after a git rebase ended up moving additions to
> _dpu_cfg to a different struct which has the same patch context.
> 
> Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> ---
>   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 10 ++++++++++
>   1 file changed, 10 insertions(+)
> 
> diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> index 73b3442e7467..dcbf03d2940a 100644
> --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> @@ -496,6 +496,11 @@ static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
>   
>   	/* check if DSC required are allocated or not */
>   	for (i = 0; i < num_dsc; i++) {
> +		if (!rm->dsc_blks[i]) {
> +			DPU_ERROR("DSC %d does not exist\n", i);
> +			return -EIO;
> +		}
> +
>   		if (global_state->dsc_to_enc_id[i]) {
>   			DPU_ERROR("DSC %d is already allocated\n", i);
>   			return -EIO;
> @@ -660,6 +665,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
>   				  blks_size, enc_id);
>   			break;
>   		}
> +		if (!hw_blks[i]) {
> +			DPU_ERROR("No more resource %d available to assign to enc %d\n",
> +				  type, enc_id);
> +			break;
> +		}
>   		blks[num_blks++] = hw_blks[i];
>   	}
>  

These two chunks should come as two separate patches, each having it's 
own Fixes tag.
  
Marijn Suijten Dec. 14, 2022, 7:31 p.m. UTC | #2
On 2022-12-14 20:56:30, Dmitry Baryshkov wrote:
> On 14/12/2022 01:22, Marijn Suijten wrote:
> > In the event that the topology requests resources that have not been
> > created by the system (because they are typically not represented in
> > dpu_mdss_cfg ^1), the resource(s) in global_state (in this case DSC
> > blocks) remain NULL but will still be returned out of
> > dpu_rm_get_assigned_resources, where the caller expects to get an array
> > containing num_blks valid pointers (but instead gets these NULLs).
> > 
> > To prevent this from happening, where null-pointer dereferences
> > typically result in a hard-to-debug platform lockup, num_blks shouldn't
> > increase past NULL blocks and will print an error and break instead.
> > After all, max_blks represents the static size of the maximum number of
> > blocks whereas the actual amount varies per platform.
> > 
> > In the specific case of DSC initial resource allocation should behave
> > more like LMs and CTLs where NULL resources are skipped.  The current
> > hardcoded mapping of DSC blocks should be loosened separately as DPU
> > 5.0.0 introduced a crossbar where DSC blocks can be "somewhat" freely
> > bound to any PP and CTL, but that hardcoding currently means that we
> > will return an error when the topology reserves a DSC that isn't
> > available, instead of looking for the next free one.
> > 
> > ^1: which can happen after a git rebase ended up moving additions to
> > _dpu_cfg to a different struct which has the same patch context.
> > 
> > Signed-off-by: Marijn Suijten <marijn.suijten@somainline.org>
> > ---
> >   drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c | 10 ++++++++++
> >   1 file changed, 10 insertions(+)
> > 
> > diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > index 73b3442e7467..dcbf03d2940a 100644
> > --- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > +++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
> > @@ -496,6 +496,11 @@ static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
> >   
> >   	/* check if DSC required are allocated or not */
> >   	for (i = 0; i < num_dsc; i++) {
> > +		if (!rm->dsc_blks[i]) {
> > +			DPU_ERROR("DSC %d does not exist\n", i);
> > +			return -EIO;
> > +		}
> > +
> >   		if (global_state->dsc_to_enc_id[i]) {
> >   			DPU_ERROR("DSC %d is already allocated\n", i);
> >   			return -EIO;
> > @@ -660,6 +665,11 @@ int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
> >   				  blks_size, enc_id);
> >   			break;
> >   		}
> > +		if (!hw_blks[i]) {
> > +			DPU_ERROR("No more resource %d available to assign to enc %d\n",
> > +				  type, enc_id);
> > +			break;
> > +		}
> >   		blks[num_blks++] = hw_blks[i];
> >   	}
> >  
> 
> These two chunks should come as two separate patches, each having it's 
> own Fixes tag.

Ack.  They are indeed addressing different issues (with the same
outcome) with differing "backportability".  Will address in v2, thanks
for pointing it out (and missing a Fixes: in the first place, of which
we already have so many...).

- Marijn
  

Patch

diff --git a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
index 73b3442e7467..dcbf03d2940a 100644
--- a/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
+++ b/drivers/gpu/drm/msm/disp/dpu1/dpu_rm.c
@@ -496,6 +496,11 @@  static int _dpu_rm_reserve_dsc(struct dpu_rm *rm,
 
 	/* check if DSC required are allocated or not */
 	for (i = 0; i < num_dsc; i++) {
+		if (!rm->dsc_blks[i]) {
+			DPU_ERROR("DSC %d does not exist\n", i);
+			return -EIO;
+		}
+
 		if (global_state->dsc_to_enc_id[i]) {
 			DPU_ERROR("DSC %d is already allocated\n", i);
 			return -EIO;
@@ -660,6 +665,11 @@  int dpu_rm_get_assigned_resources(struct dpu_rm *rm,
 				  blks_size, enc_id);
 			break;
 		}
+		if (!hw_blks[i]) {
+			DPU_ERROR("No more resource %d available to assign to enc %d\n",
+				  type, enc_id);
+			break;
+		}
 		blks[num_blks++] = hw_blks[i];
 	}