[5.10,384/390] Revert "drm/amdgpu: move nbio sdma_doorbell_range() into sdma code for vega"

Message ID 20221024113039.334437223@linuxfoundation.org
State New
Headers
Series None |

Commit Message

Greg KH Oct. 24, 2022, 11:33 a.m. UTC
  From: Shuah Khan <skhan@linuxfoundation.org>

This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.

This commit causes repeated WARN_ONs from

drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]

dmesg fills up with the following messages and drm initialization takes
a very long time.

Cc: <stable@vger.kernel.org>    # 5.10
Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
---
 drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |    5 -----
 drivers/gpu/drm/amd/amdgpu/soc15.c     |   25 +++++++++++++++++++++++++
 2 files changed, 25 insertions(+), 5 deletions(-)
  

Comments

Salvatore Bonaccorso Oct. 25, 2022, 9:02 a.m. UTC | #1
Hi Greg,

On Mon, Oct 24, 2022 at 01:33:01PM +0200, Greg Kroah-Hartman wrote:
> From: Shuah Khan <skhan@linuxfoundation.org>
> 
> This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
> commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.
> 
> This commit causes repeated WARN_ONs from
> 
> drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
> gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]
> 
> dmesg fills up with the following messages and drm initialization takes
> a very long time.
> 
> Cc: <stable@vger.kernel.org>    # 5.10
> Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
>  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |    5 -----
>  drivers/gpu/drm/amd/amdgpu/soc15.c     |   25 +++++++++++++++++++++++++
>  2 files changed, 25 insertions(+), 5 deletions(-)
> 
> --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> @@ -1475,11 +1475,6 @@ static int sdma_v4_0_start(struct amdgpu
>  		WREG32_SDMA(i, mmSDMA0_CNTL, temp);
>  
>  		if (!amdgpu_sriov_vf(adev)) {
> -			ring = &adev->sdma.instance[i].ring;
> -			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> -				ring->use_doorbell, ring->doorbell_index,
> -				adev->doorbell_index.sdma_doorbell_range);
> -
>  			/* unhalt engine */
>  			temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
>  			temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
> --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> @@ -1332,6 +1332,25 @@ static int soc15_common_sw_fini(void *ha
>  	return 0;
>  }
>  
> +static void soc15_doorbell_range_init(struct amdgpu_device *adev)
> +{
> +	int i;
> +	struct amdgpu_ring *ring;
> +
> +	/* sdma/ih doorbell range are programed by hypervisor */
> +	if (!amdgpu_sriov_vf(adev)) {
> +		for (i = 0; i < adev->sdma.num_instances; i++) {
> +			ring = &adev->sdma.instance[i].ring;
> +			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> +				ring->use_doorbell, ring->doorbell_index,
> +				adev->doorbell_index.sdma_doorbell_range);
> +		}
> +
> +		adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
> +						adev->irq.ih.doorbell_index);
> +	}
> +}
> +
>  static int soc15_common_hw_init(void *handle)
>  {
>  	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> @@ -1351,6 +1370,12 @@ static int soc15_common_hw_init(void *ha
>  
>  	/* enable the doorbell aperture */
>  	soc15_enable_doorbell_aperture(adev, true);
> +	/* HW doorbell routing policy: doorbell writing not
> +	 * in SDMA/IH/MM/ACV range will be routed to CP. So
> +	 * we need to init SDMA/IH/MM/ACV doorbell range prior
> +	 * to CP ip block init and ring test.
> +	 */
> +	soc15_doorbell_range_init(adev);
>  
>  	return 0;
>  }

Can you please as well revert 7b0db849ea030a70b8fb9c9afec67c81f955482e
on top?

See https://lore.kernel.org/stable/BL1PR12MB5144F3CC640A18DF0C36E414F72E9@BL1PR12MB5144.namprd12.prod.outlook.com/

Both of these reverts need to be applied to fix regressions which were
reported in https://gitlab.freedesktop.org/drm/amd/-/issues/2216 and
downstream in Debian (https://bugs.debian.org/1022025).

If it is now not anymore possible for 5.10.150 can you pick the revert
for 5.10.151?

Regards,
Salvatore
  
Greg KH Oct. 25, 2022, 2:20 p.m. UTC | #2
On Tue, Oct 25, 2022 at 11:02:33AM +0200, Salvatore Bonaccorso wrote:
> Hi Greg,
> 
> On Mon, Oct 24, 2022 at 01:33:01PM +0200, Greg Kroah-Hartman wrote:
> > From: Shuah Khan <skhan@linuxfoundation.org>
> > 
> > This reverts commit 9f55f36f749a7608eeef57d7d72991a9bd557341 which is
> > commit e3163bc8ffdfdb405e10530b140135b2ee487f89 upstream.
> > 
> > This commit causes repeated WARN_ONs from
> > 
> > drivers/gpu/drm/amd/amdgpu/../display/amdgpu_dm/amd
> > gpu_dm.c:7391 amdgpu_dm_atomic_commit_tail+0x23b9/0x2430 [amdgpu]
> > 
> > dmesg fills up with the following messages and drm initialization takes
> > a very long time.
> > 
> > Cc: <stable@vger.kernel.org>    # 5.10
> > Signed-off-by: Shuah Khan <skhan@linuxfoundation.org>
> > Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > ---
> >  drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c |    5 -----
> >  drivers/gpu/drm/amd/amdgpu/soc15.c     |   25 +++++++++++++++++++++++++
> >  2 files changed, 25 insertions(+), 5 deletions(-)
> > 
> > --- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
> > @@ -1475,11 +1475,6 @@ static int sdma_v4_0_start(struct amdgpu
> >  		WREG32_SDMA(i, mmSDMA0_CNTL, temp);
> >  
> >  		if (!amdgpu_sriov_vf(adev)) {
> > -			ring = &adev->sdma.instance[i].ring;
> > -			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> > -				ring->use_doorbell, ring->doorbell_index,
> > -				adev->doorbell_index.sdma_doorbell_range);
> > -
> >  			/* unhalt engine */
> >  			temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
> >  			temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
> > --- a/drivers/gpu/drm/amd/amdgpu/soc15.c
> > +++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
> > @@ -1332,6 +1332,25 @@ static int soc15_common_sw_fini(void *ha
> >  	return 0;
> >  }
> >  
> > +static void soc15_doorbell_range_init(struct amdgpu_device *adev)
> > +{
> > +	int i;
> > +	struct amdgpu_ring *ring;
> > +
> > +	/* sdma/ih doorbell range are programed by hypervisor */
> > +	if (!amdgpu_sriov_vf(adev)) {
> > +		for (i = 0; i < adev->sdma.num_instances; i++) {
> > +			ring = &adev->sdma.instance[i].ring;
> > +			adev->nbio.funcs->sdma_doorbell_range(adev, i,
> > +				ring->use_doorbell, ring->doorbell_index,
> > +				adev->doorbell_index.sdma_doorbell_range);
> > +		}
> > +
> > +		adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
> > +						adev->irq.ih.doorbell_index);
> > +	}
> > +}
> > +
> >  static int soc15_common_hw_init(void *handle)
> >  {
> >  	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
> > @@ -1351,6 +1370,12 @@ static int soc15_common_hw_init(void *ha
> >  
> >  	/* enable the doorbell aperture */
> >  	soc15_enable_doorbell_aperture(adev, true);
> > +	/* HW doorbell routing policy: doorbell writing not
> > +	 * in SDMA/IH/MM/ACV range will be routed to CP. So
> > +	 * we need to init SDMA/IH/MM/ACV doorbell range prior
> > +	 * to CP ip block init and ring test.
> > +	 */
> > +	soc15_doorbell_range_init(adev);
> >  
> >  	return 0;
> >  }
> 
> Can you please as well revert 7b0db849ea030a70b8fb9c9afec67c81f955482e
> on top?
> 
> See https://lore.kernel.org/stable/BL1PR12MB5144F3CC640A18DF0C36E414F72E9@BL1PR12MB5144.namprd12.prod.outlook.com/
> 
> Both of these reverts need to be applied to fix regressions which were
> reported in https://gitlab.freedesktop.org/drm/amd/-/issues/2216 and
> downstream in Debian (https://bugs.debian.org/1022025).
> 
> If it is now not anymore possible for 5.10.150 can you pick the revert
> for 5.10.151?

Now queued up.

greg k-h
  

Patch

--- a/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/sdma_v4_0.c
@@ -1475,11 +1475,6 @@  static int sdma_v4_0_start(struct amdgpu
 		WREG32_SDMA(i, mmSDMA0_CNTL, temp);
 
 		if (!amdgpu_sriov_vf(adev)) {
-			ring = &adev->sdma.instance[i].ring;
-			adev->nbio.funcs->sdma_doorbell_range(adev, i,
-				ring->use_doorbell, ring->doorbell_index,
-				adev->doorbell_index.sdma_doorbell_range);
-
 			/* unhalt engine */
 			temp = RREG32_SDMA(i, mmSDMA0_F32_CNTL);
 			temp = REG_SET_FIELD(temp, SDMA0_F32_CNTL, HALT, 0);
--- a/drivers/gpu/drm/amd/amdgpu/soc15.c
+++ b/drivers/gpu/drm/amd/amdgpu/soc15.c
@@ -1332,6 +1332,25 @@  static int soc15_common_sw_fini(void *ha
 	return 0;
 }
 
+static void soc15_doorbell_range_init(struct amdgpu_device *adev)
+{
+	int i;
+	struct amdgpu_ring *ring;
+
+	/* sdma/ih doorbell range are programed by hypervisor */
+	if (!amdgpu_sriov_vf(adev)) {
+		for (i = 0; i < adev->sdma.num_instances; i++) {
+			ring = &adev->sdma.instance[i].ring;
+			adev->nbio.funcs->sdma_doorbell_range(adev, i,
+				ring->use_doorbell, ring->doorbell_index,
+				adev->doorbell_index.sdma_doorbell_range);
+		}
+
+		adev->nbio.funcs->ih_doorbell_range(adev, adev->irq.ih.use_doorbell,
+						adev->irq.ih.doorbell_index);
+	}
+}
+
 static int soc15_common_hw_init(void *handle)
 {
 	struct amdgpu_device *adev = (struct amdgpu_device *)handle;
@@ -1351,6 +1370,12 @@  static int soc15_common_hw_init(void *ha
 
 	/* enable the doorbell aperture */
 	soc15_enable_doorbell_aperture(adev, true);
+	/* HW doorbell routing policy: doorbell writing not
+	 * in SDMA/IH/MM/ACV range will be routed to CP. So
+	 * we need to init SDMA/IH/MM/ACV doorbell range prior
+	 * to CP ip block init and ring test.
+	 */
+	soc15_doorbell_range_init(adev);
 
 	return 0;
 }