[v2,06/11] firmware: qcom: scm: make qcom_scm_pas_init_image() use the SCM allocator

Message ID 20230928092040.9420-7-brgl@bgdev.pl
State New
Headers
Series arm64: qcom: add and enable SHM Bridge support |

Commit Message

Bartosz Golaszewski Sept. 28, 2023, 9:20 a.m. UTC
  From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>

Let's use the new SCM memory allocator to obtain a buffer for this call
instead of using dma_alloc_coherent().

Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
---
 drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)
  

Comments

Andrew Halaney Sept. 29, 2023, 7:16 p.m. UTC | #1
On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> 
> Let's use the new SCM memory allocator to obtain a buffer for this call
> instead of using dma_alloc_coherent().
> 
> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> ---
>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
>  1 file changed, 5 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> index 02a773ba1383..c0eb81069847 100644
> --- a/drivers/firmware/qcom/qcom_scm.c
> +++ b/drivers/firmware/qcom/qcom_scm.c
> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  			    struct qcom_scm_pas_metadata *ctx)
>  {
> -	dma_addr_t mdata_phys;
> +	phys_addr_t mdata_phys;

>  	void *mdata_buf;
>  	int ret;
>  	struct qcom_scm_desc desc = {
> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  	};
>  	struct qcom_scm_res res;
>  
> -	/*
> -	 * During the scm call memory protection will be enabled for the meta
> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
> -	 * non-cachable to avoid XPU violations.
> -	 */
> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> -				       GFP_KERNEL);
> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);

mdata_phys is never initialized now, and its what's being shoved into
desc.args[1] later, which I believe is what triggered the -EINVAL
with qcom_scm_call() that I reported in my cover letter reply this
morning.

Prior with the DMA API that would have been the device view of the buffer.

>  	if (!mdata_buf) {
>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
>  		return -ENOMEM;
> @@ -574,10 +568,10 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>  
>  out:
>  	if (ret < 0 || !ctx) {
> -		dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys);
> +		qcom_scm_mem_free(mdata_buf);
>  	} else if (ctx) {
>  		ctx->ptr = mdata_buf;
> -		ctx->phys = mdata_phys;
> +		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
>  		ctx->size = size;
>  	}
>  
> @@ -594,7 +588,7 @@ void qcom_scm_pas_metadata_release(struct qcom_scm_pas_metadata *ctx)
>  	if (!ctx->ptr)
>  		return;
>  
> -	dma_free_coherent(__scm->dev, ctx->size, ctx->ptr, ctx->phys);
> +	qcom_scm_mem_free(ctx->ptr);
>  
>  	ctx->ptr = NULL;
>  	ctx->phys = 0;
> -- 
> 2.39.2
>
  
Bartosz Golaszewski Sept. 29, 2023, 7:22 p.m. UTC | #2
On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>
>> Let's use the new SCM memory allocator to obtain a buffer for this call
>> instead of using dma_alloc_coherent().
>>
>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>> ---
>>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
>>  1 file changed, 5 insertions(+), 11 deletions(-)
>>
>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
>> index 02a773ba1383..c0eb81069847 100644
>> --- a/drivers/firmware/qcom/qcom_scm.c
>> +++ b/drivers/firmware/qcom/qcom_scm.c
>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
>>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>>  			    struct qcom_scm_pas_metadata *ctx)
>>  {
>> -	dma_addr_t mdata_phys;
>> +	phys_addr_t mdata_phys;
>
>>  	void *mdata_buf;
>>  	int ret;
>>  	struct qcom_scm_desc desc = {
>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>>  	};
>>  	struct qcom_scm_res res;
>>
>> -	/*
>> -	 * During the scm call memory protection will be enabled for the meta
>> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
>> -	 * non-cachable to avoid XPU violations.
>> -	 */
>> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
>> -				       GFP_KERNEL);
>> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
>
> mdata_phys is never initialized now, and its what's being shoved into
> desc.args[1] later, which I believe is what triggered the -EINVAL
> with qcom_scm_call() that I reported in my cover letter reply this
> morning.
>
> Prior with the DMA API that would have been the device view of the buffer.
>

Gah! Thanks for finding this.

Can you try the following diff?

diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
index 794388c3212f..b0d4ea237034 100644
--- a/drivers/firmware/qcom/qcom_scm.c
+++ b/drivers/firmware/qcom/qcom_scm.c
@@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
void *metadata, size_t size,
 		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
 		return -ENOMEM;
 	}
+	mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
 	memcpy(mdata_buf, metadata, size);

 	ret = qcom_scm_clk_enable();
@@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
void *metadata, size_t size,
 		qcom_scm_mem_free(mdata_buf);
 	} else if (ctx) {
 		ctx->ptr = mdata_buf;
-		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
+		ctx->phys = mdata_phys;
 		ctx->size = size;
 	}

Bart

>>  	if (!mdata_buf) {
>>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
>>  		return -ENOMEM;
>> @@ -574,10 +568,10 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>>
>>  out:
>>  	if (ret < 0 || !ctx) {
>> -		dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys);
>> +		qcom_scm_mem_free(mdata_buf);
>>  	} else if (ctx) {
>>  		ctx->ptr = mdata_buf;
>> -		ctx->phys = mdata_phys;
>> +		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
>>  		ctx->size = size;
>>  	}
>>
>> @@ -594,7 +588,7 @@ void qcom_scm_pas_metadata_release(struct qcom_scm_pas_metadata *ctx)
>>  	if (!ctx->ptr)
>>  		return;
>>
>> -	dma_free_coherent(__scm->dev, ctx->size, ctx->ptr, ctx->phys);
>> +	qcom_scm_mem_free(ctx->ptr);
>>
>>  	ctx->ptr = NULL;
>>  	ctx->phys = 0;
>> --
>> 2.39.2
>>
>
>
  
Andrew Halaney Sept. 29, 2023, 8:44 p.m. UTC | #3
On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote:
> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
> > On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
> >> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>
> >> Let's use the new SCM memory allocator to obtain a buffer for this call
> >> instead of using dma_alloc_coherent().
> >>
> >> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >> ---
> >>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
> >>  1 file changed, 5 insertions(+), 11 deletions(-)
> >>
> >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> >> index 02a773ba1383..c0eb81069847 100644
> >> --- a/drivers/firmware/qcom/qcom_scm.c
> >> +++ b/drivers/firmware/qcom/qcom_scm.c
> >> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
> >>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> >>  			    struct qcom_scm_pas_metadata *ctx)
> >>  {
> >> -	dma_addr_t mdata_phys;
> >> +	phys_addr_t mdata_phys;
> >
> >>  	void *mdata_buf;
> >>  	int ret;
> >>  	struct qcom_scm_desc desc = {
> >> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> >>  	};
> >>  	struct qcom_scm_res res;
> >>
> >> -	/*
> >> -	 * During the scm call memory protection will be enabled for the meta
> >> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
> >> -	 * non-cachable to avoid XPU violations.
> >> -	 */
> >> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> >> -				       GFP_KERNEL);
> >> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
> >
> > mdata_phys is never initialized now, and its what's being shoved into
> > desc.args[1] later, which I believe is what triggered the -EINVAL
> > with qcom_scm_call() that I reported in my cover letter reply this
> > morning.
> >
> > Prior with the DMA API that would have been the device view of the buffer.
> >
> 
> Gah! Thanks for finding this.
> 
> Can you try the following diff?
> 
> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> index 794388c3212f..b0d4ea237034 100644
> --- a/drivers/firmware/qcom/qcom_scm.c
> +++ b/drivers/firmware/qcom/qcom_scm.c
> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> void *metadata, size_t size,
>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
>  		return -ENOMEM;
>  	}
> +	mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
>  	memcpy(mdata_buf, metadata, size);
> 
>  	ret = qcom_scm_clk_enable();
> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> void *metadata, size_t size,
>  		qcom_scm_mem_free(mdata_buf);
>  	} else if (ctx) {
>  		ctx->ptr = mdata_buf;
> -		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
> +		ctx->phys = mdata_phys;
>  		ctx->size = size;
>  	}
> 
> Bart
> 

For some reason that I can't explain that is still not working. It seems
the SMC call is returning !0 and then we return -EINVAL from there
with qcom_scm_remap_error().

Here's a really crummy diff of what I hacked in during lunch to debug (don't
judge my primitive debug skills):

diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
index 0d5554df1321..56eab0ae5f3a 100644
--- a/drivers/firmware/qcom/qcom_scm-smc.c
+++ b/drivers/firmware/qcom/qcom_scm-smc.c
@@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
        struct arm_smccc_res smc_res;
        struct arm_smccc_args smc = {0};
 
+       dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__);
+
        smc.args[0] = ARM_SMCCC_CALL_VAL(
                smccc_call_type,
                qcom_smccc_convention,
@@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
        if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) {
                alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64);
                args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag);
+               dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__);
 
                if (!args_virt)
                        return -ENOMEM;
@@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
 
        /* ret error check follows after args_virt cleanup*/
        ret = __scm_smc_do(dev, &smc, &smc_res, atomic);
+       dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret);
 
        if (ret)
                return ret;
@@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
                res->result[0] = smc_res.a1;
                res->result[1] = smc_res.a2;
                res->result[2] = smc_res.a3;
+               dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]);
        }
 
+       dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0);
        return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0;


And that all spams dmesg successfully for most cases, but the
pas_init_image calls log this out:

[   16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc
[   16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808
[   16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function
[   16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0
[   16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0
[   16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558

At the moment I am unsure why...
  
Elliot Berman Sept. 29, 2023, 10:48 p.m. UTC | #4
Hi Andrew,

On 9/29/2023 1:44 PM, Andrew Halaney wrote:
> On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote:
>> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
>>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
>>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>>>
>>>> Let's use the new SCM memory allocator to obtain a buffer for this call
>>>> instead of using dma_alloc_coherent().
>>>>
>>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
>>>> ---
>>>>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
>>>>  1 file changed, 5 insertions(+), 11 deletions(-)
>>>>
>>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
>>>> index 02a773ba1383..c0eb81069847 100644
>>>> --- a/drivers/firmware/qcom/qcom_scm.c
>>>> +++ b/drivers/firmware/qcom/qcom_scm.c
>>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
>>>>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>>>>  			    struct qcom_scm_pas_metadata *ctx)
>>>>  {
>>>> -	dma_addr_t mdata_phys;
>>>> +	phys_addr_t mdata_phys;
>>>
>>>>  	void *mdata_buf;
>>>>  	int ret;
>>>>  	struct qcom_scm_desc desc = {
>>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
>>>>  	};
>>>>  	struct qcom_scm_res res;
>>>>
>>>> -	/*
>>>> -	 * During the scm call memory protection will be enabled for the meta
>>>> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
>>>> -	 * non-cachable to avoid XPU violations.
>>>> -	 */
>>>> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
>>>> -				       GFP_KERNEL);
>>>> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
>>>
>>> mdata_phys is never initialized now, and its what's being shoved into
>>> desc.args[1] later, which I believe is what triggered the -EINVAL
>>> with qcom_scm_call() that I reported in my cover letter reply this
>>> morning.
>>>
>>> Prior with the DMA API that would have been the device view of the buffer.
>>>
>>
>> Gah! Thanks for finding this.
>>
>> Can you try the following diff?
>>
>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
>> index 794388c3212f..b0d4ea237034 100644
>> --- a/drivers/firmware/qcom/qcom_scm.c
>> +++ b/drivers/firmware/qcom/qcom_scm.c
>> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
>> void *metadata, size_t size,
>>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
>>  		return -ENOMEM;
>>  	}
>> +	mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
>>  	memcpy(mdata_buf, metadata, size);
>>
>>  	ret = qcom_scm_clk_enable();
>> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
>> void *metadata, size_t size,
>>  		qcom_scm_mem_free(mdata_buf);
>>  	} else if (ctx) {
>>  		ctx->ptr = mdata_buf;
>> -		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
>> +		ctx->phys = mdata_phys;
>>  		ctx->size = size;
>>  	}
>>
>> Bart
>>
> 
> For some reason that I can't explain that is still not working. It seems
> the SMC call is returning !0 and then we return -EINVAL from there
> with qcom_scm_remap_error().
> 
> Here's a really crummy diff of what I hacked in during lunch to debug (don't
> judge my primitive debug skills):
> 

I don't know what you're talking about :-)

> diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
> index 0d5554df1321..56eab0ae5f3a 100644
> --- a/drivers/firmware/qcom/qcom_scm-smc.c
> +++ b/drivers/firmware/qcom/qcom_scm-smc.c
> @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
>         struct arm_smccc_res smc_res;
>         struct arm_smccc_args smc = {0};
>  
> +       dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__);
> +
>         smc.args[0] = ARM_SMCCC_CALL_VAL(
>                 smccc_call_type,
>                 qcom_smccc_convention,
> @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
>         if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) {
>                 alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64);
>                 args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag);
> +               dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__);
>  
>                 if (!args_virt)
>                         return -ENOMEM;
> @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
>  
>         /* ret error check follows after args_virt cleanup*/
>         ret = __scm_smc_do(dev, &smc, &smc_res, atomic);
> +       dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret);
>  
>         if (ret)
>                 return ret;
> @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
>                 res->result[0] = smc_res.a1;
>                 res->result[1] = smc_res.a2;
>                 res->result[2] = smc_res.a3;
> +               dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]);
>         }
>  
> +       dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0);
>         return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0;
> 
> 
> And that all spams dmesg successfully for most cases, but the
> pas_init_image calls log this out:
> 
> [   16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc
> [   16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808
> [   16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function
> [   16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0
> [   16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0
> [   16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558
> 
> At the moment I am unsure why...
> 
Does the issue appear right after taking patch 6 or does it only appear after taking
the whole series? If it's just to this patch, then maybe something wrong with
the refactor: shm bridge isn't enabled at this point in the series.
  
Andrew Halaney Oct. 2, 2023, 1:24 p.m. UTC | #5
On Fri, Sep 29, 2023 at 03:48:37PM -0700, Elliot Berman wrote:
> Hi Andrew,
> 
> On 9/29/2023 1:44 PM, Andrew Halaney wrote:
> > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote:
> >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
> >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
> >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>>>
> >>>> Let's use the new SCM memory allocator to obtain a buffer for this call
> >>>> instead of using dma_alloc_coherent().
> >>>>
> >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> >>>> ---
> >>>>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
> >>>>  1 file changed, 5 insertions(+), 11 deletions(-)
> >>>>
> >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> >>>> index 02a773ba1383..c0eb81069847 100644
> >>>> --- a/drivers/firmware/qcom/qcom_scm.c
> >>>> +++ b/drivers/firmware/qcom/qcom_scm.c
> >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
> >>>>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> >>>>  			    struct qcom_scm_pas_metadata *ctx)
> >>>>  {
> >>>> -	dma_addr_t mdata_phys;
> >>>> +	phys_addr_t mdata_phys;
> >>>
> >>>>  	void *mdata_buf;
> >>>>  	int ret;
> >>>>  	struct qcom_scm_desc desc = {
> >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> >>>>  	};
> >>>>  	struct qcom_scm_res res;
> >>>>
> >>>> -	/*
> >>>> -	 * During the scm call memory protection will be enabled for the meta
> >>>> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
> >>>> -	 * non-cachable to avoid XPU violations.
> >>>> -	 */
> >>>> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> >>>> -				       GFP_KERNEL);
> >>>> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
> >>>
> >>> mdata_phys is never initialized now, and its what's being shoved into
> >>> desc.args[1] later, which I believe is what triggered the -EINVAL
> >>> with qcom_scm_call() that I reported in my cover letter reply this
> >>> morning.
> >>>
> >>> Prior with the DMA API that would have been the device view of the buffer.
> >>>
> >>
> >> Gah! Thanks for finding this.
> >>
> >> Can you try the following diff?
> >>
> >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> >> index 794388c3212f..b0d4ea237034 100644
> >> --- a/drivers/firmware/qcom/qcom_scm.c
> >> +++ b/drivers/firmware/qcom/qcom_scm.c
> >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> >> void *metadata, size_t size,
> >>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
> >>  		return -ENOMEM;
> >>  	}
> >> +	mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
> >>  	memcpy(mdata_buf, metadata, size);
> >>
> >>  	ret = qcom_scm_clk_enable();
> >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> >> void *metadata, size_t size,
> >>  		qcom_scm_mem_free(mdata_buf);
> >>  	} else if (ctx) {
> >>  		ctx->ptr = mdata_buf;
> >> -		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
> >> +		ctx->phys = mdata_phys;
> >>  		ctx->size = size;
> >>  	}
> >>
> >> Bart
> >>
> > 
> > For some reason that I can't explain that is still not working. It seems
> > the SMC call is returning !0 and then we return -EINVAL from there
> > with qcom_scm_remap_error().
> > 
> > Here's a really crummy diff of what I hacked in during lunch to debug (don't
> > judge my primitive debug skills):
> > 
> 
> I don't know what you're talking about :-)
> 
> > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
> > index 0d5554df1321..56eab0ae5f3a 100644
> > --- a/drivers/firmware/qcom/qcom_scm-smc.c
> > +++ b/drivers/firmware/qcom/qcom_scm-smc.c
> > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> >         struct arm_smccc_res smc_res;
> >         struct arm_smccc_args smc = {0};
> >  
> > +       dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__);
> > +
> >         smc.args[0] = ARM_SMCCC_CALL_VAL(
> >                 smccc_call_type,
> >                 qcom_smccc_convention,
> > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> >         if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) {
> >                 alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64);
> >                 args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag);
> > +               dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__);
> >  
> >                 if (!args_virt)
> >                         return -ENOMEM;
> > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> >  
> >         /* ret error check follows after args_virt cleanup*/
> >         ret = __scm_smc_do(dev, &smc, &smc_res, atomic);
> > +       dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret);
> >  
> >         if (ret)
> >                 return ret;
> > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> >                 res->result[0] = smc_res.a1;
> >                 res->result[1] = smc_res.a2;
> >                 res->result[2] = smc_res.a3;
> > +               dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]);
> >         }
> >  
> > +       dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0);
> >         return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0;
> > 
> > 
> > And that all spams dmesg successfully for most cases, but the
> > pas_init_image calls log this out:
> > 
> > [   16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc
> > [   16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808
> > [   16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function
> > [   16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0
> > [   16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0
> > [   16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558
> > 
> > At the moment I am unsure why...
> > 
> Does the issue appear right after taking patch 6 or does it only appear after taking
> the whole series? If it's just to this patch, then maybe something wrong with
> the refactor: shm bridge isn't enabled at this point in the series.
> 

I've only been testing the series as a whole on top of a 6.6 based
branch, I'm going to try and test some more today to see if just the
allocator bits (but not the SHM bridge enablement) works alright for
me.

Thanks,
Andrew
  
Andrew Halaney Oct. 2, 2023, 2:15 p.m. UTC | #6
On Mon, Oct 02, 2023 at 08:24:09AM -0500, Andrew Halaney wrote:
> On Fri, Sep 29, 2023 at 03:48:37PM -0700, Elliot Berman wrote:
> > Hi Andrew,
> > 
> > On 9/29/2023 1:44 PM, Andrew Halaney wrote:
> > > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote:
> > >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
> > >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
> > >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >>>>
> > >>>> Let's use the new SCM memory allocator to obtain a buffer for this call
> > >>>> instead of using dma_alloc_coherent().
> > >>>>
> > >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > >>>> ---
> > >>>>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
> > >>>>  1 file changed, 5 insertions(+), 11 deletions(-)
> > >>>>
> > >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> > >>>> index 02a773ba1383..c0eb81069847 100644
> > >>>> --- a/drivers/firmware/qcom/qcom_scm.c
> > >>>> +++ b/drivers/firmware/qcom/qcom_scm.c
> > >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
> > >>>>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> > >>>>  			    struct qcom_scm_pas_metadata *ctx)
> > >>>>  {
> > >>>> -	dma_addr_t mdata_phys;
> > >>>> +	phys_addr_t mdata_phys;
> > >>>
> > >>>>  	void *mdata_buf;
> > >>>>  	int ret;
> > >>>>  	struct qcom_scm_desc desc = {
> > >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> > >>>>  	};
> > >>>>  	struct qcom_scm_res res;
> > >>>>
> > >>>> -	/*
> > >>>> -	 * During the scm call memory protection will be enabled for the meta
> > >>>> -	 * data blob, so make sure it's physically contiguous, 4K aligned and
> > >>>> -	 * non-cachable to avoid XPU violations.
> > >>>> -	 */
> > >>>> -	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> > >>>> -				       GFP_KERNEL);
> > >>>> +	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
> > >>>
> > >>> mdata_phys is never initialized now, and its what's being shoved into
> > >>> desc.args[1] later, which I believe is what triggered the -EINVAL
> > >>> with qcom_scm_call() that I reported in my cover letter reply this
> > >>> morning.
> > >>>
> > >>> Prior with the DMA API that would have been the device view of the buffer.
> > >>>
> > >>
> > >> Gah! Thanks for finding this.
> > >>
> > >> Can you try the following diff?
> > >>
> > >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> > >> index 794388c3212f..b0d4ea237034 100644
> > >> --- a/drivers/firmware/qcom/qcom_scm.c
> > >> +++ b/drivers/firmware/qcom/qcom_scm.c
> > >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> > >> void *metadata, size_t size,
> > >>  		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
> > >>  		return -ENOMEM;
> > >>  	}
> > >> +	mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
> > >>  	memcpy(mdata_buf, metadata, size);
> > >>
> > >>  	ret = qcom_scm_clk_enable();
> > >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> > >> void *metadata, size_t size,
> > >>  		qcom_scm_mem_free(mdata_buf);
> > >>  	} else if (ctx) {
> > >>  		ctx->ptr = mdata_buf;
> > >> -		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
> > >> +		ctx->phys = mdata_phys;
> > >>  		ctx->size = size;
> > >>  	}
> > >>
> > >> Bart
> > >>
> > > 
> > > For some reason that I can't explain that is still not working. It seems
> > > the SMC call is returning !0 and then we return -EINVAL from there
> > > with qcom_scm_remap_error().
> > > 
> > > Here's a really crummy diff of what I hacked in during lunch to debug (don't
> > > judge my primitive debug skills):
> > > 
> > 
> > I don't know what you're talking about :-)
> > 
> > > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
> > > index 0d5554df1321..56eab0ae5f3a 100644
> > > --- a/drivers/firmware/qcom/qcom_scm-smc.c
> > > +++ b/drivers/firmware/qcom/qcom_scm-smc.c
> > > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > >         struct arm_smccc_res smc_res;
> > >         struct arm_smccc_args smc = {0};
> > >  
> > > +       dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__);
> > > +
> > >         smc.args[0] = ARM_SMCCC_CALL_VAL(
> > >                 smccc_call_type,
> > >                 qcom_smccc_convention,
> > > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > >         if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) {
> > >                 alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64);
> > >                 args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag);
> > > +               dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__);
> > >  
> > >                 if (!args_virt)
> > >                         return -ENOMEM;
> > > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > >  
> > >         /* ret error check follows after args_virt cleanup*/
> > >         ret = __scm_smc_do(dev, &smc, &smc_res, atomic);
> > > +       dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret);
> > >  
> > >         if (ret)
> > >                 return ret;
> > > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > >                 res->result[0] = smc_res.a1;
> > >                 res->result[1] = smc_res.a2;
> > >                 res->result[2] = smc_res.a3;
> > > +               dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]);
> > >         }
> > >  
> > > +       dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0);
> > >         return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0;
> > > 
> > > 
> > > And that all spams dmesg successfully for most cases, but the
> > > pas_init_image calls log this out:
> > > 
> > > [   16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc
> > > [   16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808
> > > [   16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function
> > > [   16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0
> > > [   16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0
> > > [   16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558
> > > 
> > > At the moment I am unsure why...
> > > 
> > Does the issue appear right after taking patch 6 or does it only appear after taking
> > the whole series? If it's just to this patch, then maybe something wrong with
> > the refactor: shm bridge isn't enabled at this point in the series.
> > 
> 
> I've only been testing the series as a whole on top of a 6.6 based
> branch, I'm going to try and test some more today to see if just the
> allocator bits (but not the SHM bridge enablement) works alright for
> me.
> 

After testing a little more with the fix Bart sent above,
the allocator refactor seems to work well.
With "firmware: qcom: scm: enable SHM bridge" applied is when I see the
errors I pointed out above. All prior patches cause no problems on boot
for me.

For patches 1-9 (with the fix sent here for patch 6) please feel free
to add:

    Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s

If anyone has suggestions for debugging why the switch to using
SHM bridge is breaking firmware loading for me, I'm willing to give
that a try.

Thanks,
Andrew
  
Bartosz Golaszewski Oct. 2, 2023, 2:23 p.m. UTC | #7
On Mon, Oct 2, 2023 at 4:15 PM Andrew Halaney <ahalaney@redhat.com> wrote:
>
> On Mon, Oct 02, 2023 at 08:24:09AM -0500, Andrew Halaney wrote:
> > On Fri, Sep 29, 2023 at 03:48:37PM -0700, Elliot Berman wrote:
> > > Hi Andrew,
> > >
> > > On 9/29/2023 1:44 PM, Andrew Halaney wrote:
> > > > On Fri, Sep 29, 2023 at 12:22:16PM -0700, Bartosz Golaszewski wrote:
> > > >> On Fri, 29 Sep 2023 21:16:51 +0200, Andrew Halaney <ahalaney@redhat.com> said:
> > > >>> On Thu, Sep 28, 2023 at 11:20:35AM +0200, Bartosz Golaszewski wrote:
> > > >>>> From: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > >>>>
> > > >>>> Let's use the new SCM memory allocator to obtain a buffer for this call
> > > >>>> instead of using dma_alloc_coherent().
> > > >>>>
> > > >>>> Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski@linaro.org>
> > > >>>> ---
> > > >>>>  drivers/firmware/qcom/qcom_scm.c | 16 +++++-----------
> > > >>>>  1 file changed, 5 insertions(+), 11 deletions(-)
> > > >>>>
> > > >>>> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> > > >>>> index 02a773ba1383..c0eb81069847 100644
> > > >>>> --- a/drivers/firmware/qcom/qcom_scm.c
> > > >>>> +++ b/drivers/firmware/qcom/qcom_scm.c
> > > >>>> @@ -532,7 +532,7 @@ static void qcom_scm_set_download_mode(bool enable)
> > > >>>>  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> > > >>>>                            struct qcom_scm_pas_metadata *ctx)
> > > >>>>  {
> > > >>>> -      dma_addr_t mdata_phys;
> > > >>>> +      phys_addr_t mdata_phys;
> > > >>>
> > > >>>>        void *mdata_buf;
> > > >>>>        int ret;
> > > >>>>        struct qcom_scm_desc desc = {
> > > >>>> @@ -544,13 +544,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
> > > >>>>        };
> > > >>>>        struct qcom_scm_res res;
> > > >>>>
> > > >>>> -      /*
> > > >>>> -       * During the scm call memory protection will be enabled for the meta
> > > >>>> -       * data blob, so make sure it's physically contiguous, 4K aligned and
> > > >>>> -       * non-cachable to avoid XPU violations.
> > > >>>> -       */
> > > >>>> -      mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
> > > >>>> -                                     GFP_KERNEL);
> > > >>>> +      mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
> > > >>>
> > > >>> mdata_phys is never initialized now, and its what's being shoved into
> > > >>> desc.args[1] later, which I believe is what triggered the -EINVAL
> > > >>> with qcom_scm_call() that I reported in my cover letter reply this
> > > >>> morning.
> > > >>>
> > > >>> Prior with the DMA API that would have been the device view of the buffer.
> > > >>>
> > > >>
> > > >> Gah! Thanks for finding this.
> > > >>
> > > >> Can you try the following diff?
> > > >>
> > > >> diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
> > > >> index 794388c3212f..b0d4ea237034 100644
> > > >> --- a/drivers/firmware/qcom/qcom_scm.c
> > > >> +++ b/drivers/firmware/qcom/qcom_scm.c
> > > >> @@ -556,6 +556,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> > > >> void *metadata, size_t size,
> > > >>                  dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
> > > >>                  return -ENOMEM;
> > > >>          }
> > > >> +        mdata_phys = qcom_scm_mem_to_phys(mdata_buf);
> > > >>          memcpy(mdata_buf, metadata, size);
> > > >>
> > > >>          ret = qcom_scm_clk_enable();
> > > >> @@ -578,7 +579,7 @@ int qcom_scm_pas_init_image(u32 peripheral, const
> > > >> void *metadata, size_t size,
> > > >>                  qcom_scm_mem_free(mdata_buf);
> > > >>          } else if (ctx) {
> > > >>                  ctx->ptr = mdata_buf;
> > > >> -                ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
> > > >> +                ctx->phys = mdata_phys;
> > > >>                  ctx->size = size;
> > > >>          }
> > > >>
> > > >> Bart
> > > >>
> > > >
> > > > For some reason that I can't explain that is still not working. It seems
> > > > the SMC call is returning !0 and then we return -EINVAL from there
> > > > with qcom_scm_remap_error().
> > > >
> > > > Here's a really crummy diff of what I hacked in during lunch to debug (don't
> > > > judge my primitive debug skills):
> > > >
> > >
> > > I don't know what you're talking about :-)
> > >
> > > > diff --git a/drivers/firmware/qcom/qcom_scm-smc.c b/drivers/firmware/qcom/qcom_scm-smc.c
> > > > index 0d5554df1321..56eab0ae5f3a 100644
> > > > --- a/drivers/firmware/qcom/qcom_scm-smc.c
> > > > +++ b/drivers/firmware/qcom/qcom_scm-smc.c
> > > > @@ -162,6 +162,8 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > > >         struct arm_smccc_res smc_res;
> > > >         struct arm_smccc_args smc = {0};
> > > >
> > > > +       dev_err(dev, "%s: %d: We are in this function\n", __func__, __LINE__);
> > > > +
> > > >         smc.args[0] = ARM_SMCCC_CALL_VAL(
> > > >                 smccc_call_type,
> > > >                 qcom_smccc_convention,
> > > > @@ -174,6 +176,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > > >         if (unlikely(arglen > SCM_SMC_N_REG_ARGS)) {
> > > >                 alloc_len = SCM_SMC_N_EXT_ARGS * sizeof(u64);
> > > >                 args_virt = qcom_scm_mem_alloc(PAGE_ALIGN(alloc_len), flag);
> > > > +               dev_err(dev, "%s: %d: Hit the unlikely case!\n", __func__, __LINE__);
> > > >
> > > >                 if (!args_virt)
> > > >                         return -ENOMEM;
> > > > @@ -197,6 +200,7 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > > >
> > > >         /* ret error check follows after args_virt cleanup*/
> > > >         ret = __scm_smc_do(dev, &smc, &smc_res, atomic);
> > > > +       dev_err(dev, "%s: %d: ret: %d\n", __func__, __LINE__, ret);
> > > >
> > > >         if (ret)
> > > >                 return ret;
> > > > @@ -205,8 +209,10 @@ int __scm_smc_call(struct device *dev, const struct qcom_scm_desc *desc,
> > > >                 res->result[0] = smc_res.a1;
> > > >                 res->result[1] = smc_res.a2;
> > > >                 res->result[2] = smc_res.a3;
> > > > +               dev_err(dev, "%s: %d: 0: %llu, 1: %llu: 2: %llu\n", __func__, __LINE__, res->result[0], res->result[1], res->result[2]);
> > > >         }
> > > >
> > > > +       dev_err(dev, "%s: %d: smc_res.a0: %lu\n", __func__, __LINE__, smc_res.a0);
> > > >         return (long)smc_res.a0 ? qcom_scm_remap_error(smc_res.a0) : 0;
> > > >
> > > >
> > > > And that all spams dmesg successfully for most cases, but the
> > > > pas_init_image calls log this out:
> > > >
> > > > [   16.362965] remoteproc remoteproc1: powering up 1b300000.remoteproc
> > > > [   16.364897] remoteproc remoteproc1: Booting fw image qcom/sc8280xp/LENOVO/21BX/qccdsp8280.mbn, size 3575808
> > > > [   16.365009] qcom_scm firmware:scm: __scm_smc_call: 165: We are in this function
> > > > [   16.365251] qcom_scm firmware:scm: __scm_smc_call: 203: ret: 0
> > > > [   16.365256] qcom_scm firmware:scm: __scm_smc_call: 212: 0: 0, 1: 0: 2: 0
> > > > [   16.365261] qcom_scm firmware:scm: __scm_smc_call: 215: smc_res.a0: 4291821558
> > > >
> > > > At the moment I am unsure why...
> > > >
> > > Does the issue appear right after taking patch 6 or does it only appear after taking
> > > the whole series? If it's just to this patch, then maybe something wrong with
> > > the refactor: shm bridge isn't enabled at this point in the series.
> > >
> >
> > I've only been testing the series as a whole on top of a 6.6 based
> > branch, I'm going to try and test some more today to see if just the
> > allocator bits (but not the SHM bridge enablement) works alright for
> > me.
> >
>
> After testing a little more with the fix Bart sent above,
> the allocator refactor seems to work well.
> With "firmware: qcom: scm: enable SHM bridge" applied is when I see the
> errors I pointed out above. All prior patches cause no problems on boot
> for me.
>
> For patches 1-9 (with the fix sent here for patch 6) please feel free
> to add:
>
>     Tested-by: Andrew Halaney <ahalaney@redhat.com> # sc8280xp-lenovo-thinkpad-x13s
>
> If anyone has suggestions for debugging why the switch to using
> SHM bridge is breaking firmware loading for me, I'm willing to give
> that a try.
>

Is it possible that sc8280xp doesn't support SHM bridge but for some
reason __qcom_scm_is_call_available() returns true when checking if it
does?

That's the only thing that comes to my mind ATM.

Bart
  

Patch

diff --git a/drivers/firmware/qcom/qcom_scm.c b/drivers/firmware/qcom/qcom_scm.c
index 02a773ba1383..c0eb81069847 100644
--- a/drivers/firmware/qcom/qcom_scm.c
+++ b/drivers/firmware/qcom/qcom_scm.c
@@ -532,7 +532,7 @@  static void qcom_scm_set_download_mode(bool enable)
 int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 			    struct qcom_scm_pas_metadata *ctx)
 {
-	dma_addr_t mdata_phys;
+	phys_addr_t mdata_phys;
 	void *mdata_buf;
 	int ret;
 	struct qcom_scm_desc desc = {
@@ -544,13 +544,7 @@  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 	};
 	struct qcom_scm_res res;
 
-	/*
-	 * During the scm call memory protection will be enabled for the meta
-	 * data blob, so make sure it's physically contiguous, 4K aligned and
-	 * non-cachable to avoid XPU violations.
-	 */
-	mdata_buf = dma_alloc_coherent(__scm->dev, size, &mdata_phys,
-				       GFP_KERNEL);
+	mdata_buf = qcom_scm_mem_alloc(size, GFP_KERNEL);
 	if (!mdata_buf) {
 		dev_err(__scm->dev, "Allocation of metadata buffer failed.\n");
 		return -ENOMEM;
@@ -574,10 +568,10 @@  int qcom_scm_pas_init_image(u32 peripheral, const void *metadata, size_t size,
 
 out:
 	if (ret < 0 || !ctx) {
-		dma_free_coherent(__scm->dev, size, mdata_buf, mdata_phys);
+		qcom_scm_mem_free(mdata_buf);
 	} else if (ctx) {
 		ctx->ptr = mdata_buf;
-		ctx->phys = mdata_phys;
+		ctx->phys = qcom_scm_mem_to_phys(mdata_buf);
 		ctx->size = size;
 	}
 
@@ -594,7 +588,7 @@  void qcom_scm_pas_metadata_release(struct qcom_scm_pas_metadata *ctx)
 	if (!ctx->ptr)
 		return;
 
-	dma_free_coherent(__scm->dev, ctx->size, ctx->ptr, ctx->phys);
+	qcom_scm_mem_free(ctx->ptr);
 
 	ctx->ptr = NULL;
 	ctx->phys = 0;