[1/2] spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes

Message ID 20230725110226.1.Ia2f980fc7cd0b831e633391f0bb1272914d8f381@changeid
State New
Headers
Series [1/2] spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes |

Commit Message

Doug Anderson July 25, 2023, 6:02 p.m. UTC
  The Qualcomm QSPI driver appears to require that any reads using DMA
are a mutliple of 4 bytes. If this isn't true then the controller will
clobber any extra bytes in memory following the last word. Let's
detect this and falback to PIO.

This fixes problems reported by slub_debug=FZPUA, which would complain
about "kmalloc Redzone overwritten". One such instance said:

  0xffffff80c29d541a-0xffffff80c29d541b @offset=21530. First byte 0x0 instead of 0xcc
  Allocated in mtd_kmalloc_up_to+0x98/0xac age=36 cpu=3 pid=6658

Tracing through what was happening I saw that, while we often did DMA
tranfers of 0x1000 bytes, sometimes we'd end up doing ones of 0x41a
bytes. Those 0x41a byte transfers were the problem.

NOTE: a future change will enable the SPI "mem ops" to help avoid this
case, but it still seems good to add the extra check in the transfer.

Fixes: b5762d95607e ("spi: spi-qcom-qspi: Add DMA mode support")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

 drivers/spi/spi-qcom-qspi.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)
  

Comments

Mark Brown July 25, 2023, 6:24 p.m. UTC | #1
On Tue, Jul 25, 2023 at 11:02:27AM -0700, Douglas Anderson wrote:
> In the patch ("spi: spi-qcom-qspi: Fallback to PIO for xfers that
> aren't multiples of 4 bytes") we detect reads that we can't handle
> properly and fallback to PIO mode. While that's correct behavior, we
> can do better by adding "spi_controller_mem_ops" for our
> controller. Once we do this then the caller will give us a transfer
> that's a multiple of 4-bytes so we can DMA.

This is more of an optimisation for the case where we're using flash -
if someone has hung some other hardware off the controller (which seems
reasonable enough if they don't need it for flash) then we'll not use
the mem_ops.
  
Doug Anderson July 25, 2023, 6:30 p.m. UTC | #2
Hi,

On Tue, Jul 25, 2023 at 11:24 AM Mark Brown <broonie@kernel.org> wrote:
>
> On Tue, Jul 25, 2023 at 11:02:27AM -0700, Douglas Anderson wrote:
> > In the patch ("spi: spi-qcom-qspi: Fallback to PIO for xfers that
> > aren't multiples of 4 bytes") we detect reads that we can't handle
> > properly and fallback to PIO mode. While that's correct behavior, we
> > can do better by adding "spi_controller_mem_ops" for our
> > controller. Once we do this then the caller will give us a transfer
> > that's a multiple of 4-bytes so we can DMA.
>
> This is more of an optimisation for the case where we're using flash -
> if someone has hung some other hardware off the controller (which seems
> reasonable enough if they don't need it for flash) then we'll not use
> the mem_ops.

Right. That's why I also have the first patch in the series too. The
first patch ensures correctness and the second patch makes things more
optimal for when we're using flash. Do you want me to re-submit the
patch with wording that makes this more obvious?

Note that it's pretty likely someone wouldn't use this SPI controller
for anything other than SPI flash. While it's theoretically possible,
the controller is intended for SPI flash (supports dual and quad SPI
modes) and is only half duplex.

-Doug
  
Mark Brown July 25, 2023, 6:48 p.m. UTC | #3
On Tue, Jul 25, 2023 at 11:30:30AM -0700, Doug Anderson wrote:

> Note that it's pretty likely someone wouldn't use this SPI controller
> for anything other than SPI flash. While it's theoretically possible,
> the controller is intended for SPI flash (supports dual and quad SPI
> modes) and is only half duplex.

TBH most devices are half duplex so it's not *that* big a restriction,
and dual/quad mode are obviously attractive if you need to transfer
large quantities of data.
  
Bjorn Andersson July 26, 2023, 6:01 a.m. UTC | #4
On Tue, Jul 25, 2023 at 11:02:26AM -0700, Douglas Anderson wrote:
> The Qualcomm QSPI driver appears to require that any reads using DMA
> are a mutliple of 4 bytes. If this isn't true then the controller will
> clobber any extra bytes in memory following the last word. Let's
> detect this and falback to PIO.
> 
> This fixes problems reported by slub_debug=FZPUA, which would complain
> about "kmalloc Redzone overwritten". One such instance said:
> 
>   0xffffff80c29d541a-0xffffff80c29d541b @offset=21530. First byte 0x0 instead of 0xcc
>   Allocated in mtd_kmalloc_up_to+0x98/0xac age=36 cpu=3 pid=6658
> 
> Tracing through what was happening I saw that, while we often did DMA
> tranfers of 0x1000 bytes, sometimes we'd end up doing ones of 0x41a
> bytes. Those 0x41a byte transfers were the problem.
> 
> NOTE: a future change will enable the SPI "mem ops" to help avoid this
> case, but it still seems good to add the extra check in the transfer.
> 
> Fixes: b5762d95607e ("spi: spi-qcom-qspi: Add DMA mode support")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Reviewed-by: Bjorn Andersson <andersson@kernel.org>

Regards,
Bjorn
  
Vijaya Krishna Nivarthi July 26, 2023, 6:06 a.m. UTC | #5
On 7/25/2023 11:32 PM, Douglas Anderson wrote:
> The Qualcomm QSPI driver appears to require that any reads using DMA
> are a mutliple of 4 bytes. If this isn't true then the controller will
> clobber any extra bytes in memory following the last word. Let's
> detect this and falback to PIO.
>
> This fixes problems reported by slub_debug=FZPUA, which would complain
> about "kmalloc Redzone overwritten". One such instance said:
>
>    0xffffff80c29d541a-0xffffff80c29d541b @offset=21530. First byte 0x0 instead of 0xcc
>    Allocated in mtd_kmalloc_up_to+0x98/0xac age=36 cpu=3 pid=6658
>
> Tracing through what was happening I saw that, while we often did DMA
> tranfers of 0x1000 bytes, sometimes we'd end up doing ones of 0x41a
> bytes. Those 0x41a byte transfers were the problem.
>
> NOTE: a future change will enable the SPI "mem ops" to help avoid this
> case, but it still seems good to add the extra check in the transfer.
>
> Fixes: b5762d95607e ("spi: spi-qcom-qspi: Add DMA mode support")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>

Reviewed-by: Vijaya Krishna Nivarthi <quic_vnivarth@quicinc.com>


Thank you for the fix,

Vijay/


> ---
>
>   drivers/spi/spi-qcom-qspi.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c
> index a0ad9802b606..39b4d8a8107a 100644
> --- a/drivers/spi/spi-qcom-qspi.c
> +++ b/drivers/spi/spi-qcom-qspi.c
> @@ -355,10 +355,22 @@ static int qcom_qspi_setup_dma_desc(struct qcom_qspi *ctrl,
>   
>   	for (i = 0; i < sgt->nents; i++) {
>   		dma_ptr_sg = sg_dma_address(sgt->sgl + i);
> +		dma_len_sg = sg_dma_len(sgt->sgl + i);
>   		if (!IS_ALIGNED(dma_ptr_sg, QSPI_ALIGN_REQ)) {
>   			dev_warn_once(ctrl->dev, "dma_address not aligned to %d\n", QSPI_ALIGN_REQ);
>   			return -EAGAIN;
>   		}
> +		/*
> +		 * When reading with DMA the controller writes to memory 1 word
> +		 * at a time. If the length isn't a multiple of 4 bytes then
> +		 * the controller can clobber the things later in memory.
> +		 * Fallback to PIO to be safe.
> +		 */
> +		if (ctrl->xfer.dir == QSPI_READ && (dma_len_sg & 0x03)) {
> +			dev_warn_once(ctrl->dev, "fallback to PIO for read of size %#010x\n",
> +				      dma_len_sg);
> +			return -EAGAIN;
> +		}
>   	}
>   
>   	for (i = 0; i < sgt->nents; i++) {
  
Mark Brown July 26, 2023, 3:50 p.m. UTC | #6
On Tue, 25 Jul 2023 11:02:26 -0700, Douglas Anderson wrote:
> The Qualcomm QSPI driver appears to require that any reads using DMA
> are a mutliple of 4 bytes. If this isn't true then the controller will
> clobber any extra bytes in memory following the last word. Let's
> detect this and falback to PIO.
> 
> This fixes problems reported by slub_debug=FZPUA, which would complain
> about "kmalloc Redzone overwritten". One such instance said:
> 
> [...]

Applied to

   https://git.kernel.org/pub/scm/linux/kernel/git/broonie/spi.git for-next

Thanks!

[1/2] spi: spi-qcom-qspi: Fallback to PIO for xfers that aren't multiples of 4 bytes
      commit: 138d73b627c71bf2b2f61502dc6c1137b9656434
[2/2] spi: spi-qcom-qspi: Add mem_ops to avoid PIO for badly sized reads
      commit: cc71c42b3dc1085d3e72dfa5603e827b9eb59da1

All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.

You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.

If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.

Please add any relevant lists and maintainers to the CCs when replying
to this mail.

Thanks,
Mark
  

Patch

diff --git a/drivers/spi/spi-qcom-qspi.c b/drivers/spi/spi-qcom-qspi.c
index a0ad9802b606..39b4d8a8107a 100644
--- a/drivers/spi/spi-qcom-qspi.c
+++ b/drivers/spi/spi-qcom-qspi.c
@@ -355,10 +355,22 @@  static int qcom_qspi_setup_dma_desc(struct qcom_qspi *ctrl,
 
 	for (i = 0; i < sgt->nents; i++) {
 		dma_ptr_sg = sg_dma_address(sgt->sgl + i);
+		dma_len_sg = sg_dma_len(sgt->sgl + i);
 		if (!IS_ALIGNED(dma_ptr_sg, QSPI_ALIGN_REQ)) {
 			dev_warn_once(ctrl->dev, "dma_address not aligned to %d\n", QSPI_ALIGN_REQ);
 			return -EAGAIN;
 		}
+		/*
+		 * When reading with DMA the controller writes to memory 1 word
+		 * at a time. If the length isn't a multiple of 4 bytes then
+		 * the controller can clobber the things later in memory.
+		 * Fallback to PIO to be safe.
+		 */
+		if (ctrl->xfer.dir == QSPI_READ && (dma_len_sg & 0x03)) {
+			dev_warn_once(ctrl->dev, "fallback to PIO for read of size %#010x\n",
+				      dma_len_sg);
+			return -EAGAIN;
+		}
 	}
 
 	for (i = 0; i < sgt->nents; i++) {