[v7,0/6] RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP

Message ID 20230330204217.47666-1-prabhakar.mahadev-lad.rj@bp.renesas.com
Headers
Series RISC-V non-coherent function pointer based CMO + non-coherent DMA support for AX45MP |

Message

Lad, Prabhakar March 30, 2023, 8:42 p.m. UTC
  From: Lad Prabhakar <prabhakar.mahadev-lad.rj@bp.renesas.com>

Hi All,

non-coherent DMA support for AX45MP
====================================

On the Andes AX45MP core, cache coherency is a specification option so it
may not be supported. In this case DMA will fail. To get around with this
issue this patch series does the below:

1] Andes alternative ports is implemented as errata which checks if the IOCP
is missing and only then applies to CMO errata. One vendor specific SBI EXT
(ANDES_SBI_EXT_IOCP_SW_WORKAROUND) is implemented as part of errata.

Below are the configs which Andes port provides (and are selected by RZ/Five):
      - ERRATA_ANDES
      - ERRATA_ANDES_CMO

OpenSBI patch supporting ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI can be found here,
https://patchwork.ozlabs.org/project/opensbi/patch/20230317140357.14819-1-prabhakar.mahadev-lad.rj@bp.renesas.com/

2] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
OpenSBI configures the PMA regions as required and creates a reserve memory
node and propagates it to the higher boot stack.

Currently OpenSBI (upstream) configures the required PMA region and passes
this a shared DMA pool to Linux.

    reserved-memory {
        #address-cells = <2>;
        #size-cells = <2>;
        ranges;

        pma_resv0@58000000 {
            compatible = "shared-dma-pool";
            reg = <0x0 0x58000000 0x0 0x08000000>;
            no-map;
            linux,dma-default;
        };
    };

The above shared DMA pool gets appended to Linux DTB so the DMA memory
requests go through this region.

3] We provide callbacks to synchronize specific content between memory and
cache and register using riscv_noncoherent_register_cache_ops().

4] RZ/Five SoC selects the below configs
        - AX45MP_L2_CACHE
        - DMA_GLOBAL_POOL
        - ERRATA_ANDES
        - ERRATA_ANDES_CMO

----------x---------------------x--------------------x---------------x--------------

Note,
- This series requires testing on Cores with zicbom and T-Head SoCs
- Ive used GCC 12.2.0 for compilation
- Tested all the IP blocks on RZ/Five which use DMA
- Patch series is dependent on the series from Arnd,
  https://patchwork.kernel.org/project/linux-riscv/cover/20230327121317.4081816-1-arnd@kernel.org/
- Patches applies on top of palmer/for-next (d34a6b715a23)

v6 -> v7
* Reworked the code based on Arnd's work
* Fixed review comments pointed by Arnd
* Fixed review comments pointed by Conor

v5.1 -> v6
* Dropped use of ALTERNATIVE_x() macro
* Now switched to used function pointers for CMO
* Moved driver to drivers/cache folder

v5 -> v5.1
* https://patchwork.kernel.org/project/linux-riscv/list/?series=708610&state=%2A&archive=both

v4 -> v5
* Rebased ALTERNATIVE_3() macro on top of Andrew's patches
* Rebased the changes on top of Heiko's alternative call patches
* Dropped configuring the PMA from Linux
* Dropped configuring the L2 cache from Linux and dropped the binding for same
* Now using runtime patching mechanism instead of compile time config

RFC v3 -> v4
* Implemented ALTERNATIVE_3() macro 
* Now using runtime patching mechanism instead of compile time config
* Added Andes CMO as and errata
* Fixed comments pointed by Geert

RFC v2-> RFC v3
* Fixed review comments pointed by Conor
* Move DT binding into cache folder
* Fixed DT binding check issue
* Added andestech,ax45mp-cache.h header file
* Now passing the flags for the PMA setup as part of andestech,pma-regions
  property.
* Added andestech,inst/data-prefetch and andestech,tag/data-ram-ctl
  properties to configure the L2 cache.
* Registered the cache driver as platform driver

RFC v1-> RFC v2
* Moved out the code from arc/riscv to drivers/soc/renesas
* Now handling the PMA setup as part of the L2 cache
* Now making use of dma-noncoherent.c instead SoC specific implementation.
* Dropped arch_dma_alloc() and arch_dma_free()
* Switched to RISCV_DMA_NONCOHERENT
* Included DT binding doc

RFC v2: https://patchwork.kernel.org/project/linux-renesas-soc/cover/20221003223222.448551-1-prabhakar.mahadev-lad.rj@bp.renesas.com/
RFC v1: https://patchwork.kernel.org/project/linux-renesas-soc/cover/20220906102154.32526-1-prabhakar.mahadev-lad.rj@bp.renesas.com/

Cheers,
Prabhakar

Lad Prabhakar (6):
  riscv: mm: dma-noncoherent: Switch using function pointers for cache
    management
  riscv: asm: vendorid_list: Add Andes Technology to the vendors list
  riscv: errata: Add Andes alternative ports
  dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation
    for L2 cache controller
  cache: Add L2 cache management for Andes AX45MP RISC-V core
  soc: renesas: Kconfig: Select the required configs for RZ/Five SoC

 .../cache/andestech,ax45mp-cache.yaml         |  81 +++++++
 MAINTAINERS                                   |   8 +
 arch/riscv/Kconfig.errata                     |  21 ++
 arch/riscv/errata/Makefile                    |   1 +
 arch/riscv/errata/andes/Makefile              |   1 +
 arch/riscv/errata/andes/errata.c              |  71 ++++++
 arch/riscv/errata/thead/errata.c              |  76 ++++++
 arch/riscv/include/asm/alternative.h          |   3 +
 arch/riscv/include/asm/dma-noncoherent.h      |  73 ++++++
 arch/riscv/include/asm/errata_list.h          |  53 ----
 arch/riscv/include/asm/vendorid_list.h        |   1 +
 arch/riscv/kernel/alternative.c               |   5 +
 arch/riscv/kernel/setup.c                     |  49 +++-
 arch/riscv/mm/dma-noncoherent.c               |  25 +-
 arch/riscv/mm/pmem.c                          |   6 +-
 drivers/Kconfig                               |   2 +
 drivers/Makefile                              |   1 +
 drivers/cache/Kconfig                         |  10 +
 drivers/cache/Makefile                        |   3 +
 drivers/cache/ax45mp_cache.c                  | 229 ++++++++++++++++++
 drivers/soc/renesas/Kconfig                   |   4 +
 21 files changed, 662 insertions(+), 61 deletions(-)
 create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
 create mode 100644 arch/riscv/errata/andes/Makefile
 create mode 100644 arch/riscv/errata/andes/errata.c
 create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
 create mode 100644 drivers/cache/Kconfig
 create mode 100644 drivers/cache/Makefile
 create mode 100644 drivers/cache/ax45mp_cache.c
  

Comments

Conor Dooley March 31, 2023, 6:05 p.m. UTC | #1
On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:

> - This series requires testing on Cores with zicbom and T-Head SoCs

I don't actually know if there are Zicbom parts, may need to test that
on QEMU.
I had to revert unrelated content to boot, but my D1 NFS setup seems to
work fine with these changes, so where it is relevant:
Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1

Cheers,
Conor.
  
Lad, Prabhakar March 31, 2023, 8:09 p.m. UTC | #2
Hi Conor,

On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
>
> > - This series requires testing on Cores with zicbom and T-Head SoCs
>
> I don't actually know if there are Zicbom parts, may need to test that
> on QEMU.
> I had to revert unrelated content to boot, but my D1 NFS setup seems to
> work fine with these changes, so where it is relevant:
> Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1
>
Thank you for testing this. By any chance did you compare the performance?

Cheers,
Prabhakar
  
Conor Dooley March 31, 2023, 8:15 p.m. UTC | #3
On Fri, Mar 31, 2023 at 08:09:16PM +0000, Lad, Prabhakar wrote:
> Hi Conor,
> 
> On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
> >
> > > - This series requires testing on Cores with zicbom and T-Head SoCs
> >
> > I don't actually know if there are Zicbom parts, may need to test that
> > on QEMU.
> > I had to revert unrelated content to boot, but my D1 NFS setup seems to
> > work fine with these changes, so where it is relevant:
> > Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-kicking on D1
> >
> Thank you for testing this. By any chance did you compare the performance?

No, just tyre kicking. Icenowy had some benchmark for it IIRC, I think
mining some coin or w/e. +CC them.
  
Icenowy Zheng April 1, 2023, 1:47 a.m. UTC | #4
在 2023-03-31星期五的 21:15 +0100,Conor Dooley写道:
> On Fri, Mar 31, 2023 at 08:09:16PM +0000, Lad, Prabhakar wrote:
> > Hi Conor,
> > 
> > On Fri, Mar 31, 2023 at 7:05 PM Conor Dooley <conor@kernel.org>
> > wrote:
> > > 
> > > On Thu, Mar 30, 2023 at 09:42:11PM +0100, Prabhakar wrote:
> > > 
> > > > - This series requires testing on Cores with zicbom and T-Head
> > > > SoCs
> > > 
> > > I don't actually know if there are Zicbom parts, may need to test
> > > that
> > > on QEMU.
> > > I had to revert unrelated content to boot, but my D1 NFS setup
> > > seems to
> > > work fine with these changes, so where it is relevant:
> > > Tested-by: Conor Dooley <conor.dooley@microchip.com> # tyre-
> > > kicking on D1
> > > 
> > Thank you for testing this. By any chance did you compare the
> > performance?
> 
> No, just tyre kicking. Icenowy had some benchmark for it IIRC, I
> think
> mining some coin or w/e. +CC them.

I previously tested the function pointer based CMO, it do not affect
the performance beyond the measurement error. Maybe it's because CMO
operations are only done at the start and end of DMA operations.

My previous test system is LiteX + OpenC906.