dma: DMA_ATTR_SKIP_CPU_SYNC documentation tweaks

Message ID 98ef4f76d7a5f90b0878e649a70b101402b8889d.1689761699.git.mst@redhat.com
State New
Headers
Series dma: DMA_ATTR_SKIP_CPU_SYNC documentation tweaks |

Commit Message

Michael S. Tsirkin July 19, 2023, 10:15 a.m. UTC
  A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
might be easily misunderstood.

This attempts to improve documentation in several ways:

when used with dma_map_*, DMA_ATTR_SKIP_CPU_SYNC does not
really assume buffer has been transferred previously -
the buffer would often not even exist in device domain
before it's mapped, instead it normally has to be transferred later.

Add a hint on how buffer can be transferred.

Code comments near DMA_ATTR_SKIP_CPU_SYNC focus on
the use-case of CPU cache synchronization while in
practice this flag isn't limited to that.
Make it more generic.

A couple of things I'm thinking about left for a follow-up patch:
- rename DMA_ATTR_SKIP_CPU_SYNC to DMA_ATTR_SKIP_SYNC
  there's nothing I can see making it especially related to the CPU.
- drop mentions of CPU cache from documentation completely
  and talk about CPU domain exclusively, or maybe mention
  CPU cache as an example: CPU domain (e.g. CPU cache).

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 Documentation/core-api/dma-attributes.rst | 5 +++--
 include/linux/dma-mapping.h               | 5 ++---
 2 files changed, 5 insertions(+), 5 deletions(-)
  

Comments

Christoph Hellwig July 20, 2023, 6:07 a.m. UTC | #1
On Wed, Jul 19, 2023 at 06:15:59AM -0400, Michael S. Tsirkin wrote:
> A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
> might be easily misunderstood.

.. just curious: what patchset is that?  DMA_ATTR_SKIP_CPU_SYNC is
often a bad idea and all users probably could use a really good
audit..

>  #define DMA_ATTR_NO_KERNEL_MAPPING	(1UL << 4)
>  /*
> - * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of
> - * the CPU cache for the given buffer assuming that it has been already
> - * transferred to 'device' domain.
> + * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of the
> + * CPU and device domains for the given buffer.

While we're at it, I think "allows" is the wrong word here, we really
must skip the synchronization or else we're in trouble.
  
Michael S. Tsirkin July 20, 2023, 6:21 a.m. UTC | #2
On Thu, Jul 20, 2023 at 08:07:42AM +0200, Christoph Hellwig wrote:
> On Wed, Jul 19, 2023 at 06:15:59AM -0400, Michael S. Tsirkin wrote:
> > A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
> > might be easily misunderstood.
> 
> .. just curious: what patchset is that?  DMA_ATTR_SKIP_CPU_SYNC is
> often a bad idea and all users probably could use a really good
> audit..

Message-Id: <20230710034237.12391-1-xuanzhuo@linux.alibaba.com>


Looks like there's really little else can be done: there's a
shared page we allow DMA into, so we sync periodically.
Then when we unmap we really do not need that data
synced again.

What exactly is wrong with this?


> >  #define DMA_ATTR_NO_KERNEL_MAPPING	(1UL << 4)
> >  /*
> > - * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of
> > - * the CPU cache for the given buffer assuming that it has been already
> > - * transferred to 'device' domain.
> > + * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of the
> > + * CPU and device domains for the given buffer.
> 
> While we're at it, I think "allows" is the wrong word here, we really
> must skip the synchronization or else we're in trouble.

Hmm could you explain? I thought multiple sync operations are harmless.
  
Christoph Hellwig July 20, 2023, 6:25 a.m. UTC | #3
On Thu, Jul 20, 2023 at 02:21:05AM -0400, Michael S. Tsirkin wrote:
> On Thu, Jul 20, 2023 at 08:07:42AM +0200, Christoph Hellwig wrote:
> > On Wed, Jul 19, 2023 at 06:15:59AM -0400, Michael S. Tsirkin wrote:
> > > A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
> > > might be easily misunderstood.
> > 
> > .. just curious: what patchset is that?  DMA_ATTR_SKIP_CPU_SYNC is
> > often a bad idea and all users probably could use a really good
> > audit..
> 
> Message-Id: <20230710034237.12391-1-xuanzhuo@linux.alibaba.com>

Do you have an actual link?

> 
> 
> Looks like there's really little else can be done: there's a
> shared page we allow DMA into, so we sync periodically.
> Then when we unmap we really do not need that data
> synced again.
> 
> What exactly is wrong with this?

A "shared" page without ownership can't work with the streaming
DMA API (dma_map_*) at all.  You need to use dma_alloc_coherent
so that it is mapped uncached.
  
Michael S. Tsirkin July 20, 2023, 6:30 a.m. UTC | #4
On Thu, Jul 20, 2023 at 08:25:25AM +0200, Christoph Hellwig wrote:
> On Thu, Jul 20, 2023 at 02:21:05AM -0400, Michael S. Tsirkin wrote:
> > On Thu, Jul 20, 2023 at 08:07:42AM +0200, Christoph Hellwig wrote:
> > > On Wed, Jul 19, 2023 at 06:15:59AM -0400, Michael S. Tsirkin wrote:
> > > > A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
> > > > might be easily misunderstood.
> > > 
> > > .. just curious: what patchset is that?  DMA_ATTR_SKIP_CPU_SYNC is
> > > often a bad idea and all users probably could use a really good
> > > audit..
> > 
> > Message-Id: <20230710034237.12391-1-xuanzhuo@linux.alibaba.com>
> 
> Do you have an actual link?

sure, they are not hard to generate ;)

https://lore.kernel.org/all/20230710034237.12391-11-xuanzhuo%40linux.alibaba.com

> > 
> > 
> > Looks like there's really little else can be done: there's a
> > shared page we allow DMA into, so we sync periodically.
> > Then when we unmap we really do not need that data
> > synced again.
> > 
> > What exactly is wrong with this?
> 
> A "shared" page without ownership can't work with the streaming
> DMA API (dma_map_*) at all.  You need to use dma_alloc_coherent
> so that it is mapped uncached.

Hmm confused.  Based on both documentation and code I think this works:

	dma_map
	dma_sync
	dma_sync
	dma_sync
	dma_sync
	dma_unmap(DMA_ATTR_SKIP_CPU_SYNC)

right?
  
Michael S. Tsirkin July 20, 2023, 6:34 a.m. UTC | #5
On Thu, Jul 20, 2023 at 02:30:08AM -0400, Michael S. Tsirkin wrote:
> On Thu, Jul 20, 2023 at 08:25:25AM +0200, Christoph Hellwig wrote:
> > On Thu, Jul 20, 2023 at 02:21:05AM -0400, Michael S. Tsirkin wrote:
> > > On Thu, Jul 20, 2023 at 08:07:42AM +0200, Christoph Hellwig wrote:
> > > > On Wed, Jul 19, 2023 at 06:15:59AM -0400, Michael S. Tsirkin wrote:
> > > > > A recent patchset highlighted to me that DMA_ATTR_SKIP_CPU_SYNC
> > > > > might be easily misunderstood.
> > > > 
> > > > .. just curious: what patchset is that?  DMA_ATTR_SKIP_CPU_SYNC is
> > > > often a bad idea and all users probably could use a really good
> > > > audit..
> > > 
> > > Message-Id: <20230710034237.12391-1-xuanzhuo@linux.alibaba.com>
> > 
> > Do you have an actual link?
> 
> sure, they are not hard to generate ;)
> 
> https://lore.kernel.org/all/20230710034237.12391-11-xuanzhuo%40linux.alibaba.com


actually there's a new version

https://lore.kernel.org/all/20230719040422.126357-11-xuanzhuo%40linux.alibaba.com

you can see it does map, sync, unmap

unmap immediately after sync seems to be exactly the use case
for DMA_ATTR_SKIP_CPU_SYNC.


> > > 
> > > 
> > > Looks like there's really little else can be done: there's a
> > > shared page we allow DMA into, so we sync periodically.
> > > Then when we unmap we really do not need that data
> > > synced again.
> > > 
> > > What exactly is wrong with this?
> > 
> > A "shared" page without ownership can't work with the streaming
> > DMA API (dma_map_*) at all.  You need to use dma_alloc_coherent
> > so that it is mapped uncached.
> 
> Hmm confused.  Based on both documentation and code I think this works:
> 
> 	dma_map
> 	dma_sync
> 	dma_sync
> 	dma_sync
> 	dma_sync
> 	dma_unmap(DMA_ATTR_SKIP_CPU_SYNC)
> 
> right?
> 
> -- 
> MST
  
Christoph Hellwig July 20, 2023, 6:43 a.m. UTC | #6
On Thu, Jul 20, 2023 at 02:30:04AM -0400, Michael S. Tsirkin wrote:
> sure, they are not hard to generate ;)
> 
> https://lore.kernel.org/all/20230710034237.12391-11-xuanzhuo%40linux.alibaba.com

Thanks, I'll chime in there.

> > > Looks like there's really little else can be done: there's a
> > > shared page we allow DMA into, so we sync periodically.
> > > Then when we unmap we really do not need that data
> > > synced again.
> > > 
> > > What exactly is wrong with this?
> > 
> > A "shared" page without ownership can't work with the streaming
> > DMA API (dma_map_*) at all.  You need to use dma_alloc_coherent
> > so that it is mapped uncached.
> 
> Hmm confused.  Based on both documentation and code I think this works:
> 
> 	dma_map
> 	dma_sync
> 	dma_sync
> 	dma_sync
> 	dma_sync
> 	dma_unmap(DMA_ATTR_SKIP_CPU_SYNC)
> 
> right?

Depends on your definition of "shared".  If there is always a clear
owner at a given time you can games with lots of syncs that transfer
ownership.  If there is no clear ownership, and the "device" just
DMAs into the buffer at random times and the host checks bits in
there we need to map the buffer uncached.

I'll chime in in the thread.

> 
> -- 
> MST
---end quoted text---
  
Michael S. Tsirkin July 20, 2023, 7:06 a.m. UTC | #7
On Thu, Jul 20, 2023 at 08:43:18AM +0200, Christoph Hellwig wrote:
> On Thu, Jul 20, 2023 at 02:30:04AM -0400, Michael S. Tsirkin wrote:
> > sure, they are not hard to generate ;)
> > 
> > https://lore.kernel.org/all/20230710034237.12391-11-xuanzhuo%40linux.alibaba.com
> 
> Thanks, I'll chime in there.
> 
> > > > Looks like there's really little else can be done: there's a
> > > > shared page we allow DMA into, so we sync periodically.
> > > > Then when we unmap we really do not need that data
> > > > synced again.
> > > > 
> > > > What exactly is wrong with this?
> > > 
> > > A "shared" page without ownership can't work with the streaming
> > > DMA API (dma_map_*) at all.  You need to use dma_alloc_coherent
> > > so that it is mapped uncached.
> > 
> > Hmm confused.  Based on both documentation and code I think this works:
> > 
> > 	dma_map
> > 	dma_sync
> > 	dma_sync
> > 	dma_sync
> > 	dma_sync
> > 	dma_unmap(DMA_ATTR_SKIP_CPU_SYNC)
> > 
> > right?
> 
> Depends on your definition of "shared".  If there is always a clear
> owner at a given time you can games with lots of syncs that transfer
> ownership.  If there is no clear ownership, and the "device" just
> DMAs into the buffer at random times and the host checks bits in
> there we need to map the buffer uncached.
> 
> I'll chime in in the thread.

Each chunk of that buffer is DMA'd into separately and then sync'd
afterwards.


> > 
> > -- 
> > MST
> ---end quoted text---
  

Patch

diff --git a/Documentation/core-api/dma-attributes.rst b/Documentation/core-api/dma-attributes.rst
index 1887d92e8e92..782734666790 100644
--- a/Documentation/core-api/dma-attributes.rst
+++ b/Documentation/core-api/dma-attributes.rst
@@ -61,8 +61,9 @@  same synchronization operation on the CPU cache. CPU cache synchronization
 might be a time consuming operation, especially if the buffers are
 large, so it is highly recommended to avoid it if possible.
 DMA_ATTR_SKIP_CPU_SYNC allows platform code to skip synchronization of
-the CPU cache for the given buffer assuming that it has been already
-transferred to 'device' domain. This attribute can be also used for
+the CPU cache for the given buffer assuming that it is
+transferred to 'device' domain separately, e.g. using
+dma_sync_{single,sg}_for_{cpu,device}. This attribute can be also used for
 dma_unmap_{single,page,sg} functions family to force buffer to stay in
 device domain after releasing a mapping for it. Use this attribute with
 care!
diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 0ee20b764000..13295ae4385a 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -32,9 +32,8 @@ 
  */
 #define DMA_ATTR_NO_KERNEL_MAPPING	(1UL << 4)
 /*
- * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of
- * the CPU cache for the given buffer assuming that it has been already
- * transferred to 'device' domain.
+ * DMA_ATTR_SKIP_CPU_SYNC: Allows platform code to skip synchronization of the
+ * CPU and device domains for the given buffer.
  */
 #define DMA_ATTR_SKIP_CPU_SYNC		(1UL << 5)
 /*