[03/13] scatterlist: Add sg_set_folio()

Message ID 20230621164557.3510324-4-willy@infradead.org
State New
Headers
Series Remove pagevecs |

Commit Message

Matthew Wilcox June 21, 2023, 4:45 p.m. UTC
  This wrapper for sg_set_page() lets drivers add folios to a scatterlist
more easily.  We could, perhaps, do better by using a different page
in the folio if offset is larger than UINT_MAX, but let's hope we get
a better data structure than this before we need to care about such
large folios.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 include/linux/scatterlist.h | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)
  

Comments

Zhu Yanjun July 30, 2023, 11:01 a.m. UTC | #1
在 2023/6/22 0:45, Matthew Wilcox (Oracle) 写道:
> This wrapper for sg_set_page() lets drivers add folios to a scatterlist
> more easily.  We could, perhaps, do better by using a different page
> in the folio if offset is larger than UINT_MAX, but let's hope we get
> a better data structure than this before we need to care about such
> large folios.
> 
> Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> ---
>   include/linux/scatterlist.h | 24 ++++++++++++++++++++++++
>   1 file changed, 24 insertions(+)
> 
> diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
> index ec46d8e8e49d..77df3d7b18a6 100644
> --- a/include/linux/scatterlist.h
> +++ b/include/linux/scatterlist.h
> @@ -141,6 +141,30 @@ static inline void sg_set_page(struct scatterlist *sg, struct page *page,
>   	sg->length = len;
>   }
>   
> +/**
> + * sg_set_folio - Set sg entry to point at given folio
> + * @sg:		 SG entry
> + * @folio:	 The folio
> + * @len:	 Length of data
> + * @offset:	 Offset into folio
> + *
> + * Description:
> + *   Use this function to set an sg entry pointing at a folio, never assign
> + *   the folio directly. We encode sg table information in the lower bits
> + *   of the folio pointer. See sg_page() for looking up the page belonging
> + *   to an sg entry.
> + *
> + **/
> +static inline void sg_set_folio(struct scatterlist *sg, struct folio *folio,
> +			       size_t len, size_t offset)
> +{
> +	WARN_ON_ONCE(len > UINT_MAX);
> +	WARN_ON_ONCE(offset > UINT_MAX);
> +	sg_assign_page(sg, &folio->page);
> +	sg->offset = offset;
> +	sg->length = len;
> +}
> +

https://elixir.bootlin.com/linux/latest/source/lib/scatterlist.c#L451

Does the following function have folio version?

"
int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
		struct page **pages, unsigned int n_pages, unsigned int offset,
		unsigned long size, unsigned int max_segment,
		unsigned int left_pages, gfp_t gfp_mask)
"

Thanks a lot.
Zhu Yanjun

>   static inline struct page *sg_page(struct scatterlist *sg)
>   {
>   #ifdef CONFIG_DEBUG_SG
  
Matthew Wilcox July 30, 2023, 11:18 a.m. UTC | #2
On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
> Does the following function have folio version?
> 
> "
> int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
> 		struct page **pages, unsigned int n_pages, unsigned int offset,
> 		unsigned long size, unsigned int max_segment,
> 		unsigned int left_pages, gfp_t gfp_mask)
> "

No -- I haven't needed to convert anything that uses
sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
be _too_ hard to add a folio version.
  
Zhu Yanjun July 30, 2023, 1:57 p.m. UTC | #3
在 2023/7/30 19:18, Matthew Wilcox 写道:
> On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
>> Does the following function have folio version?
>>
>> "
>> int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
>> 		struct page **pages, unsigned int n_pages, unsigned int offset,
>> 		unsigned long size, unsigned int max_segment,
>> 		unsigned int left_pages, gfp_t gfp_mask)
>> "
> No -- I haven't needed to convert anything that uses
> sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
> be _too_ hard to add a folio version.

In many places, this function is used. So this function needs the folio 
version.

Another problem, after folio is used, I want to know the performance 
after folio is implemented.

How to make tests to get the performance?

Thanks a lot.

Zhu Yanjun
  
Matthew Wilcox July 30, 2023, 9:42 p.m. UTC | #4
On Sun, Jul 30, 2023 at 09:57:06PM +0800, Zhu Yanjun wrote:
> 
> 在 2023/7/30 19:18, Matthew Wilcox 写道:
> > On Sun, Jul 30, 2023 at 07:01:26PM +0800, Zhu Yanjun wrote:
> > > Does the following function have folio version?
> > > 
> > > "
> > > int sg_alloc_append_table_from_pages(struct sg_append_table *sgt_append,
> > > 		struct page **pages, unsigned int n_pages, unsigned int offset,
> > > 		unsigned long size, unsigned int max_segment,
> > > 		unsigned int left_pages, gfp_t gfp_mask)
> > > "
> > No -- I haven't needed to convert anything that uses
> > sg_alloc_append_table_from_pages() yet.  It doesn't look like it should
> > be _too_ hard to add a folio version.
> 
> In many places, this function is used. So this function needs the folio
> version.

It's not used in very many places.  But the first one that I see it used
(drivers/infiniband/core/umem.c), you can't do a straightforward folio
conversion:

                pinned = pin_user_pages_fast(cur_base,
                                          min_t(unsigned long, npages,
                                                PAGE_SIZE /
                                                sizeof(struct page *)),
                                          gup_flags, page_list);
...
                ret = sg_alloc_append_table_from_pages(
                        &umem->sgt_append, page_list, pinned, 0,
                        pinned << PAGE_SHIFT, ib_dma_max_seg_size(device),
                        npages, GFP_KERNEL);

That can't be converted to folios.  The GUP might start in the middle of
the folio, and we have no way to communicate that.

This particular usage really needs the phyr work that Jason is doing so
we can efficiently communicate physically contiguous ranges from GUP
to sg.

> Another problem, after folio is used, I want to know the performance after
> folio is implemented.
> 
> How to make tests to get the performance?

You know what you're working on ... I wouldn't know how best to test
your code.
  

Patch

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index ec46d8e8e49d..77df3d7b18a6 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -141,6 +141,30 @@  static inline void sg_set_page(struct scatterlist *sg, struct page *page,
 	sg->length = len;
 }
 
+/**
+ * sg_set_folio - Set sg entry to point at given folio
+ * @sg:		 SG entry
+ * @folio:	 The folio
+ * @len:	 Length of data
+ * @offset:	 Offset into folio
+ *
+ * Description:
+ *   Use this function to set an sg entry pointing at a folio, never assign
+ *   the folio directly. We encode sg table information in the lower bits
+ *   of the folio pointer. See sg_page() for looking up the page belonging
+ *   to an sg entry.
+ *
+ **/
+static inline void sg_set_folio(struct scatterlist *sg, struct folio *folio,
+			       size_t len, size_t offset)
+{
+	WARN_ON_ONCE(len > UINT_MAX);
+	WARN_ON_ONCE(offset > UINT_MAX);
+	sg_assign_page(sg, &folio->page);
+	sg->offset = offset;
+	sg->length = len;
+}
+
 static inline struct page *sg_page(struct scatterlist *sg)
 {
 #ifdef CONFIG_DEBUG_SG