[mm-unstable] mm: clarify folio_set_compound_order() zero support

Message ID 20221207223731.32784-1-sidhartha.kumar@oracle.com
State New
Headers
Series [mm-unstable] mm: clarify folio_set_compound_order() zero support |

Commit Message

Sidhartha Kumar Dec. 7, 2022, 10:37 p.m. UTC
  Document hugetlb's use of a zero compound order so support for zero
orders is not removed from folio_set_compound_order().

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
Suggested-by: Muchun Song <songmuchun@bytedance.com>
---
This can be folded into f2b67a51d0ef6871d4fb0c3e8199f278112bd108
mm: add folio dtor and order setter functions

 include/linux/mm.h | 7 +++++++
 1 file changed, 7 insertions(+)
  

Comments

John Hubbard Dec. 8, 2022, 12:38 a.m. UTC | #1
On 12/7/22 14:37, Sidhartha Kumar wrote:
> Document hugetlb's use of a zero compound order so support for zero
> orders is not removed from folio_set_compound_order().
> 
> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
> Suggested-by: Muchun Song <songmuchun@bytedance.com>
> ---
> This can be folded into f2b67a51d0ef6871d4fb0c3e8199f278112bd108
> mm: add folio dtor and order setter functions
> 
>   include/linux/mm.h | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 443d496949a8..cd8508d728f1 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -999,9 +999,16 @@ static inline void set_compound_order(struct page *page, unsigned int order)
>   #endif
>   }
>   
> +/*
> + * folio_set_compound_order is generally passed a non-zero order to
> + * initialize a large folio.  However, hugetlb code abuses this by
> + * passing in zero when 'dissolving' a large folio.
> + */

Wouldn't it be better to instead just create a new function for that
case, such as:

     dissolve_large_folio()

?

>   static inline void folio_set_compound_order(struct folio *folio,
>   		unsigned int order)
>   {
> +	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> +
>   	folio->_folio_order = order;
>   #ifdef CONFIG_64BIT
>   	folio->_folio_nr_pages = order ? 1U << order : 0;

thanks,
  
Sidhartha Kumar Dec. 8, 2022, 1:42 a.m. UTC | #2
On 12/7/22 4:38 PM, John Hubbard wrote:
> On 12/7/22 14:37, Sidhartha Kumar wrote:
>> Document hugetlb's use of a zero compound order so support for zero
>> orders is not removed from folio_set_compound_order().
>>
>> Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
>> Suggested-by: Mike Kravetz <mike.kravetz@oracle.com>
>> Suggested-by: Muchun Song <songmuchun@bytedance.com>
>> ---
>> This can be folded into f2b67a51d0ef6871d4fb0c3e8199f278112bd108
>> mm: add folio dtor and order setter functions
>>
>>   include/linux/mm.h | 7 +++++++
>>   1 file changed, 7 insertions(+)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 443d496949a8..cd8508d728f1 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -999,9 +999,16 @@ static inline void set_compound_order(struct page 
>> *page, unsigned int order)
>>   #endif
>>   }
>> +/*
>> + * folio_set_compound_order is generally passed a non-zero order to
>> + * initialize a large folio.  However, hugetlb code abuses this by
>> + * passing in zero when 'dissolving' a large folio.
>> + */
> 
> Wouldn't it be better to instead just create a new function for that
> case, such as:
> 
>      dissolve_large_folio()
> 

Prior to the folio conversion, the helper function 
__destroy_compound_gigantic_page() did:

	set_compound_order(page, 0);
#ifdef CONFIG_64BIT
	page[1].compound_nr = 0;
#endif

as part of dissolving the page. My goal for this patch was to create a 
function that would encapsulate that segment of code with a single call 
of folio_set_compound_order(folio, 0). set_compound_order() does not set 
compound_nr to 0 when 0 is passed in to the order argument so explicitly 
setting it is required. I don't think a separate dissolve_large_folio() 
function for the hugetlb case is needed as 
__destroy_compound_gigantic_folio() is pretty concise as it is.

> ?
> 
>>   static inline void folio_set_compound_order(struct folio *folio,
>>           unsigned int order)
>>   {
>> +    VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>> +
>>       folio->_folio_order = order;
>>   #ifdef CONFIG_64BIT
>>       folio->_folio_nr_pages = order ? 1U << order : 0;
> 
> thanks,
  
John Hubbard Dec. 8, 2022, 2:27 a.m. UTC | #3
On 12/7/22 17:42, Sidhartha Kumar wrote:
>> Wouldn't it be better to instead just create a new function for that
>> case, such as:
>>
>>      dissolve_large_folio()
>>
> 
> Prior to the folio conversion, the helper function __destroy_compound_gigantic_page() did:
> 
>      set_compound_order(page, 0);
> #ifdef CONFIG_64BIT
>      page[1].compound_nr = 0;
> #endif
> 
> as part of dissolving the page. My goal for this patch was to create a function that would encapsulate that segment of code with a single call of folio_set_compound_order(folio, 0). set_compound_order() does not set compound_nr to 0 when 0 is passed in to the order argument so explicitly setting it is required. I don't think a separate dissolve_large_folio() function for the hugetlb case is needed as __destroy_compound_gigantic_folio() is pretty concise as it is.
> 

Instead of "this is abusing function X()" comments, we should prefer
well-named functions that do something understandable. And you can get
that by noticing that folio_set_compound_order() collapses down to
nearly nothing in the special "order 0" case. So just inline that code
directly into __destroy_compound_gigantic_folio(), taking a moment to
fill in and consolidate the CONFIG_64BIT missing parts in mm.h.

And now you can get rid of this cruft and "abuse" comment, and instead
just end up with two simple lines of code that are crystal clear--as
they should be, in a "__destroy" function. Like this:


diff --git a/include/linux/mm.h b/include/linux/mm.h
index 105878936485..cf227ed00945 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
  #endif
  }
  
+#ifdef CONFIG_64BIT
  /**
   * folio_nr_pages - The number of pages in the folio.
   * @folio: The folio.
@@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio *folio)
  {
  	if (!folio_test_large(folio))
  		return 1;
-#ifdef CONFIG_64BIT
  	return folio->_folio_nr_pages;
+}
+
+static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
+{
+	folio->_folio_nr_pages = nr_pages;
+}
  #else
+/**
+ * folio_nr_pages - The number of pages in the folio.
+ * @folio: The folio.
+ *
+ * Return: A positive power of two.
+ */
+static inline long folio_nr_pages(struct folio *folio)
+{
+	if (!folio_test_large(folio))
+		return 1;
  	return 1L << folio->_folio_order;
-#endif
  }
  
+static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
+{
+}
+#endif
+
  /**
   * folio_next - Move to the next physical folio.
   * @folio: The folio we're currently operating on.
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index e3500c087893..b507a98063e6 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1344,7 +1344,8 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
  			set_page_refcounted(p);
  	}
  
-	folio_set_compound_order(folio, 0);
+	folio->_folio_order = 0;
+	folio_set_nr_pages(folio, 0);
  	__folio_clear_head(folio);
  }
  

Yes?

thanks,
  
Muchun Song Dec. 8, 2022, 4:41 a.m. UTC | #4
On Thu, Dec 8, 2022 at 10:27 AM John Hubbard <jhubbard@nvidia.com> wrote:
>
> On 12/7/22 17:42, Sidhartha Kumar wrote:
> >> Wouldn't it be better to instead just create a new function for that
> >> case, such as:
> >>
> >>      dissolve_large_folio()
> >>
> >
> > Prior to the folio conversion, the helper function __destroy_compound_gigantic_page() did:
> >
> >      set_compound_order(page, 0);
> > #ifdef CONFIG_64BIT
> >      page[1].compound_nr = 0;
> > #endif
> >
> > as part of dissolving the page. My goal for this patch was to create a function that would encapsulate that segment of code with a single call of folio_set_compound_order(folio, 0). set_compound_order() does not set compound_nr to 0 when 0 is passed in to the order argument so explicitly setting it is required. I don't think a separate dissolve_large_folio() function for the hugetlb case is needed as __destroy_compound_gigantic_folio() is pretty concise as it is.
> >
>
> Instead of "this is abusing function X()" comments, we should prefer
> well-named functions that do something understandable. And you can get
> that by noticing that folio_set_compound_order() collapses down to
> nearly nothing in the special "order 0" case. So just inline that code
> directly into __destroy_compound_gigantic_folio(), taking a moment to
> fill in and consolidate the CONFIG_64BIT missing parts in mm.h.
>
> And now you can get rid of this cruft and "abuse" comment, and instead
> just end up with two simple lines of code that are crystal clear--as
> they should be, in a "__destroy" function. Like this:
>
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 105878936485..cf227ed00945 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page *page, enum zone_type zone,
>   #endif
>   }
>
> +#ifdef CONFIG_64BIT
>   /**
>    * folio_nr_pages - The number of pages in the folio.
>    * @folio: The folio.
> @@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio *folio)
>   {
>         if (!folio_test_large(folio))
>                 return 1;
> -#ifdef CONFIG_64BIT
>         return folio->_folio_nr_pages;
> +}
> +
> +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)

I like this approach and helper name since it is more consistent
with folio_nr_pages.

> +{
> +       folio->_folio_nr_pages = nr_pages;
> +}
>   #else
> +/**
> + * folio_nr_pages - The number of pages in the folio.
> + * @folio: The folio.
> + *
> + * Return: A positive power of two.
> + */
> +static inline long folio_nr_pages(struct folio *folio)
> +{
> +       if (!folio_test_large(folio))
> +               return 1;
>         return 1L << folio->_folio_order;
> -#endif
>   }
>
> +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> +{
> +}
> +#endif
> +
>   /**
>    * folio_next - Move to the next physical folio.
>    * @folio: The folio we're currently operating on.
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e3500c087893..b507a98063e6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1344,7 +1344,8 @@ static void __destroy_compound_gigantic_folio(struct folio *folio,
>                         set_page_refcounted(p);
>         }
>
> -       folio_set_compound_order(folio, 0);
> +       folio->_folio_order = 0;

I suggest not touch _folio_order directly, we can introduce another helper like
folio_sert_order to set -> _folio_order pairing with folio_order.

Thanks.

> +       folio_set_nr_pages(folio, 0);
>         __folio_clear_head(folio);
>   }
>
>
> Yes?
>
> thanks,
> --
> John Hubbard
> NVIDIA
  
Sidhartha Kumar Dec. 8, 2022, 6:06 p.m. UTC | #5
On 12/7/22 6:27 PM, John Hubbard wrote:
> On 12/7/22 17:42, Sidhartha Kumar wrote:
>>> Wouldn't it be better to instead just create a new function for that
>>> case, such as:
>>>
>>>      dissolve_large_folio()
>>>
>>
>> Prior to the folio conversion, the helper function 
>> __destroy_compound_gigantic_page() did:
>>
>>      set_compound_order(page, 0);
>> #ifdef CONFIG_64BIT
>>      page[1].compound_nr = 0;
>> #endif
>>
>> as part of dissolving the page. My goal for this patch was to create a 
>> function that would encapsulate that segment of code with a single 
>> call of folio_set_compound_order(folio, 0). set_compound_order() does 
>> not set compound_nr to 0 when 0 is passed in to the order argument so 
>> explicitly setting it is required. I don't think a separate 
>> dissolve_large_folio() function for the hugetlb case is needed as 
>> __destroy_compound_gigantic_folio() is pretty concise as it is.
>>
> 
> Instead of "this is abusing function X()" comments, we should prefer
> well-named functions that do something understandable. And you can get
> that by noticing that folio_set_compound_order() collapses down to
> nearly nothing in the special "order 0" case. So just inline that code
> directly into __destroy_compound_gigantic_folio(), taking a moment to
> fill in and consolidate the CONFIG_64BIT missing parts in mm.h.
> 
> And now you can get rid of this cruft and "abuse" comment, and instead
> just end up with two simple lines of code that are crystal clear--as
> they should be, in a "__destroy" function. Like this:
> 
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 105878936485..cf227ed00945 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page 
> *page, enum zone_type zone,
>   #endif
>   }
> 
> +#ifdef CONFIG_64BIT
>   /**
>    * folio_nr_pages - The number of pages in the folio.
>    * @folio: The folio.
> @@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio 
> *folio)
>   {
>       if (!folio_test_large(folio))
>           return 1;
> -#ifdef CONFIG_64BIT
>       return folio->_folio_nr_pages;
> +}
> +
> +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> +{
> +    folio->_folio_nr_pages = nr_pages;
> +}
>   #else
> +/**
> + * folio_nr_pages - The number of pages in the folio.
> + * @folio: The folio.
> + *
> + * Return: A positive power of two.
> + */
> +static inline long folio_nr_pages(struct folio *folio)
> +{
> +    if (!folio_test_large(folio))
> +        return 1;
>       return 1L << folio->_folio_order;
> -#endif
>   }
> 
> +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> +{
> +}
> +#endif
> +
>   /**
>    * folio_next - Move to the next physical folio.
>    * @folio: The folio we're currently operating on.
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index e3500c087893..b507a98063e6 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -1344,7 +1344,8 @@ static void 
> __destroy_compound_gigantic_folio(struct folio *folio,
>               set_page_refcounted(p);
>       }
> 
> -    folio_set_compound_order(folio, 0);
> +    folio->_folio_order = 0;
> +    folio_set_nr_pages(folio, 0);
>       __folio_clear_head(folio);
>   }
> 
> 
> Yes? 

This works for me, I will take this approach along with Muchun's 
feedback about a wrapper function so as not to touch _folio_order 
directly and send out a new version.

One question I have is if I should then get rid of 
folio_set_compound_order() as hugetlb is the only compound page user 
I've converted to folios so far and its use can be replaced by the 
suggested folio_set_nr_pages() and folio_set_order().

Hugetlb also has one has one call to folio_set_compound_order() with a 
non-zero order, should I replace this with a call to folio_set_order() 
and folio_set_nr_pages() as well, or keep folio_set_compound_order() and 
remove zero order support and the comment. Please let me know which 
approach you would prefer.

Thanks,
Sidhartha Kumar


> thanks,
  
Mike Kravetz Dec. 8, 2022, 7:32 p.m. UTC | #6
On 12/08/22 10:06, Sidhartha Kumar wrote:
> On 12/7/22 6:27 PM, John Hubbard wrote:
> > On 12/7/22 17:42, Sidhartha Kumar wrote:
> > > > Wouldn't it be better to instead just create a new function for that
> > > > case, such as:
> > > > 
> > > >      dissolve_large_folio()
> > > > 
> > > 
> > > Prior to the folio conversion, the helper function
> > > __destroy_compound_gigantic_page() did:
> > > 
> > >      set_compound_order(page, 0);
> > > #ifdef CONFIG_64BIT
> > >      page[1].compound_nr = 0;
> > > #endif
> > > 
> > > as part of dissolving the page. My goal for this patch was to create
> > > a function that would encapsulate that segment of code with a single
> > > call of folio_set_compound_order(folio, 0). set_compound_order()
> > > does not set compound_nr to 0 when 0 is passed in to the order
> > > argument so explicitly setting it is required. I don't think a
> > > separate dissolve_large_folio() function for the hugetlb case is
> > > needed as __destroy_compound_gigantic_folio() is pretty concise as
> > > it is.
> > > 
> > 
> > Instead of "this is abusing function X()" comments, we should prefer
> > well-named functions that do something understandable. And you can get
> > that by noticing that folio_set_compound_order() collapses down to
> > nearly nothing in the special "order 0" case. So just inline that code
> > directly into __destroy_compound_gigantic_folio(), taking a moment to
> > fill in and consolidate the CONFIG_64BIT missing parts in mm.h.
> > 
> > And now you can get rid of this cruft and "abuse" comment, and instead
> > just end up with two simple lines of code that are crystal clear--as
> > they should be, in a "__destroy" function. Like this:
> > 
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 105878936485..cf227ed00945 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page
> > *page, enum zone_type zone,
> >   #endif
> >   }
> > 
> > +#ifdef CONFIG_64BIT
> >   /**
> >    * folio_nr_pages - The number of pages in the folio.
> >    * @folio: The folio.
> > @@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio
> > *folio)
> >   {
> >       if (!folio_test_large(folio))
> >           return 1;
> > -#ifdef CONFIG_64BIT
> >       return folio->_folio_nr_pages;
> > +}
> > +
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +    folio->_folio_nr_pages = nr_pages;
> > +}
> >   #else
> > +/**
> > + * folio_nr_pages - The number of pages in the folio.
> > + * @folio: The folio.
> > + *
> > + * Return: A positive power of two.
> > + */
> > +static inline long folio_nr_pages(struct folio *folio)
> > +{
> > +    if (!folio_test_large(folio))
> > +        return 1;
> >       return 1L << folio->_folio_order;
> > -#endif
> >   }
> > 
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +}
> > +#endif
> > +
> >   /**
> >    * folio_next - Move to the next physical folio.
> >    * @folio: The folio we're currently operating on.
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index e3500c087893..b507a98063e6 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1344,7 +1344,8 @@ static void
> > __destroy_compound_gigantic_folio(struct folio *folio,
> >               set_page_refcounted(p);
> >       }
> > 
> > -    folio_set_compound_order(folio, 0);
> > +    folio->_folio_order = 0;
> > +    folio_set_nr_pages(folio, 0);
> >       __folio_clear_head(folio);
> >   }
> > 
> >
> > Yes?
> 
> This works for me, I will take this approach along with Muchun's feedback
> about a wrapper function so as not to touch _folio_order directly and send
> out a new version.
> 
> One question I have is if I should then get rid of
> folio_set_compound_order() as hugetlb is the only compound page user I've
> converted to folios so far and its use can be replaced by the suggested
> folio_set_nr_pages() and folio_set_order().
> 
> Hugetlb also has one has one call to folio_set_compound_order() with a
> non-zero order, should I replace this with a call to folio_set_order() and
> folio_set_nr_pages() as well, or keep folio_set_compound_order() and remove
> zero order support and the comment. Please let me know which approach you
> would prefer.

My opinion ... which is often wrong :)

I agree 100% with John about these being clear and understandable routines,
as well as Muchun's comment about the need for an interface for setting
_folio_order instead of accessing directly.

However, I do have some concern about two independent interfaces for setting
_folio_order and _folio_nr_pages.  These two fields really do need to be "in
sync" and having two interfaces to modify independently may lead to errors.

Any thoughts Matthew?  You introduced folios as well as the somewhat
redundant compound_nr/_folio_nr_pages fields.

Here is another thought.  Since this discussion started with a question
about "Can/should you really pass a zero order to folio_set_compound_order?",
what about a separate "folio_clear_order" interface?

We would then have:
folio_set_order()	/* passed non-zero order when creating large folio */
folio_clear_order()	/* zero order is implied */

Note that I did drop 'compound' from the names.  The more I think about the
folio direction we really should not use compound in the same.

Both of these could wrap a renamed version of Sidhartha's routine
folio_set_compound_order().  In this way the two fields are automatically
kept in sync.

Again, just my thoughts/opinion.
  
Matthew Wilcox Dec. 8, 2022, 7:33 p.m. UTC | #7
On Thu, Dec 08, 2022 at 10:06:07AM -0800, Sidhartha Kumar wrote:
> On 12/7/22 6:27 PM, John Hubbard wrote:
> > On 12/7/22 17:42, Sidhartha Kumar wrote:
> > > > Wouldn't it be better to instead just create a new function for that
> > > > case, such as:
> > > > 
> > > >      dissolve_large_folio()
> > > > 
> > > 
> > > Prior to the folio conversion, the helper function
> > > __destroy_compound_gigantic_page() did:
> > > 
> > >      set_compound_order(page, 0);
> > > #ifdef CONFIG_64BIT
> > >      page[1].compound_nr = 0;
> > > #endif
> > > 
> > > as part of dissolving the page. My goal for this patch was to create
> > > a function that would encapsulate that segment of code with a single
> > > call of folio_set_compound_order(folio, 0). set_compound_order()
> > > does not set compound_nr to 0 when 0 is passed in to the order
> > > argument so explicitly setting it is required. I don't think a
> > > separate dissolve_large_folio() function for the hugetlb case is
> > > needed as __destroy_compound_gigantic_folio() is pretty concise as
> > > it is.
> > > 
> > 
> > Instead of "this is abusing function X()" comments, we should prefer
> > well-named functions that do something understandable. And you can get
> > that by noticing that folio_set_compound_order() collapses down to
> > nearly nothing in the special "order 0" case. So just inline that code
> > directly into __destroy_compound_gigantic_folio(), taking a moment to
> > fill in and consolidate the CONFIG_64BIT missing parts in mm.h.
> > 
> > And now you can get rid of this cruft and "abuse" comment, and instead
> > just end up with two simple lines of code that are crystal clear--as
> > they should be, in a "__destroy" function. Like this:
> > 
> > 
> > diff --git a/include/linux/mm.h b/include/linux/mm.h
> > index 105878936485..cf227ed00945 100644
> > --- a/include/linux/mm.h
> > +++ b/include/linux/mm.h
> > @@ -1754,6 +1754,7 @@ static inline void set_page_links(struct page
> > *page, enum zone_type zone,
> >   #endif
> >   }
> > 
> > +#ifdef CONFIG_64BIT
> >   /**
> >    * folio_nr_pages - The number of pages in the folio.
> >    * @folio: The folio.
> > @@ -1764,13 +1765,32 @@ static inline long folio_nr_pages(struct folio
> > *folio)
> >   {
> >       if (!folio_test_large(folio))
> >           return 1;
> > -#ifdef CONFIG_64BIT
> >       return folio->_folio_nr_pages;
> > +}
> > +
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +    folio->_folio_nr_pages = nr_pages;
> > +}
> >   #else
> > +/**
> > + * folio_nr_pages - The number of pages in the folio.
> > + * @folio: The folio.
> > + *
> > + * Return: A positive power of two.
> > + */
> > +static inline long folio_nr_pages(struct folio *folio)
> > +{
> > +    if (!folio_test_large(folio))
> > +        return 1;
> >       return 1L << folio->_folio_order;
> > -#endif
> >   }
> > 
> > +static inline void folio_set_nr_pages(struct folio *folio, long nr_pages)
> > +{
> > +}
> > +#endif
> > +
> >   /**
> >    * folio_next - Move to the next physical folio.
> >    * @folio: The folio we're currently operating on.
> > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > index e3500c087893..b507a98063e6 100644
> > --- a/mm/hugetlb.c
> > +++ b/mm/hugetlb.c
> > @@ -1344,7 +1344,8 @@ static void
> > __destroy_compound_gigantic_folio(struct folio *folio,
> >               set_page_refcounted(p);
> >       }
> > 
> > -    folio_set_compound_order(folio, 0);
> > +    folio->_folio_order = 0;
> > +    folio_set_nr_pages(folio, 0);
> >       __folio_clear_head(folio);
> >   }
> > 
> > 
> > Yes?
> 
> This works for me, I will take this approach along with Muchun's feedback
> about a wrapper function so as not to touch _folio_order directly and send
> out a new version.
> 
> One question I have is if I should then get rid of
> folio_set_compound_order() as hugetlb is the only compound page user I've
> converted to folios so far and its use can be replaced by the suggested
> folio_set_nr_pages() and folio_set_order().
> 
> Hugetlb also has one has one call to folio_set_compound_order() with a
> non-zero order, should I replace this with a call to folio_set_order() and
> folio_set_nr_pages() as well, or keep folio_set_compound_order() and remove
> zero order support and the comment. Please let me know which approach you
> would prefer.

None of the above!

Whatever we're calling this function *it does not belong* in mm.h.
Anything outside the MM calling it is going to be a disaster -- can you
imagine what will happen if a filesystem or device driver is handed a
folio and decides "Oh, I'll just change the size of this folio"?  It is
an attractive nuisance and should be confined to mm/internal.h *at best*.

Equally, we *must not have* separate folio_set_order() and
folio_set_nr_pages().  These are the same thing!  They must be kept
in sync!  If we are to have a folio_set_order() instead of open-coding
it, then it should also update nr_pages.

So, given that this is now an internal-to-mm, if not internal-to-hugetlb
function, I see no reason that it should not handle the case of 0.
I haven't studied what hugetlb_dissolve does, or why it can't use the
standard split_folio(), but I'm sure there's a good reason.
  
John Hubbard Dec. 8, 2022, 7:56 p.m. UTC | #8
On 12/8/22 11:33, Matthew Wilcox wrote:
> None of the above!
> 
> Whatever we're calling this function *it does not belong* in mm.h.
> Anything outside the MM calling it is going to be a disaster -- can you
> imagine what will happen if a filesystem or device driver is handed a
> folio and decides "Oh, I'll just change the size of this folio"?  It is
> an attractive nuisance and should be confined to mm/internal.h *at best*.
> 
> Equally, we *must not have* separate folio_set_order() and
> folio_set_nr_pages().  These are the same thing!  They must be kept
> in sync!  If we are to have a folio_set_order() instead of open-coding
> it, then it should also update nr_pages.
> 
> So, given that this is now an internal-to-mm, if not internal-to-hugetlb
> function, I see no reason that it should not handle the case of 0.
> I haven't studied what hugetlb_dissolve does, or why it can't use the
> standard split_folio(), but I'm sure there's a good reason.

Sure, perhaps I overreacted to the "abuse of this function" comment, and
the whole thing could be fixed up with a clean name, hiding it from
the public mm.h API, and...removing comments about abuse! haha :)


thanks,
  
Mike Kravetz Dec. 8, 2022, 8:01 p.m. UTC | #9
On 12/08/22 19:33, Matthew Wilcox wrote:
> On Thu, Dec 08, 2022 at 10:06:07AM -0800, Sidhartha Kumar wrote:
> > On 12/7/22 6:27 PM, John Hubbard wrote:
> > > On 12/7/22 17:42, Sidhartha Kumar wrote:
> > This works for me, I will take this approach along with Muchun's feedback
> > about a wrapper function so as not to touch _folio_order directly and send
> > out a new version.
> > 
> > One question I have is if I should then get rid of
> > folio_set_compound_order() as hugetlb is the only compound page user I've
> > converted to folios so far and its use can be replaced by the suggested
> > folio_set_nr_pages() and folio_set_order().
> > 
> > Hugetlb also has one has one call to folio_set_compound_order() with a
> > non-zero order, should I replace this with a call to folio_set_order() and
> > folio_set_nr_pages() as well, or keep folio_set_compound_order() and remove
> > zero order support and the comment. Please let me know which approach you
> > would prefer.
> 
> None of the above!
> 
> Whatever we're calling this function *it does not belong* in mm.h.
> Anything outside the MM calling it is going to be a disaster -- can you
> imagine what will happen if a filesystem or device driver is handed a
> folio and decides "Oh, I'll just change the size of this folio"?  It is
> an attractive nuisance and should be confined to mm/internal.h *at best*.

I suspect it was placed in mm.h as it is the 'folio version' of
set_compound_order which resides in mm.h.  But, no need to repeat that
unfortunate placement.

> 
> Equally, we *must not have* separate folio_set_order() and
> folio_set_nr_pages().  These are the same thing!  They must be kept
> in sync!  If we are to have a folio_set_order() instead of open-coding
> it, then it should also update nr_pages.

Ok.  Agree.

> So, given that this is now an internal-to-mm, if not internal-to-hugetlb
> function, I see no reason that it should not handle the case of 0.
> I haven't studied what hugetlb_dissolve does, or why it can't use the
> standard split_folio(), but I'm sure there's a good reason.

The hugetlb code is changing the compound page/folio it created from a set of
individual pages back to individual pages so they can be returned to the
low level allocator.  Somewhat like what page_alloc/page_free do.  split_folio
is overkill.  split_page would be a closer match.

It makes perfect sense to put the function in mm internal.h.

Thanks,
  
Sidhartha Kumar Dec. 8, 2022, 9:58 p.m. UTC | #10
On 12/8/22 12:01 PM, Mike Kravetz wrote:
> On 12/08/22 19:33, Matthew Wilcox wrote:
>> On Thu, Dec 08, 2022 at 10:06:07AM -0800, Sidhartha Kumar wrote:
>>> On 12/7/22 6:27 PM, John Hubbard wrote:
>>>> On 12/7/22 17:42, Sidhartha Kumar wrote:
>>> This works for me, I will take this approach along with Muchun's feedback
>>> about a wrapper function so as not to touch _folio_order directly and send
>>> out a new version.
>>>
>>> One question I have is if I should then get rid of
>>> folio_set_compound_order() as hugetlb is the only compound page user I've
>>> converted to folios so far and its use can be replaced by the suggested
>>> folio_set_nr_pages() and folio_set_order().
>>>
>>> Hugetlb also has one has one call to folio_set_compound_order() with a
>>> non-zero order, should I replace this with a call to folio_set_order() and
>>> folio_set_nr_pages() as well, or keep folio_set_compound_order() and remove
>>> zero order support and the comment. Please let me know which approach you
>>> would prefer.
>>
>> None of the above!
>>
>> Whatever we're calling this function *it does not belong* in mm.h.
>> Anything outside the MM calling it is going to be a disaster -- can you
>> imagine what will happen if a filesystem or device driver is handed a
>> folio and decides "Oh, I'll just change the size of this folio"?  It is
>> an attractive nuisance and should be confined to mm/internal.h *at best*.
> 
> I suspect it was placed in mm.h as it is the 'folio version' of
> set_compound_order which resides in mm.h.  But, no need to repeat that
> unfortunate placement.
> 
>>
>> Equally, we *must not have* separate folio_set_order() and
>> folio_set_nr_pages().  These are the same thing!  They must be kept
>> in sync!  If we are to have a folio_set_order() instead of open-coding
>> it, then it should also update nr_pages.
> 
> Ok.  Agree.
> 
>> So, given that this is now an internal-to-mm, if not internal-to-hugetlb
>> function, I see no reason that it should not handle the case of 0.
>> I haven't studied what hugetlb_dissolve does, or why it can't use the
>> standard split_folio(), but I'm sure there's a good reason.
> 
> The hugetlb code is changing the compound page/folio it created from a set of
> individual pages back to individual pages so they can be returned to the
> low level allocator.  Somewhat like what page_alloc/page_free do.  split_folio
> is overkill.  split_page would be a closer match.
> 
> It makes perfect sense to put the function in mm internal.h.
> 

Thanks John, Mike, Matthew, and Muchun for the feedback.

To summarize this discussion and outline the next version of this patch, 
the changes I'll make include:

1) change the name of folio_set_compound_order() to folio_set_order()
2) change the placement of this function from mm.h to mm/internal.h
3) folio_set_order() will set both _folio_order and _folio_nr_pages and 
handle the zero order case correctly.
4) remove the comment about hugetlb's specific use for zero orders
5) improve the style of folio_set_order() by removing ifdefs from inside 
the function to doing

#ifdef CONFIG_64BIT
  static inline void folio_set_order(struct folio *folio,
                  unsigned int order)
  {
	 VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);

	 folio->_folio_order = order;
          folio->_folio_nr_pages = order ? 1U << order : 0;
}
#else
static inline void folio_set_order(struct folio *folio,
                  unsigned int order)
  {
	 VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);

	 folio->_folio_order = order;
}
#endif

Please let me know if I missing something.
Thanks,
Sidhartha Kumar
> Thanks,
  
John Hubbard Dec. 8, 2022, 10:01 p.m. UTC | #11
On 12/8/22 13:58, Sidhartha Kumar wrote:
> Thanks John, Mike, Matthew, and Muchun for the feedback.
> 
> To summarize this discussion and outline the next version of this patch, the changes I'll make include:
> 
> 1) change the name of folio_set_compound_order() to folio_set_order()
> 2) change the placement of this function from mm.h to mm/internal.h
> 3) folio_set_order() will set both _folio_order and _folio_nr_pages and handle the zero order case correctly.
> 4) remove the comment about hugetlb's specific use for zero orders
> 5) improve the style of folio_set_order() by removing ifdefs from inside the function to doing
> 
> #ifdef CONFIG_64BIT
>   static inline void folio_set_order(struct folio *folio,
>                   unsigned int order)
>   {
>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);

Sounds good, except for this part: why is a function named
folio_set_order() BUG-ing on a non-large folio? The naming
is still wrong, perhaps?

> 
>       folio->_folio_order = order;
>           folio->_folio_nr_pages = order ? 1U << order : 0;
> }
> #else
> static inline void folio_set_order(struct folio *folio,
>                   unsigned int order)
>   {
>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 
>       folio->_folio_order = order;
> }
> #endif
> 
> Please let me know if I missing something.
> Thanks,
> Sidhartha Kumar
>> Thanks,
> 

thanks,
  
Matthew Wilcox Dec. 8, 2022, 10:04 p.m. UTC | #12
On Thu, Dec 08, 2022 at 01:58:20PM -0800, Sidhartha Kumar wrote:
> 5) improve the style of folio_set_order() by removing ifdefs from inside the
> function to doing
> 
> #ifdef CONFIG_64BIT
>  static inline void folio_set_order(struct folio *folio,
>                  unsigned int order)
>  {
> 	 VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 
> 	 folio->_folio_order = order;
>          folio->_folio_nr_pages = order ? 1U << order : 0;
> }
> #else
> static inline void folio_set_order(struct folio *folio,
>                  unsigned int order)
>  {
> 	 VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 
> 	 folio->_folio_order = order;
> }
> #endif

While we usually prefer to put ifdefs outside the function, I don't
think that's justified in this case.  I'd rather see a comment inside
the function like:

static inline void folio_set_order(struct folio *folio,
                unsigned int order)
{
	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	/*
	 * When hugetlb dissolves a folio, we need to clear the tail
	 * page, rather than setting nr_pages to 1.
	 */
	folio->_folio_nr_pages = order ? 1U << order : 0;
#endif
}
  
Sidhartha Kumar Dec. 8, 2022, 10:12 p.m. UTC | #13
On 12/8/22 2:01 PM, John Hubbard wrote:
> On 12/8/22 13:58, Sidhartha Kumar wrote:
>> Thanks John, Mike, Matthew, and Muchun for the feedback.
>>
>> To summarize this discussion and outline the next version of this 
>> patch, the changes I'll make include:
>>
>> 1) change the name of folio_set_compound_order() to folio_set_order()
>> 2) change the placement of this function from mm.h to mm/internal.h
>> 3) folio_set_order() will set both _folio_order and _folio_nr_pages 
>> and handle the zero order case correctly.
>> 4) remove the comment about hugetlb's specific use for zero orders
>> 5) improve the style of folio_set_order() by removing ifdefs from 
>> inside the function to doing
>>
>> #ifdef CONFIG_64BIT
>>   static inline void folio_set_order(struct folio *folio,
>>                   unsigned int order)
>>   {
>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
> 
> Sounds good, except for this part: why is a function named
> folio_set_order() BUG-ing on a non-large folio? The naming
> is still wrong, perhaps?
> 

This is because the _folio_nr_pages and _folio_order fields are part of 
the first tail page in the folio. folio_test_large returns if the folio 
is larger than one page which would be required for setting the fields.

>>
>>       folio->_folio_order = order;
>>           folio->_folio_nr_pages = order ? 1U << order : 0;
>> }
>> #else
>> static inline void folio_set_order(struct folio *folio,
>>                   unsigned int order)
>>   {
>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>
>>       folio->_folio_order = order;
>> }
>> #endif
>>
>> Please let me know if I missing something.
>> Thanks,
>> Sidhartha Kumar
>>> Thanks,
>>
> 
> thanks,
  
John Hubbard Dec. 8, 2022, 10:14 p.m. UTC | #14
On 12/8/22 14:12, Sidhartha Kumar wrote:
> On 12/8/22 2:01 PM, John Hubbard wrote:
>> On 12/8/22 13:58, Sidhartha Kumar wrote:
>>> Thanks John, Mike, Matthew, and Muchun for the feedback.
>>>
>>> To summarize this discussion and outline the next version of this patch, the changes I'll make include:
>>>
>>> 1) change the name of folio_set_compound_order() to folio_set_order()
>>> 2) change the placement of this function from mm.h to mm/internal.h
>>> 3) folio_set_order() will set both _folio_order and _folio_nr_pages and handle the zero order case correctly.
>>> 4) remove the comment about hugetlb's specific use for zero orders
>>> 5) improve the style of folio_set_order() by removing ifdefs from inside the function to doing
>>>
>>> #ifdef CONFIG_64BIT
>>>   static inline void folio_set_order(struct folio *folio,
>>>                   unsigned int order)
>>>   {
>>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>
>> Sounds good, except for this part: why is a function named
>> folio_set_order() BUG-ing on a non-large folio? The naming
>> is still wrong, perhaps?
>>
> 
> This is because the _folio_nr_pages and _folio_order fields are part of the first tail page in the folio. folio_test_large returns if the folio is larger than one page which would be required for setting the fields.

OK, but then as I said, the name is wrong. One can either:

a) handle the non-large case, or

b) rename the function to indicate that it only works on large folios.


thanks,
  
Sidhartha Kumar Dec. 8, 2022, 10:33 p.m. UTC | #15
On 12/8/22 2:14 PM, John Hubbard wrote:
> On 12/8/22 14:12, Sidhartha Kumar wrote:
>> On 12/8/22 2:01 PM, John Hubbard wrote:
>>> On 12/8/22 13:58, Sidhartha Kumar wrote:
>>>> Thanks John, Mike, Matthew, and Muchun for the feedback.
>>>>
>>>> To summarize this discussion and outline the next version of this 
>>>> patch, the changes I'll make include:
>>>>
>>>> 1) change the name of folio_set_compound_order() to folio_set_order()
>>>> 2) change the placement of this function from mm.h to mm/internal.h
>>>> 3) folio_set_order() will set both _folio_order and _folio_nr_pages 
>>>> and handle the zero order case correctly.
>>>> 4) remove the comment about hugetlb's specific use for zero orders
>>>> 5) improve the style of folio_set_order() by removing ifdefs from 
>>>> inside the function to doing
>>>>
>>>> #ifdef CONFIG_64BIT
>>>>   static inline void folio_set_order(struct folio *folio,
>>>>                   unsigned int order)
>>>>   {
>>>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>>
>>> Sounds good, except for this part: why is a function named
>>> folio_set_order() BUG-ing on a non-large folio? The naming
>>> is still wrong, perhaps?
>>>
>>
>> This is because the _folio_nr_pages and _folio_order fields are part 
>> of the first tail page in the folio. folio_test_large returns if the 
>> folio is larger than one page which would be required for setting the 
>> fields.
> 
> OK, but then as I said, the name is wrong. One can either:
> 
> a) handle the non-large case, or
> 
> b) rename the function to indicate that it only works on large folios.
> 

Discussed here[1], the BUG_ON line seemed more appropriate over
  	
if (!folio_test_large(folio))
	return;

as the misuse would not be silent. I think I would be against renaming 
the function as I don't see any large folio specific function names for 
other accessors of tail page fields. Would both the BUG_ON and return on 
non-large folio be included then?


[1]: 
https://lore.kernel.org/linux-mm/20221129225039.82257-1-sidhartha.kumar@oracle.com/T/#m98cf80bb21ae533b7385f2e363c602e2c9e2802d
> 
> thanks,
  
John Hubbard Dec. 8, 2022, 10:39 p.m. UTC | #16
On 12/8/22 14:33, Sidhartha Kumar wrote:
> On 12/8/22 2:14 PM, John Hubbard wrote:
>> On 12/8/22 14:12, Sidhartha Kumar wrote:
>>> On 12/8/22 2:01 PM, John Hubbard wrote:
>>>> On 12/8/22 13:58, Sidhartha Kumar wrote:
>>>>> Thanks John, Mike, Matthew, and Muchun for the feedback.
>>>>>
>>>>> To summarize this discussion and outline the next version of this patch, the changes I'll make include:
>>>>>
>>>>> 1) change the name of folio_set_compound_order() to folio_set_order()
>>>>> 2) change the placement of this function from mm.h to mm/internal.h
>>>>> 3) folio_set_order() will set both _folio_order and _folio_nr_pages and handle the zero order case correctly.
>>>>> 4) remove the comment about hugetlb's specific use for zero orders
>>>>> 5) improve the style of folio_set_order() by removing ifdefs from inside the function to doing
>>>>>
>>>>> #ifdef CONFIG_64BIT
>>>>>   static inline void folio_set_order(struct folio *folio,
>>>>>                   unsigned int order)
>>>>>   {
>>>>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>>>
>>>> Sounds good, except for this part: why is a function named
>>>> folio_set_order() BUG-ing on a non-large folio? The naming
>>>> is still wrong, perhaps?
>>>>
>>>
>>> This is because the _folio_nr_pages and _folio_order fields are part of the first tail page in the folio. folio_test_large returns if the folio is larger than one page which would be required for setting the fields.
>>
>> OK, but then as I said, the name is wrong. One can either:
>>
>> a) handle the non-large case, or
>>
>> b) rename the function to indicate that it only works on large folios.
>>
> 
> Discussed here[1], the BUG_ON line seemed more appropriate over
> 
> if (!folio_test_large(folio))
>      return;
> 
> as the misuse would not be silent. I think I would be against renaming the function as I don't see any large folio specific function names for other accessors of tail page fields. Would both the BUG_ON and return on non-large folio be included then?

Actually, if you want the "misuse to not be silent", today's guidelines
would point more toward WARN and return, instead of BUG, btw.

I don't think that a survey of existing names is really the final word on what
to name this. Names should be accurate, even if other names are less so. How
about something like:

     large_folio_set_order()

?

> 
> 
> [1]: https://lore.kernel.org/linux-mm/20221129225039.82257-1-sidhartha.kumar@oracle.com/T/#m98cf80bb21ae533b7385f2e363c602e2c9e2802d
>>
>> thanks,
> 
> 

thanks,
  
Muchun Song Dec. 9, 2022, 2:27 p.m. UTC | #17
> On Dec 9, 2022, at 06:39, John Hubbard <jhubbard@nvidia.com> wrote:
> 
> On 12/8/22 14:33, Sidhartha Kumar wrote:
>> On 12/8/22 2:14 PM, John Hubbard wrote:
>>> On 12/8/22 14:12, Sidhartha Kumar wrote:
>>>> On 12/8/22 2:01 PM, John Hubbard wrote:
>>>>> On 12/8/22 13:58, Sidhartha Kumar wrote:
>>>>>> Thanks John, Mike, Matthew, and Muchun for the feedback.
>>>>>> 
>>>>>> To summarize this discussion and outline the next version of this patch, the changes I'll make include:
>>>>>> 
>>>>>> 1) change the name of folio_set_compound_order() to folio_set_order()
>>>>>> 2) change the placement of this function from mm.h to mm/internal.h
>>>>>> 3) folio_set_order() will set both _folio_order and _folio_nr_pages and handle the zero order case correctly.
>>>>>> 4) remove the comment about hugetlb's specific use for zero orders
>>>>>> 5) improve the style of folio_set_order() by removing ifdefs from inside the function to doing
>>>>>> 
>>>>>> #ifdef CONFIG_64BIT
>>>>>>   static inline void folio_set_order(struct folio *folio,
>>>>>>                   unsigned int order)
>>>>>>   {
>>>>>>       VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
>>>>> 
>>>>> Sounds good, except for this part: why is a function named
>>>>> folio_set_order() BUG-ing on a non-large folio? The naming
>>>>> is still wrong, perhaps?
>>>>> 
>>>> 
>>>> This is because the _folio_nr_pages and _folio_order fields are part of the first tail page in the folio. folio_test_large returns if the folio is larger than one page which would be required for setting the fields.
>>> 
>>> OK, but then as I said, the name is wrong. One can either:
>>> 
>>> a) handle the non-large case, or
>>> 
>>> b) rename the function to indicate that it only works on large folios.
>>> 
>> Discussed here[1], the BUG_ON line seemed more appropriate over
>> if (!folio_test_large(folio))
>>     return;
>> as the misuse would not be silent. I think I would be against renaming the function as I don't see any large folio specific function names for other accessors of tail page fields. Would both the BUG_ON and return on non-large folio be included then?
> 
> Actually, if you want the "misuse to not be silent", today's guidelines
> would point more toward WARN and return, instead of BUG, btw.

From you advise, I think we can remove VM_BUG_ON and handle non-zero
order page, something like:

static inline void folio_set_order(struct folio *folio,
		                   unsigned int order)
{
	if (!folio_test_large(folio)) {
		WARN_ON(order);
		return;
	}

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	folio->_folio_nr_pages = order ? 1U << order : 0;
#endif
}

In this case,

  1) we can handle both non-zero and zero (folio_order() works as well
     for this case) order page.
  2) it can prevent OOB for non-large folio and warn unexpected users.
  3) Do not BUG.
  4) No need to rename folio_set_order.

What do you think?

Thanks.

> 
> I don't think that a survey of existing names is really the final word on what
> to name this. Names should be accurate, even if other names are less so. How
> about something like:
> 
>    large_folio_set_order()
> 
> ?
> 
>> [1]: https://lore.kernel.org/linux-mm/20221129225039.82257-1-sidhartha.kumar@oracle.com/T/#m98cf80bb21ae533b7385f2e363c602e2c9e2802d
>>> 
>>> thanks,
> 
> thanks,
> -- 
> John Hubbard
> NVIDIA
  
John Hubbard Dec. 9, 2022, 9:10 p.m. UTC | #18
On 12/9/22 06:27, Muchun Song wrote:
>  From you advise, I think we can remove VM_BUG_ON and handle non-zero
> order page, something like:

Yes, and thanks for summarizing all the individual feedback into a
proposed solution.

If we go this route, then I'd suggest a little note above the function,
such as:

/*
  * For non-large folios, this will have no effect, other than possibly
  * generating a warning, if the caller attempts to set a non-zero folio order
  * for a non-large folio.
  */

> static inline void folio_set_order(struct folio *folio,
> 		                   unsigned int order)
> {
> 	if (!folio_test_large(folio)) {
> 		WARN_ON(order);

Better make that a WARN_ON_ONCE(), to avoid taking the machine down
with excessive warnings in the log.

> 		return;
> 	}
> 
> 	folio->_folio_order = order;
> #ifdef CONFIG_64BIT
> 	folio->_folio_nr_pages = order ? 1U << order : 0;
> #endif
> }
> 
> In this case,
> 
>    1) we can handle both non-zero and zero (folio_order() works as well
>       for this case) order page.
>    2) it can prevent OOB for non-large folio and warn unexpected users.
>    3) Do not BUG.
>    4) No need to rename folio_set_order.
> 
> What do you think?

If the new behavior is OK with everyone, it seems good to me.

thanks,
  
John Hubbard Dec. 9, 2022, 9:20 p.m. UTC | #19
On 12/9/22 13:10, John Hubbard wrote:
> On 12/9/22 06:27, Muchun Song wrote:
>>  From you advise, I think we can remove VM_BUG_ON and handle non-zero
>> order page, something like:
> 
> Yes, and thanks for summarizing all the individual feedback into a
> proposed solution.
> 
> If we go this route, then I'd suggest a little note above the function,
> such as:
> 
> /*
>   * For non-large folios, this will have no effect, other than possibly
>   * generating a warning, if the caller attempts to set a non-zero folio order
>   * for a non-large folio.
>   */
> 
>> static inline void folio_set_order(struct folio *folio,
>>                            unsigned int order)
>> {
>>     if (!folio_test_large(folio)) {
>>         WARN_ON(order);

Although, on second thought...I'm still a little confused about why
keeping the same name is so important?

A very direct approach that has more accurate naming (and therefore no
need for a strange comment explaining the behavior) would be:


static inline void large_folio_set_order(struct folio *folio,
					 unsigned int order)
{
	if (WARN_ON_ONCE(!folio_test_large(folio)))
		return;

	folio->_folio_order = order;
#ifdef CONFIG_64BIT
	folio->_folio_nr_pages = order ? 1U << order : 0;
#endif
}


thanks,
  
Muchun Song Dec. 14, 2022, 3 a.m. UTC | #20
> On Dec 10, 2022, at 05:20, John Hubbard <jhubbard@nvidia.com> wrote:
> 
> On 12/9/22 13:10, John Hubbard wrote:
>> On 12/9/22 06:27, Muchun Song wrote:
>>>  From you advise, I think we can remove VM_BUG_ON and handle non-zero
>>> order page, something like:
>> Yes, and thanks for summarizing all the individual feedback into a
>> proposed solution.
>> If we go this route, then I'd suggest a little note above the function,
>> such as:
>> /*
>>  * For non-large folios, this will have no effect, other than possibly
>>  * generating a warning, if the caller attempts to set a non-zero folio order
>>  * for a non-large folio.
>>  */
>>> static inline void folio_set_order(struct folio *folio,
>>>                            unsigned int order)
>>> {
>>>     if (!folio_test_large(folio)) {
>>>         WARN_ON(order);
> 
> Although, on second thought...I'm still a little confused about why
> keeping the same name is so important?

Just my personal preference. I like its simplicity. I'm not against
large_folio_set_order, but suggest folio_set_order.

Thanks.

> 
> A very direct approach that has more accurate naming (and therefore no
> need for a strange comment explaining the behavior) would be:
> 
> 
> static inline void large_folio_set_order(struct folio *folio,
> unsigned int order)
> {
> 	if (WARN_ON_ONCE(!folio_test_large(folio)))
> 		return;
> 
> 	folio->_folio_order = order;
> #ifdef CONFIG_64BIT
> 	folio->_folio_nr_pages = order ? 1U << order : 0;
> #endif
> }
> 
> 
> thanks,
> -- 
> John Hubbard
> NVIDIA
  

Patch

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 443d496949a8..cd8508d728f1 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -999,9 +999,16 @@  static inline void set_compound_order(struct page *page, unsigned int order)
 #endif
 }
 
+/*
+ * folio_set_compound_order is generally passed a non-zero order to
+ * initialize a large folio.  However, hugetlb code abuses this by
+ * passing in zero when 'dissolving' a large folio.
+ */
 static inline void folio_set_compound_order(struct folio *folio,
 		unsigned int order)
 {
+	VM_BUG_ON_FOLIO(!folio_test_large(folio), folio);
+
 	folio->_folio_order = order;
 #ifdef CONFIG_64BIT
 	folio->_folio_nr_pages = order ? 1U << order : 0;