[mm-unstable,v2,8/8] mm/hugetlb: convert demote_free_huge_page to folios

Message ID 20230110212821.984047-9-sidhartha.kumar@oracle.com
State New
Headers
Series continue hugetlb folio conversion |

Commit Message

Sidhartha Kumar Jan. 10, 2023, 9:28 p.m. UTC
  Change demote_free_huge_page to demote_free_hugetlb_folio() and change
demote_pool_huge_page() pass in a folio.

Signed-off-by: Sidhartha Kumar <sidhartha.kumar@oracle.com>
---
 mm/hugetlb.c | 31 ++++++++++++++++---------------
 1 file changed, 16 insertions(+), 15 deletions(-)
  

Comments

Matthew Wilcox Jan. 10, 2023, 9:40 p.m. UTC | #1
On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote:
> @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
>  {
>  	int nr_nodes, node;
>  	struct page *page;
> +	struct folio *folio;
>  
>  	lockdep_assert_held(&hugetlb_lock);
>  
> @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
>  		list_for_each_entry(page, &h->hugepage_freelists[node], lru) {
>  			if (PageHWPoison(page))
>  				continue;
> -
> -			return demote_free_huge_page(h, page);
> +			folio = page_folio(page);
> +			return demote_free_hugetlb_folio(h, folio);
>  		}
>  	}

Can't this be
		list_for_each_entry(folio, &h->hugepage_freelists[node], lru)

which avoids the call to page_folio() here.

I think the call to PageHWPoison is actually wrong here.  That would
only check the hwpoison bit on the first page, whereas we want to know
about the hwpoison bit on any page (don't we?)  So this should be
folio_test_has_hwpoisoned()?

Or is that a THP-thing that is different for hugetlb pages?
  
Mike Kravetz Jan. 11, 2023, midnight UTC | #2
On 01/10/23 21:40, Matthew Wilcox wrote:
> On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote:
> > @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
> >  {
> >  	int nr_nodes, node;
> >  	struct page *page;
> > +	struct folio *folio;
> >  
> >  	lockdep_assert_held(&hugetlb_lock);
> >  
> > @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
> >  		list_for_each_entry(page, &h->hugepage_freelists[node], lru) {
> >  			if (PageHWPoison(page))
> >  				continue;
> > -
> > -			return demote_free_huge_page(h, page);
> > +			folio = page_folio(page);
> > +			return demote_free_hugetlb_folio(h, folio);
> >  		}
> >  	}
> 
> Can't this be
> 		list_for_each_entry(folio, &h->hugepage_freelists[node], lru)
> 
> which avoids the call to page_folio() here.
> 
> I think the call to PageHWPoison is actually wrong here.  That would
> only check the hwpoison bit on the first page, whereas we want to know
> about the hwpoison bit on any page (don't we?)  So this should be
> folio_test_has_hwpoisoned()?
> 
> Or is that a THP-thing that is different for hugetlb pages?

I believe it is different for hugetlb pages.  See hugetlb_set_page_hwpoison()
where it sets PageHWPoison on head page as well as allocating a raw_hwp_page
to track the actual page with poison.  Note that we can not directly flag
hugetlb 'subpages' because we may not have the struct pages due to vmemmap
optimization.  Adding Naoya just to be sure.

Do agree that this could be list_for_each_entry(folio ...
  
Sidhartha Kumar Jan. 13, 2023, 3:16 p.m. UTC | #3
On 1/10/23 6:00 PM, Mike Kravetz wrote:
> On 01/10/23 21:40, Matthew Wilcox wrote:
>> On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote:
>>> @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
>>>   {
>>>   	int nr_nodes, node;
>>>   	struct page *page;
>>> +	struct folio *folio;
>>>   
>>>   	lockdep_assert_held(&hugetlb_lock);
>>>   
>>> @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
>>>   		list_for_each_entry(page, &h->hugepage_freelists[node], lru) {
>>>   			if (PageHWPoison(page))
>>>   				continue;
>>> -
>>> -			return demote_free_huge_page(h, page);
>>> +			folio = page_folio(page);
>>> +			return demote_free_hugetlb_folio(h, folio);
>>>   		}
>>>   	}
>>
>> Can't this be
>> 		list_for_each_entry(folio, &h->hugepage_freelists[node], lru)
>>
>> which avoids the call to page_folio() here.
>>
>> I think the call to PageHWPoison is actually wrong here.  That would
>> only check the hwpoison bit on the first page, whereas we want to know
>> about the hwpoison bit on any page (don't we?)  So this should be
>> folio_test_has_hwpoisoned()?
>>
>> Or is that a THP-thing that is different for hugetlb pages?
> 
> I believe it is different for hugetlb pages.  See hugetlb_set_page_hwpoison()
> where it sets PageHWPoison on head page as well as allocating a raw_hwp_page

I agree, this line in hugetlb_set_page_hwpoison (which is now 
folio_set_hugetlb_hwpoison) sets the HWPoison flag on the head page

int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;

so the correct code in demote_pool_huge_page() would be:

	list_for_each_entry(folio, &h->hugepage_freelists[node], lru) {
			if (folio_test_hwpoison(folio))
				continue;
			return demote_free_hugetlb_folio(h, folio);
		}


> to track the actual page with poison.  Note that we can not directly flag
> hugetlb 'subpages' because we may not have the struct pages due to vmemmap
> optimization.  Adding Naoya just to be sure.
> 
> Do agree that this could be list_for_each_entry(folio ...

Will make this change in v3.

Thanks,
Sidhartha Kumar
  
Naoya Horiguchi Jan. 18, 2023, 1:56 a.m. UTC | #4
On Fri, Jan 13, 2023 at 09:16:23AM -0600, Sidhartha Kumar wrote:
> On 1/10/23 6:00 PM, Mike Kravetz wrote:
> > On 01/10/23 21:40, Matthew Wilcox wrote:
> > > On Tue, Jan 10, 2023 at 03:28:21PM -0600, Sidhartha Kumar wrote:
> > > > @@ -3505,6 +3505,7 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
> > > >   {
> > > >   	int nr_nodes, node;
> > > >   	struct page *page;
> > > > +	struct folio *folio;
> > > >   	lockdep_assert_held(&hugetlb_lock);
> > > > @@ -3518,8 +3519,8 @@ static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
> > > >   		list_for_each_entry(page, &h->hugepage_freelists[node], lru) {
> > > >   			if (PageHWPoison(page))
> > > >   				continue;
> > > > -
> > > > -			return demote_free_huge_page(h, page);
> > > > +			folio = page_folio(page);
> > > > +			return demote_free_hugetlb_folio(h, folio);
> > > >   		}
> > > >   	}
> > > 
> > > Can't this be
> > > 		list_for_each_entry(folio, &h->hugepage_freelists[node], lru)
> > > 
> > > which avoids the call to page_folio() here.
> > > 
> > > I think the call to PageHWPoison is actually wrong here.  That would
> > > only check the hwpoison bit on the first page, whereas we want to know
> > > about the hwpoison bit on any page (don't we?)  So this should be
> > > folio_test_has_hwpoisoned()?
> > > 
> > > Or is that a THP-thing that is different for hugetlb pages?
> > 
> > I believe it is different for hugetlb pages.  See hugetlb_set_page_hwpoison()
> > where it sets PageHWPoison on head page as well as allocating a raw_hwp_page
> 
> I agree, this line in hugetlb_set_page_hwpoison (which is now
> folio_set_hugetlb_hwpoison) sets the HWPoison flag on the head page
> 
> int ret = folio_test_set_hwpoison(folio) ? -EHWPOISON : 0;
> 
> so the correct code in demote_pool_huge_page() would be:
> 
> 	list_for_each_entry(folio, &h->hugepage_freelists[node], lru) {
> 			if (folio_test_hwpoison(folio))
> 				continue;
> 			return demote_free_hugetlb_folio(h, folio);
> 		}
> 
> 
> > to track the actual page with poison.  Note that we can not directly flag
> > hugetlb 'subpages' because we may not have the struct pages due to vmemmap
> > optimization.  Adding Naoya just to be sure.

Yes, the above code and the reason totally make sense to me.
Thanks for the patch and the update.

- Naoya Horiguchif
  

Patch

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index f8cd0c694fe9..21ffab744c8f 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3435,12 +3435,12 @@  static int set_max_huge_pages(struct hstate *h, unsigned long count, int nid,
 	return 0;
 }
 
-static int demote_free_huge_page(struct hstate *h, struct page *page)
+static int demote_free_hugetlb_folio(struct hstate *h, struct folio *folio)
 {
-	int i, nid = page_to_nid(page);
+	int i, nid = folio_nid(folio);
 	struct hstate *target_hstate;
-	struct folio *folio = page_folio(page);
 	struct page *subpage;
+	struct folio *inner_folio;
 	int rc = 0;
 
 	target_hstate = size_to_hstate(PAGE_SIZE << h->demote_order);
@@ -3448,18 +3448,18 @@  static int demote_free_huge_page(struct hstate *h, struct page *page)
 	remove_hugetlb_folio_for_demote(h, folio, false);
 	spin_unlock_irq(&hugetlb_lock);
 
-	rc = hugetlb_vmemmap_restore(h, page);
+	rc = hugetlb_vmemmap_restore(h, &folio->page);
 	if (rc) {
-		/* Allocation of vmemmmap failed, we can not demote page */
+		/* Allocation of vmemmmap failed, we can not demote folio */
 		spin_lock_irq(&hugetlb_lock);
-		set_page_refcounted(page);
-		add_hugetlb_folio(h, page_folio(page), false);
+		folio_ref_unfreeze(folio, 1);
+		add_hugetlb_folio(h, folio, false);
 		return rc;
 	}
 
 	/*
 	 * Use destroy_compound_hugetlb_folio_for_demote for all huge page
-	 * sizes as it will not ref count pages.
+	 * sizes as it will not ref count folios.
 	 */
 	destroy_compound_hugetlb_folio_for_demote(folio, huge_page_order(h));
 
@@ -3474,15 +3474,15 @@  static int demote_free_huge_page(struct hstate *h, struct page *page)
 	mutex_lock(&target_hstate->resize_lock);
 	for (i = 0; i < pages_per_huge_page(h);
 				i += pages_per_huge_page(target_hstate)) {
-		subpage = nth_page(page, i);
-		folio = page_folio(subpage);
+		subpage = folio_page(folio, i);
+		inner_folio = page_folio(subpage);
 		if (hstate_is_gigantic(target_hstate))
-			prep_compound_gigantic_folio_for_demote(folio,
+			prep_compound_gigantic_folio_for_demote(inner_folio,
 							target_hstate->order);
 		else
 			prep_compound_page(subpage, target_hstate->order);
-		set_page_private(subpage, 0);
-		prep_new_hugetlb_folio(target_hstate, folio, nid);
+		folio_change_private(inner_folio, NULL);
+		prep_new_hugetlb_folio(target_hstate, inner_folio, nid);
 		free_huge_page(subpage);
 	}
 	mutex_unlock(&target_hstate->resize_lock);
@@ -3505,6 +3505,7 @@  static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
 {
 	int nr_nodes, node;
 	struct page *page;
+	struct folio *folio;
 
 	lockdep_assert_held(&hugetlb_lock);
 
@@ -3518,8 +3519,8 @@  static int demote_pool_huge_page(struct hstate *h, nodemask_t *nodes_allowed)
 		list_for_each_entry(page, &h->hugepage_freelists[node], lru) {
 			if (PageHWPoison(page))
 				continue;
-
-			return demote_free_huge_page(h, page);
+			folio = page_folio(page);
+			return demote_free_hugetlb_folio(h, folio);
 		}
 	}