[v2,1/4] mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp

Message ID 20230623164015.3431990-2-jiaqiyan@google.com
State New
Headers
Series Improve hugetlbfs read on HWPOISON hugepages |

Commit Message

Jiaqi Yan June 23, 2023, 4:40 p.m. UTC
  Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
are deleted from the llist.

llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
from __update_and_free_hugetlb_folio and memory_failure won't need
explicit locking when freeing the raw_hwp_list.

Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
---
 mm/memory-failure.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)
  

Comments

Naoya Horiguchi June 30, 2023, 2:52 p.m. UTC | #1
On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote:
> Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> are deleted from the llist.
> 
> llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
> from __update_and_free_hugetlb_folio and memory_failure won't need
> explicit locking when freeing the raw_hwp_list.
> 
> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>

(Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls
llist_for_each_safe() but it still traverses the list without calling
llist_del_all().  This convention applies only when removing item(s)?

Thanks,
Naoya Horiguchi

> ---
>  mm/memory-failure.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> index 004a02f44271..c415c3c462a3 100644
> --- a/mm/memory-failure.c
> +++ b/mm/memory-failure.c
> @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
>  
>  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
>  {
> -	struct llist_head *head;
> -	struct llist_node *t, *tnode;
> +	struct llist_node *t, *tnode, *head;
>  	unsigned long count = 0;
>  
> -	head = raw_hwp_list_head(folio);
> -	llist_for_each_safe(tnode, t, head->first) {
> +	head = llist_del_all(raw_hwp_list_head(folio));
> +	llist_for_each_safe(tnode, t, head) {
>  		struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
>  
>  		if (move_flag)
> @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
>  		kfree(p);
>  		count++;
>  	}
> -	llist_del_all(head);
>  	return count;
>  }
>  
> -- 
> 2.41.0.162.gfafddb0af9-goog
> 
> 
>
  
Jiaqi Yan June 30, 2023, 8:59 p.m. UTC | #2
On Fri, Jun 30, 2023 at 7:52 AM Naoya Horiguchi
<naoya.horiguchi@linux.dev> wrote:
>
> On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote:
> > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> > are deleted from the llist.
> >
> > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
> > from __update_and_free_hugetlb_folio and memory_failure won't need
> > explicit locking when freeing the raw_hwp_list.
> >
> > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
>
> (Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls
> llist_for_each_safe() but it still traverses the list without calling
> llist_del_all().  This convention applies only when removing item(s)?

I think in our previous discussion, Mike and I agree as of today's
code in hugetlb.c and memory-failure.c, concurrent adding, deleting,
traversing are fine with each other and with themselves [1], but new
code need to be careful wrt ops on raw_hwp_list.

This patch is a low-hanging fruit to ensure any caller of
__folio_free_raw_hwp won't introduce any problem by correcting one
thing in __folio_free_raw_hwp: since it wants to delete raw_hwp_page
entries in the list, it should do it by first llist_del_all, and then
kfree with a llist_for_each_safe.

As for folio_set_hugetlb_hwpoison, I am not very comfortable fixing
it. I imagine a way to fix it is llist_del_all() =>
llist_for_each_safe{...} => llist_add_batch(), or llist_add() within
llist_for_each_safe{...}. I haven't really thought through if this is
a correct fix.

[1] https://lore.kernel.org/lkml/CACw3F51o1ZFSYZa+XLnk4Wwjy2w_q=Kn+aOQs0=qpfG-ZYDFKg@mail.gmail.com/#t


>
> Thanks,
> Naoya Horiguchi
>
> > ---
> >  mm/memory-failure.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 004a02f44271..c415c3c462a3 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
> >
> >  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> >  {
> > -     struct llist_head *head;
> > -     struct llist_node *t, *tnode;
> > +     struct llist_node *t, *tnode, *head;
> >       unsigned long count = 0;
> >
> > -     head = raw_hwp_list_head(folio);
> > -     llist_for_each_safe(tnode, t, head->first) {
> > +     head = llist_del_all(raw_hwp_list_head(folio));
> > +     llist_for_each_safe(tnode, t, head) {
> >               struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
> >
> >               if (move_flag)
> > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> >               kfree(p);
> >               count++;
> >       }
> > -     llist_del_all(head);
> >       return count;
> >  }
> >
> > --
> > 2.41.0.162.gfafddb0af9-goog
> >
> >
> >
  
Naoya Horiguchi July 2, 2023, 11:50 p.m. UTC | #3
On Fri, Jun 30, 2023 at 01:59:23PM -0700, Jiaqi Yan wrote:
> On Fri, Jun 30, 2023 at 7:52 AM Naoya Horiguchi
> <naoya.horiguchi@linux.dev> wrote:
> >
> > On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote:
> > > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> > > are deleted from the llist.
> > >
> > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
> > > from __update_and_free_hugetlb_folio and memory_failure won't need
> > > explicit locking when freeing the raw_hwp_list.
> > >
> > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
> >
> > (Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls
> > llist_for_each_safe() but it still traverses the list without calling
> > llist_del_all().  This convention applies only when removing item(s)?
> 
> I think in our previous discussion, Mike and I agree as of today's
> code in hugetlb.c and memory-failure.c, concurrent adding, deleting,
> traversing are fine with each other and with themselves [1], but new
> code need to be careful wrt ops on raw_hwp_list.
> 
> This patch is a low-hanging fruit to ensure any caller of
> __folio_free_raw_hwp won't introduce any problem by correcting one
> thing in __folio_free_raw_hwp: since it wants to delete raw_hwp_page
> entries in the list, it should do it by first llist_del_all, and then
> kfree with a llist_for_each_safe.

Thanks for the explanation, this is worth adding to the patch description
for future developers to understand the background.

> 
> As for folio_set_hugetlb_hwpoison, I am not very comfortable fixing
> it. I imagine a way to fix it is llist_del_all() =>
> llist_for_each_safe{...} => llist_add_batch(), or llist_add() within
> llist_for_each_safe{...}. I haven't really thought through if this is
> a correct fix.

I see. Changing folio_set_hugetlb_hwpoison() like this is a little too complex
considering that this fix is for precaution.
So no change on this for now is fine to me.

Anyway this patch looks fine to me.

Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>

> 
> [1] https://lore.kernel.org/lkml/CACw3F51o1ZFSYZa+XLnk4Wwjy2w_q=Kn+aOQs0=qpfG-ZYDFKg@mail.gmail.com/#t
> 
> 
> >
> > Thanks,
> > Naoya Horiguchi
> >
> > > ---
> > >  mm/memory-failure.c | 8 +++-----
> > >  1 file changed, 3 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > > index 004a02f44271..c415c3c462a3 100644
> > > --- a/mm/memory-failure.c
> > > +++ b/mm/memory-failure.c
> > > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
> > >
> > >  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> > >  {
> > > -     struct llist_head *head;
> > > -     struct llist_node *t, *tnode;
> > > +     struct llist_node *t, *tnode, *head;
> > >       unsigned long count = 0;
> > >
> > > -     head = raw_hwp_list_head(folio);
> > > -     llist_for_each_safe(tnode, t, head->first) {
> > > +     head = llist_del_all(raw_hwp_list_head(folio));
> > > +     llist_for_each_safe(tnode, t, head) {
> > >               struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
> > >
> > >               if (move_flag)
> > > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> > >               kfree(p);
> > >               count++;
> > >       }
> > > -     llist_del_all(head);
> > >       return count;
> > >  }
> > >
> > > --
> > > 2.41.0.162.gfafddb0af9-goog
> > >
> > >
> > >
  
Mike Kravetz July 5, 2023, 11:35 p.m. UTC | #4
On 06/23/23 16:40, Jiaqi Yan wrote:
> Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> are deleted from the llist.
> 
> llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
> from __update_and_free_hugetlb_folio and memory_failure won't need
> explicit locking when freeing the raw_hwp_list.
> 
> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
> ---
>  mm/memory-failure.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)

After updating the reason for patch in commit message as suggested by Naoya,

Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
  
Jiaqi Yan July 6, 2023, 6:11 p.m. UTC | #5
On Wed, Jul 5, 2023 at 4:36 PM Mike Kravetz <mike.kravetz@oracle.com> wrote:
>
> On 06/23/23 16:40, Jiaqi Yan wrote:
> > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> > are deleted from the llist.
> >
> > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
> > from __update_and_free_hugetlb_folio and memory_failure won't need
> > explicit locking when freeing the raw_hwp_list.
> >
> > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
> > ---
> >  mm/memory-failure.c | 8 +++-----
> >  1 file changed, 3 insertions(+), 5 deletions(-)
>
> After updating the reason for patch in commit message as suggested by Naoya,

Thank you both Mike and Naoya! I will add the explanation in the next version.

>
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
>
> --
> Mike Kravetz
>
> >
> > diff --git a/mm/memory-failure.c b/mm/memory-failure.c
> > index 004a02f44271..c415c3c462a3 100644
> > --- a/mm/memory-failure.c
> > +++ b/mm/memory-failure.c
> > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
> >
> >  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> >  {
> > -     struct llist_head *head;
> > -     struct llist_node *t, *tnode;
> > +     struct llist_node *t, *tnode, *head;
> >       unsigned long count = 0;
> >
> > -     head = raw_hwp_list_head(folio);
> > -     llist_for_each_safe(tnode, t, head->first) {
> > +     head = llist_del_all(raw_hwp_list_head(folio));
> > +     llist_for_each_safe(tnode, t, head) {
> >               struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
> >
> >               if (move_flag)
> > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
> >               kfree(p);
> >               count++;
> >       }
> > -     llist_del_all(head);
> >       return count;
> >  }
> >
> > --
> > 2.41.0.162.gfafddb0af9-goog
> >
  

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index 004a02f44271..c415c3c462a3 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1825,12 +1825,11 @@  static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
 
 static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 {
-	struct llist_head *head;
-	struct llist_node *t, *tnode;
+	struct llist_node *t, *tnode, *head;
 	unsigned long count = 0;
 
-	head = raw_hwp_list_head(folio);
-	llist_for_each_safe(tnode, t, head->first) {
+	head = llist_del_all(raw_hwp_list_head(folio));
+	llist_for_each_safe(tnode, t, head) {
 		struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
 
 		if (move_flag)
@@ -1840,7 +1839,6 @@  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 		kfree(p);
 		count++;
 	}
-	llist_del_all(head);
 	return count;
 }