Message ID | 20230623164015.3431990-2-jiaqiyan@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp5925506vqr; Fri, 23 Jun 2023 10:16:02 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4AnKVmYyaNefQG3QTLNX6yDzy9jJd2PSoJPPC/o8eD01RbVS1uzJjKXYdOngMV3Oycyu0p X-Received: by 2002:a17:90b:2246:b0:24e:1575:149 with SMTP id hk6-20020a17090b224600b0024e15750149mr15756492pjb.48.1687540562391; Fri, 23 Jun 2023 10:16:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687540562; cv=none; d=google.com; s=arc-20160816; b=xpx0XlSoSHUj3RgGCWKDHu8dSo0FBoMsKxjyCTCMzyw7NsfovtQfyEbRQU5+u4zmS+ j76j3s5AoI3Z6l5PTBvHMg+p6EgSOpdxwMAxJtuwbhKrCJB5s+rROzQEJ3Tqs+nMEMS3 rC2kDh5VuCoQEmrj2QH5iKJpNmxoZ/o3DUOqUzn6seNQD8IRnWt3Lm9wiA+sT4FbULYU y26/w389KlGsqGqqxtCP9SQHZlpAdO1hO8NoK8/Pjqr5ZZ9+pJ0T5SRqQHVtdVri45vE GZ+q9XXl9AK7GfUJpyVe2xQ1pGH7NGX2x63caASyx7Gw1kaHalychv8xOSETHR5HfknQ paRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Awz9MouaWvhYEsa6uGhopM2ZzPpu93glS/WPXSA5BTo=; b=h6SMa3XR2CxzMwKovDRPSSywhln60IJDuFY8PG3/JB+F/PdrXxJdLFR+ZcbMTEQjIe QcHWsHO5qQj4n6AyP9OzFkvaMDfEUx1IYWi0DlkVySwIZkZ4VwrwdrEUH6B1GR4IkbFZ wYCzUNAxinGx5CNDd/XDWO3pUlOpYgOJ8t0z9yWRUzLKcDeawLgfFv0lV2ljBV18R7Lb B7GeEoc1POHORaeqBluOlfwn7bm3v8YpNNEjQzAkYuxjMVit6tq1N/RbT568fp/hqvQN B+fZfRT1hZ2xi4T527+8hVaahWni4px2RLnRNHtfrK6WzGEezChRkp7CIFXGDcxcW4iY pUbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=34QbFsFB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ci13-20020a17090afc8d00b0025e8f2c9a62si2170379pjb.63.2023.06.23.10.15.48; Fri, 23 Jun 2023 10:16:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=34QbFsFB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232424AbjFWQke (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Fri, 23 Jun 2023 12:40:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49816 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232397AbjFWQkX (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 23 Jun 2023 12:40:23 -0400 Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com [IPv6:2607:f8b0:4864:20::549]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EC333273D for <linux-kernel@vger.kernel.org>; Fri, 23 Jun 2023 09:40:20 -0700 (PDT) Received: by mail-pg1-x549.google.com with SMTP id 41be03b00d2f7-528ab71c95cso507524a12.0 for <linux-kernel@vger.kernel.org>; Fri, 23 Jun 2023 09:40:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687538420; x=1690130420; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Awz9MouaWvhYEsa6uGhopM2ZzPpu93glS/WPXSA5BTo=; b=34QbFsFBy9aHUrBbbyPZvNf2E6cREYDEJ3OjWNnfkIABbwc6TGNrC1vrgVO6MW9NbF eqg/aLoYmKTPEpf/V4fsvvxZE2OH/jKO4vFZZEDXI9Fw9rFIdjqAghj7mM553FLclePO KLLFRlkp1ktiExq7SI5uSKhq1wmLqwGDDEZNOLOElvL4lS0kRfXPXgsWyOuJxhzdyMqv nNxyqnDxgcoLV/l2B8Oh7QU4LpW04N+ipd0x+tjGDQNo7DnIZ+46hlrSdKqHVfqSfpu2 IMvjSxyQLqGS45aGc7DQuYRsR+7dKrGsp2eRYXyTB07mxB+Es4MLo7hX+4zKOg4KauQE RrNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687538420; x=1690130420; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Awz9MouaWvhYEsa6uGhopM2ZzPpu93glS/WPXSA5BTo=; b=llkCglROWrpzNGpDxfUTFv008bipClTOGK8TU5ba5qOwBp5GlxaTZ/0b8+5PqVVWjQ LkCxiijC2oIJYC6+EPBbhgPQYubC7Tj4Nc35ncGwwVzzFOhOoWq3+5V2yM9SrJ9kPGHZ 5vU5bdjTGjfQBHo1e30LkeF1K8zadHN0dqyUgqa4ImqkHnkj9H+pue29nqw0gK3PoEI/ Qfmm2bJbSIErjDC5RHM14JeVTky3SJxnMFSW4QOgksApYWpGGR45OdyHu5OC7xu5NVBN pvRt4Ilps25cOFD1e/K6dZcXDIuRtb6bVLj9NWd3S6l3Z8n+AAtx0uoJPqPcmQTr7h0u RwBg== X-Gm-Message-State: AC+VfDx76FurdaugeZQYjZfxo0HCTusvBxdPOxvL2TV5z8K/5q6NJJFs hJcZLrn061slUOB2sVtuLOtfRRT0F/Vxtw== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a65:6793:0:b0:553:9251:558b with SMTP id e19-20020a656793000000b005539251558bmr2555744pgr.8.1687538420513; Fri, 23 Jun 2023 09:40:20 -0700 (PDT) Date: Fri, 23 Jun 2023 16:40:12 +0000 In-Reply-To: <20230623164015.3431990-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230623164015.3431990-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.162.gfafddb0af9-goog Message-ID: <20230623164015.3431990-2-jiaqiyan@google.com> Subject: [PATCH v2 1/4] mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp From: Jiaqi Yan <jiaqiyan@google.com> To: mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan <jiaqiyan@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769514532703852900?= X-GMAIL-MSGID: =?utf-8?q?1769514532703852900?= |
Series |
Improve hugetlbfs read on HWPOISON hugepages
|
|
Commit Message
Jiaqi Yan
June 23, 2023, 4:40 p.m. UTC
Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
are deleted from the llist.
llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s
from __update_and_free_hugetlb_folio and memory_failure won't need
explicit locking when freeing the raw_hwp_list.
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
---
mm/memory-failure.c | 8 +++-----
1 file changed, 3 insertions(+), 5 deletions(-)
Comments
On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote: > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries > are deleted from the llist. > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s > from __update_and_free_hugetlb_folio and memory_failure won't need > explicit locking when freeing the raw_hwp_list. > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> (Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls llist_for_each_safe() but it still traverses the list without calling llist_del_all(). This convention applies only when removing item(s)? Thanks, Naoya Horiguchi > --- > mm/memory-failure.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > index 004a02f44271..c415c3c462a3 100644 > --- a/mm/memory-failure.c > +++ b/mm/memory-failure.c > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) > > static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > { > - struct llist_head *head; > - struct llist_node *t, *tnode; > + struct llist_node *t, *tnode, *head; > unsigned long count = 0; > > - head = raw_hwp_list_head(folio); > - llist_for_each_safe(tnode, t, head->first) { > + head = llist_del_all(raw_hwp_list_head(folio)); > + llist_for_each_safe(tnode, t, head) { > struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); > > if (move_flag) > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > kfree(p); > count++; > } > - llist_del_all(head); > return count; > } > > -- > 2.41.0.162.gfafddb0af9-goog > > >
On Fri, Jun 30, 2023 at 7:52 AM Naoya Horiguchi <naoya.horiguchi@linux.dev> wrote: > > On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote: > > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries > > are deleted from the llist. > > > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s > > from __update_and_free_hugetlb_folio and memory_failure won't need > > explicit locking when freeing the raw_hwp_list. > > > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > (Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls > llist_for_each_safe() but it still traverses the list without calling > llist_del_all(). This convention applies only when removing item(s)? I think in our previous discussion, Mike and I agree as of today's code in hugetlb.c and memory-failure.c, concurrent adding, deleting, traversing are fine with each other and with themselves [1], but new code need to be careful wrt ops on raw_hwp_list. This patch is a low-hanging fruit to ensure any caller of __folio_free_raw_hwp won't introduce any problem by correcting one thing in __folio_free_raw_hwp: since it wants to delete raw_hwp_page entries in the list, it should do it by first llist_del_all, and then kfree with a llist_for_each_safe. As for folio_set_hugetlb_hwpoison, I am not very comfortable fixing it. I imagine a way to fix it is llist_del_all() => llist_for_each_safe{...} => llist_add_batch(), or llist_add() within llist_for_each_safe{...}. I haven't really thought through if this is a correct fix. [1] https://lore.kernel.org/lkml/CACw3F51o1ZFSYZa+XLnk4Wwjy2w_q=Kn+aOQs0=qpfG-ZYDFKg@mail.gmail.com/#t > > Thanks, > Naoya Horiguchi > > > --- > > mm/memory-failure.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index 004a02f44271..c415c3c462a3 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) > > > > static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > { > > - struct llist_head *head; > > - struct llist_node *t, *tnode; > > + struct llist_node *t, *tnode, *head; > > unsigned long count = 0; > > > > - head = raw_hwp_list_head(folio); > > - llist_for_each_safe(tnode, t, head->first) { > > + head = llist_del_all(raw_hwp_list_head(folio)); > > + llist_for_each_safe(tnode, t, head) { > > struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); > > > > if (move_flag) > > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > kfree(p); > > count++; > > } > > - llist_del_all(head); > > return count; > > } > > > > -- > > 2.41.0.162.gfafddb0af9-goog > > > > > >
On Fri, Jun 30, 2023 at 01:59:23PM -0700, Jiaqi Yan wrote: > On Fri, Jun 30, 2023 at 7:52 AM Naoya Horiguchi > <naoya.horiguchi@linux.dev> wrote: > > > > On Fri, Jun 23, 2023 at 04:40:12PM +0000, Jiaqi Yan wrote: > > > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries > > > are deleted from the llist. > > > > > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s > > > from __update_and_free_hugetlb_folio and memory_failure won't need > > > explicit locking when freeing the raw_hwp_list. > > > > > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > > > (Sorry if stupid question...) folio_set_hugetlb_hwpoison() also calls > > llist_for_each_safe() but it still traverses the list without calling > > llist_del_all(). This convention applies only when removing item(s)? > > I think in our previous discussion, Mike and I agree as of today's > code in hugetlb.c and memory-failure.c, concurrent adding, deleting, > traversing are fine with each other and with themselves [1], but new > code need to be careful wrt ops on raw_hwp_list. > > This patch is a low-hanging fruit to ensure any caller of > __folio_free_raw_hwp won't introduce any problem by correcting one > thing in __folio_free_raw_hwp: since it wants to delete raw_hwp_page > entries in the list, it should do it by first llist_del_all, and then > kfree with a llist_for_each_safe. Thanks for the explanation, this is worth adding to the patch description for future developers to understand the background. > > As for folio_set_hugetlb_hwpoison, I am not very comfortable fixing > it. I imagine a way to fix it is llist_del_all() => > llist_for_each_safe{...} => llist_add_batch(), or llist_add() within > llist_for_each_safe{...}. I haven't really thought through if this is > a correct fix. I see. Changing folio_set_hugetlb_hwpoison() like this is a little too complex considering that this fix is for precaution. So no change on this for now is fine to me. Anyway this patch looks fine to me. Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com> > > [1] https://lore.kernel.org/lkml/CACw3F51o1ZFSYZa+XLnk4Wwjy2w_q=Kn+aOQs0=qpfG-ZYDFKg@mail.gmail.com/#t > > > > > > Thanks, > > Naoya Horiguchi > > > > > --- > > > mm/memory-failure.c | 8 +++----- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > > index 004a02f44271..c415c3c462a3 100644 > > > --- a/mm/memory-failure.c > > > +++ b/mm/memory-failure.c > > > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) > > > > > > static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > > { > > > - struct llist_head *head; > > > - struct llist_node *t, *tnode; > > > + struct llist_node *t, *tnode, *head; > > > unsigned long count = 0; > > > > > > - head = raw_hwp_list_head(folio); > > > - llist_for_each_safe(tnode, t, head->first) { > > > + head = llist_del_all(raw_hwp_list_head(folio)); > > > + llist_for_each_safe(tnode, t, head) { > > > struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); > > > > > > if (move_flag) > > > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > > kfree(p); > > > count++; > > > } > > > - llist_del_all(head); > > > return count; > > > } > > > > > > -- > > > 2.41.0.162.gfafddb0af9-goog > > > > > > > > >
On 06/23/23 16:40, Jiaqi Yan wrote: > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries > are deleted from the llist. > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s > from __update_and_free_hugetlb_folio and memory_failure won't need > explicit locking when freeing the raw_hwp_list. > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > --- > mm/memory-failure.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) After updating the reason for patch in commit message as suggested by Naoya, Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
On Wed, Jul 5, 2023 at 4:36 PM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 06/23/23 16:40, Jiaqi Yan wrote: > > Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries > > are deleted from the llist. > > > > llist_del_all are lock free with itself. folio_clear_hugetlb_hwpoison()s > > from __update_and_free_hugetlb_folio and memory_failure won't need > > explicit locking when freeing the raw_hwp_list. > > > > Signed-off-by: Jiaqi Yan <jiaqiyan@google.com> > > --- > > mm/memory-failure.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > After updating the reason for patch in commit message as suggested by Naoya, Thank you both Mike and Naoya! I will add the explanation in the next version. > > Acked-by: Mike Kravetz <mike.kravetz@oracle.com> > > -- > Mike Kravetz > > > > > diff --git a/mm/memory-failure.c b/mm/memory-failure.c > > index 004a02f44271..c415c3c462a3 100644 > > --- a/mm/memory-failure.c > > +++ b/mm/memory-failure.c > > @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) > > > > static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > { > > - struct llist_head *head; > > - struct llist_node *t, *tnode; > > + struct llist_node *t, *tnode, *head; > > unsigned long count = 0; > > > > - head = raw_hwp_list_head(folio); > > - llist_for_each_safe(tnode, t, head->first) { > > + head = llist_del_all(raw_hwp_list_head(folio)); > > + llist_for_each_safe(tnode, t, head) { > > struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); > > > > if (move_flag) > > @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) > > kfree(p); > > count++; > > } > > - llist_del_all(head); > > return count; > > } > > > > -- > > 2.41.0.162.gfafddb0af9-goog > >
diff --git a/mm/memory-failure.c b/mm/memory-failure.c index 004a02f44271..c415c3c462a3 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1825,12 +1825,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) { - struct llist_head *head; - struct llist_node *t, *tnode; + struct llist_node *t, *tnode, *head; unsigned long count = 0; - head = raw_hwp_list_head(folio); - llist_for_each_safe(tnode, t, head->first) { + head = llist_del_all(raw_hwp_list_head(folio)); + llist_for_each_safe(tnode, t, head) { struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); if (move_flag) @@ -1840,7 +1839,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) kfree(p); count++; } - llist_del_all(head); return count; }