From patchwork Fri Jul 7 20:19:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 117282 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3524189vqx; Fri, 7 Jul 2023 13:33:40 -0700 (PDT) X-Google-Smtp-Source: APBJJlHN5qjdDpykPRhPlRcKQeYt5xoOA5FNbRK/ptFL0iBIogF2eg1nRGSblDzaPsFHYE+/jmiX X-Received: by 2002:a17:90a:a613:b0:263:a966:7a75 with SMTP id c19-20020a17090aa61300b00263a9667a75mr4239780pjq.49.1688762020193; Fri, 07 Jul 2023 13:33:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688762020; cv=none; d=google.com; s=arc-20160816; b=hjyGOgbX4ZDo+eH/PJqyGGdbBw4furHr6Cc28W4cUghnNhoG3ZzpxlwyEJO3Oq7amk DTUH2vJpmhwiMhBr8lCDi3tdtiZw0QZt3CNTNppdE0c9Ytllu3BVeK9gcRgtmbQVywgZ rxtJOUTGgSK4DvKplrnImMkQ2lRFjmj4KS1X1jy/nqgZzrFtXxU5pDKu846OuP9Ac2Vk qPKFn6KPyUILDjM+KLJ0ThQXxqgDf6RxCdUoE7l1clRAK6hAVsHOW1Zg6HPsTln7GehF MEenYJFWYALXnhyi/ucyBHc2wQ64G7pV4SRwCwUySzETEJgOF8UnDdiPuoKcnXuB9Ene ifbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; fh=dXuKmv+n+4RrXbgVwL2QlK4VwZMBf3WumGals2t2WtM=; b=ofmSEb250iAp8RFCFqvLnZvlwvpjPiATMPFy08YS8+S4X9dKsjCFOo4Qzt0iv45Yyv HtdFdhUA06EglEr4ScM9fLctSaSC3aC/aNh3LXKW9uSJIUxGgMkpld832RIrxzbkS/C3 ePV02qJZzNpPdpfKkDn5S9zKyptdS5N1vUAVOa5PJWbnJnp4f64pwka0vLauTGcOYzYi zx9ZnP1y1GJEXQLEd94M4sNMVD6xuNXhaB+md+Wbd5AobAGF253MKBaDZQLXH3a0hq18 gAQqSz6yoB6GMVaBntuH2cZuwvCsufz5F1BnQOrKKPoZdt05sNrKf902N289LPMtZwxU LA4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=VqyQmTQk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l6-20020a17090a598600b0024e5ed38294si2728866pji.66.2023.07.07.13.33.26; Fri, 07 Jul 2023 13:33:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=VqyQmTQk; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232747AbjGGUTZ (ORCPT + 99 others); Fri, 7 Jul 2023 16:19:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232740AbjGGUTU (ORCPT ); Fri, 7 Jul 2023 16:19:20 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C01382114 for ; Fri, 7 Jul 2023 13:19:16 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1b8130aceefso30900095ad.2 for ; Fri, 07 Jul 2023 13:19:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688761156; x=1691353156; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; b=VqyQmTQkCzL2+2WReICnzA6BIm0pcsZl2UxgOC96YeMxOiWjK9NhMMsd6sh9vgNsyp l48aVL+YSen8q6QehXxVJmNvqqI169EdhdJF1wP18IrISpvNJpDmE8qvdh8iCozeWtb6 JHgOCmY5zsqIie6UwoBTIJKIq84brRDTo5Ou4o4qfUOH/nsVpBgCFa70jcSCS8JQjRGc YPkr1zj/Al/4tkKYO5OxobtQsE4XPWag4pb/fMTB6fXhhvDh+IoKStTY4VGEtEBkAolL +GRm79XEsc5P3ls1k+A19D7pSp5BkV+9M69nyu7mjSupo4UAMKBhqh3ZJivJ7t7YjP3Z tU8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688761156; x=1691353156; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; b=VpV0Qg/ynk+7W2KX8qdFPvY1s1EMr4qx7aNSKou0Pf3Q9hUO3ZtI1lvPATg/jcI+iP ZMvBunc6qKVx+uKdmGS8z2ni7Z5sVyc34rtAgQxCpRKD5O5FbiujJWjZhnHbaBExYGDr MB5GaOBPrl3CJ0Sbnk0WQKd95CQ+sVoE4ogLBN7fbWAMGrB4gh67/CnQBsQevhGwnTtB kclM1NkA0HHWG7CtDjdVKOTZKdOaMRDoQHiN+eJHRA4HXS9qZ3rGDxwPHKoxETYJ6XX2 xCpqxYObfefOE47FrqrURMjCpASsYEYZapuc7A+wE2kLGVcwFtWioxMN7tO6GNiarWzj GIDQ== X-Gm-Message-State: ABy/qLYoHPihWsJupAghCCLQmRZxTIv6ZeZ7b0nzEvr8Rt9xS8IioGaf 3sslkk4f0eyVunuZRGg8/BM+GFPsNkm3Hg== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:ec88:b0:1b7:fa05:e0c7 with SMTP id x8-20020a170902ec8800b001b7fa05e0c7mr5745441plg.13.1688761156325; Fri, 07 Jul 2023 13:19:16 -0700 (PDT) Date: Fri, 7 Jul 2023 20:19:01 +0000 In-Reply-To: <20230707201904.953262-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230707201904.953262-2-jiaqiyan@google.com> Subject: [PATCH v3 1/4] mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp From: Jiaqi Yan To: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770795324215715821?= X-GMAIL-MSGID: =?utf-8?q?1770795324215715821?= Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries are deleted from the llist. Correct the way __folio_free_raw_hwp deletes and frees raw_hwp_page entries in raw_hwp_list: first llist_del_all, then kfree within llist_for_each_safe. As of today, concurrent adding, deleting, and traversal on raw_hwp_list from hugetlb.c and/or memory-failure.c are fine with each other. Note this is guaranteed partly by the lock-free nature of llist, and partly by holding hugetlb_lock and/or mf_mutex. For example, as llist_del_all is lock-free with itself, folio_clear_hugetlb_hwpoison()s from __update_and_free_hugetlb_folio and memory_failure won't need explicit locking when freeing the raw_hwp_list. New code that manipulates raw_hwp_list must be careful to ensure the concurrency correctness. Acked-by: Mike Kravetz Acked-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- mm/memory-failure.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index e245191e6b04..a08677dcf953 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1829,12 +1829,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) { - struct llist_head *head; - struct llist_node *t, *tnode; + struct llist_node *t, *tnode, *head; unsigned long count = 0; - head = raw_hwp_list_head(folio); - llist_for_each_safe(tnode, t, head->first) { + head = llist_del_all(raw_hwp_list_head(folio)); + llist_for_each_safe(tnode, t, head) { struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); if (move_flag) @@ -1844,7 +1843,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) kfree(p); count++; } - llist_del_all(head); return count; } From patchwork Fri Jul 7 20:19:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 117281 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3520463vqx; Fri, 7 Jul 2023 13:25:24 -0700 (PDT) X-Google-Smtp-Source: APBJJlHT4fOJhm+pM/vWW/D047RgouG1YQ84l/p3r3xwoyFCFeD8o9Zc3FbaoCOxHhVo+2tu2m2n X-Received: by 2002:a05:6402:12d8:b0:51d:9d59:7a11 with SMTP id k24-20020a05640212d800b0051d9d597a11mr4441046edx.4.1688761524087; Fri, 07 Jul 2023 13:25:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688761524; cv=none; d=google.com; s=arc-20160816; b=aWJlmazrf35I9T+eeoIR+7nk5wmiT1YzMbkQQ5JyQqQs1fZ/h6XGdVzS8MtiJXkjbK 2LaK2o7KCHHajq6tryfFfUV4fyuESPDXYwo8+lwz+ayHRucSPKQeA4Ro0/0Vzb0bSgWW 8vnbxS+B/OgNd1Rimvns/49vHjBTT7L6Pn4DFmGtm34wZBXJwvPH5xODn7PoEdHz4BuG slgRN7MIcg+qT+iPBUdim+gHKkrFxLoFN0CL7Pcw/uRms6MO02xkozc9WQ2yD2acujpU 5q/5TduQHC57HCcupPQ0Z8280Ud6Cx8lWC1Ov7J4hTxzf+uZeZenXlfnJesLzZ2jOZI0 AdQA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=lhHFjAXtuMYeiDPTb9enUa1sagPmUht2lF56qHA5nAY=; fh=dXuKmv+n+4RrXbgVwL2QlK4VwZMBf3WumGals2t2WtM=; b=z7gmXDGhcaZNiXwJ296Ewd5gg6zJ9l/gYwMiP3j+Lv1kh5PfNoTMYJ72wk5skdePJp MYUhCBbY2Kla92lmFqzikAdOZa5eCtMrnjAHddUx2ldFgFrPLkkhQNvWZmBQIioGzXP4 /bpSvPjhaTVrnMeihuCZ3KzOUpC03wJDF+Q8U+3EppSEalKEbFkiTe1oH896q88c/gjo 3iy9nwVKNyDbXy8m64bKY/9zXJJVRzkLyt+8whU8F7wurSJN/+lsZIcfnuoM3PhVo+GJ w3/i7aXYkMmykg+H3CqbLNlIylUDnVmEMnUJ8O2wm5RlBLZtVmrYz8PkvsDJ/EZTt8/m l0+Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Lm821Q9A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j20-20020aa7ca54000000b0051a7bccf383si2622084edt.86.2023.07.07.13.25.00; Fri, 07 Jul 2023 13:25:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Lm821Q9A; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232805AbjGGUT2 (ORCPT + 99 others); Fri, 7 Jul 2023 16:19:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232770AbjGGUTW (ORCPT ); Fri, 7 Jul 2023 16:19:22 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38DB52126 for ; Fri, 7 Jul 2023 13:19:18 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-55b2c66d713so3346115a12.2 for ; Fri, 07 Jul 2023 13:19:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688761158; x=1691353158; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lhHFjAXtuMYeiDPTb9enUa1sagPmUht2lF56qHA5nAY=; b=Lm821Q9A8z9/jwUhSfYSBmBg3a1CJgB8uC9S29RR8iamr7rpAYSfO0Eeq787JeWICj aSBaqXte1TU+6+8Sh0EwU0DomQ2dskj2yaXJhdXbMBNjJ8QUuqx8PFueYsGG0LTCGMQM AG5oIBFVs6KD00JEaeesIKk4hiB7jSUeRJ5U9ZVeA4Xg20DZQrowBV2mwu3oDdpyaXg2 VSPJ+oq7NOFgZNlDG3dULstPjbk7FgX8TBv4efKwHSj4dG2SHkzR24FKkM89NmGdxa+V ORZx2FMfNdL9uzR1qedaRjJLe7tuSJorO9cCwuwQmen2ZaI5XXih1iy8y9odfFxhoErc 4nvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688761158; x=1691353158; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lhHFjAXtuMYeiDPTb9enUa1sagPmUht2lF56qHA5nAY=; b=OQtSlw81L/X+Cft54edpqtYZzn3QBJ3yuyRk/rQ2bjQdEkJDi9UMNOX727vcZ6nkIE eptZ2Pz77edVoNmPdJNp4C6nytOp8ejqofkCtajlVTF2STMm+HgGW4gVIRVA5KQYeZQt s1akhcl3olJvd0Ccfz2UoRTbmaNbjMsJ2j9Lpap0NUQVW3vD422rne0ulZ5Z5QaqN7Cg chJpFAt5HebKq/otse+w2ChXO2KhQ7UZxj6kX20nCgIwbzkgFbC7fE1+ICCdcAQd6fI0 WBAQAdJfsK4rmp6SZsnlYfzd+9YHdaji20DPasvd/5gNG+iuqFN3+/Psj2rjHIIi+cwE zTAA== X-Gm-Message-State: ABy/qLbAIIN9DN0JLZr1iaXm+jDS8Q53DSvRP4bymIRoQIT9X/N/9jze Tl1aZGnkyR+IGbxRS2zxYPcSzIWYLHOKSA== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a63:7f51:0:b0:557:5649:381 with SMTP id p17-20020a637f51000000b0055756490381mr4012081pgn.3.1688761157717; Fri, 07 Jul 2023 13:19:17 -0700 (PDT) Date: Fri, 7 Jul 2023 20:19:02 +0000 In-Reply-To: <20230707201904.953262-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230707201904.953262-3-jiaqiyan@google.com> Subject: [PATCH v3 2/4] mm/hwpoison: check if a subpage of a hugetlb folio is raw HWPOISON From: Jiaqi Yan To: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770794803647405252?= X-GMAIL-MSGID: =?utf-8?q?1770794803647405252?= Add the functionality, is_raw_hwp_subpage, to tell if a subpage of a hugetlb folio is a raw HWPOISON page. This functionality relies on RawHwpUnreliable to be not set; otherwise hugepage's raw HWPOISON list becomes meaningless. is_raw_hwp_subpage needs to hold hugetlb_lock in order to synchronize with __get_huge_page_for_hwpoison, who iterates and inserts an entry to raw_hwp_list. llist itself doesn't ensure insertion is synchornized with the iterating used by __is_raw_hwp_list. Caller can minimize the overhead of lock cycles by first checking if folio / head page's HWPOISON flag is set. Exports this functionality to be immediately used in the read operation for hugetlbfs. Reviewed-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- include/linux/hugetlb.h | 19 +++++++++++++++++++ include/linux/mm.h | 7 +++++++ mm/hugetlb.c | 10 ++++++++++ mm/memory-failure.c | 34 ++++++++++++++++++++++++---------- 4 files changed, 60 insertions(+), 10 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ca3c8e10f24a..4a745af98525 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1007,6 +1007,25 @@ void hugetlb_register_node(struct node *node); void hugetlb_unregister_node(struct node *node); #endif +/* + * Struct raw_hwp_page represents information about "raw error page", + * constructing singly linked list from ->_hugetlb_hwpoison field of folio. + */ +struct raw_hwp_page { + struct llist_node node; + struct page *page; +}; + +static inline struct llist_head *raw_hwp_list_head(struct folio *folio) +{ + return (struct llist_head *)&folio->_hugetlb_hwpoison; +} + +/* + * Check if a given raw @subpage in a hugepage @folio is HWPOISON. + */ +bool is_raw_hwp_subpage(struct folio *folio, struct page *subpage); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; diff --git a/include/linux/mm.h b/include/linux/mm.h index 74f1be743ba2..edaa18b6f731 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3680,6 +3680,7 @@ extern const struct attribute_group memory_failure_attr_group; extern void memory_failure_queue(unsigned long pfn, int flags); extern int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, bool *migratable_cleared); +extern bool __is_raw_hwp_subpage(struct folio *folio, struct page *subpage); void num_poisoned_pages_inc(unsigned long pfn); void num_poisoned_pages_sub(unsigned long pfn, long i); struct task_struct *task_early_kill(struct task_struct *tsk, int force_early); @@ -3694,6 +3695,12 @@ static inline int __get_huge_page_for_hwpoison(unsigned long pfn, int flags, return 0; } +static inline bool __is_raw_hwp_subpage(struct folio *folio, + struct page *subpage) +{ + return false; +} + static inline void num_poisoned_pages_inc(unsigned long pfn) { } diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bce28cca73a1..9c608d2f6630 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7373,6 +7373,16 @@ int get_huge_page_for_hwpoison(unsigned long pfn, int flags, return ret; } +bool is_raw_hwp_subpage(struct folio *folio, struct page *subpage) +{ + bool ret; + + spin_lock_irq(&hugetlb_lock); + ret = __is_raw_hwp_subpage(folio, subpage); + spin_unlock_irq(&hugetlb_lock); + return ret; +} + void folio_putback_active_hugetlb(struct folio *folio) { spin_lock_irq(&hugetlb_lock); diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a08677dcf953..5b6c8ceb13c0 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1813,18 +1813,32 @@ EXPORT_SYMBOL_GPL(mf_dax_kill_procs); #endif /* CONFIG_FS_DAX */ #ifdef CONFIG_HUGETLB_PAGE -/* - * Struct raw_hwp_page represents information about "raw error page", - * constructing singly linked list from ->_hugetlb_hwpoison field of folio. - */ -struct raw_hwp_page { - struct llist_node node; - struct page *page; -}; -static inline struct llist_head *raw_hwp_list_head(struct folio *folio) +bool __is_raw_hwp_subpage(struct folio *folio, struct page *subpage) { - return (struct llist_head *)&folio->_hugetlb_hwpoison; + struct llist_head *raw_hwp_head; + struct raw_hwp_page *p, *tmp; + bool ret = false; + + if (!folio_test_hwpoison(folio)) + return false; + + /* + * When RawHwpUnreliable is set, kernel lost track of which subpages + * are HWPOISON. So return as if ALL subpages are HWPOISONed. + */ + if (folio_test_hugetlb_raw_hwp_unreliable(folio)) + return true; + + raw_hwp_head = raw_hwp_list_head(folio); + llist_for_each_entry_safe(p, tmp, raw_hwp_head->first, node) { + if (subpage == p->page) { + ret = true; + break; + } + } + + return ret; } static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) From patchwork Fri Jul 7 20:19:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 117283 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3524193vqx; Fri, 7 Jul 2023 13:33:40 -0700 (PDT) X-Google-Smtp-Source: APBJJlHLaA9Qv95MwP4ZbrdbyqG3NDGvsRxb/9IlCpiMG120sjzgd8np5Vs1QtRXygBBp1JhsHBx X-Received: by 2002:a05:6a00:80f:b0:681:eddd:51fb with SMTP id m15-20020a056a00080f00b00681eddd51fbmr7296354pfk.18.1688762020281; Fri, 07 Jul 2023 13:33:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688762020; cv=none; d=google.com; s=arc-20160816; b=etbjEbAIfZb8ELxoOCP5jDR2S8RZ3IV8KcutEvj18QQUsGNqbFn0PMeN4kVys7BMgy ojztm8AEsmt+RoxafdTXL+KipndHqcc1pHcXQoS1WF/gYM5SYWoKQsTKTccqJTO7bGU5 stvdo9YtGYGknXE97OCNG4D2vhZZ/5Jo4zwk77lku/4o6PH6Sk9Tqm4GiInuVTSRcEhf yab51EqbISTMvOUR4awE+7v3VX77qb+Gjuqdfi+b/qudbdfk6USK9jO+eP2ffqPER9cR 29Rj8gqCa4Zr1m/Q2FI4fzq2KVlR71D7N7/H9GPBK25y+XcsDRYUV4sAXI3KmLsa6TWO HGiQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=A+DZafON7JsoL7Ay0CKBI9HDnuWj0nfz7I/atmY5O6E=; fh=dXuKmv+n+4RrXbgVwL2QlK4VwZMBf3WumGals2t2WtM=; b=uZfHqZr1xsg3kNleAOBMXnaj4AuS/EPxU0PlpygTVJCAAFEGcNtMvlKzp2wrLScoda 0Wv7h5a+PCcB8kURgd8mfDWwM4rJ/oqXPUiYUAjWliN+TSrJA0is9XYWo6dYN3sj+a95 +PTTv6suAkCwGkfX2wrgxwDypC2FEZJZ+jLaFMGTCFbbt33UtNv8wZECMJ+wQBQzwG23 Vygoc6nNi8umJJqrs8FxtZbu+98IZcnNxzxIfYFJ6E8ZwTMhkv040bztI7CGPsuIY9/r 5aP0ViM1qWJXOPkCpoKPf1tN6KiZ5YheFH/NFrI8H/X/0JyZG/UJpdgUpnV23RYOD+sw mYKQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=TldvMxXO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t33-20020a634461000000b0055bf3d0c991si4404733pgk.5.2023.07.07.13.33.26; Fri, 07 Jul 2023 13:33:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=TldvMxXO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232809AbjGGUTb (ORCPT + 99 others); Fri, 7 Jul 2023 16:19:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45170 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232774AbjGGUTW (ORCPT ); Fri, 7 Jul 2023 16:19:22 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B66F22105 for ; Fri, 7 Jul 2023 13:19:19 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-263047f46f4so3725753a91.1 for ; Fri, 07 Jul 2023 13:19:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688761159; x=1691353159; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=A+DZafON7JsoL7Ay0CKBI9HDnuWj0nfz7I/atmY5O6E=; b=TldvMxXOlAitzanCYBdJPKTFsceaBUtZtfHE+NCMC7uRLSuEMdNFphYxZysQPlOWyt 6IIkCSmbZ0QhAj6BdA2S0ZbkwbhR7x3BX+nwYBGgaOxeyqB+HsmP8eFpNEMX/EjaKoHw Osqpj/mJ1MqPaQbsakEeDVTjMuhGWE9GnXwVc8F/i6rZzYOS2/oL2zNYYKzVVfG8aSiy zLaZC2KHe+RKfCbvOVNYPOcE1YOJUR4cbXfL9Nm1YDmuYA43h0rfivcjthaU3URwFYSK 96WU5M8uzwqIi+5+ywPkqjzgjsM0fGa0w0GzRv5r4kHftPEO8J5DvOvCOoUqVSHiN5aw 6/8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688761159; x=1691353159; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A+DZafON7JsoL7Ay0CKBI9HDnuWj0nfz7I/atmY5O6E=; b=ItyuyXKON4UB+2+GHoelozx7BZmZJXPjydnVs3IQR+DMuO+tGfVICWK24nJDRKZDCe RIied27OUsPgmFI0KNyyYmPd6AWEeuC5QhffUteYpC7jIqzIJO9ifCihyXDM1d6lJqrc Igwjro/ULtmnRwoxyULx+BHE10YGdfG71DVpsmkPhMfLwMLOkPQGrKLYrZr/SOudAqax 4Tnb1QRElHxl4uku9UX+lvdBOyybYm+dqUhkYWcckGqh5KShjifrX31v8djdYBIuh26E 0VQhtEypjO+DMBg3lGbBnb4MDhn2IXlQucvUR0RhJRPdOZXaULGdjegzrZ9mtPOiJENt dO+A== X-Gm-Message-State: ABy/qLZJBYizQ5/7tunZH7JSUm2Y0KTyaStCmsazBMwE/eD+og+34WlF kOwyR3Vk/v3+Q3fczTeSCUjZGgHKFSa9kg== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:90a:ce18:b0:263:49d3:8024 with SMTP id f24-20020a17090ace1800b0026349d38024mr4778324pju.1.1688761159291; Fri, 07 Jul 2023 13:19:19 -0700 (PDT) Date: Fri, 7 Jul 2023 20:19:03 +0000 In-Reply-To: <20230707201904.953262-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230707201904.953262-4-jiaqiyan@google.com> Subject: [PATCH v3 3/4] hugetlbfs: improve read HWPOISON hugepage From: Jiaqi Yan To: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770795324007786199?= X-GMAIL-MSGID: =?utf-8?q?1770795324007786199?= When a hugepage contains HWPOISON pages, read() fails to read any byte of the hugepage and returns -EIO, although many bytes in the HWPOISON hugepage are readable. Improve this by allowing hugetlbfs_read_iter returns as many bytes as possible. For a requested range [offset, offset + len) that contains HWPOISON page, return [offset, first HWPOISON page addr); the next read attempt will fail and return -EIO. Reviewed-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- fs/hugetlbfs/inode.c | 58 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 52 insertions(+), 6 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 7b17ccfa039d..c2b807d37f85 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -282,6 +282,42 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, } #endif +/* + * Someone wants to read @bytes from a HWPOISON hugetlb @page from @offset. + * Returns the maximum number of bytes one can read without touching the 1st raw + * HWPOISON subpage. + * + * The implementation borrows the iteration logic from copy_page_to_iter*. + */ +static size_t adjust_range_hwpoison(struct page *page, size_t offset, size_t bytes) +{ + size_t n = 0; + size_t res = 0; + struct folio *folio = page_folio(page); + + /* First subpage to start the loop. */ + page += offset / PAGE_SIZE; + offset %= PAGE_SIZE; + while (1) { + if (is_raw_hwp_subpage(folio, page)) + break; + + /* Safe to read n bytes without touching HWPOISON subpage. */ + n = min(bytes, (size_t)PAGE_SIZE - offset); + res += n; + bytes -= n; + if (!bytes || !n) + break; + offset += n; + if (offset == PAGE_SIZE) { + page++; + offset = 0; + } + } + + return res; +} + /* * Support for read() - Find the page attached to f_mapping and copy out the * data. This provides functionality similar to filemap_read(). @@ -300,7 +336,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) while (iov_iter_count(to)) { struct page *page; - size_t nr, copied; + size_t nr, copied, want; /* nr is the maximum number of bytes to copy from this page */ nr = huge_page_size(h); @@ -328,16 +364,26 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) } else { unlock_page(page); - if (PageHWPoison(page)) { - put_page(page); - retval = -EIO; - break; + if (!PageHWPoison(page)) + want = nr; + else { + /* + * Adjust how many bytes safe to read without + * touching the 1st raw HWPOISON subpage after + * offset. + */ + want = adjust_range_hwpoison(page, offset, nr); + if (want == 0) { + put_page(page); + retval = -EIO; + break; + } } /* * We have the page, copy it to user space buffer. */ - copied = copy_page_to_iter(page, offset, nr, to); + copied = copy_page_to_iter(page, offset, want, to); put_page(page); } offset += copied; From patchwork Fri Jul 7 20:19:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 117290 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp3540839vqx; Fri, 7 Jul 2023 14:10:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlGg+6OGsevnrsnB0Ffa/I5/kGUstxq0AeWBQMyscQF3EOKb8GGVXKX3/ZFzo66oqw4PtvhK X-Received: by 2002:a17:907:7f26:b0:966:1bf2:2af5 with SMTP id qf38-20020a1709077f2600b009661bf22af5mr9478536ejc.22.1688764226084; Fri, 07 Jul 2023 14:10:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688764226; cv=none; d=google.com; s=arc-20160816; b=Ss8LPt0FqUx9gHFnVLfavo6ZkKrmIVvslDftEdBODaC1rz676kfI68jigY8aQPcZWC Wwo5idFPl4KQBnM5dAGqqY0ofF3FLN94oPMfJqxno8fYGOxB0+lSE0lK3o60px5Jbz/t oC30gGZJAnYneboue6FHC5xWYGFezU/OYwwwv7z2SRPBXoyFx4rZ+jQ7jgqnDUlBvfqy HqiY+abh5RKBdo3Kn7e4lmHPMM4yF0JDD2i7bBNZk09nI71qeytX42H4YYDsU7RJbCOY nDMV24lb/Pz2EDazs8ST2V4sxCj/qPwwSGTzrBj9GBELk3FrukjOT6+RM4gm3BGk6Oil +UBg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; fh=dXuKmv+n+4RrXbgVwL2QlK4VwZMBf3WumGals2t2WtM=; b=OW5ZCVdgcL2M1NKNBrcQ7UduV25MLVGuaTxZT9KLcHIIyvKx0w4c8vnYEjSZqTon/u gbXjlSKTNXPhWNY+D11GPZTT1e2VvIiNRdLSi3AhmGsLmvVmH4RmKK2eWt3WGrS8xFW/ kwbG7OVWRhNTmRxrpuhuauwrlxkCeJrHL5/oK/isX8UIp+XPn87ndGLRwgVHEIRRg6lN tiqFU/ColQQAc71g5L8mGxCbvbTGPbG6oeUgLX97uKWzdekpQIbVCRQtKAVOpfCljTQD YS7zmeB+ZO9cO0peJZgVFP84VWzM2xrDOqH643KNbVHYpmRmRsSvRZiEABm1kPvo8Mmh wQ2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=cgDoU3tD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z7-20020a170906714700b00992d0de8763si2861769ejj.910.2023.07.07.14.10.02; Fri, 07 Jul 2023 14:10:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=cgDoU3tD; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232611AbjGGUTe (ORCPT + 99 others); Fri, 7 Jul 2023 16:19:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45190 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231280AbjGGUTY (ORCPT ); Fri, 7 Jul 2023 16:19:24 -0400 Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com [IPv6:2607:f8b0:4864:20::54a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1596B2106 for ; Fri, 7 Jul 2023 13:19:21 -0700 (PDT) Received: by mail-pg1-x54a.google.com with SMTP id 41be03b00d2f7-55bf5cd4fb8so2535552a12.3 for ; Fri, 07 Jul 2023 13:19:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688761160; x=1691353160; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; b=cgDoU3tDkKP7JTQ+F0CzAiBgvzjb4qUPJbX80bmEdZNC2kBFPhSxB/2KWqZSIhGnu5 DNnicuDpK4RRjfvI8Od0feJbKV0/XU2e63+LYPuJBKE3rThjUkLPqnYx6RjItC6itmd6 E8oybw7gnCOvuqj/lkQyUEa8CGnU/e60XfcGhlrIjKezX1hZaq+Fbq1wQe5hiMJLfSxY 04YtCgIm9GcwTklBS9EFbPw5TLQgXGtx3bb42RDjvnpfonqnq7jboBzGk/LZy1us4JKo w+8/QeI5pWZgUe4X5h5uG4tRJUsMzcCNRRNm03kEX6WEvTDa0+st3K25FuKUsfWelq3J yj+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688761160; x=1691353160; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; b=PURmerhlN8w14FyMItiPfsI47nkcSVYi/0JsFSKOyKgsW8zeJrC8eb33671cS45mKl Kj82nD8Q7REfvXoO1t/5QpAbwzwu3dmsmjhbeaf1yTBXCo2PfNxdrtuzSCIbMNFqnJHz FtIJOEDMUWbZtT4Wqzzu9gBtglvkjnNG4maNbsvGOatdEkw3YQONGAglUi0b0jKaDZS6 5hjA0+4zjljJ1/CZFQHBQr2l6qyakwaIJlNRKkbfifrkikS5X6DPSBHy324paLxiHgna 5rpf6VzGA3NXKM+QxcY0JmLU4sBRnmcBZUi0D6Z6WEmJPYkBI4JkMZXqYyR+jPI4rapN An6w== X-Gm-Message-State: ABy/qLYgrFXLZXDZ4Q4Ei4Lu8vfLQpRNxAXWuIQo0HXC/qbvwp/ZWN4M 2dHsnscFzgkaQ90MyTi6oi88n65zNp4v3A== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a63:9316:0:b0:53f:f32b:1f20 with SMTP id b22-20020a639316000000b0053ff32b1f20mr4028211pge.2.1688761160596; Fri, 07 Jul 2023 13:19:20 -0700 (PDT) Date: Fri, 7 Jul 2023 20:19:04 +0000 In-Reply-To: <20230707201904.953262-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230707201904.953262-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230707201904.953262-5-jiaqiyan@google.com> Subject: [PATCH v3 4/4] selftests/mm: add tests for HWPOISON hugetlbfs read From: Jiaqi Yan To: akpm@linux-foundation.org, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: songmuchun@bytedance.com, shy828301@gmail.com, linmiaohe@huawei.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770797637529475458?= X-GMAIL-MSGID: =?utf-8?q?1770797637529475458?= Add tests for the improvement made to read operation on HWPOISON hugetlb page with different read granularities. For each chunk size, three read scenarios are tested: 1. Simple regression test on read without HWPOISON. 2. Sequential read page by page should succeed until encounters the 1st raw HWPOISON subpage. 3. After skip a raw HWPOISON subpage by lseek, read()s always succeed. Acked-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb-read-hwpoison.c | 322 ++++++++++++++++++ 3 files changed, 324 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb-read-hwpoison.c diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore index 7e2a982383c0..cdc9ce4426b9 100644 --- a/tools/testing/selftests/mm/.gitignore +++ b/tools/testing/selftests/mm/.gitignore @@ -5,6 +5,7 @@ hugepage-mremap hugepage-shm hugepage-vmemmap hugetlb-madvise +hugetlb-read-hwpoison khugepaged map_hugetlb map_populate diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 66d7c07dc177..b7fce9073279 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -41,6 +41,7 @@ TEST_GEN_PROGS += gup_longterm TEST_GEN_PROGS += gup_test TEST_GEN_PROGS += hmm-tests TEST_GEN_PROGS += hugetlb-madvise +TEST_GEN_PROGS += hugetlb-read-hwpoison TEST_GEN_PROGS += hugepage-mmap TEST_GEN_PROGS += hugepage-mremap TEST_GEN_PROGS += hugepage-shm diff --git a/tools/testing/selftests/mm/hugetlb-read-hwpoison.c b/tools/testing/selftests/mm/hugetlb-read-hwpoison.c new file mode 100644 index 000000000000..ba6cc6f9cabc --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb-read-hwpoison.c @@ -0,0 +1,322 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "../kselftest.h" + +#define PREFIX " ... " +#define ERROR_PREFIX " !!! " + +#define MAX_WRITE_READ_CHUNK_SIZE (getpagesize() * 16) +#define MAX(a, b) (((a) > (b)) ? (a) : (b)) + +enum test_status { + TEST_PASSED = 0, + TEST_FAILED = 1, + TEST_SKIPPED = 2, +}; + +static char *status_to_str(enum test_status status) +{ + switch (status) { + case TEST_PASSED: + return "TEST_PASSED"; + case TEST_FAILED: + return "TEST_FAILED"; + case TEST_SKIPPED: + return "TEST_SKIPPED"; + default: + return "TEST_???"; + } +} + +static int setup_filemap(char *filemap, size_t len, size_t wr_chunk_size) +{ + char iter = 0; + + for (size_t offset = 0; offset < len; + offset += wr_chunk_size) { + iter++; + memset(filemap + offset, iter, wr_chunk_size); + } + + return 0; +} + +static bool verify_chunk(char *buf, size_t len, char val) +{ + size_t i; + + for (i = 0; i < len; ++i) { + if (buf[i] != val) { + printf(PREFIX ERROR_PREFIX "check fail: buf[%lu] = %u != %u\n", + i, buf[i], val); + return false; + } + } + + return true; +} + +static bool seek_read_hugepage_filemap(int fd, size_t len, size_t wr_chunk_size, + off_t offset, size_t expected) +{ + char buf[MAX_WRITE_READ_CHUNK_SIZE]; + ssize_t ret_count = 0; + ssize_t total_ret_count = 0; + char val = offset / wr_chunk_size + offset % wr_chunk_size; + + printf(PREFIX PREFIX "init val=%u with offset=0x%lx\n", val, offset); + printf(PREFIX PREFIX "expect to read 0x%lx bytes of data in total\n", + expected); + if (lseek(fd, offset, SEEK_SET) < 0) { + perror(PREFIX ERROR_PREFIX "seek failed"); + return false; + } + + while (offset + total_ret_count < len) { + ret_count = read(fd, buf, wr_chunk_size); + if (ret_count == 0) { + printf(PREFIX PREFIX "read reach end of the file\n"); + break; + } else if (ret_count < 0) { + perror(PREFIX ERROR_PREFIX "read failed"); + break; + } + ++val; + if (!verify_chunk(buf, ret_count, val)) + return false; + + total_ret_count += ret_count; + } + printf(PREFIX PREFIX "actually read 0x%lx bytes of data in total\n", + total_ret_count); + + return total_ret_count == expected; +} + +static bool read_hugepage_filemap(int fd, size_t len, + size_t wr_chunk_size, size_t expected) +{ + char buf[MAX_WRITE_READ_CHUNK_SIZE]; + ssize_t ret_count = 0; + ssize_t total_ret_count = 0; + char val = 0; + + printf(PREFIX PREFIX "expect to read 0x%lx bytes of data in total\n", + expected); + while (total_ret_count < len) { + ret_count = read(fd, buf, wr_chunk_size); + if (ret_count == 0) { + printf(PREFIX PREFIX "read reach end of the file\n"); + break; + } else if (ret_count < 0) { + perror(PREFIX ERROR_PREFIX "read failed"); + break; + } + ++val; + if (!verify_chunk(buf, ret_count, val)) + return false; + + total_ret_count += ret_count; + } + printf(PREFIX PREFIX "actually read 0x%lx bytes of data in total\n", + total_ret_count); + + return total_ret_count == expected; +} + +static enum test_status +test_hugetlb_read(int fd, size_t len, size_t wr_chunk_size) +{ + enum test_status status = TEST_SKIPPED; + char *filemap = NULL; + + if (ftruncate(fd, len) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate failed"); + return status; + } + + filemap = mmap(NULL, len, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + if (filemap == MAP_FAILED) { + perror(PREFIX ERROR_PREFIX "mmap for primary mapping failed"); + goto done; + } + + setup_filemap(filemap, len, wr_chunk_size); + status = TEST_FAILED; + + if (read_hugepage_filemap(fd, len, wr_chunk_size, len)) + status = TEST_PASSED; + + munmap(filemap, len); +done: + if (ftruncate(fd, 0) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + + return status; +} + +static enum test_status +test_hugetlb_read_hwpoison(int fd, size_t len, size_t wr_chunk_size, + bool skip_hwpoison_page) +{ + enum test_status status = TEST_SKIPPED; + char *filemap = NULL; + char *hwp_addr = NULL; + const unsigned long pagesize = getpagesize(); + + if (ftruncate(fd, len) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate failed"); + return status; + } + + filemap = mmap(NULL, len, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + if (filemap == MAP_FAILED) { + perror(PREFIX ERROR_PREFIX "mmap for primary mapping failed"); + goto done; + } + + setup_filemap(filemap, len, wr_chunk_size); + status = TEST_FAILED; + + /* + * Poisoned hugetlb page layout (assume hugepagesize=2MB): + * |<---------------------- 1MB ---------------------->| + * |<---- healthy page ---->|<---- HWPOISON page ----->| + * |<------------------- (1MB - 8KB) ----------------->| + */ + hwp_addr = filemap + len / 2 + pagesize; + if (madvise(hwp_addr, pagesize, MADV_HWPOISON) < 0) { + perror(PREFIX ERROR_PREFIX "MADV_HWPOISON failed"); + goto unmap; + } + + if (!skip_hwpoison_page) { + /* + * Userspace should be able to read (1MB + 1 page) from + * the beginning of the HWPOISONed hugepage. + */ + if (read_hugepage_filemap(fd, len, wr_chunk_size, + len / 2 + pagesize)) + status = TEST_PASSED; + } else { + /* + * Userspace should be able to read (1MB - 2 pages) from + * HWPOISONed hugepage. + */ + if (seek_read_hugepage_filemap(fd, len, wr_chunk_size, + len / 2 + MAX(2 * pagesize, wr_chunk_size), + len / 2 - MAX(2 * pagesize, wr_chunk_size))) + status = TEST_PASSED; + } + +unmap: + munmap(filemap, len); +done: + if (ftruncate(fd, 0) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + + return status; +} + +static int create_hugetlbfs_file(struct statfs *file_stat) +{ + int fd; + + fd = memfd_create("hugetlb_tmp", MFD_HUGETLB); + if (fd < 0) { + perror(PREFIX ERROR_PREFIX "could not open hugetlbfs file"); + return -1; + } + + memset(file_stat, 0, sizeof(*file_stat)); + if (fstatfs(fd, file_stat)) { + perror(PREFIX ERROR_PREFIX "fstatfs failed"); + goto close; + } + if (file_stat->f_type != HUGETLBFS_MAGIC) { + printf(PREFIX ERROR_PREFIX "not hugetlbfs file\n"); + goto close; + } + + return fd; +close: + close(fd); + return -1; +} + +int main(void) +{ + int fd; + struct statfs file_stat; + enum test_status status; + /* Test read() in different granularity. */ + size_t wr_chunk_sizes[] = { + getpagesize() / 2, getpagesize(), + getpagesize() * 2, getpagesize() * 4 + }; + size_t i; + + for (i = 0; i < ARRAY_SIZE(wr_chunk_sizes); ++i) { + printf("Write/read chunk size=0x%lx\n", + wr_chunk_sizes[i]); + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB read regression test...\n"); + status = test_hugetlb_read(fd, file_stat.f_bsize, + wr_chunk_sizes[i]); + printf(PREFIX "HugeTLB read regression test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB read HWPOISON test...\n"); + status = test_hugetlb_read_hwpoison(fd, file_stat.f_bsize, + wr_chunk_sizes[i], false); + printf(PREFIX "HugeTLB read HWPOISON test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB seek then read HWPOISON test...\n"); + status = test_hugetlb_read_hwpoison(fd, file_stat.f_bsize, + wr_chunk_sizes[i], true); + printf(PREFIX "HugeTLB seek then read HWPOISON test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + } + + return 0; + +create_failure: + printf(ERROR_PREFIX "Abort test: failed to create hugetlbfs file\n"); + return -1; +}