From patchwork Thu Jul 13 00:18:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 119446 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1506526vqm; Wed, 12 Jul 2023 17:37:58 -0700 (PDT) X-Google-Smtp-Source: APBJJlGIoHVJAe+1T77XO8A+r8mTuWVTFHreJUGUX491vb/jtMF7NqD03ZNs6gXqoFhYJS/V+4lv X-Received: by 2002:a05:6a00:1394:b0:675:8f71:290a with SMTP id t20-20020a056a00139400b006758f71290amr246487pfg.34.1689208677973; Wed, 12 Jul 2023 17:37:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689208677; cv=none; d=google.com; s=arc-20160816; b=dWf6J3+ymiRCRNVPBSYd9Wgty77to3Byz++2vW46f0roNKg4XTraTqUuupufJLvNhn pbDaSGajxsz6oEfzpkRbVoKwdTkixWYV3E1mAsmFp1ItG62y53Kkbqd9AviDTkCiIfpJ DKTVnxoX2QS1U4+Huokrxv0ZReD8iGe0pB+zEKSVmBnoUp0/TAiyLFG2hf5anETim7oA +6ILw97iGqlUaPTzc8Mk4fK+4PiheEBAtQZyeXuzvaEBZraDkXdoIcvO4/ABPIhTuMlm WIO/ou3siPnVewr5HT4mm7WrD+kODnyAkdSsCs+ilNcmirRwSIY0arNfwMudLwo9+jqF tbMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; fh=so8aI3hPnDSYdD6rozMc9dHG7pAv+qgyPg3MIEBo9Hw=; b=fmAW7HD9WCha3SKdjUyFUOGgfREx9F7b+CggzCVyTPtVmG/LCdNMAw/UTXc3MjdrM/ 7Ni/egAxnxtpIsCV3MtOGtlLQ3sbdjWiKQuxmEGjn4MxuvQ0A5Qb8Qg3Jstbn35kDbmQ SYecOmrdfGqMurZMSvKfUHDxLA2jVDcl/pYW/Zz+A45T3p0dywzoLG8SRpXWlLoMoZoN h2/fkmGG2Pr+2xf55WtLaeg+M8uft8QcEZhN1NYg72K8YCc05BO6VjUfOMIIsJbFmc5g X4o/ZfmGNvOJEUNEbCgCW1xIXN+A+41X0J0fTmbp8/wzM+eZJLyt+RqC3jfAK07ryWXl OTNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=OYL0bSj4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id o7-20020a056a0015c700b00682bec0b685si4218586pfu.237.2023.07.12.17.37.45; Wed, 12 Jul 2023 17:37:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=OYL0bSj4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233210AbjGMASw (ORCPT + 99 others); Wed, 12 Jul 2023 20:18:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36522 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231743AbjGMASk (ORCPT ); Wed, 12 Jul 2023 20:18:40 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 769571FD2 for ; Wed, 12 Jul 2023 17:18:39 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-262e0c70e8eso16444a91.2 for ; Wed, 12 Jul 2023 17:18:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689207519; x=1691799519; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; b=OYL0bSj4G5DjZWuCKYJh1mgYYyb7btYF2LGzKY9Bk+oN0orukveShF4mERnDT12wOV 59jdagioq6JUZlITmhCB22KSsxcpFF6nRLpZZr2dAeg1q60Px50PCZI4BiVQB4x4I1Nv D7UtKoG3pLXisajm6f5fiIvw4mAM/GvHvlla0HmYIzBfJMVspK8P3f7Zh2oU4gWLY038 pzkJZ7lT76qFOdxc22MN7AWRmoNR4Jxt1plUazR3Kx2CHPPAlPjFhmATOWQzVnQ1H54p mEraCl7FGtI0EzT6W0V7qICdMdx0dnTi3JpZXGu/u6Ttzha9Zt0Uc8uTuZemcL9A24wm 4ruw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689207519; x=1691799519; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8FNetjBnnWs4aWcoESXEPv5EQNod9M4H8ra/lwVlWng=; b=KwrFxlgIjJ4CE2JXLm7tFmLiGYuB72JZLEQaMRDjpaGt3cYVvHtx/5Tc9M9af73Wbo dtKD5Tx5NPV8P0/FzBsPyKprl+mFDxSc8EkjgBT9z+sLpbCmRuf2xh0dlWvxMWp3rXMU DOc+mrzxgxtDth3dMmxRxo+nzjQKb7s1lMrmYm93IfVvn7SsiYgPV2i0tM+65ghlGLtV htrXrzkSAQzgW8GHrPSBHHusOlfCMP7sd1+0d3WynF1/St1NwCOBSdfzRVVg6vBNG1fc 6zX6dGcpcFQDz2ChuSZrnXWFyl/L9qU9ySeGsWIFO/SqI/oqHHIcHuGzxE3fbD3k5Lct wo6g== X-Gm-Message-State: ABy/qLbaf9c8bT/ZeYGtdBZa1fakX5ITqlc9XqofqTuLGanHBRDB6NN+ 9WlS610fcd2qLzqBskSrcvt8vlM//7IP8Q== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:8bcb:b0:1ab:18eb:17c8 with SMTP id r11-20020a1709028bcb00b001ab18eb17c8mr720plo.2.1689207518956; Wed, 12 Jul 2023 17:18:38 -0700 (PDT) Date: Thu, 13 Jul 2023 00:18:30 +0000 In-Reply-To: <20230713001833.3778937-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230713001833.3778937-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230713001833.3778937-2-jiaqiyan@google.com> Subject: [PATCH v4 1/4] mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp From: Jiaqi Yan To: linmiaohe@huawei.com, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771263678641387129 X-GMAIL-MSGID: 1771263678641387129 Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries are deleted from the llist. Correct the way __folio_free_raw_hwp deletes and frees raw_hwp_page entries in raw_hwp_list: first llist_del_all, then kfree within llist_for_each_safe. As of today, concurrent adding, deleting, and traversal on raw_hwp_list from hugetlb.c and/or memory-failure.c are fine with each other. Note this is guaranteed partly by the lock-free nature of llist, and partly by holding hugetlb_lock and/or mf_mutex. For example, as llist_del_all is lock-free with itself, folio_clear_hugetlb_hwpoison()s from __update_and_free_hugetlb_folio and memory_failure won't need explicit locking when freeing the raw_hwp_list. New code that manipulates raw_hwp_list must be careful to ensure the concurrency correctness. Acked-by: Mike Kravetz Acked-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- mm/memory-failure.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index e245191e6b04..a08677dcf953 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -1829,12 +1829,11 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) { - struct llist_head *head; - struct llist_node *t, *tnode; + struct llist_node *t, *tnode, *head; unsigned long count = 0; - head = raw_hwp_list_head(folio); - llist_for_each_safe(tnode, t, head->first) { + head = llist_del_all(raw_hwp_list_head(folio)); + llist_for_each_safe(tnode, t, head) { struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node); if (move_flag) @@ -1844,7 +1843,6 @@ static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) kfree(p); count++; } - llist_del_all(head); return count; } From patchwork Thu Jul 13 00:18:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 119452 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1511841vqm; Wed, 12 Jul 2023 17:54:44 -0700 (PDT) X-Google-Smtp-Source: APBJJlFq6prISzjC1Y+3PkHIAR+HiXdqb6+zireLnbmjJYi5hSANA7aCSaZ5bY7WOo4cNW0CaPDx X-Received: by 2002:a05:6402:656:b0:51d:d16f:7e52 with SMTP id u22-20020a056402065600b0051dd16f7e52mr333400edx.29.1689209684196; Wed, 12 Jul 2023 17:54:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689209684; cv=none; d=google.com; s=arc-20160816; b=g1ABQPTRW5CC4I0bXrM7Yqo9iiI0V5wgrcrarTKDY/rz247sI2wjV2gXzuuZSROc0W j99auCL2v/Da7MGDEPRNp8FVia9JiweYjwssCpiIhppERPvu+Vdy2YvR6vS0wC83npYy Z/tCfBvkRXfH+7dqa9xTSL/+ysJs7DwGekO3YLfV7j67rmTavfE8ocxpnxGCD9qbRb2Q I/2W4vgasCqh3wzaE3jnNOYB4ukMll5O28Bx67Hp7ECFujmh5jV7nIf/PKqCbTwbE6FK Os4L4PgqjVpFjD71jMfgOvwLPJaf+bo8UiLBC1235MwL8AkRBpzTvIDdvCzO8YoWxB6B IEaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Iz4QSmIvTSTUsAdxwi3pCBw4pt8RugCrQx5KDc9ZhJs=; fh=so8aI3hPnDSYdD6rozMc9dHG7pAv+qgyPg3MIEBo9Hw=; b=vfRvo9/h3CYlylmWJGA1k44RblOuCM4bx8HnhC6CKXv4RJOhggVqIvd8kVID8j+ofd Q639qdwAr3QFn8qIE9X0R/w3T5DuPKbQkqFP9szhDiuavgqy+hYGP2phduHlVXJztG2c OnJAhKpY6kpeBavSqCfU8MwNR2HQoSEHM/deO2M81h3ZXO5q75hRfqWw9GG6EHp6z06v oJoFBQnRhOjzk2Bso7KWRVWYYAGPqTrQxYwSifC4KgPi+XMlQ1cCOKPxyc+3ubVV94wH qVFFOL++6OUfIhl4FhvIEPnvtHwN1yWireencyD/3/cSJ+K30AUyTpJAU+92GuR9U3zg 7sYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=OxVY6Z7l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b7-20020aa7df87000000b0051de3b9e331si6232948edy.561.2023.07.12.17.54.20; Wed, 12 Jul 2023 17:54:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=OxVY6Z7l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233269AbjGMAS4 (ORCPT + 99 others); Wed, 12 Jul 2023 20:18:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36528 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229529AbjGMASl (ORCPT ); Wed, 12 Jul 2023 20:18:41 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDDC71995 for ; Wed, 12 Jul 2023 17:18:40 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1b8b30f781cso838175ad.2 for ; Wed, 12 Jul 2023 17:18:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689207520; x=1691799520; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Iz4QSmIvTSTUsAdxwi3pCBw4pt8RugCrQx5KDc9ZhJs=; b=OxVY6Z7lXNfLpGGQxHTUt1swejOIFF85FUW0wE2VE8vdLB5aDIEaplEBhJovE6UIuU K4MI2omqzLcf0878ohV/qbCyn1o1btoxNO1zmo4XpzgNCwhqbsR5g9lPMhF/oc6c5OSH FLFQ8rcC3HlG6YKuWPQCFnkeyENr45NxeouUwj5SMJayn/oY+aa5qTUD8r4awoQKd282 07sAZWdt7BPtWzt9aI4AGI/dz9iMygnhJ+O+zdqcVmDbmZTctgty2teAKbz/LapOZrLR jUhKEwIXQ2hO2jhE5Geirl2yikNH3XAhBveTKUwqbOhLRH3RICugfJqK+TjpYtt4DoV1 LizA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689207520; x=1691799520; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Iz4QSmIvTSTUsAdxwi3pCBw4pt8RugCrQx5KDc9ZhJs=; b=Iubx0nKWgbDpGT3weyygBNegPOfNihpz52Ui9wVFOwNZ12vzMtQo+k7mvRjqqLDEZA AC/fLiwxYoltMpv/g3Mb13gxr/TSOZ6kKEnCJh/a6ab6WbGpZeinaZHuzylNLWFGQ1yn k+mIHTx/7jpRtuYf4dt55JmlpOKMh2mG5v2v/o27kIemNiCa2XP1Tg///XBTzeHcS1ue 6f4m4b//lvkahdb9RDZuC3MUtpwSR8G29pD9ypNyMZc237AdwYJeQdUrn31eC7af+3XN eAKKaVD37Z9qtflGFSFA/pFPVODT2ze4EQjWcsxOaRvsHc1p6g7dUtZWoPabcV/QX6wM 2Vww== X-Gm-Message-State: ABy/qLY98/OSWlMbnIGUGsmMuVUpsEiamOfSpvxOUFy08CoZLyvVgPmG qzJIAqIj4dU/KvAjS2AHJtfKZe1GgSV/Gw== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:e54e:b0:1b5:1fe8:a91 with SMTP id n14-20020a170902e54e00b001b51fe80a91mr611plf.3.1689207520379; Wed, 12 Jul 2023 17:18:40 -0700 (PDT) Date: Thu, 13 Jul 2023 00:18:31 +0000 In-Reply-To: <20230713001833.3778937-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230713001833.3778937-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230713001833.3778937-3-jiaqiyan@google.com> Subject: [PATCH v4 2/4] mm/hwpoison: check if a raw page in a hugetlb folio is raw HWPOISON From: Jiaqi Yan To: linmiaohe@huawei.com, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771264733941325256 X-GMAIL-MSGID: 1771264733941325256 Add the functionality, is_raw_hwpoison_page_in_hugepage, to tell if a raw page in a hugetlb folio is HWPOISON. This functionality relies on RawHwpUnreliable to be not set; otherwise hugepage's raw HWPOISON list becomes meaningless. is_raw_hwpoison_page_in_hugepage holds mf_mutex in order to synchronize with folio_set_hugetlb_hwpoison and folio_free_raw_hwp who iterate, insert, or delete entry in raw_hwp_list. llist itself doesn't ensure insertion and removal are synchornized with the llist_for_each_entry used by is_raw_hwpoison_page_in_hugepage (unless iterated entries are already deleted from the list). Caller can minimize the overhead of lock cycles by first checking HWPOISON flag of the folio. Exports this functionality to be immediately used in the read operation for hugetlbfs. Reviewed-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Reviewed-by: Miaohe Lin Signed-off-by: Jiaqi Yan --- include/linux/hugetlb.h | 5 +++++ mm/memory-failure.c | 40 ++++++++++++++++++++++++++++++++++++++-- 2 files changed, 43 insertions(+), 2 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ca3c8e10f24a..0a96cfacb746 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -1007,6 +1007,11 @@ void hugetlb_register_node(struct node *node); void hugetlb_unregister_node(struct node *node); #endif +/* + * Check if a given raw @page in a hugepage is HWPOISON. + */ +bool is_raw_hwpoison_page_in_hugepage(struct page *page); + #else /* CONFIG_HUGETLB_PAGE */ struct hstate {}; diff --git a/mm/memory-failure.c b/mm/memory-failure.c index a08677dcf953..d610d8f03f69 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -75,6 +75,8 @@ atomic_long_t num_poisoned_pages __read_mostly = ATOMIC_LONG_INIT(0); static bool hw_memory_failure __read_mostly = false; +static DEFINE_MUTEX(mf_mutex); + inline void num_poisoned_pages_inc(unsigned long pfn) { atomic_long_inc(&num_poisoned_pages); @@ -1813,6 +1815,7 @@ EXPORT_SYMBOL_GPL(mf_dax_kill_procs); #endif /* CONFIG_FS_DAX */ #ifdef CONFIG_HUGETLB_PAGE + /* * Struct raw_hwp_page represents information about "raw error page", * constructing singly linked list from ->_hugetlb_hwpoison field of folio. @@ -1827,6 +1830,41 @@ static inline struct llist_head *raw_hwp_list_head(struct folio *folio) return (struct llist_head *)&folio->_hugetlb_hwpoison; } +bool is_raw_hwpoison_page_in_hugepage(struct page *page) +{ + struct llist_head *raw_hwp_head; + struct raw_hwp_page *p; + struct folio *folio = page_folio(page); + bool ret = false; + + if (!folio_test_hwpoison(folio)) + return false; + + if (!folio_test_hugetlb(folio)) + return PageHWPoison(page); + + /* + * When RawHwpUnreliable is set, kernel lost track of which subpages + * are HWPOISON. So return as if ALL subpages are HWPOISONed. + */ + if (folio_test_hugetlb_raw_hwp_unreliable(folio)) + return true; + + mutex_lock(&mf_mutex); + + raw_hwp_head = raw_hwp_list_head(folio); + llist_for_each_entry(p, raw_hwp_head->first, node) { + if (page == p->page) { + ret = true; + break; + } + } + + mutex_unlock(&mf_mutex); + + return ret; +} + static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag) { struct llist_node *t, *tnode, *head; @@ -2106,8 +2144,6 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, return rc; } -static DEFINE_MUTEX(mf_mutex); - /** * memory_failure - Handle memory failure of a page. * @pfn: Page Number of the corrupted page From patchwork Thu Jul 13 00:18:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 119450 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1511748vqm; Wed, 12 Jul 2023 17:54:22 -0700 (PDT) X-Google-Smtp-Source: APBJJlHvYrNZo/wKD9/pboS4RniOZQXHDhqzeiGqGYLyUVKaPwyha5lHAfwg3GsSBxHXSJmZFQXU X-Received: by 2002:a05:6402:2c4:b0:51e:281a:66a4 with SMTP id b4-20020a05640202c400b0051e281a66a4mr401954edx.38.1689209662744; Wed, 12 Jul 2023 17:54:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689209662; cv=none; d=google.com; s=arc-20160816; b=jqWmTvn9uJLoKxoJw6buC2nPYO35xsmg5EkKm8Uno9o+NQBoV8y5P3spFnN9d2AOYs 0LzDz96AlmZg2QGsOfSJEbGVOTaiYHdZ3BV5BEZMl//TpT+4GzuRbZ10TcyvTlKhEiqt 3Yv2/plsqe7vM79jPYgJbVPTpIwbnXkYEXcjxvRUTD4EuIpBczXbhXIQPS+w3aAZdU+s tLTGI2/xbVOT45zi16TQ1Zz9tIPGBzinOk5/kCn/irTFxce2lUUzumTu+oC2v5X54ohl aCiKWgH/CUt9dR8hOSVyMiSA37tg3sb2UR2sbdKiS4mNyBXnKVlxzjhp3HPr99gKFqMm 0x0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=6XARlF0BXnO6/4Atr6/Ka8Rsp7ITnYdH8gSSlCIsHmQ=; fh=so8aI3hPnDSYdD6rozMc9dHG7pAv+qgyPg3MIEBo9Hw=; b=hE5kXvZl9vtbPdFEvlnz4suQ1hGonqlBuH/7oCP3SQcKKcZf20QBQjTwAVJTRe18GP cCFd+9P3mu/aKB14miniAtgvCm06wRSkFRPPzG/Ju7/LSdwohDzndLb/3t/h4GptnQ/W lYLajbVYvFbLh1uMoUT/MkiJl2ta84OP9BUZFWGQML1+m/6eru+EnoavZc9Ikw6lSUWY HE4km9tlmTSBT2HEtQu1CqWSv7r+2nLcXRi7FFRM0JMQakjD95PyfJekCfYvg7WtX6OO lMnAgqbXGysphpwc1LJEBXzNhaYcQ6G47sWxA8Kk08Zbxfd3ewIFXCKwU853NSl6LD7O GLAA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dypHe3K1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d25-20020a056402079900b0051e23d8b086si5530507edy.267.2023.07.12.17.53.59; Wed, 12 Jul 2023 17:54:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=dypHe3K1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233303AbjGMATA (ORCPT + 99 others); Wed, 12 Jul 2023 20:19:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36534 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232298AbjGMASn (ORCPT ); Wed, 12 Jul 2023 20:18:43 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D6111FCD for ; Wed, 12 Jul 2023 17:18:42 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1b8b2a2e720so824515ad.3 for ; Wed, 12 Jul 2023 17:18:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689207522; x=1691799522; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=6XARlF0BXnO6/4Atr6/Ka8Rsp7ITnYdH8gSSlCIsHmQ=; b=dypHe3K1l/c1oaj2HHVrHq7fKVxuhlr/Ku9wHIlgid2l5uxeWU6gepVzV2p5HJ+Mtn HQbc73phPu16+ziP3vR3iq444D/tI3BAZT+QtWryBgLc3sVCy5eU27hzqaAtG1F8EqMl o/uPkLyFyiabmK6g1nZ8jFcEXuWM0VnyC/xfa332OEOupMY+ygXwWjny1snuRDFi0JkG s3xf57LVHLyYM8qLCEOrGRoywDSE6krtphd9CuaI+egnsX7ey3Kuc8nYch5I/S9m9RGe skHC5e2MOBIE+ANgafHaV29E/DQKCyjKboQ/a6THaV0HVxs4uK2tWwdUET7s3EB7CyXK CWZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689207522; x=1691799522; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=6XARlF0BXnO6/4Atr6/Ka8Rsp7ITnYdH8gSSlCIsHmQ=; b=NKS0U1NY/oKPGL+SNezPxPym6RG6wHiRdIJJrKhJJuCBZd0gtN3ajfQh3xe1SU0wBn Mx5VcaRHfcJ3T+mBR5AcDx2kBlg6jVZiSzZRqcm80bHkT5K9gBel13PcatBofmYught3 OI0RZjwJ02sjFF95uRXtzh67flD+gkSEUB/TfBu4gJuiwSJP8dP2pH/lIDVVaWj0p7sw lI8TzpSKrfbkxJkvqZYBvX5RiS2aFmoSla+g9p7c/6cDnucia9GHiDoUSHyLgo0GUom1 I2s29tF/PQSo7z7DtmzGKQQWz/Si9pb80fFKI8nMdFk/UJbZ1L7Q/5D1ZU000QARenPB 0avQ== X-Gm-Message-State: ABy/qLa+A1yyp/840vhncDKhGcLn7ukawNxNm+xX7IM2BZCVRM5NTyao dOeWIoWcO1GI/0Jc6pa7BTYvRCdUZ+CrEw== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:7448:b0:1b7:dbb0:782b with SMTP id e8-20020a170902744800b001b7dbb0782bmr496plt.10.1689207521799; Wed, 12 Jul 2023 17:18:41 -0700 (PDT) Date: Thu, 13 Jul 2023 00:18:32 +0000 In-Reply-To: <20230713001833.3778937-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230713001833.3778937-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230713001833.3778937-4-jiaqiyan@google.com> Subject: [PATCH v4 3/4] hugetlbfs: improve read HWPOISON hugepage From: Jiaqi Yan To: linmiaohe@huawei.com, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771264711091561746 X-GMAIL-MSGID: 1771264711091561746 When a hugepage contains HWPOISON pages, read() fails to read any byte of the hugepage and returns -EIO, although many bytes in the HWPOISON hugepage are readable. Improve this by allowing hugetlbfs_read_iter returns as many bytes as possible. For a requested range [offset, offset + len) that contains HWPOISON page, return [offset, first HWPOISON page addr); the next read attempt will fail and return -EIO. Reviewed-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan --- fs/hugetlbfs/inode.c | 57 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 51 insertions(+), 6 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 7b17ccfa039d..e7611ae1e612 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -282,6 +282,41 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, } #endif +/* + * Someone wants to read @bytes from a HWPOISON hugetlb @page from @offset. + * Returns the maximum number of bytes one can read without touching the 1st raw + * HWPOISON subpage. + * + * The implementation borrows the iteration logic from copy_page_to_iter*. + */ +static size_t adjust_range_hwpoison(struct page *page, size_t offset, size_t bytes) +{ + size_t n = 0; + size_t res = 0; + + /* First subpage to start the loop. */ + page += offset / PAGE_SIZE; + offset %= PAGE_SIZE; + while (1) { + if (is_raw_hwpoison_page_in_hugepage(page)) + break; + + /* Safe to read n bytes without touching HWPOISON subpage. */ + n = min(bytes, (size_t)PAGE_SIZE - offset); + res += n; + bytes -= n; + if (!bytes || !n) + break; + offset += n; + if (offset == PAGE_SIZE) { + page++; + offset = 0; + } + } + + return res; +} + /* * Support for read() - Find the page attached to f_mapping and copy out the * data. This provides functionality similar to filemap_read(). @@ -300,7 +335,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) while (iov_iter_count(to)) { struct page *page; - size_t nr, copied; + size_t nr, copied, want; /* nr is the maximum number of bytes to copy from this page */ nr = huge_page_size(h); @@ -328,16 +363,26 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) } else { unlock_page(page); - if (PageHWPoison(page)) { - put_page(page); - retval = -EIO; - break; + if (!PageHWPoison(page)) + want = nr; + else { + /* + * Adjust how many bytes safe to read without + * touching the 1st raw HWPOISON subpage after + * offset. + */ + want = adjust_range_hwpoison(page, offset, nr); + if (want == 0) { + put_page(page); + retval = -EIO; + break; + } } /* * We have the page, copy it to user space buffer. */ - copied = copy_page_to_iter(page, offset, nr, to); + copied = copy_page_to_iter(page, offset, want, to); put_page(page); } offset += copied; From patchwork Thu Jul 13 00:18:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 119449 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp1511680vqm; Wed, 12 Jul 2023 17:54:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlFhKpNcwnjsUNS+bqkwvfI+/Y8BxseNPCYQLI60upVbpgCBbFyq5hmtn9WAk60z3h+sczVs X-Received: by 2002:a05:6402:12c9:b0:51d:d615:19af with SMTP id k9-20020a05640212c900b0051dd61519afmr282694edx.28.1689209651071; Wed, 12 Jul 2023 17:54:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689209651; cv=none; d=google.com; s=arc-20160816; b=Mfbi5XeMc76T3scj/kDf3Z52CWbHy+VCHzYUyEdwug3cYqaDf/Jg9pPbM7aXFVoR8S rMfen8S8xbAyklfRsOVLPtDMKIvknSSVvFlVP8YNKL5EUWsyCwZYABtn7qXGehWuXsgW RwGLZmy2ZOytVkcqN1uWaBCrC0Sjj03oKjypcP8+uWMULPCq41FEQjX0SvpLJ3Gr08Im ckZPSy/KNKBowdwWRSCSv4FIyz9ox3cmbjg+hmfv4xLKGe/8GE9exeB/UotJ/AsCW67c g40CmuY8COUjvPPN42BZdiWbU31cQGSilCNn8PamtU6442k5rbod+qohCeCfAaAegqNO OKlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; fh=so8aI3hPnDSYdD6rozMc9dHG7pAv+qgyPg3MIEBo9Hw=; b=rZkWha/27ks5HiF7edhQcmuFZkZ8XA0/rnHy8L9vnfZepynQinf4mt+QqhQBVfmKhU 9NvnR30YpUjuoFGe7rGOwz1H0lv/rVU/pMJUSNprX96EbQ/42M3ZG4Rf16kMlSkO0P9o 0tUK/V55764JNRW8oaUC78B9oXirhX/+dZ/8YWqoXIrh9St2pIIFc+SJdH3TpCMkOABB HiNCcoD3YhooeWaiRKGysTiwZZgxkPuE83wjsXAMtCkxfB2NAA+YkrSI/tmDn5gwS7re 898xcUmTeh/MuwvwBVtaW+FTbcdV45Jz+RMyKaGHSX7APdWBo9ETl70/7qUY2nATnhYZ iVVg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=FlZRL9jU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f20-20020a056402069400b0051d92aee623si5972947edy.54.2023.07.12.17.53.45; Wed, 12 Jul 2023 17:54:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=FlZRL9jU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233096AbjGMATC (ORCPT + 99 others); Wed, 12 Jul 2023 20:19:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36542 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232746AbjGMASp (ORCPT ); Wed, 12 Jul 2023 20:18:45 -0400 Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com [IPv6:2607:f8b0:4864:20::1049]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D14A41995 for ; Wed, 12 Jul 2023 17:18:43 -0700 (PDT) Received: by mail-pj1-x1049.google.com with SMTP id 98e67ed59e1d1-26314c2bdb8so18056a91.1 for ; Wed, 12 Jul 2023 17:18:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1689207523; x=1691799523; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; b=FlZRL9jUxJFKgY4MqDtKMkc0rMg/7jFvumz8fttZ1mweOBBWCGzvC8lc/FCQJuLkTZ A3qVeyiodQbUroyDwmQMCZw9jh+ovjD2C6URvGmE5mQg2qAG3rIFa3u5O3utY9wqWDdh y2yPK+CMKrOvd+Vf4PsWNx1mpzo4cScAJX/G/GmXtN3BrOtfw2lROeQROcur5BX+u/nI HMmRA4ujFCbKxeSV9RFjI91KjjN5dVl+5reqgQrM5L+aNYcjlbm1Kknrzn4l69z7WPtO K7sCo37UfV1f9414ZzedguXvQ6M3kjBSF+jGOr6vSzlnQwTqTeW0uoCbjObPBWxUQGY6 wV8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689207523; x=1691799523; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dmSXGhNgjtW0Yb8+yIyK70NmvAjyR1vjV+RIYl0ULq4=; b=LjloJDnhf/K40R7EsH1/H92xVYkQYi8IMOpQ/m3fVn6FvA2VOLgS3OD24UOYDG49Cw bdiVWt61f9kmCvOUUdT6Oho6ul+nlQzL2H5yhixy8FrTncINC+FnGt2TFychvuqETn4L P/SSN5IugNfvWDVtZt29Bv/+nLhyB5dST7NhtikIb6s/ZK5vM/SMVYsmO/FH9te+g3lN K27mt3JUZfbsX7lgEyuUbdn0Le72vEPTuZZBU7gBuouEZT75Zf9jpqRphtmRA+JCcO+v aKyOgmhmaGqgsTfBnrQxeDZ7Td/35hyMMdKJe2BWzIryEkWHoX2wN08xCSt+NtYJjyOX YHAQ== X-Gm-Message-State: ABy/qLYmYXYajMSzscmfpAZBqk0nurixKXi4tB6E51NVWG8xWJW+1TS4 dxVTTwsC4Rz8GVk5kbszBUoZTPNwamRJBw== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a17:902:d50f:b0:1b8:a555:385d with SMTP id b15-20020a170902d50f00b001b8a555385dmr624plg.9.1689207523281; Wed, 12 Jul 2023 17:18:43 -0700 (PDT) Date: Thu, 13 Jul 2023 00:18:33 +0000 In-Reply-To: <20230713001833.3778937-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230713001833.3778937-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230713001833.3778937-5-jiaqiyan@google.com> Subject: [PATCH v4 4/4] selftests/mm: add tests for HWPOISON hugetlbfs read From: Jiaqi Yan To: linmiaohe@huawei.com, mike.kravetz@oracle.com, naoya.horiguchi@nec.com Cc: akpm@linux-foundation.org, songmuchun@bytedance.com, shy828301@gmail.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771264699168322195 X-GMAIL-MSGID: 1771264699168322195 Add tests for the improvement made to read operation on HWPOISON hugetlb page with different read granularities. For each chunk size, three read scenarios are tested: 1. Simple regression test on read without HWPOISON. 2. Sequential read page by page should succeed until encounters the 1st raw HWPOISON subpage. 3. After skip a raw HWPOISON subpage by lseek, read()s always succeed. Acked-by: Mike Kravetz Reviewed-by: Naoya Horiguchi Signed-off-by: Jiaqi Yan Tested-by: Muhammad Usama Anjum --- tools/testing/selftests/mm/.gitignore | 1 + tools/testing/selftests/mm/Makefile | 1 + .../selftests/mm/hugetlb-read-hwpoison.c | 322 ++++++++++++++++++ 3 files changed, 324 insertions(+) create mode 100644 tools/testing/selftests/mm/hugetlb-read-hwpoison.c diff --git a/tools/testing/selftests/mm/.gitignore b/tools/testing/selftests/mm/.gitignore index 7e2a982383c0..cdc9ce4426b9 100644 --- a/tools/testing/selftests/mm/.gitignore +++ b/tools/testing/selftests/mm/.gitignore @@ -5,6 +5,7 @@ hugepage-mremap hugepage-shm hugepage-vmemmap hugetlb-madvise +hugetlb-read-hwpoison khugepaged map_hugetlb map_populate diff --git a/tools/testing/selftests/mm/Makefile b/tools/testing/selftests/mm/Makefile index 66d7c07dc177..b7fce9073279 100644 --- a/tools/testing/selftests/mm/Makefile +++ b/tools/testing/selftests/mm/Makefile @@ -41,6 +41,7 @@ TEST_GEN_PROGS += gup_longterm TEST_GEN_PROGS += gup_test TEST_GEN_PROGS += hmm-tests TEST_GEN_PROGS += hugetlb-madvise +TEST_GEN_PROGS += hugetlb-read-hwpoison TEST_GEN_PROGS += hugepage-mmap TEST_GEN_PROGS += hugepage-mremap TEST_GEN_PROGS += hugepage-shm diff --git a/tools/testing/selftests/mm/hugetlb-read-hwpoison.c b/tools/testing/selftests/mm/hugetlb-read-hwpoison.c new file mode 100644 index 000000000000..ba6cc6f9cabc --- /dev/null +++ b/tools/testing/selftests/mm/hugetlb-read-hwpoison.c @@ -0,0 +1,322 @@ +// SPDX-License-Identifier: GPL-2.0 + +#define _GNU_SOURCE +#include +#include +#include + +#include +#include +#include +#include +#include + +#include "../kselftest.h" + +#define PREFIX " ... " +#define ERROR_PREFIX " !!! " + +#define MAX_WRITE_READ_CHUNK_SIZE (getpagesize() * 16) +#define MAX(a, b) (((a) > (b)) ? (a) : (b)) + +enum test_status { + TEST_PASSED = 0, + TEST_FAILED = 1, + TEST_SKIPPED = 2, +}; + +static char *status_to_str(enum test_status status) +{ + switch (status) { + case TEST_PASSED: + return "TEST_PASSED"; + case TEST_FAILED: + return "TEST_FAILED"; + case TEST_SKIPPED: + return "TEST_SKIPPED"; + default: + return "TEST_???"; + } +} + +static int setup_filemap(char *filemap, size_t len, size_t wr_chunk_size) +{ + char iter = 0; + + for (size_t offset = 0; offset < len; + offset += wr_chunk_size) { + iter++; + memset(filemap + offset, iter, wr_chunk_size); + } + + return 0; +} + +static bool verify_chunk(char *buf, size_t len, char val) +{ + size_t i; + + for (i = 0; i < len; ++i) { + if (buf[i] != val) { + printf(PREFIX ERROR_PREFIX "check fail: buf[%lu] = %u != %u\n", + i, buf[i], val); + return false; + } + } + + return true; +} + +static bool seek_read_hugepage_filemap(int fd, size_t len, size_t wr_chunk_size, + off_t offset, size_t expected) +{ + char buf[MAX_WRITE_READ_CHUNK_SIZE]; + ssize_t ret_count = 0; + ssize_t total_ret_count = 0; + char val = offset / wr_chunk_size + offset % wr_chunk_size; + + printf(PREFIX PREFIX "init val=%u with offset=0x%lx\n", val, offset); + printf(PREFIX PREFIX "expect to read 0x%lx bytes of data in total\n", + expected); + if (lseek(fd, offset, SEEK_SET) < 0) { + perror(PREFIX ERROR_PREFIX "seek failed"); + return false; + } + + while (offset + total_ret_count < len) { + ret_count = read(fd, buf, wr_chunk_size); + if (ret_count == 0) { + printf(PREFIX PREFIX "read reach end of the file\n"); + break; + } else if (ret_count < 0) { + perror(PREFIX ERROR_PREFIX "read failed"); + break; + } + ++val; + if (!verify_chunk(buf, ret_count, val)) + return false; + + total_ret_count += ret_count; + } + printf(PREFIX PREFIX "actually read 0x%lx bytes of data in total\n", + total_ret_count); + + return total_ret_count == expected; +} + +static bool read_hugepage_filemap(int fd, size_t len, + size_t wr_chunk_size, size_t expected) +{ + char buf[MAX_WRITE_READ_CHUNK_SIZE]; + ssize_t ret_count = 0; + ssize_t total_ret_count = 0; + char val = 0; + + printf(PREFIX PREFIX "expect to read 0x%lx bytes of data in total\n", + expected); + while (total_ret_count < len) { + ret_count = read(fd, buf, wr_chunk_size); + if (ret_count == 0) { + printf(PREFIX PREFIX "read reach end of the file\n"); + break; + } else if (ret_count < 0) { + perror(PREFIX ERROR_PREFIX "read failed"); + break; + } + ++val; + if (!verify_chunk(buf, ret_count, val)) + return false; + + total_ret_count += ret_count; + } + printf(PREFIX PREFIX "actually read 0x%lx bytes of data in total\n", + total_ret_count); + + return total_ret_count == expected; +} + +static enum test_status +test_hugetlb_read(int fd, size_t len, size_t wr_chunk_size) +{ + enum test_status status = TEST_SKIPPED; + char *filemap = NULL; + + if (ftruncate(fd, len) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate failed"); + return status; + } + + filemap = mmap(NULL, len, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + if (filemap == MAP_FAILED) { + perror(PREFIX ERROR_PREFIX "mmap for primary mapping failed"); + goto done; + } + + setup_filemap(filemap, len, wr_chunk_size); + status = TEST_FAILED; + + if (read_hugepage_filemap(fd, len, wr_chunk_size, len)) + status = TEST_PASSED; + + munmap(filemap, len); +done: + if (ftruncate(fd, 0) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + + return status; +} + +static enum test_status +test_hugetlb_read_hwpoison(int fd, size_t len, size_t wr_chunk_size, + bool skip_hwpoison_page) +{ + enum test_status status = TEST_SKIPPED; + char *filemap = NULL; + char *hwp_addr = NULL; + const unsigned long pagesize = getpagesize(); + + if (ftruncate(fd, len) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate failed"); + return status; + } + + filemap = mmap(NULL, len, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + if (filemap == MAP_FAILED) { + perror(PREFIX ERROR_PREFIX "mmap for primary mapping failed"); + goto done; + } + + setup_filemap(filemap, len, wr_chunk_size); + status = TEST_FAILED; + + /* + * Poisoned hugetlb page layout (assume hugepagesize=2MB): + * |<---------------------- 1MB ---------------------->| + * |<---- healthy page ---->|<---- HWPOISON page ----->| + * |<------------------- (1MB - 8KB) ----------------->| + */ + hwp_addr = filemap + len / 2 + pagesize; + if (madvise(hwp_addr, pagesize, MADV_HWPOISON) < 0) { + perror(PREFIX ERROR_PREFIX "MADV_HWPOISON failed"); + goto unmap; + } + + if (!skip_hwpoison_page) { + /* + * Userspace should be able to read (1MB + 1 page) from + * the beginning of the HWPOISONed hugepage. + */ + if (read_hugepage_filemap(fd, len, wr_chunk_size, + len / 2 + pagesize)) + status = TEST_PASSED; + } else { + /* + * Userspace should be able to read (1MB - 2 pages) from + * HWPOISONed hugepage. + */ + if (seek_read_hugepage_filemap(fd, len, wr_chunk_size, + len / 2 + MAX(2 * pagesize, wr_chunk_size), + len / 2 - MAX(2 * pagesize, wr_chunk_size))) + status = TEST_PASSED; + } + +unmap: + munmap(filemap, len); +done: + if (ftruncate(fd, 0) < 0) { + perror(PREFIX ERROR_PREFIX "ftruncate back to 0 failed"); + status = TEST_FAILED; + } + + return status; +} + +static int create_hugetlbfs_file(struct statfs *file_stat) +{ + int fd; + + fd = memfd_create("hugetlb_tmp", MFD_HUGETLB); + if (fd < 0) { + perror(PREFIX ERROR_PREFIX "could not open hugetlbfs file"); + return -1; + } + + memset(file_stat, 0, sizeof(*file_stat)); + if (fstatfs(fd, file_stat)) { + perror(PREFIX ERROR_PREFIX "fstatfs failed"); + goto close; + } + if (file_stat->f_type != HUGETLBFS_MAGIC) { + printf(PREFIX ERROR_PREFIX "not hugetlbfs file\n"); + goto close; + } + + return fd; +close: + close(fd); + return -1; +} + +int main(void) +{ + int fd; + struct statfs file_stat; + enum test_status status; + /* Test read() in different granularity. */ + size_t wr_chunk_sizes[] = { + getpagesize() / 2, getpagesize(), + getpagesize() * 2, getpagesize() * 4 + }; + size_t i; + + for (i = 0; i < ARRAY_SIZE(wr_chunk_sizes); ++i) { + printf("Write/read chunk size=0x%lx\n", + wr_chunk_sizes[i]); + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB read regression test...\n"); + status = test_hugetlb_read(fd, file_stat.f_bsize, + wr_chunk_sizes[i]); + printf(PREFIX "HugeTLB read regression test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB read HWPOISON test...\n"); + status = test_hugetlb_read_hwpoison(fd, file_stat.f_bsize, + wr_chunk_sizes[i], false); + printf(PREFIX "HugeTLB read HWPOISON test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + + fd = create_hugetlbfs_file(&file_stat); + if (fd < 0) + goto create_failure; + printf(PREFIX "HugeTLB seek then read HWPOISON test...\n"); + status = test_hugetlb_read_hwpoison(fd, file_stat.f_bsize, + wr_chunk_sizes[i], true); + printf(PREFIX "HugeTLB seek then read HWPOISON test...%s\n", + status_to_str(status)); + close(fd); + if (status == TEST_FAILED) + return -1; + } + + return 0; + +create_failure: + printf(ERROR_PREFIX "Abort test: failed to create hugetlbfs file\n"); + return -1; +}