From patchwork Wed May 17 16:09:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jiaqi Yan X-Patchwork-Id: 95406 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1257999vqo; Wed, 17 May 2023 09:18:12 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7lOEAzNqwC0aB2bdekJycoAZ9/CXFChCb4PWjZSvMCEVsdsRKybTShtnp6+gx/AywN2oJK X-Received: by 2002:a05:6a20:258e:b0:ff:ca91:68ee with SMTP id k14-20020a056a20258e00b000ffca9168eemr3617085pzd.9.1684340292588; Wed, 17 May 2023 09:18:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684340292; cv=none; d=google.com; s=arc-20160816; b=rFs+cYjDQyByraOnxqA/aGbVnBCeNAHhx1qZYNPTyIPP67MKTtJxAbdT5mU1wp/obn fkw9eRRv7GT93oeyU8ybWv2v4aKnnnRlKzn7QGIvq/5P0TayzyTF+aMTY30aPdYXNhGQ 1ouX6ky2PIo1TUTN3e7P9oKd6t4L5+ip3mtxanFKBXsYj/j20H0tFZSLfR+2oqg32TNX dn+eBn2RzVAeYoESlUc5j774Uz79c7O6qm4OaTjvxt7vKO3K7jZWivKmM5cuFdVYyRkT bC8WYy19FqdT7o5gc3fC6gYcLBjnPD+swyelrNRdTnNQM/YaGbkHzuka3M7Whglpa0Jt tTEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=4afwjJMNRVLZWsRS4KvH1UWrNK7pSNXsAEU4xOupb6Y=; b=q9soSKKJIfb44wKgWN44e3tcqzBn8GG6rOGn0KiUgo+qe23QlY7qzxqUwLo4Wx1f49 1kgPHTmMqPtaym3JKvfWobaWQR4SQ82YL9W8Vox4Nb55qp8DZgcRmeaJUXeBdc3oEIF4 ZuffMoaJ0Ez4gwHehV+X+Hn4f0nWxN20qLfFltN+6losqI7Hi9JgfNfp7W/6HdzgmN4E 1+YK70CJX6a37XJu72ccL+dRSvkZZqbN3oJn9moyluEjEkxMNgOkNi+QpbtIvk+Rmv5e jsOJXPCHccKQpXqDy7MLCuuLrd3XY7nt9t0czahf7RAhSI6YE5V2JfPXfJQv6UK+gk7E yw4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=axQ41qX1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c22-20020aa79536000000b00643994622e4si21905674pfp.98.2023.05.17.09.17.59; Wed, 17 May 2023 09:18:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=axQ41qX1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232037AbjEQQKN (ORCPT + 99 others); Wed, 17 May 2023 12:10:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231819AbjEQQKF (ORCPT ); Wed, 17 May 2023 12:10:05 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CE8E9268C for ; Wed, 17 May 2023 09:10:03 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-ba83fed50a6so1875273276.0 for ; Wed, 17 May 2023 09:10:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684339803; x=1686931803; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4afwjJMNRVLZWsRS4KvH1UWrNK7pSNXsAEU4xOupb6Y=; b=axQ41qX1bwiHJbGU6rrt3d0A/FAXog8UjScM7eAjhvO/VZz1uMpCiXfvNRTVLE3/8S OZBGnYY6NkXRzHCcXWb2deoRCp4d/IhzJGUh0wL1C9VUN2hWUXoDFRR7iXDO/6Wl+fp6 uGwIan2D0pw5vLh98Q1bRnbHw7ENy/oETdEpQ8m8wE2UMa7QnS7eUD2E3CuA2ie5z6mc Q3h9U6cGrBmDrF/iHiQeKEXmB/AqVVSqZ2+ND4BoFj+Gq0VO1MO60ZY7L3bL5eaUrqOU INdMQsIXL5q34h0EXxp3baPOV9jVqICirZUX4wvP0xgrgao+63OSBIe1/yr8wrkYoNgo iHsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684339803; x=1686931803; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4afwjJMNRVLZWsRS4KvH1UWrNK7pSNXsAEU4xOupb6Y=; b=gTU8lhy78Fxb6pl31YdIN0kAxN3qB2r4wsr63/5erKEpJdalBKbMYGmvkDnKWGLJPj Vaj4l6LS2nHS4Zz25tebrEchJZlJKKBSfZ30xYHmzIky/LUyWiGpJ8xC23xwCA8kzLvz TWeINWo5Gc8HLowS30nk2v/X9EJUsBpL+YHuswFLl5kswPObo+H+ROnlI+sntHo/GTNx V5IUNiwDmZewFNGKoLYFF+dBbf50utgjL8Y6EdDlStx3H2YpJxHsFWS9jLrVnvPFr/WL IihoIiCwBjKV0AOUpBxNWwLDgstCdbKK00zvSBLJiomA+BFkYjSAhDkMebgFhhHObL1W Rd4Q== X-Gm-Message-State: AC+VfDwIILVy9u0wOG2FXGuGFL0HOWSPH/V5ctpz/Ga//6KQovxxt4iT ZIMydNVgToA+azzC5YB02Qw+8zGhT0+whw== X-Received: from yjq3.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:272f]) (user=jiaqiyan job=sendgmr) by 2002:a25:ce12:0:b0:ba8:1fab:4f99 with SMTP id x18-20020a25ce12000000b00ba81fab4f99mr2620511ybe.9.1684339803118; Wed, 17 May 2023 09:10:03 -0700 (PDT) Date: Wed, 17 May 2023 16:09:47 +0000 In-Reply-To: <20230517160948.811355-1-jiaqiyan@google.com> Mime-Version: 1.0 References: <20230517160948.811355-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.40.1.606.ga4b1b128d6-goog Message-ID: <20230517160948.811355-3-jiaqiyan@google.com> Subject: [PATCH v1 2/3] hugetlbfs: improve read HWPOISON hugepage From: Jiaqi Yan To: mike.kravetz@oracle.com, songmuchun@bytedance.com, naoya.horiguchi@nec.com, shy828301@gmail.com, linmiaohe@huawei.com Cc: akpm@linux-foundation.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, duenwen@google.com, axelrasmussen@google.com, jthoughton@google.com, Jiaqi Yan X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766158806322133724?= X-GMAIL-MSGID: =?utf-8?q?1766158806322133724?= When a hugepage contains HWPOISON pages, read() fails to read any byte of the hugepage and returns -EIO, although many bytes in the HWPOISON hugepage are readable. Improve this by allowing hugetlbfs_read_iter returns as many bytes as possible. For a requested range [offset, offset + len) that contains HWPOISON page, return [offset, first HWPOISON page addr); the next read attempt will fail and return -EIO. Signed-off-by: Jiaqi Yan --- fs/hugetlbfs/inode.c | 62 +++++++++++++++++++++++++++++++++++++++----- 1 file changed, 56 insertions(+), 6 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index ecfdfb2529a3..1baa08ec679f 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -282,6 +282,46 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr, } #endif +/* + * Someone wants to read @bytes from a HWPOISON hugetlb @page from @offset. + * Returns the maximum number of bytes one can read without touching the 1st raw + * HWPOISON subpage. + * + * The implementation borrows the iteration logic from copy_page_to_iter*. + */ +static size_t adjust_range_hwpoison(struct page *page, size_t offset, size_t bytes) +{ + size_t n = 0; + size_t res = 0; + struct folio *folio = page_folio(page); + + folio_lock(folio); + + /* First subpage to start the loop. */ + page += offset / PAGE_SIZE; + offset %= PAGE_SIZE; + while (1) { + if (find_raw_hwp_page(folio, page) != NULL) + break; + + /* Safe to read n bytes without touching HWPOISON subpage. */ + n = min(bytes, (size_t)PAGE_SIZE - offset); + res += n; + bytes -= n; + if (!bytes || !n) + break; + offset += n; + if (offset == PAGE_SIZE) { + page++; + offset = 0; + } + } + + folio_unlock(folio); + + return res; +} + /* * Support for read() - Find the page attached to f_mapping and copy out the * data. This provides functionality similar to filemap_read(). @@ -300,7 +340,7 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) while (iov_iter_count(to)) { struct page *page; - size_t nr, copied; + size_t nr, copied, want; /* nr is the maximum number of bytes to copy from this page */ nr = huge_page_size(h); @@ -328,16 +368,26 @@ static ssize_t hugetlbfs_read_iter(struct kiocb *iocb, struct iov_iter *to) } else { unlock_page(page); - if (PageHWPoison(page)) { - put_page(page); - retval = -EIO; - break; + if (!PageHWPoison(page)) + want = nr; + else { + /* + * Adjust how many bytes safe to read without + * touching the 1st raw HWPOISON subpage after + * offset. + */ + want = adjust_range_hwpoison(page, offset, nr); + if (want == 0) { + put_page(page); + retval = -EIO; + break; + } } /* * We have the page, copy it to user space buffer. */ - copied = copy_page_to_iter(page, offset, nr, to); + copied = copy_page_to_iter(page, offset, want, to); put_page(page); } offset += copied;