Message ID | 20230218002819.1486479-2-jthoughton@google.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp142049wrn; Fri, 17 Feb 2023 16:29:36 -0800 (PST) X-Google-Smtp-Source: AK7set/tx+525Vi888m6yK0oT+Guql77JWv7OiZJs3/IHyP5WMXOulOE3XDfGCQyuZQTZlml69Ki X-Received: by 2002:a05:6402:1016:b0:4aa:b216:3e23 with SMTP id c22-20020a056402101600b004aab2163e23mr2107787edu.30.1676680176029; Fri, 17 Feb 2023 16:29:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1676680176; cv=none; d=google.com; s=arc-20160816; b=hsP/WuWM2fsUi01h7zwQ1jJ9FFtl89Vwk8VWLIXFcVG+zzoSquvj6IhdLFym9LZ8an 1/Oa0mkMQMxIoWTXkORzkb5RlA8u/r372xoXqZlzuo9Igf0MotFaGRgAArnEq7LJWbMU kbCS+JzBgxDlLKwFoMYClR456ZCz4Svj4KWaFhvf826iTjCtjLOc31JO+23jKhD4YbDM bFPD9tfLfveVNtBCZVB/txC4sE/HfGDioQ/6bbFb3L/8W79N38mSV21CSyp1lzNbT0eG at4T/qbldhcq17q+Us1a+WByPZMY0ql8547B+ZXGstC/AYckuwN79h18fOgPFcXuML42 sAZw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=OUKXsk8AE9Dnhz3j+nFcGbpV2atJOJTLvwarlmIu2ntAbbBmhQ0nIoXWPQGlOe5j/P P3OhjNgLa68BMjP37Q30UmiYDmYNEEroqZJCOD/kjDtc2T3zauDL0ed4LsLweUc3IQBw Al9p5hgB/0c20ZAHbjm2UwYHmfR+yWZuNSOQBSjhRg4Hzk0YnVhiDLzgOsg+fm8/uSAZ nEMKOAyydGe+ioNQj10QutAqIMLt4YbgIFuRnYXgTmjganjCTji3LtPOA2h977EHSpdI zTrneMGw228x9LLxBgZuNtDpsSY2AnkYIT7tnXQ4Kuyy9niVOCHS3ak5QfV4rv7RwUJi lX4w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JUkI+uUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m3-20020aa7c2c3000000b004acbe85e42csi6711754edp.414.2023.02.17.16.29.10; Fri, 17 Feb 2023 16:29:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=JUkI+uUZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229793AbjBRA2s (ORCPT <rfc822;daweilics@gmail.com> + 99 others); Fri, 17 Feb 2023 19:28:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41652 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbjBRA2q (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 17 Feb 2023 19:28:46 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3CB2459B49 for <linux-kernel@vger.kernel.org>; Fri, 17 Feb 2023 16:28:45 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-536582abb72so22509327b3.5 for <linux-kernel@vger.kernel.org>; Fri, 17 Feb 2023 16:28:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=JUkI+uUZUGikStVOrWOWfUFvkifZQjdvysrYVMcxjv1b2yQplxM71vw1fjhTzhEkxI BEohrL6MRF518g9B3YG2ljl8T7G93FwY6nToc+IVWoDTzVPkWMKdK99Fw+Y/07yhJ3P2 Pt0n2XoDP+p37aLwSJxXkZR/Jyh8LDLuXTOP46PclH0WaRZarQPsfJWYdmcsYTgivcxA V29rcePGsSV2Bm8JBI3cnvgY6VYwZ38Vy23D8uxt/hblg5U5P6fb6PnTeEwjCMXWB3jt Xt02v+X21m74LuwV377B2oYbsfLR/FpkRSgMkEe4zOTBZt33XomrJuA07ToJPeOqmC5B 2qvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WRJSWRU8dQHXXhdXAbgeFtewLtYoHueyg1zV/lollrc=; b=EFhDPtIx+Qrf0TIl/4WzadxlbNvCFb/DC0O4nJH6L5mqgMMaPMCn21myVcAXUcXeEh RfTkfQ4w7ZrSo/ssU1dGh9y1kn8UnUJPO1DHqQeVswx94rABzuihVLJGlinpHrXcNPQC PeCpgxQHiAT8W9gvJDNzTuEr/5R2ICjDzaRtIv6gBH7x0W9r+kQbAvw8SV98nWtx1jWu qsdSJgUQAgoJDcWIxAlOvNcDUOD4uCDa5D6iPiNVwz2WB1EWm1JsxsDfWRc8za4CD/wQ tBvokvaP006rcr7iYsbKHDSd72up7gagteZt3Lclwwx9S2DRWB5lLgMhpe9VLQUYvTBU R9WA== X-Gm-Message-State: AO0yUKXItu4v/sEEHBs8svAlKqTmYGHRsTmwIDH6YikvS8OTk9Cyvmmd usz/rZVcHt6xu/rt2Efoj330AKZlzMcJg5fS X-Received: from jthoughton.c.googlers.com ([fda3:e722:ac3:cc00:14:4d90:c0a8:2a4f]) (user=jthoughton job=sendgmr) by 2002:a81:4981:0:b0:527:adb4:3297 with SMTP id w123-20020a814981000000b00527adb43297mr1507453ywa.161.1676680124408; Fri, 17 Feb 2023 16:28:44 -0800 (PST) Date: Sat, 18 Feb 2023 00:27:34 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> X-Mailer: git-send-email 2.39.2.637.g21b0678d19-goog Message-ID: <20230218002819.1486479-2-jthoughton@google.com> Subject: [PATCH v2 01/46] hugetlb: don't set PageUptodate for UFFDIO_CONTINUE From: James Houghton <jthoughton@google.com> To: Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <songmuchun@bytedance.com>, Peter Xu <peterx@redhat.com>, Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com>, David Rientjes <rientjes@google.com>, Axel Rasmussen <axelrasmussen@google.com>, Mina Almasry <almasrymina@google.com>, "Zach O'Keefe" <zokeefe@google.com>, Manish Mishra <manish.mishra@nutanix.com>, Naoya Horiguchi <naoya.horiguchi@nec.com>, "Dr . David Alan Gilbert" <dgilbert@redhat.com>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, Vlastimil Babka <vbabka@suse.cz>, Baolin Wang <baolin.wang@linux.alibaba.com>, Miaohe Lin <linmiaohe@huawei.com>, Yang Shi <shy828301@gmail.com>, Frank van der Linden <fvdl@google.com>, Jiaqi Yan <jiaqiyan@google.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton <jthoughton@google.com> Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758126592195884097?= X-GMAIL-MSGID: =?utf-8?q?1758126592195884097?= |
Series |
hugetlb: introduce HugeTLB high-granularity mapping
|
|
Commit Message
James Houghton
Feb. 18, 2023, 12:27 a.m. UTC
If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE;
PageUptodate indicates that the page has been zeroed, and we don't want
to give a non-zeroed page to the user.
The reason this change is being made now is because UFFDIO_CONTINUEs on
subpages definitely shouldn't set this page flag on the head page.
Signed-off-by: James Houghton <jthoughton@google.com>
Comments
On Fri, Feb 17, 2023 at 4:28 PM James Houghton <jthoughton@google.com> wrote: > > If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; > PageUptodate indicates that the page has been zeroed, and we don't want > to give a non-zeroed page to the user. > > The reason this change is being made now is because UFFDIO_CONTINUEs on > subpages definitely shouldn't set this page flag on the head page. > > Signed-off-by: James Houghton <jthoughton@google.com> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > index 07abcb6eb203..792cb2e67ce5 100644 > --- a/mm/hugetlb.c > +++ b/mm/hugetlb.c > @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > * preceding stores to the page contents become visible before > * the set_pte_at() write. > */ > - __folio_mark_uptodate(folio); > + if (!is_continue) > + __folio_mark_uptodate(folio); > + else if (!folio_test_uptodate(folio)) { > + /* > + * This should never happen; HugeTLB pages are always Uptodate > + * as soon as they are allocated. > + */ if (is_continue) then we grab a page from the page cache, no? Are pages in page caches always uptodate? Why? I guess that means they're mapped hence uptodate? Also this comment should explain why pages in the page cache are always uptodate, no? Because this error branch is hit if (is_continue && !folio_test_uptodate()), not when pages are freshly allocated. > + ret = -EFAULT; > + goto out_release_nounlock; > + } > > /* Add shared, newly allocated pages to the page cache. */ > if (vm_shared && !is_continue) { > -- > 2.39.2.637.g21b0678d19-goog >
On Fri, Feb 17, 2023 at 4:42 PM Mina Almasry <almasrymina@google.com> wrote: > > On Fri, Feb 17, 2023 at 4:28 PM James Houghton <jthoughton@google.com> wrote: > > > > If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; > > PageUptodate indicates that the page has been zeroed, and we don't want > > to give a non-zeroed page to the user. > > > > The reason this change is being made now is because UFFDIO_CONTINUEs on > > subpages definitely shouldn't set this page flag on the head page. > > > > Signed-off-by: James Houghton <jthoughton@google.com> > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > index 07abcb6eb203..792cb2e67ce5 100644 > > --- a/mm/hugetlb.c > > +++ b/mm/hugetlb.c > > @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > > * preceding stores to the page contents become visible before > > * the set_pte_at() write. > > */ > > - __folio_mark_uptodate(folio); > > + if (!is_continue) > > + __folio_mark_uptodate(folio); > > + else if (!folio_test_uptodate(folio)) { > > + /* > > + * This should never happen; HugeTLB pages are always Uptodate > > + * as soon as they are allocated. > > + */ > > if (is_continue) then we grab a page from the page cache, no? Are > pages in page caches always uptodate? Why? I guess that means they're > mapped hence uptodate? > > Also this comment should explain why pages in the page cache are > always uptodate, no? Because this error branch is hit if (is_continue > && !folio_test_uptodate()), not when pages are freshly allocated. There was some discussion about it here[1]. Without even thinking about how the pages become uptodate, I think this patch is justified like this: UFFDIO_CONTINUE => we aren't actually changing the contents of the page, so we shouldn't be changing the uptodate-ness of the page. HugeTLB pages in the page cache are always uptodate: 1. fallocate -- the page is allocated, zeroed, marked as uptodate, and then placed in the page cache. 2. hugetlb_no_page -- same as above. So uptodate <=> "the page has been zeroed", so it would be very bad if we gave a !uptodate page to userspace via UFFDIO_CONTINUE. I'll update the comment to something like: "HugeTLB pages are always Uptodate as soon as they are added to the page cache. Given that we aren't changing the contents of the page, we shouldn't be updating the Uptodate-ness of the page." [1]: https://lore.kernel.org/linux-mm/Y5JrS4o5Detzid9V@monkey/ Thanks, Mina. :) - James
On 02/21/23 07:59, James Houghton wrote: > On Fri, Feb 17, 2023 at 4:42 PM Mina Almasry <almasrymina@google.com> wrote: > > > > On Fri, Feb 17, 2023 at 4:28 PM James Houghton <jthoughton@google.com> wrote: > > > > > > If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; > > > PageUptodate indicates that the page has been zeroed, and we don't want > > > to give a non-zeroed page to the user. > > > > > > The reason this change is being made now is because UFFDIO_CONTINUEs on > > > subpages definitely shouldn't set this page flag on the head page. > > > > > > Signed-off-by: James Houghton <jthoughton@google.com> > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > index 07abcb6eb203..792cb2e67ce5 100644 > > > --- a/mm/hugetlb.c > > > +++ b/mm/hugetlb.c > > > @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > > > * preceding stores to the page contents become visible before > > > * the set_pte_at() write. > > > */ > > > - __folio_mark_uptodate(folio); > > > + if (!is_continue) > > > + __folio_mark_uptodate(folio); > > > + else if (!folio_test_uptodate(folio)) { > > > + /* > > > + * This should never happen; HugeTLB pages are always Uptodate > > > + * as soon as they are allocated. > > > + */ > > > > if (is_continue) then we grab a page from the page cache, no? Are > > pages in page caches always uptodate? Why? I guess that means they're > > mapped hence uptodate? > > > > Also this comment should explain why pages in the page cache are > > always uptodate, no? Because this error branch is hit if (is_continue > > && !folio_test_uptodate()), not when pages are freshly allocated. > > There was some discussion about it here[1]. > > Without even thinking about how the pages become uptodate, I think > this patch is justified like this: UFFDIO_CONTINUE => we aren't > actually changing the contents of the page, so we shouldn't be > changing the uptodate-ness of the page. Agree! > HugeTLB pages in the page cache are always uptodate: > 1. fallocate -- the page is allocated, zeroed, marked as uptodate, and > then placed in the page cache. > 2. hugetlb_no_page -- same as above. > > So uptodate <=> "the page has been zeroed", so it would be very bad if > we gave a !uptodate page to userspace via UFFDIO_CONTINUE. > > I'll update the comment to something like: > > "HugeTLB pages are always Uptodate as soon as they are added to the > page cache. Given that we aren't changing the contents of the page, we > shouldn't be updating the Uptodate-ness of the page." Perhaps a better way of saying is that hugetlb pages are marked uptodate shortly after allocation when their contents are initialized. Initialized data could be zero, or it could be contents copied from another location (such as in the UFFDIO_COPY case also handled in this routine). Saying "PageUptodate indicates that the page has been zeroed" as in the commit message is technically not correct. Ack to the patch.
On Tue, Feb 21, 2023 at 11:34 AM Mike Kravetz <mike.kravetz@oracle.com> wrote: > > On 02/21/23 07:59, James Houghton wrote: > > On Fri, Feb 17, 2023 at 4:42 PM Mina Almasry <almasrymina@google.com> wrote: > > > > > > On Fri, Feb 17, 2023 at 4:28 PM James Houghton <jthoughton@google.com> wrote: > > > > > > > > If would be bad if we actually set PageUptodate with UFFDIO_CONTINUE; > > > > PageUptodate indicates that the page has been zeroed, and we don't want > > > > to give a non-zeroed page to the user. > > > > > > > > The reason this change is being made now is because UFFDIO_CONTINUEs on > > > > subpages definitely shouldn't set this page flag on the head page. > > > > > > > > Signed-off-by: James Houghton <jthoughton@google.com> > > > > > > > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c > > > > index 07abcb6eb203..792cb2e67ce5 100644 > > > > --- a/mm/hugetlb.c > > > > +++ b/mm/hugetlb.c > > > > @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, > > > > * preceding stores to the page contents become visible before > > > > * the set_pte_at() write. > > > > */ > > > > - __folio_mark_uptodate(folio); > > > > + if (!is_continue) > > > > + __folio_mark_uptodate(folio); > > > > + else if (!folio_test_uptodate(folio)) { > > > > + /* > > > > + * This should never happen; HugeTLB pages are always Uptodate > > > > + * as soon as they are allocated. > > > > + */ > > > > > > if (is_continue) then we grab a page from the page cache, no? Are > > > pages in page caches always uptodate? Why? I guess that means they're > > > mapped hence uptodate? > > > > > > Also this comment should explain why pages in the page cache are > > > always uptodate, no? Because this error branch is hit if (is_continue > > > && !folio_test_uptodate()), not when pages are freshly allocated. > > > > There was some discussion about it here[1]. > > > > Without even thinking about how the pages become uptodate, I think > > this patch is justified like this: UFFDIO_CONTINUE => we aren't > > actually changing the contents of the page, so we shouldn't be > > changing the uptodate-ness of the page. > > Agree! > > > HugeTLB pages in the page cache are always uptodate: > > 1. fallocate -- the page is allocated, zeroed, marked as uptodate, and > > then placed in the page cache. > > 2. hugetlb_no_page -- same as above. > > > > So uptodate <=> "the page has been zeroed", so it would be very bad if > > we gave a !uptodate page to userspace via UFFDIO_CONTINUE. > > > > I'll update the comment to something like: > > > > "HugeTLB pages are always Uptodate as soon as they are added to the > > page cache. Given that we aren't changing the contents of the page, we > > shouldn't be updating the Uptodate-ness of the page." > > Perhaps a better way of saying is that hugetlb pages are marked uptodate > shortly after allocation when their contents are initialized. Initialized > data could be zero, or it could be contents copied from another location > (such as in the UFFDIO_COPY case also handled in this routine). I'll write something like this. Thank you! > > Saying "PageUptodate indicates that the page has been zeroed" as in the > commit message is technically not correct. And I'll make sure to update the commit description as well. > > Ack to the patch. Thanks, Mike!
diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 07abcb6eb203..792cb2e67ce5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6256,7 +6256,16 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, * preceding stores to the page contents become visible before * the set_pte_at() write. */ - __folio_mark_uptodate(folio); + if (!is_continue) + __folio_mark_uptodate(folio); + else if (!folio_test_uptodate(folio)) { + /* + * This should never happen; HugeTLB pages are always Uptodate + * as soon as they are allocated. + */ + ret = -EFAULT; + goto out_release_nounlock; + } /* Add shared, newly allocated pages to the page cache. */ if (vm_shared && !is_continue) {