From patchwork Thu Jul 6 22:50:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2886621vqx; Thu, 6 Jul 2023 16:06:00 -0700 (PDT) X-Google-Smtp-Source: APBJJlG596J0Z4RzNgVmezTlnlu+yDTlcJsXB6Y3fy/23JyhbHfLkYcDe886iFh79cwIN8Y3LmaS X-Received: by 2002:a17:902:ce84:b0:1b8:417d:d042 with SMTP id f4-20020a170902ce8400b001b8417dd042mr4856803plg.20.1688684760537; Thu, 06 Jul 2023 16:06:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688684760; cv=none; d=google.com; s=arc-20160816; b=IA8RsdYHg9q5yzQ7p9/8U1TQhcw/Ccl+3xDIUPg/6l5/C4rU8JN1IaVORw5XXZQFiX LTbbQaaFI15/jIGaTs6XaRXVd7PU+gEN3ynf5nA4haE+pxb9Nt6CLebfsFVO+oeSywjI bQz/0JotmEcLTZ2sL0VFmR5hOSZ4c1rg1N5OFXO7+gHyGuR1bObD8Q0zHVuUHCE+Dypo DzoSOfkjbDizTYbbkD0H/nWPrVDqGX4D4LIczlE1IKcryT+E4nZ5Scw+WcAHc12RSPT/ 2sNBNIpJFU90j8tjQZzv81d68C2DDRkltV+Sta+T8MJQeu0ciKtj6i+tdjQx2OFEvVT2 4gWw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=axERsIn3EwMGffTL4fWJaIzLcp6XXXIYiFKDUzVzBqY=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=iafcNfcghJmi4pzfM0IWmg95SgqoMJpXqABbbCz9H/zQ6a5l4Zqte3HZ74jCpcHs0o XWyevxPFDdeLgCkDMsVx3TjyY3Crv/wV2XnKdQX5IStvV2npXEbv9CFi3QQsCX79FaGH fXWT9rRlUq/Qnp4IaEeI1aFE7xRfZYB99VuEwkuyjrtNYOEW9amqPftHw4Xmwk6OpaKJ lZxvCPKhx+oz/H3VTzs+NftnuaaSTNZmlUHnQ1dGuNheywX25EXHGgdfohwvuOdZFgKc vbEr+2gykey1K8XNGFbkWiz7iM3eAxK/goz/tMLXR6oyraiwc7v2JnDnl3W2+R/36WSQ AqvA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yiBN+5aT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n13-20020a170902e54d00b001b89ba65f3dsi2570015plf.193.2023.07.06.16.05.38; Thu, 06 Jul 2023 16:06:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=yiBN+5aT; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231706AbjGFWu6 (ORCPT + 99 others); Thu, 6 Jul 2023 18:50:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229829AbjGFWuz (ORCPT ); Thu, 6 Jul 2023 18:50:55 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 66C601994 for ; Thu, 6 Jul 2023 15:50:52 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-56ff7b4feefso14589787b3.0 for ; Thu, 06 Jul 2023 15:50:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683851; x=1691275851; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=axERsIn3EwMGffTL4fWJaIzLcp6XXXIYiFKDUzVzBqY=; b=yiBN+5aT9dRdKtxMwsd3u/C1Xvh33zuWSJWtyc3b2oqUAezQBMDYehbTlZSQ43q9rc GjxiS4tCHuYh72lUDHDWnhv1RNBmlVhS7uMGqqF6xgkO8xPXMZwt06lYIEgO4HT0GePH vDCilEspj/9gvKK6E/Xlu0bf+76WPjOT9jJ/vO5uQtiepbi44/uGZ+K0VB7otdT5xt0v Qw3dpYo216D1CFQLlt2k1rfsnwsPsRhpOt1hZF/IcAy4lMynJc8FNYjbTCbEFXlclhiW saVTIojFJ1zVagbeXVsgf0uniofnqL9UTOGwJEomvRE02Enf5Gry/gezLCRdZ5MsxGyZ mU/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683851; x=1691275851; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=axERsIn3EwMGffTL4fWJaIzLcp6XXXIYiFKDUzVzBqY=; b=DZpyZuznktEsBYh+1uhD9M+AXBWSrzg/syiuHOfYk33cjW0w9KAy6lRQy/XPLQEutr 683hNMWtd4A1FlSkR3SNQTEBshoS98MEDmINQAZ51vm/SM4SfBkqNVvJAOGCddm6ljTZ ZW1t+1smJHBs+L27NpjdZGxvOiJwutl7qcOBGnUoQM1L7mdxq1JcywN7JKfIE8c5dP+/ jIQ9HlL4V2TRA5t7P4oVMjGEmkzaB8YyyQq1uOc9tKXqE1azFFaA+PTlWryJ26eYrRZP /9P6ya+eSRMuO0qg83C1no8JOcdB4nZcLbEOHBQf4wvHu2XSe+FM736QQH6byT7pIqcA ncsw== X-Gm-Message-State: ABy/qLZvdG5WtAIGArWTTfNqxUOn/vlqCUchqCTKy/rzY6SUu08Oby3x 2Rw36YUpklrjgZCdohE76MGeTBMmaURS2BcUQFW8 X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a05:6902:30b:b0:c67:ebc5:de5d with SMTP id b11-20020a056902030b00b00c67ebc5de5dmr18087ybs.4.1688683851353; Thu, 06 Jul 2023 15:50:51 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:29 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-2-axelrasmussen@google.com> Subject: [PATCH v3 1/8] mm: make PTE_MARKER_SWAPIN_ERROR more general From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770714311030929804?= X-GMAIL-MSGID: =?utf-8?q?1770714311030929804?= Future patches will re-use PTE_MARKER_SWAPIN_ERROR to implement UFFDIO_POISON, so make some various preparations for that: First, rename it to just PTE_MARKER_ERROR. The "SWAPIN" can be confusing since we're going to re-use it for something not really related to swap. This can be particularly confusing for things like hugetlbfs, which doesn't support swap whatsoever. Also rename some various helper functions. Next, fix pte marker copying for hugetlbfs. Previously, it would WARN on seeing a PTE_MARKER_SWAPIN_ERROR, since hugetlbfs doesn't support swap. But, since we're going to re-use it, we want it to go ahead and copy it just like non-hugetlbfs memory does today. Since the code to do this is more complicated now, pull it out into a helper which can be re-used in both places. While we're at it, also make it slightly more explicit in its handling of e.g. uffd wp markers. For non-hugetlbfs page faults, instead of returning VM_FAULT_SIGBUS for an error entry, return VM_FAULT_HWPOISON. For most cases this change doesn't matter, e.g. a userspace program would receive a SIGBUS either way. But for UFFDIO_POISON, this change will let KVM guests get an MCE out of the box, instead of giving a SIGBUS to the hypervisor and requiring it to somehow inject an MCE. Finally, for hugetlbfs faults, handle PTE_MARKER_ERROR, and return VM_FAULT_HWPOISON_LARGE in such cases. Note that this can't happen today because the lack of swap support means we'll never end up with such a PTE anyway, but this behavior will be needed once such entries *can* show up via UFFDIO_POISON. Signed-off-by: Axel Rasmussen --- include/linux/mm_inline.h | 19 +++++++++++++++++++ include/linux/swapops.h | 10 +++++----- mm/hugetlb.c | 32 +++++++++++++++++++++----------- mm/madvise.c | 2 +- mm/memory.c | 15 +++++++++------ mm/mprotect.c | 4 ++-- mm/shmem.c | 4 ++-- mm/swapfile.c | 2 +- 8 files changed, 60 insertions(+), 28 deletions(-) diff --git a/include/linux/mm_inline.h b/include/linux/mm_inline.h index 21d6c72bcc71..329bd9370b49 100644 --- a/include/linux/mm_inline.h +++ b/include/linux/mm_inline.h @@ -523,6 +523,25 @@ static inline bool mm_tlb_flush_nested(struct mm_struct *mm) return atomic_read(&mm->tlb_flush_pending) > 1; } +/* + * Computes the pte marker to copy from the given source entry into dst_vma. + * If no marker should be copied, returns 0. + * The caller should insert a new pte created with make_pte_marker(). + */ +static inline pte_marker copy_pte_marker( + swp_entry_t entry, struct vm_area_struct *dst_vma) +{ + pte_marker srcm = pte_marker_get(entry); + /* Always copy error entries. */ + pte_marker dstm = srcm & PTE_MARKER_ERROR; + + /* Only copy PTE markers if UFFD register matches. */ + if ((srcm & PTE_MARKER_UFFD_WP) && userfaultfd_wp(dst_vma)) + dstm |= PTE_MARKER_UFFD_WP; + + return dstm; +} + /* * If this pte is wr-protected by uffd-wp in any form, arm the special pte to * replace a none pte. NOTE! This should only be called when *pte is already diff --git a/include/linux/swapops.h b/include/linux/swapops.h index 4c932cb45e0b..5f1818d48dd6 100644 --- a/include/linux/swapops.h +++ b/include/linux/swapops.h @@ -393,7 +393,7 @@ static inline bool is_migration_entry_dirty(swp_entry_t entry) typedef unsigned long pte_marker; #define PTE_MARKER_UFFD_WP BIT(0) -#define PTE_MARKER_SWAPIN_ERROR BIT(1) +#define PTE_MARKER_ERROR BIT(1) #define PTE_MARKER_MASK (BIT(2) - 1) static inline swp_entry_t make_pte_marker_entry(pte_marker marker) @@ -421,15 +421,15 @@ static inline pte_t make_pte_marker(pte_marker marker) return swp_entry_to_pte(make_pte_marker_entry(marker)); } -static inline swp_entry_t make_swapin_error_entry(void) +static inline swp_entry_t make_error_swp_entry(void) { - return make_pte_marker_entry(PTE_MARKER_SWAPIN_ERROR); + return make_pte_marker_entry(PTE_MARKER_ERROR); } -static inline int is_swapin_error_entry(swp_entry_t entry) +static inline int is_error_swp_entry(swp_entry_t entry) { return is_pte_marker_entry(entry) && - (pte_marker_get(entry) & PTE_MARKER_SWAPIN_ERROR); + (pte_marker_get(entry) & PTE_MARKER_ERROR); } /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bce28cca73a1..934e129d9939 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -5101,15 +5102,12 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, entry = huge_pte_clear_uffd_wp(entry); set_huge_pte_at(dst, addr, dst_pte, entry); } else if (unlikely(is_pte_marker(entry))) { - /* No swap on hugetlb */ - WARN_ON_ONCE( - is_swapin_error_entry(pte_to_swp_entry(entry))); - /* - * We copy the pte marker only if the dst vma has - * uffd-wp enabled. - */ - if (userfaultfd_wp(dst_vma)) - set_huge_pte_at(dst, addr, dst_pte, entry); + pte_marker marker = copy_pte_marker( + pte_to_swp_entry(entry), dst_vma); + + if (marker) + set_huge_pte_at(dst, addr, dst_pte, + make_pte_marker(marker)); } else { entry = huge_ptep_get(src_pte); pte_folio = page_folio(pte_page(entry)); @@ -6090,14 +6088,26 @@ vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma, } entry = huge_ptep_get(ptep); - /* PTE markers should be handled the same way as none pte */ - if (huge_pte_none_mostly(entry)) + if (huge_pte_none_mostly(entry)) { + if (is_pte_marker(entry)) { + pte_marker marker = + pte_marker_get(pte_to_swp_entry(entry)); + + if (marker & PTE_MARKER_ERROR) { + ret = VM_FAULT_HWPOISON_LARGE; + goto out_mutex; + } + } + /* + * Other PTE markers should be handled the same way as none PTE. + * * hugetlb_no_page will drop vma lock and hugetlb fault * mutex internally, which make us return immediately. */ return hugetlb_no_page(mm, vma, mapping, idx, address, ptep, entry, flags); + } ret = 0; diff --git a/mm/madvise.c b/mm/madvise.c index 886f06066622..59e954586e2a 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -660,7 +660,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, free_swap_and_cache(entry); pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); } else if (is_hwpoison_entry(entry) || - is_swapin_error_entry(entry)) { + is_error_swp_entry(entry)) { pte_clear_not_present_full(mm, addr, pte, tlb->fullmm); } continue; diff --git a/mm/memory.c b/mm/memory.c index 0ae594703021..c8b6de99d14c 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -860,8 +860,11 @@ copy_nonpresent_pte(struct mm_struct *dst_mm, struct mm_struct *src_mm, return -EBUSY; return -ENOENT; } else if (is_pte_marker_entry(entry)) { - if (is_swapin_error_entry(entry) || userfaultfd_wp(dst_vma)) - set_pte_at(dst_mm, addr, dst_pte, pte); + pte_marker marker = copy_pte_marker(entry, dst_vma); + + if (marker) + set_pte_at(dst_mm, addr, dst_pte, + make_pte_marker(marker)); return 0; } if (!userfaultfd_wp(dst_vma)) @@ -1500,7 +1503,7 @@ static unsigned long zap_pte_range(struct mmu_gather *tlb, !zap_drop_file_uffd_wp(details)) continue; } else if (is_hwpoison_entry(entry) || - is_swapin_error_entry(entry)) { + is_error_swp_entry(entry)) { if (!should_zap_cows(details)) continue; } else { @@ -3647,7 +3650,7 @@ static vm_fault_t pte_marker_clear(struct vm_fault *vmf) * none pte. Otherwise it means the pte could have changed, so retry. * * This should also cover the case where e.g. the pte changed - * quickly from a PTE_MARKER_UFFD_WP into PTE_MARKER_SWAPIN_ERROR. + * quickly from a PTE_MARKER_UFFD_WP into PTE_MARKER_ERROR. * So is_pte_marker() check is not enough to safely drop the pte. */ if (pte_same(vmf->orig_pte, ptep_get(vmf->pte))) @@ -3693,8 +3696,8 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) return VM_FAULT_SIGBUS; /* Higher priority than uffd-wp when data corrupted */ - if (marker & PTE_MARKER_SWAPIN_ERROR) - return VM_FAULT_SIGBUS; + if (marker & PTE_MARKER_ERROR) + return VM_FAULT_HWPOISON; if (pte_marker_entry_uffd_wp(entry)) return pte_marker_handle_uffd_wp(vmf); diff --git a/mm/mprotect.c b/mm/mprotect.c index 6f658d483704..47d255c8c2f2 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -230,10 +230,10 @@ static long change_pte_range(struct mmu_gather *tlb, newpte = pte_swp_mkuffd_wp(newpte); } else if (is_pte_marker_entry(entry)) { /* - * Ignore swapin errors unconditionally, + * Ignore error swap entries unconditionally, * because any access should sigbus anyway. */ - if (is_swapin_error_entry(entry)) + if (is_error_swp_entry(entry)) continue; /* * If this is uffd-wp pte marker and we'd like diff --git a/mm/shmem.c b/mm/shmem.c index 2f2e0e618072..c0f408c2c020 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1707,7 +1707,7 @@ static void shmem_set_folio_swapin_error(struct inode *inode, pgoff_t index, swp_entry_t swapin_error; void *old; - swapin_error = make_swapin_error_entry(); + swapin_error = make_error_swp_entry(); old = xa_cmpxchg_irq(&mapping->i_pages, index, swp_to_radix_entry(swap), swp_to_radix_entry(swapin_error), 0); @@ -1752,7 +1752,7 @@ static int shmem_swapin_folio(struct inode *inode, pgoff_t index, swap = radix_to_swp_entry(*foliop); *foliop = NULL; - if (is_swapin_error_entry(swap)) + if (is_error_swp_entry(swap)) return -EIO; si = get_swap_device(swap); diff --git a/mm/swapfile.c b/mm/swapfile.c index 8e6dde68b389..72e110387e67 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1773,7 +1773,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd, swp_entry = make_hwpoison_entry(swapcache); page = swapcache; } else { - swp_entry = make_swapin_error_entry(); + swp_entry = make_error_swp_entry(); } new_pte = swp_entry_to_pte(swp_entry); ret = 0; From patchwork Thu Jul 6 22:50:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116878 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2887641vqx; Thu, 6 Jul 2023 16:07:47 -0700 (PDT) X-Google-Smtp-Source: APBJJlFCb33Yspja9i11hvc5YO6zIjUuDNXtZHPBTfdVInsnoeR/iRRWdxtZ2Cz0a/nfG/TQXQ5K X-Received: by 2002:a9d:5e16:0:b0:6b7:5687:8a9e with SMTP id d22-20020a9d5e16000000b006b756878a9emr3165453oti.15.1688684867228; Thu, 06 Jul 2023 16:07:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688684867; cv=none; d=google.com; s=arc-20160816; b=tCt9G9ueUtpMEnDsf7MTLpVsLmGuP+P0qr1XIY5slH4UzG+Wl+em8+trUK94qhalD2 +oyoRJJ5csHHiEL9T11b2H6WAKoWNBJo5oyREA6ls0eIeoZ/8uVWfrA5OrVXkop5CPMY c0PMMq6EeWbcRw1bAksRHLdbYbDgwv7zuq//SxyDzh7fQP3jMuenEkTjkAuafbWoZi4I xuKA/U6NRL9Bw+60S4lcQmfihPh5ZAKX6ATYMWN2/3qQXUjO6LPQY1L7MmB2piy8BH+8 AozkNofh+L+yNNzeReQrud3ih8w4c+Zyc3rao7M58ZlVvKOYNifizpO+atpkycM14Lmn 8vsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=dr4h9VGDfsn/VMT+xHDPiOp9dXBZSKUD38K7YaReMD4=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=MUEicoE0M0n7+SQx/CcyOqt4RzahoMg2kv2CuP8vNl7PD+HdAaCOTIXy90h8YnFRw+ JE3KM3/Q5ZaofJQ0RpFXOJputCNjj/UYoZ2J/tF5dXPiQcr+1ekIw/DZWVRbi8L51s6Z ybfnxoqXk8Bww2CTvnOHAylbz+aNK2O6pM2IsVaihhSim/bP42Z+fKrwSkhczTvhhFxx UwvNfsSipJUb53o/+tv9SOBwnDjiVpoywwjmmB5Fen/W3v6QcMO/vMHoFyYesZ5vkCmx LbRLn+plgX7XwlF9GrSUvA9Fs2em//USAF16eVPHFZwsgyPcVAENR5e5M7JvcgcLN7uH NPtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=oG6cUbXq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j7-20020a634a47000000b00553939d6a6dsi2241065pgl.44.2023.07.06.16.07.31; Thu, 06 Jul 2023 16:07:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=oG6cUbXq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231799AbjGFWvH (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231450AbjGFWu4 (ORCPT ); Thu, 6 Jul 2023 18:50:56 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1690E1BC2 for ; Thu, 6 Jul 2023 15:50:54 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-c5e3d2c339aso1226519276.3 for ; Thu, 06 Jul 2023 15:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683853; x=1691275853; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dr4h9VGDfsn/VMT+xHDPiOp9dXBZSKUD38K7YaReMD4=; b=oG6cUbXqueLNKCE01QfHG46TYxjrnSczrRs1z4fAd0XehnlwpZ6QzxEJt4W2Ue3oTg jmIlcVMPOO3wBc5Jmpx0+P7PdoVG9zHOG4l9JNotEr1VVRKn8UBqflwPIUBZiM7amwz3 tHjgfEOpZHvakwUYyo/kFliQBAwyxEYaH9p9Krdz4uCEX03XtWL0k4huwPf4JMBUWiuJ vH0FF8WK3+d9dnq2JbKxfTWxmC9bR4z1x++0EIg8VN51Y6zcrk9sFkWFs+n0Ee8WaTrI T/JjfeQ68QMFZsZ0wuD8e82BwZ+qeWxNN+55lMptu2IruYWfXF6lh4uEunTtWoQohPkt /enw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683853; x=1691275853; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dr4h9VGDfsn/VMT+xHDPiOp9dXBZSKUD38K7YaReMD4=; b=Ody9o9YN03XQFVmqKdHC0dsLFOqopHqmrz5REKN/qr2Q0MWDVnBusTBHmZkh9ijBxp YV3dZDNsWFu7RUv13rLDhyWDE+vx2zgngfBnkAtM+avjb/tmrszahvFYFxjieQSlwUxB nLLdzGzk1o/ndRzZtZw55oCXAaQo4Pt+X1AbBW5H3ysWbJ/DolQy//neDUzYdGknplcD Dnxu/gf9fRLuJNhUrflldK2fxSo2/sWd70PMgeS6NPAFESNWPsFOiT4orJsx+cyBSQUS nDEtHl1uZrZrJuFZ9JrCIHDe8WjdnuuDHU8fgHlpG2kEz/eJOwV457vtJ3SYwxbn3KCY 8VJQ== X-Gm-Message-State: ABy/qLa672gaxpMqeZFisQrRxm3XtK/m6VIdUrovZeWTKeGadKXzqJbW XJsHONQwplevPRKZx7cCpkakZZUzVumFo2pGqxaZ X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a25:c70e:0:b0:c67:e177:100a with SMTP id w14-20020a25c70e000000b00c67e177100amr19507ybe.4.1688683853333; Thu, 06 Jul 2023 15:50:53 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:30 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-3-axelrasmussen@google.com> Subject: [PATCH v3 2/8] mm: userfaultfd: check for start + len overflow in validate_range From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770714423479094761?= X-GMAIL-MSGID: =?utf-8?q?1770714423479094761?= Most userfaultfd ioctls take a `start + len` range as an argument. We have the validate_range helper to check that such ranges are valid. However, some (but not all!) ioctls *also* check that `start + len` doesn't wrap around (overflow). Just check for this in validate_range. This saves some repetitive code, and adds the check to some ioctls which weren't bothering to check for it before. Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu --- fs/userfaultfd.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 7cecd49e078b..2e84684c46f0 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1306,6 +1306,8 @@ static __always_inline int validate_range(struct mm_struct *mm, return -EINVAL; if (len > task_size - start) return -EINVAL; + if (start + len <= start) + return -EINVAL; return 0; } @@ -1760,14 +1762,8 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out; - /* - * double check for wraparound just in case. copy_from_user() - * will later check uffdio_copy.src + uffdio_copy.len to fit - * in the userland range. - */ + ret = -EINVAL; - if (uffdio_copy.src + uffdio_copy.len <= uffdio_copy.src) - goto out; if (uffdio_copy.mode & ~(UFFDIO_COPY_MODE_DONTWAKE|UFFDIO_COPY_MODE_WP)) goto out; if (uffdio_copy.mode & UFFDIO_COPY_MODE_WP) @@ -1927,11 +1923,6 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) goto out; ret = -EINVAL; - /* double check for wraparound just in case. */ - if (uffdio_continue.range.start + uffdio_continue.range.len <= - uffdio_continue.range.start) { - goto out; - } if (uffdio_continue.mode & ~(UFFDIO_CONTINUE_MODE_DONTWAKE | UFFDIO_CONTINUE_MODE_WP)) goto out; From patchwork Thu Jul 6 22:50:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116872 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2880026vqx; Thu, 6 Jul 2023 15:52:42 -0700 (PDT) X-Google-Smtp-Source: APBJJlHVmzOF1x5IbyOlaNrmAY6qTCE3Dc7JVM3oWwK6YJ8iBKNLX9XkShR913tq7sA7mF7qQzYQ X-Received: by 2002:a05:6358:906:b0:134:c984:ab7e with SMTP id r6-20020a056358090600b00134c984ab7emr3014978rwi.0.1688683962588; Thu, 06 Jul 2023 15:52:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688683962; cv=none; d=google.com; s=arc-20160816; b=fGIe3fxaOgbgXJUylwMW63Faik5q8EGG71QPatIL6kOm9sbEy8o5rGhBvyTWCTIOkE drZrqvpKv1tf8NdENLo21PEPzvz+8dPS0KKaZgaPyF9tItzo2ecYnS8+wKb0nvv9uCJm FYz9xnlJYVQK9G+dQLyRhHiZ7IFV736WUOCWG5zoCCFZitZgGWdP82grFh9ObZSzOMhw YgC6vpeA2A09/WFLWu5/Jw+DoJ7u2WXeovaWZtaWiRqdS0i6wuDsv01wk5yH5oA9UtkM NwdWTt/LzuNUvulLd7vH2Q8DiYEO5j8X8sNSFq6Z3zDzzVXwG4xJk1Cfe+3Jon1fI29D zdeA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=zDsVvAgbqAnX6PMRHG2ZwgnBlt3+9ofsZz4Jdu5BLSk=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=Q/y+ZODtrLZcm6gn2YdzSZeVM7wC5hiwjfoEeFpeVtNZ+HNml//cPiXzWsNWZDr2FR VXfKUj6AWLppgBAbBHP3Y8a/X5ujPOHaSvQeaXwQFwsQArTttl8psqFDYfCPdCszWdUs 3t6CAP1fn2LxQSsWb4++Hk8kPUNasTTJYOHlCM1gbLNLDz4jMd1rwhkDb8jzDvuhPmDg ZhauTGtuSwtlqtYqjt60N6/OrkduIZmCSDm9xdZiJQG9n4Z2dEizz+DVSNWS6e4RI5Mi k+7a+cW6WoBoKTHIAP6KuDMfIepqx18moOf63aIl9Isl5dBE2irdP9UjlRBUM/ZIXqa6 UDyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=HaacAumE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l185-20020a6388c2000000b005533c53f5cesi2334130pgd.27.2023.07.06.15.52.27; Thu, 06 Jul 2023 15:52:42 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=HaacAumE; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231976AbjGFWvP (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231693AbjGFWu5 (ORCPT ); Thu, 6 Jul 2023 18:50:57 -0400 Received: from mail-yb1-xb4a.google.com (mail-yb1-xb4a.google.com [IPv6:2607:f8b0:4864:20::b4a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47AE61992 for ; Thu, 6 Jul 2023 15:50:56 -0700 (PDT) Received: by mail-yb1-xb4a.google.com with SMTP id 3f1490d57ef6-c0f35579901so1352097276.0 for ; Thu, 06 Jul 2023 15:50:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683855; x=1691275855; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zDsVvAgbqAnX6PMRHG2ZwgnBlt3+9ofsZz4Jdu5BLSk=; b=HaacAumEvn1eewgFiWzFDvsikzZVzPppCLT9rnA4TuMVOYLLy/vYJp6o8eS4v2uERC eDEIsx+BKjHd/+jFBv97VKbIdryCcYHF7rTvthDzDOT6vjTdMik5bmYGYq9YywI8gk/E IGbbrW20IF+yQTBxXBW0WV2O4wlEDTsHUqaG4YRBMSJGdJJ7YunZJhzePLuWnx9Wd57A 27l/ThhbyJNl1BV6XUx7oUBXPVXukMu7rVGxP8OS6VqC0GqDxzyj1IOddqpGqcyY2e9m uEIi8hNXSJHFLEllQyYR0l9vuzQePw7paEHgOJxnlvQ+OYByrCTbg4bjwmtqYY5COLUP d0Eg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683855; x=1691275855; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zDsVvAgbqAnX6PMRHG2ZwgnBlt3+9ofsZz4Jdu5BLSk=; b=ANSRWbu16PY2Om8N5aoEhM5j/zRJHHRdtPdaeqcwwEWDWteXLrCCDlS2UvyttGlHBF yINWbG+J//LRM9ktVgWVkKiwJ23CXjq7vR85qDvKcRlGV+q3akb/GQlEEno6d1LDvcLE xHo+knbhfmYsyRc5VWasRJpLagAu3s/A8Ccgeg5FvQZ0R2a4DIHQPGi/OK7AdNLngU3z eeC4M503SOgiCqeJKtEIIU684J4agf9samiVtuitD5nvtIVpkzcjntaSRUWbZ+7y+QTO gXXUeKcjPS2ndTOCr9VYsB0wOd5frN8bgBJz3fa0S5JgFGpYrhqSi3rSIJ7tTsUG/d5M rTtg== X-Gm-Message-State: ABy/qLaxsd1ApmFjLIYhNi60Sm7M7IYt9xrRy12Ks3oDCMPJ9teOe5xQ y4wPTrh2I65uP/0WzEKJLCtCgazCmxJsV/jTPKy+ X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a25:53c2:0:b0:be4:7214:7aef with SMTP id h185-20020a2553c2000000b00be472147aefmr17503ybb.10.1688683855407; Thu, 06 Jul 2023 15:50:55 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:31 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-4-axelrasmussen@google.com> Subject: [PATCH v3 3/8] mm: userfaultfd: extract file size check out into a helper From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770713474628218100?= X-GMAIL-MSGID: =?utf-8?q?1770713474628218100?= This code is already duplicated twice, and UFFDIO_POISON will do the same check a third time. So, it's worth extracting into a helper to save repetitive lines of code. Signed-off-by: Axel Rasmussen Reviewed-by: Peter Xu --- mm/userfaultfd.c | 38 ++++++++++++++++++++------------------ 1 file changed, 20 insertions(+), 18 deletions(-) diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index a2bf37ee276d..4244ca7ee903 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -45,6 +45,22 @@ struct vm_area_struct *find_dst_vma(struct mm_struct *dst_mm, return dst_vma; } +/* Check if dst_addr is outside of file's size. Must be called with ptl held. */ +static bool mfill_file_over_size(struct vm_area_struct *dst_vma, + unsigned long dst_addr) +{ + struct inode *inode; + pgoff_t offset, max_off; + + if (!dst_vma->vm_file) + return false; + + inode = dst_vma->vm_file->f_inode; + offset = linear_page_index(dst_vma, dst_addr); + max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + return offset >= max_off; +} + /* * Install PTEs, to map dst_addr (within dst_vma) to page. * @@ -64,8 +80,6 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, bool page_in_cache = page_mapping(page); spinlock_t *ptl; struct folio *folio; - struct inode *inode; - pgoff_t offset, max_off; _dst_pte = mk_pte(page, dst_vma->vm_page_prot); _dst_pte = pte_mkdirty(_dst_pte); @@ -81,14 +95,9 @@ int mfill_atomic_install_pte(pmd_t *dst_pmd, if (!dst_pte) goto out; - if (vma_is_shmem(dst_vma)) { - /* serialize against truncate with the page table lock */ - inode = dst_vma->vm_file->f_inode; - offset = linear_page_index(dst_vma, dst_addr); - max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + if (mfill_file_over_size(dst_vma, dst_addr)) { ret = -EFAULT; - if (unlikely(offset >= max_off)) - goto out_unlock; + goto out_unlock; } ret = -EEXIST; @@ -211,8 +220,6 @@ static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, pte_t _dst_pte, *dst_pte; spinlock_t *ptl; int ret; - pgoff_t offset, max_off; - struct inode *inode; _dst_pte = pte_mkspecial(pfn_pte(my_zero_pfn(dst_addr), dst_vma->vm_page_prot)); @@ -220,14 +227,9 @@ static int mfill_atomic_pte_zeropage(pmd_t *dst_pmd, dst_pte = pte_offset_map_lock(dst_vma->vm_mm, dst_pmd, dst_addr, &ptl); if (!dst_pte) goto out; - if (dst_vma->vm_file) { - /* the shmem MAP_PRIVATE case requires checking the i_size */ - inode = dst_vma->vm_file->f_inode; - offset = linear_page_index(dst_vma, dst_addr); - max_off = DIV_ROUND_UP(i_size_read(inode), PAGE_SIZE); + if (mfill_file_over_size(dst_vma, dst_addr)) { ret = -EFAULT; - if (unlikely(offset >= max_off)) - goto out_unlock; + goto out_unlock; } ret = -EEXIST; if (!pte_none(ptep_get(dst_pte))) From patchwork Thu Jul 6 22:50:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116873 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2880062vqx; Thu, 6 Jul 2023 15:52:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlGfVg3ssDyGtCCoeFssNpCAwtr9DGPDW0bNODf35j7LOyb2KkjkP1/Dv4vlzikquJFBsocg X-Received: by 2002:a05:6870:3515:b0:1b0:6e5b:21d6 with SMTP id k21-20020a056870351500b001b06e5b21d6mr3429001oah.38.1688683971917; Thu, 06 Jul 2023 15:52:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688683971; cv=none; d=google.com; s=arc-20160816; b=aJNOAhO1Pya2WZAvN4WXb2tVINBxREJCYqh0ecL0f0oc0sB4UHvQwbNQ5MtQ+Ptozo yB/r48+60FtiuIOBV2rhNiKrisGFEjtyZnBCqbzB6fNkassT6iZuT7rSvN/uETX1IfO+ JtuNWosvcB8dmyPxcnvOqmOnqnzyev3E13cgK/ws2aXgiHuYAWlUxdkoFSkcqXb9lyY3 mp1NsaTOU0jBYbp3xNhFpdPlrx58JAe4IlqUW7mK65FtDpUYIXLIWlRBoTMq6tuPfYLY sfqBbH00jlL6IjSb5rxzwVNMkeCvRhriYdlgd+Rtkj8ZL1aOSR37t6oIw+G+4y6uQBgp BXzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=PJqf0m2qw+Xjs8Quv7ufLANzkWFzCscIfWrhHyj74r4=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=LZLpHDDNBP5b8f6n3AZqQhARlgnCsJv1QGnFLEez52EwS4PWGXCTaXVO5UnblcxDo0 V1cg9bZVWfhkslqCITSVgUyr8dCTM4OJGB3WssgW2tqlpI8oYry18NOx33PoCVY5dALw XpLxDdrFLRnQnE43b8UEMiMk9uD1jQz6lnJ/6dEocHAHQmzgAvQhtSliu/cmiMDFxxYn jgbxugSClp2WnIldGyamXGpFJRq6//GI2iXlF1xfLB40Ow8Q4on4bvDMz0PULDlQcv46 TgVUNtdiF4pvmKaN2w2JKDdRgfFV5Qj58mReaBW7F7F+/VgwOSmcnLzF+V+E/xZ+r0nr Sb6w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=D2k9xCsf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lt2-20020a17090b354200b0024e35ef410fsi587500pjb.131.2023.07.06.15.52.37; Thu, 06 Jul 2023 15:52:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=D2k9xCsf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232185AbjGFWvZ (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231887AbjGFWvO (ORCPT ); Thu, 6 Jul 2023 18:51:14 -0400 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D79C51FC9 for ; Thu, 6 Jul 2023 15:50:58 -0700 (PDT) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-577323ba3d5so34198647b3.0 for ; Thu, 06 Jul 2023 15:50:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683858; x=1691275858; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=PJqf0m2qw+Xjs8Quv7ufLANzkWFzCscIfWrhHyj74r4=; b=D2k9xCsfeg61xPSIpw+bsBr3jPwLNm9nVEHyGhqfkG8lUTP35cd6IFL212tdgreqNz 0YlkyIjNjF30q+hjO/oW6G1cNPIRff4f/oHqF7Wy+jVNAq1/WuXr6jz15FTr0ttknNyr 0KjASHAlCDktsRxguFNoBBTItvv/YFICr/K4mQRPcS2hhFcyPon/giYdNfLAM48U0Jx0 eXvwik0EB73EKMqASVZMZZsmmHmToMBPVsvzhfYgIrpjZiLkZdUoFdyj4hnakuxSym+M ulEcJTqFS3AoXIeAsnUQhvNIC+QtVFc1duLHDu/kIEV70/F62oZksyCAFiDKBKPrHFko dgvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683858; x=1691275858; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=PJqf0m2qw+Xjs8Quv7ufLANzkWFzCscIfWrhHyj74r4=; b=GP3MbGk/7mhi1Lx0FtESLQNLMvmwIx8cvBWDvsmaaMp4Ld7caFRo0fUQ30c7h2mmEJ qDZcdx7/hvyDcc1ajZVplJ0bc7NFlM7E0u20MombEQS6lx9ZzhGcaRDkAY/HGkORo/I5 i1nvWmt0VCBe908NP7sbtTr166gkdwM3bmHjBDehCwT9XaA2ym3AqIVR5BTMNRBtP1RX 7e7nV+PT2tKvttgb5T54oDC5e+6X7XZRNDBZEHAO93RbzwIHMoOIoM1sU8mniuK8A+P6 d/Ld8ncSZc0UQy81NFtgM+gAx6ThCkrdIAl8yR5cgEfJ6JQFpQBpL+Kh9sviI49t5rj3 fstA== X-Gm-Message-State: ABy/qLZCpKMSuRIp2JhbrTvhND+GlA9Oa1RZkuudrJXORy0V5rsYVd6h bTK8Nf4+ibc4YOMjedJrRdCPY8vFzdcjhWAesf3q X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a25:d17:0:b0:bd5:dc2d:9d7f with SMTP id 23-20020a250d17000000b00bd5dc2d9d7fmr78836ybn.4.1688683857962; Thu, 06 Jul 2023 15:50:57 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:32 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-5-axelrasmussen@google.com> Subject: [PATCH v3 4/8] mm: userfaultfd: add new UFFDIO_POISON ioctl From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770713484362353061?= X-GMAIL-MSGID: =?utf-8?q?1770713484362353061?= The basic idea here is to "simulate" memory poisoning for VMs. A VM running on some host might encounter a memory error, after which some page(s) are poisoned (i.e., future accesses SIGBUS). They expect that once poisoned, pages can never become "un-poisoned". So, when we live migrate the VM, we need to preserve the poisoned status of these pages. When live migrating, we try to get the guest running on its new host as quickly as possible. So, we start it running before all memory has been copied, and before we're certain which pages should be poisoned or not. So the basic way to use this new feature is: - On the new host, the guest's memory is registered with userfaultfd, in either MISSING or MINOR mode (doesn't really matter for this purpose). - On any first access, we get a userfaultfd event. At this point we can communicate with the old host to find out if the page was poisoned. - If so, we can respond with a UFFDIO_POISON - this places a swap marker so any future accesses will SIGBUS. Because the pte is now "present", future accesses won't generate more userfaultfd events, they'll just SIGBUS directly. UFFDIO_POISON does not handle unmapping previously-present PTEs. This isn't needed, because during live migration we want to intercept all accesses with userfaultfd (not just writes, so WP mode isn't useful for this). So whether minor or missing mode is being used (or both), the PTE won't be present in any case, so handling that case isn't needed. Similarly, UFFDIO_POISON won't replace existing PTE markers. This might be okay to do, but it seems to be safer to just refuse to overwrite any existing entry (like a UFFD_WP PTE marker). Signed-off-by: Axel Rasmussen Acked-by: Peter Xu --- fs/userfaultfd.c | 58 ++++++++++++++++++++++++++++++++ include/linux/userfaultfd_k.h | 4 +++ include/uapi/linux/userfaultfd.h | 16 +++++++++ mm/userfaultfd.c | 48 +++++++++++++++++++++++++- 4 files changed, 125 insertions(+), 1 deletion(-) diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 2e84684c46f0..53a7220c4679 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1956,6 +1956,61 @@ static int userfaultfd_continue(struct userfaultfd_ctx *ctx, unsigned long arg) return ret; } +static inline int userfaultfd_poison(struct userfaultfd_ctx *ctx, unsigned long arg) +{ + __s64 ret; + struct uffdio_poison uffdio_poison; + struct uffdio_poison __user *user_uffdio_poison; + struct userfaultfd_wake_range range; + + user_uffdio_poison = (struct uffdio_poison __user *)arg; + + ret = -EAGAIN; + if (atomic_read(&ctx->mmap_changing)) + goto out; + + ret = -EFAULT; + if (copy_from_user(&uffdio_poison, user_uffdio_poison, + /* don't copy the output fields */ + sizeof(uffdio_poison) - (sizeof(__s64)))) + goto out; + + ret = validate_range(ctx->mm, uffdio_poison.range.start, + uffdio_poison.range.len); + if (ret) + goto out; + + ret = -EINVAL; + if (uffdio_poison.mode & ~UFFDIO_POISON_MODE_DONTWAKE) + goto out; + + if (mmget_not_zero(ctx->mm)) { + ret = mfill_atomic_poison(ctx->mm, uffdio_poison.range.start, + uffdio_poison.range.len, + &ctx->mmap_changing, 0); + mmput(ctx->mm); + } else { + return -ESRCH; + } + + if (unlikely(put_user(ret, &user_uffdio_poison->updated))) + return -EFAULT; + if (ret < 0) + goto out; + + /* len == 0 would wake all */ + BUG_ON(!ret); + range.len = ret; + if (!(uffdio_poison.mode & UFFDIO_POISON_MODE_DONTWAKE)) { + range.start = uffdio_poison.range.start; + wake_userfault(ctx, &range); + } + ret = range.len == uffdio_poison.range.len ? 0 : -EAGAIN; + +out: + return ret; +} + static inline unsigned int uffd_ctx_features(__u64 user_features) { /* @@ -2057,6 +2112,9 @@ static long userfaultfd_ioctl(struct file *file, unsigned cmd, case UFFDIO_CONTINUE: ret = userfaultfd_continue(ctx, arg); break; + case UFFDIO_POISON: + ret = userfaultfd_poison(ctx, arg); + break; } return ret; } diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h index ac7b0c96d351..ac8c6854097c 100644 --- a/include/linux/userfaultfd_k.h +++ b/include/linux/userfaultfd_k.h @@ -46,6 +46,7 @@ enum mfill_atomic_mode { MFILL_ATOMIC_COPY, MFILL_ATOMIC_ZEROPAGE, MFILL_ATOMIC_CONTINUE, + MFILL_ATOMIC_POISON, NR_MFILL_ATOMIC_MODES, }; @@ -83,6 +84,9 @@ extern ssize_t mfill_atomic_zeropage(struct mm_struct *dst_mm, extern ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long dst_start, unsigned long len, atomic_t *mmap_changing, uffd_flags_t flags); +extern ssize_t mfill_atomic_poison(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, atomic_t *mmap_changing, + uffd_flags_t flags); extern int mwriteprotect_range(struct mm_struct *dst_mm, unsigned long start, unsigned long len, bool enable_wp, atomic_t *mmap_changing); diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index 66dd4cd277bd..b5f07eacc697 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -71,6 +71,7 @@ #define _UFFDIO_ZEROPAGE (0x04) #define _UFFDIO_WRITEPROTECT (0x06) #define _UFFDIO_CONTINUE (0x07) +#define _UFFDIO_POISON (0x08) #define _UFFDIO_API (0x3F) /* userfaultfd ioctl ids */ @@ -91,6 +92,8 @@ struct uffdio_writeprotect) #define UFFDIO_CONTINUE _IOWR(UFFDIO, _UFFDIO_CONTINUE, \ struct uffdio_continue) +#define UFFDIO_POISON _IOWR(UFFDIO, _UFFDIO_POISON, \ + struct uffdio_poison) /* read() structure */ struct uffd_msg { @@ -225,6 +228,7 @@ struct uffdio_api { #define UFFD_FEATURE_EXACT_ADDRESS (1<<11) #define UFFD_FEATURE_WP_HUGETLBFS_SHMEM (1<<12) #define UFFD_FEATURE_WP_UNPOPULATED (1<<13) +#define UFFD_FEATURE_POISON (1<<14) __u64 features; __u64 ioctls; @@ -321,6 +325,18 @@ struct uffdio_continue { __s64 mapped; }; +struct uffdio_poison { + struct uffdio_range range; +#define UFFDIO_POISON_MODE_DONTWAKE ((__u64)1<<0) + __u64 mode; + + /* + * Fields below here are written by the ioctl and must be at the end: + * the copy_from_user will not read past here. + */ + __s64 updated; +}; + /* * Flags for the userfaultfd(2) system call itself. */ diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 4244ca7ee903..899aa621d7c1 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -288,6 +288,40 @@ static int mfill_atomic_pte_continue(pmd_t *dst_pmd, goto out; } +/* Handles UFFDIO_POISON for all non-hugetlb VMAs. */ +static int mfill_atomic_pte_poison(pmd_t *dst_pmd, + struct vm_area_struct *dst_vma, + unsigned long dst_addr, + uffd_flags_t flags) +{ + int ret; + struct mm_struct *dst_mm = dst_vma->vm_mm; + pte_t _dst_pte, *dst_pte; + spinlock_t *ptl; + + _dst_pte = make_pte_marker(PTE_MARKER_ERROR); + dst_pte = pte_offset_map_lock(dst_mm, dst_pmd, dst_addr, &ptl); + + if (mfill_file_over_size(dst_vma, dst_addr)) { + ret = -EFAULT; + goto out_unlock; + } + + ret = -EEXIST; + /* Refuse to overwrite any PTE, even a PTE marker (e.g. UFFD WP). */ + if (!pte_none(*dst_pte)) + goto out_unlock; + + set_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + + /* No need to invalidate - it was non-present before */ + update_mmu_cache(dst_vma, dst_addr, dst_pte); + ret = 0; +out_unlock: + pte_unmap_unlock(dst_pte, ptl); + return ret; +} + static pmd_t *mm_alloc_pmd(struct mm_struct *mm, unsigned long address) { pgd_t *pgd; @@ -339,7 +373,8 @@ static __always_inline ssize_t mfill_atomic_hugetlb( * by THP. Since we can not reliably insert a zero page, this * feature is not supported. */ - if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) { + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE) || + uffd_flags_mode_is(flags, MFILL_ATOMIC_POISON)) { mmap_read_unlock(dst_mm); return -EINVAL; } @@ -483,6 +518,9 @@ static __always_inline ssize_t mfill_atomic_pte(pmd_t *dst_pmd, if (uffd_flags_mode_is(flags, MFILL_ATOMIC_CONTINUE)) { return mfill_atomic_pte_continue(dst_pmd, dst_vma, dst_addr, flags); + } else if (uffd_flags_mode_is(flags, MFILL_ATOMIC_POISON)) { + return mfill_atomic_pte_poison(dst_pmd, dst_vma, + dst_addr, flags); } /* @@ -704,6 +742,14 @@ ssize_t mfill_atomic_continue(struct mm_struct *dst_mm, unsigned long start, uffd_flags_set_mode(flags, MFILL_ATOMIC_CONTINUE)); } +ssize_t mfill_atomic_poison(struct mm_struct *dst_mm, unsigned long start, + unsigned long len, atomic_t *mmap_changing, + uffd_flags_t flags) +{ + return mfill_atomic(dst_mm, start, 0, len, mmap_changing, + uffd_flags_set_mode(flags, MFILL_ATOMIC_POISON)); +} + long uffd_wp_range(struct vm_area_struct *dst_vma, unsigned long start, unsigned long len, bool enable_wp) { From patchwork Thu Jul 6 22:50:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116876 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2880560vqx; Thu, 6 Jul 2023 15:54:21 -0700 (PDT) X-Google-Smtp-Source: APBJJlH6AGPqg7QWTSEBxz59QQ7yOPak2eLh5+Gcs80xf3EG6azIOrBP3/CqFRhX19ZFLDbNiu3S X-Received: by 2002:a17:90a:b311:b0:263:71ff:d0c3 with SMTP id d17-20020a17090ab31100b0026371ffd0c3mr3271228pjr.8.1688684061382; Thu, 06 Jul 2023 15:54:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688684061; cv=none; d=google.com; s=arc-20160816; b=cwhxE6G2Iop2Oe9DlsXcHHGiTNcdA0rL8pQIPb0F/dqB8RSWd4DkbUgZVn4Ec9oYOF SGMmeqEMQXjgi+KFWy2XRorvDbevgkh7XBFD4CELkmKho+Rs32GhRunMJ9q05++E7jCs nhJV/6S5I8HzkVg7gJ4+RusYtct5eUd5ZGIJEKb4Sf8BR2p0BT115pSq7ANf2f9aWpL3 lOHsDgYqxTypBg+z+Zva0AqqRLYLqJtFzG+8WnmXDBscC6eHmxhJE+LQM9+G3Qza0+hQ qOCFtzmTS/I4xhg6gASFXgL+WLeevm5JKRmP6JUJzMVqOl6p2hiNIzcniEnITQ7SsyaK T6tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=Qwjh0UAlAYXVze8TCXBt1h/rZKLWFE0vCD0k3jVLMVk=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=UKbOh7MIGwVDc8+cD8MGMn+gqWeJeMsNqSQ5HXnSHfXqg09WC/O8T466LUwJYIn4kS 2MSxUOSz6kUmCCx+Q5ITdx/DfhV0b+u+WBPsE7THiHsEfJdvf4NL/sQoSx4DYt3+gjFz ClecEpMhEeNWWDfzAhRKEamQYlOCE1APwZpNtQPmy0rtZV14FUxJ4qaHnltIYgaOWwA8 3T+GOXONMAIz1BbGKKgSnxeAII9HKzSq2R1vipDhRqOUgMu0VpRxXzgY1RNT2jwT9wmT w7WUTxL+H6NYMSgPEiq+7EE94MsbfMeIgCveNbCiLnAtkyAeb5d4hsoS0un9dGFvZ0eK L9sg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Y2yizmMp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pi12-20020a17090b1e4c00b00263d559dbf1si722492pjb.55.2023.07.06.15.54.06; Thu, 06 Jul 2023 15:54:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=Y2yizmMp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232029AbjGFWvb (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231962AbjGFWvP (ORCPT ); Thu, 6 Jul 2023 18:51:15 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60F691FE8 for ; Thu, 6 Jul 2023 15:51:01 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-56942442eb0so14211477b3.1 for ; Thu, 06 Jul 2023 15:51:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683860; x=1691275860; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Qwjh0UAlAYXVze8TCXBt1h/rZKLWFE0vCD0k3jVLMVk=; b=Y2yizmMpjzL0EwPNfj1fzhhPy4fB1Fio/333NC+Pl3+CabD5WvQiBObIAnbDvfdIz0 55qqIyoV/B43yRJQMgc5PRzcnEO7oZ13pzUFcnMsRpYGZDSVSZeMWMJRv9WA6h3i6GfT TXaYPaz84vks82iZ0vbDvAGXP/Ef7g9EhKOVR1CvAJFhj45Vp0/ta+dk/HgbDOBioOwa 2BKF9RsPqJx9l5jS/r3knSsrH/fNyDrHo/fZURqQr/YXnDI5GwojI2TLVLkJxGr1ous4 fBuKAKMZ/SIafScZO/gr72YFQKtZJ6oKMnuQ1yiJQjaFtVwKCY9iv6k7Pl0tvv06AM6a 3A8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683860; x=1691275860; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Qwjh0UAlAYXVze8TCXBt1h/rZKLWFE0vCD0k3jVLMVk=; b=F1PZt3LzZsWhOcehX913UjRnrh982k6ZJiYpV5R2ePEHRkCCvrMnTfwWxtC+YFTjP4 EmOXReFP95w106KUD0hYaxGOlvjun545vYUCTEHq2qZoqnkj+ku8/eZnMEkkuz+o6sbi dF1KK83uAZPQ7rSQARsgpLJqIK4gNbEo6408mvu1cJecJXS6U+zLGQFl2SsLgBRXaFDy tMfPcez1QPA/BkYReQXCh20ZshKVA7oHXlcp+lmc/oFYEepwc9hgcEVQILtH85d/zmf0 coC1mjXDJ/NKeQbIeWkJYCo8R/jHt+HOwQN+ALhOOJk6/vSZ4pZNuW159Yq2iIXq5Jrm I+aQ== X-Gm-Message-State: ABy/qLbkX7zdVKIypukAFUVApBkRE14Ky3ElPzPJdF7CFYh0exHANV3u lkmkb71Mdsg+vjz5a4EnnuWvuOt0x92sQ7VvecfT X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a81:c642:0:b0:576:d9ea:1331 with SMTP id q2-20020a81c642000000b00576d9ea1331mr27910ywj.4.1688683860327; Thu, 06 Jul 2023 15:51:00 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:33 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-6-axelrasmussen@google.com> Subject: [PATCH v3 5/8] mm: userfaultfd: support UFFDIO_POISON for hugetlbfs From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770713578308857301?= X-GMAIL-MSGID: =?utf-8?q?1770713578308857301?= The behavior here is the same as it is for anon/shmem. This is done separately because hugetlb pte marker handling is a bit different. Signed-off-by: Axel Rasmussen Acked-by: Peter Xu --- mm/hugetlb.c | 19 +++++++++++++++++++ mm/userfaultfd.c | 3 +-- 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 934e129d9939..20c5f6a5420a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6263,6 +6263,25 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, int writable; bool folio_in_pagecache = false; + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_POISON)) { + ptl = huge_pte_lock(h, dst_mm, dst_pte); + + /* Don't overwrite any existing PTEs (even markers) */ + if (!huge_pte_none(huge_ptep_get(dst_pte))) { + spin_unlock(ptl); + return -EEXIST; + } + + _dst_pte = make_pte_marker(PTE_MARKER_ERROR); + set_huge_pte_at(dst_mm, dst_addr, dst_pte, _dst_pte); + + /* No need to invalidate - it was non-present before */ + update_mmu_cache(dst_vma, dst_addr, dst_pte); + + spin_unlock(ptl); + return 0; + } + if (is_continue) { ret = -EFAULT; folio = filemap_lock_folio(mapping, idx); diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 899aa621d7c1..9ce129fdd596 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -373,8 +373,7 @@ static __always_inline ssize_t mfill_atomic_hugetlb( * by THP. Since we can not reliably insert a zero page, this * feature is not supported. */ - if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE) || - uffd_flags_mode_is(flags, MFILL_ATOMIC_POISON)) { + if (uffd_flags_mode_is(flags, MFILL_ATOMIC_ZEROPAGE)) { mmap_read_unlock(dst_mm); return -EINVAL; } From patchwork Thu Jul 6 22:50:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116874 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2880165vqx; Thu, 6 Jul 2023 15:53:11 -0700 (PDT) X-Google-Smtp-Source: APBJJlFs95qq9njj1GmZgjCp7SUdq0axJOw1ZKqfxZYwwqFbMf2ItgbCU8kR2SNZm7wZMAupmfN+ X-Received: by 2002:a05:6830:1d7a:b0:6b7:5c4:42e9 with SMTP id l26-20020a0568301d7a00b006b705c442e9mr3331081oti.36.1688683990854; Thu, 06 Jul 2023 15:53:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688683990; cv=none; d=google.com; s=arc-20160816; b=wV7G9polyKxsRopmWHlKVk4MY1m+Uo79EV2jkkgJGmJyyvoxzAaIQ2ySop9wyhGpyy 1aYDHWChBKgGZlupWzx57QYxeupyx3nKQc9oVoqIf3O28u7wjbHdXgnx96HYHqHri2U2 F6ollBQk46mkfSAmQ0VWdx5XHdd1PUT52bmHbgiupmdz++XYs6BXWYOKSNZKAiM+SxuB 6Dbdlca+0tpxwJSv3JsZnwJm5y73ew77Pn9g1tCKJnw1kSsOj6W018fr8u9CaEgg9jcV j8DxzhKzW6T9vbA8oQSAc/l+DJRriYyplA53YNeiISDV8ViMwiu3MmxnJsnOqFQDtXrF ce+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=7OMVl3YxFVN2A+Ry/UxbO1YmCHHRZcxw7VDxaJg9ek4=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=poUBbaHjbcMuaqb6ezFuVt+zbjhH9mI8dl40grr1m/Tvi4vFxnGBswaC6S4Za7SQtO 7nEeU0JSnsJKFbykk3fyy0vEG4wFh4ODxVbJeP8e8WVyuI+aNt/6RIAsg9VUboatJE20 yQo7GqYBB4Y9QwdqC3ha9yHrJrxgD4G4ViC5+Zt0tIx5OuvOHUvuyd4fisIOK2DYG34a PSKuzYtJnW5yEWgNxjQDn8mZPZoB+EMuFKRYZ1N5JQBddKmWmKznDVQ06jXnznxnOh2E OoxpS9J/Vov68sNUiRQYCJ9v6mKCqtsU+fIDe4u3ZfTT2xX6NdE6SzgYwu97KhorKZqM uSng== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=MUCpPcJ5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x16-20020a631710000000b00553810ea8e5si2271546pgl.303.2023.07.06.15.52.56; Thu, 06 Jul 2023 15:53:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=MUCpPcJ5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232290AbjGFWve (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56804 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231666AbjGFWvP (ORCPT ); Thu, 6 Jul 2023 18:51:15 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A5E01FF6 for ; Thu, 6 Jul 2023 15:51:03 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-57704aace46so13053777b3.2 for ; Thu, 06 Jul 2023 15:51:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683862; x=1691275862; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=7OMVl3YxFVN2A+Ry/UxbO1YmCHHRZcxw7VDxaJg9ek4=; b=MUCpPcJ5Cgcqj6eW6sHdBBG776f5QGOT4kgvGYw/zAIra7r+ur1/6MP7Dq6MkoSZmq 4xRE3PYyIXD4h8qZWQWCXhWmcWP1AwGUjGAf2ElqV4yAkcG0gjhOddOoakQQadmYz+QD Is/7A9d1hSTithrMC6POOEAOeFEch61LmagZDJM30S0vYIcKNDDe5/OQ3z3oRxh6zHQl utpA3lGtPFljt8j9Ntkp9LhkEyWfCZ3raeDW8NKQTH9gVvZpq5Wizvy5vkyeCh1Pwwja CGEW2G1lqZ2p9p/Ga+J136aVMeRPFS8hQNLl5QI5DDjAJCWHkUOphl3NTJbnSXMCQUoE o8sw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683862; x=1691275862; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=7OMVl3YxFVN2A+Ry/UxbO1YmCHHRZcxw7VDxaJg9ek4=; b=WsKGT1buqPxOx3vTQVAwn6tsDwGEqPPKgzaJ0p7d8DAK6XVY8Ke9j3iTrb6UyUHWYk lgloddgXs6FnF+tDo1BiD6mAl87SohZEqW8TVHqstx4OiFz30dEI11zCN4i2caRiSloK 5fofFne+gkrNxlzFWWfL7bKXS5RszmNipP6ZavP0T+wiCyp/WuZbt11o2Ilft43hfp5M FiIZjA1cz6CWoqs02J4+YB13Fxsnh2jmqVVUzy/JYfCj/nDp6AYq4XT668QTREpae9Aq yP3chGvwhJ7dqFXlGITV+yAAtrkhcBKMyghnDutgEipKq/xAs9pFLQ8OkIfggRkOEITK /soA== X-Gm-Message-State: ABy/qLbuWKyEsCMtAJ7WBBY6jueADjo8chRuHmrqyGSqsa5Lx8U87zQ8 OeU9GdZF2UR+PK2hc8XIO8+0Xy+ur4Go+EUttAhJ X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a81:af07:0:b0:565:a33a:a49f with SMTP id n7-20020a81af07000000b00565a33aa49fmr25333ywh.6.1688683862421; Thu, 06 Jul 2023 15:51:02 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:34 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-7-axelrasmussen@google.com> Subject: [PATCH v3 6/8] mm: userfaultfd: document and enable new UFFDIO_POISON feature From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770713504244195493?= X-GMAIL-MSGID: =?utf-8?q?1770713504244195493?= Update the userfaultfd API to advertise this feature as part of feature flags and supported ioctls (returned upon registration). Add basic documentation describing the new feature. Acked-by: Peter Xu Signed-off-by: Axel Rasmussen --- Documentation/admin-guide/mm/userfaultfd.rst | 15 +++++++++++++++ include/uapi/linux/userfaultfd.h | 9 ++++++--- 2 files changed, 21 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst index 7c304e432205..4349a8c2b978 100644 --- a/Documentation/admin-guide/mm/userfaultfd.rst +++ b/Documentation/admin-guide/mm/userfaultfd.rst @@ -244,6 +244,21 @@ write-protected (so future writes will also result in a WP fault). These ioctls support a mode flag (``UFFDIO_COPY_MODE_WP`` or ``UFFDIO_CONTINUE_MODE_WP`` respectively) to configure the mapping this way. +Memory Poisioning Emulation +--------------------------- + +In response to a fault (either missing or minor), an action userspace can +take to "resolve" it is to issue a ``UFFDIO_POISON``. This will cause any +future faulters to either get a SIGBUS, or in KVM's case the guest will +receive an MCE as if there were hardware memory poisoning. + +This is used to emulate hardware memory poisoning. Imagine a VM running on a +machine which experiences a real hardware memory error. Later, we live migrate +the VM to another physical machine. Since we want the migration to be +transparent to the guest, we want that same address range to act as if it was +still poisoned, even though it's on a new physical host which ostensibly +doesn't have a memory error in the exact same spot. + QEMU/KVM ======== diff --git a/include/uapi/linux/userfaultfd.h b/include/uapi/linux/userfaultfd.h index b5f07eacc697..62151706c5a3 100644 --- a/include/uapi/linux/userfaultfd.h +++ b/include/uapi/linux/userfaultfd.h @@ -39,7 +39,8 @@ UFFD_FEATURE_MINOR_SHMEM | \ UFFD_FEATURE_EXACT_ADDRESS | \ UFFD_FEATURE_WP_HUGETLBFS_SHMEM | \ - UFFD_FEATURE_WP_UNPOPULATED) + UFFD_FEATURE_WP_UNPOPULATED | \ + UFFD_FEATURE_POISON) #define UFFD_API_IOCTLS \ ((__u64)1 << _UFFDIO_REGISTER | \ (__u64)1 << _UFFDIO_UNREGISTER | \ @@ -49,12 +50,14 @@ (__u64)1 << _UFFDIO_COPY | \ (__u64)1 << _UFFDIO_ZEROPAGE | \ (__u64)1 << _UFFDIO_WRITEPROTECT | \ - (__u64)1 << _UFFDIO_CONTINUE) + (__u64)1 << _UFFDIO_CONTINUE | \ + (__u64)1 << _UFFDIO_POISON) #define UFFD_API_RANGE_IOCTLS_BASIC \ ((__u64)1 << _UFFDIO_WAKE | \ (__u64)1 << _UFFDIO_COPY | \ + (__u64)1 << _UFFDIO_WRITEPROTECT | \ (__u64)1 << _UFFDIO_CONTINUE | \ - (__u64)1 << _UFFDIO_WRITEPROTECT) + (__u64)1 << _UFFDIO_POISON) /* * Valid ioctl command number range with this API is from 0x00 to From patchwork Thu Jul 6 22:50:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116875 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2880235vqx; Thu, 6 Jul 2023 15:53:24 -0700 (PDT) X-Google-Smtp-Source: APBJJlHjsl6+5k/1Vc3Ex1dDLK4bP1Wu8YkOFj5AOOXk4qv2orc4QmK4N8gTljjjB09StiHNxaiR X-Received: by 2002:a05:6a20:3d08:b0:12f:8755:96ba with SMTP id y8-20020a056a203d0800b0012f875596bamr2962310pzi.28.1688684004006; Thu, 06 Jul 2023 15:53:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688684003; cv=none; d=google.com; s=arc-20160816; b=ioQuBMr9Dpo009WJMVqrCA2KY+CtJ4ILp9cI00MaHd/f+sl9l2Zc1QbFwchT2E0rID wnwA6nRQVD4BkC1tucFfn4B/UK7BxHqzbiHP3MSThS5u+UVWAjSj9dOzRn+lwEgnnnt0 SDDp+5+zNLF8XfwaXZO4IyJKnbiKc37EP4CPq+O+1coaoci+y0dM1LaiADb9Q5sO7lho jl8cpNTMzXijtwz3E+Ld84dLiUdPy1M4dihA2ZL5gBqcLFGk4vfm+r1II1i119qiDP+J rb5CNwVATmzwsImo/PNVHX7Eq0ICphsgVbouNjbHtGnXakiTrWiFfvTdqAZmZYXiigI2 +DCA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=0eR1m0YpsGhZ6GlqZ9XuYhhV42VymJr50nJOc8nZiNo=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=YD5d7Zbfc09uWaIt/TI+5tOYgNHpggb1sI+ZeTktGvwMf37HpblzQiGOVfLUvfAlXX n4/KjyDHsFNt9+XGsnJBJ7IUmb+lUQ3PFwTVklaIcNhR2xmkI3zfFTFef3Bh+cbEdjxU a8ScyBv3/PxBY23QwnyjuMKpe3Hks733clP6zXXjdSTLs/BIAOEtCsSQw1ZYkwq9Vcgy hawtCS9UZHgAssR8nBKM2n76lPQZhuZDJVkXEHPkdbbuHubzb82IWkpP6Y0ZLeZRxMmG qeLyivehDt1jE9k0tSlrX8dnbfF4z0/RDwZrs1TB8FYCtbhaQILmy1JUUKNAvYyrWvyn h3zg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=zVK6nDx2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f11-20020a170902f38b00b001b02e044c87si2005528ple.320.2023.07.06.15.53.08; Thu, 06 Jul 2023 15:53:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=zVK6nDx2; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229527AbjGFWvy (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232077AbjGFWvY (ORCPT ); Thu, 6 Jul 2023 18:51:24 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 901F12109 for ; Thu, 6 Jul 2023 15:51:05 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-569e7aec37bso14401537b3.2 for ; Thu, 06 Jul 2023 15:51:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683864; x=1691275864; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0eR1m0YpsGhZ6GlqZ9XuYhhV42VymJr50nJOc8nZiNo=; b=zVK6nDx2/vIL1dc7h7hFZDfr/rEOBcNDC5lzAQ8fMfaY2kDXm1d90zN+sAdZBRkAVy eS0e/cYLAI5GlN5hCyqzAZsI3npJTnOPGdGRNCf5Z9XQRS26+3V3iKb68AeK1pqfrrK/ xI0bVwy1i7i/os5XjQVIXRLYlXw2zGr2RbOxqfH7TjC+dya4aBmpaEQwquOV67eYDYli EHdK6eJUGxQqPuxQuTMsQewM+3qR50sGQFaisJznV2893CLPnbjYkC1qV0z6IJvBHP2Q Qm/1xZWrhlds7/VAWCEpvN4M1cI6E8ouf/4BdK8WRTe5GRh7RJaWO+df/EurZS1IsV5C ESHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683864; x=1691275864; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0eR1m0YpsGhZ6GlqZ9XuYhhV42VymJr50nJOc8nZiNo=; b=XU38eHNnvHUIgCssKRcheu+pcFsuo1AlYMfIMzWcQ6BP0g3xoWZMdqlAWA2/1FiZx6 kcE8FcL8OL8gzLe71yolvXPwq3GiIFJoBYYFQjzmudIhH5XaEYKf+9top4NfGu7MaKE5 HYkYt01pHJLZI57h475er6JyjXuxgjSewN+WkpIYU/nzmq/nep0c+UlQVubtXdFhTVKz awd5ZSVJzwElpT0JTzPVlqOIPkTJKaesNuJHC64vmgPhKsE9SQK0fAnvfmz3RyoYtsa8 nP19OqJU4AYtCaJj8SBelHfX4SLqgoTnHrIMsCq9IW7ywsJzAVtUbNLJp7XBA54GpdAs LPHA== X-Gm-Message-State: ABy/qLbjjXH0tk13igdDZcXirL5aHI5YXPi4xpVGAd/ekA02edR3KyD8 SNU+YFoezSKpNqoKKvEfNM6AuP3JrWpzFsV3YPlX X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a25:b193:0:b0:c01:e1c0:3b8f with SMTP id h19-20020a25b193000000b00c01e1c03b8fmr16283ybj.6.1688683864381; Thu, 06 Jul 2023 15:51:04 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:35 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-8-axelrasmussen@google.com> Subject: [PATCH v3 7/8] selftests/mm: refactor uffd_poll_thread to allow custom fault handlers From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770713518532441815?= X-GMAIL-MSGID: =?utf-8?q?1770713518532441815?= Previously, we had "one fault handler to rule them all", which used several branches to deal with all of the scenarios required by all of the various tests. In upcoming patches, I plan to add a new test, which has its own slightly different fault handling logic. Instead of continuing to add cruft to the existing fault handler, let's allow tests to define custom ones, separate from other tests. Signed-off-by: Axel Rasmussen --- tools/testing/selftests/mm/uffd-common.c | 5 ++++- tools/testing/selftests/mm/uffd-common.h | 3 +++ tools/testing/selftests/mm/uffd-stress.c | 12 +++++++----- 3 files changed, 14 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/mm/uffd-common.c b/tools/testing/selftests/mm/uffd-common.c index ba20d7504022..02b89860e193 100644 --- a/tools/testing/selftests/mm/uffd-common.c +++ b/tools/testing/selftests/mm/uffd-common.c @@ -499,6 +499,9 @@ void *uffd_poll_thread(void *arg) int ret; char tmp_chr; + if (!args->handle_fault) + args->handle_fault = uffd_handle_page_fault; + pollfd[0].fd = uffd; pollfd[0].events = POLLIN; pollfd[1].fd = pipefd[cpu*2]; @@ -527,7 +530,7 @@ void *uffd_poll_thread(void *arg) err("unexpected msg event %u\n", msg.event); break; case UFFD_EVENT_PAGEFAULT: - uffd_handle_page_fault(&msg, args); + args->handle_fault(&msg, args); break; case UFFD_EVENT_FORK: close(uffd); diff --git a/tools/testing/selftests/mm/uffd-common.h b/tools/testing/selftests/mm/uffd-common.h index 197f5262fe0d..7c4fa964c3b0 100644 --- a/tools/testing/selftests/mm/uffd-common.h +++ b/tools/testing/selftests/mm/uffd-common.h @@ -77,6 +77,9 @@ struct uffd_args { unsigned long missing_faults; unsigned long wp_faults; unsigned long minor_faults; + + /* A custom fault handler; defaults to uffd_handle_page_fault. */ + void (*handle_fault)(struct uffd_msg *msg, struct uffd_args *args); }; struct uffd_test_ops { diff --git a/tools/testing/selftests/mm/uffd-stress.c b/tools/testing/selftests/mm/uffd-stress.c index 995ff13e74c7..50b1224d72c7 100644 --- a/tools/testing/selftests/mm/uffd-stress.c +++ b/tools/testing/selftests/mm/uffd-stress.c @@ -189,10 +189,8 @@ static int stress(struct uffd_args *args) locking_thread, (void *)cpu)) return 1; if (bounces & BOUNCE_POLL) { - if (pthread_create(&uffd_threads[cpu], &attr, - uffd_poll_thread, - (void *)&args[cpu])) - return 1; + if (pthread_create(&uffd_threads[cpu], &attr, uffd_poll_thread, &args[cpu])) + err("uffd_poll_thread create"); } else { if (pthread_create(&uffd_threads[cpu], &attr, uffd_read_thread, @@ -247,9 +245,13 @@ static int userfaultfd_stress(void) { void *area; unsigned long nr; - struct uffd_args args[nr_cpus]; + struct uffd_args *args; uint64_t mem_size = nr_pages * page_size; + args = calloc(nr_cpus, sizeof(struct uffd_args)); + if (!args) + err("allocating args array failed"); + if (uffd_test_ctx_init(UFFD_FEATURE_WP_UNPOPULATED, NULL)) err("context init failed"); From patchwork Thu Jul 6 22:50:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Axel Rasmussen X-Patchwork-Id: 116879 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f45:0:b0:3ea:f831:8777 with SMTP id v5csp2893289vqx; Thu, 6 Jul 2023 16:20:05 -0700 (PDT) X-Google-Smtp-Source: APBJJlHqLO2EYzB1u8OxV4P+tYEseJifggT7Im5ecjCgpDwOPzgrz3OZlYXIziD0J8wgXg/4RzuL X-Received: by 2002:a05:6358:894:b0:134:d19a:b89e with SMTP id m20-20020a056358089400b00134d19ab89emr3357737rwj.1.1688685605520; Thu, 06 Jul 2023 16:20:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1688685605; cv=none; d=google.com; s=arc-20160816; b=x0N2vH5PobeofdmfqxNPls7zW1EZHF9KusK4ZI4flbTklf/KKu5xoMZN6qDrZaF1Vj ZWssXgo86Xl4aiqe6GREpYCexX6v5eKhkreiL5OXJaOqlA1GwtfRG3De14lo8xA/jE/0 Pen4/Q8F5E8aUG2KNlOUMtaZ6NVFu9O9dOEqt8+be+2kiLdVNgaxkMx25rgEVvtcyS7H Hf4jYr+6x2Hcnz9X8ZxiLADUpOd3USR9GCrIq0UB9C4Wp7cGX2yAfkURcjZ/ZcMzhzJ1 JgaoF8cCyJ41tnb4X0Fx/DiZ7WiFEqPEtKC4zSrNCfip9o7XwpVSuRVvFNPiJGHSAbgz YO3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=vNyM+7c0est1wcVkLGeYyWS2zFzaKr32nANrKlDiGpI=; fh=Opjbx899L+35XA6SNQJRoMm3wFpvelEj3IooMxkZLjI=; b=wDVdZ8X0Mgr9emqOUtWdUxg6lqtJoBhsg82Am5Pq47xWfnyq14dytBjLg+0CIFon/6 yJbsK2nIm6hFaVJW6E/+BCLeeAvSCBvVo6T9ztVRbFBPq7to2+2Eu1UytatxBU2llw/g 5jr2/sGUf7g1gWcsaasoufnFnfGDWMmIOjU71b2TxdV2uVFgM3w7SYXc8luKmFMgTz7K oChX6IddzPY2KdYALXdhThfz3m4Jxwp70P6BfS5IOXNEqS3L30oC5A7zdFRCcil8UQa5 CRlCxefx16OxWElDVaR6UefvRxewcKZ9o2UuMDaj8FKDG7pkyZc6QcQo6N/geB0eUQkC 1tTw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=EK01mLyp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s186-20020a6377c3000000b0054ff40830b5si2308081pgc.384.2023.07.06.16.19.49; Thu, 06 Jul 2023 16:20:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20221208 header.b=EK01mLyp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232443AbjGFWv5 (ORCPT + 99 others); Thu, 6 Jul 2023 18:51:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230504AbjGFWvZ (ORCPT ); Thu, 6 Jul 2023 18:51:25 -0400 Received: from mail-yw1-x114a.google.com (mail-yw1-x114a.google.com [IPv6:2607:f8b0:4864:20::114a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03953212B for ; Thu, 6 Jul 2023 15:51:07 -0700 (PDT) Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-577323ba3d5so34200597b3.0 for ; Thu, 06 Jul 2023 15:51:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688683866; x=1691275866; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vNyM+7c0est1wcVkLGeYyWS2zFzaKr32nANrKlDiGpI=; b=EK01mLypW26xwbB17tl+WVi77pgJ0OApJdOy0ZdOVv84/lYvfVuW7WowZ4rEK8f7WX g4rTBFdn/gfopVFptl2AY7oR1MbOxcuKJ5NLlgJS4KZnwJQ/O9DYar58Cp6+LJtem2dz 7aV6c0xPB8rtx/DVLBBYyqqPk4Y5aP7bx24BzHZVQh7fjXNIuWh3WsjiPncdyuOWOy0f szlc4nj/Zd3/Khbkib9ISdsge3HmKG+4el8u83yJEUvEUWR2sl14PZi/zm9NTrqicFDS XKFqXwSRFH+ICd8i9r7ldw2FSna2YUHpUhgptiJGArRniVNPuiWl7n0jMNBuoRiJ6+di 6/Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688683866; x=1691275866; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vNyM+7c0est1wcVkLGeYyWS2zFzaKr32nANrKlDiGpI=; b=MiVtaur8dsEFqm7U+jRnBwbVzf5CP6XpY3m9yVy2kipotrwguSiG8l0YqB94DruQLt VpY6rXHcU4fXrgObm7WvjBYIuwumXLt4VdT2QsWK+OpEPHP/1zRlU/CHjdRLAiajbEE+ Lg+0rvZYLk+Rd4ruH5wcKH1QMjOq/LCUNsZQeZCCUEhO+MzuHYe4o4KZNfwpzmW4IcIf Rcwvnd765fORoohaPa9IzToTddgtc+Hn0G5PnnE5NokMirDI9sM75u8QnWCuN8B14Qhv yGUV7zLxD9GkVS6r6sgCuZPK92hJcUuSbuei4x7pEJuEbTspWS9lk6OIDD8qH4NtQvi5 +qgQ== X-Gm-Message-State: ABy/qLaJZtFFAjCBhm7rxszFJkgqeCLfiDKLa7n82Gh5qDIpOQJfUh1x ciKCKq7xz7mnoAubqfJAWKZZUQ4I79o+eVSyb3NO X-Received: from axel.svl.corp.google.com ([2620:15c:2a3:200:bec3:2b1c:87a:fca2]) (user=axelrasmussen job=sendgmr) by 2002:a25:ab90:0:b0:bc4:a660:528f with SMTP id v16-20020a25ab90000000b00bc4a660528fmr39975ybi.5.1688683866544; Thu, 06 Jul 2023 15:51:06 -0700 (PDT) Date: Thu, 6 Jul 2023 15:50:36 -0700 In-Reply-To: <20230706225037.1164380-1-axelrasmussen@google.com> Mime-Version: 1.0 References: <20230706225037.1164380-1-axelrasmussen@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230706225037.1164380-9-axelrasmussen@google.com> Subject: [PATCH v3 8/8] selftests/mm: add uffd unit test for UFFDIO_POISON From: Axel Rasmussen To: Alexander Viro , Andrew Morton , Brian Geffon , Christian Brauner , David Hildenbrand , Gaosheng Cui , Huang Ying , Hugh Dickins , James Houghton , "Jan Alexander Steffens (heftig)" , Jiaqi Yan , Jonathan Corbet , Kefeng Wang , "Liam R. Howlett" , Miaohe Lin , Mike Kravetz , "Mike Rapoport (IBM)" , Muchun Song , Nadav Amit , Naoya Horiguchi , Peter Xu , Ryan Roberts , Shuah Khan , Suleiman Souhlal , Suren Baghdasaryan , "T.J. Alumbaugh" , Yu Zhao , ZhangPeng Cc: linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, Axel Rasmussen X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1770715197818574048?= X-GMAIL-MSGID: =?utf-8?q?1770715197818574048?= The test is pretty basic, and exercises UFFDIO_POISON straightforwardly. We register a region with userfaultfd, in missing fault mode. For each fault, we either UFFDIO_COPY a zeroed page (odd pages) or UFFDIO_POISON (even pages). We do this mix to test "something like a real use case", where guest memory would be some mix of poisoned and non-poisoned pages. We read each page in the region, and assert that the odd pages are zeroed as expected, and the even pages yield a SIGBUS as expected. Why UFFDIO_COPY instead of UFFDIO_ZEROPAGE? Because hugetlb doesn't support UFFDIO_ZEROPAGE, and we don't want to have special case code. Acked-by: Peter Xu Signed-off-by: Axel Rasmussen --- tools/testing/selftests/mm/uffd-unit-tests.c | 117 +++++++++++++++++++ 1 file changed, 117 insertions(+) diff --git a/tools/testing/selftests/mm/uffd-unit-tests.c b/tools/testing/selftests/mm/uffd-unit-tests.c index 04d91f144d1c..2709a34a39c5 100644 --- a/tools/testing/selftests/mm/uffd-unit-tests.c +++ b/tools/testing/selftests/mm/uffd-unit-tests.c @@ -951,6 +951,117 @@ static void uffd_zeropage_test(uffd_test_args_t *args) uffd_test_pass(); } +static void uffd_register_poison(int uffd, void *addr, uint64_t len) +{ + uint64_t ioctls = 0; + uint64_t expected = (1 << _UFFDIO_COPY) | (1 << _UFFDIO_POISON); + + if (uffd_register_with_ioctls(uffd, addr, len, true, + false, false, &ioctls)) + err("poison register fail"); + + if ((ioctls & expected) != expected) + err("registered area doesn't support COPY and POISON ioctls"); +} + +static void do_uffdio_poison(int uffd, unsigned long offset) +{ + struct uffdio_poison uffdio_poison = { 0 }; + int ret; + __s64 res; + + uffdio_poison.range.start = (unsigned long) area_dst + offset; + uffdio_poison.range.len = page_size; + uffdio_poison.mode = 0; + ret = ioctl(uffd, UFFDIO_POISON, &uffdio_poison); + res = uffdio_poison.updated; + + if (ret) + err("UFFDIO_POISON error: %"PRId64, (int64_t)res); + else if (res != page_size) + err("UFFDIO_POISON unexpected size: %"PRId64, (int64_t)res); +} + +static void uffd_poison_handle_fault( + struct uffd_msg *msg, struct uffd_args *args) +{ + unsigned long offset; + + if (msg->event != UFFD_EVENT_PAGEFAULT) + err("unexpected msg event %u", msg->event); + + if (msg->arg.pagefault.flags & + (UFFD_PAGEFAULT_FLAG_WP | UFFD_PAGEFAULT_FLAG_MINOR)) + err("unexpected fault type %llu", msg->arg.pagefault.flags); + + offset = (char *)(unsigned long)msg->arg.pagefault.address - area_dst; + offset &= ~(page_size-1); + + /* Odd pages -> copy zeroed page; even pages -> poison. */ + if (offset & page_size) + copy_page(uffd, offset, false); + else + do_uffdio_poison(uffd, offset); +} + +static void uffd_poison_test(uffd_test_args_t *targs) +{ + pthread_t uffd_mon; + char c; + struct uffd_args args = { 0 }; + struct sigaction act = { 0 }; + unsigned long nr_sigbus = 0; + unsigned long nr; + + fcntl(uffd, F_SETFL, uffd_flags | O_NONBLOCK); + + uffd_register_poison(uffd, area_dst, nr_pages * page_size); + memset(area_src, 0, nr_pages * page_size); + + args.handle_fault = uffd_poison_handle_fault; + if (pthread_create(&uffd_mon, NULL, uffd_poll_thread, &args)) + err("uffd_poll_thread create"); + + sigbuf = &jbuf; + act.sa_sigaction = sighndl; + act.sa_flags = SA_SIGINFO; + if (sigaction(SIGBUS, &act, 0)) + err("sigaction"); + + for (nr = 0; nr < nr_pages; ++nr) { + unsigned long offset = nr * page_size; + const char *bytes = (const char *) area_dst + offset; + const char *i; + + if (sigsetjmp(*sigbuf, 1)) { + /* + * Access below triggered a SIGBUS, which was caught by + * sighndl, which then jumped here. Count this SIGBUS, + * and move on to next page. + */ + ++nr_sigbus; + continue; + } + + for (i = bytes; i < bytes + page_size; ++i) { + if (*i) + err("nonzero byte in area_dst (%p) at %p: %u", + area_dst, i, *i); + } + } + + if (write(pipefd[1], &c, sizeof(c)) != sizeof(c)) + err("pipe write"); + if (pthread_join(uffd_mon, NULL)) + err("pthread_join()"); + + if (nr_sigbus != nr_pages / 2) + err("expected to receive %lu SIGBUS, actually received %lu", + nr_pages / 2, nr_sigbus); + + uffd_test_pass(); +} + /* * Test the returned uffdio_register.ioctls with different register modes. * Note that _UFFDIO_ZEROPAGE is tested separately in the zeropage test. @@ -1126,6 +1237,12 @@ uffd_test_case_t uffd_tests[] = { UFFD_FEATURE_PAGEFAULT_FLAG_WP | UFFD_FEATURE_WP_HUGETLBFS_SHMEM, }, + { + .name = "poison", + .uffd_fn = uffd_poison_test, + .mem_targets = MEM_ALL, + .uffd_feature_required = UFFD_FEATURE_POISON, + }, }; static void usage(const char *prog)