From patchwork Fri Nov 18 09:12:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 22222 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp81582wrr; Fri, 18 Nov 2022 01:18:30 -0800 (PST) X-Google-Smtp-Source: AA0mqf7+fTYunZbbNRI+Sj3+pvOLY2RNDWEhq5cLn+YpwVnupqdMS5FtmPZiq/pg3n98VQThIwGM X-Received: by 2002:a17:906:f6da:b0:7ad:86f9:1d70 with SMTP id jo26-20020a170906f6da00b007ad86f91d70mr5057013ejb.179.1668763109867; Fri, 18 Nov 2022 01:18:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668763109; cv=none; d=google.com; s=arc-20160816; b=LOMWnYE6k67edqrbHuzrwEoIWtj4+74dm8Wpdvxum0cx0qWd/PkbMzsAclHVUy4RXb 7ra/wRNzVbnQvIIu13yxdPVzwtJGiXTVvahsORy8gepQwmqaGN20vQSP1B3WNtnVGE9H KjuCBbkaLg/kHFkXIY8zcxLNF6l6Wue4S+WrOyIhaS4BOrO4/P3LNDXHZ2FSaGUVt422 4OEu5kft/cas19/Yi6XIylJxO2vkQY378r/WNhWEi4U9iNTSHKHRXlenM+sbJ+pYVVRk wr4xk11H6up+2qg6ieD075pBGZKQ1lGaoJqSUGoUmO2OCay4pU/4J626yzoWJIeas8VW 9qWg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=LWYG5yXpcnI43KCNoHnIp9KVL/SeX5f5dsoB3eAmwhY=; b=jdn/fvWpDW3kuShv7xQHRuhqW9ceNa3/gtMEVelhQY+FnaNWIXS9ySUg+sX/O0G05A 4895e1CjOQw/xyu3X59s2keFT238Gt4jfJxUjcHIc2eVK2vhCUkuSd1en233IJk6AEil AnLwm2h5US2Ew7UYsHjoPJw5gqnxMobgdQkqSczlVVuLcXCjOItIYlUJwpHDSDhr//Zp iP0GHVrR0xY6BOI2YLOT3Je90m/JerH8bof+z5k+UC05f0VLJsgPyGqRKVQyuuQF+GEm mbSJrQ+cWMP9LKB9iUoxZlzsCFQIZLdX4Mc1UEebD8znkC9yONVK9MmkSf02JhteJfRi ZV0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=lWUdGH7Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id v4-20020a056402174400b004617f587912si2635601edx.279.2022.11.18.01.18.05; Fri, 18 Nov 2022 01:18:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=lWUdGH7Q; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241810AbiKRJMU (ORCPT + 99 others); Fri, 18 Nov 2022 04:12:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241789AbiKRJMP (ORCPT ); Fri, 18 Nov 2022 04:12:15 -0500 Received: from mail-qv1-xf35.google.com (mail-qv1-xf35.google.com [IPv6:2607:f8b0:4864:20::f35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A46BBF72 for ; Fri, 18 Nov 2022 01:12:07 -0800 (PST) Received: by mail-qv1-xf35.google.com with SMTP id k2so1778430qvo.1 for ; Fri, 18 Nov 2022 01:12:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=LWYG5yXpcnI43KCNoHnIp9KVL/SeX5f5dsoB3eAmwhY=; b=lWUdGH7QMjwIQpC33Mnwk0RfaE2rQEDDHj/ENakHdKrxuWS8b0+jplr25cNqE/kQqP iPhT04QzJCsUHaxS/haLl735E7Jqquctvmfrd2JjT9mh6n6sjLPYE0SLDq1olQ7UWxvy JmD0vVHtlwyCGm8/zHWMe/MeakapPcaJj+jf87ka/ZCYMWv0aFl4bpVmXLhIOViX0DPP 3HX/H6d2o8Q4ESqK7EWhtZ/nVvrti2K6WWXQJx8jr2X0Y/yXerf82NfQXB22scqB2lvC VGcQwMo2K8HUHuvplBf1M12JlEPWxnsysCHkLtJsH22ceoIQstQeEGlXprRVCIY/sFBY WECw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=LWYG5yXpcnI43KCNoHnIp9KVL/SeX5f5dsoB3eAmwhY=; b=gnhEjMoT/Ydtp7oXeulp5ub7uNyQNDU8BUd1aMCANq3Aq0q5WNQRTtJsiYCCjXwT2r 7JP7HTIwag8nA6QtcxWwJUhJAYhs4oOVuGBQ0ePExvGBDpyiB1zTOSnn1aMS6U7ZxsGs J9HN3yF1YAY/7alWs2ihyWPlLeTJm7ZThHqhxX3+stIinqSwCzs1Ee79KmlHDD190UmU 9/5XV9uIK7A/Xbu2LxcD6UHtEruMa/Jeh+PsyzKcYale/vPVaYL6ewDF3hML0et8j5rp GS3OX7VkkUpEn51+PNV+rBakdTWFE+q1PkAgBzUriqpl18TQtvYpsTrVuOck++Dl0R3M pAjQ== X-Gm-Message-State: ANoB5pnZMZkO0BNcJ4YgQReJVlnsAsJHks6OwJ9oxnBLbCN8SR/G5ssV /jnHb+/Lhi7UMPU6dfKo8mGhnQ== X-Received: by 2002:a05:6214:3c98:b0:4aa:a353:8426 with SMTP id ok24-20020a0562143c9800b004aaa3538426mr5876362qvb.55.1668762726119; Fri, 18 Nov 2022 01:12:06 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id t18-20020a05620a451200b006fafc111b12sm2110718qkp.83.2022.11.18.01.12.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:12:05 -0800 (PST) Date: Fri, 18 Nov 2022 01:12:03 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Linus Torvalds , Johannes Weiner , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Vlastimil Babka , Peter Xu , Yang Shi , John Hubbard , Mike Kravetz , Sidhartha Kumar , Muchun Song , Miaohe Lin , Naoya Horiguchi , Mina Almasry , James Houghton , Zach O'Keefe , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 1/3] mm,thp,rmap: subpages_mapcount of PTE-mapped subpages In-Reply-To: Message-ID: <78fa518-85b5-32c0-ee92-537fa46131f6@google.com> References: <5f52de70-975-e94f-f141-543765736181@google.com> MIME-Version: 1.0 X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749824946514513380?= X-GMAIL-MSGID: =?utf-8?q?1749824946514513380?= Following suggestion from Linus, instead of counting every PTE map of a compound page in subpages_mapcount, just count how many of its subpages are PTE-mapped: this yields the exact number needed for NR_ANON_MAPPED and NR_FILE_MAPPED stats, without any need for a locked scan of subpages; and requires updating the count less often. This does then revert total_mapcount() and folio_mapcount() to needing a scan of subpages; but they are inherently racy, and need no locking, so Linus is right that the scans are much better done there. Plus (unlike in 6.1 and previous) subpages_mapcount lets us avoid the scan in the common case of no PTE maps. And page_mapped() and folio_mapped() remain scanless and just as efficient with the new meaning of subpages_mapcount: those are the functions which I most wanted to remove the scan from. The updated page_dup_compound_rmap() is no longer suitable for use by anon THP's __split_huge_pmd_locked(); but page_add_anon_rmap() can be used for that, so long as its VM_BUG_ON_PAGE(!PageLocked) is deleted. Evidence is that this way goes slightly faster than the previous implementation for most cases; but significantly faster in the (now scanless) pmds after ptes case, which started out at 870ms and was brought down to 495ms by the previous series, now takes around 105ms. Suggested-by: Linus Torvalds Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov --- Documentation/mm/transhuge.rst | 3 +- include/linux/mm.h | 52 ++++++----- include/linux/rmap.h | 8 +- mm/huge_memory.c | 2 +- mm/rmap.c | 155 ++++++++++++++------------------- 5 files changed, 103 insertions(+), 117 deletions(-) diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst index 1e2a637cc607..af4c9d70321d 100644 --- a/Documentation/mm/transhuge.rst +++ b/Documentation/mm/transhuge.rst @@ -122,7 +122,8 @@ pages: - map/unmap of sub-pages with PTE entry increment/decrement ->_mapcount on relevant sub-page of the compound page, and also increment/decrement - ->subpages_mapcount, stored in first tail page of the compound page. + ->subpages_mapcount, stored in first tail page of the compound page, when + _mapcount goes from -1 to 0 or 0 to -1: counting sub-pages mapped by PTE. In order to have race-free accounting of sub-pages mapped, changes to sub-page ->_mapcount, ->subpages_mapcount and ->compound_mapcount are are all locked by bit_spin_lock of PG_locked in the first tail ->flags. diff --git a/include/linux/mm.h b/include/linux/mm.h index 8fe6276d8cc2..c9e46d4d46f2 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -828,7 +828,7 @@ static inline int head_compound_mapcount(struct page *head) } /* - * Sum of mapcounts of sub-pages, does not include compound mapcount. + * Number of sub-pages mapped by PTE, does not include compound mapcount. * Must be called only on head of compound page. */ static inline int head_subpages_mapcount(struct page *head) @@ -864,23 +864,7 @@ static inline int page_mapcount(struct page *page) return head_compound_mapcount(page) + mapcount; } -static inline int total_mapcount(struct page *page) -{ - if (likely(!PageCompound(page))) - return atomic_read(&page->_mapcount) + 1; - page = compound_head(page); - return head_compound_mapcount(page) + head_subpages_mapcount(page); -} - -/* - * Return true if this page is mapped into pagetables. - * For compound page it returns true if any subpage of compound page is mapped, - * even if this particular subpage is not itself mapped by any PTE or PMD. - */ -static inline bool page_mapped(struct page *page) -{ - return total_mapcount(page) > 0; -} +int total_compound_mapcount(struct page *head); /** * folio_mapcount() - Calculate the number of mappings of this folio. @@ -897,8 +881,20 @@ static inline int folio_mapcount(struct folio *folio) { if (likely(!folio_test_large(folio))) return atomic_read(&folio->_mapcount) + 1; - return atomic_read(folio_mapcount_ptr(folio)) + 1 + - atomic_read(folio_subpages_mapcount_ptr(folio)); + return total_compound_mapcount(&folio->page); +} + +static inline int total_mapcount(struct page *page) +{ + if (likely(!PageCompound(page))) + return atomic_read(&page->_mapcount) + 1; + return total_compound_mapcount(compound_head(page)); +} + +static inline bool folio_large_is_mapped(struct folio *folio) +{ + return atomic_read(folio_mapcount_ptr(folio)) + + atomic_read(folio_subpages_mapcount_ptr(folio)) >= 0; } /** @@ -909,7 +905,21 @@ static inline int folio_mapcount(struct folio *folio) */ static inline bool folio_mapped(struct folio *folio) { - return folio_mapcount(folio) > 0; + if (likely(!folio_test_large(folio))) + return atomic_read(&folio->_mapcount) >= 0; + return folio_large_is_mapped(folio); +} + +/* + * Return true if this page is mapped into pagetables. + * For compound page it returns true if any sub-page of compound page is mapped, + * even if this particular sub-page is not itself mapped by any PTE or PMD. + */ +static inline bool page_mapped(struct page *page) +{ + if (likely(!PageCompound(page))) + return atomic_read(&page->_mapcount) >= 0; + return folio_large_is_mapped(page_folio(page)); } static inline struct page *virt_to_head_page(const void *x) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 011a7530dc76..860f558126ac 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -204,14 +204,14 @@ void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, void hugepage_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address); -void page_dup_compound_rmap(struct page *page, bool compound); +void page_dup_compound_rmap(struct page *page); static inline void page_dup_file_rmap(struct page *page, bool compound) { - if (PageCompound(page)) - page_dup_compound_rmap(page, compound); - else + if (likely(!compound /* page is mapped by PTE */)) atomic_inc(&page->_mapcount); + else + page_dup_compound_rmap(page); } /** diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 30056efc79ad..3dee8665c585 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2215,7 +2215,7 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, BUG_ON(!pte_none(*pte)); set_pte_at(mm, addr, pte, entry); if (!pmd_migration) - page_dup_compound_rmap(page + i, false); + page_add_anon_rmap(page + i, vma, addr, false); pte_unmap(pte); } diff --git a/mm/rmap.c b/mm/rmap.c index 4833d28c5e1a..66be8cae640f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1117,55 +1117,36 @@ static void unlock_compound_mapcounts(struct page *head, bit_spin_unlock(PG_locked, &head[1].flags); } -/* - * When acting on a compound page under lock_compound_mapcounts(), avoid the - * unnecessary overhead of an actual atomic operation on its subpage mapcount. - * Return true if this is the first increment or the last decrement - * (remembering that page->_mapcount -1 represents logical mapcount 0). - */ -static bool subpage_mapcount_inc(struct page *page) -{ - int orig_mapcount = atomic_read(&page->_mapcount); - - atomic_set(&page->_mapcount, orig_mapcount + 1); - return orig_mapcount < 0; -} - -static bool subpage_mapcount_dec(struct page *page) -{ - int orig_mapcount = atomic_read(&page->_mapcount); - - atomic_set(&page->_mapcount, orig_mapcount - 1); - return orig_mapcount == 0; -} - -/* - * When mapping a THP's first pmd, or unmapping its last pmd, if that THP - * also has pte mappings, then those must be discounted: in order to maintain - * NR_ANON_MAPPED and NR_FILE_MAPPED statistics exactly, without any drift, - * and to decide when an anon THP should be put on the deferred split queue. - * This function must be called between lock_ and unlock_compound_mapcounts(). - */ -static int nr_subpages_unmapped(struct page *head, int nr_subpages) +int total_compound_mapcount(struct page *head) { - int nr = nr_subpages; + int mapcount = head_compound_mapcount(head); + int nr_subpages; int i; - /* Discount those subpages mapped by pte */ + /* In the common case, avoid the loop when no subpages mapped by PTE */ + if (head_subpages_mapcount(head) == 0) + return mapcount; + /* + * Add all the PTE mappings of those subpages mapped by PTE. + * Limit the loop, knowing that only subpages_mapcount are mapped? + * Perhaps: given all the raciness, that may be a good or a bad idea. + */ + nr_subpages = thp_nr_pages(head); for (i = 0; i < nr_subpages; i++) - if (atomic_read(&head[i]._mapcount) >= 0) - nr--; - return nr; + mapcount += atomic_read(&head[i]._mapcount); + + /* But each of those _mapcounts was based on -1 */ + mapcount += nr_subpages; + return mapcount; } /* - * page_dup_compound_rmap(), used when copying mm, or when splitting pmd, + * page_dup_compound_rmap(), used when copying mm, * provides a simple example of using lock_ and unlock_compound_mapcounts(). */ -void page_dup_compound_rmap(struct page *page, bool compound) +void page_dup_compound_rmap(struct page *head) { struct compound_mapcounts mapcounts; - struct page *head; /* * Hugetlb pages could use lock_compound_mapcounts(), like THPs do; @@ -1176,20 +1157,16 @@ void page_dup_compound_rmap(struct page *page, bool compound) * Note that hugetlb does not call page_add_file_rmap(): * here is where hugetlb shared page mapcount is raised. */ - if (PageHuge(page)) { - atomic_inc(compound_mapcount_ptr(page)); - return; - } + if (PageHuge(head)) { + atomic_inc(compound_mapcount_ptr(head)); - head = compound_head(page); - lock_compound_mapcounts(head, &mapcounts); - if (compound) { + } else if (PageTransHuge(head)) { + /* That test is redundant: it's for safety or to optimize out */ + + lock_compound_mapcounts(head, &mapcounts); mapcounts.compound_mapcount++; - } else { - mapcounts.subpages_mapcount++; - subpage_mapcount_inc(page); + unlock_compound_mapcounts(head, &mapcounts); } - unlock_compound_mapcounts(head, &mapcounts); } /** @@ -1308,31 +1285,29 @@ void page_add_anon_rmap(struct page *page, if (unlikely(PageKsm(page))) lock_page_memcg(page); - else - VM_BUG_ON_PAGE(!PageLocked(page), page); - if (likely(!PageCompound(page))) { + if (likely(!compound /* page is mapped by PTE */)) { first = atomic_inc_and_test(&page->_mapcount); nr = first; + if (first && PageCompound(page)) { + struct page *head = compound_head(page); + + lock_compound_mapcounts(head, &mapcounts); + mapcounts.subpages_mapcount++; + nr = !mapcounts.compound_mapcount; + unlock_compound_mapcounts(head, &mapcounts); + } + } else if (PageTransHuge(page)) { + /* That test is redundant: it's for safety or to optimize out */ - } else if (compound && PageTransHuge(page)) { lock_compound_mapcounts(page, &mapcounts); first = !mapcounts.compound_mapcount; mapcounts.compound_mapcount++; if (first) { - nr = nr_pmdmapped = thp_nr_pages(page); - if (mapcounts.subpages_mapcount) - nr = nr_subpages_unmapped(page, nr_pmdmapped); + nr_pmdmapped = thp_nr_pages(page); + nr = nr_pmdmapped - mapcounts.subpages_mapcount; } unlock_compound_mapcounts(page, &mapcounts); - } else { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount++; - first = subpage_mapcount_inc(page); - nr = first && !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); } VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); @@ -1411,28 +1386,28 @@ void page_add_file_rmap(struct page *page, VM_BUG_ON_PAGE(compound && !PageTransHuge(page), page); lock_page_memcg(page); - if (likely(!PageCompound(page))) { + if (likely(!compound /* page is mapped by PTE */)) { first = atomic_inc_and_test(&page->_mapcount); nr = first; + if (first && PageCompound(page)) { + struct page *head = compound_head(page); + + lock_compound_mapcounts(head, &mapcounts); + mapcounts.subpages_mapcount++; + nr = !mapcounts.compound_mapcount; + unlock_compound_mapcounts(head, &mapcounts); + } + } else if (PageTransHuge(page)) { + /* That test is redundant: it's for safety or to optimize out */ - } else if (compound && PageTransHuge(page)) { lock_compound_mapcounts(page, &mapcounts); first = !mapcounts.compound_mapcount; mapcounts.compound_mapcount++; if (first) { - nr = nr_pmdmapped = thp_nr_pages(page); - if (mapcounts.subpages_mapcount) - nr = nr_subpages_unmapped(page, nr_pmdmapped); + nr_pmdmapped = thp_nr_pages(page); + nr = nr_pmdmapped - mapcounts.subpages_mapcount; } unlock_compound_mapcounts(page, &mapcounts); - } else { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount++; - first = subpage_mapcount_inc(page); - nr = first && !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); } if (nr_pmdmapped) @@ -1472,28 +1447,28 @@ void page_remove_rmap(struct page *page, lock_page_memcg(page); /* page still mapped by someone else? */ - if (likely(!PageCompound(page))) { + if (likely(!compound /* page is mapped by PTE */)) { last = atomic_add_negative(-1, &page->_mapcount); nr = last; + if (last && PageCompound(page)) { + struct page *head = compound_head(page); + + lock_compound_mapcounts(head, &mapcounts); + mapcounts.subpages_mapcount--; + nr = !mapcounts.compound_mapcount; + unlock_compound_mapcounts(head, &mapcounts); + } + } else if (PageTransHuge(page)) { + /* That test is redundant: it's for safety or to optimize out */ - } else if (compound && PageTransHuge(page)) { lock_compound_mapcounts(page, &mapcounts); mapcounts.compound_mapcount--; last = !mapcounts.compound_mapcount; if (last) { - nr = nr_pmdmapped = thp_nr_pages(page); - if (mapcounts.subpages_mapcount) - nr = nr_subpages_unmapped(page, nr_pmdmapped); + nr_pmdmapped = thp_nr_pages(page); + nr = nr_pmdmapped - mapcounts.subpages_mapcount; } unlock_compound_mapcounts(page, &mapcounts); - } else { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount--; - last = subpage_mapcount_dec(page); - nr = last && !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); } if (nr_pmdmapped) { From patchwork Fri Nov 18 09:14:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 22221 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp81365wrr; Fri, 18 Nov 2022 01:17:56 -0800 (PST) X-Google-Smtp-Source: AA0mqf6sgFpjREumQl8XWk6teAYGckJx/0F8QvQjQAmXHQVMcwWfyEjgG3JeQ/xIYIce50zvMmOQ X-Received: by 2002:a17:906:e2cb:b0:7ad:c35a:ad76 with SMTP id gr11-20020a170906e2cb00b007adc35aad76mr5235491ejb.705.1668763076818; Fri, 18 Nov 2022 01:17:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668763076; cv=none; d=google.com; s=arc-20160816; b=trowhn9MsimG6FL0MrTQuxo3CLJZQ8o0lYpJWJ37C6VYHlYQHIEj458r1g3bbxdzjN ZaRLuWg91eNokpBp2UJNqPvMZ6IBbXAXi5sGUOTYiYOQNcKTQ63pkQf+lG9B4uxIutSR DO3sCvy8KbLxIC8ETN8uO0yNcH5cRfvJehnB06trGAHVzE2oQ26g1scNyDZDVcvJWsOA S4OIL1GlH1etwRNOFDkA7/l+8eCt4XDZbPXTvdWdxArNw7sGItvlOh/IAYXvhMNlIiTl usAmGjbcRBrPKXElqpnBC2pE9qhsNQGpkih+Bb7mrf9yP5EEKcsirmhe85mG2OURtYbY ZVnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=+R4+JsMdAsj8lhbAosLbJ972kPmHJQgJWoN2i2aS51c=; b=kCFap61V1GFGNsoBsuErqA83lD9C8W6p5WNeb0sKK8NJFSRF8/awETUCrhLVaC0+ZE OQk2kbZuSm+ru12nZLoisJ/NPyHVcjk7aj8gu0JvGQLewjenllVnGBVMAgfriz6aR+a9 R/wV1iBC8oDJw9qRhGGfwLWIEO1oyAu82hZ8+WvDPtDaLwgKj8YR7qXgxOE07WIMXace uj+gJ/TLk3w3DYrMEdkK5fgTFoSPVhq0WKUL7hYkpxmqMPTsgHm8vYkiqWqpZED/l49v YwRpnKguJ7oCz6lA+Mm8JpJaiBXYe5rqLhHZQM2IurzIgGWEIsFYdEC0va+NiTS+nqFp iRFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=YOyiEySv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jg33-20020a170907972100b0078dcd448f97si2794511ejc.801.2022.11.18.01.17.31; Fri, 18 Nov 2022 01:17:56 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=YOyiEySv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241814AbiKRJOl (ORCPT + 99 others); Fri, 18 Nov 2022 04:14:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241831AbiKRJOb (ORCPT ); Fri, 18 Nov 2022 04:14:31 -0500 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67FFB7FC32 for ; Fri, 18 Nov 2022 01:14:21 -0800 (PST) Received: by mail-qt1-x82c.google.com with SMTP id w9so2727997qtv.13 for ; Fri, 18 Nov 2022 01:14:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=+R4+JsMdAsj8lhbAosLbJ972kPmHJQgJWoN2i2aS51c=; b=YOyiEySv8T3J5rwa4G6sIiW4rOJU1VBsv/769AIHVUdyr6xNYE3C3A73DU7k3fZeTo sHt5bR2J76LMyfIR9UNDS6hhwavzQRAxJgKsoFog4Nj9rwjFBVGBMYk58PRggAHRBXtt If5dDsarj6ZEtueURmJmmzS4K/7wit4Z0I0qWyQXpxH79Cx9haY1nS/rYYGlTnuVSqOb 37n5HrF9L1OE+EyACsDrAtoN8vj9TzC21rG8lYzJGsuXhZlGHPxZGEE7lU5VKEP/lCPo EVJ8MtpppWJ9d/wnm+J/aWOjOYtqVW+hcu76/quOn1Tbz1z7iaq8NSfpYiREKa4pAyUJ 3STQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+R4+JsMdAsj8lhbAosLbJ972kPmHJQgJWoN2i2aS51c=; b=ajE6QkUFhBFTfi9dQ+H5hFJFzMZLPDYmZTJFB/NQ8cKg09UMuCK+DEkShYWRaiEr/p wJccNp5whtYDFi0IaoBI5KoxbDR3Cr68pv5+MgSYER3l2CtmmjkUT7K107V1B11iMeMQ B+VmNwAa1kPHNmt2SKc6uRnRyQssEMfsV1brCWhgs9ZpRMmtv4qWfTcKJUedAiwUQgDU YUBtcLbZvmdUyW2udyMZCn/fsZqkfNqF/XWEQGcKHm9WJb7jjm6cXIyr2a90kDFz0bjh Rjlyim6C7ggE8ZXpTZXPE6J85cwkNUETGJT3EGQVKCxXFTFLfNtOOUIvG1LFPb5D8W8A prAg== X-Gm-Message-State: ANoB5pkyqw56FpR5txOMUhoMN9zB26EQTy41peLBJNKNeUch54r6nTob rVnHUThjltQUvElh9C083Q6yjw== X-Received: by 2002:a05:622a:1144:b0:3a5:2307:9f23 with SMTP id f4-20020a05622a114400b003a523079f23mr5552851qty.514.1668762860368; Fri, 18 Nov 2022 01:14:20 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id fp19-20020a05622a509300b00359961365f1sm1720785qtb.68.2022.11.18.01.14.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:14:20 -0800 (PST) Date: Fri, 18 Nov 2022 01:14:17 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Linus Torvalds , Johannes Weiner , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Vlastimil Babka , Peter Xu , Yang Shi , John Hubbard , Mike Kravetz , Sidhartha Kumar , Muchun Song , Miaohe Lin , Naoya Horiguchi , Mina Almasry , James Houghton , Zach O'Keefe , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 2/3] mm,thp,rmap: subpages_mapcount COMPOUND_MAPPED if PMD-mapped In-Reply-To: Message-ID: <25a09a7a-81a9-e9c2-7567-c94ce18ac2@google.com> References: <5f52de70-975-e94f-f141-543765736181@google.com> MIME-Version: 1.0 X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749824912045324033?= X-GMAIL-MSGID: =?utf-8?q?1749824912045324033?= Can the lock_compound_mapcount() bit_spin_lock apparatus be removed now? Yes. Not by atomic64_t or cmpxchg games, those get difficult on 32-bit; but if we slightly abuse subpages_mapcount by additionally demanding that one bit be set there when the compound page is PMD-mapped, then a cascade of two atomic ops is able to maintain the stats without bit_spin_lock. This is harder to reason about than when bit_spin_locked, but I believe safe; and no drift in stats detected when testing. When there are racing removes and adds, of course the sequence of operations is less well- defined; but each operation on subpages_mapcount is atomically good. What might be disastrous, is if subpages_mapcount could ever fleetingly appear negative: but the pte lock (or pmd lock) these rmap functions are called under, ensures that a last remove cannot race ahead of a first add. Continue to make an exception for hugetlb (PageHuge) pages, though that exception can be easily removed by a further commit if necessary: leave subpages_mapcount 0, don't bother with COMPOUND_MAPPED in its case, just carry on checking compound_mapcount too in folio_mapped(), page_mapped(). Evidence is that this way goes slightly faster than the previous implementation in all cases (pmds after ptes now taking around 103ms); and relieves us of worrying about contention on the bit_spin_lock. Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov --- Documentation/mm/transhuge.rst | 7 +- include/linux/mm.h | 19 ++++- include/linux/rmap.h | 12 ++-- mm/debug.c | 2 +- mm/rmap.c | 124 +++++++-------------------------- 5 files changed, 52 insertions(+), 112 deletions(-) diff --git a/Documentation/mm/transhuge.rst b/Documentation/mm/transhuge.rst index af4c9d70321d..ec3dc5b04226 100644 --- a/Documentation/mm/transhuge.rst +++ b/Documentation/mm/transhuge.rst @@ -118,15 +118,14 @@ pages: succeeds on tail pages. - map/unmap of PMD entry for the whole compound page increment/decrement - ->compound_mapcount, stored in the first tail page of the compound page. + ->compound_mapcount, stored in the first tail page of the compound page; + and also increment/decrement ->subpages_mapcount (also in the first tail) + by COMPOUND_MAPPED when compound_mapcount goes from -1 to 0 or 0 to -1. - map/unmap of sub-pages with PTE entry increment/decrement ->_mapcount on relevant sub-page of the compound page, and also increment/decrement ->subpages_mapcount, stored in first tail page of the compound page, when _mapcount goes from -1 to 0 or 0 to -1: counting sub-pages mapped by PTE. - In order to have race-free accounting of sub-pages mapped, changes to - sub-page ->_mapcount, ->subpages_mapcount and ->compound_mapcount are - are all locked by bit_spin_lock of PG_locked in the first tail ->flags. split_huge_page internally has to distribute the refcounts in the head page to the tail pages before clearing all PG_head/tail bits from the page diff --git a/include/linux/mm.h b/include/linux/mm.h index c9e46d4d46f2..a2bfb5e4be62 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -828,7 +828,16 @@ static inline int head_compound_mapcount(struct page *head) } /* - * Number of sub-pages mapped by PTE, does not include compound mapcount. + * If a 16GB hugetlb page were mapped by PTEs of all of its 4kB sub-pages, + * its subpages_mapcount would be 0x400000: choose the COMPOUND_MAPPED bit + * above that range, instead of 2*(PMD_SIZE/PAGE_SIZE). Hugetlb currently + * leaves subpages_mapcount at 0, but avoid surprise if it participates later. + */ +#define COMPOUND_MAPPED 0x800000 +#define SUBPAGES_MAPPED (COMPOUND_MAPPED - 1) + +/* + * Number of sub-pages mapped by PTE, plus COMPOUND_MAPPED if compound mapped. * Must be called only on head of compound page. */ static inline int head_subpages_mapcount(struct page *head) @@ -893,8 +902,12 @@ static inline int total_mapcount(struct page *page) static inline bool folio_large_is_mapped(struct folio *folio) { - return atomic_read(folio_mapcount_ptr(folio)) + - atomic_read(folio_subpages_mapcount_ptr(folio)) >= 0; + /* + * Reading folio_mapcount_ptr() below could be omitted if hugetlb + * participated in incrementing subpages_mapcount when compound mapped. + */ + return atomic_read(folio_mapcount_ptr(folio)) >= 0 || + atomic_read(folio_subpages_mapcount_ptr(folio)) > 0; } /** diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 860f558126ac..bd3504d11b15 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -204,14 +204,14 @@ void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, void hugepage_add_new_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address); -void page_dup_compound_rmap(struct page *page); +static inline void __page_dup_rmap(struct page *page, bool compound) +{ + atomic_inc(compound ? compound_mapcount_ptr(page) : &page->_mapcount); +} static inline void page_dup_file_rmap(struct page *page, bool compound) { - if (likely(!compound /* page is mapped by PTE */)) - atomic_inc(&page->_mapcount); - else - page_dup_compound_rmap(page); + __page_dup_rmap(page, compound); } /** @@ -260,7 +260,7 @@ static inline int page_try_dup_anon_rmap(struct page *page, bool compound, * the page R/O into both processes. */ dup: - page_dup_file_rmap(page, compound); + __page_dup_rmap(page, compound); return 0; } diff --git a/mm/debug.c b/mm/debug.c index 7f8e5f744e42..1ef2ff6a05cb 100644 --- a/mm/debug.c +++ b/mm/debug.c @@ -97,7 +97,7 @@ static void __dump_page(struct page *page) pr_warn("head:%p order:%u compound_mapcount:%d subpages_mapcount:%d compound_pincount:%d\n", head, compound_order(head), head_compound_mapcount(head), - head_subpages_mapcount(head), + head_subpages_mapcount(head) & SUBPAGES_MAPPED, head_compound_pincount(head)); } diff --git a/mm/rmap.c b/mm/rmap.c index 66be8cae640f..5e4ce0a6d6f1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1085,38 +1085,6 @@ int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, return page_vma_mkclean_one(&pvmw); } -struct compound_mapcounts { - unsigned int compound_mapcount; - unsigned int subpages_mapcount; -}; - -/* - * lock_compound_mapcounts() first locks, then copies subpages_mapcount and - * compound_mapcount from head[1].compound_mapcount and subpages_mapcount, - * converting from struct page's internal representation to logical count - * (that is, adding 1 to compound_mapcount to hide its offset by -1). - */ -static void lock_compound_mapcounts(struct page *head, - struct compound_mapcounts *local) -{ - bit_spin_lock(PG_locked, &head[1].flags); - local->compound_mapcount = atomic_read(compound_mapcount_ptr(head)) + 1; - local->subpages_mapcount = atomic_read(subpages_mapcount_ptr(head)); -} - -/* - * After caller has updated subpage._mapcount, local subpages_mapcount and - * local compound_mapcount, as necessary, unlock_compound_mapcounts() converts - * and copies them back to the compound head[1] fields, and then unlocks. - */ -static void unlock_compound_mapcounts(struct page *head, - struct compound_mapcounts *local) -{ - atomic_set(compound_mapcount_ptr(head), local->compound_mapcount - 1); - atomic_set(subpages_mapcount_ptr(head), local->subpages_mapcount); - bit_spin_unlock(PG_locked, &head[1].flags); -} - int total_compound_mapcount(struct page *head) { int mapcount = head_compound_mapcount(head); @@ -1124,7 +1092,7 @@ int total_compound_mapcount(struct page *head) int i; /* In the common case, avoid the loop when no subpages mapped by PTE */ - if (head_subpages_mapcount(head) == 0) + if ((head_subpages_mapcount(head) & SUBPAGES_MAPPED) == 0) return mapcount; /* * Add all the PTE mappings of those subpages mapped by PTE. @@ -1140,35 +1108,6 @@ int total_compound_mapcount(struct page *head) return mapcount; } -/* - * page_dup_compound_rmap(), used when copying mm, - * provides a simple example of using lock_ and unlock_compound_mapcounts(). - */ -void page_dup_compound_rmap(struct page *head) -{ - struct compound_mapcounts mapcounts; - - /* - * Hugetlb pages could use lock_compound_mapcounts(), like THPs do; - * but at present they are still being managed by atomic operations: - * which are likely to be somewhat faster, so don't rush to convert - * them over without evaluating the effect. - * - * Note that hugetlb does not call page_add_file_rmap(): - * here is where hugetlb shared page mapcount is raised. - */ - if (PageHuge(head)) { - atomic_inc(compound_mapcount_ptr(head)); - - } else if (PageTransHuge(head)) { - /* That test is redundant: it's for safety or to optimize out */ - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.compound_mapcount++; - unlock_compound_mapcounts(head, &mapcounts); - } -} - /** * page_move_anon_rmap - move a page to our anon_vma * @page: the page to move to our anon_vma @@ -1278,7 +1217,7 @@ static void __page_check_anon_rmap(struct page *page, void page_add_anon_rmap(struct page *page, struct vm_area_struct *vma, unsigned long address, rmap_t flags) { - struct compound_mapcounts mapcounts; + atomic_t *mapped; int nr = 0, nr_pmdmapped = 0; bool compound = flags & RMAP_COMPOUND; bool first; @@ -1290,24 +1229,20 @@ void page_add_anon_rmap(struct page *page, first = atomic_inc_and_test(&page->_mapcount); nr = first; if (first && PageCompound(page)) { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount++; - nr = !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); + mapped = subpages_mapcount_ptr(compound_head(page)); + nr = atomic_inc_return_relaxed(mapped); + nr = !(nr & COMPOUND_MAPPED); } } else if (PageTransHuge(page)) { /* That test is redundant: it's for safety or to optimize out */ - lock_compound_mapcounts(page, &mapcounts); - first = !mapcounts.compound_mapcount; - mapcounts.compound_mapcount++; + first = atomic_inc_and_test(compound_mapcount_ptr(page)); if (first) { + mapped = subpages_mapcount_ptr(page); + nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); nr_pmdmapped = thp_nr_pages(page); - nr = nr_pmdmapped - mapcounts.subpages_mapcount; + nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED); } - unlock_compound_mapcounts(page, &mapcounts); } VM_BUG_ON_PAGE(!first && (flags & RMAP_EXCLUSIVE), page); @@ -1360,6 +1295,7 @@ void page_add_new_anon_rmap(struct page *page, VM_BUG_ON_PAGE(!PageTransHuge(page), page); /* increment count (starts at -1) */ atomic_set(compound_mapcount_ptr(page), 0); + atomic_set(subpages_mapcount_ptr(page), COMPOUND_MAPPED); nr = thp_nr_pages(page); __mod_lruvec_page_state(page, NR_ANON_THPS, nr); } @@ -1379,7 +1315,7 @@ void page_add_new_anon_rmap(struct page *page, void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, bool compound) { - struct compound_mapcounts mapcounts; + atomic_t *mapped; int nr = 0, nr_pmdmapped = 0; bool first; @@ -1390,24 +1326,20 @@ void page_add_file_rmap(struct page *page, first = atomic_inc_and_test(&page->_mapcount); nr = first; if (first && PageCompound(page)) { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount++; - nr = !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); + mapped = subpages_mapcount_ptr(compound_head(page)); + nr = atomic_inc_return_relaxed(mapped); + nr = !(nr & COMPOUND_MAPPED); } } else if (PageTransHuge(page)) { /* That test is redundant: it's for safety or to optimize out */ - lock_compound_mapcounts(page, &mapcounts); - first = !mapcounts.compound_mapcount; - mapcounts.compound_mapcount++; + first = atomic_inc_and_test(compound_mapcount_ptr(page)); if (first) { + mapped = subpages_mapcount_ptr(page); + nr = atomic_add_return_relaxed(COMPOUND_MAPPED, mapped); nr_pmdmapped = thp_nr_pages(page); - nr = nr_pmdmapped - mapcounts.subpages_mapcount; + nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED); } - unlock_compound_mapcounts(page, &mapcounts); } if (nr_pmdmapped) @@ -1431,7 +1363,7 @@ void page_add_file_rmap(struct page *page, void page_remove_rmap(struct page *page, struct vm_area_struct *vma, bool compound) { - struct compound_mapcounts mapcounts; + atomic_t *mapped; int nr = 0, nr_pmdmapped = 0; bool last; @@ -1451,24 +1383,20 @@ void page_remove_rmap(struct page *page, last = atomic_add_negative(-1, &page->_mapcount); nr = last; if (last && PageCompound(page)) { - struct page *head = compound_head(page); - - lock_compound_mapcounts(head, &mapcounts); - mapcounts.subpages_mapcount--; - nr = !mapcounts.compound_mapcount; - unlock_compound_mapcounts(head, &mapcounts); + mapped = subpages_mapcount_ptr(compound_head(page)); + nr = atomic_dec_return_relaxed(mapped); + nr = !(nr & COMPOUND_MAPPED); } } else if (PageTransHuge(page)) { /* That test is redundant: it's for safety or to optimize out */ - lock_compound_mapcounts(page, &mapcounts); - mapcounts.compound_mapcount--; - last = !mapcounts.compound_mapcount; + last = atomic_add_negative(-1, compound_mapcount_ptr(page)); if (last) { + mapped = subpages_mapcount_ptr(page); + nr = atomic_sub_return_relaxed(COMPOUND_MAPPED, mapped); nr_pmdmapped = thp_nr_pages(page); - nr = nr_pmdmapped - mapcounts.subpages_mapcount; + nr = nr_pmdmapped - (nr & SUBPAGES_MAPPED); } - unlock_compound_mapcounts(page, &mapcounts); } if (nr_pmdmapped) { From patchwork Fri Nov 18 09:16:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 22223 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp81822wrr; Fri, 18 Nov 2022 01:19:13 -0800 (PST) X-Google-Smtp-Source: AA0mqf7Ts7Q0zdVghVGeJD7WZ8qpuWyX9iZ6F6Kt1VbXT7D5DZ3E+it7wJVgAEqC9R9C+jmBGtWU X-Received: by 2002:a17:903:40d2:b0:17f:52af:d022 with SMTP id t18-20020a17090340d200b0017f52afd022mr6674390pld.122.1668763153081; Fri, 18 Nov 2022 01:19:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668763153; cv=none; d=google.com; s=arc-20160816; b=uYa6XIWOJmAanaVFyidTLjSPvVPgMXnI5NfDy1LUZG7SZqJqIghuBiND+lcL99/MOR c9jk6FYiCEV8MDMWzlH0EOCavromKTTqCXLzGDDrJbnyKWw50Srb63kqQZDDUKVLlOyw FYRLtigIq/HaSkoiG8mAGyMFuzdVY09Jgj8owj6NXcY13BWzvjbZfTSsXKiXG7XL9N1f UiQ2lURTOKAB2TXU8re8X32F2AQpa3dnw8HtzGVmG8Hn0dLHir1NdKeLZxHPoFaRp1eu aVYwODP/HudijSUYIkxNv+e4QBI/GczyA/nDOHuRIXOoIT6c9qFALI1V5nRnXvVw6eVb LQiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:message-id:in-reply-to :subject:cc:to:from:date:dkim-signature; bh=3l1Y5HSD6/V1oMjNmRLOOT3WB8p/T/AI0RfUdUHqvkM=; b=qbrHd5eKs5mvBAhP9zt+8IWPxyK45notAUpaH3ODGWQ8RjBhHnG6nlm8L7Fh8xWTQP heaRQHDadklzcopt530fY/qp1uRA6KRFj9USOfh7byePoAHQiYgCsxAGkY7KqlwQp2zo A3hWttd0eaeSuuHEGumnKeViC3ERZQmLakS7UAaeNug4ULTi7zPpB/akqL4RJ5J9Rt1o 9jEe2+5HeXttThhAL3OxhYyxm8fMjJRJJ13cgOA2dkQLfcLePwSDwiJcqQx6X7/jVBK6 C5iZahey+bZvyqO6t6w691twTjEOym+9uPlZn0o8p0aGSTYYXyURRqzj7Wp6iARv+esV U3wQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jqFHAvcX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id in19-20020a17090b439300b00218323b06f8si6366750pjb.54.2022.11.18.01.19.00; Fri, 18 Nov 2022 01:19:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=jqFHAvcX; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241846AbiKRJQs (ORCPT + 99 others); Fri, 18 Nov 2022 04:16:48 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241886AbiKRJQZ (ORCPT ); Fri, 18 Nov 2022 04:16:25 -0500 Received: from mail-qk1-x72b.google.com (mail-qk1-x72b.google.com [IPv6:2607:f8b0:4864:20::72b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C183490B1 for ; Fri, 18 Nov 2022 01:16:24 -0800 (PST) Received: by mail-qk1-x72b.google.com with SMTP id d8so2983766qki.13 for ; Fri, 18 Nov 2022 01:16:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=3l1Y5HSD6/V1oMjNmRLOOT3WB8p/T/AI0RfUdUHqvkM=; b=jqFHAvcXPb8pPqSgsQoBpVV2smoWYv0AsTkrJ+kQJiaRB6AMS+3EqMCC/cEs362x4u /+p301ynz83gOJciwL/+JhWXLwlzDXQtAVxI0pJDDB9GKFE/gP/ZecyBXVb0vVwGcrua B2vrLedGOjYteL855XQk1CA0/49+fnprzI25uu1e5Bx/oqPl9AG+803KHDKtnT/PLG+k Un9d7IN9ioEtrMxSurRiCVuroEOut7A/yFcb7dN8ho/9Rkdo002NkBdMT4cXDl7VmAKN uEfp7YDyLrXETMesR0xnhm5rSncI0B+2YFfyWAA1DTtmX6CIW2s8s/afyp7n0DlEMUch nA+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=3l1Y5HSD6/V1oMjNmRLOOT3WB8p/T/AI0RfUdUHqvkM=; b=NWwOcIuMQeEDAA55zo5SXJCi8Qkg/L6zT7znB9M4qkO53LpLMdYCNkeKAFrJYAxcGw 6jhYDicUAc5AvQ5nGrQ78UdCmG+RdWO5TRH55oQ/s9jNh9zrjWl0CS5VM14NG8MBekA/ XIW5buOjtt6JhfiR4vQNEy4x+ZybR934QekysYlO5rXUbOs1phbK9jg3l2EKXSmomiFW rxs4KTx/RKhgnoVHCVHVQ5X7wSNaBUpvBjKcEFtIJ1IdFQz4QRGB6EornQm7dr5tO8uo /ZvhkhWPFMJ81AErhx4K0VaTAjSS8yQ4a8fxX0kyLt0FShwwKK1aS4AaE/IjY7q+O9Pa /fUA== X-Gm-Message-State: ANoB5pllEk2ez37DIGeWOZCBTCNm+lS9v8F7k8tOyOvo1QWP4P9717+V YrYWsrVESXsaTb7OcAV5HznTDA== X-Received: by 2002:a37:6554:0:b0:6f9:f236:1b2b with SMTP id z81-20020a376554000000b006f9f2361b2bmr4845776qkb.299.1668762983133; Fri, 18 Nov 2022 01:16:23 -0800 (PST) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id az20-20020a05620a171400b006ec771d8f89sm2086377qkb.112.2022.11.18.01.16.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Nov 2022 01:16:22 -0800 (PST) Date: Fri, 18 Nov 2022 01:16:20 -0800 (PST) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Linus Torvalds , Johannes Weiner , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Vlastimil Babka , Peter Xu , Yang Shi , John Hubbard , Mike Kravetz , Sidhartha Kumar , Muchun Song , Miaohe Lin , Naoya Horiguchi , Mina Almasry , James Houghton , Zach O'Keefe , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 3/3] mm,thp,rmap: clean up the end of __split_huge_pmd_locked() In-Reply-To: Message-ID: <2f4afe60-40d2-706c-af21-914fbbbd164@google.com> References: <5f52de70-975-e94f-f141-543765736181@google.com> MIME-Version: 1.0 X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749824992219945842?= X-GMAIL-MSGID: =?utf-8?q?1749824992219945842?= It's hard to add a page_add_anon_rmap() into __split_huge_pmd_locked()'s HPAGE_PMD_NR set_pte_at() loop, without wincing at the "freeze" case's HPAGE_PMD_NR page_remove_rmap() loop below it. It's just a mistake to add rmaps in the "freeze" (insert migration entries prior to splitting huge page) case: the pmd_migration case already avoids doing that, so just follow its lead. page_add_ref() versus put_page() likewise. But why is one more put_page() needed in the "freeze" case? Because it's removing the pmd rmap, already removed when pmd_migration (and freeze and pmd_migration are mutually exclusive cases). Signed-off-by: Hugh Dickins Acked-by: Kirill A. Shutemov --- mm/huge_memory.c | 15 +++++---------- 1 file changed, 5 insertions(+), 10 deletions(-) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 3dee8665c585..ab5ab1a013e1 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2135,7 +2135,6 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, uffd_wp = pmd_uffd_wp(old_pmd); VM_BUG_ON_PAGE(!page_count(page), page); - page_ref_add(page, HPAGE_PMD_NR - 1); /* * Without "freeze", we'll simply split the PMD, propagating the @@ -2155,6 +2154,8 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, anon_exclusive = PageAnon(page) && PageAnonExclusive(page); if (freeze && anon_exclusive && page_try_share_anon_rmap(page)) freeze = false; + if (!freeze) + page_ref_add(page, HPAGE_PMD_NR - 1); } /* @@ -2210,27 +2211,21 @@ static void __split_huge_pmd_locked(struct vm_area_struct *vma, pmd_t *pmd, entry = pte_mksoft_dirty(entry); if (uffd_wp) entry = pte_mkuffd_wp(entry); + page_add_anon_rmap(page + i, vma, addr, false); } pte = pte_offset_map(&_pmd, addr); BUG_ON(!pte_none(*pte)); set_pte_at(mm, addr, pte, entry); - if (!pmd_migration) - page_add_anon_rmap(page + i, vma, addr, false); pte_unmap(pte); } if (!pmd_migration) page_remove_rmap(page, vma, true); + if (freeze) + put_page(page); smp_wmb(); /* make pte visible before pmd */ pmd_populate(mm, pmd, pgtable); - - if (freeze) { - for (i = 0; i < HPAGE_PMD_NR; i++) { - page_remove_rmap(page + i, vma, false); - put_page(page + i); - } - } } void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd,