Message ID | 20230830095011.1228673-1-ryan.roberts@arm.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4716801vqm; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF2SpO1MjnvxfbrLtOmfnC+pscSO9BpCwdSPZcnRdn9vZnM2E/bVcMe7kldYeGCiAPrtnhw X-Received: by 2002:a05:6a00:148d:b0:68a:613e:a369 with SMTP id v13-20020a056a00148d00b0068a613ea369mr3622462pfu.3.1693421621254; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693421621; cv=none; d=google.com; s=arc-20160816; b=mE2jjFrVWajhX6vHHeho+pDYbvlNpz0tHwk+oKz3qR9iDUBP2b6//XbA3IrOVZpjrR yeHaJmrrugccixU8QnRxVqw6G6J9oa8TtZKVLGBHOAEpV1ORAZaI9SAxiC34DU1pYkV8 9K8CSwjogMeqwm5M3zDBPZRYllAIqhH3mWWf92Q0JJqrFx8Q8evJ/2hu0z3m8OXNSIX8 3IBtWpHBYAdR/gFVkqtK4q0t6pCc1K3ypyYJqkMOqFThEMMCjqrtc8OkxYpHa92062WM 1a9S2BwHJUJnvQmq35YSfOPz9tltgaYa1xrkv5BNaJhPvHM9pcc8BscJGNvbmL6CDr2H O19w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=v9PgY6FMhffaDB3aS2bKoReBlh3WjS5OecNHlQ8Ww1I=; fh=7ufO3qkgzAujJjGJ3F7aZQP3DQw3L2uNuulBC14Kglk=; b=UYA0GXDkFPbpprNQldt2bAc3tCvS8ho7D4a5xl4R5UU+Ue/BA0tFjvSrsNEYAHhV4n jvrEmDxE9HH6QBCV4sDtV9pQgB9cBxi0CDl6eCkBrHvlk8TonA4qenXr4Fyayp72279x +XJBhaZ/mdY2eId+UDyRkH2Kfis6oSzhjpu54C8RsKhEta6c/a2gh+hqisHcQjFDkSwj KhH8aTOJqh7AMq/YrLzQLR1IVjI/rCxPMQLkSW7hrcKjEkZejUj2+YfJr9k1zqbCJgEX JNZEmZsPcFKAbUOvSLqEuthw7wGDlWX4hysGG7LGiBldmg4pzHAbjZb/Bbzl4R7FyDY3 Asyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eg20-20020a056a00801400b0068be9e10995si3334251pfb.70.2023.08.30.11.53.22; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243041AbjH3Sre (ORCPT <rfc822;kiss.andras.p@gmail.com> + 99 others); Wed, 30 Aug 2023 14:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242823AbjH3Jub (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 30 Aug 2023 05:50:31 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B5EFE1B0 for <linux-kernel@vger.kernel.org>; Wed, 30 Aug 2023 02:50:28 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C95182F4; Wed, 30 Aug 2023 02:51:07 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C852A3F64C; Wed, 30 Aug 2023 02:50:25 -0700 (PDT) From: Ryan Roberts <ryan.roberts@arm.com> To: Will Deacon <will@kernel.org>, "Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>, Andrew Morton <akpm@linux-foundation.org>, Nick Piggin <npiggin@gmail.com>, Peter Zijlstra <peterz@infradead.org>, Christian Borntraeger <borntraeger@linux.ibm.com>, Sven Schnelle <svens@linux.ibm.com>, Arnd Bergmann <arnd@arndb.de>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, David Hildenbrand <david@redhat.com>, Yu Zhao <yuzhao@google.com>, "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>, Yin Fengwei <fengwei.yin@intel.com>, Yang Shi <shy828301@gmail.com>, "Huang, Ying" <ying.huang@intel.com>, Zi Yan <ziy@nvidia.com> Cc: Ryan Roberts <ryan.roberts@arm.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 0/5] Optimize mmap_exit for large folios Date: Wed, 30 Aug 2023 10:50:06 +0100 Message-Id: <20230830095011.1228673-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775681269815168441 X-GMAIL-MSGID: 1775681269815168441 |
Series |
Optimize mmap_exit for large folios
|
|
Message
Ryan Roberts
Aug. 30, 2023, 9:50 a.m. UTC
Hi All, This is v2 of a series to improve performance of process teardown, taking advantage of the fact that large folios are increasingly regularly pte-mapped in user space; supporting filesystems already use large folios for pagecache memory, and large folios for anonymous memory are (hopefully) on the horizon. See last patch for performance numbers, including measurements that show this approach doesn't regress (and actually improves a little bit) when all folios are small. The basic approach is to accumulate contiguous ranges of pages in the mmu_gather structure (instead of storing each individual page pointer), then take advantage of this internal format to efficiently batch rmap removal, swapcache removal and page release - see the commit messages for more details. This series replaces the previous approach I took at [2], which was much smaller in scope, only attempting to batch rmap removal for anon pages. Feedback was that I should do something more general that would also batch-remove pagecache pages from the rmap. But while designing that, I found it was also possible to improve swapcache removal and page release. Hopefully I haven't gone too far the other way now! Note that patch 1 is unchanged from that originl series. Note that this series will conflict with Matthew's series at [3]. I figure we both race to mm-unstable and the loser has to do the conflict resolution? This series is based on mm-unstable (b93868dbf9bc). Changes since v1 [1] -------------------- - Now using pfns for start and end of page ranges within a folio. `struct page`s may not be contiguous on some setups so using pointers breaks these systems. (Thanks to Zi Yan). - Fixed zone_device folio reference putting. (Thanks to Matthew and David). - Refactored release_pages() and folios_put_refs() so that they now share a common implementation. [1] https://lore.kernel.org/linux-mm/20230810103332.3062143-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-mm/20230727141837.3386072-1-ryan.roberts@arm.com/ [3] https://lore.kernel.org/linux-mm/20230825135918.4164671-1-willy@infradead.org/ Thanks, Ryan Ryan Roberts (5): mm: Implement folio_remove_rmap_range() mm/mmu_gather: generalize mmu_gather rmap removal mechanism mm/mmu_gather: Remove encoded_page infrastructure mm: Refector release_pages() mm/mmu_gather: Store and process pages in contig ranges arch/s390/include/asm/tlb.h | 9 +- include/asm-generic/tlb.h | 49 ++++----- include/linux/mm.h | 11 +- include/linux/mm_types.h | 34 +----- include/linux/rmap.h | 2 + include/linux/swap.h | 6 +- mm/memory.c | 24 +++-- mm/mmu_gather.c | 114 ++++++++++++++------ mm/rmap.c | 125 ++++++++++++++++------ mm/swap.c | 201 ++++++++++++++++++++++-------------- mm/swap_state.c | 11 +- 11 files changed, 367 insertions(+), 219 deletions(-) -- 2.25.1
Comments
On 30/08/2023 16:07, Matthew Wilcox wrote: > On Wed, Aug 30, 2023 at 10:50:11AM +0100, Ryan Roberts wrote: >> +++ b/include/asm-generic/tlb.h >> @@ -246,11 +246,11 @@ struct mmu_gather_batch { >> struct mmu_gather_batch *next; >> unsigned int nr; >> unsigned int max; >> - struct page *pages[]; >> + struct pfn_range folios[]; > > I think it's dangerous to call this 'folios' as it lets you think that > each entry is a single folio. But as I understand this patch, you can > coagulate contiguous ranges across multiple folios. No that's not quite the case; each contiguous range only ever spans a *single* folio. If there are 2 contiguous folios, they will be represented as separate ranges. This is done so that we can subsequently do the per-folio operations without having to figure out how many folios are within each range - one range = one (contiguous part of a) folio. On naming, I was calling this variable "ranges" in v1 but thought folios was actually clearer. How about "folio_regions"? > >> -void free_pages_and_swap_cache(struct page **pages, int nr) >> +void free_folios_and_swap_cache(struct pfn_range *folios, int nr) >> { >> lru_add_drain(); >> for (int i = 0; i < nr; i++) >> - free_swap_cache(pages[i]); >> - release_pages(pages, nr); >> + free_swap_cache(pfn_to_page(folios[i].start)); > > ... but here, you only put the swapcache for the first folio covered by > the range, not for each folio. Yes that's intentional - one range only ever covers one folio, so I only need to call free_swap_cache() once for the folio. Unless I've misunderstood and free_swap_cache() is actually decrementing a reference count and needs to be called for every page? (but it doesn't look like that in the code). > >> + folios_put_refs(folios, nr); > > It's kind of confusing to have folios_put() which takes a struct folio * > and then folios_put_refs() which takes a struct pfn_range *. > pfn_range_put()? I think it's less confusing if you know that each pfn_range represents a single contig range of pages within a *single* folio. pfn_range_put() would make it sound like its ok to pass a pfn_range that spans multiple folios (this would break). I could rename `struct pfn_range` to `struct sub_folio` or something like that. Would that help make the semantic clearer?