From patchwork Wed Aug 30 09:50:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13752 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a7d1:0:b0:3f2:4152:657d with SMTP id p17csp4716801vqm; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF2SpO1MjnvxfbrLtOmfnC+pscSO9BpCwdSPZcnRdn9vZnM2E/bVcMe7kldYeGCiAPrtnhw X-Received: by 2002:a05:6a00:148d:b0:68a:613e:a369 with SMTP id v13-20020a056a00148d00b0068a613ea369mr3622462pfu.3.1693421621254; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1693421621; cv=none; d=google.com; s=arc-20160816; b=mE2jjFrVWajhX6vHHeho+pDYbvlNpz0tHwk+oKz3qR9iDUBP2b6//XbA3IrOVZpjrR yeHaJmrrugccixU8QnRxVqw6G6J9oa8TtZKVLGBHOAEpV1ORAZaI9SAxiC34DU1pYkV8 9K8CSwjogMeqwm5M3zDBPZRYllAIqhH3mWWf92Q0JJqrFx8Q8evJ/2hu0z3m8OXNSIX8 3IBtWpHBYAdR/gFVkqtK4q0t6pCc1K3ypyYJqkMOqFThEMMCjqrtc8OkxYpHa92062WM 1a9S2BwHJUJnvQmq35YSfOPz9tltgaYa1xrkv5BNaJhPvHM9pcc8BscJGNvbmL6CDr2H O19w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=v9PgY6FMhffaDB3aS2bKoReBlh3WjS5OecNHlQ8Ww1I=; fh=7ufO3qkgzAujJjGJ3F7aZQP3DQw3L2uNuulBC14Kglk=; b=UYA0GXDkFPbpprNQldt2bAc3tCvS8ho7D4a5xl4R5UU+Ue/BA0tFjvSrsNEYAHhV4n jvrEmDxE9HH6QBCV4sDtV9pQgB9cBxi0CDl6eCkBrHvlk8TonA4qenXr4Fyayp72279x +XJBhaZ/mdY2eId+UDyRkH2Kfis6oSzhjpu54C8RsKhEta6c/a2gh+hqisHcQjFDkSwj KhH8aTOJqh7AMq/YrLzQLR1IVjI/rCxPMQLkSW7hrcKjEkZejUj2+YfJr9k1zqbCJgEX JNZEmZsPcFKAbUOvSLqEuthw7wGDlWX4hysGG7LGiBldmg4pzHAbjZb/Bbzl4R7FyDY3 Asyw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eg20-20020a056a00801400b0068be9e10995si3334251pfb.70.2023.08.30.11.53.22; Wed, 30 Aug 2023 11:53:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243041AbjH3Sre (ORCPT + 99 others); Wed, 30 Aug 2023 14:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242823AbjH3Jub (ORCPT ); Wed, 30 Aug 2023 05:50:31 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id B5EFE1B0 for ; Wed, 30 Aug 2023 02:50:28 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C95182F4; Wed, 30 Aug 2023 02:51:07 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C852A3F64C; Wed, 30 Aug 2023 02:50:25 -0700 (PDT) From: Ryan Roberts To: Will Deacon , "Aneesh Kumar K.V" , Andrew Morton , Nick Piggin , Peter Zijlstra , Christian Borntraeger , Sven Schnelle , Arnd Bergmann , "Matthew Wilcox (Oracle)" , David Hildenbrand , Yu Zhao , "Kirill A. Shutemov" , Yin Fengwei , Yang Shi , "Huang, Ying" , Zi Yan Cc: Ryan Roberts , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH v2 0/5] Optimize mmap_exit for large folios Date: Wed, 30 Aug 2023 10:50:06 +0100 Message-Id: <20230830095011.1228673-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1775681269815168441 X-GMAIL-MSGID: 1775681269815168441 Hi All, This is v2 of a series to improve performance of process teardown, taking advantage of the fact that large folios are increasingly regularly pte-mapped in user space; supporting filesystems already use large folios for pagecache memory, and large folios for anonymous memory are (hopefully) on the horizon. See last patch for performance numbers, including measurements that show this approach doesn't regress (and actually improves a little bit) when all folios are small. The basic approach is to accumulate contiguous ranges of pages in the mmu_gather structure (instead of storing each individual page pointer), then take advantage of this internal format to efficiently batch rmap removal, swapcache removal and page release - see the commit messages for more details. This series replaces the previous approach I took at [2], which was much smaller in scope, only attempting to batch rmap removal for anon pages. Feedback was that I should do something more general that would also batch-remove pagecache pages from the rmap. But while designing that, I found it was also possible to improve swapcache removal and page release. Hopefully I haven't gone too far the other way now! Note that patch 1 is unchanged from that originl series. Note that this series will conflict with Matthew's series at [3]. I figure we both race to mm-unstable and the loser has to do the conflict resolution? This series is based on mm-unstable (b93868dbf9bc). Changes since v1 [1] -------------------- - Now using pfns for start and end of page ranges within a folio. `struct page`s may not be contiguous on some setups so using pointers breaks these systems. (Thanks to Zi Yan). - Fixed zone_device folio reference putting. (Thanks to Matthew and David). - Refactored release_pages() and folios_put_refs() so that they now share a common implementation. [1] https://lore.kernel.org/linux-mm/20230810103332.3062143-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-mm/20230727141837.3386072-1-ryan.roberts@arm.com/ [3] https://lore.kernel.org/linux-mm/20230825135918.4164671-1-willy@infradead.org/ Thanks, Ryan Ryan Roberts (5): mm: Implement folio_remove_rmap_range() mm/mmu_gather: generalize mmu_gather rmap removal mechanism mm/mmu_gather: Remove encoded_page infrastructure mm: Refector release_pages() mm/mmu_gather: Store and process pages in contig ranges arch/s390/include/asm/tlb.h | 9 +- include/asm-generic/tlb.h | 49 ++++----- include/linux/mm.h | 11 +- include/linux/mm_types.h | 34 +----- include/linux/rmap.h | 2 + include/linux/swap.h | 6 +- mm/memory.c | 24 +++-- mm/mmu_gather.c | 114 ++++++++++++++------ mm/rmap.c | 125 ++++++++++++++++------ mm/swap.c | 201 ++++++++++++++++++++++-------------- mm/swap_state.c | 11 +- 11 files changed, 367 insertions(+), 219 deletions(-) --- 2.25.1