From patchwork Wed Oct 25 14:45:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 15887 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:d641:0:b0:403:3b70:6f57 with SMTP id cy1csp915vqb; Wed, 25 Oct 2023 07:46:39 -0700 (PDT) X-Google-Smtp-Source: AGHT+IE7h8gfN/CLTGePwUwWEWoizxxVPiZJXWj635fNWpaI08LxCSBwqRTTPrKlvEHIrhfQGOc6 X-Received: by 2002:a54:460b:0:b0:3ae:bae2:fa76 with SMTP id p11-20020a54460b000000b003aebae2fa76mr15761764oip.36.1698245194331; Wed, 25 Oct 2023 07:46:34 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1698245194; cv=none; d=google.com; s=arc-20160816; b=bfpNnFB9Fww7GlwjH8hPdR7Iyud49GpnFgnCVsl75Airu5iwheScyLYKOx6CAH+WSe 2bJ8e7kATqDL6DsyAfe4L//TZdQR9vxioY0WesFx3VK16KOaTJUk0EqO4q6ENpuC1yzH CYBUKxDKDkY9USKR1OUfGdVaarhoVBfj1reZujRYlqejkKbBe8xTzsyIspgWpH6ua/Ro lYTZ8dwR11khcXoYwUpdpqsDixFVc0anuYOGbE//KFJdLdmNDsGhOXEBnfx4LhWXMWza ez5o3kI8x8ldfCVJ7KTGrWKoAVoSH+iSIuZLJmhbFFkx50XDMwmdbsPwzISss2MRKevb z02A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=xP7sOvrfSH852OAIXuWBK/wSgiTGti8Jl2oeZ2ri4Ik=; fh=odHZZLn5aUrUhssH6KSiJJlFsZ7wTjfMLFlHN8avkCo=; b=IQNrSXEpcsbZkL2588YvvMZVyTXRx6ATRZQJyKSZy34MTEahPibde3vTfmeya9gC4c vZ+19SRukk6u9DJ8gCiVCv+wdz/ckz3LTz9SfoDDBKk0daOzLiGS8xgHLqLDNv8kfz0b B0/Gwr4RDL4EBwQvZQ6KSxc4tl6evIGustdKtgr9cgiSQrLnmsnVNpwJ5YdZvidYdtzu hb4z8XUVXyWFidSKEyaT6Av+5zhTTSycaFfMjbtjT/YFCNZ32lXDx52aBV1g8vPVB/fM qUg7K9gkJvcFo/nlS0h0LGz00Lg6pEQh41IgOgPVv8cFzKUniMV2U3Bw+9f/I92YVdbC 6MjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id i36-20020a25b224000000b00da0631f96d7si2870463ybj.347.2023.10.25.07.46.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Oct 2023 07:46:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 86045803DBAC; Wed, 25 Oct 2023 07:46:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234704AbjJYOqB (ORCPT + 26 others); Wed, 25 Oct 2023 10:46:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32796 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234655AbjJYOp7 (ORCPT ); Wed, 25 Oct 2023 10:45:59 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 881CADC for ; Wed, 25 Oct 2023 07:45:56 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A7C432F4; Wed, 25 Oct 2023 07:46:37 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 896F53F64C; Wed, 25 Oct 2023 07:45:54 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , David Hildenbrand , Matthew Wilcox , Huang Ying , Gao Xiang , Yu Zhao , Yang Shi , Michal Hocko , Kefeng Wang Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v3 0/4] Swap-out small-sized THP without splitting Date: Wed, 25 Oct 2023 15:45:42 +0100 Message-Id: <20231025144546.577640-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 25 Oct 2023 07:46:18 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1780739152925316141 X-GMAIL-MSGID: 1780739152925316141 Hi All, This is v3 of a series to add support for swapping out small-sized THP without needing to first split the large folio via __split_huge_page(). It closely follows the approach already used by PMD-sized THP. "Small-sized THP" is an upcoming feature that enables performance improvements by allocating large folios for anonymous memory, where the large folio size is smaller than the traditional PMD-size. See [3]. In some circumstances I've observed a performance regression (see patch 2 for details), and this series is an attempt to fix the regression in advance of merging small-sized THP support. I've done what I thought was the smallest change possible, and as a result, this approach is only employed when the swap is backed by a non-rotating block device (just as PMD-sized THP is supported today). Discussion against the RFC concluded that this is probably sufficient. The series applies against mm-unstable (1a3c85fa684a) Changes since v2 [2] ==================== - Reuse scan_swap_map_try_ssd_cluster() between order-0 and order > 0 allocation. This required some refactoring to make everything work nicely (new patches 2 and 3). - Fix bug where nr_swap_pages would say there are pages available but the scanner would not be able to allocate them because they were reserved for the per-cpu allocator. We now allow stealing of order-0 entries from the high order per-cpu clusters (in addition to exisiting stealing from order-0 per-cpu clusters). Thanks to Huang, Ying for the review feedback and suggestions! Changes since v1 [1] ==================== - patch 1: - Use cluster_set_count() instead of cluster_set_count_flag() in swap_alloc_cluster() since we no longer have any flag to set. I was unable to kill cluster_set_count_flag() as proposed against v1 as other call sites depend explicitly setting flags to 0. - patch 2: - Moved large_next[] array into percpu_cluster to make it per-cpu (recommended by Huang, Ying). - large_next[] array is dynamically allocated because PMD_ORDER is not compile-time constant for powerpc (fixes build error). Thanks, Ryan P.S. I know we agreed this is not a prerequisite for merging small-sized THP, but given Huang Ying had provided some review feedback, I wanted to progress it. All the actual prerequisites are either complete or being worked on by others. [1] https://lore.kernel.org/linux-mm/20231010142111.3997780-1-ryan.roberts@arm.com/ [2] https://lore.kernel.org/linux-mm/20231017161302.2518826-1-ryan.roberts@arm.com/ [3] https://lore.kernel.org/linux-mm/15a52c3d-9584-449b-8228-1335e0753b04@arm.com/ Ryan Roberts (4): mm: swap: Remove CLUSTER_FLAG_HUGE from swap_cluster_info:flags mm: swap: Remove struct percpu_cluster mm: swap: Simplify ssd behavior when scanner steals entry mm: swap: Swap-out small-sized THP without splitting include/linux/swap.h | 31 +++--- mm/huge_memory.c | 3 - mm/swapfile.c | 232 ++++++++++++++++++++++++------------------- mm/vmscan.c | 10 +- 4 files changed, 149 insertions(+), 127 deletions(-) --- 2.25.1