From patchwork Wed Jul 19 13:54:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 12281 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2472386vqt; Wed, 19 Jul 2023 07:18:36 -0700 (PDT) X-Google-Smtp-Source: APBJJlGVzAEYN2VPYPoEBP4jmCiKAKiigUXeHnz6b8gbK2pGYsbGdRrpn28rdRUR+qXdiPTZCRbh X-Received: by 2002:a17:903:1c9:b0:1ac:7245:ba5a with SMTP id e9-20020a17090301c900b001ac7245ba5amr17681310plh.61.1689776316005; Wed, 19 Jul 2023 07:18:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689776315; cv=none; d=google.com; s=arc-20160816; b=c+SCwLjJcBckxZf8TyaV0FTkCf2V5iVwMcaQ0SiSlF+s0qMIbeAIoQSmXn9v/v+EFF tLWMZqY8nn2HzauaiFOccDxkB5gUL575pZC1zkv17A4IWdim6tEoNuXXVA+LuUZ89j5y rclHUNAZBqua3zh7qJysXg4we5KAhV084lh/uJq7kneAnJ4AtPpLf/XRXKoL3icyutib 3sWSwUoUHURQNaX/Kh1/zSDBX1iDyFjNfOHMjjlEqQfS1fxdflgeZW7eh61ihTQWqZ87 u7++JWIuLtXGZOr/0zgFhnhumWQwsc2oK5BlhgPn5FbrmeSwOUO6eLgRq7ukUtRMns++ Tziw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from; bh=ZtvQGP1mXU/+/oKvo0Ief14HguIdmvwePfK5XLB/jmI=; fh=THi7eKCjaWcVibUnyL3CTjC3PwRMYqk9JChFkSe1vn0=; b=ENY7+f+NNBa8sjJc4vfNIF2Tk+t4h4VkTx1bJBrpc6MXjh96jWb1r8bFZxMDVsSvgV 2nxeHb0nxa53jH3pqVDPXbOm3eI74VYtxapww/56rt8h+FkvGFY4b0BZgbAjSe0JRLaP qs5NiM9Zu6PkF61pjowDeHuKjzJuP7FXzfcXH7mgYKQazU2Vk25VmvE7rRb+Pai/xuxZ CHp8Ak8UcdLheFSLdH8HukuSRjqPC3PZTU91c5qntrh5jGQ1BicDoZ0fkooRW8DqvZdD alFLRuXYyx2PJBfNh1+uAF6KHBCZhtTb4+wOPokGVRMFQfVyF9j/Ay54oZ/vwQVpeu/b o0CQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ko4-20020a17090307c400b001b9fb1e25ebsi3366441plb.360.2023.07.19.07.18.22; Wed, 19 Jul 2023 07:18:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230118AbjGSNzI (ORCPT + 99 others); Wed, 19 Jul 2023 09:55:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229539AbjGSNzG (ORCPT ); Wed, 19 Jul 2023 09:55:06 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id A3D5C19A for ; Wed, 19 Jul 2023 06:55:04 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 95FFC2F4; Wed, 19 Jul 2023 06:55:47 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.26]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id B9A253F6C4; Wed, 19 Jul 2023 06:55:02 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Matthew Wilcox , Yin Fengwei , David Hildenbrand , Yu Zhao , Yang Shi , "Huang, Ying" , Zi Yan Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 0/3] Optimize large folio interaction with deferred split Date: Wed, 19 Jul 2023 14:54:47 +0100 Message-Id: <20230719135450.545227-1-ryan.roberts@arm.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771858890484018469 X-GMAIL-MSGID: 1771858890484018469 Hi All, This is v2 of a small series in support of my work to enable the use of large folios for anonymous memory (known as "FLEXIBLE_THP" or "LARGE_ANON_FOLIO") [1]. It first makes it possible to add large, non-pmd-mappable folios to the deferred split queue. Then it modifies zap_pte_range() to batch-remove spans of physically contiguous pages from the rmap, which means that in the common case, we elide the need to ever put the folio on the deferred split queue, thus reducing lock contention and improving performance. This becomes more visible once we have lots of large anonymous folios in the system, and Huang Ying has suggested solving this needs to be a prerequisit for merging the main FLEXIBLE_THP/LARGE_ANON_FOLIO work. The series applies on top of v6.5-rc2 and a branch is available at [2]. I don't have a full test run with the latest versions of all the patches on top of the latest baseline, so not posting results formally. I can get these if people feel they are neccessary though. But anecdotally, for the kernel compilation workload, this series reduces kernel time by ~4% and reduces real-time by ~0.4%, compared with [1]. Changes since v1 [3] -------------------- - patch 2: Modified doc comment for folio_remove_rmap_range() - patch 2: Hoisted _nr_pages_mapped manipulation out of page loop so its now modified once per folio_remove_rmap_range() call. - patch 2: Added check that page range is fully contained by folio in folio_remove_rmap_range() - patch 2: Fixed some nits raised by Huang, Ying for folio_remove_rmap_range() - patch 3: Support batch-zap of all anon pages, not just those in anon vmas - patch 3: Renamed various functions to make their use clear - patch 3: Various minor refactoring/cleanups - Added Reviewed-By tags - thanks! [1] https://lore.kernel.org/linux-mm/20230714160407.4142030-1-ryan.roberts@arm.com/ [2] https://gitlab.arm.com/linux-arm/linux-rr/-/tree/features/granule_perf/deferredsplit-lkml_v2 [3] https://lore.kernel.org/linux-mm/20230717143110.260162-1-ryan.roberts@arm.com/ Thanks, Ryan Ryan Roberts (3): mm: Allow deferred splitting of arbitrary large anon folios mm: Implement folio_remove_rmap_range() mm: Batch-zap large anonymous folio PTE mappings include/linux/rmap.h | 2 + mm/memory.c | 120 +++++++++++++++++++++++++++++++++++++++++++ mm/rmap.c | 76 ++++++++++++++++++++++++++- 3 files changed, 196 insertions(+), 2 deletions(-) --- 2.25.1