From patchwork Fri Jul 21 09:40:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 123741 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp100758vqg; Fri, 21 Jul 2023 03:15:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlF1QtrtOM2VC5ELMnoOjcjEgKjdmfsOU4Bs7NkoggbL/T7XV7yfbX0MMTcNSMOeNeLLht4B X-Received: by 2002:a05:6402:1343:b0:51d:d16f:7e52 with SMTP id y3-20020a056402134300b0051dd16f7e52mr1151948edw.29.1689934538034; Fri, 21 Jul 2023 03:15:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689934538; cv=none; d=google.com; s=arc-20160816; b=ya2kdPQ60rGVBMNa//V05EkOOvWkhkpIIs8rfi2m0wrvidiEXtAG9wpVPLdLjwFjUQ wzCLF/dAmcuMaBmQnYnigSXzB1PbnbEnFfyI71Z3c7T5PSTmHkoV4kWY4MT0PBrkAaIA ovrsjrM5dgFRpQXNe9acnTT61CdNyfeOCs4N1M/YB6RCsAJR+3A4gN7x6ha8U9ovo2Ud BMCl2w4yis7c+TfbsbNpMGVkOHezK7v8bon6WXNIo3o4IV/CPnT22d7JBnLSDf1HVNiI IHb1jXvAzaYajKOVXLlT+8k83wEFezrjHioapGmDPaNOqxLD6BtSEGxDWQgRxSWgH7GP m0hw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=e2sNh8vTkR820lnb3yYf7b8aM21w60Mfy7WKVZcrew4=; fh=0KVzgvcQw81G5nVqs8HZCvx8mOzmVWAkxPDBTqGbDDY=; b=PNVwRPydK1M81depXh8gLh0WUNvSlB5BtsIznMo4Azy656xzuyJ7jVOIGtIINc3Q/M e1ZQSWRyf2koAZGShD/apyXJSfAc4F/65TNrNFeeyEBuezSPAkxwAZC390T1NKpKLmAt PBZzNnZ96+Cq4S28/AEvnwfN/dhpWfrzdQuq6JN0M/gzGj6Y6bl/V88LnfsdnLocTEc1 bC1hGGwqq7f1LaKVXqrEYbR9eTdge72SmNHzGkx4Aws0DWrvxpzZCkoXOEq6SNiofBOS EhKK8NE0h8xpAaM51erJtlIF1lXoJhDzYgcNYTblO3N0NOArQt9VzEXJhugm91mEfOHa v1tg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iZMfpCQs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id h14-20020a50ed8e000000b0051e29447e4fsi2006197edr.548.2023.07.21.03.15.13; Fri, 21 Jul 2023 03:15:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=iZMfpCQs; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231382AbjGUJl5 (ORCPT + 99 others); Fri, 21 Jul 2023 05:41:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231168AbjGUJlq (ORCPT ); Fri, 21 Jul 2023 05:41:46 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40D1630F1 for ; Fri, 21 Jul 2023 02:41:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689932495; x=1721468495; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=AU+Bax7o8AGRnodMI5F/Lc+z+q2VSKWl4dvqmzGnJLE=; b=iZMfpCQs0IQzGDYY5PTn/Emw2jQjTzz9N9j8O1n1BgCpio0lOgQuzTMb JcodfBT+obN1a6x7YdnLkPGE5i++HaYfnNJFkk+x4FMt7O++IK6lbMFvJ xjQDL1N88rp5CrhoxucYfqHme/NBv3CDcNw7gAz3WgDN8mg+WphcxbBGs kp3aPxjCLUTno9S8J0Ha/+NpRwWHFtrYWVHoknrX3TiERjat8lKofER/L j/Fl6EWOZcUaoJ8d53fWbQ50PlekX9tCdvhsj5Q64K6JnqPLUe1/uxBPD ACx1p6aEIF06uFHdmGy+wh5amNrNtusHFU3rE03AOsfYpWPaKuoRRCe9B w==; X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="397874321" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="397874321" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2023 02:40:49 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="898661109" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="898661109" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga005.jf.intel.com with ESMTP; 21 Jul 2023 02:40:46 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, minchan@kernel.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 1/4] madvise: not use mapcount() against large folio for sharing check Date: Fri, 21 Jul 2023 17:40:40 +0800 Message-Id: <20230721094043.2506691-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230721094043.2506691-1-fengwei.yin@intel.com> References: <20230721094043.2506691-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772024798114880399 X-GMAIL-MSGID: 1772024798114880399 The commit 07e8c82b5eff ("madvise: convert madvise_cold_or_pageout_pte_range() to use folios") replaced the page_mapcount() with folio_mapcount() to check whether the folio is shared by other mapping. But it's not correct for large folio. folio_mapcount() returns the total mapcount of large folio which is not suitable to detect whether the folio is shared. Use folio_estimated_sharers() which returns a estimated number of shares. That means it's not 100% correct. But it should be OK for madvise case here. Signed-off-by: Yin Fengwei Reviewed-by: Yu Zhao --- mm/madvise.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index 38382a5d1e39..f12933ebcc24 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -383,7 +383,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, folio = pfn_folio(pmd_pfn(orig_pmd)); /* Do not interfere with other mappings of this folio */ - if (folio_mapcount(folio) != 1) + if (folio_estimated_sharers(folio) != 1) goto huge_unlock; if (pageout_anon_only_filter && !folio_test_anon(folio)) @@ -459,7 +459,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, if (folio_test_large(folio)) { int err; - if (folio_mapcount(folio) != 1) + if (folio_estimated_sharers(folio) != 1) break; if (pageout_anon_only_filter && !folio_test_anon(folio)) break; @@ -682,7 +682,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, if (folio_test_large(folio)) { int err; - if (folio_mapcount(folio) != 1) + if (folio_estimated_sharers(folio) != 1) break; if (!folio_trylock(folio)) break; From patchwork Fri Jul 21 09:40:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 123778 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp115086vqg; Fri, 21 Jul 2023 03:47:59 -0700 (PDT) X-Google-Smtp-Source: APBJJlFda80XfFU+r2jAwKU3vs8Gz8adkIX0CnhLPX1A78NPi2IVjTDMx/vJQkLYUEnR3ZdmLqVB X-Received: by 2002:a05:6e02:1bc7:b0:345:a201:82b7 with SMTP id x7-20020a056e021bc700b00345a20182b7mr1921067ilv.26.1689936478966; Fri, 21 Jul 2023 03:47:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689936478; cv=none; d=google.com; s=arc-20160816; b=jO4yf8jgSkemR1NpEyualACzDoqg/8aVICjqkDLKTBlcDGgFQ7bqwfVwnsVgHHuV7h AnvSwMAjmSY8oxWAjcoHpdMsnQeUy9+mC+4jftAbwXHwVwNDIx+9f7t9RDomKFW17vkF 83X8Z8nM88PFL2ZwLkfaBTyTb2GRLOdqOmo1+SpIrVj1xLNMeyusP/7JGRcXDr1t9qlj hH9RNTHCWyqiVgxTFfOY54mLKtAxxvVI/ncZRPFfWGkw/5wuK8lvtbKJRJhMWX8Z0C33 gD3peqBBW2wA2y/O+L9sA8H/rcygsT1xT+mt88WF40VRoGXORhEpvzIXB1rXe1LIiYYR cy6w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=apzMRyTxT/YpU0tZgW/dawYoZpV0o4D5ZoL/b9Sz3J8=; fh=0KVzgvcQw81G5nVqs8HZCvx8mOzmVWAkxPDBTqGbDDY=; b=Vb/5C8SdCUjvw1ojmlE5ceAq5HXgSf/CHY/3OQMw1eaEMdEJkQfef0oDnjaJVVHBWZ F93WfoL7wlTzFj8AUhykxjD1Yzc6artRM4Kc5wqbNu//rDJ6E/n+L9Kd2CVUrdnKew0Q XvcE3u99gYGQBnb4uni5vvzz8ggT9wniHXLPh5Ulg0bNgSnmc+uPhTJTYM195enQcTv7 cmuv58AHEGt3Ytrww/TXc0z7m5RYp8CYVp1ugD5stgdX+juV0Y0Bde+hAFZ2GcVQte1t JqZh6B8sHBSM+7MEogqANKh3y8FbrOR8aokOo7CithLhR21iK0xGxmQpe9OJfz+p9FKw J8pA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NPdJMpy4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 4-20020a631344000000b0054f9f9b333csi2680377pgt.686.2023.07.21.03.47.45; Fri, 21 Jul 2023 03:47:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=NPdJMpy4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231209AbjGUJmF (ORCPT + 99 others); Fri, 21 Jul 2023 05:42:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54082 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231233AbjGUJlt (ORCPT ); Fri, 21 Jul 2023 05:41:49 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B45F03A87 for ; Fri, 21 Jul 2023 02:41:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689932496; x=1721468496; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=cu8lT3oOEGsTcv6kJqkYTpbtDaklUJpbsDBtNyUKlKw=; b=NPdJMpy4bZwjJPH6rtVAZkrw5vcqK7ajUQhRNip187YeZL9tdrlc65wi LzODMzFZ2BUKDDwxntXdesEyCkLMfcbOmP9vv+KUneBK7Rq1zQ5JZFbB4 QLoraSwsmD78ctusocsUV5OxANRBMmW4f8/dbCzhNTE3ipBxvZltFWU4E bfZd0lto9Hfyg7MPBQa5n7BqXg/ELhvvmVvIIgCFNk7GwBDbi+DNZjapj bAT992MXGIgGaLw1bMfzZikxqdjLfIDl2+PfKcY15AR6B7tFlZbtYXxIv vQ3ov33RAztM48nzSlFBQTrZ/aLlEWqluNIBSHe634PYX/zTynQmk7/yY A==; X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="397874364" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="397874364" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2023 02:41:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="971386942" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="971386942" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga006.fm.intel.com with ESMTP; 21 Jul 2023 02:40:59 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, minchan@kernel.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 2/4] madvise: Use notify-able API to clear and flush page table entries Date: Fri, 21 Jul 2023 17:40:41 +0800 Message-Id: <20230721094043.2506691-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230721094043.2506691-1-fengwei.yin@intel.com> References: <20230721094043.2506691-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772026833534784325 X-GMAIL-MSGID: 1772026833534784325 Currently, in function madvise_cold_or_pageout_pte_range(), the young bit of pte/pmd is cleared notify subscripter. Using notify-able API to make sure the subscripter is signaled about the young bit clearing. Signed-off-by: Yin Fengwei --- mm/madvise.c | 18 ++---------------- 1 file changed, 2 insertions(+), 16 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index f12933ebcc24..b236e201a738 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -403,14 +403,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, return 0; } - if (pmd_young(orig_pmd)) { - pmdp_invalidate(vma, addr, pmd); - orig_pmd = pmd_mkold(orig_pmd); - - set_pmd_at(mm, addr, pmd, orig_pmd); - tlb_remove_pmd_tlb_entry(tlb, pmd, addr); - } - + pmdp_clear_flush_young_notify(vma, addr, pmd); folio_clear_referenced(folio); folio_test_clear_young(folio); if (folio_test_active(folio)) @@ -496,14 +489,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - if (pte_young(ptent)) { - ptent = ptep_get_and_clear_full(mm, addr, pte, - tlb->fullmm); - ptent = pte_mkold(ptent); - set_pte_at(mm, addr, pte, ptent); - tlb_remove_tlb_entry(tlb, pte, addr); - } - + ptep_clear_flush_young_notify(vma, addr, pte); /* * We are deactivating a folio for accelerating reclaiming. * VM couldn't reclaim the folio unless we clear PG_young. From patchwork Fri Jul 21 09:40:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 123779 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp115246vqg; Fri, 21 Jul 2023 03:48:19 -0700 (PDT) X-Google-Smtp-Source: APBJJlE817vsQ11X3TCGbWQNEBIZHVLanFzZWUjBFnA8mqgFU9q+FFXRbkqkjPh12N6tZxthJ34R X-Received: by 2002:a05:6a20:3c89:b0:134:b3af:57e7 with SMTP id b9-20020a056a203c8900b00134b3af57e7mr1314888pzj.51.1689936498903; Fri, 21 Jul 2023 03:48:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689936498; cv=none; d=google.com; s=arc-20160816; b=GE4Qj5Zpyah/REamuLO7tq/np4Ku7te7yyJ/zxylgjU5WW3q3882sjfaZcT2KVDThx RNJHErkQWITByQMPzmrgeHlAmR1xhAxkA5RhEicHRpdRWpvyZfR3auVRq1fhA3qNXxhf XrNFgs8TQrw4dl1zwYa3dW9b/VMPkpZXOpSA63pMaBS/QNxdfICnzFqM8+PlOOrcskXO 3o+0RP2ym29hjnJ95FxmlJaaUYS0HJVxOAk2K35S4HO9XNy2f7R9g6QORkPsDsJ8jj+W z+uIriw6gypLl6apZx1NW33SP+Sx293RxaiaNYXOvHtIlbSgUuyzMhi6JmvU16z+SyU1 e+rA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=CJU6rlhjEkJQvGttexfzxf5+8AjYlAy+5TBBCbgKccE=; fh=0KVzgvcQw81G5nVqs8HZCvx8mOzmVWAkxPDBTqGbDDY=; b=fJc2b58Ecu+ZZugRjVvqxFiVVmcJrr1dLfLZjOwNnsrnx+xmqKpGAeleYDc8X17ACU qDEaHPFXXxML+poL2/PuUCgPlMm2/CjjEBrnbno9AOTLAFWLjXWcMh8navTskfc82M2c KuTpp0iDay/yz5wVJVof7pHKBSXPrPwHF+XkNNzQL+RWWGZt1SZylkfMegBhda+lwxIh w7xnDVB1pryLd7cVDEUWdwjPt7x1yhW9lj+34/nUpHck4B/XxHUzwcg4zfChes4bs/rf hrOX2PYqHb4FiTZ8YgEFGkxX9gGuLmP26sTLTTHccDC34bZ0WvzrR741eLbq7Aji7zK0 fECQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l+wTlzo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w133-20020a627b8b000000b00682a8dfe64esi2676864pfc.301.2023.07.21.03.48.05; Fri, 21 Jul 2023 03:48:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=l+wTlzo6; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231186AbjGUJm2 (ORCPT + 99 others); Fri, 21 Jul 2023 05:42:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230436AbjGUJmX (ORCPT ); Fri, 21 Jul 2023 05:42:23 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA84730F1 for ; Fri, 21 Jul 2023 02:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689932513; x=1721468513; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gTxpEd4Cqo7yQGmFIHWT43Boa+nqgiZNKOPooxf76ME=; b=l+wTlzo6b3+UIxz4PwTQt3RYIRyetpvCXB6Sc+PlETIss+h7nLYsYVXp kpP8kaZQSKLTPrmdm7P8ft6gj9+HnVdaq46pGKsycj337AfWYUnigfD+q lLBaQmVdWEXSVvgtYe532I8/NnHS0pBDbxn4Z4SfTJgpZjiq0lFMDunnP Kfn0gcmoRSaOnUxciHxH22/yBPnlgyOLGGnKunOftFpZ6QbmYH/PIDnJa qLOLDE4+gTUBhkISA3WEE+EMFJTlP4TA30G6Yuf64pf0bCBm+jNcQeib7 plhTXZziZVErP/sJFIFtunQvtbiEsOI5klY/+XAIycPGqRUfaBNUBDqTz Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="346575454" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="346575454" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2023 02:41:15 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="838480273" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="838480273" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga002.fm.intel.com with ESMTP; 21 Jul 2023 02:41:13 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, minchan@kernel.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 3/4] mm: add functions folio_in_range() and folio_within_vma() Date: Fri, 21 Jul 2023 17:40:42 +0800 Message-Id: <20230721094043.2506691-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230721094043.2506691-1-fengwei.yin@intel.com> References: <20230721094043.2506691-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772026854184629120 X-GMAIL-MSGID: 1772026854184629120 It will be used to check whether the folio is mapped to specific VMA and whether the mapping address of folio is in the range. Also a helper function folio_within_vma() to check whether folio is in the range of vma based on folio_in_range(). Signed-off-by: Yin Fengwei --- mm/internal.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index 483add0bfb28..c7dd15d8de3e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -585,6 +585,38 @@ extern long faultin_vma_page_range(struct vm_area_struct *vma, bool write, int *locked); extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); + +static inline bool +folio_in_range(struct folio *folio, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + pgoff_t pgoff, addr; + unsigned long vma_pglen = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + VM_WARN_ON_FOLIO(folio_test_ksm(folio), folio); + if (start < vma->vm_start) + start = vma->vm_start; + + if (end > vma->vm_end) + end = vma->vm_end; + + pgoff = folio_pgoff(folio); + + /* if folio start address is not in vma range */ + if (pgoff < vma->vm_pgoff || pgoff > vma->vm_pgoff + vma_pglen) + return false; + + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + + return ((addr >= start) && (addr + folio_size(folio) <= end)); +} + +static inline bool +folio_within_vma(struct folio *folio, struct vm_area_struct *vma) +{ + return folio_in_range(folio, vma, vma->vm_start, vma->vm_end); +} + /* * mlock_vma_folio() and munlock_vma_folio(): * should be called with vma's mmap_lock held for read or write, From patchwork Fri Jul 21 09:40:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 123728 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9010:0:b0:3e4:2afc:c1 with SMTP id l16csp99570vqg; Fri, 21 Jul 2023 03:13:15 -0700 (PDT) X-Google-Smtp-Source: APBJJlFFRSw2/oSIkZrr4rXn+bj111CCGsqKVWSq6AU+YzQEOyONcKLF747WFQJ7aiqbrFASeTmn X-Received: by 2002:a17:906:53:b0:99b:605b:1f49 with SMTP id 19-20020a170906005300b0099b605b1f49mr1333142ejg.36.1689934395074; Fri, 21 Jul 2023 03:13:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689934395; cv=none; d=google.com; s=arc-20160816; b=RDemB2W0fSIVyVnUKjR1peU7Q16wBNZ16xhuMAoIZs5e/TERshU5CSqz6PZKnOvcHB fbGXQAY6/XKZyWfiYvVmrFeueiwHKzEmIuJJyKinTq5TFC620esWbJF5qjj/WzrF5EGk 7bawbXWNOfj8v0XI/BKTGqW4F5nlhXNt8ra1TbuG5P0J9u6Ovig15sekquXqlI4GGPAy 9ai7Da9dZKS+UUuR9BxMA8AlkIDwG8uEmHKjdjxydLZDAPduW6cZrrFcbyhUHREhZKgb OIR0VmB8iSf7hk9nvgL9/12/3iiQ2trZ5OSnDXyaoa9Avd2bBolUxiEmosh20C1jpHat Nang== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=XHg8XCtLoVnx+178so0LHeAi7YesWlPOQ5pb0at6+2Q=; fh=0KVzgvcQw81G5nVqs8HZCvx8mOzmVWAkxPDBTqGbDDY=; b=Z5OC0xDQLYjtFrWQNZ6fImbaiEm6k87szeirGUGG+O27lK8CHmRMjJek6C6zP+t6gJ sbjXgLb617aLOEVfqSh7b4qFCymFikDjGUSl+3iFtKiV/QH+b6OjKQHzrkirJE11419e aR0VqQHKfnkpWDJYC2wqWjtDsBQUmoSvBY0h5PTOPBFSvGiAgP8jdgP9jRkTLxmzzPbr hBoghIPEw2ZMvBPmxxJnuqYc81WOxf9TtiRwzXzEzVQZy7BiHL2Y8FQCjKRY8Ed52dqz mNoLec7xcVLI5/Wf7uuusf8O4UKmfzKZG+VFMfnonaNG7HephnK9k0+9QQLKL1VFHOUp mXgQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ORWw3mM0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f22-20020a1709067f9600b00993a68a3af8si1999559ejr.568.2023.07.21.03.12.50; Fri, 21 Jul 2023 03:13:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ORWw3mM0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231295AbjGUJmj (ORCPT + 99 others); Fri, 21 Jul 2023 05:42:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231168AbjGUJmc (ORCPT ); Fri, 21 Jul 2023 05:42:32 -0400 Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E606F3C15 for ; Fri, 21 Jul 2023 02:42:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689932522; x=1721468522; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Y7g87XRsxPlXPl3gcejHKFouz+BtPVkulKmHSjYpyg4=; b=ORWw3mM01xHKBC6eLy4RNDIuKIs11uVTECYRFvLmxExpiFibTbKTgeRm E+VBzTFmwmABflGJJiiJ9qfi6iIDOYEpGfKCVEYnu6qk6BFtxRiePmfH0 XTYQ8jPMldl8zvmG2RLb6H9BR25FPtPZrv4ZC2V9NnFP2zuyUgwpnB1Yh chzQ4n3r6so14OAl/OuVRsDuearPvaUEBjgtalMUawKKbpHs4y4+ANs7o wgQO+yWk6TODwPhU/QcgZiTLii2sdzByFkzmobBOmIr4mzej/Eu0CMtOM +jpbK0SPlKDZ7O8dCpcSRPDsCSUCb2xNfkA0pfgQnH9yZnw2CZoI/GWqo w==; X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="346575496" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="346575496" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jul 2023 02:41:29 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10777"; a="838480297" X-IronPort-AV: E=Sophos;i="6.01,220,1684825200"; d="scan'208";a="838480297" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga002.fm.intel.com with ESMTP; 21 Jul 2023 02:41:26 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, minchan@kernel.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 4/4] madvise: avoid trying to split large folio always in cold_pageout Date: Fri, 21 Jul 2023 17:40:43 +0800 Message-Id: <20230721094043.2506691-5-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230721094043.2506691-1-fengwei.yin@intel.com> References: <20230721094043.2506691-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1772024648392117525 X-GMAIL-MSGID: 1772024648392117525 Current madvise_cold_or_pageout_pte_range() always tries to split large folio. Avoid trying to split large folio always by: - if large folio is in the request range, don't split it. Leave to page reclaim to decide whether the large folio needs be split. - if large folio crosses boundaries of request range, skip it if it's page cache. Try to split it if it's anonymous large folio. If failed to split it, just skip it. Invoke folio_referenced() to clear the A bit for large folio. As it will acquire pte lock, just do it after release pte lock. Signed-off-by: Yin Fengwei --- mm/internal.h | 10 +++++ mm/madvise.c | 118 +++++++++++++++++++++++++++++++++++--------------- 2 files changed, 93 insertions(+), 35 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index c7dd15d8de3e..cd1ff348d690 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -586,6 +586,16 @@ extern long faultin_vma_page_range(struct vm_area_struct *vma, extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); +static inline unsigned int +folio_op_size(struct folio *folio, pte_t pte, + unsigned long addr, unsigned long end) +{ + unsigned int nr; + + nr = folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte); + return min_t(unsigned int, nr, (end - addr) >> PAGE_SHIFT); +} + static inline bool folio_in_range(struct folio *folio, struct vm_area_struct *vma, unsigned long start, unsigned long end) diff --git a/mm/madvise.c b/mm/madvise.c index b236e201a738..71af370c3251 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -339,6 +339,23 @@ static inline bool can_do_file_pageout(struct vm_area_struct *vma) file_permission(vma->vm_file, MAY_WRITE) == 0; } +static inline bool skip_cur_entry(struct folio *folio, bool pageout_anon_only) +{ + if (!folio) + return true; + + if (folio_is_zone_device(folio)) + return true; + + if (!folio_test_lru(folio)) + return true; + + if (pageout_anon_only && !folio_test_anon(folio)) + return true; + + return false; +} + static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -352,7 +369,9 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, spinlock_t *ptl; struct folio *folio = NULL; LIST_HEAD(folio_list); + LIST_HEAD(reclaim_list); bool pageout_anon_only_filter; + unsigned long start = addr; if (fatal_signal_pending(current)) return -EINTR; @@ -442,54 +461,90 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, continue; folio = vm_normal_folio(vma, addr, ptent); - if (!folio || folio_is_zone_device(folio)) + if (skip_cur_entry(folio, pageout_anon_only_filter)) continue; /* - * Creating a THP page is expensive so split it only if we - * are sure it's worth. Split it if we are only owner. + * Split large folio only if it's anonymous, cross the + * boundaries of request range and we are likely the + * only onwer. */ if (folio_test_large(folio)) { - int err; + int err, step; if (folio_estimated_sharers(folio) != 1) - break; - if (pageout_anon_only_filter && !folio_test_anon(folio)) - break; - if (!folio_trylock(folio)) - break; + continue; + if (folio_in_range(folio, vma, start, end)) + goto pageout_cold_folio; + if (!folio_test_anon(folio) || !folio_trylock(folio)) + continue; + folio_get(folio); + step = folio_op_size(folio, ptent, addr, end); arch_leave_lazy_mmu_mode(); pte_unmap_unlock(start_pte, ptl); start_pte = NULL; err = split_folio(folio); folio_unlock(folio); folio_put(folio); - if (err) - break; + start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); if (!start_pte) break; arch_enter_lazy_mmu_mode(); - pte--; - addr -= PAGE_SIZE; - continue; - } - /* - * Do not interfere with other mappings of this folio and - * non-LRU folio. - */ - if (!folio_test_lru(folio) || folio_mapcount(folio) != 1) + /* split success. retry the same entry */ + if (!err) + step = 0; + + /* + * Split fails, jump over the whole folio to avoid + * grabbing same folio but fails to split it again + * and again. + */ + pte += step - 1; + addr += (step - 1) << PAGE_SHIFT; continue; + } - if (pageout_anon_only_filter && !folio_test_anon(folio)) + /* Do not interfere with other mappings of this folio */ + if (folio_mapcount(folio) != 1) continue; VM_BUG_ON_FOLIO(folio_test_large(folio), folio); - ptep_clear_flush_young_notify(vma, addr, pte); + +pageout_cold_folio: + if (folio_isolate_lru(folio)) { + if (folio_test_unevictable(folio)) + folio_putback_lru(folio); + else + list_add(&folio->lru, &folio_list); + } + } + + if (start_pte) { + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(start_pte, ptl); + } + + while (!list_empty(&folio_list)) { + folio = lru_to_folio(&folio_list); + list_del(&folio->lru); + + if (folio_test_large(folio)) { + int refs; + unsigned long flags; + struct mem_cgroup *memcg = folio_memcg(folio); + + refs = folio_referenced(folio, 0, memcg, &flags); + if ((flags & VM_LOCKED) || (refs == -1)) { + folio_putback_lru(folio); + continue; + } + } + /* * We are deactivating a folio for accelerating reclaiming. * VM couldn't reclaim the folio unless we clear PG_young. @@ -501,22 +556,15 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, if (folio_test_active(folio)) folio_set_workingset(folio); if (pageout) { - if (folio_isolate_lru(folio)) { - if (folio_test_unevictable(folio)) - folio_putback_lru(folio); - else - list_add(&folio->lru, &folio_list); - } - } else - folio_deactivate(folio); + list_add(&folio->lru, &reclaim_list); + } else { + folio_clear_active(folio); + folio_putback_lru(folio); + } } - if (start_pte) { - arch_leave_lazy_mmu_mode(); - pte_unmap_unlock(start_pte, ptl); - } if (pageout) - reclaim_pages(&folio_list); + reclaim_pages(&reclaim_list); cond_resched(); return 0;