From patchwork Tue Sep 26 06:09:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Huang, Ying" X-Patchwork-Id: 144788 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp1777074vqu; Tue, 26 Sep 2023 02:15:30 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHz0PodU15VoamcHgbiIxii/UqzrJIU0l1dmQ1P7QseeaXdCs+xf/MwgPlPQSSsWSg7z+GL X-Received: by 2002:a25:9a08:0:b0:d78:134:9477 with SMTP id x8-20020a259a08000000b00d7801349477mr7055397ybn.58.1695719730065; Tue, 26 Sep 2023 02:15:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695719730; cv=none; d=google.com; s=arc-20160816; b=U1yDReFWkNxneFd7nYh5H7JekHufYkFNlmijCzKjO5BJDRs35HNQ50tOROplBiwdbk LhVia/0nOgrSClJddlIiMd1v1EUY97/9N3Ud5NWNpxchTn+09RccB/QX/CDzUhSsI6p9 E3BtwRa/NC0b8o5Jc5ySmoBnDfojD8gFfkhsEuc3Ai5xB1E4gAm1vTh/HNIJiF8PLDqx la/q75BnhT2X4hfejQfmEMTIq/fCs2m42NtD/oVEARgW4Ma/4nT0dre1jSPvkyDYiU8c kUR6wrF5EVFqfd5e08t38UPvXkVN5KUsMicW0sU/3J8dZWTaH/iETSTI+WrXMu2lSlFF MjsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=w2iMvxo7If1GTMOOh15H1XxbFei4TMeFoSfe4bBcBqM=; fh=rOqdWm0xLtwhY96CBVlHZJCtqAZkONVUDvFazfYuxhM=; b=USKOuzcxcav01nrLFE+B2QYQNU9Hk9TTpq5HmqJfZcBTnIp+GI+v2ZC15dJJXigG+g L9MCddUXsYd6Zp/jR+KVJsIUl31xH9giz7+ngON9Uugz2uxcG1/4TZmnVQ3z0M0D39GE sd8WyqMiJqgU4xAXVkf48BvSxvATesDNU6YUJpLRsX1kDpspJffcVc2HDuywboltzv4c 4DZQaJ0zNdMO/QRGFoq7MaY0297CxPkgtoTMQUuPSSIIQRrWML6RBoPCCwWe0O1dPFyc NWt1UKdWndrmhU+SZ7OXrcR1dle5FtQFtjLpHYO13aiL/7btdPrde5CKKTRDl0+K1Crm 2J0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SV7ON1j1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id e2-20020a637442000000b00578a79e8f8bsi1493422pgn.551.2023.09.26.02.15.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 02:15:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=SV7ON1j1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 9F9E78076E54; Mon, 25 Sep 2023 23:10:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233799AbjIZGKs (ORCPT + 27 others); Tue, 26 Sep 2023 02:10:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59160 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233895AbjIZGKZ (ORCPT ); Tue, 26 Sep 2023 02:10:25 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6BB2E76 for ; Mon, 25 Sep 2023 23:10:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1695708607; x=1727244607; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qaIFP424oa1dLN+3UGtNGktIElxaTAH6Kr61bKGSDDg=; b=SV7ON1j13SFQD8wFbgq48J+TvQBdusPCjU5cZCoEuBJBhXDD4lN5bAMI kkPqBEveZegZwyZIQP7bFDgJIuwieb7Pk+W6xPaYbTDK9rh+bJPG3TsMl VDKu7BZpCWjCXpGAtXQAAxnnIIyw2dQ6fGaDYlF2YK/wMwdbsrfSufIKa tyn3wXNroKn/95nL0Wll2o3rdQoVRQ0FKYEayG3YdLXJnvA9KDXqkL8kP 3mVsDYs/5M8M4WB0gvM348QKRFWIjxK+cJW+7hsEHJuUL8hkKdGlzXCfB MLItB/jxPuVMPQNOdC4QQefE7ILZotGmBtPWXeUSTptMhHUOMR09oF2zP w==; X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="447991478" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="447991478" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 23:10:07 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10843"; a="892076115" X-IronPort-AV: E=Sophos;i="6.03,177,1694761200"; d="scan'208";a="892076115" Received: from aozhu-mobl.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.255.31.94]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Sep 2023 23:08:59 -0700 From: Huang Ying To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven , Huang Ying , Mel Gorman , Vlastimil Babka , David Hildenbrand , Johannes Weiner , Dave Hansen , Michal Hocko , Pavel Tatashin , Matthew Wilcox , Christoph Lameter Subject: [PATCH -V2 09/10] mm, pcp: avoid to reduce PCP high unnecessarily Date: Tue, 26 Sep 2023 14:09:10 +0800 Message-Id: <20230926060911.266511-10-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230926060911.266511-1-ying.huang@intel.com> References: <20230926060911.266511-1-ying.huang@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Mon, 25 Sep 2023 23:10:56 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778091011376997967 X-GMAIL-MSGID: 1778091011376997967 In PCP high auto-tuning algorithm, to minimize idle pages in PCP, in periodic vmstat updating kworker (via refresh_cpu_vm_stats()), we will decrease PCP high to try to free possible idle PCP pages. One issue is that even if the page allocating/freeing depth is larger than maximal PCP high, we may reduce PCP high unnecessarily. To avoid the above issue, in this patch, we will track the minimal PCP page count. And, the periodic PCP high decrement will not more than the recent minimal PCP page count. So, only detected idle pages will be freed. On a 2-socket Intel server with 224 logical CPU, we run 8 kbuild instances in parallel (each with `make -j 28`) in 8 cgroup. This simulates the kbuild server that is used by 0-Day kbuild service. With the patch, The number of pages allocated from zone (instead of from PCP) decreases 21.4%. Signed-off-by: "Huang, Ying" Cc: Andrew Morton Cc: Mel Gorman Cc: Vlastimil Babka Cc: David Hildenbrand Cc: Johannes Weiner Cc: Dave Hansen Cc: Michal Hocko Cc: Pavel Tatashin Cc: Matthew Wilcox Cc: Christoph Lameter --- include/linux/mmzone.h | 1 + mm/page_alloc.c | 15 ++++++++++----- 2 files changed, 11 insertions(+), 5 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 8a19e2af89df..35b78c7522a7 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -682,6 +682,7 @@ enum zone_watermarks { struct per_cpu_pages { spinlock_t lock; /* Protects lists field */ int count; /* number of pages in the list */ + int count_min; /* minimal number of pages in the list recently */ int high; /* high watermark, emptying needed */ int high_min; /* min high watermark */ int high_max; /* max high watermark */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 08b74c65b88a..d7b602822ab3 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2166,19 +2166,20 @@ static int rmqueue_bulk(struct zone *zone, unsigned int order, */ int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) { - int high_min, to_drain, batch; + int high_min, decrease, to_drain, batch; int todo = 0; high_min = READ_ONCE(pcp->high_min); batch = READ_ONCE(pcp->batch); /* - * Decrease pcp->high periodically to try to free possible - * idle PCP pages. And, avoid to free too many pages to - * control latency. + * Decrease pcp->high periodically to free idle PCP pages counted + * via pcp->count_min. And, avoid to free too many pages to + * control latency. This caps pcp->high decrement too. */ if (pcp->high > high_min) { + decrease = min(pcp->count_min, pcp->high / 5); pcp->high = max3(pcp->count - (batch << PCP_BATCH_SCALE_MAX), - pcp->high * 4 / 5, high_min); + pcp->high - decrease, high_min); if (pcp->high > high_min) todo++; } @@ -2191,6 +2192,8 @@ int decay_pcp_high(struct zone *zone, struct per_cpu_pages *pcp) todo++; } + pcp->count_min = pcp->count; + return todo; } @@ -2828,6 +2831,8 @@ struct page *__rmqueue_pcplist(struct zone *zone, unsigned int order, page = list_first_entry(list, struct page, pcp_list); list_del(&page->pcp_list); pcp->count -= 1 << order; + if (pcp->count < pcp->count_min) + pcp->count_min = pcp->count; } while (check_new_pages(page, order)); return page;