Message ID | 20231222102255.56993-1-ryncsn@gmail.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:2483:b0:fb:cd0c:d3e with SMTP id q3csp964238dyi; Fri, 22 Dec 2023 02:24:39 -0800 (PST) X-Google-Smtp-Source: AGHT+IHva+eR3P1q4WBz89AhIwnRfCSnB6Xu5DYHRRylZPjvGTgI/pIp79H/8q14Bgl7DOdS6XSi X-Received: by 2002:a05:620a:4610:b0:781:1c87:138c with SMTP id br16-20020a05620a461000b007811c87138cmr1684928qkb.113.1703240678805; Fri, 22 Dec 2023 02:24:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703240678; cv=none; d=google.com; s=arc-20160816; b=PeUQYpxpgAdkzekmTBtajVLbBmatFdXojlS1urWA6G1W2X26BS5+lie0kABptxVB7V RUQfk8YRc2nwPYo/tpjyVrdT1aN76C4CVhpKBtbuaX6SZKBLIdSmDLv7O1QUCp+s/+Vw B2mTkjEStHQe8372bSG0zolR6NBJfo3JtRDLW+xF7HxBAX0u6sEUZ/A9FMxmngD0LxiZ gJCrznFd/oLpwRw9GKM96rVjH0+1m6CI1kX4MYswzM5ntXAVm5wjdiWgTA3VT1QKv86M wFUVr3chM+99FMz2MlmYQ/P4JH/Iom0KVFwCDex2Ub92ysT+/Y1PemDfTYKEK0O7rPP3 2mrw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=gqSwQxD2fbJPgUWI2ZYQp2p14IC1JkNGTdv/EQhR7s4=; fh=BR/K4ERTJpr6f8GTal1pbdy3jT74iTFJ5XZ7xjgIJ3s=; b=ZzxGGhvZ/wzoqJhDCaxHJ/A3tzezJX5xLGOeLOQbFjR/mIA/INCltlHR4uCg9OHDSs wKAqd5rJq8s9zhW8TNOvX/So+9ydEjz1yXYOdnthJfaE25k0GJBXjaqz2q645mND+Thg UA+rHTlS9Pk53lUuyzY5FLPF08p1YD6vqfrqBwHfxuuaPOCf1b8yZ08CiR/X9O8hg8tz Bn9D7qa+59GQOt6+Slqgdz7ywf8KJX3Pcl2bCl6uErUiZagTINMnxCvT9EH4/gAjGk7o XG7yuLYRTEkPqf3LRa017N/V0UeuLa2A9X2SW3NMhnVCsqvCh0H6y2VlcOPpExCl08lr S6jQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=doFUtE6l; spf=pass (google.com: domain of linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id h22-20020ae9ec16000000b0077f10ed7a98si3703013qkg.489.2023.12.22.02.24.38 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Dec 2023 02:24:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=doFUtE6l; spf=pass (google.com: domain of linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-9589-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 989121C21B9E for <ouuuleilei@gmail.com>; Fri, 22 Dec 2023 10:24:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C625E179A6; Fri, 22 Dec 2023 10:24:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="doFUtE6l" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-oi1-f178.google.com (mail-oi1-f178.google.com [209.85.167.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D0A8214A82 for <linux-kernel@vger.kernel.org>; Fri, 22 Dec 2023 10:24:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-oi1-f178.google.com with SMTP id 5614622812f47-3b9f8c9307dso1460141b6e.0 for <linux-kernel@vger.kernel.org>; Fri, 22 Dec 2023 02:24:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1703240646; x=1703845446; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:from:to:cc:subject:date:message-id:reply-to; bh=gqSwQxD2fbJPgUWI2ZYQp2p14IC1JkNGTdv/EQhR7s4=; b=doFUtE6lEf0Y+/gNYJHbqFMUaVa2U0rS4Q/AGXQhPrXHoIYVMI1j2niy6Fx+oAL/tu e/j7CYEIlTx5BmZjvwpi4VHrXkWTcsEUOQ2FTZ3u/azjBCHA2bC34h3ADhU0f6QoZuUJ 2gKhmuaULyCUFin9YtH4GiPOxbufzf2lNS8n5+9JflYT1afvcNtdbO3CWs9Ah2B96PLt ClkqPWZFc4jqoKOKGI/Qy9VK/a98mYy/15aqt96tPyBXY7uTrMpV1q023yAD1O9VRXvM qFEBGXIRyTIAPFed1imWxeH0YUkzZKCBYZgpeEIH8wu7lvGZgd1YoEZ2q9PYv/4va+Gt 14rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703240646; x=1703845446; h=content-transfer-encoding:mime-version:reply-to:message-id:date :subject:cc:to:from:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gqSwQxD2fbJPgUWI2ZYQp2p14IC1JkNGTdv/EQhR7s4=; b=SdiIwQtO585SxJ1Ov22lrwcmFaf3+3U36rCeYO0QwAqV9ySBC3cZFQcAasW8APk4Tv R4Ccd5TIQWI9RZws2TJDLQ7KidLHnq9CRRuqsM0w2w+8exfVlvn7jQYvDWbbcvvnp/bl oBdfbkjQAL1ZyUlfmS20uNoKU7b4x/MOlg3fEmyvvMDIgVMEJpwT8MZ7FMKzx0XCIXfd bEBgP350yx6EJeoo9x+vUxT3PR/kRHbQ9x9KgMKcK/RwrutkVIO5ckCQv4EUy0wIF23X 49A+EcUUsIZZuvUlrlGVNz7PMeYd5Op9fTWoFN9RTLlHlxNdgjZYx0RM6Tl3bgSn88fz tcBg== X-Gm-Message-State: AOJu0YzSkzlGxGecAhmHoJkPtgSg4jDxdOPAOBt7pr8JX2vEArDC/IGi +DmF9xvsfQgU4VkQf6igJHE= X-Received: by 2002:a05:6808:2e4b:b0:3bb:721d:8ac9 with SMTP id gp11-20020a0568082e4b00b003bb721d8ac9mr1551420oib.18.1703240645875; Fri, 22 Dec 2023 02:24:05 -0800 (PST) Received: from KASONG-MB2.tencent.com ([103.7.29.31]) by smtp.gmail.com with ESMTPSA id gx1-20020a056a001e0100b006d9912e9a77sm364074pfb.6.2023.12.22.02.24.03 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Fri, 22 Dec 2023 02:24:05 -0800 (PST) From: Kairui Song <ryncsn@gmail.com> To: linux-mm@kvack.org Cc: Andrew Morton <akpm@linux-foundation.org>, Yu Zhao <yuzhao@google.com>, linux-kernel@vger.kernel.org, Kairui Song <kasong@tencent.com> Subject: [PATCH 0/3] mm, lru_gen: batch update pages when aging Date: Fri, 22 Dec 2023 18:22:52 +0800 Message-ID: <20231222102255.56993-1-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 Reply-To: Kairui Song <kasong@tencent.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785977298111180169 X-GMAIL-MSGID: 1785977298111180169 |
Series |
mm, lru_gen: batch update pages when aging
|
|
Message
Kairui Song
Dec. 22, 2023, 10:22 a.m. UTC
From: Kairui Song <kasong@tencent.com>
Currently when MGLRU ages, it moves the pages one by one and updates mm
counter page by page, which is correct but the overhead can be optimized
by batching these operations.
Batch moving also has a good effect on LRU ordering. Currently when
MGLRU ages, it walks the LRU backward, and the protected pages are moved to
the tail of newer gen one by one, which reverses the order of pages in
LRU. Moving them in batches can help keep their order, only in a small
scope though due to the scan limit of MAX_LRU_BATCH pages.
I noticed a higher performance gain if there are a lot of pages getting
protected, but hard to reproduce, so instead I tested using a simpler
benchmark, memtier, also for a more generic result. The main overhead
here is not aging but the result is also looking good:
Average result of 18 test runs:
Before: 44017.78 Ops/sec
After patch 1-3: 44890.50 Ops/sec (+1.8%)
Kairui Song (3):
mm, lru_gen: batch update counters on againg
mm, lru_gen: move pages in bulk when aging
mm, lru_gen: try to prefetch next page when canning LRU
mm/vmscan.c | 140 ++++++++++++++++++++++++++++++++++++++++++++++------
1 file changed, 124 insertions(+), 16 deletions(-)
Comments
On Sun, Dec 24, 2023 at 11:41:31PM -0700, Yu Zhao wrote: > On Fri, Dec 22, 2023 at 3:24 AM Kairui Song <ryncsn@gmail.com> wrote: > > > > From: Kairui Song <kasong@tencent.com> > > > > Prefetch for inactive/active LRU have been long exiting, apply the same > > optimization for MGLRU. > > I seriously doubt that prefetch helps in this case. > > Willy, any thoughts on this? Thanks. It _might_ ... highly depends on microarchitecture. My experience is that it offers more benefit on AMD than on Intel, but that experience is several generations out of date and it may just not be applicable to modern AMD. It's probably more effective on ARM Cortex A cores than on ARM Cortex X cores ... maybe we can get someone from Android (Suren?) to do some testing?
Hi Kairui, Some early feedback on your patch. I am still working my way through your patches. Might have more questions. On Fri, Dec 22, 2023 at 2:24 AM Kairui Song <ryncsn@gmail.com> wrote: > > From: Kairui Song <kasong@tencent.com> > > When lru_gen is aging, it will update mm counters page by page, > which causes a higher overhead if age happens frequently or there > are a lot of pages in one generation getting moved. > Optimize this by doing the counter update in batch. > > Although most __mod_*_state has its own caches the overhead > is still observable. > > Tested in a 4G memcg on a EPYC 7K62 with: > > memcached -u nobody -m 16384 -s /tmp/memcached.socket \ > -a 0766 -t 16 -B binary & > > memtier_benchmark -S /tmp/memcached.socket \ > -P memcache_binary -n allkeys \ > --key-minimum=1 --key-maximum=16000000 -d 1024 \ > --ratio=1:0 --key-pattern=P:P -c 2 -t 16 --pipeline 8 -x 6 > > Average result of 18 test runs: > > Before: 44017.78 Ops/sec > After: 44687.08 Ops/sec (+1.5%) > > Signed-off-by: Kairui Song <kasong@tencent.com> > --- > mm/vmscan.c | 64 +++++++++++++++++++++++++++++++++++++++++++++-------- > 1 file changed, 55 insertions(+), 9 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index b4ca3563bcf4..e3b4797b9729 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -3095,9 +3095,47 @@ static int folio_update_gen(struct folio *folio, int gen) > return ((old_flags & LRU_GEN_MASK) >> LRU_GEN_PGOFF) - 1; > } > > +/* > + * Update LRU gen in batch for each lru_gen LRU list. The batch is limited to > + * each gen / type / zone level LRU. Batch is applied after finished or aborted > + * scanning one LRU list. > + */ > +struct gen_update_batch { > + int delta[MAX_NR_GENS]; > +}; > + > +static void lru_gen_update_batch(struct lruvec *lruvec, bool type, int zone, "type" need to be int, it is either LRU_GEN_FILE or LRU_GEN_ANON. Ideally the type is an enum that defines LRU_GEN_FILE or LRU_GEN_ANON. bool is not the right C type of "type" here. The rest of the code uses "int" for type as well. I saw you use "bool type" in other patches as well. All need to change to "int type". Chris > + struct gen_update_batch *batch) > +{ > + int gen; > + int promoted = 0; > + struct lru_gen_folio *lrugen = &lruvec->lrugen; > + enum lru_list lru = type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON; > + > + for (gen = 0; gen < MAX_NR_GENS; gen++) { > + int delta = batch->delta[gen]; > + > + if (!delta) > + continue; > + > + WRITE_ONCE(lrugen->nr_pages[gen][type][zone], > + lrugen->nr_pages[gen][type][zone] + delta); > + > + if (lru_gen_is_active(lruvec, gen)) > + promoted += delta; > + } > + > + if (promoted) { > + __update_lru_size(lruvec, lru, zone, -promoted); > + __update_lru_size(lruvec, lru + LRU_ACTIVE, zone, promoted); > + } > +} > + > /* protect pages accessed multiple times through file descriptors */ > -static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclaiming) > +static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, > + bool reclaiming, struct gen_update_batch *batch) > { > + int delta = folio_nr_pages(folio); > int type = folio_is_file_lru(folio); > struct lru_gen_folio *lrugen = &lruvec->lrugen; > int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); > @@ -3120,7 +3158,8 @@ static int folio_inc_gen(struct lruvec *lruvec, struct folio *folio, bool reclai > new_flags |= BIT(PG_reclaim); > } while (!try_cmpxchg(&folio->flags, &old_flags, new_flags)); > > - lru_gen_update_size(lruvec, folio, old_gen, new_gen); > + batch->delta[old_gen] -= delta; > + batch->delta[new_gen] += delta; > > return new_gen; > } > @@ -3663,6 +3702,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) > { > int zone; > int remaining = MAX_LRU_BATCH; > + struct gen_update_batch batch = { }; > struct lru_gen_folio *lrugen = &lruvec->lrugen; > int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); > > @@ -3681,12 +3721,15 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) > VM_WARN_ON_ONCE_FOLIO(folio_is_file_lru(folio) != type, folio); > VM_WARN_ON_ONCE_FOLIO(folio_zonenum(folio) != zone, folio); > > - new_gen = folio_inc_gen(lruvec, folio, false); > + new_gen = folio_inc_gen(lruvec, folio, false, &batch); > list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]); > > - if (!--remaining) > + if (!--remaining) { > + lru_gen_update_batch(lruvec, type, zone, &batch); > return false; > + } > } > + lru_gen_update_batch(lruvec, type, zone, &batch); > } > done: > reset_ctrl_pos(lruvec, type, true); > @@ -4197,7 +4240,7 @@ static int lru_gen_memcg_seg(struct lruvec *lruvec) > ******************************************************************************/ > > static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc, > - int tier_idx) > + int tier_idx, struct gen_update_batch *batch) > { > bool success; > int gen = folio_lru_gen(folio); > @@ -4239,7 +4282,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c > if (tier > tier_idx || refs == BIT(LRU_REFS_WIDTH)) { > int hist = lru_hist_from_seq(lrugen->min_seq[type]); > > - gen = folio_inc_gen(lruvec, folio, false); > + gen = folio_inc_gen(lruvec, folio, false, batch); > list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); > > WRITE_ONCE(lrugen->protected[hist][type][tier - 1], > @@ -4249,7 +4292,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c > > /* ineligible */ > if (zone > sc->reclaim_idx || skip_cma(folio, sc)) { > - gen = folio_inc_gen(lruvec, folio, false); > + gen = folio_inc_gen(lruvec, folio, false, batch); > list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); > return true; > } > @@ -4257,7 +4300,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c > /* waiting for writeback */ > if (folio_test_locked(folio) || folio_test_writeback(folio) || > (type == LRU_GEN_FILE && folio_test_dirty(folio))) { > - gen = folio_inc_gen(lruvec, folio, true); > + gen = folio_inc_gen(lruvec, folio, true, batch); > list_move(&folio->lru, &lrugen->folios[gen][type][zone]); > return true; > } > @@ -4323,6 +4366,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > for (i = MAX_NR_ZONES; i > 0; i--) { > LIST_HEAD(moved); > int skipped_zone = 0; > + struct gen_update_batch batch = { }; > int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; > struct list_head *head = &lrugen->folios[gen][type][zone]; > > @@ -4337,7 +4381,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > > scanned += delta; > > - if (sort_folio(lruvec, folio, sc, tier)) > + if (sort_folio(lruvec, folio, sc, tier, &batch)) > sorted += delta; > else if (isolate_folio(lruvec, folio, sc)) { > list_add(&folio->lru, list); > @@ -4357,6 +4401,8 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, > skipped += skipped_zone; > } > > + lru_gen_update_batch(lruvec, type, zone, &batch); > + > if (!remaining || isolated >= MIN_LRU_BATCH) > break; > } > -- > 2.43.0 > >