[v3,2/4] workingset: refactor LRU refault to expose refault recency check
Commit Message
In preparation for computing recently evicted pages in cachestat,
refactor workingset_refault and lru_gen_refault to expose a helper
function that would test if an evicted page is recently evicted.
Signed-off-by: Nhat Pham <nphamcs@gmail.com>
---
include/linux/swap.h | 1 +
mm/workingset.c | 129 ++++++++++++++++++++++++++++++-------------
2 files changed, 92 insertions(+), 38 deletions(-)
--
2.30.2
Comments
Hi Nhat,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on shuah-kselftest/next]
[also build test ERROR on shuah-kselftest/fixes tip/x86/asm akpm-mm/mm-everything linus/master v6.1-rc8 next-20221208]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Nhat-Pham/cachestat-a-new-syscall-for-page-cache-state-of-files/20221209-042856
base: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
patch link: https://lore.kernel.org/r/20221208202808.908690-3-nphamcs%40gmail.com
patch subject: [PATCH v3 2/4] workingset: refactor LRU refault to expose refault recency check
config: hexagon-randconfig-r045-20221207
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 6e4cea55f0d1104408b26ac574566a0e4de48036)
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/e2f34321a236f2cbe1ece4eda8c0aaadaa182739
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Nhat-Pham/cachestat-a-new-syscall-for-page-cache-state-of-files/20221209-042856
git checkout e2f34321a236f2cbe1ece4eda8c0aaadaa182739
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
In file included from mm/workingset.c:8:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __raw_readb(PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
#define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
^
In file included from mm/workingset.c:8:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
~~~~~~~~~~ ^
include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
#define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
^
In file included from mm/workingset.c:8:
In file included from include/linux/memcontrol.h:13:
In file included from include/linux/cgroup.h:26:
In file included from include/linux/kernel_stat.h:9:
In file included from include/linux/interrupt.h:11:
In file included from include/linux/hardirq.h:11:
In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
In file included from include/asm-generic/hardirq.h:17:
In file included from include/linux/irq.h:20:
In file included from include/linux/io.h:13:
In file included from arch/hexagon/include/asm/io.h:334:
include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writeb(value, PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
__raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
~~~~~~~~~~ ^
>> mm/workingset.c:265:38: error: indirection requires pointer operand ('int' invalid)
eviction_memcg = mem_cgroup_from_id(*memcgid);
^~~~~~~~
>> mm/workingset.c:267:45: error: passing 'struct pglist_data' to parameter of incompatible type 'struct pglist_data *'; remove *
lruvec = mem_cgroup_lruvec(eviction_memcg, *pgdat);
^~~~~~
include/linux/memcontrol.h:709:33: note: passing argument to parameter 'pgdat' here
struct pglist_data *pgdat)
^
>> mm/workingset.c:271:12: error: indirection requires pointer operand ('unsigned long' invalid)
return !((*token >> LRU_REFS_WIDTH) != (min_seq & (EVICTION_MASK >> LRU_REFS_WIDTH)));
^~~~~~
6 warnings and 3 errors generated.
vim +265 mm/workingset.c
246
247 /*
248 * Test if the folio is recently evicted.
249 *
250 * As a side effect, also populates the references with
251 * values unpacked from the shadow of the evicted folio.
252 */
253 static bool lru_gen_test_recent(void *shadow, bool file, bool *workingset)
254 {
255 struct mem_cgroup *eviction_memcg;
256 struct lruvec *lruvec;
257 struct lru_gen_struct *lrugen;
258 unsigned long min_seq;
259
260 int memcgid;
261 struct pglist_data *pgdat;
262 unsigned long token;
263
264 unpack_shadow(shadow, &memcgid, &pgdat, &token, workingset);
> 265 eviction_memcg = mem_cgroup_from_id(*memcgid);
266
> 267 lruvec = mem_cgroup_lruvec(eviction_memcg, *pgdat);
268 lrugen = &lruvec->lrugen;
269
270 min_seq = READ_ONCE(lrugen->min_seq[file]);
> 271 return !((*token >> LRU_REFS_WIDTH) != (min_seq & (EVICTION_MASK >> LRU_REFS_WIDTH)));
272 }
273
Hi Nhat,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on shuah-kselftest/next]
[also build test ERROR on shuah-kselftest/fixes tip/x86/asm akpm-mm/mm-everything linus/master v6.1-rc8 next-20221208]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Nhat-Pham/cachestat-a-new-syscall-for-page-cache-state-of-files/20221209-042856
base: https://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest.git next
patch link: https://lore.kernel.org/r/20221208202808.908690-3-nphamcs%40gmail.com
patch subject: [PATCH v3 2/4] workingset: refactor LRU refault to expose refault recency check
config: nios2-randconfig-r001-20221207
compiler: nios2-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# https://github.com/intel-lab-lkp/linux/commit/e2f34321a236f2cbe1ece4eda8c0aaadaa182739
git remote add linux-review https://github.com/intel-lab-lkp/linux
git fetch --no-tags linux-review Nhat-Pham/cachestat-a-new-syscall-for-page-cache-state-of-files/20221209-042856
git checkout e2f34321a236f2cbe1ece4eda8c0aaadaa182739
# save the config file
mkdir build_dir && cp config build_dir/.config
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=nios2 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>
All errors (new ones prefixed by >>):
mm/workingset.c: In function 'lru_gen_test_recent':
>> mm/workingset.c:265:45: error: invalid type argument of unary '*' (have 'int')
265 | eviction_memcg = mem_cgroup_from_id(*memcgid);
| ^~~~~~~~
>> mm/workingset.c:267:52: error: incompatible type for argument 2 of 'mem_cgroup_lruvec'
267 | lruvec = mem_cgroup_lruvec(eviction_memcg, *pgdat);
| ^~~~~~
| |
| struct pglist_data
In file included from mm/workingset.c:8:
include/linux/memcontrol.h:1251:68: note: expected 'struct pglist_data *' but argument is of type 'struct pglist_data'
1251 | struct pglist_data *pgdat)
| ~~~~~~~~~~~~~~~~~~~~^~~~~
>> mm/workingset.c:271:19: error: invalid type argument of unary '*' (have 'long unsigned int')
271 | return !((*token >> LRU_REFS_WIDTH) != (min_seq & (EVICTION_MASK >> LRU_REFS_WIDTH)));
| ^~~~~~
mm/workingset.c:272:1: error: control reaches end of non-void function [-Werror=return-type]
272 | }
| ^
cc1: some warnings being treated as errors
vim +265 mm/workingset.c
246
247 /*
248 * Test if the folio is recently evicted.
249 *
250 * As a side effect, also populates the references with
251 * values unpacked from the shadow of the evicted folio.
252 */
253 static bool lru_gen_test_recent(void *shadow, bool file, bool *workingset)
254 {
255 struct mem_cgroup *eviction_memcg;
256 struct lruvec *lruvec;
257 struct lru_gen_struct *lrugen;
258 unsigned long min_seq;
259
260 int memcgid;
261 struct pglist_data *pgdat;
262 unsigned long token;
263
264 unpack_shadow(shadow, &memcgid, &pgdat, &token, workingset);
> 265 eviction_memcg = mem_cgroup_from_id(*memcgid);
266
> 267 lruvec = mem_cgroup_lruvec(eviction_memcg, *pgdat);
268 lrugen = &lruvec->lrugen;
269
270 min_seq = READ_ONCE(lrugen->min_seq[file]);
> 271 return !((*token >> LRU_REFS_WIDTH) != (min_seq & (EVICTION_MASK >> LRU_REFS_WIDTH)));
272 }
273
@@ -361,6 +361,7 @@ static inline void folio_set_swap_entry(struct folio *folio, swp_entry_t entry)
}
/* linux/mm/workingset.c */
+bool workingset_test_recent(void *shadow, bool file, bool *workingset);
void workingset_age_nonresident(struct lruvec *lruvec, unsigned long nr_pages);
void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg);
void workingset_refault(struct folio *folio, void *shadow);
@@ -244,6 +244,33 @@ static void *lru_gen_eviction(struct folio *folio)
return pack_shadow(mem_cgroup_id(memcg), pgdat, token, refs);
}
+/*
+ * Test if the folio is recently evicted.
+ *
+ * As a side effect, also populates the references with
+ * values unpacked from the shadow of the evicted folio.
+ */
+static bool lru_gen_test_recent(void *shadow, bool file, bool *workingset)
+{
+ struct mem_cgroup *eviction_memcg;
+ struct lruvec *lruvec;
+ struct lru_gen_struct *lrugen;
+ unsigned long min_seq;
+
+ int memcgid;
+ struct pglist_data *pgdat;
+ unsigned long token;
+
+ unpack_shadow(shadow, &memcgid, &pgdat, &token, workingset);
+ eviction_memcg = mem_cgroup_from_id(*memcgid);
+
+ lruvec = mem_cgroup_lruvec(eviction_memcg, *pgdat);
+ lrugen = &lruvec->lrugen;
+
+ min_seq = READ_ONCE(lrugen->min_seq[file]);
+ return !((*token >> LRU_REFS_WIDTH) != (min_seq & (EVICTION_MASK >> LRU_REFS_WIDTH)));
+}
+
static void lru_gen_refault(struct folio *folio, void *shadow)
{
int hist, tier, refs;
@@ -306,6 +333,11 @@ static void *lru_gen_eviction(struct folio *folio)
return NULL;
}
+static bool lru_gen_test_recent(void *shadow, bool file, bool *workingset)
+{
+ return true;
+}
+
static void lru_gen_refault(struct folio *folio, void *shadow)
{
}
@@ -373,40 +405,31 @@ void *workingset_eviction(struct folio *folio, struct mem_cgroup *target_memcg)
folio_test_workingset(folio));
}
-/**
- * workingset_refault - Evaluate the refault of a previously evicted folio.
- * @folio: The freshly allocated replacement folio.
- * @shadow: Shadow entry of the evicted folio.
+/*
+ * Test if the folio is recently evicted by checking if
+ * refault distance of shadow exceeds workingset size.
*
- * Calculates and evaluates the refault distance of the previously
- * evicted folio in the context of the node and the memcg whose memory
- * pressure caused the eviction.
+ * As a side effect, populate workingset with the value
+ * unpacked from shadow.
*/
-void workingset_refault(struct folio *folio, void *shadow)
+bool workingset_test_recent(void *shadow, bool file, bool *workingset)
{
- bool file = folio_is_file_lru(folio);
struct mem_cgroup *eviction_memcg;
struct lruvec *eviction_lruvec;
unsigned long refault_distance;
unsigned long workingset_size;
- struct pglist_data *pgdat;
- struct mem_cgroup *memcg;
- unsigned long eviction;
- struct lruvec *lruvec;
unsigned long refault;
- bool workingset;
+
int memcgid;
- long nr;
+ struct pglist_data *pgdat;
+ unsigned long eviction;
- if (lru_gen_enabled()) {
- lru_gen_refault(folio, shadow);
- return;
- }
+ if (lru_gen_enabled())
+ return lru_gen_test_recent(shadow, file, workingset);
- unpack_shadow(shadow, &memcgid, &pgdat, &eviction, &workingset);
+ unpack_shadow(shadow, &memcgid, &pgdat, &eviction, workingset);
eviction <<= bucket_order;
- rcu_read_lock();
/*
* Look up the memcg associated with the stored ID. It might
* have been deleted since the folio's eviction.
@@ -425,7 +448,8 @@ void workingset_refault(struct folio *folio, void *shadow)
*/
eviction_memcg = mem_cgroup_from_id(memcgid);
if (!mem_cgroup_disabled() && !eviction_memcg)
- goto out;
+ return false;
+
eviction_lruvec = mem_cgroup_lruvec(eviction_memcg, pgdat);
refault = atomic_long_read(&eviction_lruvec->nonresident_age);
@@ -447,21 +471,6 @@ void workingset_refault(struct folio *folio, void *shadow)
*/
refault_distance = (refault - eviction) & EVICTION_MASK;
- /*
- * The activation decision for this folio is made at the level
- * where the eviction occurred, as that is where the LRU order
- * during folio reclaim is being determined.
- *
- * However, the cgroup that will own the folio is the one that
- * is actually experiencing the refault event.
- */
- nr = folio_nr_pages(folio);
- memcg = folio_memcg(folio);
- pgdat = folio_pgdat(folio);
- lruvec = mem_cgroup_lruvec(memcg, pgdat);
-
- mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
-
mem_cgroup_flush_stats_delayed();
/*
* Compare the distance to the existing workingset size. We
@@ -483,8 +492,51 @@ void workingset_refault(struct folio *folio, void *shadow)
NR_INACTIVE_ANON);
}
}
- if (refault_distance > workingset_size)
+
+ return refault_distance <= workingset_size;
+}
+
+/**
+ * workingset_refault - Evaluate the refault of a previously evicted folio.
+ * @folio: The freshly allocated replacement folio.
+ * @shadow: Shadow entry of the evicted folio.
+ *
+ * Calculates and evaluates the refault distance of the previously
+ * evicted folio in the context of the node and the memcg whose memory
+ * pressure caused the eviction.
+ */
+void workingset_refault(struct folio *folio, void *shadow)
+{
+ bool file = folio_is_file_lru(folio);
+ struct pglist_data *pgdat;
+ struct mem_cgroup *memcg;
+ struct lruvec *lruvec;
+ bool workingset;
+ long nr;
+
+ if (lru_gen_enabled()) {
+ lru_gen_refault(folio, shadow);
+ return;
+ }
+
+ rcu_read_lock();
+
+ nr = folio_nr_pages(folio);
+ memcg = folio_memcg(folio);
+ pgdat = folio_pgdat(folio);
+ lruvec = mem_cgroup_lruvec(memcg, pgdat);
+
+ if (!workingset_test_recent(shadow, file, &workingset)) {
+ /*
+ * The activation decision for this folio is made at the level
+ * where the eviction occurred, as that is where the LRU order
+ * during folio reclaim is being determined.
+ *
+ * However, the cgroup that will own the folio is the one that
+ * is actually experiencing the refault event.
+ */
goto out;
+ }
folio_set_active(folio);
workingset_age_nonresident(lruvec, nr);
@@ -498,6 +550,7 @@ void workingset_refault(struct folio *folio, void *shadow)
mod_lruvec_state(lruvec, WORKINGSET_RESTORE_BASE + file, nr);
}
out:
+ mod_lruvec_state(lruvec, WORKINGSET_REFAULT_BASE + file, nr);
rcu_read_unlock();
}