Message ID | 20230628044303.1412624-1-fengwei.yin@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp8762551vqr; Wed, 28 Jun 2023 01:33:05 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ54A+GjWGu8kRkrr6rXTf6qT+a2mFaxk/LCrb2q1qzOusfr2gUBUMYv4anXvvDFSPRB1aHi X-Received: by 2002:a05:6a21:3394:b0:126:aa77:a11f with SMTP id yy20-20020a056a21339400b00126aa77a11fmr13647519pzb.6.1687941184862; Wed, 28 Jun 2023 01:33:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687941184; cv=none; d=google.com; s=arc-20160816; b=cAJDgy8vzOD/R4lc4RhzJE/h57MMRWmvpW2OEISimYzvfVcDIV9509FhjY/yiPc7d1 qPra79w5eqHdp/ESFwTTmwq/cRwDPcbMI+ymJ3LmA4DVEr5YtNI7YlWe6oVDbdlX7Q9S FpCR/zmIRb76T8NrHMC582i8F67Sm7u4Y/9h2s0e6gsiGk9NaOBnLr9W57ru4uP5841b 4oybQeMy5dmuB6xzFCqW0XF88z8eVNy+3E+sqYN0NfS+CDd/OsS1g3jYEFXIhOdd8I4T 71fN07y73q8idoeF74sRhzgm71uMPJgi4N1bvRGRryufN3zfL53Fl83jZqhv97SYEZL3 /BYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=k/S0b2lQYRv+KQguwwQz4tzRmPBLvV4N0uBTkM2W2Z4=; fh=ib+Yz+zNuK095fPymlTnji6vcDKYvyS/oONHXofbyT4=; b=bFj69lMY/qA1NiHUrRs7drNxuvQSY3BMFI3xGeXWlsbkWoPqxPX+6bItXxsObd+0Ox 2rdLb3h7Eyg42DO/ONCSZnjRgyqgLlU6xdSRJ4Ic6ru7JOqtSsdKMOaNrogQvmnXgqf4 9ZtYtsb9cweS5pmZLxPmB08gvh+ww0YxDaZy03NzwMoq/gKSLqncKrf5CGe1DIWAVp7W fQ7816hXs7JCJC3U96XYVG02DjWDAp21QBcqqyA7WLRvZtqx3syRcPSr3F8GVi5PLKDa Mtnhp6U3UULjaTfwvz3ATD3IZvM4E7C2oKOnsWiXT2Cg3kx0lO+wZPf2IIW0/UrA2ALp 9eMw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GIYEgrXl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g26-20020a63201a000000b005030925d31asi8833377pgg.203.2023.06.28.01.32.49; Wed, 28 Jun 2023 01:33:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=GIYEgrXl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233653AbjF1IYn (ORCPT <rfc822;ivan.orlov0322@gmail.com> + 99 others); Wed, 28 Jun 2023 04:24:43 -0400 Received: from mga17.intel.com ([192.55.52.151]:64476 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233844AbjF1IVE (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 28 Jun 2023 04:21:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1687940464; x=1719476464; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=/qPMcMuAbH8XbNYSsnxXR9j+8vlCJi+JFzYYpfZQA1s=; b=GIYEgrXlJ2Ufss14RtWTZPaEtgueYMAJ+owErqDSPkBZXBCcMjqfBDMx FL/qRxjq40LWs9bdciFc1l7k4aE0viGuum4Gn7sOKib8/5grrnli0iUoK N0SMfjtf3qe38bYi5zk31yUhMlFPPRRez/Q1KbxUSN1sRZxl0EOTtZw9z R6ndsJvsmRPLs8+0KiD6Dcxnlaaa/eiZoH6idTbbRGJH7ysplYbfjNJtI zI1dOj992dytBxRNnHP0Yy2p51zpNP99MGT+trqkNz78mRas4Ue+Bi1Ap /vb4FuV8W9QINe99P1aTtpl+gN2GXtIW6T2mZAkrtUYSnRLVX5jRjZUWo A==; X-IronPort-AV: E=McAfee;i="6600,9927,10754"; a="342087734" X-IronPort-AV: E=Sophos;i="6.01,164,1684825200"; d="scan'208";a="342087734" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 27 Jun 2023 21:43:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10754"; a="806741515" X-IronPort-AV: E=Sophos;i="6.01,164,1684825200"; d="scan'208";a="806741515" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by FMSMGA003.fm.intel.com with ESMTP; 27 Jun 2023 21:43:08 -0700 From: Yin Fengwei <fengwei.yin@intel.com> To: akpm@linux-foundation.org, mike.kravetz@oracle.com, willy@infradead.org, ackerleytng@google.com, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: fengwei.yin@intel.com, oliver.sang@intel.com Subject: [PATCH v2] readahead: Correct the start and size in ondemand_readahead() Date: Wed, 28 Jun 2023 12:43:03 +0800 Message-Id: <20230628044303.1412624-1-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769832571279379019?= X-GMAIL-MSGID: =?utf-8?q?1769934615493193692?= |
Series |
[v2] readahead: Correct the start and size in ondemand_readahead()
|
|
Commit Message
Yin Fengwei
June 28, 2023, 4:43 a.m. UTC
The commit
9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
updated the page_cache_next_miss() to return the index beyond
range.
But it breaks the start/size of ra in ondemand_readahead() because
the offset by one is accumulated to readahead_index. As a consequence,
not best readahead order is picked.
Tracing of the order parameter of filemap_alloc_folio() showed:
page order : count distribution
0 : 892073 | |
1 : 0 | |
2 : 65120457 |****************************************|
3 : 32914005 |******************** |
4 : 33020991 |******************** |
with 9425c591e06a9.
With parent commit:
page order : count distribution
0 : 3417288 |**** |
1 : 0 | |
2 : 877012 |* |
3 : 288 | |
4 : 5607522 |******* |
5 : 29974228 |****************************************|
Fix the issue by removing the offset by one when page_cache_next_miss()
returns no gaps in the range.
After the fix:
page order : count distribution
0 : 2598561 |*** |
1 : 0 | |
2 : 687739 | |
3 : 288 | |
4 : 207210 | |
5 : 32628260 |****************************************|
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202306211346.1e9ff03e-oliver.sang@intel.com
Fixes: 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one")
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
---
Changes from v1:
- only removing offset by one when there is no gaps found by
page_cache_next_miss()
- Update commit message to include the histogram of page order
after fix
mm/readahead.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
Comments
On 06/28/23 12:43, Yin Fengwei wrote: > The commit > 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") > updated the page_cache_next_miss() to return the index beyond > range. > > But it breaks the start/size of ra in ondemand_readahead() because > the offset by one is accumulated to readahead_index. As a consequence, > not best readahead order is picked. > > Tracing of the order parameter of filemap_alloc_folio() showed: > page order : count distribution > 0 : 892073 | | > 1 : 0 | | > 2 : 65120457 |****************************************| > 3 : 32914005 |******************** | > 4 : 33020991 |******************** | > with 9425c591e06a9. > > With parent commit: > page order : count distribution > 0 : 3417288 |**** | > 1 : 0 | | > 2 : 877012 |* | > 3 : 288 | | > 4 : 5607522 |******* | > 5 : 29974228 |****************************************| > > Fix the issue by removing the offset by one when page_cache_next_miss() > returns no gaps in the range. > > After the fix: > page order : count distribution > 0 : 2598561 |*** | > 1 : 0 | | > 2 : 687739 | | > 3 : 288 | | > 4 : 207210 | | > 5 : 32628260 |****************************************| > Thank you for your detailed analysis! When the regression was initially discovered, I sent a patch to revert commit 9425c591e06a. Andrew has picked up this change. And, Andrew has also picked up this patch. I have not verified yet, but I suspect that this patch is going to cause a regression because it depends on the behavior of page_cache_next_miss in 9425c591e06a which has been reverted. Sorry for the delay in responding as I was traveling.
On 7/4/2023 2:49 AM, Mike Kravetz wrote: > On 06/28/23 12:43, Yin Fengwei wrote: >> The commit >> 9425c591e06a ("page cache: fix page_cache_next/prev_miss off by one") >> updated the page_cache_next_miss() to return the index beyond >> range. >> >> But it breaks the start/size of ra in ondemand_readahead() because >> the offset by one is accumulated to readahead_index. As a consequence, >> not best readahead order is picked. >> >> Tracing of the order parameter of filemap_alloc_folio() showed: >> page order : count distribution >> 0 : 892073 | | >> 1 : 0 | | >> 2 : 65120457 |****************************************| >> 3 : 32914005 |******************** | >> 4 : 33020991 |******************** | >> with 9425c591e06a9. >> >> With parent commit: >> page order : count distribution >> 0 : 3417288 |**** | >> 1 : 0 | | >> 2 : 877012 |* | >> 3 : 288 | | >> 4 : 5607522 |******* | >> 5 : 29974228 |****************************************| >> >> Fix the issue by removing the offset by one when page_cache_next_miss() >> returns no gaps in the range. >> >> After the fix: >> page order : count distribution >> 0 : 2598561 |*** | >> 1 : 0 | | >> 2 : 687739 | | >> 3 : 288 | | >> 4 : 207210 | | >> 5 : 32628260 |****************************************| >> > > Thank you for your detailed analysis! > > When the regression was initially discovered, I sent a patch to revert > commit 9425c591e06a. Andrew has picked up this change. And, Andrew has > also picked up this patch. Oh. I didn't notice that you sent revert patch. My understanding is that commit 9425c591e06a is a good change. > > I have not verified yet, but I suspect that this patch is going to cause > a regression because it depends on the behavior of page_cache_next_miss > in 9425c591e06a which has been reverted. Yes. If the 9425c591e06a was reverted, this patch could introduce regression. Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we can suggest to Andrew to take it. Regards Yin, Fengwei
On 07/04/23 09:41, Yin, Fengwei wrote: > On 7/4/2023 2:49 AM, Mike Kravetz wrote: > > On 06/28/23 12:43, Yin Fengwei wrote: > > > > Thank you for your detailed analysis! > > > > When the regression was initially discovered, I sent a patch to revert > > commit 9425c591e06a. Andrew has picked up this change. And, Andrew has > > also picked up this patch. > Oh. I didn't notice that you sent revert patch. My understanding is that > commit 9425c591e06a is a good change. > > > > > I have not verified yet, but I suspect that this patch is going to cause > > a regression because it depends on the behavior of page_cache_next_miss > > in 9425c591e06a which has been reverted. > Yes. If the 9425c591e06a was reverted, this patch could introduce regression. > Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we > can suggest to Andrew to take it. For now, I suggest we go with the revert. Why? - The revert is already going into stable trees. - I may not be remembering correctly, but I seem to recall Matthew mentioning plans to redo/redesign the page cache and possibly readahead code. If this is the case, then better to keep the legacy behavior for now. But, I am not sure if this is actually part of any plan or work in progress.
On 7/6/23 00:52, Mike Kravetz wrote: > On 07/04/23 09:41, Yin, Fengwei wrote: >> On 7/4/2023 2:49 AM, Mike Kravetz wrote: >>> On 06/28/23 12:43, Yin Fengwei wrote: >>> >>> Thank you for your detailed analysis! >>> >>> When the regression was initially discovered, I sent a patch to revert >>> commit 9425c591e06a. Andrew has picked up this change. And, Andrew has >>> also picked up this patch. >> Oh. I didn't notice that you sent revert patch. My understanding is that >> commit 9425c591e06a is a good change. >> >>> >>> I have not verified yet, but I suspect that this patch is going to cause >>> a regression because it depends on the behavior of page_cache_next_miss >>> in 9425c591e06a which has been reverted. >> Yes. If the 9425c591e06a was reverted, this patch could introduce regression. >> Which fixing do you prefer? reverting 9425c591e06a or this patch? Then we >> can suggest to Andrew to take it. > > For now, I suggest we go with the revert. Why? > - The revert is already going into stable trees. > - I may not be remembering correctly, but I seem to recall Matthew > mentioning plans to redo/redesign the page cache and possibly > readahead code. If this is the case, then better to keep the legacy > behavior for now. But, I am not sure if this is actually part of any > plan or work in progress. > It's fine to me and thanks a lot for detail explanations. Hi Andrew, Could you please help to drop this patch? Thanks. Regards Yin, Fengwei
diff --git a/mm/readahead.c b/mm/readahead.c index 47afbca1d122..a93af773686f 100644 --- a/mm/readahead.c +++ b/mm/readahead.c @@ -614,9 +614,17 @@ static void ondemand_readahead(struct readahead_control *ractl, max_pages); rcu_read_unlock(); - if (!start || start - index > max_pages) + if (!start || start - index - 1 > max_pages) return; + /* + * If no gaps in the range, page_cache_next_miss() returns + * index beyond range. Adjust it back to make sure + * ractl->_index is updated correctly later. + */ + if ((start - index - 1) == max_pages) + start--; + ra->start = start; ra->size = start - index; /* old async_size */ ra->size += req_size;