Message ID | 1700823445-27531-1-git-send-email-quic_charante@quicinc.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:ce62:0:b0:403:3b70:6f57 with SMTP id o2csp1077388vqx; Fri, 24 Nov 2023 02:58:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IGexovVrag2fB4+lONAUQPTABWy4md7/lCTBXtwHdOKFStN8nvN2wQKaZxuJ/5R8iny1p+t X-Received: by 2002:a05:6a20:e121:b0:18c:651:66cd with SMTP id kr33-20020a056a20e12100b0018c065166cdmr755411pzb.60.1700823501375; Fri, 24 Nov 2023 02:58:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700823501; cv=none; d=google.com; s=arc-20160816; b=xZF7CYozjb3MsD81ebjuhUYDGY77ZKf/6PDvO864AvHmdO9CCJrCl0Ux+KJFBjxykf onNsWfZJbNjtJMqVC2l1h92g/CgIaxrWK5PkzXuQtRZKmL2ADbmC7FQEr6rZ7m5ifxqr Wq8RrDPinJgDllbbQkh4w3QncLOzQjowIuD95j00PF2nyVr8RpWHtofCNdrgMBb+vuWh xQLpHSmuIPPs7Vf8UqtmscqXCHyiJlQGbtCZ2PxejjmNFvfBjOZioZ0L/CF59sHp4p6a 8eyYcaOUF2QlozYe9D4/QRkE82y2tE1MVIjcgXezP857KWkwg2/o7ccoJRq5arTSKByD Nlcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:subject:cc:to:from :dkim-signature; bh=qJYKCnY15MtyD1i2oBanNGX6llWbvyH4diC81pvBdwk=; fh=QXtgxHomICg+52X+aNWhKWqqQJnos/nE1/Yu3UisPGg=; b=U0PetlbrDiByRgtzjeUo2DBWB5SKZLj0I2zikYg3aDoDG4djEAMuLMrz83UACxYBQ1 QQoRPFYk6F5WGgYAF5OurP4Z/iIT7IvfxymIaQ2TilwgJ7dyU/cxcfdOnZILS3aYiyVm jwZUFugTWPOgEzdT5fk0NENmmuucj/2qJq1a9eTe7N2wfkLH57Cohv8S92SA5HEJFs8A lyEiN/Bzklzl5tGGu+GwHpmofCH8Ib5pyTxko6WutJigS6T6L25Du8u0fUA1yVd0iHiW U/TiwfKJN1EPKk1YQY0n+WzzOpg1ibul90Or+HmDopdx97SmT7BjOZf9ydzE7YdbuEXp NYIg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=Cud8PM6x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id i3-20020a170902c94300b001cfa126e7d7si1697535pla.451.2023.11.24.02.58.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 24 Nov 2023 02:58:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@quicinc.com header.s=qcppdkim1 header.b=Cud8PM6x; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=quicinc.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 82C3A82B30E7; Fri, 24 Nov 2023 02:58:17 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345621AbjKXK6I (ORCPT <rfc822;ouuuleilei@gmail.com> + 99 others); Fri, 24 Nov 2023 05:58:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345362AbjKXK6H (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 24 Nov 2023 05:58:07 -0500 Received: from mx0b-0031df01.pphosted.com (mx0b-0031df01.pphosted.com [205.220.180.131]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51D23D44 for <linux-kernel@vger.kernel.org>; Fri, 24 Nov 2023 02:58:12 -0800 (PST) Received: from pps.filterd (m0279871.ppops.net [127.0.0.1]) by mx0a-0031df01.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3AOAvudk023244; Fri, 24 Nov 2023 10:57:56 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=quicinc.com; h=from : to : cc : subject : date : message-id : mime-version : content-type; s=qcppdkim1; bh=qJYKCnY15MtyD1i2oBanNGX6llWbvyH4diC81pvBdwk=; b=Cud8PM6xctfq91vVrtzSwd8dX6Y2RP3auZu8NMWsNpRbIuFEBB30MGmDd2D+34+HMycD KtnMjPjrTZJ9REUmFs0lH5yf8nF3MxJpm5Aexyki5G3AmKDLpdvMLC6EX0Bslbq6OcIo 1QB+47/pYVdzgxxY9Jx+bibRKUUoTogT1BRGOoMFJbs+B0YT2sv1KOFiJRGS+q2Ytq4P h7DIsFS/ZA5AlGBQzuvsgy3eQuGtyPwUHxOgosfz3Pbmfc7O99MeeR5C7whGkA08NlPI PwgMZPwJ+i0RVVuPurCSsKdAev4OHqQkcm+9f2W/HZtZjy68Mys/JvB09yDNkCSFksDK qw== Received: from nalasppmta04.qualcomm.com (Global_NAT1.qualcomm.com [129.46.96.20]) by mx0a-0031df01.pphosted.com (PPS) with ESMTPS id 3uj7gjt8n5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Nov 2023 10:57:56 +0000 Received: from nalasex01a.na.qualcomm.com (nalasex01a.na.qualcomm.com [10.47.209.196]) by NALASPPMTA04.qualcomm.com (8.17.1.5/8.17.1.5) with ESMTPS id 3AOAvtQG029675 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 24 Nov 2023 10:57:55 GMT Received: from hu-charante-hyd.qualcomm.com (10.80.80.8) by nalasex01a.na.qualcomm.com (10.47.209.196) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 24 Nov 2023 02:57:50 -0800 From: Charan Teja Kalla <quic_charante@quicinc.com> To: <akpm@linux-foundation.org>, <mgorman@techsingularity.net>, <mhocko@suse.com>, <david@redhat.com>, <vbabka@suse.cz>, <hannes@cmpxchg.org>, <quic_pkondeti@quicinc.com>, <quic_cgoldswo@quicinc.com> CC: <linux-mm@kvack.org>, <linux-kernel@vger.kernel.org>, Charan Teja Kalla <quic_charante@quicinc.com> Subject: [RESEND PATCH V2] mm: page_alloc: unreserve highatomic page blocks before oom Date: Fri, 24 Nov 2023 16:27:25 +0530 Message-ID: <1700823445-27531-1-git-send-email-quic_charante@quicinc.com> X-Mailer: git-send-email 2.7.4 MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.80.80.8] X-ClientProxiedBy: nasanex01a.na.qualcomm.com (10.52.223.231) To nalasex01a.na.qualcomm.com (10.47.209.196) X-QCInternal: smtphost X-Proofpoint-Virus-Version: vendor=nai engine=6200 definitions=5800 signatures=585085 X-Proofpoint-ORIG-GUID: fNJKhlF92KjgzJCQvDGH27e7Lh6-7aum X-Proofpoint-GUID: fNJKhlF92KjgzJCQvDGH27e7Lh6-7aum X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-23_15,2023-11-22_01,2023-05-22_02 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 impostorscore=0 mlxscore=0 lowpriorityscore=0 clxscore=1015 malwarescore=0 suspectscore=0 spamscore=0 phishscore=0 bulkscore=0 adultscore=0 mlxlogscore=999 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2311060000 definitions=main-2311240086 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 24 Nov 2023 02:58:17 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783442704054689550 X-GMAIL-MSGID: 1783442704054689550 |
Series |
[RESEND,V2] mm: page_alloc: unreserve highatomic page blocks before oom
|
|
Commit Message
Charan Teja Kalla
Nov. 24, 2023, 10:57 a.m. UTC
__alloc_pages_direct_reclaim() is called from slowpath allocation where high atomic reserves can be unreserved after there is a progress in reclaim and yet no suitable page is found. Later should_reclaim_retry() gets called from slow path allocation to decide if the reclaim needs to be retried before OOM kill path is taken. should_reclaim_retry() checks the available(reclaimable + free pages) memory against the min wmark levels of a zone and returns: a) true, if it is above the min wmark so that slow path allocation will do the reclaim retries. b) false, thus slowpath allocation takes oom kill path. should_reclaim_retry() can also unreserves the high atomic reserves **but only after all the reclaim retries are exhausted.** In a case where there are almost none reclaimable memory and free pages contains mostly the high atomic reserves but allocation context can't use these high atomic reserves, makes the available memory below min wmark levels hence false is returned from should_reclaim_retry() leading the allocation request to take OOM kill path. This can turn into a early oom kill if high atomic reserves are holding lot of free memory and unreserving of them is not attempted. (early)OOM is encountered on a VM with the below state: [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB local_pcp:492kB free_cma:0kB [ 295.998656] lowmem_reserve[]: 0 32 [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 7752kB Per above log, the free memory of ~7MB exist in the high atomic reserves is not freed up before falling back to oom kill path. Fix it by trying to unreserve the high atomic reserves in should_reclaim_retry() before __alloc_pages_direct_reclaim() can fallback to oom kill path. Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand") Reported-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com> Suggested-by: Michal Hocko <mhocko@suse.com> Acked-by: Michal Hocko <mhocko@suse.com> Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> --- Changes in V2 and RESEND: o Unreserve the high atomic pageblock from should_reclaim_retry() o Collected the tags by Michal. o Start a separate discussion for high atomic reserves. o https://lore.kernel.org/linux-mm/cover.1699104759.git.quic_charante@quicinc.com/#r Changes in V1: o Unreserving the high atomic page blocks is tried to fix from the oom kill path rather than in should_reclaim_retry() o https://lore.kernel.org/linux-mm/1698669590-3193-1-git-send-email-quic_charante@quicinc.com/ mm/page_alloc.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
Comments
On Fri, 24 Nov 2023, Charan Teja Kalla wrote: > __alloc_pages_direct_reclaim() is called from slowpath allocation where > high atomic reserves can be unreserved after there is a progress in > reclaim and yet no suitable page is found. Later should_reclaim_retry() > gets called from slow path allocation to decide if the reclaim needs to > be retried before OOM kill path is taken. > > should_reclaim_retry() checks the available(reclaimable + free pages) > memory against the min wmark levels of a zone and returns: > a) true, if it is above the min wmark so that slow path allocation will > do the reclaim retries. > b) false, thus slowpath allocation takes oom kill path. > > should_reclaim_retry() can also unreserves the high atomic reserves > **but only after all the reclaim retries are exhausted.** > > In a case where there are almost none reclaimable memory and free pages > contains mostly the high atomic reserves but allocation context can't > use these high atomic reserves, makes the available memory below min > wmark levels hence false is returned from should_reclaim_retry() leading > the allocation request to take OOM kill path. This can turn into a early > oom kill if high atomic reserves are holding lot of free memory and > unreserving of them is not attempted. > > (early)OOM is encountered on a VM with the below state: > [ 295.998653] Normal free:7728kB boost:0kB min:804kB low:1004kB > high:1204kB reserved_highatomic:8192KB active_anon:4kB inactive_anon:0kB > active_file:24kB inactive_file:24kB unevictable:1220kB writepending:0kB > present:70732kB managed:49224kB mlocked:0kB bounce:0kB free_pcp:688kB > local_pcp:492kB free_cma:0kB > [ 295.998656] lowmem_reserve[]: 0 32 > [ 295.998659] Normal: 508*4kB (UMEH) 241*8kB (UMEH) 143*16kB (UMEH) > 33*32kB (UH) 7*64kB (UH) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB > 0*4096kB = 7752kB > > Per above log, the free memory of ~7MB exist in the high atomic > reserves is not freed up before falling back to oom kill path. > > Fix it by trying to unreserve the high atomic reserves in > should_reclaim_retry() before __alloc_pages_direct_reclaim() can > fallback to oom kill path. > > Fixes: 0aaa29a56e4f ("mm, page_alloc: reserve pageblocks for high-order atomic allocations on demand") > Reported-by: Chris Goldsworthy <quic_cgoldswo@quicinc.com> > Suggested-by: Michal Hocko <mhocko@suse.com> > Acked-by: Michal Hocko <mhocko@suse.com> > Signed-off-by: Charan Teja Kalla <quic_charante@quicinc.com> Acked-by: David Rientjes <rientjes@google.com>
Could this patch be backported to stable? I have seen similar OOMs with reserved_highatomic:4096KB Upstream commit: 04c8716f7b0075def05dc05646e2408f318167d2 Jocke
On Thu, Jan 18, 2024 at 05:23:58PM +0000, Joakim Tjernlund wrote: > Could this patch be backported to stable? I have seen similar OOMs with > reserved_highatomic:4096KB > > Upstream commit: 04c8716f7b0075def05dc05646e2408f318167d2 Backported to exactly where? This commit is in the 4.20 kernel and newer, please tell me you aren't relying on the 4.19.y kernel anymore... thanks, greg k-h
Seems like I pasted the wrong commit(sorry), should be: ac3f3b0a55518056bc80ed32a41931c99e1f7d81 I only see that one in master.
A: http://en.wikipedia.org/wiki/Top_post Q: Were do I find info about this thing called top-posting? A: Because it messes up the order in which people normally read text. Q: Why is top-posting such a bad thing? A: Top-posting. Q: What is the most annoying thing in e-mail? A: No. Q: Should I include quotations after my reply? http://daringfireball.net/2007/07/on_top On Mon, Jan 22, 2024 at 10:49:32PM +0000, Joakim Tjernlund wrote: > Seems like I pasted the wrong commit(sorry), should be: ac3f3b0a55518056bc80ed32a41931c99e1f7d81 > I only see that one in master. And what kernels have you tested this on? How far back should it go? For mm patches like this, that are not explicitly tagged by the maintainers to be included in the stable tree, we need their ack to be able to apply them based on their requests. So can you get that for this change and provide tested patches, we will be glad to queue them up. thanks, greg k-h
On Mon, 2024-01-22 at 15:04 -0800, Greg KH wrote: > A: http://en.wikipedia.org/wiki/Top_post > Q: Were do I find info about this thing called top-posting? > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing in e-mail? > > A: No. > Q: Should I include quotations after my reply? haha, serves me right :) (note to self, do not use webmail for Linux stuff ...) > > http://daringfireball.net/2007/07/on_top > > On Mon, Jan 22, 2024 at 10:49:32PM +0000, Joakim Tjernlund wrote: > > Seems like I pasted the wrong commit(sorry), should be: ac3f3b0a55518056bc80ed32a41931c99e1f7d81 > > I only see that one in master. > > And what kernels have you tested this on? How far back should it go? I am testing it now in latest 5.15.x but the jury is still out. No OOM since a few days but the error does not happen often. > > For mm patches like this, that are not explicitly tagged by the > maintainers to be included in the stable tree, we need their ack to be > able to apply them based on their requests. So can you get that for > this change and provide tested patches, we will be glad to queue them > up. I asked the author and he acknowledged it could be backported. Charan, please chim in. Jocke > > thanks, > > greg k-h
On Mon 22-01-24 23:14:59, Joakim Tjernlund wrote: > On Mon, 2024-01-22 at 15:04 -0800, Greg KH wrote: [...] > > For mm patches like this, that are not explicitly tagged by the > > maintainers to be included in the stable tree, we need their ack to be > > able to apply them based on their requests. So can you get that for > > this change and provide tested patches, we will be glad to queue them > > up. > > I asked the author and he acknowledged it could be backported. Charan, please chim in. The patch itself is safe to backport but it would be great to here what kind of problem you are trying to deal with. The issue fixed by this patch is more on a corner case side than something that many users should see. Could you share oom report you are seeing?
On Tue, 2024-01-23 at 11:05 +0100, Michal Hocko wrote: > On Mon 22-01-24 23:14:59, Joakim Tjernlund wrote: > > On Mon, 2024-01-22 at 15:04 -0800, Greg KH wrote: > [...] > > > For mm patches like this, that are not explicitly tagged by the > > > maintainers to be included in the stable tree, we need their ack to be > > > able to apply them based on their requests. So can you get that for > > > this change and provide tested patches, we will be glad to queue them > > > up. > > > > I asked the author and he acknowledged it could be backported. Charan, please chim in. > > The patch itself is safe to backport but it would be great to here what > kind of problem you are trying to deal with. The issue fixed by this > patch is more on a corner case side than something that many users > should see. Could you share oom report you are seeing? Yes, here it is: Mar 9 12:52:39 xr kern.warn kernel: [ 1065.896824] xr-swm-install- invoked oom-killer: gfp_mask=0x1100cca(GFP_HIGHUSER_MOVABLE), order=0, oom_score_adj=0 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.907321] CPU: 0 PID: 2428 Comm: xr-swm-install- Not tainted 5.15.129-xr-linux #1 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.914974] Hardware name: infinera,xr (DT) Mar 9 12:52:40 xr kern.warn kernel: [ 1065.919158] Call trace: Mar 9 12:52:40 xr kern.warn kernel: [ 1065.921600] dump_backtrace+0x0/0x148 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.925310] show_stack+0x14/0x1c Mar 9 12:52:40 xr kern.warn kernel: [ 1065.928628] dump_stack_lvl+0x64/0x7c Mar 9 12:52:40 xr kern.warn kernel: [ 1065.932314] dump_stack+0x14/0x2c Mar 9 12:52:40 xr kern.warn kernel: [ 1065.935633] dump_header+0x64/0x1fc Mar 9 12:52:40 xr kern.warn kernel: [ 1065.939123] oom_kill_process+0xc0/0x28c Mar 9 12:52:40 xr kern.warn kernel: [ 1065.943063] out_of_memory+0x2c8/0x2e0 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.946811] __alloc_pages_slowpath.constprop.0+0x4f4/0x5b0 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.952388] __alloc_pages+0xcc/0xdc Mar 9 12:52:40 xr kern.warn kernel: [ 1065.955962] __page_cache_alloc+0x18/0x20 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.959993] pagecache_get_page+0x14c/0x1bc Mar 9 12:52:40 xr kern.warn kernel: [ 1065.964174] filemap_fault+0x1f4/0x390 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.967920] __do_fault+0x48/0x78 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.971253] __handle_mm_fault+0x35c/0x7c0 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.975351] handle_mm_fault+0x2c/0xc4 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.979101] do_page_fault+0x224/0x350 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.982857] do_translation_fault+0x3c/0x58 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.987038] do_mem_abort+0x40/0xa4 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.990526] el0_ia+0x74/0xc8 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.993498] el0t_32_sync_handler+0xa8/0xe8 Mar 9 12:52:40 xr kern.warn kernel: [ 1065.997679] el0t_32_sync+0x15c/0x160 Mar 9 12:52:40 xr kern.warn kernel: [ 1066.001379] Mem-Info: Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_anon:18 inactive_anon:2435 isolated_anon:0 Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_file:1 inactive_file:0 isolated_file:0 Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] unevictable:267 dirty:0 writeback:0 Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] slab_reclaimable:384 slab_unreclaimable:2223 Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] mapped:1 shmem:0 pagetables:309 bounce:0 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] kernel_misc_reclaimable:0 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] free:1054 free_pcp:24 free_cma:0 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.037938] Node 0 active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB isolated(anon):0kB isolated(file):0kB mapped:4kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:9 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.061723] Normal free:4216kB min:128kB low:160kB high:192kB reserved_highatomic:4096KB active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB writepending:0kB present:36864kB managed:2944 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.087595] lowmem_reserve[]: 0 0 Mar 9 12:52:41 xr kern.warn kernel: [ 1066.090918] Normal: 206*4kB (MEH) 198*8kB (UMEH) 79*16kB (UMEH) 13*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4216kB Mar 9 12:52:41 xr kern.warn kernel: [ 1066.104104] 269 total pagecache pages Mar 9 12:52:41 xr kern.warn kernel: [ 1066.107768] 9216 pages RAM Mar 9 12:52:41 xr kern.warn kernel: [ 1066.110469] 0 pages HighMem/MovableOnly Mar 9 12:52:41 xr kern.warn kernel: [ 1066.114305] 1855 pages reserved Mar 9 12:52:41 xr kern.info kernel: [ 1066.117448] Tasks state (memory values in pages): Mar 9 12:52:41 xr kern.info kernel: [ 1066.122150] [ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name Mar 9 12:52:41 xr kern.info kernel: [ 1066.130764] [ 77] 0 77 380 24 24576 0 0 mdev Mar 9 12:52:41 xr kern.info kernel: [ 1066.138857] [ 224] 0 224 364 9 24576 0 0 syslogd Mar 9 12:52:41 xr kern.info kernel: [ 1066.147208] [ 227] 0 227 364 9 28672 0 0 klogd Mar 9 12:52:41 xr kern.info kernel: [ 1066.155385] [ 242] 0 242 301 10 28672 0 0 dropbear Mar 9 12:52:41 xr kern.info kernel: [ 1066.163824] [ 248] 0 248 337 38 24576 0 0 dhcpcd Mar 9 12:52:41 xr kern.info kernel: [ 1066.172090] [ 259] 0 259 899 34 32768 0 0 watchdog Mar 9 12:52:41 xr kern.info kernel: [ 1066.180528] [ 261] 0 261 741 33 32768 0 0 rpmsg_broker Mar 9 12:52:41 xr kern.info kernel: [ 1066.189311] [ 263] 0 263 75466 965 622592 0 0 waactrl-main Mar 9 12:52:41 xr kern.info kernel: [ 1066.198095] [ 320] 0 320 1127 90 32768 0 0 xr-fm-agent Mar 9 12:52:41 xr kern.info kernel: [ 1066.206794] [ 324] 0 324 797 40 24576 0 0 factory_reset Mar 9 12:52:41 xr kern.info kernel: [ 1066.215668] [ 328] 0 328 1674 242 40960 0 0 xr-cm-agent Mar 9 12:52:41 xr kern.info kernel: [ 1066.224628] [ 331] 0 331 707 31 32768 0 0 mmcu-agent Mar 9 12:52:41 xr kern.info kernel: [ 1066.233245] [ 334] 0 334 4054 427 49152 0 0 xr-swm-agent Mar 9 12:52:41 xr kern.info kernel: [ 1066.242031] [ 338] 0 338 1129 94 32768 0 0 xr-pm-agent Mar 9 12:52:41 xr kern.info kernel: [ 1066.250729] [ 364] 0 364 734 34 32768 0 0 process_supervi Mar 9 12:52:41 xr kern.info kernel: [ 1066.259774] [ 399] 0 399 1327 215 32768 0 0 swupdate Mar 9 12:52:41 xr kern.info kernel: [ 1066.268212] [ 413] 0 413 370 13 28672 0 0 sh Mar 9 12:52:41 xr kern.info kernel: [ 1066.276131] [ 2072] 0 2072 306 13 28672 0 0 dropbear Mar 9 12:52:42 xr kern.info kernel: [ 1066.284569] [ 2081] 0 2081 366 11 24576 0 0 sh Mar 9 12:52:42 xr kern.info kernel: [ 1066.292492] [ 2427] 0 2427 876 93 28672 0 0 xr-swm-install- Mar 9 12:52:42 xr kern.info kernel: [ 1066.301536] [ 2468] 0 2468 546 25 28672 0 0 ip Mar 9 12:52:42 xr kern.info kernel: [ 1066.309453] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),task=waactrl-main,pid=263,uid=0 Mar 9 12:52:42 xr kern.err kernel: [ 1066.318197] Out of memory: Killed process 263 (waactrl-main) total-vm:301864kB, anon-rss:3860kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:608kB oom_score_adj:0
On Tue 23-01-24 11:08:28, Joakim Tjernlund wrote: [...] > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.001379] Mem-Info: > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_anon:18 inactive_anon:2435 isolated_anon:0 > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_file:1 inactive_file:0 isolated_file:0 > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] unevictable:267 dirty:0 writeback:0 > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] slab_reclaimable:384 slab_unreclaimable:2223 > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] mapped:1 shmem:0 pagetables:309 bounce:0 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] kernel_misc_reclaimable:0 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] free:1054 free_pcp:24 free_cma:0 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.037938] Node 0 active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB isolated(anon):0kB isolated(file):0kB mapped:4kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:9 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.061723] Normal free:4216kB min:128kB low:160kB high:192kB reserved_highatomic:4096KB active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB writepending:0kB present:36864kB managed:2944 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.087595] lowmem_reserve[]: 0 0 > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.090918] Normal: 206*4kB (MEH) 198*8kB (UMEH) 79*16kB (UMEH) 13*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4216kB > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.104104] 269 total pagecache pages > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.107768] 9216 pages RAM OK, so this is really a tiny system (36MB) and having 4MB in highatomic reserves is clearly too much. So this matches the patch you are referencing and it migh help you indeed. Thanks!
On Tue, Jan 23, 2024 at 12:24:22PM +0100, Michal Hocko wrote: > On Tue 23-01-24 11:08:28, Joakim Tjernlund wrote: > [...] > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.001379] Mem-Info: > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_anon:18 inactive_anon:2435 isolated_anon:0 > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] active_file:1 inactive_file:0 isolated_file:0 > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] unevictable:267 dirty:0 writeback:0 > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] slab_reclaimable:384 slab_unreclaimable:2223 > > Mar 9 12:52:40 xr kern.warn kernel: [ 1066.003665] mapped:1 shmem:0 pagetables:309 bounce:0 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] kernel_misc_reclaimable:0 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.003665] free:1054 free_pcp:24 free_cma:0 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.037938] Node 0 active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB isolated(anon):0kB isolated(file):0kB mapped:4kB dirty:0kB writeback:0kB shmem:0kB writeback_tmp:0kB kernel_stack:9 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.061723] Normal free:4216kB min:128kB low:160kB high:192kB reserved_highatomic:4096KB active_anon:72kB inactive_anon:9740kB active_file:4kB inactive_file:0kB unevictable:1068kB writepending:0kB present:36864kB managed:2944 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.087595] lowmem_reserve[]: 0 0 > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.090918] Normal: 206*4kB (MEH) 198*8kB (UMEH) 79*16kB (UMEH) 13*32kB (H) 2*64kB (H) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 4216kB > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.104104] 269 total pagecache pages > > Mar 9 12:52:41 xr kern.warn kernel: [ 1066.107768] 9216 pages RAM > > OK, so this is really a tiny system (36MB) and having 4MB in highatomic > reserves is clearly too much. So this matches the patch you are > referencing and it migh help you indeed. Ok, now queued up, thanks. greg k-h
diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 733732e..6d2a741 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3951,14 +3951,9 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, else (*no_progress_loops)++; - /* - * Make sure we converge to OOM if we cannot make any progress - * several times in the row. - */ - if (*no_progress_loops > MAX_RECLAIM_RETRIES) { - /* Before OOM, exhaust highatomic_reserve */ - return unreserve_highatomic_pageblock(ac, true); - } + if (*no_progress_loops > MAX_RECLAIM_RETRIES) + goto out; + /* * Keep reclaiming pages while there is a chance this will lead @@ -4001,6 +3996,11 @@ should_reclaim_retry(gfp_t gfp_mask, unsigned order, schedule_timeout_uninterruptible(1); else cond_resched(); +out: + /* Before OOM, exhaust highatomic_reserve */ + if (!ret) + return unreserve_highatomic_pageblock(ac, true); + return ret; }