Message ID | 20230114001556.43795-2-vishal.moola@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp37216wrn; Fri, 13 Jan 2023 16:25:30 -0800 (PST) X-Google-Smtp-Source: AMrXdXsdgHhGbjMGpIADRFrew2qmZwxvbgWGv2qWQC8tpwAa+8ZgYfz3ns03Ptve8ELzUAs+5a2S X-Received: by 2002:a17:903:40c8:b0:189:ab82:53f5 with SMTP id t8-20020a17090340c800b00189ab8253f5mr66674978pld.40.1673655929950; Fri, 13 Jan 2023 16:25:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673655929; cv=none; d=google.com; s=arc-20160816; b=bidgqt8XDuI/5TQx38gKVMTo+ypsgl1h2TlT50EeH3RvG4woAVvbOuMyqbKaIid8C7 2/YzMDkPwuMKriXmUIP82Mfz57YQJ541kBR4b+s6LzkU/sc1lCHP/RjQrbYRmWTcvuA6 QbiGWwJYyhYffon1pVYmsnKvT0g9tbKr3LatKCsdRE1KNA3u29QeP+tIQV3ur1GGVvFH RN8qomDKD7ko/ruQ7riIPFec0+ekhm0NQgulZiXn+cHEzLSJLdfyESsDV766IpNCfwb3 UBMbYwe6yZHx5tJOrPaj+1q5p1Tw1JGc6QqHb4RGwjGGo6nI96cxnsNB7x+UrbeAANT8 MuHA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=R11gpIsoLjoJPx/bJDZSX3qFzcN1YPIoVj3Kg++9jRA=; b=gFPhYbidcpREL5j6OJ+I+V2oCcSkXSy0vx4tT28yKPwlwU7uF0E1Yd5knuGsBwXoAg wWD2giA+QdpoQCPWUxRaPHoImh5kvlXm9gUuJauZIwiSnL874Mb7BBQyGILneUZthU83 O7sMP1DhKWgF5k9MK2YJhVr9e12mi+xnIgsZDq9xdH4aAatLqDlZTJp7KAW+ltUzIDKb ld/L0BR1f3I8TWNVyluDjmDbihy4CKO1ghZkUudAC6Ql7uqABnsjS2FKU4zmChrpPdGa MTNOjaaEGANokODfigoJzJpSWStbqGq+KNgF6sms8P4A/qSQMK4hq8qi9s/GKDvFyFtQ CARQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=P3XFIbjS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q16-20020a170902dad000b00193f8c6a020si13673101plx.111.2023.01.13.16.25.18; Fri, 13 Jan 2023 16:25:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=P3XFIbjS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231322AbjANAQi (ORCPT <rfc822;stefanalexe48@gmail.com> + 99 others); Fri, 13 Jan 2023 19:16:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56084 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231142AbjANAQb (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 13 Jan 2023 19:16:31 -0500 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19DBD26F1 for <linux-kernel@vger.kernel.org>; Fri, 13 Jan 2023 16:16:29 -0800 (PST) Received: by mail-pl1-x62b.google.com with SMTP id d9so24997105pll.9 for <linux-kernel@vger.kernel.org>; Fri, 13 Jan 2023 16:16:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=R11gpIsoLjoJPx/bJDZSX3qFzcN1YPIoVj3Kg++9jRA=; b=P3XFIbjS9O2iFLNHs9KwE1VQ8werQuyrNARUhLmH0lCVNRZJ2rk7sgZT+Zn/CV3aO9 n8NosQbbujPD1yTLzY2UtT/orCnuHWy7JhAv4QCSPiGNheHs2ZNcWCweXjNj0NCFqodH Xz2SipLoimvA1UTBTssbm8XWZ8xtLfiLHZwIdk9wyWIUkEEnyVwoQOahZlGWgFgcbv1v vdMyCUandgKkyl6eDOY3ntLLrn6MzeGGOX03JuITzAcChJZuNGQh9FnQ05DZoIPVVoeT fH3iJEksq5Dz+dM+HeaMJPuXGVstuBYeICD8N7ANsSkrBqLXmgyUzhVmF9vR2+cf6V9N DOKQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=R11gpIsoLjoJPx/bJDZSX3qFzcN1YPIoVj3Kg++9jRA=; b=aZGpfvk/wkE8tbmR0j/PO34m9DeBixGDuBL4lPdE7C2ju8WkUkj3wECnUzJIgpuC+D k3jDx7w+izq/EXuPSGQG12bja8Kff22hE/Vr0l+Pk/gU2no+tmYXXvMexTBdvNsJ2VSi WQ043haAMvFiovc0iK545+cJmGTmPoQLgeLU4301JNtvF+AXPt6uZ5AAmg3y6ImRJzjA rCLa9KErIbxVPWujtd2ctuex3lELuZPCpmlxbgeMCANhhUi9LxjSCMffNaapk2+Ejryq fWf0HW2bBdwaLIChCXjComdsOw7lH/KepOkpvSP4MYhrFxkwFWGz9KqKsNQUCWkMnK+W KTSQ== X-Gm-Message-State: AFqh2krqx7Bgv21W7KlrwCmO1khlDOBmzojKvlc2x+vfFDca6u7QMV5J Qii7dmVgs2fEzA5zuO4IxlY= X-Received: by 2002:a05:6a20:43a2:b0:b6:71f6:a4c0 with SMTP id i34-20020a056a2043a200b000b671f6a4c0mr9808650pzl.1.1673655388607; Fri, 13 Jan 2023 16:16:28 -0800 (PST) Received: from fedora.hsd1.ca.comcast.net ([2601:644:8002:1c20::2e58]) by smtp.googlemail.com with ESMTPSA id nl5-20020a17090b384500b0020b21019086sm4064541pjb.3.2023.01.13.16.16.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Jan 2023 16:16:28 -0800 (PST) From: "Vishal Moola (Oracle)" <vishal.moola@gmail.com> To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, "Vishal Moola (Oracle)" <vishal.moola@gmail.com> Subject: [PATCH 2/2] mm/khugepaged: Convert release_pte_pages() to use folios Date: Fri, 13 Jan 2023 16:15:56 -0800 Message-Id: <20230114001556.43795-2-vishal.moola@gmail.com> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20230114001556.43795-1-vishal.moola@gmail.com> References: <20230114001556.43795-1-vishal.moola@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754955440712809588?= X-GMAIL-MSGID: =?utf-8?q?1754955440712809588?= |
Series |
[1/2] mm/khugepaged: Introduce release_pte_folio() to replace release_pte_page()
|
|
Commit Message
Vishal Moola
Jan. 14, 2023, 12:15 a.m. UTC
Converts release_pte_pages() to use folios instead of pages.
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
mm/khugepaged.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
Comments
Hi, On 14.01.2023 01:15, Vishal Moola (Oracle) wrote: > Converts release_pte_pages() to use folios instead of pages. > > Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> This patch has been merged some time ago to linux-next as commit 9bdfeea46f49 ("mm/khugepaged: convert release_pte_pages() to use folios"). It took me a while to bisect this (mainly because I was busy with other things), but I finally found that this change is responsible for the following kernel panic: Unable to handle kernel paging request at virtual address fffffc0000000008 Mem abort info: ESR = 0x0000000096000006 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x06: level 2 translation fault Data abort info: ISV = 0, ISS = 0x00000006 CM = 0, WnR = 0 swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000021efa000 [fffffc0000000008] pgd=10000000df05a003, p4d=10000000df05a003, pud=10000000df059003, pmd=0000000000000000 Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP Modules linked in: ip_tables x_tables ipv6 CPU: 7 PID: 61 Comm: khugepaged Not tainted 6.2.0-rc4+ #13307 Hardware name: Samsung TM2E board (DT) pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : hpage_collapse_scan_pmd+0x12ec/0x1a20 lr : hpage_collapse_scan_pmd+0x14b0/0x1a20 sp : ffff80000be13c20 x29: ffff80000be13c20 x28: 0000000000000001 x27: fffffc0000d3f5c0 x26: fffffc0000d3f600 x25: 00000000000001f9 x24: 0000000000000007 x23: ffff0000296f9dd8 x22: ffff800009e5b490 x21: 0000000000000000 x20: 000000000000000f x19: ffff80000a9d0000 x18: ffff80000af52e58 x17: 0000000000000028 x16: 0000000000009249 x15: ffff80000af971f8 x14: 0000000000000000 x13: 00000000000443a0 x12: 0000000000040000 x11: 000000000fffffff x10: ffff000024928880 x9 : ffff80000b5c6e98 x8 : ffff000024928000 x7 : 00000000b35d04b9 x6 : 0000000000000000 x5 : fffffc0000000000 x4 : ffff8000cbf2e000 x3 : 0000000000000000 x2 : 0000000000000000 x1 : 0000000000000000 x0 : fffffc0000000000 Call trace: hpage_collapse_scan_pmd+0x12ec/0x1a20 khugepaged+0x7e0/0x8dc kthread+0x118/0x11c ret_from_fork+0x10/0x20 Code: d34cbc43 cb813061 d37ae421 8b050020 (f9400404) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception SMP: stopping secondary CPUs Kernel Offset: disabled CPU features: 0x8c000,41c78100,0000421b Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- Reverting it on top of recent linux-next fixes the issue, so it looks that some kind of a corner case is missing in this patch. I can reproduce it usually during the system shutdown, 1 of 20 times on the average. > --- > mm/khugepaged.c | 14 +++++++------- > 1 file changed, 7 insertions(+), 7 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 4888e8688401..27d010431ece 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -509,20 +509,20 @@ static void release_pte_page(struct page *page) > static void release_pte_pages(pte_t *pte, pte_t *_pte, > struct list_head *compound_pagelist) > { > - struct page *page, *tmp; > + struct folio *folio, *tmp; > > while (--_pte >= pte) { > pte_t pteval = *_pte; > > - page = pte_page(pteval); > + folio = pfn_folio(pte_pfn(pteval)); > if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && > - !PageCompound(page)) > - release_pte_page(page); > + !folio_test_large(folio)) > + release_pte_folio(folio); > } > > - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { > - list_del(&page->lru); > - release_pte_page(page); > + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > + list_del(&folio->lru); > + release_pte_folio(folio); > } > } > Best regards
Hi, On 2/13/23 09:53, Marek Szyprowski wrote: > Hi, > > On 14.01.2023 01:15, Vishal Moola (Oracle) wrote: >> Converts release_pte_pages() to use folios instead of pages. >> >> Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com> > This patch has been merged some time ago to linux-next as commit > 9bdfeea46f49 ("mm/khugepaged: convert release_pte_pages() to use > folios"). It took me a while to bisect this (mainly because I was busy > with other things), but I finally found that this change is responsible > for the following kernel panic: > > Unable to handle kernel paging request at virtual address fffffc0000000008 > Mem abort info: > ESR = 0x0000000096000006 > EC = 0x25: DABT (current EL), IL = 32 bits > SET = 0, FnV = 0 > EA = 0, S1PTW = 0 > FSC = 0x06: level 2 translation fault > Data abort info: > ISV = 0, ISS = 0x00000006 > CM = 0, WnR = 0 > swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000021efa000 > [fffffc0000000008] pgd=10000000df05a003, p4d=10000000df05a003, > pud=10000000df059003, pmd=0000000000000000 > Internal error: Oops: 0000000096000006 [#1] PREEMPT SMP > Modules linked in: ip_tables x_tables ipv6 > CPU: 7 PID: 61 Comm: khugepaged Not tainted 6.2.0-rc4+ #13307 > Hardware name: Samsung TM2E board (DT) > pstate: 80000005 (Nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) > pc : hpage_collapse_scan_pmd+0x12ec/0x1a20 > lr : hpage_collapse_scan_pmd+0x14b0/0x1a20 > sp : ffff80000be13c20 > x29: ffff80000be13c20 x28: 0000000000000001 x27: fffffc0000d3f5c0 > x26: fffffc0000d3f600 x25: 00000000000001f9 x24: 0000000000000007 > x23: ffff0000296f9dd8 x22: ffff800009e5b490 x21: 0000000000000000 > x20: 000000000000000f x19: ffff80000a9d0000 x18: ffff80000af52e58 > x17: 0000000000000028 x16: 0000000000009249 x15: ffff80000af971f8 > x14: 0000000000000000 x13: 00000000000443a0 x12: 0000000000040000 > x11: 000000000fffffff x10: ffff000024928880 x9 : ffff80000b5c6e98 > x8 : ffff000024928000 x7 : 00000000b35d04b9 x6 : 0000000000000000 > x5 : fffffc0000000000 x4 : ffff8000cbf2e000 x3 : 0000000000000000 > x2 : 0000000000000000 x1 : 0000000000000000 x0 : fffffc0000000000 > Call trace: > hpage_collapse_scan_pmd+0x12ec/0x1a20 > khugepaged+0x7e0/0x8dc > kthread+0x118/0x11c > ret_from_fork+0x10/0x20 > Code: d34cbc43 cb813061 d37ae421 8b050020 (f9400404) > ---[ end trace 0000000000000000 ]--- > Kernel panic - not syncing: Oops: Fatal exception > SMP: stopping secondary CPUs > Kernel Offset: disabled > CPU features: 0x8c000,41c78100,0000421b > Memory Limit: none > ---[ end Kernel panic - not syncing: Oops: Fatal exception ]--- > > > Reverting it on top of recent linux-next fixes the issue, so it looks > that some kind of a corner case is missing in this patch. I can > reproduce it usually during the system shutdown, 1 of 20 times on the > average. I have debugging this issue since this morning too! > >> --- >> mm/khugepaged.c | 14 +++++++------- >> 1 file changed, 7 insertions(+), 7 deletions(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 4888e8688401..27d010431ece 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -509,20 +509,20 @@ static void release_pte_page(struct page *page) >> static void release_pte_pages(pte_t *pte, pte_t *_pte, >> struct list_head *compound_pagelist) >> { >> - struct page *page, *tmp; >> + struct folio *folio, *tmp; >> >> while (--_pte >= pte) { >> pte_t pteval = *_pte; >> >> - page = pte_page(pteval); >> + folio = pfn_folio(pte_pfn(pteval)); The issue lies here: before using pteval in pfn_folio(), we should test it. The following patch fixes the issue for me: diff --git a/mm/khugepaged.c b/mm/khugepaged.c index eb38bd1b1b2f..fef3414b481b 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte, while (--_pte >= pte) { pte_t pteval = *_pte; - folio = pfn_folio(pte_pfn(pteval)); - if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && - !folio_test_large(folio)) - release_pte_folio(folio); + if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) { + folio = pfn_folio(pte_pfn(pteval)); + + if (!folio_test_large(folio)) + release_pte_folio(folio); + } } list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { @Marek: could you give it a try? I can send a separate patch if needed, let me know. Thanks, Alex >> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && >> - !PageCompound(page)) >> - release_pte_page(page); >> + !folio_test_large(folio)) >> + release_pte_folio(folio); >> } >> >> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { >> - list_del(&page->lru); >> - release_pte_page(page); >> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { >> + list_del(&folio->lru); >> + release_pte_folio(folio); >> } >> } >> > Best regards
On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote: > The issue lies here: before using pteval in pfn_folio(), we should test it. > The following patch fixes the issue for me: Thanks for debugging it. I'd rather see this written as ... pte_t pteval = *_pte; + unsigned long pfn; + if (pte_none(pteval)) + continue; + pfn = pte_pfn(pteval); + if (is_zero_pfn(pfn)) + continue; + folio = pfn_folio(pfn); + if (folio_test_large(folio)) + continue; release_pte_folio(folio); makes sense? > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index eb38bd1b1b2f..fef3414b481b 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte, > while (--_pte >= pte) { > pte_t pteval = *_pte; > > - folio = pfn_folio(pte_pfn(pteval)); > - if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && > - !folio_test_large(folio)) > - release_pte_folio(folio); > + if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) { > + folio = pfn_folio(pte_pfn(pteval)); > + > + if (!folio_test_large(folio)) > + release_pte_folio(folio); > + } > } > > list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > > > @Marek: could you give it a try? > > I can send a separate patch if needed, let me know. > > Thanks, > > Alex > > > > > if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && > > > - !PageCompound(page)) > > > - release_pte_page(page); > > > + !folio_test_large(folio)) > > > + release_pte_folio(folio); > > > } > > > - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { > > > - list_del(&page->lru); > > > - release_pte_page(page); > > > + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > > > + list_del(&folio->lru); > > > + release_pte_folio(folio); > > > } > > > } > > Best regards >
Hi Matthew, On 2/13/23 16:50, Matthew Wilcox wrote: > On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote: >> The issue lies here: before using pteval in pfn_folio(), we should test it. >> The following patch fixes the issue for me: > Thanks for debugging it. I'd rather see this written as ... > > pte_t pteval = *_pte; > + unsigned long pfn; > > + if (pte_none(pteval)) > + continue; > + pfn = pte_pfn(pteval); > + if (is_zero_pfn(pfn)) > + continue; > + folio = pfn_folio(pfn); > + if (folio_test_large(folio)) > + continue; > release_pte_folio(folio); > > makes sense? Sure, that's fine by me, I can send that or I'll add my tested-by on what you send, whatever suits you. Alex > >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index eb38bd1b1b2f..fef3414b481b 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte, >> while (--_pte >= pte) { >> pte_t pteval = *_pte; >> >> - folio = pfn_folio(pte_pfn(pteval)); >> - if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && >> - !folio_test_large(folio)) >> - release_pte_folio(folio); >> + if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) { >> + folio = pfn_folio(pte_pfn(pteval)); >> + >> + if (!folio_test_large(folio)) >> + release_pte_folio(folio); >> + } >> } >> >> list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { >> >> >> @Marek: could you give it a try? >> >> I can send a separate patch if needed, let me know. >> >> Thanks, >> >> Alex >> >> >>>> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && >>>> - !PageCompound(page)) >>>> - release_pte_page(page); >>>> + !folio_test_large(folio)) >>>> + release_pte_folio(folio); >>>> } >>>> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { >>>> - list_del(&page->lru); >>>> - release_pte_page(page); >>>> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { >>>> + list_del(&folio->lru); >>>> + release_pte_folio(folio); >>>> } >>>> } >>> Best regards
On Mon, Feb 13, 2023 at 7:55 AM Alexandre Ghiti <alex@ghiti.fr> wrote: > > Hi Matthew, > > On 2/13/23 16:50, Matthew Wilcox wrote: > > On Mon, Feb 13, 2023 at 04:28:05PM +0100, Alexandre Ghiti wrote: > >> The issue lies here: before using pteval in pfn_folio(), we should test it. > >> The following patch fixes the issue for me: > > Thanks for debugging it. I'd rather see this written as ... > > > > pte_t pteval = *_pte; > > + unsigned long pfn; > > > > + if (pte_none(pteval)) > > + continue; > > + pfn = pte_pfn(pteval); > > + if (is_zero_pfn(pfn)) > > + continue; > > + folio = pfn_folio(pfn); > > + if (folio_test_large(folio)) > > + continue; > > release_pte_folio(folio); > > > > makes sense? > > > Sure, that's fine by me, I can send that or I'll add my tested-by on > what you send, whatever suits you. Thanks for debugging this! I'll send a fix patch using Matthew's approach later today. > Alex > > > > > >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c > >> index eb38bd1b1b2f..fef3414b481b 100644 > >> --- a/mm/khugepaged.c > >> +++ b/mm/khugepaged.c > >> @@ -514,10 +514,12 @@ static void release_pte_pages(pte_t *pte, pte_t *_pte, > >> while (--_pte >= pte) { > >> pte_t pteval = *_pte; > >> > >> - folio = pfn_folio(pte_pfn(pteval)); > >> - if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && > >> - !folio_test_large(folio)) > >> - release_pte_folio(folio); > >> + if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval))) { > >> + folio = pfn_folio(pte_pfn(pteval)); > >> + > >> + if (!folio_test_large(folio)) > >> + release_pte_folio(folio); > >> + } > >> } > >> > >> list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > >> > >> > >> @Marek: could you give it a try? > >> > >> I can send a separate patch if needed, let me know. > >> > >> Thanks, > >> > >> Alex > >> > >> > >>>> if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && > >>>> - !PageCompound(page)) > >>>> - release_pte_page(page); > >>>> + !folio_test_large(folio)) > >>>> + release_pte_folio(folio); > >>>> } > >>>> - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { > >>>> - list_del(&page->lru); > >>>> - release_pte_page(page); > >>>> + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { > >>>> + list_del(&folio->lru); > >>>> + release_pte_folio(folio); > >>>> } > >>>> } > >>> Best regards
diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 4888e8688401..27d010431ece 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -509,20 +509,20 @@ static void release_pte_page(struct page *page) static void release_pte_pages(pte_t *pte, pte_t *_pte, struct list_head *compound_pagelist) { - struct page *page, *tmp; + struct folio *folio, *tmp; while (--_pte >= pte) { pte_t pteval = *_pte; - page = pte_page(pteval); + folio = pfn_folio(pte_pfn(pteval)); if (!pte_none(pteval) && !is_zero_pfn(pte_pfn(pteval)) && - !PageCompound(page)) - release_pte_page(page); + !folio_test_large(folio)) + release_pte_folio(folio); } - list_for_each_entry_safe(page, tmp, compound_pagelist, lru) { - list_del(&page->lru); - release_pte_page(page); + list_for_each_entry_safe(folio, tmp, compound_pagelist, lru) { + list_del(&folio->lru); + release_pte_folio(folio); } }