Message ID | 20230810095652.3905184-1-fengwei.yin@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b824:0:b0:3f2:4152:657d with SMTP id z4csp341729vqi; Thu, 10 Aug 2023 04:14:25 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEqKt6Wda5Rh9QbsQ/eRTI/jAmv8fdfjXK/MMGvfL2nvUiWXhaUTtrbW+L2D9GQljizOJwk X-Received: by 2002:a05:6a00:21c9:b0:67b:8602:aa1e with SMTP id t9-20020a056a0021c900b0067b8602aa1emr2449511pfj.28.1691666065200; Thu, 10 Aug 2023 04:14:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691666065; cv=none; d=google.com; s=arc-20160816; b=V7aMZ/awHyyvtXPDeSCveXGwnLFMAujPj4xr9ADPMsOMvQAnhHEZPzZb1iMwGtoc96 RsaurfKXX8ED9XbUEghprxx6yQiWMh6mJJBB7rH8uI7ENBKWbGBc+6029lXhMxm/NHRb eLH9huwiC6OXrOQNY6PDOXSrKPL1aPLUW8lEJiQ1U6DLjprABqTl2UULXIB2Uou0URT/ +KPq+I5NC4DAfid0Yf9aUXYop/YArPvVfb/4d/TP1iMKKqBPO3PKGsx1tY8VBXTLSz4c HCeCJxAnq94UKh5TU3wLAtCEoXKaJoTB8VDBEnrhdljXcgQut+ieuDCS80YEHKnLch7V wo/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=4wlpweY4fbAIVnikNk5KNoZsMc8OkpdCvHdIwnI6v5Y=; fh=pSAPOax4lKxIXuhMc1MgMW1wTSkMhjuSL4hjAKc8z7o=; b=p6/YXCBaWcjLL37DsvbqsXqI3E9GLzI7cNG5kyw1qqdcaoQC022r0/DdKBrA+Ed0Yb chL1RHHlWtKtsAEGvNTJCbCKCQDagvw3NKE3+SYeiOMPGql0LLjyTSV0Q5QZBzI9boGL aIvE8aCyGkeWd/E0eBSfv86/s2Y1YDkKptvSOTg3H2hm5TDOLyqVhC7oeM/SQE/9pcnU W/xix3T9S2068Jcssr3IkREEhvRiH7SdptKtihqw7F/F2LZWi+9dl/mRB7CloJNpvpAq KODim6SRGj22Jxt0Pm0PjT0+xnvzZ5mKmwIVbI/8Qs3BYF6ZHLVywPnHhFxvm+4RvJOR Qgcg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WWbT+VMa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y26-20020a056a001c9a00b006873d2a507fsi1375642pfw.44.2023.08.10.04.14.11; Thu, 10 Aug 2023 04:14:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=WWbT+VMa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233816AbjHJJ7I (ORCPT <rfc822;m15293943392@gmail.com> + 99 others); Thu, 10 Aug 2023 05:59:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235147AbjHJJ6t (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 10 Aug 2023 05:58:49 -0400 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 510E330DA for <linux-kernel@vger.kernel.org>; Thu, 10 Aug 2023 02:58:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1691661505; x=1723197505; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=FHmKEFi3g2q87hPqZ6nb5nTBLxi8mKfOBDkOrkOGWGM=; b=WWbT+VMakvIVEbD/0aAV5hUDZnUtHdO1aLAG3T+TpnjKOaf0jhGgV2A3 reBGD11+Sh5hMONv/ENp73PnAAFy4nL2zwmRgxzlpeIJj2slC3zbtJUxO MrurNJ5vwWxjoO9PVamPcbWlXRLTWGSzzrEc2lmQcvctEHMTKOCjUO2GE Ptnxej+sY659wCh1Im006YIbjK6DQ6771TKPzfFLkPwlp8AsUt3iEt01i hvjTCed+n/Z+m5iwM3/xtrHfOktClAZPv7Fn0N55wvpHviwOT7N/q9krh adjEV8p67l4L3W/nfkPAWCbXIOO2pjgv+UojQBLE5Fee5blr1NRiz0NUc g==; X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="368812120" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="368812120" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Aug 2023 02:58:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10797"; a="767226945" X-IronPort-AV: E=Sophos;i="6.01,161,1684825200"; d="scan'208";a="767226945" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga001.jf.intel.com with ESMTP; 10 Aug 2023 02:58:09 -0700 From: Yin Fengwei <fengwei.yin@intel.com> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, hannes@cmpxchg.org Cc: fengwei.yin@intel.com Subject: [PATCH] zswap: don't warn if none swapcache folio is passed to zswap_load Date: Thu, 10 Aug 2023 17:56:52 +0800 Message-Id: <20230810095652.3905184-1-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773840435757494606 X-GMAIL-MSGID: 1773840435757494606 |
Series |
zswap: don't warn if none swapcache folio is passed to zswap_load
|
|
Commit Message
Yin Fengwei
Aug. 10, 2023, 9:56 a.m. UTC
With mm-unstable branch, if trigger swap activity and it's possible
see following warning:
[ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570
[ 178.095155][ T651] Modules linked in:
[ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148
[ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014
[ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570
[ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4
[ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246
[ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc
[ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0
[ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc
[ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c
[ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000
[ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000
[ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0
[ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000
[ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400
[ 178.121087][ T651] Call Trace:
[ 178.121654][ T651] <TASK>
[ 178.122109][ T651] ? zswap_load+0x67/0x570
[ 178.122658][ T651] ? __warn+0x81/0x170
[ 178.123119][ T651] ? zswap_load+0x67/0x570
[ 178.123608][ T651] ? report_bug+0x167/0x190
[ 178.124150][ T651] ? handle_bug+0x3c/0x70
[ 178.124615][ T651] ? exc_invalid_op+0x13/0x60
[ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20
[ 178.125753][ T651] ? zswap_load+0x67/0x570
[ 178.126231][ T651] ? lock_acquire+0xbb/0x290
[ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0
[ 178.127261][ T651] ? find_held_lock+0x2b/0x80
[ 178.127776][ T651] swap_readpage+0xc7/0x5c0
[ 178.128273][ T651] do_swap_page+0x86d/0xf50
[ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290
[ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290
[ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0
[ 178.130419][ T651] handle_mm_fault+0x18b/0x410
[ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820
[ 178.132076][ T651] exc_page_fault+0x63/0x1a0
[ 178.132599][ T651] asm_exc_page_fault+0x22/0x30
It's possible that swap_readpage() is called with none swapcache folio
in do_swap_page() and trigger this warning. So we shouldn't assume
zswap_load() always takes swapcache folio.
Signed-off-by: Yin Fengwei <fengwei.yin@intel.com>
---
mm/zswap.c | 1 -
1 file changed, 1 deletion(-)
Comments
On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > > With mm-unstable branch, if trigger swap activity and it's possible > see following warning: > [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > [ 178.095155][ T651] Modules linked in: > [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > [ 178.121087][ T651] Call Trace: > [ 178.121654][ T651] <TASK> > [ 178.122109][ T651] ? zswap_load+0x67/0x570 > [ 178.122658][ T651] ? __warn+0x81/0x170 > [ 178.123119][ T651] ? zswap_load+0x67/0x570 > [ 178.123608][ T651] ? report_bug+0x167/0x190 > [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > [ 178.125753][ T651] ? zswap_load+0x67/0x570 > [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > > It's possible that swap_readpage() is called with none swapcache folio > in do_swap_page() and trigger this warning. So we shouldn't assume > zswap_load() always takes swapcache folio. Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds like a bug to me.
On 8/11/2023 2:44 AM, Yu Zhao wrote: > On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: >> >> With mm-unstable branch, if trigger swap activity and it's possible >> see following warning: >> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 >> [ 178.095155][ T651] Modules linked in: >> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 >> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 >> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 >> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 >> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 >> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc >> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 >> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc >> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c >> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 >> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 >> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 >> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 >> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 >> [ 178.121087][ T651] Call Trace: >> [ 178.121654][ T651] <TASK> >> [ 178.122109][ T651] ? zswap_load+0x67/0x570 >> [ 178.122658][ T651] ? __warn+0x81/0x170 >> [ 178.123119][ T651] ? zswap_load+0x67/0x570 >> [ 178.123608][ T651] ? report_bug+0x167/0x190 >> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 >> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 >> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 >> [ 178.125753][ T651] ? zswap_load+0x67/0x570 >> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 >> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 >> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 >> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 >> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 >> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 >> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 >> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 >> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 >> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 >> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 >> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 >> >> It's possible that swap_readpage() is called with none swapcache folio >> in do_swap_page() and trigger this warning. So we shouldn't assume >> zswap_load() always takes swapcache folio. > > Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > like a bug to me. I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. Regards Yin, Fengwei
On Thu, Aug 10, 2023 at 5:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > On 8/11/2023 2:44 AM, Yu Zhao wrote: > > On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > >> > >> With mm-unstable branch, if trigger swap activity and it's possible > >> see following warning: > >> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > >> [ 178.095155][ T651] Modules linked in: > >> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > >> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > >> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > >> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > >> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > >> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > >> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > >> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > >> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > >> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > >> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > >> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > >> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > >> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > >> [ 178.121087][ T651] Call Trace: > >> [ 178.121654][ T651] <TASK> > >> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > >> [ 178.122658][ T651] ? __warn+0x81/0x170 > >> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > >> [ 178.123608][ T651] ? report_bug+0x167/0x190 > >> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > >> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > >> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > >> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > >> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > >> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > >> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > >> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > >> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > >> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > >> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > >> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > >> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > >> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > >> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > >> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > >> > >> It's possible that swap_readpage() is called with none swapcache folio > >> in do_swap_page() and trigger this warning. So we shouldn't assume > >> zswap_load() always takes swapcache folio. > > > > Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > like a bug to me. > > I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. Reviewed-by: Yu Zhao <yuzhao@google.com>
On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > On 8/11/2023 2:44 AM, Yu Zhao wrote: > > On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > >> > >> With mm-unstable branch, if trigger swap activity and it's possible > >> see following warning: > >> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > >> [ 178.095155][ T651] Modules linked in: > >> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > >> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > >> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > >> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > >> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > >> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > >> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > >> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > >> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > >> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > >> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > >> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > >> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > >> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > >> [ 178.121087][ T651] Call Trace: > >> [ 178.121654][ T651] <TASK> > >> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > >> [ 178.122658][ T651] ? __warn+0x81/0x170 > >> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > >> [ 178.123608][ T651] ? report_bug+0x167/0x190 > >> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > >> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > >> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > >> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > >> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > >> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > >> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > >> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > >> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > >> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > >> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > >> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > >> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > >> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > >> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > >> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > >> > >> It's possible that swap_readpage() is called with none swapcache folio > >> in do_swap_page() and trigger this warning. So we shouldn't assume > >> zswap_load() always takes swapcache folio. > > > > Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > like a bug to me. > I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. Does it make sense to keep the warning and instead change it to check SWP_SYNCHRONOUS_IO as well? Something like: VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); Of course this is too ugly, so perhaps we want a helper to check if a swapfile is synchronous. > > > Regards > Yin, Fengwei > >
On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: >> >> >> >> On 8/11/2023 2:44 AM, Yu Zhao wrote: >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: >>>> >>>> With mm-unstable branch, if trigger swap activity and it's possible >>>> see following warning: >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 >>>> [ 178.095155][ T651] Modules linked in: >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 >>>> [ 178.121087][ T651] Call Trace: >>>> [ 178.121654][ T651] <TASK> >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 >>>> >>>> It's possible that swap_readpage() is called with none swapcache folio >>>> in do_swap_page() and trigger this warning. So we shouldn't assume >>>> zswap_load() always takes swapcache folio. >>> >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds >>> like a bug to me. >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > Does it make sense to keep the warning and instead change it to check > SWP_SYNCHRONOUS_IO as well? Something like: > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > Of course this is too ugly, so perhaps we want a helper to check if a > swapfile is synchronous. My understanding was that the WARN here is zswap_load() doesn't expect a folio not in swapcache. With zram, swap_readpage() must accept the folio not in swapcache. So this warn should not be there. But your comment make more sense to me. I will update the patch not to remove this WARN. Thanks. Regards Yin, Fengwei > >> >> >> Regards >> Yin, Fengwei >> >> >
On Thu, Aug 10, 2023 at 4:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > >> > >> > >> > >> On 8/11/2023 2:44 AM, Yu Zhao wrote: > >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > >>>> > >>>> With mm-unstable branch, if trigger swap activity and it's possible > >>>> see following warning: > >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > >>>> [ 178.095155][ T651] Modules linked in: > >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > >>>> [ 178.121087][ T651] Call Trace: > >>>> [ 178.121654][ T651] <TASK> > >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > >>>> > >>>> It's possible that swap_readpage() is called with none swapcache folio > >>>> in do_swap_page() and trigger this warning. So we shouldn't assume > >>>> zswap_load() always takes swapcache folio. > >>> > >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > >>> like a bug to me. > >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > > > Does it make sense to keep the warning and instead change it to check > > SWP_SYNCHRONOUS_IO as well? Something like: > > > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > > > Of course this is too ugly, so perhaps we want a helper to check if a > > swapfile is synchronous. > My understanding was that the WARN here is zswap_load() doesn't expect > a folio not in swapcache. With zram, swap_readpage() must accept the > folio not in swapcache. So this warn should not be there. > > But your comment make more sense to me. I will update the patch not > to remove this WARN. Thanks. Thanks. What I have in mind is that usually zram & zswap are not used together (which is probably why no one reported this warning before), so in the common case this warning is valuable. > > Regards > Yin, Fengwei > > > > >> > >> > >> Regards > >> Yin, Fengwei > >> > >> > >
On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > >> > >> > >> > >> On 8/11/2023 2:44 AM, Yu Zhao wrote: > >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > >>>> > >>>> With mm-unstable branch, if trigger swap activity and it's possible > >>>> see following warning: > >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > >>>> [ 178.095155][ T651] Modules linked in: > >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > >>>> [ 178.121087][ T651] Call Trace: > >>>> [ 178.121654][ T651] <TASK> > >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > >>>> > >>>> It's possible that swap_readpage() is called with none swapcache folio > >>>> in do_swap_page() and trigger this warning. So we shouldn't assume > >>>> zswap_load() always takes swapcache folio. > >>> > >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > >>> like a bug to me. > >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > > > Does it make sense to keep the warning and instead change it to check > > SWP_SYNCHRONOUS_IO as well? Something like: > > > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > > > Of course this is too ugly, so perhaps we want a helper to check if a > > swapfile is synchronous. > My understanding was that the WARN here is zswap_load() doesn't expect > a folio not in swapcache. With zram, swap_readpage() must accept the > folio not in swapcache. So this warn should not be there. > > But your comment make more sense to me. I will update the patch not > to remove this WARN. Thanks. That can cause another warning. Please don't overegineer.
On Thu, Aug 10, 2023 at 4:44 PM Yu Zhao <yuzhao@google.com> wrote: > > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > > > > > On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > >> > > >> > > >> > > >> On 8/11/2023 2:44 AM, Yu Zhao wrote: > > >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > > >>>> > > >>>> With mm-unstable branch, if trigger swap activity and it's possible > > >>>> see following warning: > > >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > > >>>> [ 178.095155][ T651] Modules linked in: > > >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > > >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > > >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > > >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > > >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > > >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > > >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > > >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > > >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > > >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > > >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > > >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > > >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > > >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > > >>>> [ 178.121087][ T651] Call Trace: > > >>>> [ 178.121654][ T651] <TASK> > > >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > > >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > > >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > > >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > > >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > > >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > > >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > > >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > > >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > > >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > > >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > > >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > > >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > > >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > > >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > > >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > > >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > > >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > > >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > > >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > > >>>> > > >>>> It's possible that swap_readpage() is called with none swapcache folio > > >>>> in do_swap_page() and trigger this warning. So we shouldn't assume > > >>>> zswap_load() always takes swapcache folio. > > >>> > > >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > >>> like a bug to me. > > >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > > > > > Does it make sense to keep the warning and instead change it to check > > > SWP_SYNCHRONOUS_IO as well? Something like: > > > > > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > > > > > Of course this is too ugly, so perhaps we want a helper to check if a > > > swapfile is synchronous. > > My understanding was that the WARN here is zswap_load() doesn't expect > > a folio not in swapcache. With zram, swap_readpage() must accept the > > folio not in swapcache. So this warn should not be there. > > > > But your comment make more sense to me. I will update the patch not > > to remove this WARN. Thanks. > > That can cause another warning. > > Please don't overegineer. How so? Using zswap with zram is a weird combination, if anything I would prefer leaving the warning as-is than removing it to be honest.
On 8/11/2023 7:43 AM, Yu Zhao wrote: > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: >> >> >> >> On 8/11/2023 7:15 AM, Yosry Ahmed wrote: >>> On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: >>>> >>>> >>>> >>>> On 8/11/2023 2:44 AM, Yu Zhao wrote: >>>>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: >>>>>> >>>>>> With mm-unstable branch, if trigger swap activity and it's possible >>>>>> see following warning: >>>>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 >>>>>> [ 178.095155][ T651] Modules linked in: >>>>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 >>>>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 >>>>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 >>>>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 >>>>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 >>>>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc >>>>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 >>>>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc >>>>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c >>>>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 >>>>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 >>>>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 >>>>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 >>>>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 >>>>>> [ 178.121087][ T651] Call Trace: >>>>>> [ 178.121654][ T651] <TASK> >>>>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 >>>>>> [ 178.122658][ T651] ? __warn+0x81/0x170 >>>>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 >>>>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 >>>>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 >>>>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 >>>>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 >>>>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 >>>>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 >>>>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 >>>>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 >>>>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 >>>>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 >>>>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 >>>>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 >>>>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 >>>>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 >>>>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 >>>>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 >>>>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 >>>>>> >>>>>> It's possible that swap_readpage() is called with none swapcache folio >>>>>> in do_swap_page() and trigger this warning. So we shouldn't assume >>>>>> zswap_load() always takes swapcache folio. >>>>> >>>>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds >>>>> like a bug to me. >>>> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. >>> >>> Does it make sense to keep the warning and instead change it to check >>> SWP_SYNCHRONOUS_IO as well? Something like: >>> >>> VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && >>> !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); >>> >>> Of course this is too ugly, so perhaps we want a helper to check if a >>> swapfile is synchronous. >> My understanding was that the WARN here is zswap_load() doesn't expect >> a folio not in swapcache. With zram, swap_readpage() must accept the >> folio not in swapcache. So this warn should not be there. >> >> But your comment make more sense to me. I will update the patch not >> to remove this WARN. Thanks. > > That can cause another warning. My understanding is that WARN may be wanted by zswap code. > > Please don't overegineer.
On Thu, Aug 10, 2023 at 6:37 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > On 8/11/2023 7:43 AM, Yu Zhao wrote: > > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > >> > >> > >> > >> On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > >>> On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > >>>> > >>>> > >>>> > >>>> On 8/11/2023 2:44 AM, Yu Zhao wrote: > >>>>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > >>>>>> > >>>>>> With mm-unstable branch, if trigger swap activity and it's possible > >>>>>> see following warning: > >>>>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > >>>>>> [ 178.095155][ T651] Modules linked in: > >>>>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > >>>>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > >>>>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > >>>>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > >>>>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > >>>>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > >>>>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > >>>>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > >>>>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > >>>>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > >>>>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > >>>>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > >>>>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > >>>>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > >>>>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > >>>>>> [ 178.121087][ T651] Call Trace: > >>>>>> [ 178.121654][ T651] <TASK> > >>>>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > >>>>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > >>>>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > >>>>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > >>>>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > >>>>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > >>>>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > >>>>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > >>>>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > >>>>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > >>>>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > >>>>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > >>>>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > >>>>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > >>>>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > >>>>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > >>>>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > >>>>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > >>>>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > >>>>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > >>>>>> > >>>>>> It's possible that swap_readpage() is called with none swapcache folio > >>>>>> in do_swap_page() and trigger this warning. So we shouldn't assume > >>>>>> zswap_load() always takes swapcache folio. > >>>>> > >>>>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > >>>>> like a bug to me. > >>>> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > >>> > >>> Does it make sense to keep the warning and instead change it to check > >>> SWP_SYNCHRONOUS_IO as well? Something like: > >>> > >>> VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > >>> !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > >>> > >>> Of course this is too ugly, so perhaps we want a helper to check if a > >>> swapfile is synchronous. > >> My understanding was that the WARN here is zswap_load() doesn't expect > >> a folio not in swapcache. With zram, swap_readpage() must accept the > >> folio not in swapcache. So this warn should not be there. > >> > >> But your comment make more sense to me. I will update the patch not > >> to remove this WARN. Thanks. > > > > That can cause another warning. > My understanding is that WARN may be wanted by zswap code. > > > > > Please don't overegineer. The original patch looks good to me. What Yosry suggested seems not only overengineered but also can cause a new KCSAN warning.
On Thu, Aug 10, 2023 at 8:03 PM Yu Zhao <yuzhao@google.com> wrote: > > On Thu, Aug 10, 2023 at 6:37 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > > > > > On 8/11/2023 7:43 AM, Yu Zhao wrote: > > > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > >> > > >> > > >> > > >> On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > >>> On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > >>>> > > >>>> > > >>>> > > >>>> On 8/11/2023 2:44 AM, Yu Zhao wrote: > > >>>>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > > >>>>>> > > >>>>>> With mm-unstable branch, if trigger swap activity and it's possible > > >>>>>> see following warning: > > >>>>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > > >>>>>> [ 178.095155][ T651] Modules linked in: > > >>>>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > > >>>>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > > >>>>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > > >>>>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > > >>>>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > > >>>>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > > >>>>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > > >>>>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > > >>>>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > > >>>>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > > >>>>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > > >>>>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > >>>>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > > >>>>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > > >>>>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > > >>>>>> [ 178.121087][ T651] Call Trace: > > >>>>>> [ 178.121654][ T651] <TASK> > > >>>>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > > >>>>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > > >>>>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > > >>>>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > > >>>>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > > >>>>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > > >>>>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > > >>>>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > > >>>>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > > >>>>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > > >>>>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > > >>>>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > > >>>>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > > >>>>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > > >>>>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > > >>>>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > > >>>>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > > >>>>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > > >>>>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > > >>>>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > > >>>>>> > > >>>>>> It's possible that swap_readpage() is called with none swapcache folio > > >>>>>> in do_swap_page() and trigger this warning. So we shouldn't assume > > >>>>>> zswap_load() always takes swapcache folio. > > >>>>> > > >>>>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > >>>>> like a bug to me. > > >>>> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > >>> > > >>> Does it make sense to keep the warning and instead change it to check > > >>> SWP_SYNCHRONOUS_IO as well? Something like: > > >>> > > >>> VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > >>> !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > >>> > > >>> Of course this is too ugly, so perhaps we want a helper to check if a > > >>> swapfile is synchronous. > > >> My understanding was that the WARN here is zswap_load() doesn't expect > > >> a folio not in swapcache. With zram, swap_readpage() must accept the > > >> folio not in swapcache. So this warn should not be there. > > >> > > >> But your comment make more sense to me. I will update the patch not > > >> to remove this WARN. Thanks. > > > > > > That can cause another warning. > > My understanding is that WARN may be wanted by zswap code. > > > > > > > > Please don't overegineer. > > The original patch looks good to me. What Yosry suggested seems not > only overengineered but also can cause a new KCSAN warning. I suppose that can be easily mitigated with data_race(), similar to do_swap_page(). Anyway, I don't feel strongly about it, if you do then we can go with the current patch :) It just feels odd to me to drop a warning from zswap due to an interaction with zram, which should not be happening in practice.
On Thu, Aug 10, 2023 at 5:46 PM Yosry Ahmed <yosryahmed@google.com> wrote: > > On Thu, Aug 10, 2023 at 4:44 PM Yu Zhao <yuzhao@google.com> wrote: > > > > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > > > > > > > > > On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > > > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > >> > > > >> > > > >> > > > >> On 8/11/2023 2:44 AM, Yu Zhao wrote: > > > >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > > > >>>> > > > >>>> With mm-unstable branch, if trigger swap activity and it's possible > > > >>>> see following warning: > > > >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > > > >>>> [ 178.095155][ T651] Modules linked in: > > > >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > > > >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > > > >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > > > >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > > > >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > > > >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > > > >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > > > >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > > > >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > > > >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > > > >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > > > >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > > > >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > > > >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > > > >>>> [ 178.121087][ T651] Call Trace: > > > >>>> [ 178.121654][ T651] <TASK> > > > >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > > > >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > > > >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > > > >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > > > >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > > > >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > > > >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > > > >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > > > >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > > > >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > > > >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > > > >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > > > >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > > > >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > > > >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > > > >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > > > >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > > > >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > > > >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > > > >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > > > >>>> > > > >>>> It's possible that swap_readpage() is called with none swapcache folio > > > >>>> in do_swap_page() and trigger this warning. So we shouldn't assume > > > >>>> zswap_load() always takes swapcache folio. > > > >>> > > > >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > > >>> like a bug to me. > > > >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > > > > > > > Does it make sense to keep the warning and instead change it to check > > > > SWP_SYNCHRONOUS_IO as well? Something like: > > > > > > > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > > > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > > > > > > > Of course this is too ugly, so perhaps we want a helper to check if a > > > > swapfile is synchronous. > > > My understanding was that the WARN here is zswap_load() doesn't expect > > > a folio not in swapcache. With zram, swap_readpage() must accept the > > > folio not in swapcache. So this warn should not be there. > > > > > > But your comment make more sense to me. I will update the patch not > > > to remove this WARN. Thanks. > > > > That can cause another warning. > > > > Please don't overegineer. > > How so? > > Using zswap with zram is a weird combination Not at all -- it can achieve tiering between different compressors: fast but low compression ratio for zswap but the opposite for zram. > if anything I would > prefer leaving the warning as-is than removing it to be honest.
On Thu, Aug 10, 2023 at 8:12 PM Yu Zhao <yuzhao@google.com> wrote: > > On Thu, Aug 10, 2023 at 5:46 PM Yosry Ahmed <yosryahmed@google.com> wrote: > > > > On Thu, Aug 10, 2023 at 4:44 PM Yu Zhao <yuzhao@google.com> wrote: > > > > > > On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > > > > > > > > > > > > > On 8/11/2023 7:15 AM, Yosry Ahmed wrote: > > > > > On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: > > > > >> > > > > >> > > > > >> > > > > >> On 8/11/2023 2:44 AM, Yu Zhao wrote: > > > > >>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: > > > > >>>> > > > > >>>> With mm-unstable branch, if trigger swap activity and it's possible > > > > >>>> see following warning: > > > > >>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 > > > > >>>> [ 178.095155][ T651] Modules linked in: > > > > >>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 > > > > >>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 > > > > >>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 > > > > >>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 > > > > >>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 > > > > >>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc > > > > >>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 > > > > >>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc > > > > >>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c > > > > >>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 > > > > >>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 > > > > >>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > > > > >>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 > > > > >>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 > > > > >>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 > > > > >>>> [ 178.121087][ T651] Call Trace: > > > > >>>> [ 178.121654][ T651] <TASK> > > > > >>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 > > > > >>>> [ 178.122658][ T651] ? __warn+0x81/0x170 > > > > >>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 > > > > >>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 > > > > >>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 > > > > >>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 > > > > >>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 > > > > >>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 > > > > >>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 > > > > >>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 > > > > >>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 > > > > >>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 > > > > >>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 > > > > >>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 > > > > >>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 > > > > >>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 > > > > >>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 > > > > >>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 > > > > >>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 > > > > >>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 > > > > >>>> > > > > >>>> It's possible that swap_readpage() is called with none swapcache folio > > > > >>>> in do_swap_page() and trigger this warning. So we shouldn't assume > > > > >>>> zswap_load() always takes swapcache folio. > > > > >>> > > > > >>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds > > > > >>> like a bug to me. > > > > >> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. > > > > > > > > > > Does it make sense to keep the warning and instead change it to check > > > > > SWP_SYNCHRONOUS_IO as well? Something like: > > > > > > > > > > VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && > > > > > !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); > > > > > > > > > > Of course this is too ugly, so perhaps we want a helper to check if a > > > > > swapfile is synchronous. > > > > My understanding was that the WARN here is zswap_load() doesn't expect > > > > a folio not in swapcache. With zram, swap_readpage() must accept the > > > > folio not in swapcache. So this warn should not be there. > > > > > > > > But your comment make more sense to me. I will update the patch not > > > > to remove this WARN. Thanks. > > > > > > That can cause another warning. > > > > > > Please don't overegineer. > > > > How so? > > > > Using zswap with zram is a weird combination > > Not at all -- it can achieve tiering between different compressors: > fast but low compression ratio for zswap but the opposite for zram. That's definitely an interesting use case, thanks for pointing this out. I would prefer creating a helper and using it in both do_swap_fault() and zswap_load() in the WARN_ON (with data_race()), but I am not against just removing the WARN_ON either. I will leave it up to you and Yin :) > > > if anything I would > > prefer leaving the warning as-is than removing it to be honest.
On 8/11/2023 11:21 AM, Yosry Ahmed wrote: > On Thu, Aug 10, 2023 at 8:12 PM Yu Zhao <yuzhao@google.com> wrote: >> >> On Thu, Aug 10, 2023 at 5:46 PM Yosry Ahmed <yosryahmed@google.com> wrote: >>> >>> On Thu, Aug 10, 2023 at 4:44 PM Yu Zhao <yuzhao@google.com> wrote: >>>> >>>> On Thu, Aug 10, 2023 at 5:31 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: >>>>> >>>>> >>>>> >>>>> On 8/11/2023 7:15 AM, Yosry Ahmed wrote: >>>>>> On Thu, Aug 10, 2023 at 4:09 PM Yin, Fengwei <fengwei.yin@intel.com> wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> On 8/11/2023 2:44 AM, Yu Zhao wrote: >>>>>>>> On Thu, Aug 10, 2023 at 3:58 AM Yin Fengwei <fengwei.yin@intel.com> wrote: >>>>>>>>> >>>>>>>>> With mm-unstable branch, if trigger swap activity and it's possible >>>>>>>>> see following warning: >>>>>>>>> [ 178.093511][ T651] WARNING: CPU: 2 PID: 651 at mm/zswap.c:1387 zswap_load+0x67/0x570 >>>>>>>>> [ 178.095155][ T651] Modules linked in: >>>>>>>>> [ 178.096103][ T651] CPU: 2 PID: 651 Comm: gmain Not tainted 6.5.0-rc4-00492-gad3232df3e41 #148 >>>>>>>>> [ 178.098372][ T651] Hardware name: QEMU Standard PC (i440FX + PIIX,1996), BIOS 1.14.0-2 04/01/2014 >>>>>>>>> [ 178.101114][ T651] RIP: 0010:zswap_load+0x67/0x570 >>>>>>>>> [ 178.102359][ T651] Code: a0 78 4b 85 e8 ea db ff ff 48 8b 00 a8 01 0f 84 84 04 00 00 48 89 df e8 d7 db ff ff 48 8b 00 a9 00 00 08 00 0f 85 c4 >>>>>>>>> [ 178.106376][ T651] RSP: 0018:ffffc900011b3760 EFLAGS: 00010246 >>>>>>>>> [ 178.107675][ T651] RAX: 0017ffffc0080001 RBX: ffffea0004a991c0 RCX:ffffc900011b37dc >>>>>>>>> [ 178.109242][ T651] RDX: 0000000000000000 RSI: 0000000000000001 RDI:ffffea0004a991c0 >>>>>>>>> [ 178.110916][ T651] RBP: ffffea0004a991c0 R08: 0000000000000243 R09:00000000c9a1aafc >>>>>>>>> [ 178.112377][ T651] R10: 00000000c9657db3 R11: 000000003c9657db R12:0000000000014b9c >>>>>>>>> [ 178.113698][ T651] R13: ffff88813501e710 R14: ffff88810d591000 R15:0000000000000000 >>>>>>>>> [ 178.115008][ T651] FS: 00007fb21a9ff700(0000) GS:ffff88813bc80000(0000) knlGS:0000000000000000 >>>>>>>>> [ 178.116423][ T651] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>>>>>>>> [ 178.117421][ T651] CR2: 00005632cbfc81f6 CR3: 0000000131450002 CR4:0000000000370ee0 >>>>>>>>> [ 178.118683][ T651] DR0: 0000000000000000 DR1: 0000000000000000 DR2:0000000000000000 >>>>>>>>> [ 178.119894][ T651] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:0000000000000400 >>>>>>>>> [ 178.121087][ T651] Call Trace: >>>>>>>>> [ 178.121654][ T651] <TASK> >>>>>>>>> [ 178.122109][ T651] ? zswap_load+0x67/0x570 >>>>>>>>> [ 178.122658][ T651] ? __warn+0x81/0x170 >>>>>>>>> [ 178.123119][ T651] ? zswap_load+0x67/0x570 >>>>>>>>> [ 178.123608][ T651] ? report_bug+0x167/0x190 >>>>>>>>> [ 178.124150][ T651] ? handle_bug+0x3c/0x70 >>>>>>>>> [ 178.124615][ T651] ? exc_invalid_op+0x13/0x60 >>>>>>>>> [ 178.125192][ T651] ? asm_exc_invalid_op+0x16/0x20 >>>>>>>>> [ 178.125753][ T651] ? zswap_load+0x67/0x570 >>>>>>>>> [ 178.126231][ T651] ? lock_acquire+0xbb/0x290 >>>>>>>>> [ 178.126745][ T651] ? folio_add_lru+0x40/0x1c0 >>>>>>>>> [ 178.127261][ T651] ? find_held_lock+0x2b/0x80 >>>>>>>>> [ 178.127776][ T651] swap_readpage+0xc7/0x5c0 >>>>>>>>> [ 178.128273][ T651] do_swap_page+0x86d/0xf50 >>>>>>>>> [ 178.128770][ T651] ? __pte_offset_map+0x3e/0x290 >>>>>>>>> [ 178.129321][ T651] ? __pte_offset_map+0x1c4/0x290 >>>>>>>>> [ 178.129883][ T651] __handle_mm_fault+0x6ad/0xca0 >>>>>>>>> [ 178.130419][ T651] handle_mm_fault+0x18b/0x410 >>>>>>>>> [ 178.130992][ T651] do_user_addr_fault+0x1f1/0x820 >>>>>>>>> [ 178.132076][ T651] exc_page_fault+0x63/0x1a0 >>>>>>>>> [ 178.132599][ T651] asm_exc_page_fault+0x22/0x30 >>>>>>>>> >>>>>>>>> It's possible that swap_readpage() is called with none swapcache folio >>>>>>>>> in do_swap_page() and trigger this warning. So we shouldn't assume >>>>>>>>> zswap_load() always takes swapcache folio. >>>>>>>> >>>>>>>> Did you use a bdev with QUEUE_FLAG_SYNCHRONOUS? Otherwise it sounds >>>>>>>> like a bug to me. >>>>>>> I hit this warning with zram which has QUEUE_FLAG_SYNCHRONOUS set. Thanks. >>>>>> >>>>>> Does it make sense to keep the warning and instead change it to check >>>>>> SWP_SYNCHRONOUS_IO as well? Something like: >>>>>> >>>>>> VM_WARN_ON_ONCE(!folio_test_swapcache(folio) && >>>>>> !swap_type_to_swap_info(type)->flags && SWP_SYNCHRONOUS_IO); >>>>>> >>>>>> Of course this is too ugly, so perhaps we want a helper to check if a >>>>>> swapfile is synchronous. >>>>> My understanding was that the WARN here is zswap_load() doesn't expect >>>>> a folio not in swapcache. With zram, swap_readpage() must accept the >>>>> folio not in swapcache. So this warn should not be there. >>>>> >>>>> But your comment make more sense to me. I will update the patch not >>>>> to remove this WARN. Thanks. >>>> >>>> That can cause another warning. >>>> >>>> Please don't overegineer. >>> >>> How so? >>> >>> Using zswap with zram is a weird combination >> >> Not at all -- it can achieve tiering between different compressors: >> fast but low compression ratio for zswap but the opposite for zram. > > That's definitely an interesting use case, thanks for pointing this out. > > I would prefer creating a helper and using it in both do_swap_fault() > and zswap_load() in the WARN_ON (with data_race()), but I am not > against just removing the WARN_ON either. I will leave it up to you > and Yin :) OK. I will stick to the current patch. Regards Yin, Fengwei > >> >>> if anything I would >>> prefer leaving the warning as-is than removing it to be honest.
Hi Yin, On Fri, Aug 11, 2023 at 01:21:21PM +0800, Yin, Fengwei wrote: > OK. I will stick to the current patch. I think remove that warning is fine. Feel free to add: Reviewed-by: Chris Li (Google) <chrisl@kernel.org> Chris
diff --git a/mm/zswap.c b/mm/zswap.c index 1e17f11a7896..7300b98d4a03 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1384,7 +1384,6 @@ bool zswap_load(struct folio *folio) bool ret; VM_WARN_ON_ONCE(!folio_test_locked(folio)); - VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); /* find */ spin_lock(&tree->lock);