Message ID | 20240209142901.126894-1-da.gomez@samsung.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:50ea:b0:106:860b:bbdd with SMTP id r10csp891656dyd; Fri, 9 Feb 2024 06:31:18 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCW6mLNB9ZptA96Z2JlWELwlFmvCP5KQvUI3j2c+skPcgL+86AaWDcaBS5KndroW65HuAordc4hkT0NW53YZ63OHBJdsNA== X-Google-Smtp-Source: AGHT+IFoq58nG64nB1L++VLa+nY+XnreZjkJKk1f1uUmw02N1YOZMgln9i+5srqesGsneng56z0y X-Received: by 2002:a05:6358:41a1:b0:178:fce4:5f71 with SMTP id w33-20020a05635841a100b00178fce45f71mr1679219rwc.8.1707489078270; Fri, 09 Feb 2024 06:31:18 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707489078; cv=pass; d=google.com; s=arc-20160816; b=cUTy7YCKUq4s5p9osJyCD7WdSlmXIN/CgGKfzy7/sBWnP2HhOcpXb6Jh6nuU/ez7jm j6LgrWDSsDArL8hA42H12QtCRktb/Gr/VQdXB6qkTbWdhJzM0Hdg4XOEh9JmPXyM+jt6 wkUWI8C4TL4SKl+daxNwSO4+Hz2uee8JhIe/9CXjeHMt9bWIWX0RNhe5bHlwa0HP4F8n w/25LQwoqdauDG1I8QUi1yncZ9bKddLzTefbphE+uI4OUknQYfFRLV0Rvqq5ZhA54YhR SJ2uPe7dds+3GOPZSIVKEpNngRzWXOIgbTOLtQHgacG/wbd8NKUVfFpNDszRoPz7j0rg Jicw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=references:cms-type:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:content-transfer-encoding:content-id :content-language:accept-language:message-id:date:thread-index :thread-topic:subject:cc:to:from:dkim-signature:dkim-filter; bh=Dswm6KHnt6fctGMQEhkJJsY2zjJGUEaprm9g9WiZFfQ=; fh=TmaTINw+TNPeNzs08sZdTjvcxa5KcYvNTkiIlPg10qE=; b=F57PhfPXRfGHk83x1FaAuHtZj5OWmtfgZpXdIv5xY+7xsLsULIV7wJZ9tpkHxnDw2R mZx/WbmXIGKWrkHmCWM3LpJel8vEAMyroxxa8EwpyjgKdxfbkeEZIWwKIO1Nu20G7HD9 mZsQ53VFQ2et0P7XtrYkCeFZeamdtPNmMN/id2etmdRd4UZBSJVvwYpGwuSWCTe78MHb lSpRy606Rpb+7Lsyp/eXuFogJKR0f1L8GUBNE81tqFyeZbBSAGWex1uxgui8Pu+TvxaC 4nW9y/G50IL1kkzngMyWZoMR34zKFRjFJ7u+3CPgc8eE5U9oJf0Gwndo9xqeHQ2M5+7y c6Wg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@samsung.com header.s=mail20170921 header.b=n2KL3eDA; arc=pass (i=1 spf=pass spfdomain=samsung.com dkim=pass dkdomain=samsung.com dmarc=pass fromdomain=samsung.com); spf=pass (google.com: domain of linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=samsung.com X-Forwarded-Encrypted: i=2; AJvYcCW2V3YsMa0ehktBq9kspyQ44G1oHIec2y9/eE0P7gb0LgL3nw1+3f76o0LmPy6PDdPxw4+V+15UnRyvGdg6Ttxt/s4PIQ== Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id ge15-20020a05621427cf00b0068ccc5108b3si1990423qvb.380.2024.02.09.06.31.18 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 09 Feb 2024 06:31:18 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@samsung.com header.s=mail20170921 header.b=n2KL3eDA; arc=pass (i=1 spf=pass spfdomain=samsung.com dkim=pass dkdomain=samsung.com dmarc=pass fromdomain=samsung.com); spf=pass (google.com: domain of linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-59451-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=samsung.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 454C11C233D3 for <ouuuleilei@gmail.com>; Fri, 9 Feb 2024 14:31:04 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 360816D1CB; Fri, 9 Feb 2024 14:29:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=samsung.com header.i=@samsung.com header.b="n2KL3eDA" Received: from mailout1.w1.samsung.com (mailout1.w1.samsung.com [210.118.77.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 56B5E69955 for <linux-kernel@vger.kernel.org>; Fri, 9 Feb 2024 14:29:06 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=210.118.77.11 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707488949; cv=none; b=YkXWCVjAXO7NJ4DlvRiMAiRCMU7tzy5ESCRyyP65JfPgw6ZvBziFbGezcSUqviBLX3vO2OxHzkEq8fxbpzh/c2DVgD/GJ7A6QP+pE6Ek/J2wYj8uktc/7NeSv9hRCVE/31ILomjFJDJI7LnVF3j9O8kEMZlFclfESPI/k34XV3Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707488949; c=relaxed/simple; bh=Dswm6KHnt6fctGMQEhkJJsY2zjJGUEaprm9g9WiZFfQ=; h=From:To:CC:Subject:Date:Message-ID:Content-Type:MIME-Version: References; b=AUeYaqnSSD0NDpo2VPXftzOAyWjNfN7aMbQZO3ctu/fev5jfxIyIPNiQ8Bjp0waK/cPwFdeaA9EUN3XaKutQQEgLGp3ecDWPy/31ElWSHnVJpGo6eZJTCHI5c2kqQcde6t5+EmxfxYQBouQCUbmKuihS7OjvqvRabfiT2jo+TJM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=samsung.com; spf=pass smtp.mailfrom=samsung.com; dkim=pass (1024-bit key) header.d=samsung.com header.i=@samsung.com header.b=n2KL3eDA; arc=none smtp.client-ip=210.118.77.11 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=samsung.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=samsung.com Received: from eucas1p1.samsung.com (unknown [182.198.249.206]) by mailout1.w1.samsung.com (KnoxPortal) with ESMTP id 20240209142904euoutp0152b1b769b6689e5dbf844e55d94db11b~yOCX3Z8DE2559825598euoutp01N for <linux-kernel@vger.kernel.org>; Fri, 9 Feb 2024 14:29:04 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.w1.samsung.com 20240209142904euoutp0152b1b769b6689e5dbf844e55d94db11b~yOCX3Z8DE2559825598euoutp01N DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1707488944; bh=Dswm6KHnt6fctGMQEhkJJsY2zjJGUEaprm9g9WiZFfQ=; h=From:To:CC:Subject:Date:References:From; b=n2KL3eDASHuDHAmLDNeV3h0bS6bxQ2xkSCMjp+bLEfxXqBf+eUhWG/ATcjg7wwv4D 5//xY4jRI5JMj9VS/cRLzvuKyP7zfmEHeQUcKVHomYniFi612c4nKMEI5meIt53taN D5ndrjjkSlIxTuCKKH3e866X0VFtGah9LIduUfRs= Received: from eusmges3new.samsung.com (unknown [203.254.199.245]) by eucas1p2.samsung.com (KnoxPortal) with ESMTP id 20240209142903eucas1p2ea029c7e25afcab2b3116ee58615490f~yOCXVzbKe2329623296eucas1p2U; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) Received: from eucas1p2.samsung.com ( [182.198.249.207]) by eusmges3new.samsung.com (EUCPMTA) with SMTP id E7.2E.09552.FA636C56; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) Received: from eusmtrp1.samsung.com (unknown [182.198.249.138]) by eucas1p1.samsung.com (KnoxPortal) with ESMTPA id 20240209142903eucas1p1f211ca6fc40a788e833de062e2772c41~yOCW6KAeo3259232592eucas1p1B; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) Received: from eusmgms2.samsung.com (unknown [182.198.249.180]) by eusmtrp1.samsung.com (KnoxPortal) with ESMTP id 20240209142903eusmtrp17c0f58b9cd349b795e4af0519634af71~yOCW5hFSC0528405284eusmtrp1w; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) X-AuditID: cbfec7f5-853ff70000002550-86-65c636af76a3 Received: from eusmtip1.samsung.com ( [203.254.199.221]) by eusmgms2.samsung.com (EUCPMTA) with SMTP id 8E.A3.10702.FA636C56; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) Received: from CAMSVWEXC01.scsc.local (unknown [106.1.227.71]) by eusmtip1.samsung.com (KnoxPortal) with ESMTPA id 20240209142903eusmtip1f20a48f3b94b23fb7387f39f331b4b72~yOCWwn82c2720027200eusmtip1u; Fri, 9 Feb 2024 14:29:03 +0000 (GMT) Received: from CAMSVWEXC02.scsc.local (2002:6a01:e348::6a01:e348) by CAMSVWEXC01.scsc.local (2002:6a01:e347::6a01:e347) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 9 Feb 2024 14:29:02 +0000 Received: from CAMSVWEXC02.scsc.local ([::1]) by CAMSVWEXC02.scsc.local ([fe80::3c08:6c51:fa0a:6384%13]) with mapi id 15.00.1497.012; Fri, 9 Feb 2024 14:29:02 +0000 From: Daniel Gomez <da.gomez@samsung.com> To: "viro@zeniv.linux.org.uk" <viro@zeniv.linux.org.uk>, "brauner@kernel.org" <brauner@kernel.org>, "jack@suse.cz" <jack@suse.cz>, "hughd@google.com" <hughd@google.com>, "akpm@linux-foundation.org" <akpm@linux-foundation.org> CC: "dagmcr@gmail.com" <dagmcr@gmail.com>, "linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>, "linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, "willy@infradead.org" <willy@infradead.org>, "hch@infradead.org" <hch@infradead.org>, "mcgrof@kernel.org" <mcgrof@kernel.org>, Pankaj Raghav <p.raghav@samsung.com>, "gost.dev@samsung.com" <gost.dev@samsung.com>, "Daniel Gomez" <da.gomez@samsung.com> Subject: [RFC PATCH 0/9] shmem: fix llseek in hugepages Thread-Topic: [RFC PATCH 0/9] shmem: fix llseek in hugepages Thread-Index: AQHaW2RT7xV/K52STUGwCHRBbQ9MBQ== Date: Fri, 9 Feb 2024 14:29:01 +0000 Message-ID: <20240209142901.126894-1-da.gomez@samsung.com> Accept-Language: en-US, en-GB Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-messagesentrepresentingtype: 1 x-ms-exchange-transport-fromentityheader: Hosted Content-Type: text/plain; charset="utf-8" Content-ID: <FC9A37F54832EA47B4528BC8F1D3C09F@scsc.local> Content-Transfer-Encoding: base64 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrOKsWRmVeSWpSXmKPExsWy7djP87rrzY6lGvTNNLSYs34Nm8Xrw58Y Lc72/WazOD1hEZPF0099LBazpzczWezZe5LF4vKuOWwW99b8Z7W4MeEpo8X5v8dZLX7/mMPm wOOxc9Zddo8Fm0o9Nq/Q8ti0qpPNY9OnSeweJ2b8ZvE4s+AIu8fnTXIem568ZQrgjOKySUnN ySxLLdK3S+DKmDTxBXtBh3jFvxnxDYx7xLoYOTkkBEwkls/9zNLFyMUhJLCCUeLq7oXMEM4X RomtF/9DZT4zSpyb858dpmVpx1JGiMRyRonOhs1McFWvLy2BypxmlLg//wQzwuS9f5hA+tkE NCX2ndzEDpIQEXjOKNG6+yOYwyxwm1liTvssoH4ODmEBc4nFU5JAGkQEbCQ2NTQzQdh6Elfb 1jOC2CwCKhKHDxwCK+cVsJKYMTkbJMwoICvxaOUvsFuZBcQlbj2ZzwRxt6DEotl7mCFsMYl/ ux6yQdg6EmevP2GEsA0kti7dxwJhK0p0HLvJBjKeGejm9bv0IUZaSsx+P4sNwlaUmNL9EGwV L9D4kzOfgMNLQuAfp8ST1Uuh4eUi8X3OF6i9whKvjm+BistInJ7cwzKBUXsWklNnIaybhWTd LCTrZiFZt4CRdRWjeGppcW56arFxXmq5XnFibnFpXrpecn7uJkZg2jv97/jXHYwrXn3UO8TI xMF4iFGCg1lJhDdkyZFUId6UxMqq1KL8+KLSnNTiQ4zSHCxK4ryqKfKpQgLpiSWp2ampBalF MFkmDk6pBibj9CW9N7WPdSWxXd6u2hCwYfeMC8/ULqSeqeH1tFNdkFE8TV2DL1RQqXqr3qTr e66sS1zEP+vU/t3RH35dV/kV/Y7rc4Dvio690U/eyksfCzA7uC23qT7H6fvfDA+LHquPX1vE w5bllEc67Xrp7PPA6l1TupHz/JonRQdzHXc7aW8SytrRFS3kpGfy/oHf34ttKUG+l4ofrKxy mXVYd6tE5r3ZT9fH/OXfU3VDIzTpXtpeEa267vKb/05PDZrnWqh/TIzVSGj3hbdeGpY5qyWN X1nHZ4se9NvqFzLhWd8NBfPG3gszjtV4uHzb+F3jcW1fmqnsJ6mfRjNd81bb/mYNk5rOp+UQ 6f/16P+Ut0osxRmJhlrMRcWJAAPfo1nqAwAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEKsWRmVeSWpSXmKPExsVy+t/xu7rrzY6lGpz6ImIxZ/0aNovXhz8x Wpzt+81mcXrCIiaLp5/6WCxmT29mstiz9ySLxeVdc9gs7q35z2pxY8JTRovzf4+zWvz+MYfN gcdj56y77B4LNpV6bF6h5bFpVSebx6ZPk9g9Tsz4zeJxZsERdo/Pm+Q8Nj15yxTAGaVnU5Rf WpKqkJFfXGKrFG1oYaRnaGmhZ2RiqWdobB5rZWSqpG9nk5Kak1mWWqRvl6CXMWniC/aCDvGK fzPiGxj3iHUxcnJICJhILO1YyghiCwksZZS4Oc8QIi4jsfHLVVYIW1jiz7UuNoiaj4wSTS1B EPZpRomlzxIh7BWMEo83VYHYbAKaEvtObmLvYuTiEBF4yigx/fchFhCHWeA2s8Sc9llA2zg4 hAXMJRZPSQJpEBGwkdjU0MwEYetJXG1bD3YQi4CKxOEDh8DKeQWsJGZMzgYJMwrISjxa+Ysd xGYWEJe49WQ+E8SdAhJL9pxnhrBFJV4+/gd1v47E2etPGCFsA4mtS/exQNiKEh3HbrKBjGcG unn9Ln2IkZYSs9/PYoOwFSWmdD8EW8UrIChxcuYTlgmMUrOQbJ6F0D0LSfcsJN2zkHQvYGRd xSiSWlqcm55bbKRXnJhbXJqXrpecn7uJEZiYth37uWUH48pXH/UOMTJxMB5ilOBgVhLhDVly JFWINyWxsiq1KD++qDQntfgQoykwgCYyS4km5wNTY15JvKGZgamhiZmlgamlmbGSOK9nQUei kEB6YklqdmpqQWoRTB8TB6dUAxMb3zzDTbHmLzeznFxxT3ONdmmKt+2pW6rq27izFJ2PbpFa 65+Xv/7pCvu1bzz3x0jLrpkdK7Fkd+pVA/GpOzt79nKYfz6vX1PGvN9C7q+LcEFZZE3+Fcur V182fJt8pCbXZ6P2sQ8WFS9eNlS+0nvV6zd915U9b+8xCv257rbh0e+ngf8C2qzO8GwqZ639 4/lv4z/tCedPMP7+feaKti2bZsuFHaw2f2YsOL6si+3vnt8rPcTYnfje7djJa/jWwe/1DoUI s7nHNyXtmDPl+pHHv7dulWjmkytfFXbl3J87VW7KM46rfI/z5c+L9ZsecNTRq7/+W2zrxaVV 86tC5pxJyqw5pXGrvWDONBdm/dOflViKMxINtZiLihMBVjP0VtUDAAA= X-CMS-MailID: 20240209142903eucas1p1f211ca6fc40a788e833de062e2772c41 X-Msg-Generator: CA X-RootMTR: 20240209142903eucas1p1f211ca6fc40a788e833de062e2772c41 X-EPHeader: CA CMS-TYPE: 201P X-CMS-RootMailID: 20240209142903eucas1p1f211ca6fc40a788e833de062e2772c41 References: <CGME20240209142903eucas1p1f211ca6fc40a788e833de062e2772c41@eucas1p1.samsung.com> X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790432067552079323 X-GMAIL-MSGID: 1790432067552079323 |
Series |
shmem: fix llseek in hugepages
|
|
Message
Daniel Gomez
Feb. 9, 2024, 2:29 p.m. UTC
Hi, The following series fixes the generic/285 and generic/436 fstests for huge pages (huge=always). These are tests for llseek (SEEK_HOLE and SEEK_DATA). The implementation to fix above tests is based on iomap per-block tracking for uptodate and dirty states but applied to shmem uptodate flag. The motivation is to avoid any regressions in tmpfs once it gets support for large folios. Testing with kdevops Testing has been performed using fstests with kdevops for the v6.8-rc2 tag. There are currently different profiles supported [1] and for each of these, a baseline of 20 loops has been performed with the following failures for hugepages profiles: generic/080, generic/126, generic/193, generic/245, generic/285, generic/436, generic/551, generic/619 and generic/732. If anyone interested, please find all of the failures in the expunges directory: https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned [1] tmpfs profiles supported in kdevops: default, tmpfs_noswap_huge_never, tmpfs_noswap_huge_always, tmpfs_noswap_huge_within_size, tmpfs_noswap_huge_advise, tmpfs_huge_always, tmpfs_huge_within_size and tmpfs_huge_advise. More information: https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned All the patches has been tested on top of v6.8-rc2 and rebased onto latest next tag available (next-20240209). Daniel Daniel Gomez (8): shmem: add per-block uptodate tracking for hugepages shmem: move folio zero operation to write_begin() shmem: exit shmem_get_folio_gfp() if block is uptodate shmem: clear_highpage() if block is not uptodate shmem: set folio uptodate when reclaim shmem: check if a block is uptodate before splice into pipe shmem: clear uptodate blocks after PUNCH_HOLE shmem: enable per-block uptodate Pankaj Raghav (1): splice: don't check for uptodate if partially uptodate is impl fs/splice.c | 17 ++- mm/shmem.c | 340 ++++++++++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 332 insertions(+), 25 deletions(-) -- 2.43.0
Comments
On Fri, Feb 09, 2024 at 02:29:01PM +0000, Daniel Gomez wrote: > Hi, > > The following series fixes the generic/285 and generic/436 fstests for huge > pages (huge=always). These are tests for llseek (SEEK_HOLE and SEEK_DATA). > > The implementation to fix above tests is based on iomap per-block tracking for > uptodate and dirty states but applied to shmem uptodate flag. Hi Hugh, Andrew, Could you kindly provide feedback on these patches/fixes? I'd appreciate your input on whether we're headed in the right direction, or maybe not. Thanks, Daniel > > The motivation is to avoid any regressions in tmpfs once it gets support for > large folios. > > Testing with kdevops > Testing has been performed using fstests with kdevops for the v6.8-rc2 tag. > There are currently different profiles supported [1] and for each of these, > a baseline of 20 loops has been performed with the following failures for > hugepages profiles: generic/080, generic/126, generic/193, generic/245, > generic/285, generic/436, generic/551, generic/619 and generic/732. > > If anyone interested, please find all of the failures in the expunges directory: > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned > > [1] tmpfs profiles supported in kdevops: default, tmpfs_noswap_huge_never, > tmpfs_noswap_huge_always, tmpfs_noswap_huge_within_size, > tmpfs_noswap_huge_advise, tmpfs_huge_always, tmpfs_huge_within_size and > tmpfs_huge_advise. > > More information: > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned > > All the patches has been tested on top of v6.8-rc2 and rebased onto latest next > tag available (next-20240209). > > Daniel > > Daniel Gomez (8): > shmem: add per-block uptodate tracking for hugepages > shmem: move folio zero operation to write_begin() > shmem: exit shmem_get_folio_gfp() if block is uptodate > shmem: clear_highpage() if block is not uptodate > shmem: set folio uptodate when reclaim > shmem: check if a block is uptodate before splice into pipe > shmem: clear uptodate blocks after PUNCH_HOLE > shmem: enable per-block uptodate > > Pankaj Raghav (1): > splice: don't check for uptodate if partially uptodate is impl > > fs/splice.c | 17 ++- > mm/shmem.c | 340 ++++++++++++++++++++++++++++++++++++++++++++++++---- > 2 files changed, 332 insertions(+), 25 deletions(-) > > -- > 2.43.0
On Wed, 14 Feb 2024, Daniel Gomez wrote: > On Fri, Feb 09, 2024 at 02:29:01PM +0000, Daniel Gomez wrote: > > Hi, > > > > The following series fixes the generic/285 and generic/436 fstests for huge > > pages (huge=always). These are tests for llseek (SEEK_HOLE and SEEK_DATA). > > > > The implementation to fix above tests is based on iomap per-block tracking for > > uptodate and dirty states but applied to shmem uptodate flag. > > Hi Hugh, Andrew, > > Could you kindly provide feedback on these patches/fixes? I'd appreciate your > input on whether we're headed in the right direction, or maybe not. I am sorry, Daniel, but I see this series as misdirected effort. We do not want to add overhead to tmpfs and the kernel, just to pass two tests which were (very reasonably) written for fixed block size, before the huge page possibility ever came in. If one opts for transparent huge pages in the filesystem, then of course the dividing line between hole and data becomes more elastic than before. It would be a serious bug if lseek ever reported an area of non-0 data as in a hole; but I don't think that is what generic/285 or generic/436 find. Beyond that, "man 2 lseek" is very forgiving of filesystem implementation. I'll send you my stack of xfstests patches (which, as usual, I cannot afford the time now to re-review and post): there are several tweaks to seek_sanity_test in there for tmpfs huge pages, along with other fixes for tmpfs (and some fixes to suit an old 32-bit build environment). With those tweaks, generic/285 and generic/436 and others (but not all) have been passing on huge tmpfs for several years. If you see something you'd like to add your name to in that stack, or can improve upon, please go ahead and post to the fstests list (Cc me). Thanks, Hugh > > Thanks, > Daniel > > > > > The motivation is to avoid any regressions in tmpfs once it gets support for > > large folios. > > > > Testing with kdevops > > Testing has been performed using fstests with kdevops for the v6.8-rc2 tag. > > There are currently different profiles supported [1] and for each of these, > > a baseline of 20 loops has been performed with the following failures for > > hugepages profiles: generic/080, generic/126, generic/193, generic/245, > > generic/285, generic/436, generic/551, generic/619 and generic/732. > > > > If anyone interested, please find all of the failures in the expunges directory: > > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned > > > > [1] tmpfs profiles supported in kdevops: default, tmpfs_noswap_huge_never, > > tmpfs_noswap_huge_always, tmpfs_noswap_huge_within_size, > > tmpfs_noswap_huge_advise, tmpfs_huge_always, tmpfs_huge_within_size and > > tmpfs_huge_advise. > > > > More information: > > https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned > > > > All the patches has been tested on top of v6.8-rc2 and rebased onto latest next > > tag available (next-20240209). > > > > Daniel > > > > Daniel Gomez (8): > > shmem: add per-block uptodate tracking for hugepages > > shmem: move folio zero operation to write_begin() > > shmem: exit shmem_get_folio_gfp() if block is uptodate > > shmem: clear_highpage() if block is not uptodate > > shmem: set folio uptodate when reclaim > > shmem: check if a block is uptodate before splice into pipe > > shmem: clear uptodate blocks after PUNCH_HOLE > > shmem: enable per-block uptodate > > > > Pankaj Raghav (1): > > splice: don't check for uptodate if partially uptodate is impl > > > > fs/splice.c | 17 ++- > > mm/shmem.c | 340 ++++++++++++++++++++++++++++++++++++++++++++++++---- > > 2 files changed, 332 insertions(+), 25 deletions(-) > > > > -- > > 2.43.0
On Mon, Feb 19, 2024 at 02:15:47AM -0800, Hugh Dickins wrote: > On Wed, 14 Feb 2024, Daniel Gomez wrote: > > On Fri, Feb 09, 2024 at 02:29:01PM +0000, Daniel Gomez wrote: > > > Hi, > > > > > > The following series fixes the generic/285 and generic/436 fstests for huge > > > pages (huge=always). These are tests for llseek (SEEK_HOLE and SEEK_DATA). > > > > > > The implementation to fix above tests is based on iomap per-block tracking for > > > uptodate and dirty states but applied to shmem uptodate flag. > > > > Hi Hugh, Andrew, > > > > Could you kindly provide feedback on these patches/fixes? I'd appreciate your > > input on whether we're headed in the right direction, or maybe not. > > I am sorry, Daniel, but I see this series as misdirected effort. > > We do not want to add overhead to tmpfs and the kernel, just to pass two > tests which were (very reasonably) written for fixed block size, before > the huge page possibility ever came in. Is this overhead a concern in performance? Can you clarify what do you mean? I guess is a matter of which kind of granularity we want for a filesystem. Then, we can either adapt the test to work with different block sizes or change the filesystem to support this fixed and minimum block size. I believe the tests should remain unchanged if we still want to operate at this fixed block size, regardless of how the memory is managed in the filesystem side (whether is a huge page or a large folio with arbitrary order). > > If one opts for transparent huge pages in the filesystem, then of course > the dividing line between hole and data becomes more elastic than before. I'm uncertain when we may want to be more elastic. In the case of XFS with iomap and support for large folios, for instance, we are 'less' elastic than here So, what exactly is the rationale behind wanting shmem to be 'more elastic'? If we ever move shmem to large folios [1], and we use them in an oportunistic way, then we are going to be more elastic in the default path. [1] https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsung.com In addition, I think that having this block granularity can benefit quota support and the reclaim path. For example, in the generic/100 fstest, around ~26M of data are reported as 1G of used disk when using tmpfs with huge pages. > > It would be a serious bug if lseek ever reported an area of non-0 data as > in a hole; but I don't think that is what generic/285 or generic/436 find. I agree this is not the case here. We mark the entire folio (huge page) as uptodate, hence we report that full area as data, making steps of 2M. > > Beyond that, "man 2 lseek" is very forgiving of filesystem implementation. Thanks for bringing that up. This got me thinking along the same lines as before, wanting to understand where we want to draw the line and the reasons benhind it. > > I'll send you my stack of xfstests patches (which, as usual, I cannot > afford the time now to re-review and post): there are several tweaks to > seek_sanity_test in there for tmpfs huge pages, along with other fixes > for tmpfs (and some fixes to suit an old 32-bit build environment). > > With those tweaks, generic/285 and generic/436 and others (but not all) > have been passing on huge tmpfs for several years. If you see something > you'd like to add your name to in that stack, or can improve upon, please > go ahead and post to the fstests list (Cc me). Thanks for the patches Hugh. I see how you are making the seeking tests a bit more 'elastic'. I will post them shortly and see if we can make sure we can minimize the number of failures [2]. In kdevops [3], we are discussing the possibility to add tmpfs to 0-day and track for any regressions. [2] https://github.com/linux-kdevops/kdevops/tree/master/workflows/fstests/expunges/6.8.0-rc2/tmpfs/unassigned [3] https://github.com/linux-kdevops/kdevops > > Thanks, > Hugh > > > > > Thanks, > > Daniel > > > > > > > > The motivation is to avoid any regressions in tmpfs once it gets support for > > > large folios. > > > > > > Testing with kdevops > > > Testing has been performed using fstests with kdevops for the v6.8-rc2 tag. > > > There are currently different profiles supported [1] and for each of these, > > > a baseline of 20 loops has been performed with the following failures for > > > hugepages profiles: generic/080, generic/126, generic/193, generic/245, > > > generic/285, generic/436, generic/551, generic/619 and generic/732. > > > > > > If anyone interested, please find all of the failures in the expunges directory: > > > https://protect2.fireeye.com/v1/url?k=9a7b8131-fbf09401-9a7a0a7e-000babffaa23-2e83e8b120fdf45e&q=1&e=e25c026a-1bb5-45f4-8acb-884e4a5e4d91&u=https%3A%2F%2Fgithub.com%2Flinux-kdevops%2Fkdevops%2Ftree%2Fmaster%2Fworkflows%2Ffstests%2Fexpunges%2F6.8.0-rc2%2Ftmpfs%2Funassigned > > > > > > [1] tmpfs profiles supported in kdevops: default, tmpfs_noswap_huge_never, > > > tmpfs_noswap_huge_always, tmpfs_noswap_huge_within_size, > > > tmpfs_noswap_huge_advise, tmpfs_huge_always, tmpfs_huge_within_size and > > > tmpfs_huge_advise. > > > > > > More information: > > > https://protect2.fireeye.com/v1/url?k=70096f39-11827a09-7008e476-000babffaa23-4c0e0d7b2ec659b6&q=1&e=e25c026a-1bb5-45f4-8acb-884e4a5e4d91&u=https%3A%2F%2Fgithub.com%2Flinux-kdevops%2Fkdevops%2Ftree%2Fmaster%2Fworkflows%2Ffstests%2Fexpunges%2F6.8.0-rc2%2Ftmpfs%2Funassigned > > > > > > All the patches has been tested on top of v6.8-rc2 and rebased onto latest next > > > tag available (next-20240209). > > > > > > Daniel > > > > > > Daniel Gomez (8): > > > shmem: add per-block uptodate tracking for hugepages > > > shmem: move folio zero operation to write_begin() > > > shmem: exit shmem_get_folio_gfp() if block is uptodate > > > shmem: clear_highpage() if block is not uptodate > > > shmem: set folio uptodate when reclaim > > > shmem: check if a block is uptodate before splice into pipe > > > shmem: clear uptodate blocks after PUNCH_HOLE > > > shmem: enable per-block uptodate > > > > > > Pankaj Raghav (1): > > > splice: don't check for uptodate if partially uptodate is impl > > > > > > fs/splice.c | 17 ++- > > > mm/shmem.c | 340 ++++++++++++++++++++++++++++++++++++++++++++++++---- > > > 2 files changed, 332 insertions(+), 25 deletions(-) > > > > > > -- > > > 2.43.0
On Tue 20-02-24 10:26:48, Daniel Gomez wrote: > On Mon, Feb 19, 2024 at 02:15:47AM -0800, Hugh Dickins wrote: > I'm uncertain when we may want to be more elastic. In the case of XFS with iomap > and support for large folios, for instance, we are 'less' elastic than here. So, > what exactly is the rationale behind wanting shmem to be 'more elastic'? Well, but if you allocated space in larger chunks - as is the case with ext4 and bigalloc feature, you will be similarly 'elastic' as tmpfs with large folio support... So simply the granularity of allocation of underlying space is what matters here. And for tmpfs the underlying space happens to be the page cache. > If we ever move shmem to large folios [1], and we use them in an oportunistic way, > then we are going to be more elastic in the default path. > > [1] https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsung.com > > In addition, I think that having this block granularity can benefit quota > support and the reclaim path. For example, in the generic/100 fstest, around > ~26M of data are reported as 1G of used disk when using tmpfs with huge pages. And I'd argue this is a desirable thing. If 1G worth of pages is attached to the inode, then quota should be accounting 1G usage even though you've written just 26MB of data to the file. Quota is about constraining used resources, not about "how much did I write to the file". Honza
On Tue, Feb 20, 2024 at 01:39:05PM +0100, Jan Kara wrote: > On Tue 20-02-24 10:26:48, Daniel Gomez wrote: > > On Mon, Feb 19, 2024 at 02:15:47AM -0800, Hugh Dickins wrote: > > I'm uncertain when we may want to be more elastic. In the case of XFS with iomap > > and support for large folios, for instance, we are 'less' elastic than here. So, > > what exactly is the rationale behind wanting shmem to be 'more elastic'? > > Well, but if you allocated space in larger chunks - as is the case with > ext4 and bigalloc feature, you will be similarly 'elastic' as tmpfs with > large folio support... So simply the granularity of allocation of > underlying space is what matters here. And for tmpfs the underlying space > happens to be the page cache. But it seems like the underlying space 'behaves' differently when we talk about large folios and huge pages. Is that correct? And this is reflected in the fstat st_blksize. The first one is always based on the host base page size, regardless of the order we get. The second one is always based on the host huge page size configured (at the moment I've tested 2MiB, and 1GiB for x86-64 and 2MiB, 512 MiB and 16GiB for ARM64). If that is the case, I'd agree this is not needed for huge pages but only when we adopt large folios. Otherwise, we won't have a way to determine the step/ granularity for seeking data/holes as it could be anything from order-0 to order-9. Note: order-1 support currently in LBS v1 thread here [1]. Regarding large folios adoption, we have the following implementations [2] being sent to the mailing list. Would it make sense then, to have this block tracking for the large folios case? Notice that my last attempt includes a partial implementation of block tracking discussed here. [1] https://lore.kernel.org/all/20240226094936.2677493-2-kernel@pankajraghav.com/ [2] shmem: high order folios support in write path v1: https://lore.kernel.org/all/20230915095042.1320180-1-da.gomez@samsung.com/ v2: https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsung.com/ v3 (RFC): https://lore.kernel.org/all/20231028211518.3424020-1-da.gomez@samsung.com/ > > > If we ever move shmem to large folios [1], and we use them in an oportunistic way, > > then we are going to be more elastic in the default path. > > > > [1] https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsung.com > > > > In addition, I think that having this block granularity can benefit quota > > support and the reclaim path. For example, in the generic/100 fstest, around > > ~26M of data are reported as 1G of used disk when using tmpfs with huge pages. > > And I'd argue this is a desirable thing. If 1G worth of pages is attached > to the inode, then quota should be accounting 1G usage even though you've > written just 26MB of data to the file. Quota is about constraining used > resources, not about "how much did I write to the file". But these are two separate values. I get that the system wants to track how many pages are attached to the inode, so is there a way to report (in addition) the actual use of these pages being consumed? > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
On Tue, Feb 27, 2024 at 11:42:01AM +0000, Daniel Gomez wrote: > On Tue, Feb 20, 2024 at 01:39:05PM +0100, Jan Kara wrote: > > On Tue 20-02-24 10:26:48, Daniel Gomez wrote: > > > On Mon, Feb 19, 2024 at 02:15:47AM -0800, Hugh Dickins wrote: > > > I'm uncertain when we may want to be more elastic. In the case of XFS with iomap > > > and support for large folios, for instance, we are 'less' elastic than here. So, > > > what exactly is the rationale behind wanting shmem to be 'more elastic'? > > > > Well, but if you allocated space in larger chunks - as is the case with > > ext4 and bigalloc feature, you will be similarly 'elastic' as tmpfs with > > large folio support... So simply the granularity of allocation of > > underlying space is what matters here. And for tmpfs the underlying space > > happens to be the page cache. > > But it seems like the underlying space 'behaves' differently when we talk about > large folios and huge pages. Is that correct? And this is reflected in the fstat > st_blksize. The first one is always based on the host base page size, regardless > of the order we get. The second one is always based on the host huge page size > configured (at the moment I've tested 2MiB, and 1GiB for x86-64 and 2MiB, 512 > MiB and 16GiB for ARM64). Apologies, I was mixing the values available in HugeTLB and those supported in THP (pmd-size only). Thus, it is 2MiB for x86-64, and 2MiB, 32 MiB and 512 MiB for ARM64 with 4k, 16k and 64k Base Page Size, respectively. > > If that is the case, I'd agree this is not needed for huge pages but only when > we adopt large folios. Otherwise, we won't have a way to determine the step/ > granularity for seeking data/holes as it could be anything from order-0 to > order-9. Note: order-1 support currently in LBS v1 thread here [1]. > > Regarding large folios adoption, we have the following implementations [2] being > sent to the mailing list. Would it make sense then, to have this block tracking > for the large folios case? Notice that my last attempt includes a partial > implementation of block tracking discussed here. > > [1] https://lore.kernel.org/all/20240226094936.2677493-2-kernel@pankajraghav.com/ > > [2] shmem: high order folios support in write path > v1: https://lore.kernel.org/all/20230915095042.1320180-1-da.gomez@samsungcom/ > v2: https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsungcom/ > v3 (RFC): https://lore.kernel.org/all/20231028211518.3424020-1-da.gomez@samsung.com/ > > > > > > If we ever move shmem to large folios [1], and we use them in an oportunistic way, > > > then we are going to be more elastic in the default path. > > > > > > [1] https://lore.kernel.org/all/20230919135536.2165715-1-da.gomez@samsung.com > > > > > > In addition, I think that having this block granularity can benefit quota > > > support and the reclaim path. For example, in the generic/100 fstest, around > > > ~26M of data are reported as 1G of used disk when using tmpfs with huge pages. > > > > And I'd argue this is a desirable thing. If 1G worth of pages is attached > > to the inode, then quota should be accounting 1G usage even though you've > > written just 26MB of data to the file. Quota is about constraining used > > resources, not about "how much did I write to the file". > > But these are two separate values. I get that the system wants to track how many > pages are attached to the inode, so is there a way to report (in addition) the > actual use of these pages being consumed? > > > > > Honza > > -- > > Jan Kara <jack@suse.com> > > SUSE Labs, CR