From patchwork Tue Nov 8 19:32:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 17181 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2908921wru; Tue, 8 Nov 2022 11:35:20 -0800 (PST) X-Google-Smtp-Source: AMsMyM7kBpYEdX28nAsFWPfHJN4OqmH2txsU+1ApJ5WvDsnV2dPArUaGznCr9YpPrf2H8zPb8FVI X-Received: by 2002:a63:fe58:0:b0:46f:dc58:b23c with SMTP id x24-20020a63fe58000000b0046fdc58b23cmr36971649pgj.205.1667936120402; Tue, 08 Nov 2022 11:35:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667936120; cv=none; d=google.com; s=arc-20160816; b=BrBKR3ipgnLq85NOSrwx2RFqZh14sHYtC8FT9Edkfr9khoai2PitlTMrx1O1wrkx5d ijb4YyWCzrhceNnjP/f7mCx57YHJOOxaICHbjXQFjylJc/M8h5ZuhHHJiI952WmNYWm6 H67MOCxPgDBYqiP87q6tgg03uBqgep6ql+l7q+viZg2XVW/1tcfSUPkPLSL5XwU82x/A LtuAvbrRz/LB6brzeUknrZZ4ZchGNRsamyl8w/3DOz51bHI24Z+/Bs1+ImMctHlMpi4s hhaSI4jwaUOjo0JBproIwD+jmm242nZs5OPyen+BiKBWxV61eVA7EiEcn0jc5T9Zf/p4 xT+w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=cdR+zz51d43osJcFtmh1eCXg8h2Qiif8Z+m5nEUjsxdHzfHdH/pyxndQba/NuJ0WBG rDGe3WJHpTkZkdQ/eRWfraHx4jg/ffeI/prjl5Kxyc9XPYFFa2tErtwhzpOePnhK0r5K 9bFEZNkn//DvA3WH+pfqX++Kv31bAMkOnkY1CbQlHp5+zyxkXok7uVWamD2wonmigceT uzVtI5IabUSWzpnRulwzZ0U9upU/rF6sYkCsFm/cVdWpFW5mwOsnKXjZhjeHjpQZesD7 KJGe5t/ynkfvFG7fib+0L5ldQ3kQZuf5mt6288Ipj6+K4T070PCttLGGrdIFA9j/SXRf igtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HSThM6hF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g130-20020a636b88000000b0042b6e87d126si14044970pgc.198.2022.11.08.11.35.06; Tue, 08 Nov 2022 11:35:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HSThM6hF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229647AbiKHTcN (ORCPT + 99 others); Tue, 8 Nov 2022 14:32:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229485AbiKHTcK (ORCPT ); Tue, 8 Nov 2022 14:32:10 -0500 Received: from mail-pf1-x429.google.com (mail-pf1-x429.google.com [IPv6:2607:f8b0:4864:20::429]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 066F5682B0 for ; Tue, 8 Nov 2022 11:32:10 -0800 (PST) Received: by mail-pf1-x429.google.com with SMTP id v28so14663783pfi.12 for ; Tue, 08 Nov 2022 11:32:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=HSThM6hFhhVtQ9XEuNUbTWYT7N7QyCqsOWJmCZ+Yc1HgTC5dVqVT5bIrBAjLJjYN4H DtZ4Od1NEvWEM7Q8mFdZTbfTJE3GcYDBXEa/wKhZksZQdEXvopPYjROUxytN+JRYnCAm PRXAIiGoz9m1Ca7Sa/Y36uQ24ak+Yw9WNj0uVQtUJNufX6MAtOUZYBw9F8TlnQraD6t7 okXhYKtevjEUpjKI3L3YStKt6jPXNUGnPKtTSWoAEANPeEcaIj34eCx8kbh+b8JlXl30 /fwr4LcBJksre4uH1ZMCiTg/wkn14O5eKPDZeoJmyIHSZ6IxnevkEL92goAY6m45eBPZ Y4KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=VYelQZu3aLOC2reqzM1ibsdlNiSTlTYKQ0JOaqOPvq3hShEaFE/4IQwg4MGeHrwvv2 E8yDbIvYIuLyWGSfXxIlCsp74vUKOiObFodjd/WfVDlWgNaZj6SfNqIhKnpD3i/qjaBU lzvqQ/9KTtBa7asO/cY/AmvXPMQqy0RGo7qzHHgiT+CxHhtO+Ifi4Kj3QylfqW0tL6jS YTmTscHef+l7oATUzaK2sGgbqf54a6olzr+GoAgmzpyS8W4vrHnI5eH6kU05aQWysMHR MB1WCqmntOf40KVeL7EJX932Sk/26nqeu5500ZHBOIn1Q+GxPvgp24DH8qV6WWXNWe4M a1UA== X-Gm-Message-State: ACrzQf1cQO/T1zbcq4hZR3LHimJ5f1upqh/UL24MPjFQc9JeLzIFCW/z V/tRY04BwyoiobuZzkGpw52ZekohrGAYCg== X-Received: by 2002:a63:5a05:0:b0:434:23a5:a5ca with SMTP id o5-20020a635a05000000b0043423a5a5camr49494580pgb.515.1667935929482; Tue, 08 Nov 2022 11:32:09 -0800 (PST) Received: from localhost (fwdproxy-prn-007.fbsv.net. [2a03:2880:ff:7::face:b00c]) by smtp.gmail.com with ESMTPSA id a13-20020a63e84d000000b0046ae5cfc3d5sm6054230pgk.61.2022.11.08.11.32.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:32:09 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v3 1/5] zswap: fix writeback lock ordering for zsmalloc Date: Tue, 8 Nov 2022 11:32:03 -0800 Message-Id: <20221108193207.3297327-2-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221108193207.3297327-1-nphamcs@gmail.com> References: <20221108193207.3297327-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748957785704513796?= X-GMAIL-MSGID: =?utf-8?q?1748957785704513796?= From: Johannes Weiner zswap's customary lock order is tree->lock before pool->lock, because the tree->lock protects the entries' refcount, and the free callbacks in the backends acquire their respective pool locks to dispatch the backing object. zsmalloc's map callback takes the pool lock, so zswap must not grab the tree->lock while a handle is mapped. This currently only happens during writeback, which isn't implemented for zsmalloc. In preparation for it, move the tree->lock section out of the mapped entry section Signed-off-by: Johannes Weiner Signed-off-by: Nhat Pham --- mm/zswap.c | 37 ++++++++++++++++++++----------------- 1 file changed, 20 insertions(+), 17 deletions(-) -- 2.30.2 diff --git a/mm/zswap.c b/mm/zswap.c index 2d48fd59cc7a..2d69c1d678fe 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -958,7 +958,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) }; if (!zpool_can_sleep_mapped(pool)) { - tmp = kmalloc(PAGE_SIZE, GFP_ATOMIC); + tmp = kmalloc(PAGE_SIZE, GFP_KERNEL); if (!tmp) return -ENOMEM; } @@ -968,6 +968,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) swpentry = zhdr->swpentry; /* here */ tree = zswap_trees[swp_type(swpentry)]; offset = swp_offset(swpentry); + zpool_unmap_handle(pool, handle); /* find and ref zswap entry */ spin_lock(&tree->lock); @@ -975,20 +976,12 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) if (!entry) { /* entry was invalidated */ spin_unlock(&tree->lock); - zpool_unmap_handle(pool, handle); kfree(tmp); return 0; } spin_unlock(&tree->lock); BUG_ON(offset != entry->offset); - src = (u8 *)zhdr + sizeof(struct zswap_header); - if (!zpool_can_sleep_mapped(pool)) { - memcpy(tmp, src, entry->length); - src = tmp; - zpool_unmap_handle(pool, handle); - } - /* try to allocate swap cache page */ switch (zswap_get_swap_cache_page(swpentry, &page)) { case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */ @@ -1006,6 +999,14 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); dlen = PAGE_SIZE; + zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO); + src = (u8 *)zhdr + sizeof(struct zswap_header); + if (!zpool_can_sleep_mapped(pool)) { + memcpy(tmp, src, entry->length); + src = tmp; + zpool_unmap_handle(pool, handle); + } + mutex_lock(acomp_ctx->mutex); sg_init_one(&input, src, entry->length); sg_init_table(&output, 1); @@ -1015,6 +1016,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) dlen = acomp_ctx->req->dlen; mutex_unlock(acomp_ctx->mutex); + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); + else + zpool_unmap_handle(pool, handle); + BUG_ON(ret); BUG_ON(dlen != PAGE_SIZE); @@ -1045,7 +1051,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) zswap_entry_put(tree, entry); spin_unlock(&tree->lock); - goto end; + return ret; + +fail: + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); /* * if we get here due to ZSWAP_SWAPCACHE_EXIST @@ -1054,17 +1064,10 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) * if we free the entry in the following put * it is also okay to return !0 */ -fail: spin_lock(&tree->lock); zswap_entry_put(tree, entry); spin_unlock(&tree->lock); -end: - if (zpool_can_sleep_mapped(pool)) - zpool_unmap_handle(pool, handle); - else - kfree(tmp); - return ret; } From patchwork Tue Nov 8 19:32:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 17180 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2908412wru; Tue, 8 Nov 2022 11:34:24 -0800 (PST) X-Google-Smtp-Source: AMsMyM5aE/u0hNl73j9+MyrwVCZabcZidPEhbp48gr8z+tomRB3mor1+HdiKLQooWMUA3Kv+oGjL X-Received: by 2002:a17:907:2bc1:b0:7ad:d3a4:9df6 with SMTP id gv1-20020a1709072bc100b007add3a49df6mr47243432ejc.188.1667936063945; Tue, 08 Nov 2022 11:34:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667936063; cv=none; d=google.com; s=arc-20160816; b=bBcr94smp/HutaWdFqOT328C1G+ZZ7+0eDHSXRdvSMf+yc/G5fUIPVyEOPV4aENHaj UzWykgXT2vHgWTX0dUde/5KpsWI/MUP67ZhG5ju5jwgHNjvgu6T+g833xYX2yoIzoDJi mHfQeXRNp0a+fbapSSSsmVudZCEZDilfPqSEbmsLDUCwqoJechycpPDuRcmUkDXLxCxo ecs+x11zmd9D4/HAAVPqdr9PVVkDIz5OgGVidAduWs3H2ipuTtyWLNyMY4RXBJsKbnxm ef6CKxnPGI3p1vLdq7eGXINkVW7ciMy+NQBWURFgsOczBq68RZApVSkNUHErungYZCVj Z1Ow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Ub0OYe15qtfzmwShje8nnnhP69fp8NMaFxLpIwoOo2w=; b=T6oLV6IZVWPe4GYKXrktFIGR0nm5PG58cSAIshkIRMGC7erCkApVLZUZV8yJ3GoO3O ndTCC5YJhZ7pfsDjCtBdqr0YZ58voKq35pF5wPbiIEqIN22earEcyy1FFrIQjKSX9euh U2dTo9U9dxpoemH/vM0ISZFogNEOl+13NEgine3GBN2p3l24hKQy1S24ruKbiSDk7/AQ v0tupdrGTiFdxGypk/Gx5nCTCTK1vIyyi40daj8cjCOQFFez5tYbrRKLh+S40rmkjI+x E7PTYqgqhGNnh836VSQdZuug6zXT1mgDzePpivx3o8pE91JmlymOOwko+yPnup6WwXq2 +u+A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=E3PTLhsw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a16-20020a1709066d5000b0078d288ddfc9si10973822ejt.143.2022.11.08.11.33.58; Tue, 08 Nov 2022 11:34:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=E3PTLhsw; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229712AbiKHTcQ (ORCPT + 99 others); Tue, 8 Nov 2022 14:32:16 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34294 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229632AbiKHTcM (ORCPT ); Tue, 8 Nov 2022 14:32:12 -0500 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6873E66CA8 for ; Tue, 8 Nov 2022 11:32:11 -0800 (PST) Received: by mail-pg1-x52c.google.com with SMTP id 6so6111676pgm.6 for ; Tue, 08 Nov 2022 11:32:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ub0OYe15qtfzmwShje8nnnhP69fp8NMaFxLpIwoOo2w=; b=E3PTLhswytnNvv9nQB34PVcBlwRSj8uZm4eaF0NgUYBbP7JCfjXpP86/ZCuqf6OCNo /n6cWg1QG9kKK2/+z0Lx2fceatk0v93g9xoc72bKj/KyoScvJGtoZU/H/hNoe9QUJdOl 6vxiO6ef0OD09qpT3xAWLJ9rgo2NyGvBb4DFONpC8GsuUWXBZLCGO6crMXZG9nYQlXJT cfXxlCtUEkUKIer6L72fXjn/wasR8NmLkcAs7TTz5Jlmqlv05dRb6EN2WXtaveYmiukv hm2ZIWNd0NDceeW7PC7sPZ8dsZ0ZxgKiTw4+ji5hTujolTDZw1iKWk2QEwRPK9VhCTvQ tN4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ub0OYe15qtfzmwShje8nnnhP69fp8NMaFxLpIwoOo2w=; b=CnaEZ4XW0ieZSIFxyIP7NzvkjcmSCuZ4osbQBORttsxvS7r8stQj7ICCsWax2XEKXh N9J6ImAKkkyeCc7PrjpSV12jq3nYBPNy2eTzC5Ld4jksuCk+7+T/624NTZtSsLn2Ui3Y 7SsHIkdnG5TKPDrGm8HFdV2SyHkkbH2B3NK8QleRu00gGAu2Zo9mnfKLmyBvki0y1j4I 5UPIRNhA/O/e2eLtvp7pcxKVjdCHapQZqin3f/I/7ibGn6iAXRDdkTD444IcmjFzVOMX A6ccdpKz0yI54kjmKj3f26I+6z4VK3eXIhCT7LppRb/Ir/3bR87iuFqoXtcFiT7diQep STmQ== X-Gm-Message-State: ACrzQf0RLSWAAu+aXVeXoYKi7asVJHD2s4fFl5Z2yvGXADhugQdvJovQ v4hIx8JcxaNL7eQH60r5+Do= X-Received: by 2002:a63:5c56:0:b0:464:85bb:8fd9 with SMTP id n22-20020a635c56000000b0046485bb8fd9mr47624470pgm.188.1667935930743; Tue, 08 Nov 2022 11:32:10 -0800 (PST) Received: from localhost (fwdproxy-prn-002.fbsv.net. [2a03:2880:ff:2::face:b00c]) by smtp.gmail.com with ESMTPSA id t15-20020a17090a6a0f00b00212e60c7d9csm8198201pjj.41.2022.11.08.11.32.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:32:10 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v3 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks Date: Tue, 8 Nov 2022 11:32:04 -0800 Message-Id: <20221108193207.3297327-3-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221108193207.3297327-1-nphamcs@gmail.com> References: <20221108193207.3297327-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748957726230939117?= X-GMAIL-MSGID: =?utf-8?q?1748957726230939117?= Currently, zsmalloc has a hierarchy of locks, which includes a pool-level migrate_lock, and a lock for each size class. We have to obtain both locks in the hotpath in most cases anyway, except for zs_malloc. This exception will no longer exist when we introduce a LRU into the zs_pool for the new writeback functionality - we will need to obtain a pool-level lock to synchronize LRU handling even in zs_malloc. In preparation for zsmalloc writeback, consolidate these locks into a single pool-level lock, which drastically reduces the complexity of synchronization in zsmalloc. We have also benchmarked the lock consolidation to see the performance effect of this change on zram. First, we ran a synthetic FS workload on a server machine with 36 cores (same machine for all runs), using fs_mark -d ../zram1mnt -s 100000 -n 2500 -t 32 -k before and after for btrfs and ext4 on zram (FS usage is 80%). Here is the result (unit is file/second): With lock consolidation (btrfs): Average: 13520.2, Median: 13531.0, Stddev: 137.5961482019028 Without lock consolidation (btrfs): Average: 13487.2, Median: 13575.0, Stddev: 309.08283679298665 With lock consolidation (ext4): Average: 16824.4, Median: 16839.0, Stddev: 89.97388510006668 Without lock consolidation (ext4) Average: 16958.0, Median: 16986.0, Stddev: 194.7370021336469 As you can see, we observe a 0.3% regression for btrfs, and a 0.9% regression for ext4. This is a small, barely measurable difference in my opinion. For a more realistic scenario, we also tries building the kernel on zram. Here is the time it takes (in seconds): With lock consolidation (btrfs): real Average: 319.6, Median: 320.0, Stddev: 0.8944271909999159 user Average: 6894.2, Median: 6895.0, Stddev: 25.528415540334656 sys Average: 521.4, Median: 522.0, Stddev: 1.51657508881031 Without lock consolidation (btrfs): real Average: 319.8, Median: 320.0, Stddev: 0.8366600265340756 user Average: 6896.6, Median: 6899.0, Stddev: 16.04057355583023 sys Average: 520.6, Median: 521.0, Stddev: 1.140175425099138 With lock consolidation (ext4): real Average: 320.0, Median: 319.0, Stddev: 1.4142135623730951 user Average: 6896.8, Median: 6878.0, Stddev: 28.621670111997307 sys Average: 521.2, Median: 521.0, Stddev: 1.7888543819998317 Without lock consolidation (ext4) real Average: 319.6, Median: 319.0, Stddev: 0.8944271909999159 user Average: 6886.2, Median: 6887.0, Stddev: 16.93221781102523 sys Average: 520.4, Median: 520.0, Stddev: 1.140175425099138 The difference is entirely within the noise of a typical run on zram. This hardly justifies the complexity of maintaining both the pool lock and the class lock. In fact, for writeback, we would need to introduce yet another lock to prevent data races on the pool's LRU, further complicating the lock handling logic. IMHO, it is just better to collapse all of these into a single pool-level lock. Suggested-by: Johannes Weiner Signed-off-by: Nhat Pham --- mm/zsmalloc.c | 87 ++++++++++++++++++++++----------------------------- 1 file changed, 37 insertions(+), 50 deletions(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index d03941cace2c..326faa751f0a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -33,8 +33,7 @@ /* * lock ordering: * page_lock - * pool->migrate_lock - * class->lock + * pool->lock * zspage->lock */ @@ -192,7 +191,6 @@ static const int fullness_threshold_frac = 4; static size_t huge_class_size; struct size_class { - spinlock_t lock; struct list_head fullness_list[NR_ZS_FULLNESS]; /* * Size of objects stored in this class. Must be multiple @@ -247,8 +245,7 @@ struct zs_pool { #ifdef CONFIG_COMPACTION struct work_struct free_work; #endif - /* protect page/zspage migration */ - rwlock_t migrate_lock; + spinlock_t lock; }; struct zspage { @@ -355,7 +352,7 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } -/* class->lock(which owns the handle) synchronizes races */ +/* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { *(unsigned long *)handle = obj; @@ -452,7 +449,7 @@ static __maybe_unused int is_first_page(struct page *page) return PagePrivate(page); } -/* Protected by class->lock */ +/* Protected by pool->lock */ static inline int get_zspage_inuse(struct zspage *zspage) { return zspage->inuse; @@ -597,13 +594,13 @@ static int zs_stats_size_show(struct seq_file *s, void *v) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); class_almost_full = zs_stat_get(class, CLASS_ALMOST_FULL); class_almost_empty = zs_stat_get(class, CLASS_ALMOST_EMPTY); obj_allocated = zs_stat_get(class, OBJ_ALLOCATED); obj_used = zs_stat_get(class, OBJ_USED); freeable = zs_can_compact(class); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); objs_per_zspage = class->objs_per_zspage; pages_used = obj_allocated / objs_per_zspage * @@ -916,7 +913,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, get_zspage_mapping(zspage, &class_idx, &fg); - assert_spin_locked(&class->lock); + assert_spin_locked(&pool->lock); VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); @@ -1247,19 +1244,19 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, BUG_ON(in_interrupt()); /* It guarantees it can get zspage from handle safely */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_location(obj, &page, &obj_idx); zspage = get_zspage(page); /* - * migration cannot move any zpages in this zspage. Here, class->lock + * migration cannot move any zpages in this zspage. Here, pool->lock * is too heavy since callers would take some time until they calls * zs_unmap_object API so delegate the locking from class to zspage * which is smaller granularity. */ migrate_read_lock(zspage); - read_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); class = zspage_class(pool, zspage); off = (class->size * obj_idx) & ~PAGE_MASK; @@ -1412,8 +1409,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; - /* class->lock effectively protects the zpage migration */ - spin_lock(&class->lock); + /* pool->lock effectively protects the zpage migration */ + spin_lock(&pool->lock); zspage = find_get_zspage(class); if (likely(zspage)) { obj = obj_malloc(pool, zspage, handle); @@ -1421,12 +1418,12 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } - spin_unlock(&class->lock); + spin_unlock(&pool->lock); zspage = alloc_zspage(pool, class, gfp); if (!zspage) { @@ -1434,7 +1431,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) return (unsigned long)ERR_PTR(-ENOMEM); } - spin_lock(&class->lock); + spin_lock(&pool->lock); obj = obj_malloc(pool, zspage, handle); newfg = get_fullness_group(class, zspage); insert_zspage(class, zspage, newfg); @@ -1447,7 +1444,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } @@ -1491,16 +1488,14 @@ void zs_free(struct zs_pool *pool, unsigned long handle) return; /* - * The pool->migrate_lock protects the race with zpage's migration + * The pool->lock protects the race with zpage's migration * so it's safe to get the page from handle. */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_page(obj, &f_page); zspage = get_zspage(f_page); class = zspage_class(pool, zspage); - spin_lock(&class->lock); - read_unlock(&pool->migrate_lock); obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); @@ -1510,7 +1505,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) free_zspage(pool, class, zspage); out: - spin_unlock(&class->lock); + spin_unlock(&pool->lock); cache_free_handle(pool, handle); } EXPORT_SYMBOL_GPL(zs_free); @@ -1867,16 +1862,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, pool = zspage->pool; /* - * The pool migrate_lock protects the race between zpage migration + * The pool's lock protects the race between zpage migration * and zs_free. */ - write_lock(&pool->migrate_lock); + spin_lock(&pool->lock); class = zspage_class(pool, zspage); - /* - * the class lock protects zpage alloc/free in the zspage. - */ - spin_lock(&class->lock); /* the migrate_write_lock protects zpage access via zs_map_object */ migrate_write_lock(zspage); @@ -1906,10 +1897,9 @@ static int zs_page_migrate(struct page *newpage, struct page *page, replace_sub_page(class, zspage, newpage, page); /* * Since we complete the data copy and set up new zspage structure, - * it's okay to release migration_lock. + * it's okay to release the pool's lock. */ - write_unlock(&pool->migrate_lock); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); dec_zspage_isolation(zspage); migrate_write_unlock(zspage); @@ -1964,9 +1954,9 @@ static void async_free_zspage(struct work_struct *work) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); list_splice_init(&class->fullness_list[ZS_EMPTY], &free_pages); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } list_for_each_entry_safe(zspage, tmp, &free_pages, list) { @@ -1976,9 +1966,9 @@ static void async_free_zspage(struct work_struct *work) get_zspage_mapping(zspage, &class_idx, &fullness); VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; - spin_lock(&class->lock); + spin_lock(&pool->lock); __free_zspage(pool, class, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } }; @@ -2039,10 +2029,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, struct zspage *dst_zspage = NULL; unsigned long pages_freed = 0; - /* protect the race between zpage migration and zs_free */ - write_lock(&pool->migrate_lock); - /* protect zpage allocation/free */ - spin_lock(&class->lock); + /* + * protect the race between zpage migration and zs_free + * as well as zpage allocation/free + */ + spin_lock(&pool->lock); while ((src_zspage = isolate_zspage(class, true))) { /* protect someone accessing the zspage(i.e., zs_map_object) */ migrate_write_lock(src_zspage); @@ -2067,7 +2058,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(class, dst_zspage); migrate_write_unlock(dst_zspage); dst_zspage = NULL; - if (rwlock_is_contended(&pool->migrate_lock)) + if (spin_is_contended(&pool->lock)) break; } @@ -2084,11 +2075,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, pages_freed += class->pages_per_zspage; } else migrate_write_unlock(src_zspage); - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); cond_resched(); - write_lock(&pool->migrate_lock); - spin_lock(&class->lock); + spin_lock(&pool->lock); } if (src_zspage) { @@ -2096,8 +2085,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, migrate_write_unlock(src_zspage); } - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); return pages_freed; } @@ -2200,7 +2188,7 @@ struct zs_pool *zs_create_pool(const char *name) return NULL; init_deferred_free(pool); - rwlock_init(&pool->migrate_lock); + spin_lock_init(&pool->lock); pool->name = kstrdup(name, GFP_KERNEL); if (!pool->name) @@ -2271,7 +2259,6 @@ struct zs_pool *zs_create_pool(const char *name) class->index = i; class->pages_per_zspage = pages_per_zspage; class->objs_per_zspage = objs_per_zspage; - spin_lock_init(&class->lock); pool->size_class[i] = class; for (fullness = ZS_EMPTY; fullness < NR_ZS_FULLNESS; fullness++) From patchwork Tue Nov 8 19:32:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 17182 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2909300wru; Tue, 8 Nov 2022 11:36:02 -0800 (PST) X-Google-Smtp-Source: AMsMyM71KCNV+iB6lqVzv3BIQ7MHjrAssJUoCVaB3TrPi5LTmATdXUCdUtnrE0AusiLw5NWl98Q1 X-Received: by 2002:a17:903:41c2:b0:187:3fb8:79bc with SMTP id u2-20020a17090341c200b001873fb879bcmr38363059ple.18.1667936161814; Tue, 08 Nov 2022 11:36:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667936161; cv=none; d=google.com; s=arc-20160816; b=kkmvhrAKCjy+oE7yBqsrPjRKXlu/xv478MNJeNsHG2msUglv27KtDMTtIiuekBWqG/ VK3xb0Z8j8S0lcsRqTnIrzuD1Gc72dUxPRQYkNxpZu50Hp3WMklQvP8IvH0UGKc4gDDb wIHeulyQ5GS3hzWdgDJyICv27LmBOh88fKGkj3IRFdrczNB2cLR9ngtatS3xRsTeIKOX bdoeYN0eWwLIkPdlcj56J+MJDPWR4sejO8QpZ5IZem4Y7WqEZ6i+DXIKtIo+KFBGccAR UjWHqLWUa4/FZtvaYduaUHwaqGLGnQBygrBLzWAqUlfVZEINNElfa55fTL3OrJQk0H0Z m+zQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=CSMyPkpmx6ncAmztqFYM3Qqld7TsL3tLi0RT34MTss4PGTXp9UNiEGzfpcNL6voKEi 3pZGGDsGvdOuQdgUjQxHQur06acZ5p0eaCE9oLSUOorm7HAQ2Urf7ZKtfFRBZpLkhS72 /37p6oTlUKpzvcGoPmPS58dyByTVnlPWehpcrD86N1JiEYRCyfZXBza/ly3k2SDqIQjl gl4ZMSgQKDUQHWP2dZnbHJzGWrJRp+qA3ASNxdcxCaJfJD7D+w+pVXCI3dKU+HetVR2k PZ3Gz+y1SrLBp65ScZ4yyozX1x+EWTx9pZSik22P1dzgcO2mQs4Ju+TbmlCiVUPSrQt5 z63Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CyOeV3YR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b21-20020a056a000cd500b0056d919bf4a8si16404534pfv.270.2022.11.08.11.35.44; Tue, 08 Nov 2022 11:36:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=CyOeV3YR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229836AbiKHTcU (ORCPT + 99 others); Tue, 8 Nov 2022 14:32:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34384 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229611AbiKHTcO (ORCPT ); Tue, 8 Nov 2022 14:32:14 -0500 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8CA1F682B7 for ; Tue, 8 Nov 2022 11:32:12 -0800 (PST) Received: by mail-pg1-x52a.google.com with SMTP id r18so14227643pgr.12 for ; Tue, 08 Nov 2022 11:32:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=CyOeV3YR6WRXm9NAlBA+Dp/PY8YjpvxWUqOk9TAlz7g/mObVJI4JA3foNBQN+I+TtX VY45KurMLA224PLyhlSrDXngV7pOYZIVDg3ko+rjZXpM/uQuxVsi2UTTY0tU95FDQxvu GHIk0rncMmZdZIU6KFwq57ZIwxKCdqXbdbfhioKXu8JCjMlJvm3IGRNLxVkWgV/4MNwB 9jrbhFl7AD0jGRjMyTuOi2bG7eSt+CoMUIJyIMbHxkVmBFdEdahXRkoC/sgSKNyADbJw vyUZiNTTalkkFX+AN4JNIcfz8987oPqJ1uGbWohr3toeoKyna7g7tEYukGg7sATmYTuO 11+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=LPnsYfzL+WEwAlJyXMY3RLyRcMTVb5vwVU3E6eyqSwvMvM1l6m5mzl6BMsxy9oHQd3 5R2naQQDxfzKjP7U9+Xi7ziY//6aQxAYfhUUfWUklch0Oh49KEeAp3jFr7f26ZbDr1bB l6puU5SQ4bgrK0CEQBRe2HjvIYpyzX3vCVMEoqM1GRRmxhEp5K6y2G2uEOm7T296iDw5 C/dhoO0Y2y5RQ7QiEwT6vhE1EOgogBCtpjgaM97ltcssf8Xyakp8PGiICPcXKIAu4cIv ojTqVoxGX1UiBEzsZarAAhcU3GC/5DJbEgp/bdoKbKTSNEm/9XN9UyQyDC7hM5ZUPSnL oIUA== X-Gm-Message-State: ACrzQf0iLlMU5nLC+L3Ei9+gi3pZUJVeXoEDHm/2t2HBFXbo1H1wPdoF dE351cpwYw4sCgNWJOpJQ0E= X-Received: by 2002:a63:f50f:0:b0:470:2eca:d84b with SMTP id w15-20020a63f50f000000b004702ecad84bmr24556115pgh.55.1667935932076; Tue, 08 Nov 2022 11:32:12 -0800 (PST) Received: from localhost (fwdproxy-prn-016.fbsv.net. [2a03:2880:ff:10::face:b00c]) by smtp.gmail.com with ESMTPSA id rj6-20020a17090b3e8600b0020a28156e11sm8354536pjb.26.2022.11.08.11.32.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:32:11 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v3 3/5] zsmalloc: Add a LRU to zs_pool to keep track of zspages in LRU order Date: Tue, 8 Nov 2022 11:32:05 -0800 Message-Id: <20221108193207.3297327-4-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221108193207.3297327-1-nphamcs@gmail.com> References: <20221108193207.3297327-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748957828915544474?= X-GMAIL-MSGID: =?utf-8?q?1748957828915544474?= This helps determines the coldest zspages as candidates for writeback. Signed-off-by: Nhat Pham --- mm/zsmalloc.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 326faa751f0a..600c40121544 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -239,6 +239,9 @@ struct zs_pool { /* Compact classes */ struct shrinker shrinker; + /* List tracking the zspages in LRU order by most recently added object */ + struct list_head lru; + #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; #endif @@ -260,6 +263,10 @@ struct zspage { unsigned int freeobj; struct page *first_page; struct list_head list; /* fullness list */ + + /* links the zspage to the lru list in the pool */ + struct list_head lru; + struct zs_pool *pool; #ifdef CONFIG_COMPACTION rwlock_t lock; @@ -352,6 +359,16 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } +/* Moves the zspage to the front of the zspool's LRU */ +static void move_to_front(struct zs_pool *pool, struct zspage *zspage) +{ + assert_spin_locked(&pool->lock); + + if (!list_empty(&zspage->lru)) + list_del(&zspage->lru); + list_add(&zspage->lru, &pool->lru); +} + /* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { @@ -953,6 +970,7 @@ static void free_zspage(struct zs_pool *pool, struct size_class *class, } remove_zspage(class, zspage, ZS_EMPTY); + list_del(&zspage->lru); __free_zspage(pool, class, zspage); } @@ -998,6 +1016,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) off %= PAGE_SIZE; } + INIT_LIST_HEAD(&zspage->lru); + set_freeobj(zspage, 0); } @@ -1418,6 +1438,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); + /* Move the zspage to front of pool's LRU */ + move_to_front(pool, zspage); spin_unlock(&pool->lock); return handle; @@ -1444,6 +1466,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); + /* Move the zspage to front of pool's LRU */ + move_to_front(pool, zspage); spin_unlock(&pool->lock); return handle; @@ -1967,6 +1991,7 @@ static void async_free_zspage(struct work_struct *work) VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; spin_lock(&pool->lock); + list_del(&zspage->lru); __free_zspage(pool, class, zspage); spin_unlock(&pool->lock); } @@ -2278,6 +2303,8 @@ struct zs_pool *zs_create_pool(const char *name) */ zs_register_shrinker(pool); + INIT_LIST_HEAD(&pool->lru); + return pool; err: From patchwork Tue Nov 8 19:32:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 17183 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2909301wru; Tue, 8 Nov 2022 11:36:02 -0800 (PST) X-Google-Smtp-Source: AMsMyM6AKqVQR2CPw4gi/TLIfMppsyXRlTP6ov9F0Ec2JzzjIH34oz3wH7V+iuvWeSMv+Afx0yGZ X-Received: by 2002:a63:1217:0:b0:470:2ecd:57f0 with SMTP id h23-20020a631217000000b004702ecd57f0mr23825496pgl.354.1667936161817; Tue, 08 Nov 2022 11:36:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667936161; cv=none; d=google.com; s=arc-20160816; b=j8Sohd3dN+SYzVK4eTHznCGA766fkKvsOI5zp4bnq5ith14iqQJwGna2DFp9rH1Yxp MI7Q3t0grNSpvHpjKUaXgqARMt88CZK62WVnUYBDZvajGODjxJ1dsieKA9qtlCYHiTrW 7jlsIQjjdSAFpdodFetJZWuftYJq6RxJB3wL6qeU6rpAr+P3wp2lc0HVuiYQ0uj5GRYM zfv4sO3p3vQamzr+s8SRiUIXZD+D/mU6P+O5VyBluf3qPpTmgQxV9QyvxHrpP8ISOOQk Uv82hkiV4ZmFkwoZaRMXToLgD1MC4eN6N1IQziBp9TjWsZKnIheKsMpygKWGrcXpwgdG w4Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=NcRGiok93yrAI7cvzhwIp5D29F3AFI0/ULm3Nt1q2VU=; b=MBbcNPcKLuVCeE/foDY1dWoHvjJ136UQYyh2vQNidSgwJaiuKGIyVMmOQAU+BBgxma Sbb+qsV3C31hXhzqBNJG0ilNmOMnzIoXDA8kJnlX+u0jS8CQJKAaAt+qAZ0uyax7a8qr /s9RGM8gr9VLY7NkxCsd/xMUrNXOlmYrxxJEZyV2POZCyoG50hmCe5LqY6cHaOpMJ8pA EFnEXJ6jnNtRBV9l7wBeE7m5kX4EwkJZnDS6AbWayRHsj4QbHNIgSPQiBYHK1V4ub1vk l2htb8j8JVG5q609aS5d73uj994UvDtLeFkG+mxWMxstfywuafqhJ5cmJezd6BRM+i7X mQuw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=fqWVAbjd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id om5-20020a17090b3a8500b00209a3e49f63si20248180pjb.94.2022.11.08.11.35.45; Tue, 08 Nov 2022 11:36:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=fqWVAbjd; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229866AbiKHTcX (ORCPT + 99 others); Tue, 8 Nov 2022 14:32:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229688AbiKHTcO (ORCPT ); Tue, 8 Nov 2022 14:32:14 -0500 Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E471F6AECE for ; Tue, 8 Nov 2022 11:32:13 -0800 (PST) Received: by mail-pg1-x529.google.com with SMTP id h193so14219655pgc.10 for ; Tue, 08 Nov 2022 11:32:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=NcRGiok93yrAI7cvzhwIp5D29F3AFI0/ULm3Nt1q2VU=; b=fqWVAbjdw/fNyqUE/etLKdBoXVwrp6pKlcFnwPGKLcOfZBud1HZCf6BK53gHrYdV4g IGtamwQbgzqDFuKM78ggN/XUx84npKmZFw7VBbsZ9GFU4jvRffJ7QeuBgCY26lk/XZfJ Fo0/BYnSR0oQQHgW7jb+qeC2N6tS06hg+ewVnR12TCYRT2I2Eon1Q03flOwqrDoZx7C1 0KDKAtbrovbnna4a5n5LUYeCM48kjQQceWfwPa/IuPCChs6JWTSuWBLITjyRGa5wBu0V Q0VJr/FlETrNwTiE0In3eUtKGNy2RX9SPuh38w4ZsTDemlolYpXZVDVjeZbQw4BLUMp1 f8rA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NcRGiok93yrAI7cvzhwIp5D29F3AFI0/ULm3Nt1q2VU=; b=yQZFnLvBgh8A05VOGCpaPL/WDqoOzbbs2siK31p01fGDH6yPT68RlkmMzbdhxG+jtG 57rqIUyYTDSVkux09/jh/1JWPNknLBKFP0vnnL+UU7zyy9TR/h19tH5/lqy7UScBQuR9 DHK7DsqfKNc7HMhgLrGWQBn3LjFTorqyOsxPPX7WIzvX5H8nT0dFnj0+xcKvoKqG4NtF 486g2b7Nc/MRQ67nUlmVuVRYfLowVq9fRCVvKEy1GPdd/GAkvsnQya9xA93yzQWKfNeS 0EomoZLt7e/s7TLhTpSuopbpsmDDv5xwtqOdtraLc2zQmK6mFgVn+Rz/xRHaxSqgHDyn aNNQ== X-Gm-Message-State: ANoB5pmcVXA0alRNUFXVcvsRaZ0VvD/WCLPL2LpF65cH2ru7UtpDcqp7 TjTtlKPJpMh0pGNXF8cOf2A= X-Received: by 2002:a05:6a00:1689:b0:56e:d7f4:3aca with SMTP id k9-20020a056a00168900b0056ed7f43acamr18175224pfc.55.1667935933322; Tue, 08 Nov 2022 11:32:13 -0800 (PST) Received: from localhost (fwdproxy-prn-002.fbsv.net. [2a03:2880:ff:2::face:b00c]) by smtp.gmail.com with ESMTPSA id j21-20020a170902c3d500b00186c9d17af2sm7336166plj.17.2022.11.08.11.32.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:32:12 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v3 4/5] zsmalloc: Add ops fields to zs_pool to store evict handlers Date: Tue, 8 Nov 2022 11:32:06 -0800 Message-Id: <20221108193207.3297327-5-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221108193207.3297327-1-nphamcs@gmail.com> References: <20221108193207.3297327-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748957828870603234?= X-GMAIL-MSGID: =?utf-8?q?1748957828870603234?= This adds fields to zs_pool to store evict handlers for writeback, analogous to the zbud allocator. Signed-off-by: Nhat Pham --- mm/zsmalloc.c | 38 +++++++++++++++++++++++++++++++++++++- 1 file changed, 37 insertions(+), 1 deletion(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 600c40121544..ac86cffa62cd 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -225,6 +225,12 @@ struct link_free { }; }; +struct zs_pool; + +struct zs_ops { + int (*evict)(struct zs_pool *pool, unsigned long handle); +}; + struct zs_pool { const char *name; @@ -242,6 +248,12 @@ struct zs_pool { /* List tracking the zspages in LRU order by most recently added object */ struct list_head lru; +#ifdef CONFIG_ZPOOL + const struct zs_ops *ops; + struct zpool *zpool; + const struct zpool_ops *zpool_ops; +#endif + #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; #endif @@ -379,6 +391,18 @@ static void record_obj(unsigned long handle, unsigned long obj) #ifdef CONFIG_ZPOOL +static int zs_zpool_evict(struct zs_pool *pool, unsigned long handle) +{ + if (pool->zpool && pool->zpool_ops && pool->zpool_ops->evict) + return pool->zpool_ops->evict(pool->zpool, handle); + else + return -ENOENT; +} + +static const struct zs_ops zs_zpool_ops = { + .evict = zs_zpool_evict +}; + static void *zs_zpool_create(const char *name, gfp_t gfp, const struct zpool_ops *zpool_ops, struct zpool *zpool) @@ -388,7 +412,19 @@ static void *zs_zpool_create(const char *name, gfp_t gfp, * different contexts and its caller must provide a valid * gfp mask. */ - return zs_create_pool(name); + struct zs_pool *pool = zs_create_pool(name); + + if (pool) { + pool->zpool = zpool; + pool->zpool_ops = zpool_ops; + + if (zpool_ops) + pool->ops = &zs_zpool_ops; + else + pool->ops = NULL; + } + + return pool; } static void zs_zpool_destroy(void *pool) From patchwork Tue Nov 8 19:32:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 17184 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2909452wru; Tue, 8 Nov 2022 11:36:20 -0800 (PST) X-Google-Smtp-Source: AMsMyM77jky2WYBODCRKMmaznOo97D+BfmJzU0lLW+M2btPSGwE+2QPo8fPGxcdRM6q4NyGiKpdx X-Received: by 2002:a17:90b:2651:b0:20a:daaf:75f0 with SMTP id pa17-20020a17090b265100b0020adaaf75f0mr56980807pjb.142.1667936179934; Tue, 08 Nov 2022 11:36:19 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667936179; cv=none; d=google.com; s=arc-20160816; b=WtN/ZU43mRBpFHiJcph0C382j6prwA0X8AodA6s9xR1z4igJln8wwgnO17GL5q0tOl ywFkhNLMHd8pWub2Z6yjvWTbi7sn+oh9xN7dfWKSyATgZYN56tRiuIrCsTgYG9J7s+KO dZWdGa9nivhyIFm6kUIwzNdLDX4A4D/3GGZojLdoK+Gqws1ebOwvL6NOuzkw3uOQRNUC WIePCVwlw4zhryyaw+xJUBr37DIIzYpWHEChf41oLZC1o5bMYOyY6foEfVBow2JQ1cQZ gxiJw52H/kBnGxyLipCnj8gZxA9Lwj14V3kBNTiiuv6X7a/qkKUZ6Lj4Y8ekYQf/dUjb HY9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=d9GdtafRH6I7WJFplWbRa+/LvLJ2SebdW8/3k5nqMJE=; b=uc1ky085/03aeLBBzDRLdV1qhlZkVSDuPoqzSzL9Wwv8pQEarHJe84RcW+RjJ0deF7 9e4MWSfC5wX6vJN71Tb2ZpQCHQ8Q5tFp/u7/3ULijiIKk8biOh5d1zbCSrmS9iMMrf2r UcuxuxrQCSeQADXalbttGuW+gTXOjlaqw55zN3WMC8MXnaunbyYM/dPHBPLPhJPN6+xO 9mfLNtk/fva9upSnqqj1suabUydXLitmnByNp1m3fHvrl7c/7GmFgDgoLOVqjFFXtaFv irdD/X5bux/dNSMCocL8mD/0PV2YVqlGvRh9DUjKm/204s86WfuoHAIZ7PPWtMJtYhgK h7dA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HeLL9pUu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s11-20020a056a00178b00b0056c01ece001si17327172pfg.252.2022.11.08.11.36.05; Tue, 08 Nov 2022 11:36:19 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=HeLL9pUu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229853AbiKHTc1 (ORCPT + 99 others); Tue, 8 Nov 2022 14:32:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34614 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229794AbiKHTcT (ORCPT ); Tue, 8 Nov 2022 14:32:19 -0500 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BC83686A8 for ; Tue, 8 Nov 2022 11:32:15 -0800 (PST) Received: by mail-pl1-x634.google.com with SMTP id p21so15053045plr.7 for ; Tue, 08 Nov 2022 11:32:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=d9GdtafRH6I7WJFplWbRa+/LvLJ2SebdW8/3k5nqMJE=; b=HeLL9pUugyXL5l0vJmPh14x5FwOgG3uFcYxeXa4LAo6tlK5H6qwemLrVzMepGwwSbL 1XSM5LB89hqDUXKTn7nCxIW9mS9P+Fe2hhT5km3fqjRlfguiMwcKbiHynvfkmJ9x9eCp 2UW/Hmjpptv33NN6V41v1KgtmlBiMMniXhTarWPs9j0tpMUKnDeCo6gMrz1XoGZZEINO wOOf4F60w7GPAOUa25LvijmcrYvjUa4HjQ8F1QCneyl7ZxCvqTIRTpSrXvWldTTcF0F1 dZ1PG/HKAd3fjeRPnS9ryrPYz86a9yMqpQJ/JN/wm6FSSywU/aHvaIwS+kHU46rflNAm mZRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=d9GdtafRH6I7WJFplWbRa+/LvLJ2SebdW8/3k5nqMJE=; b=gqy/a2V4rhj1OlTHKeq7r9x5/cl9jHd9o6/TIBvL9Cn4SDGnOsAoT3bHyqseExIEvT e7J3dmzFhEYTt3s2PoOaSDJjY9jZirhfdZKv/2d5eOGT535FGbHfiSA7WAhucMjA6mTx 8jM0Trm5s47Xa51jf1whSpOyG/mIhT/LVwIDWjkdaBbUtPGs/jt+7WicN+PvjDPgxvYh FlfppfGZWkSFWLgFUiJgUaRdRSZBlq0P/pXXRXqJshh930uMLC4Xf/eN2u6kE6rmjJSz U2wSvuWq50II3w31ypiC49zpg8HBhKcCmhrqwfxt6/cpuZfthZvEsCiYcDX+JKRbEt0V iwrQ== X-Gm-Message-State: ACrzQf3o+e24+Ta20LWSxZq5kHN7yX7uaWZwSPde3483VjG+QmRLKInq okLCAQB8UROJ64hRxeX3dW8= X-Received: by 2002:a17:902:ced1:b0:186:b18a:d0d5 with SMTP id d17-20020a170902ced100b00186b18ad0d5mr58143806plg.60.1667935934513; Tue, 08 Nov 2022 11:32:14 -0800 (PST) Received: from localhost (fwdproxy-prn-023.fbsv.net. [2a03:2880:ff:17::face:b00c]) by smtp.gmail.com with ESMTPSA id d16-20020a170902ced000b0017f49b41c12sm7337665plg.173.2022.11.08.11.32.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 08 Nov 2022 11:32:14 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH v3 5/5] zsmalloc: Implement writeback mechanism for zsmalloc Date: Tue, 8 Nov 2022 11:32:07 -0800 Message-Id: <20221108193207.3297327-6-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221108193207.3297327-1-nphamcs@gmail.com> References: <20221108193207.3297327-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748957847739423253?= X-GMAIL-MSGID: =?utf-8?q?1748957847739423253?= This commit adds the writeback mechanism for zsmalloc, analogous to the zbud allocator. Zsmalloc will attempt to determine the coldest zspage (i.e least recently used) in the pool, and attempt to write back all the stored compressed objects via the pool's evict handler. Signed-off-by: Nhat Pham --- mm/zsmalloc.c | 200 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 185 insertions(+), 15 deletions(-) -- 2.30.2 diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index ac86cffa62cd..3868ad3cd038 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -279,10 +279,13 @@ struct zspage { /* links the zspage to the lru list in the pool */ struct list_head lru; + bool under_reclaim; + + /* list of unfreed handles whose objects have been reclaimed */ + unsigned long *deferred_handles; + struct zs_pool *pool; -#ifdef CONFIG_COMPACTION rwlock_t lock; -#endif }; struct mapping_area { @@ -303,10 +306,11 @@ static bool ZsHugePage(struct zspage *zspage) return zspage->huge; } -#ifdef CONFIG_COMPACTION static void migrate_lock_init(struct zspage *zspage); static void migrate_read_lock(struct zspage *zspage); static void migrate_read_unlock(struct zspage *zspage); + +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage); static void migrate_write_lock_nested(struct zspage *zspage); static void migrate_write_unlock(struct zspage *zspage); @@ -314,9 +318,6 @@ static void kick_deferred_free(struct zs_pool *pool); static void init_deferred_free(struct zs_pool *pool); static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage); #else -static void migrate_lock_init(struct zspage *zspage) {} -static void migrate_read_lock(struct zspage *zspage) {} -static void migrate_read_unlock(struct zspage *zspage) {} static void migrate_write_lock(struct zspage *zspage) {} static void migrate_write_lock_nested(struct zspage *zspage) {} static void migrate_write_unlock(struct zspage *zspage) {} @@ -446,6 +447,27 @@ static void zs_zpool_free(void *pool, unsigned long handle) zs_free(pool, handle); } +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries); + +static int zs_zpool_shrink(void *pool, unsigned int pages, + unsigned int *reclaimed) +{ + unsigned int total = 0; + int ret = -EINVAL; + + while (total < pages) { + ret = zs_reclaim_page(pool, 8); + if (ret < 0) + break; + total++; + } + + if (reclaimed) + *reclaimed = total; + + return ret; +} + static void *zs_zpool_map(void *pool, unsigned long handle, enum zpool_mapmode mm) { @@ -484,6 +506,7 @@ static struct zpool_driver zs_zpool_driver = { .malloc_support_movable = true, .malloc = zs_zpool_malloc, .free = zs_zpool_free, + .shrink = zs_zpool_shrink, .map = zs_zpool_map, .unmap = zs_zpool_unmap, .total_size = zs_zpool_total_size, @@ -957,6 +980,21 @@ static int trylock_zspage(struct zspage *zspage) return 0; } +/* + * Free all the deferred handles whose objects are freed in zs_free. + */ +static void free_handles(struct zs_pool *pool, struct zspage *zspage) +{ + unsigned long handle = (unsigned long)zspage->deferred_handles; + + while (handle) { + unsigned long nxt_handle = handle_to_obj(handle); + + cache_free_handle(pool, handle); + handle = nxt_handle; + } +} + static void __free_zspage(struct zs_pool *pool, struct size_class *class, struct zspage *zspage) { @@ -971,6 +1009,9 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); + /* Free all deferred handles from zs_free */ + free_handles(pool, zspage); + next = page = get_first_page(zspage); do { VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -1053,6 +1094,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) } INIT_LIST_HEAD(&zspage->lru); + zspage->under_reclaim = false; + zspage->deferred_handles = NULL; set_freeobj(zspage, 0); } @@ -1474,11 +1517,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - /* Move the zspage to front of pool's LRU */ - move_to_front(pool, zspage); - spin_unlock(&pool->lock); - return handle; + goto out; } spin_unlock(&pool->lock); @@ -1502,6 +1542,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); + +out: /* Move the zspage to front of pool's LRU */ move_to_front(pool, zspage); spin_unlock(&pool->lock); @@ -1559,12 +1601,24 @@ void zs_free(struct zs_pool *pool, unsigned long handle) obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); + + if (zspage->under_reclaim) { + /* + * Reclaim needs the handles during writeback. It'll free + * them along with the zspage when it's done with them. + * + * Record current deferred handle at the memory location + * whose address is given by handle. + */ + record_obj(handle, (unsigned long)zspage->deferred_handles); + zspage->deferred_handles = (unsigned long *)handle; + spin_unlock(&pool->lock); + return; + } fullness = fix_fullness_group(class, zspage); - if (fullness != ZS_EMPTY) - goto out; + if (fullness == ZS_EMPTY) + free_zspage(pool, class, zspage); - free_zspage(pool, class, zspage); -out: spin_unlock(&pool->lock); cache_free_handle(pool, handle); } @@ -1764,7 +1818,7 @@ static enum fullness_group putback_zspage(struct size_class *class, return fullness; } -#ifdef CONFIG_COMPACTION +#if defined(CONFIG_ZPOOL) || defined(CONFIG_COMPACTION) /* * To prevent zspage destroy during migration, zspage freeing should * hold locks of all pages in the zspage. @@ -1806,6 +1860,24 @@ static void lock_zspage(struct zspage *zspage) } migrate_read_unlock(zspage); } +#endif /* defined(CONFIG_ZPOOL) || defined(CONFIG_COMPACTION) */ + +#ifdef CONFIG_ZPOOL +/* + * Unlocks all the pages of the zspage. + * + * pool->lock must be held before this function is called + * to prevent the underlying pages from migrating. + */ +static void unlock_zspage(struct zspage *zspage) +{ + struct page *page = get_first_page(zspage); + + do { + unlock_page(page); + } while ((page = get_next_page(page)) != NULL); +} +#endif /* CONFIG_ZPOOL */ static void migrate_lock_init(struct zspage *zspage) { @@ -1822,6 +1894,7 @@ static void migrate_read_unlock(struct zspage *zspage) __releases(&zspage->lock) read_unlock(&zspage->lock); } +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage) { write_lock(&zspage->lock); @@ -2382,6 +2455,103 @@ void zs_destroy_pool(struct zs_pool *pool) } EXPORT_SYMBOL_GPL(zs_destroy_pool); +#ifdef CONFIG_ZPOOL +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries) +{ + int i, obj_idx, ret = 0; + unsigned long handle; + struct zspage *zspage; + struct page *page; + enum fullness_group fullness; + + if (retries == 0 || !pool->ops || !pool->ops->evict) + return -EINVAL; + + /* Lock LRU and fullness list */ + spin_lock(&pool->lock); + if (list_empty(&pool->lru)) { + spin_unlock(&pool->lock); + return -EINVAL; + } + + for (i = 0; i < retries; i++) { + struct size_class *class; + + zspage = list_last_entry(&pool->lru, struct zspage, lru); + list_del(&zspage->lru); + + /* zs_free may free objects, but not the zspage and handles */ + zspage->under_reclaim = true; + + class = zspage_class(pool, zspage); + fullness = get_fullness_group(class, zspage); + + /* Lock out object allocations and object compaction */ + remove_zspage(class, zspage, fullness); + + spin_unlock(&pool->lock); + + /* Lock backing pages into place */ + lock_zspage(zspage); + + obj_idx = 0; + page = zspage->first_page; + while (1) { + handle = find_alloced_obj(class, page, &obj_idx); + if (!handle) { + page = get_next_page(page); + if (!page) + break; + obj_idx = 0; + continue; + } + + /* + * This will write the object and call + * zs_free. + * + * zs_free will free the object, but the + * under_reclaim flag prevents it from freeing + * the zspage altogether. This is necessary so + * that we can continue working with the + * zspage potentially after the last object + * has been freed. + */ + ret = pool->ops->evict(pool, handle); + if (ret) + goto next; + + obj_idx++; + } + +next: + /* For freeing the zspage, or putting it back in the pool and LRU list. */ + spin_lock(&pool->lock); + zspage->under_reclaim = false; + + if (!get_zspage_inuse(zspage)) { + /* + * Fullness went stale as zs_free() won't touch it + * while the page is removed from the pool. Fix it + * up for the check in __free_zspage(). + */ + zspage->fullness = ZS_EMPTY; + + __free_zspage(pool, class, zspage); + spin_unlock(&pool->lock); + return 0; + } + + putback_zspage(class, zspage); + list_add(&zspage->lru, &pool->lru); + unlock_zspage(zspage); + } + + spin_unlock(&pool->lock); + return -EAGAIN; +} +#endif /* CONFIG_ZPOOL */ + static int __init zs_init(void) { int ret;