From patchwork Wed Oct 26 20:06:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 11434 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp462864wru; Wed, 26 Oct 2022 13:09:33 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7Afp0s7/1tPKkz9amrZiRWn35oRbUfxGCyaAi6NtFpUn7ycIW4HNYLhetBnGQmIyqSJ5Kp X-Received: by 2002:a63:fd58:0:b0:46b:41d:9d33 with SMTP id m24-20020a63fd58000000b0046b041d9d33mr39191890pgj.399.1666814972717; Wed, 26 Oct 2022 13:09:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666814972; cv=none; d=google.com; s=arc-20160816; b=pcYub3rJCmf6gg4VKgOygqjwhOmc2YN8+lkrdFOJcs2rLaToZbL8zORtjE0RxPhFfE PmYlSP0ynlN2yHjlOTUqndGmG99xaKkuhengiRK66mJ20xsmhlN4hWkUdDseE+wC4QiK EcGTMqwzIA048Y+kqXtAdZmkDhcwsOmJ58xKQvNctRvOwHRGJpaR+H0T0H84xBOqe+Rr hNl5pgg83MK7LJt+vPoW7ZQm00VLuspzT4IS+crKcbRTF7Q297Rbt/UbpV5y8PLweBye kZTKzy5NHXIMwyLtacYYLwQYDb3mZJ96HX7pNDgbKAiyZkY5hb7FtvY6d7L4DOqx5v+Q qvzw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=I7yJ8mu84pORSqqAKnqYKHgpZXk8+eFGcdBNcSuI+juyXHS65/uj/F5yN6OfhthXmk 3EjphMgNTc7mac9q05uAzq8KYgzyktU1E8E35AuEAz3ntqQnLZJ34fFIRI6AfY58+HxT roaTz9h1g8T8jK3M1nTDXE9urw2YbFGy3PvTfAuN+pvdfADhKd47a4kWwOuD00mXKm9a rmzazJtzmUCSKNl7c9hNmQgwNpPFUQwNCQAuffs5lXwZ9+exE3Cx3In04U1/waz4AeTZ ovKFLBoPhdRuqUUlUzSBuK92GpUBvr5pxvJxxDMvBP7h5RlxWmkQfEUZz7ZLOQg48kdy jxIA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Hi53eiNf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m3-20020a170902f64300b0018690a76b7dsi8019083plg.359.2022.10.26.13.09.18; Wed, 26 Oct 2022 13:09:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Hi53eiNf; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233884AbiJZUGt (ORCPT + 99 others); Wed, 26 Oct 2022 16:06:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235051AbiJZUGV (ORCPT ); Wed, 26 Oct 2022 16:06:21 -0400 Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 110219E0D5 for ; Wed, 26 Oct 2022 13:06:16 -0700 (PDT) Received: by mail-pj1-x1036.google.com with SMTP id c15-20020a17090a1d0f00b0021365864446so1424801pjd.4 for ; Wed, 26 Oct 2022 13:06:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=Hi53eiNft/uxfPCA9DiEpqEUXzmt5CELkUIG4MFbzeAEA0yq6pTdtJTWBHdXP5SCiS n2Ur0KkDeGzYyPPU9S1a9htUfQlaijj8CCXdjA4PC9kS7VA9vxREY3Cw/N9LFE9f1bnA fZiNG0Brr5sdQwx9wEuZveHTnAKm4SpFqqTBNfi1t7B+g8OZ9ipJcZZOWrfcjQcGjNDC uGdiBtgZuRgYQ8HrD+09R/mRUyEq/IVckBQ6SkfC1zPCm3EtYyYiOPvYovyty4U1n2lH dGHYFWO8wt3JAUfVkXbtouiJ1q1G+6Ual7nwO3I/SWEYqWiCnwLKxXhDETzKFL1OtyCF sHxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TdXYXXujE5c6Z8uTdZ5KokARMRQa3FrJU+bkNzs2RLY=; b=Ak8//1fxf8wkUTGfX8Z81TlCFsr9xw0+V1mCwn/L6wFdl/ERt7GXg0WAKsQgF+9onf 78+clb0O0Oa5uELhKCERtCix2N8NFjK0lRvdsgmO8iDfln9smeJyah9rSpnlc3mxlAvB b7gl/RFioxJNr8bOSQPpRl7tRBW+pAkQMuf48Wfm6zaf268tN2MSHFe84Xx2MLfX+AVL +1afW76P9D0lMS6pQsngYUEoZie1XdrsYL2nIah7MfCjACaKc9sM/RXpjw/3J29ubzqH omKdApHCmalqakDjU6+R16QQO8OUgg4upF+X2NgD3AYJyIDwhwdTBOEwvoqkkPwXn+yc XHug== X-Gm-Message-State: ACrzQf3evAfZ/OUfhQYimF86ysiVDCW43e6Ox3UfNSmKIcrvxbeWi1rr fn9HcXdkQU9dXu7Ch2gPeIaQjPXkcxHSmQ== X-Received: by 2002:a17:902:f710:b0:184:7a4c:fdc1 with SMTP id h16-20020a170902f71000b001847a4cfdc1mr45443478plo.27.1666814775478; Wed, 26 Oct 2022 13:06:15 -0700 (PDT) Received: from localhost (fwdproxy-prn-015.fbsv.net. [2a03:2880:ff:f::face:b00c]) by smtp.gmail.com with ESMTPSA id y27-20020aa7943b000000b0056c47a5c34dsm1775845pfo.122.2022.10.26.13.06.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Oct 2022 13:06:15 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH 1/5] zswap: fix writeback lock ordering for zsmalloc Date: Wed, 26 Oct 2022 13:06:09 -0700 Message-Id: <20221026200613.1031261-2-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221026200613.1031261-1-nphamcs@gmail.com> References: <20221026200613.1031261-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747782176949678614?= X-GMAIL-MSGID: =?utf-8?q?1747782176949678614?= From: Johannes Weiner zswap's customary lock order is tree->lock before pool->lock, because the tree->lock protects the entries' refcount, and the free callbacks in the backends acquire their respective pool locks to dispatch the backing object. zsmalloc's map callback takes the pool lock, so zswap must not grab the tree->lock while a handle is mapped. This currently only happens during writeback, which isn't implemented for zsmalloc. In preparation for it, move the tree->lock section out of the mapped entry section Signed-off-by: Johannes Weiner Signed-off-by: Nhat Pham --- mm/zswap.c | 37 ++++++++++++++++++++----------------- 1 file changed, 20 insertions(+), 17 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 2d48fd59cc7a..2d69c1d678fe 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -958,7 +958,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) }; if (!zpool_can_sleep_mapped(pool)) { - tmp = kmalloc(PAGE_SIZE, GFP_ATOMIC); + tmp = kmalloc(PAGE_SIZE, GFP_KERNEL); if (!tmp) return -ENOMEM; } @@ -968,6 +968,7 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) swpentry = zhdr->swpentry; /* here */ tree = zswap_trees[swp_type(swpentry)]; offset = swp_offset(swpentry); + zpool_unmap_handle(pool, handle); /* find and ref zswap entry */ spin_lock(&tree->lock); @@ -975,20 +976,12 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) if (!entry) { /* entry was invalidated */ spin_unlock(&tree->lock); - zpool_unmap_handle(pool, handle); kfree(tmp); return 0; } spin_unlock(&tree->lock); BUG_ON(offset != entry->offset); - src = (u8 *)zhdr + sizeof(struct zswap_header); - if (!zpool_can_sleep_mapped(pool)) { - memcpy(tmp, src, entry->length); - src = tmp; - zpool_unmap_handle(pool, handle); - } - /* try to allocate swap cache page */ switch (zswap_get_swap_cache_page(swpentry, &page)) { case ZSWAP_SWAPCACHE_FAIL: /* no memory or invalidate happened */ @@ -1006,6 +999,14 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); dlen = PAGE_SIZE; + zhdr = zpool_map_handle(pool, handle, ZPOOL_MM_RO); + src = (u8 *)zhdr + sizeof(struct zswap_header); + if (!zpool_can_sleep_mapped(pool)) { + memcpy(tmp, src, entry->length); + src = tmp; + zpool_unmap_handle(pool, handle); + } + mutex_lock(acomp_ctx->mutex); sg_init_one(&input, src, entry->length); sg_init_table(&output, 1); @@ -1015,6 +1016,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) dlen = acomp_ctx->req->dlen; mutex_unlock(acomp_ctx->mutex); + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); + else + zpool_unmap_handle(pool, handle); + BUG_ON(ret); BUG_ON(dlen != PAGE_SIZE); @@ -1045,7 +1051,11 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) zswap_entry_put(tree, entry); spin_unlock(&tree->lock); - goto end; + return ret; + +fail: + if (!zpool_can_sleep_mapped(pool)) + kfree(tmp); /* * if we get here due to ZSWAP_SWAPCACHE_EXIST @@ -1054,17 +1064,10 @@ static int zswap_writeback_entry(struct zpool *pool, unsigned long handle) * if we free the entry in the following put * it is also okay to return !0 */ -fail: spin_lock(&tree->lock); zswap_entry_put(tree, entry); spin_unlock(&tree->lock); -end: - if (zpool_can_sleep_mapped(pool)) - zpool_unmap_handle(pool, handle); - else - kfree(tmp); - return ret; } From patchwork Wed Oct 26 20:06:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 11437 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp463115wru; Wed, 26 Oct 2022 13:09:52 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6K33oSgHXV8tp7at5NgMcrpk3pvU8+cc1nMD+pY8F3McMLSpv0Ec74n76dSYhWiyac2aj/ X-Received: by 2002:a17:90a:582:b0:20a:97f6:f52e with SMTP id i2-20020a17090a058200b0020a97f6f52emr6117761pji.126.1666814991790; Wed, 26 Oct 2022 13:09:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666814991; cv=none; d=google.com; s=arc-20160816; b=TYKylp25JMA0QFR+39+SAJJduJppXm2BZStFjyIQOHDDyapEKARxM7HjoQlY+tsYSu 3a2OBYkjpfw/AUxD0rNDSb/XgDdU/ZLA6GHh9X7+mTfOt0Quiei3AUQcmtdK+i2V+B67 cZImkg90+wul0RnflZShbGYwa29kMbbQpGpOS0BVrIBhY9I7TCx4CHMMG2aiCKKE6/vK gVhzb/06xPBLu80axOolDph1VfoYc85OLP8XUvKWUSd2wWbsEgc08t0WpemSyvKMkf4M ONFs5QhlU+t9Gn4xqCf/AYxWPGqVwT3Gu49UsDSwbzGxJOqJA8buthbpAQnb9RT02RsO pmFg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=3eLGwbKV3BBUOzsaG0W84saUrLP5SGKXVz3Xw1kLQgs=; b=hySAdexC9CBPhrGzmZQvo3kUOB1Cv+eCL62Mauy7g841fBS9XONUhb4LtQG1hXkwtD N1/yuIgr89YBuXvg1KjpMvadExGCX/wtpGAo8y5Y+raEn7Ixby4W8lj1y76mKXgovFrc 6SRY1CmG4OqRvS+6bORkZ11AYaVfRAKIXoOkeHN2gE65MnjvYMBtFNDBmTRTd0itcNTN P9hVDnTlb4WuIQoVHkz1xFbTo6qC3tnPmJdmyLmJ/LHeMlRAMxjQteHjpBenamNU26iO +oYsH9x1TUu4XkMZOV8nOD6g74AYJTpGjpQj8vBw1JzvacKBZfRbLa/K5d+pGW4Plkmc WGFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=YnbOdy61; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b13-20020a6541cd000000b0045a73a1790asi7829967pgq.613.2022.10.26.13.09.38; Wed, 26 Oct 2022 13:09:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=YnbOdy61; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234444AbiJZUHD (ORCPT + 99 others); Wed, 26 Oct 2022 16:07:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235107AbiJZUGW (ORCPT ); Wed, 26 Oct 2022 16:06:22 -0400 Received: from mail-pl1-x62c.google.com (mail-pl1-x62c.google.com [IPv6:2607:f8b0:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5165791857 for ; Wed, 26 Oct 2022 13:06:17 -0700 (PDT) Received: by mail-pl1-x62c.google.com with SMTP id f23so15265397plr.6 for ; Wed, 26 Oct 2022 13:06:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3eLGwbKV3BBUOzsaG0W84saUrLP5SGKXVz3Xw1kLQgs=; b=YnbOdy61eJftyiNAVtQlaSheYjFU9QGnOhN8d6buLYOroIL7bSX7FBIV769kp7XN4t e00hvxwyO2Fo6ei7jhhnHBZmjP73wDubAsLsY5DnWJCfpb4gCK2hcNL8qE+fzCE8LhNY fhjv87kf1mwJTJFEKNvhw02dnlg2DwTRnYDLkTyiaLo5M1YkRdYwWNg+Jv4nD3VMCmA2 lqvIbtJ4dGpAU+OaDxkZbNnb3n2WGiPpNhyweYQo8Ub2i4GU/eo9iaPDr0Koj1Z6wDox ZSHFeoMNAVh2a6of/OV3CWer0niEqMY4ldK3uIT/l8rqJxBggQSOqzqg4vv3IUiW2WB8 Y7ig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3eLGwbKV3BBUOzsaG0W84saUrLP5SGKXVz3Xw1kLQgs=; b=6nD99jfU8/eN5Y10D5aZdw4WJRW1wWXh93JQExYhIQUyFRmZgWTlR3/GMzcisH+OOm wSGHqKnhUTH0mw+UlKUCQq46gcJmkzrfIoIeYrKfKNTmVvGYl0YR7MQE5LoWctsZ9nvN VzMF7GFqZuT0TBrzIPTjVYYWvroMTqWrdgbXecIWoAQzL8G333BWR80TAUPOgwHzHx6w 4VnNIcsoY6F7LiTBWa4Z5YFxv+UxGkv/Mq9uaXrzmjVzyC9or1vNmgIl4Qw/yKQDY4Tr WWUGmKZzt5UA7MdtGwKpf3UgOEgRRv+WJj09ic9pweu/8aUrecHMgOh6ALKGj6BPPeUq oGlg== X-Gm-Message-State: ACrzQf045SkI96c3PH8L77GgGSxUKWzIFbBaXIlTcEPhXBV+YC9PtK2R hcWr2KvAnppjyaOx6Hjanao= X-Received: by 2002:a17:902:cecc:b0:186:cd5c:3fc2 with SMTP id d12-20020a170902cecc00b00186cd5c3fc2mr8223249plg.152.1666814776763; Wed, 26 Oct 2022 13:06:16 -0700 (PDT) Received: from localhost (fwdproxy-prn-116.fbsv.net. [2a03:2880:ff:74::face:b00c]) by smtp.gmail.com with ESMTPSA id b5-20020aa78ec5000000b0056c04dee930sm3353909pfr.120.2022.10.26.13.06.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Oct 2022 13:06:16 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH 2/5] zsmalloc: Consolidate zs_pool's migrate_lock and size_class's locks Date: Wed, 26 Oct 2022 13:06:10 -0700 Message-Id: <20221026200613.1031261-3-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221026200613.1031261-1-nphamcs@gmail.com> References: <20221026200613.1031261-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747782196691955604?= X-GMAIL-MSGID: =?utf-8?q?1747782196691955604?= Currently, zsmalloc has a hierarchy of locks, which includes a pool-level migrate_lock, and a lock for each size class. We have to obtain both locks in the hotpath in most cases anyway, except for zs_malloc. This exception will no longer exist when we introduce a LRU into the zs_pool for the new writeback functionality - we will need to obtain a pool-level lock to synchronize LRU handling even in zs_malloc. In preparation for zsmalloc writeback, consolidate these locks into a single pool-level lock, which drastically reduces the complexity of synchronization in zsmalloc. Suggested-by: Johannes Weiner Signed-off-by: Nhat Pham Acked-by: Johannes Weiner --- mm/zsmalloc.c | 87 ++++++++++++++++++++++----------------------------- 1 file changed, 37 insertions(+), 50 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index d03941cace2c..326faa751f0a 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -33,8 +33,7 @@ /* * lock ordering: * page_lock - * pool->migrate_lock - * class->lock + * pool->lock * zspage->lock */ @@ -192,7 +191,6 @@ static const int fullness_threshold_frac = 4; static size_t huge_class_size; struct size_class { - spinlock_t lock; struct list_head fullness_list[NR_ZS_FULLNESS]; /* * Size of objects stored in this class. Must be multiple @@ -247,8 +245,7 @@ struct zs_pool { #ifdef CONFIG_COMPACTION struct work_struct free_work; #endif - /* protect page/zspage migration */ - rwlock_t migrate_lock; + spinlock_t lock; }; struct zspage { @@ -355,7 +352,7 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } -/* class->lock(which owns the handle) synchronizes races */ +/* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { *(unsigned long *)handle = obj; @@ -452,7 +449,7 @@ static __maybe_unused int is_first_page(struct page *page) return PagePrivate(page); } -/* Protected by class->lock */ +/* Protected by pool->lock */ static inline int get_zspage_inuse(struct zspage *zspage) { return zspage->inuse; @@ -597,13 +594,13 @@ static int zs_stats_size_show(struct seq_file *s, void *v) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); class_almost_full = zs_stat_get(class, CLASS_ALMOST_FULL); class_almost_empty = zs_stat_get(class, CLASS_ALMOST_EMPTY); obj_allocated = zs_stat_get(class, OBJ_ALLOCATED); obj_used = zs_stat_get(class, OBJ_USED); freeable = zs_can_compact(class); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); objs_per_zspage = class->objs_per_zspage; pages_used = obj_allocated / objs_per_zspage * @@ -916,7 +913,7 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, get_zspage_mapping(zspage, &class_idx, &fg); - assert_spin_locked(&class->lock); + assert_spin_locked(&pool->lock); VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); @@ -1247,19 +1244,19 @@ void *zs_map_object(struct zs_pool *pool, unsigned long handle, BUG_ON(in_interrupt()); /* It guarantees it can get zspage from handle safely */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_location(obj, &page, &obj_idx); zspage = get_zspage(page); /* - * migration cannot move any zpages in this zspage. Here, class->lock + * migration cannot move any zpages in this zspage. Here, pool->lock * is too heavy since callers would take some time until they calls * zs_unmap_object API so delegate the locking from class to zspage * which is smaller granularity. */ migrate_read_lock(zspage); - read_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); class = zspage_class(pool, zspage); off = (class->size * obj_idx) & ~PAGE_MASK; @@ -1412,8 +1409,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) size += ZS_HANDLE_SIZE; class = pool->size_class[get_size_class_index(size)]; - /* class->lock effectively protects the zpage migration */ - spin_lock(&class->lock); + /* pool->lock effectively protects the zpage migration */ + spin_lock(&pool->lock); zspage = find_get_zspage(class); if (likely(zspage)) { obj = obj_malloc(pool, zspage, handle); @@ -1421,12 +1418,12 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } - spin_unlock(&class->lock); + spin_unlock(&pool->lock); zspage = alloc_zspage(pool, class, gfp); if (!zspage) { @@ -1434,7 +1431,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) return (unsigned long)ERR_PTR(-ENOMEM); } - spin_lock(&class->lock); + spin_lock(&pool->lock); obj = obj_malloc(pool, zspage, handle); newfg = get_fullness_group(class, zspage); insert_zspage(class, zspage, newfg); @@ -1447,7 +1444,7 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); return handle; } @@ -1491,16 +1488,14 @@ void zs_free(struct zs_pool *pool, unsigned long handle) return; /* - * The pool->migrate_lock protects the race with zpage's migration + * The pool->lock protects the race with zpage's migration * so it's safe to get the page from handle. */ - read_lock(&pool->migrate_lock); + spin_lock(&pool->lock); obj = handle_to_obj(handle); obj_to_page(obj, &f_page); zspage = get_zspage(f_page); class = zspage_class(pool, zspage); - spin_lock(&class->lock); - read_unlock(&pool->migrate_lock); obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); @@ -1510,7 +1505,7 @@ void zs_free(struct zs_pool *pool, unsigned long handle) free_zspage(pool, class, zspage); out: - spin_unlock(&class->lock); + spin_unlock(&pool->lock); cache_free_handle(pool, handle); } EXPORT_SYMBOL_GPL(zs_free); @@ -1867,16 +1862,12 @@ static int zs_page_migrate(struct page *newpage, struct page *page, pool = zspage->pool; /* - * The pool migrate_lock protects the race between zpage migration + * The pool's lock protects the race between zpage migration * and zs_free. */ - write_lock(&pool->migrate_lock); + spin_lock(&pool->lock); class = zspage_class(pool, zspage); - /* - * the class lock protects zpage alloc/free in the zspage. - */ - spin_lock(&class->lock); /* the migrate_write_lock protects zpage access via zs_map_object */ migrate_write_lock(zspage); @@ -1906,10 +1897,9 @@ static int zs_page_migrate(struct page *newpage, struct page *page, replace_sub_page(class, zspage, newpage, page); /* * Since we complete the data copy and set up new zspage structure, - * it's okay to release migration_lock. + * it's okay to release the pool's lock. */ - write_unlock(&pool->migrate_lock); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); dec_zspage_isolation(zspage); migrate_write_unlock(zspage); @@ -1964,9 +1954,9 @@ static void async_free_zspage(struct work_struct *work) if (class->index != i) continue; - spin_lock(&class->lock); + spin_lock(&pool->lock); list_splice_init(&class->fullness_list[ZS_EMPTY], &free_pages); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } list_for_each_entry_safe(zspage, tmp, &free_pages, list) { @@ -1976,9 +1966,9 @@ static void async_free_zspage(struct work_struct *work) get_zspage_mapping(zspage, &class_idx, &fullness); VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; - spin_lock(&class->lock); + spin_lock(&pool->lock); __free_zspage(pool, class, zspage); - spin_unlock(&class->lock); + spin_unlock(&pool->lock); } }; @@ -2039,10 +2029,11 @@ static unsigned long __zs_compact(struct zs_pool *pool, struct zspage *dst_zspage = NULL; unsigned long pages_freed = 0; - /* protect the race between zpage migration and zs_free */ - write_lock(&pool->migrate_lock); - /* protect zpage allocation/free */ - spin_lock(&class->lock); + /* + * protect the race between zpage migration and zs_free + * as well as zpage allocation/free + */ + spin_lock(&pool->lock); while ((src_zspage = isolate_zspage(class, true))) { /* protect someone accessing the zspage(i.e., zs_map_object) */ migrate_write_lock(src_zspage); @@ -2067,7 +2058,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, putback_zspage(class, dst_zspage); migrate_write_unlock(dst_zspage); dst_zspage = NULL; - if (rwlock_is_contended(&pool->migrate_lock)) + if (spin_is_contended(&pool->lock)) break; } @@ -2084,11 +2075,9 @@ static unsigned long __zs_compact(struct zs_pool *pool, pages_freed += class->pages_per_zspage; } else migrate_write_unlock(src_zspage); - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); cond_resched(); - write_lock(&pool->migrate_lock); - spin_lock(&class->lock); + spin_lock(&pool->lock); } if (src_zspage) { @@ -2096,8 +2085,7 @@ static unsigned long __zs_compact(struct zs_pool *pool, migrate_write_unlock(src_zspage); } - spin_unlock(&class->lock); - write_unlock(&pool->migrate_lock); + spin_unlock(&pool->lock); return pages_freed; } @@ -2200,7 +2188,7 @@ struct zs_pool *zs_create_pool(const char *name) return NULL; init_deferred_free(pool); - rwlock_init(&pool->migrate_lock); + spin_lock_init(&pool->lock); pool->name = kstrdup(name, GFP_KERNEL); if (!pool->name) @@ -2271,7 +2259,6 @@ struct zs_pool *zs_create_pool(const char *name) class->index = i; class->pages_per_zspage = pages_per_zspage; class->objs_per_zspage = objs_per_zspage; - spin_lock_init(&class->lock); pool->size_class[i] = class; for (fullness = ZS_EMPTY; fullness < NR_ZS_FULLNESS; fullness++) From patchwork Wed Oct 26 20:06:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 11435 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp462915wru; Wed, 26 Oct 2022 13:09:41 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4uc1IIafWw9EN+6+dyBXL2TNYsVztKv5lA7dKQO65EmtboMzYGq2VHrcu+JnEvR1/EMu1M X-Received: by 2002:a05:6a00:2303:b0:56b:cd7e:6cb with SMTP id h3-20020a056a00230300b0056bcd7e06cbmr17571062pfh.77.1666814980958; Wed, 26 Oct 2022 13:09:40 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666814980; cv=none; d=google.com; s=arc-20160816; b=Ki4MZ9/qE8SaZkD98JT4abQ2r62K/TzkzllCeOR+J3rSa2WXLe1iiS+1FGfMGkKmgh vtuGldZOVuU0Obt/rMGn6lfaiVrCU+Yc+1cQ2f0wgEbTUsVeWGF/5PvHBTuUgqVAZl8Y psrJP2LTWybbR9nSOx4NT1gNtHpR/1V3p354fH98VYvgCaALK0sno8ZcF70/IM7R5V/e +9I7Rxney6OvKbLv1bBvxZ/ox+MYOGjWo1WhsSVrcX2qZlaSAnWgxWNQ61237QHQ4zgz fvq8m9vwDGtBMlipwDOUXdHdWx1YriEYxCD2nIFtzxnTusDvN+ArHGRas99zoXMAGSZz oKlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=Jt3L7wqT4Oz/JrPLj7/g9Cw16CvlR1lmOdrd7xRJy61fVobyq6SmF9QsxUKrOuLwKe s2vDdyLgp6f6Fdxx+G/ZWxbWSOVygmlM/99hxpddLTKwzNYl6/ruOhNwGkTm55Abh4+t v8pNQesKVu+tTDOnRK4S+K5XXjuXSykjZPlnsP1nQ1+lJi2Blug9Iu/0LhultAM1vA1E fUue6xmtoY3kBOKb94vTi2ZF+27OybaU1j4YzLgtwix5rRoO2RPp9HFT+hUfr9sJCEig 5XrYgs9pHj7VLzulkQc1EtZjCc5sXf1Ljr472TVcYA2g9Q12mCeroGyIhCzE0XRev4EY imnQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Ak7cByKP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i69-20020a636d48000000b0043941e5532dsi9088231pgc.391.2022.10.26.13.09.26; Wed, 26 Oct 2022 13:09:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=Ak7cByKP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234720AbiJZUG4 (ORCPT + 99 others); Wed, 26 Oct 2022 16:06:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235110AbiJZUGW (ORCPT ); Wed, 26 Oct 2022 16:06:22 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 91BE995ADD for ; Wed, 26 Oct 2022 13:06:18 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id io19so10309771plb.8 for ; Wed, 26 Oct 2022 13:06:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=Ak7cByKPgon/K+vDdR2mu/7sGLxErzFBRAs9CPl/Vts21w7AkP4S5jY0ZME5DdBENG CXkN2fTEH0rHR2UKBWxrMeBziN7WYtnWQCs1yDhMdkTf2P2F4oW0gbcPs2z1zSumYQGo IHMYEJZfsd6r1WI32+V2VvxUailYFWNanKeY6G40uKbzQStdjERrhsuyyWIbJtw7+Pz2 dlGR4WHcRAhuq2E+gh5FA7fB1Oa4T3YhkuU9BBmihLr6gtKeOHPFstO36u29uEdKCvCI ccn4mJ9sJ3YfcLwnXPJL0g7BoHme9xKx5zjKN/x0kRZo129LyklvxE0Gtv5GXcv4AiSs TMsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vtNgGl1ujIcnJmyweLV5m7PzX6xgoQT5EqlkUSI6PN4=; b=nQeq7X9W2o7lShru7mljWX5pU1Pcbg3b1VnUX2PNE5IJprVl7/KvMCCq1ZMcaw3k4Y Shnjdqw6E6doudf02JQ8BuIjN6Vken5CheH4AiVcO0UnUOamfoM0kjLkfKcuQkormm4h gcPWcjSKJi6nhfzgR3qgNtVW3vmAHWqB/7hamtUsv/qdx93ULk8ggF78OyDqkqjWX4ju DJLObmusQTt1yO4KLSpxDyagHQ1TUqubL9LqloztF1cJ/QDM5IV/XOfQoRnEBidFWEcT T5y8m+/fZaoXibB443k9jmtNMmGb3zAxgx0zkkicVlJ+DuYfCamRZxwq5ptwk0PlZo/8 bIlA== X-Gm-Message-State: ACrzQf19oIK4Z38jgIj9wA3GjzLb4k/zskRfkUYRZMptULmMAORxF+V7 iUHaIXOlcVNeiNWSe6Rvhp0N0EJbWNEVjQ== X-Received: by 2002:a17:902:ab89:b0:186:7cfc:cde8 with SMTP id f9-20020a170902ab8900b001867cfccde8mr29379358plr.9.1666814778018; Wed, 26 Oct 2022 13:06:18 -0700 (PDT) Received: from localhost (fwdproxy-prn-019.fbsv.net. [2a03:2880:ff:13::face:b00c]) by smtp.gmail.com with ESMTPSA id y17-20020aa79e11000000b0056be4dbd4besm3363241pfq.111.2022.10.26.13.06.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Oct 2022 13:06:17 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH 3/5] zsmalloc: Add a LRU to zs_pool to keep track of zspages in LRU order Date: Wed, 26 Oct 2022 13:06:11 -0700 Message-Id: <20221026200613.1031261-4-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221026200613.1031261-1-nphamcs@gmail.com> References: <20221026200613.1031261-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747782185314400686?= X-GMAIL-MSGID: =?utf-8?q?1747782185314400686?= This helps determines the coldest zspages as candidates for writeback. Signed-off-by: Nhat Pham Acked-by: Johannes Weiner --- mm/zsmalloc.c | 27 +++++++++++++++++++++++++++ 1 file changed, 27 insertions(+) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 326faa751f0a..600c40121544 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -239,6 +239,9 @@ struct zs_pool { /* Compact classes */ struct shrinker shrinker; + /* List tracking the zspages in LRU order by most recently added object */ + struct list_head lru; + #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; #endif @@ -260,6 +263,10 @@ struct zspage { unsigned int freeobj; struct page *first_page; struct list_head list; /* fullness list */ + + /* links the zspage to the lru list in the pool */ + struct list_head lru; + struct zs_pool *pool; #ifdef CONFIG_COMPACTION rwlock_t lock; @@ -352,6 +359,16 @@ static void cache_free_zspage(struct zs_pool *pool, struct zspage *zspage) kmem_cache_free(pool->zspage_cachep, zspage); } +/* Moves the zspage to the front of the zspool's LRU */ +static void move_to_front(struct zs_pool *pool, struct zspage *zspage) +{ + assert_spin_locked(&pool->lock); + + if (!list_empty(&zspage->lru)) + list_del(&zspage->lru); + list_add(&zspage->lru, &pool->lru); +} + /* pool->lock(which owns the handle) synchronizes races */ static void record_obj(unsigned long handle, unsigned long obj) { @@ -953,6 +970,7 @@ static void free_zspage(struct zs_pool *pool, struct size_class *class, } remove_zspage(class, zspage, ZS_EMPTY); + list_del(&zspage->lru); __free_zspage(pool, class, zspage); } @@ -998,6 +1016,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) off %= PAGE_SIZE; } + INIT_LIST_HEAD(&zspage->lru); + set_freeobj(zspage, 0); } @@ -1418,6 +1438,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); + /* Move the zspage to front of pool's LRU */ + move_to_front(pool, zspage); spin_unlock(&pool->lock); return handle; @@ -1444,6 +1466,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); + /* Move the zspage to front of pool's LRU */ + move_to_front(pool, zspage); spin_unlock(&pool->lock); return handle; @@ -1967,6 +1991,7 @@ static void async_free_zspage(struct work_struct *work) VM_BUG_ON(fullness != ZS_EMPTY); class = pool->size_class[class_idx]; spin_lock(&pool->lock); + list_del(&zspage->lru); __free_zspage(pool, class, zspage); spin_unlock(&pool->lock); } @@ -2278,6 +2303,8 @@ struct zs_pool *zs_create_pool(const char *name) */ zs_register_shrinker(pool); + INIT_LIST_HEAD(&pool->lru); + return pool; err: From patchwork Wed Oct 26 20:06:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 11436 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp463000wru; Wed, 26 Oct 2022 13:09:44 -0700 (PDT) X-Google-Smtp-Source: AMsMyM6jCxc0Kp2jGvDLkKi6q8ba5dLMZ8v2Bq8rZ2EV7bEMsnR+yOimyNbhteyPnHbWAuOBaqJe X-Received: by 2002:a17:90a:6e4c:b0:213:2058:f456 with SMTP id s12-20020a17090a6e4c00b002132058f456mr6098291pjm.186.1666814984456; Wed, 26 Oct 2022 13:09:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666814984; cv=none; d=google.com; s=arc-20160816; b=EZwJMhn3mK75mSyIpzJEoAmF+/eAr8I2mHglAyaAOEpVFxilHQO6SgoQVfIHRLfDfH +o9tcxTfIPVx0PLBJGOglw55QfNpd/Ob4lvzyLPo9oS6/xoD3QuvEzxKS/osWLaQ45pw j+eTSehxGz+IjplY/+0AOy5QTPsA2A2tQtiXf9/azG5YbNr0v5uSlVYiEpeIYoOj16Zd w99ylnudQKFZvbBE2khc8Pv9yQIHDq4CFYBHGfXGzjrEGR3s07RZTmuW3eXMQc3gJE+Y GbvITjE5GLUrNexLmFNMA8+9QihVK7/de2DziSg0KoO5cDb10UYEg17pxskhHNXWr/Qt OrCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=HSvm1OVp5HfWSmnERqwnki4xcSgkSWyRsvH49QuFnfE=; b=DFgsyoAnMitkIbs58p6ikoh5PqnUr4LWidIoa/PepZba1ZJueEvPSyvT05o0ZgLTAt LZWIONgbbNEsewm6ZqfS1tKXHMfnO1S7X1vvmDLxILUa7EU+y6KrrwzYty20Ux7jhhRB 2Yfrfue66Unj8+9KJXKCk7srb1iXS6F0YRdAscREq4D7ArgCNP7mgnXCQZ1HMMeXawt+ jAafr892jwVW3kCwRsXC/yElNNvOIYl07noOFCqjIYrS3++OmWBR7SCLWdCsU2qvmgi1 yLeFp1DsEW8kcb//gPFozZNOJpVRTzF9Po27OMIEbjMzsgUbIpUhmU7c3zgPYOVwFrt6 ixuA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=M6ji8O15; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id jb14-20020a170903258e00b00184000a834dsi6413457plb.455.2022.10.26.13.09.30; Wed, 26 Oct 2022 13:09:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=M6ji8O15; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233992AbiJZUHA (ORCPT + 99 others); Wed, 26 Oct 2022 16:07:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235115AbiJZUGW (ORCPT ); Wed, 26 Oct 2022 16:06:22 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C12D895AF5 for ; Wed, 26 Oct 2022 13:06:19 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id l6so11527361pjj.0 for ; Wed, 26 Oct 2022 13:06:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=HSvm1OVp5HfWSmnERqwnki4xcSgkSWyRsvH49QuFnfE=; b=M6ji8O15HzhOwy1jXWvAOlfmKQUeVoW/EbKvR9LcvTBC2x5En6d987HkhhRKanDnDV SFN0WzFT4ZC2IsTIbJT5za3RFvKWWLefUJFkXiy61PQEpN2Fz4H37MdsGumYX9eWk1OO +HWsUn9490Xsf9bxh5xgwAintd2PGJAmexI6ec0NZ4gFb2byWvu1AHzmsOmg2jfPs6E6 P90BnIAzYLyBAzAd7BTeGB5jkonOtk6qNosTYS54/+1/I0O53P9Jib8/1rTtg6vrPmRC ptPQUtnCiSrt2ARsE37QJsrbwBaiZRkzXYx1XNw/Jk7Ms0ycWP/g3fFmzFKmA5GSJXhG KNRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=HSvm1OVp5HfWSmnERqwnki4xcSgkSWyRsvH49QuFnfE=; b=osIFDripWQ80NtYEftNCDlYdaQ95lzAWHMIqUC3h/tV60628KJNkoohf31IOVzjfey 7Cxo65u/92Yp6izlscsUP3UlAwNZFnTwSWI1BrZvUY4IHUXuuK80JH8wuAX2gedcHZvV gONTKw+tPBBQK3/BUQtQPdA5WnmyWXB8vxV0QO6XFZ9zuBEMfaZZ1tHtaQ335kfIdY/C ZZ60c1J16+99GqeDObrqfCzMdlmL+LhH2hyJo9O9qsTRJyIx9/Dsuw4aBjmwHL6XuJLP 8BE3eEwwUEj4LJ26I9IVbsEgXpuI2ar9JMS4m/vyuvbwpuRER/kMBf4ZBAMTSBcixO1C /YHg== X-Gm-Message-State: ACrzQf3VWjavHy0ykFakxqNnlYLla9JgJDhJGdoaHk6tEyORGSBoKzgi kS2ICfbxFg5rBEyMj95kF4s= X-Received: by 2002:a17:902:be0b:b0:182:fd6:1293 with SMTP id r11-20020a170902be0b00b001820fd61293mr46049794pls.146.1666814779269; Wed, 26 Oct 2022 13:06:19 -0700 (PDT) Received: from localhost (fwdproxy-prn-023.fbsv.net. [2a03:2880:ff:17::face:b00c]) by smtp.gmail.com with ESMTPSA id 126-20020a620484000000b0056c349f5c70sm2565942pfe.79.2022.10.26.13.06.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Oct 2022 13:06:18 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH 4/5] zsmalloc: Add ops fields to zs_pool to store evict handlers Date: Wed, 26 Oct 2022 13:06:12 -0700 Message-Id: <20221026200613.1031261-5-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221026200613.1031261-1-nphamcs@gmail.com> References: <20221026200613.1031261-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747782189131322262?= X-GMAIL-MSGID: =?utf-8?q?1747782189131322262?= This adds fields to zs_pool to store evict handlers for writeback, analogous to the zbud allocator. Signed-off-by: Nhat Pham Acked-by: Johannes Weiner --- mm/zsmalloc.c | 36 +++++++++++++++++++++++++++++++++++- 1 file changed, 35 insertions(+), 1 deletion(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 600c40121544..76ff2ed839d0 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -225,6 +225,12 @@ struct link_free { }; }; +struct zs_pool; + +struct zs_ops { + int (*evict)(struct zs_pool *pool, unsigned long handle); +}; + struct zs_pool { const char *name; @@ -242,6 +248,12 @@ struct zs_pool { /* List tracking the zspages in LRU order by most recently added object */ struct list_head lru; +#ifdef CONFIG_ZPOOL + const struct zs_ops *ops; + struct zpool *zpool; + const struct zpool_ops *zpool_ops; +#endif + #ifdef CONFIG_ZSMALLOC_STAT struct dentry *stat_dentry; #endif @@ -379,6 +391,18 @@ static void record_obj(unsigned long handle, unsigned long obj) #ifdef CONFIG_ZPOOL +static int zs_zpool_evict(struct zs_pool *pool, unsigned long handle) +{ + if (pool->zpool && pool->zpool_ops && pool->zpool_ops->evict) + return pool->zpool_ops->evict(pool->zpool, handle); + else + return -ENOENT; +} + +static const struct zs_ops zs_zpool_ops = { + .evict = zs_zpool_evict +}; + static void *zs_zpool_create(const char *name, gfp_t gfp, const struct zpool_ops *zpool_ops, struct zpool *zpool) @@ -388,7 +412,17 @@ static void *zs_zpool_create(const char *name, gfp_t gfp, * different contexts and its caller must provide a valid * gfp mask. */ - return zs_create_pool(name); + struct zs_pool *pool = zs_create_pool(name); + + if (pool) { + pool->zpool = zpool; + pool->zpool_ops = zpool_ops; + + if (zpool_ops) + pool->ops = &zs_zpool_ops; + } + + return pool; } static void zs_zpool_destroy(void *pool) From patchwork Wed Oct 26 20:06:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 11438 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp463151wru; Wed, 26 Oct 2022 13:09:58 -0700 (PDT) X-Google-Smtp-Source: AMsMyM4RvOshQulJsk/3Yzx9zEyYI3Y2d+/dDLUV+PTFyLnic7C2PUEWmPvuhz8ZfMsQdj4bm1r6 X-Received: by 2002:a17:902:e552:b0:179:e796:b432 with SMTP id n18-20020a170902e55200b00179e796b432mr45031714plf.21.1666814997974; Wed, 26 Oct 2022 13:09:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1666814997; cv=none; d=google.com; s=arc-20160816; b=rkOKPrKDNy0ensURi0x4OyZFwQK83TXGDbhtKsP/ip9CFMPJTCXXiSiFX0AEpzj+f1 olckfmFGvVQVGh6YQUgKOEe8bWzownJt1qRg9oN9vvuKpxTX6hiVhjI+664/i0rvf52k Z4KTQrZNI4DsCINtAVikhl4UZ1xOXjs5uvUZQ/b144M7aPAHC62mZIRjO6vQ49TwRuug FrEn+KemwuaqoyLSz2RaPEcOJnRIKVtSb/G3+XUt5pXaxHhbnkv0sQRWKB8Te2PMiWpW vPiQoVlW6ECy3B2642wwh3VGp0z+hHMQCq1Hne2r42iN4Jt/E824y56S7u13tjTuOKUs ao4w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7OqjbX0ITf4k24OuoKwYdR+7colwCLHKEor3QZ1c38U=; b=Svg+uJ+157Lpz29JsNWOURuHVNNosNn+TzZI5xkhyU1TWu/IXFJwhs3LGpoH/6RQqn B3aXFy9l29L43JvUdkPv/poQX0QXaNsxWxWbzi6r/5xEjYJx8J0zs0eRYTqRcXM2iluI TKF/F5B4Bmc5uEeZVD7WtIiqH4//9uiJcGMfGBXp8IyksPq9k+DH14H3sZtNhJh0FGx4 v3brJBysnc3B0WBLpFqULJBnOEWFVHp9eWQIgFEibGH7nx2gAmMMGdDqqSXfskq9APhZ 0DBx/EP/pNaqAQ5IIgfNwnT3ZywMbqAnqaXRvHZkrlB8ol7AQ1YByICzZqWstmu6P6qy X3yw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="B8/Zw4l4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bh5-20020a056a02020500b0044ef742ff8bsi8236436pgb.728.2022.10.26.13.09.44; Wed, 26 Oct 2022 13:09:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b="B8/Zw4l4"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234865AbiJZUHF (ORCPT + 99 others); Wed, 26 Oct 2022 16:07:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235119AbiJZUGX (ORCPT ); Wed, 26 Oct 2022 16:06:23 -0400 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39F82A220E for ; Wed, 26 Oct 2022 13:06:21 -0700 (PDT) Received: by mail-pf1-x430.google.com with SMTP id m6so16594946pfb.0 for ; Wed, 26 Oct 2022 13:06:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7OqjbX0ITf4k24OuoKwYdR+7colwCLHKEor3QZ1c38U=; b=B8/Zw4l4FSDq4hTG54dSyA1WeuK0Dz1r5JD85LjlCAjLNCsu+MH0bnTd8jjEnsSulV s/CLh+005SrNFDNzbqQrDnnoyMd5btQIsGqmYFFvJZIEP77Px713mxhnG60gec1LVfQC UfCzlpF8ILFekFuXunWwwlFeMahnDOiL0xypcbekLCThO56VkxqMbFSsgf7WfBl1hjrm oGUTYDJOstz3KhAVJ04QqOEkGCovf3rWK9WO4/xoDCtwHFQ2Tn98YCQbYwx5GM2rbNt3 t5yM59yjjbw8dwfk8XY0jNDoWln3+eSnlD+pYGCde2RSC+iu8Ou/tUYxaQqH0Fgrt4EW QiZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7OqjbX0ITf4k24OuoKwYdR+7colwCLHKEor3QZ1c38U=; b=VdGIj51On1T4LReOVLCTu1HhcjFfcLwAdmakv5oGnAJahCKiGFew+g3Ta4iYfA60Zh DWVnmgZuIvQMpN+6k9xtTM2ucpwGwYLVvHlHzzIwJ1R/Lt0xYdWYVuj9/CJ1yXbdxP0j icYOZTu15ycKNrQd9av3zPs3Bsrd0B26MKeQ0yCqkalCvbRrmXuQMz6/iq0/MnjMs+G5 o8OYmScRWpglpzluG4BXz37RsagLSyKJgrulAqhifuh7WNpKzEryVEY4tyvjypajG85A 78INqcfM0PlBvw1ZjITYchHdGJOz9ancCX9lc9ajBpW6kGWB73f/kAvsmwxupEp9AeW5 QaCQ== X-Gm-Message-State: ACrzQf2MPBICY9CjQqPvvd62azeDOSdwtFDQn7bmDNcu6Gb5smb0oigh 97jQEkOjugicjpssG/rGQAA= X-Received: by 2002:a05:6a00:10cf:b0:563:34ce:4138 with SMTP id d15-20020a056a0010cf00b0056334ce4138mr46545932pfu.6.1666814780588; Wed, 26 Oct 2022 13:06:20 -0700 (PDT) Received: from localhost (fwdproxy-prn-001.fbsv.net. [2a03:2880:ff:1::face:b00c]) by smtp.gmail.com with ESMTPSA id g31-20020a63111f000000b0043a18cef977sm3156403pgl.13.2022.10.26.13.06.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 26 Oct 2022 13:06:20 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, minchan@kernel.org, ngupta@vflare.org, senozhatsky@chromium.org, sjenning@redhat.com, ddstreet@ieee.org, vitaly.wool@konsulko.com Subject: [PATCH 5/5] zsmalloc: Implement writeback mechanism for zsmalloc Date: Wed, 26 Oct 2022 13:06:13 -0700 Message-Id: <20221026200613.1031261-6-nphamcs@gmail.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20221026200613.1031261-1-nphamcs@gmail.com> References: <20221026200613.1031261-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1747782202900794755?= X-GMAIL-MSGID: =?utf-8?q?1747782202900794755?= This commit adds the writeback mechanism for zsmalloc, analogous to the zbud allocator. Zsmalloc will attempt to determine the coldest zspage (i.e least recently used) in the pool, and attempt to write back all the stored compressed objects via the pool's evict handler. Signed-off-by: Nhat Pham --- mm/zsmalloc.c | 192 ++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 177 insertions(+), 15 deletions(-) diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index 76ff2ed839d0..c79cbd3f46f3 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -279,10 +279,13 @@ struct zspage { /* links the zspage to the lru list in the pool */ struct list_head lru; + bool under_reclaim; + + /* list of unfreed handles whose objects have been reclaimed */ + unsigned long *deferred_handles; + struct zs_pool *pool; -#ifdef CONFIG_COMPACTION rwlock_t lock; -#endif }; struct mapping_area { @@ -303,10 +306,11 @@ static bool ZsHugePage(struct zspage *zspage) return zspage->huge; } -#ifdef CONFIG_COMPACTION static void migrate_lock_init(struct zspage *zspage); static void migrate_read_lock(struct zspage *zspage); static void migrate_read_unlock(struct zspage *zspage); + +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage); static void migrate_write_lock_nested(struct zspage *zspage); static void migrate_write_unlock(struct zspage *zspage); @@ -314,9 +318,6 @@ static void kick_deferred_free(struct zs_pool *pool); static void init_deferred_free(struct zs_pool *pool); static void SetZsPageMovable(struct zs_pool *pool, struct zspage *zspage); #else -static void migrate_lock_init(struct zspage *zspage) {} -static void migrate_read_lock(struct zspage *zspage) {} -static void migrate_read_unlock(struct zspage *zspage) {} static void migrate_write_lock(struct zspage *zspage) {} static void migrate_write_lock_nested(struct zspage *zspage) {} static void migrate_write_unlock(struct zspage *zspage) {} @@ -444,6 +445,27 @@ static void zs_zpool_free(void *pool, unsigned long handle) zs_free(pool, handle); } +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries); + +static int zs_zpool_shrink(void *pool, unsigned int pages, + unsigned int *reclaimed) +{ + unsigned int total = 0; + int ret = -EINVAL; + + while (total < pages) { + ret = zs_reclaim_page(pool, 8); + if (ret < 0) + break; + total++; + } + + if (reclaimed) + *reclaimed = total; + + return ret; +} + static void *zs_zpool_map(void *pool, unsigned long handle, enum zpool_mapmode mm) { @@ -482,6 +504,7 @@ static struct zpool_driver zs_zpool_driver = { .malloc_support_movable = true, .malloc = zs_zpool_malloc, .free = zs_zpool_free, + .shrink = zs_zpool_shrink, .map = zs_zpool_map, .unmap = zs_zpool_unmap, .total_size = zs_zpool_total_size, @@ -955,6 +978,21 @@ static int trylock_zspage(struct zspage *zspage) return 0; } +/* + * Free all the deferred handles whose objects are freed in zs_free. + */ +static void free_handles(struct zs_pool *pool, struct zspage *zspage) +{ + unsigned long handle = (unsigned long) zspage->deferred_handles; + + while (handle) { + unsigned long nxt_handle = handle_to_obj(handle); + + cache_free_handle(pool, handle); + handle = nxt_handle; + } +} + static void __free_zspage(struct zs_pool *pool, struct size_class *class, struct zspage *zspage) { @@ -969,6 +1007,9 @@ static void __free_zspage(struct zs_pool *pool, struct size_class *class, VM_BUG_ON(get_zspage_inuse(zspage)); VM_BUG_ON(fg != ZS_EMPTY); + /* Free all deferred handles from zs_free */ + free_handles(pool, zspage); + next = page = get_first_page(zspage); do { VM_BUG_ON_PAGE(!PageLocked(page), page); @@ -1051,6 +1092,8 @@ static void init_zspage(struct size_class *class, struct zspage *zspage) } INIT_LIST_HEAD(&zspage->lru); + zspage->under_reclaim = false; + zspage->deferred_handles = NULL; set_freeobj(zspage, 0); } @@ -1472,11 +1515,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) fix_fullness_group(class, zspage); record_obj(handle, obj); class_stat_inc(class, OBJ_USED, 1); - /* Move the zspage to front of pool's LRU */ - move_to_front(pool, zspage); - spin_unlock(&pool->lock); - return handle; + goto out; } spin_unlock(&pool->lock); @@ -1500,6 +1540,8 @@ unsigned long zs_malloc(struct zs_pool *pool, size_t size, gfp_t gfp) /* We completely set up zspage so mark them as movable */ SetZsPageMovable(pool, zspage); + +out: /* Move the zspage to front of pool's LRU */ move_to_front(pool, zspage); spin_unlock(&pool->lock); @@ -1557,12 +1599,24 @@ void zs_free(struct zs_pool *pool, unsigned long handle) obj_free(class->size, obj); class_stat_dec(class, OBJ_USED, 1); + + if (zspage->under_reclaim) { + /* + * Reclaim needs the handles during writeback. It'll free + * them along with the zspage when it's done with them. + * + * Record current deferred handle at the memory location + * whose address is given by handle. + */ + record_obj(handle, (unsigned long) zspage->deferred_handles); + zspage->deferred_handles = (unsigned long *) handle; + spin_unlock(&pool->lock); + return; + } fullness = fix_fullness_group(class, zspage); - if (fullness != ZS_EMPTY) - goto out; + if (fullness == ZS_EMPTY) + free_zspage(pool, class, zspage); - free_zspage(pool, class, zspage); -out: spin_unlock(&pool->lock); cache_free_handle(pool, handle); } @@ -1762,7 +1816,6 @@ static enum fullness_group putback_zspage(struct size_class *class, return fullness; } -#ifdef CONFIG_COMPACTION /* * To prevent zspage destroy during migration, zspage freeing should * hold locks of all pages in the zspage. @@ -1805,6 +1858,21 @@ static void lock_zspage(struct zspage *zspage) migrate_read_unlock(zspage); } +/* + * Unlocks all the pages of the zspage. + * + * pool->lock must be held before this function is called + * to prevent the underlying pages from migrating. + */ +static void unlock_zspage(struct zspage *zspage) +{ + struct page *page = get_first_page(zspage); + + do { + unlock_page(page); + } while ((page = get_next_page(page)) != NULL); +} + static void migrate_lock_init(struct zspage *zspage) { rwlock_init(&zspage->lock); @@ -1820,6 +1888,7 @@ static void migrate_read_unlock(struct zspage *zspage) __releases(&zspage->lock) read_unlock(&zspage->lock); } +#ifdef CONFIG_COMPACTION static void migrate_write_lock(struct zspage *zspage) { write_lock(&zspage->lock); @@ -2380,6 +2449,99 @@ void zs_destroy_pool(struct zs_pool *pool) } EXPORT_SYMBOL_GPL(zs_destroy_pool); +static int zs_reclaim_page(struct zs_pool *pool, unsigned int retries) +{ + int i, obj_idx, ret = 0; + unsigned long handle; + struct zspage *zspage; + struct page *page; + enum fullness_group fullness; + + /* Lock LRU and fullness list */ + spin_lock(&pool->lock); + if (!pool->ops || !pool->ops->evict || list_empty(&pool->lru) || + retries == 0) { + spin_unlock(&pool->lock); + return -EINVAL; + } + + for (i = 0; i < retries; i++) { + struct size_class *class; + + zspage = list_last_entry(&pool->lru, struct zspage, lru); + list_del(&zspage->lru); + + /* zs_free may free objects, but not the zspage and handles */ + zspage->under_reclaim = true; + + /* Lock backing pages into place */ + lock_zspage(zspage); + + class = zspage_class(pool, zspage); + fullness = get_fullness_group(class, zspage); + + /* Lock out object allocations and object compaction */ + remove_zspage(class, zspage, fullness); + + spin_unlock(&pool->lock); + + obj_idx = 0; + page = zspage->first_page; + while (1) { + handle = find_alloced_obj(class, page, &obj_idx); + if (!handle) { + page = get_next_page(page); + if (!page) + break; + obj_idx = 0; + continue; + } + + /* + * This will write the object and call + * zs_free. + * + * zs_free will free the object, but the + * under_reclaim flag prevents it from freeing + * the zspage altogether. This is necessary so + * that we can continue working with the + * zspage potentially after the last object + * has been freed. + */ + ret = pool->ops->evict(pool, handle); + if (ret) + goto next; + + obj_idx++; + } + +next: + /* For freeing the zspage, or putting it back in the pool and LRU list. */ + spin_lock(&pool->lock); + zspage->under_reclaim = false; + + if (!get_zspage_inuse(zspage)) { + /* + * Fullness went stale as zs_free() won't touch it + * while the page is removed from the pool. Fix it + * up for the check in __free_zspage(). + */ + zspage->fullness = ZS_EMPTY; + + __free_zspage(pool, class, zspage); + spin_unlock(&pool->lock); + return 0; + } + + putback_zspage(class, zspage); + list_add(&zspage->lru, &pool->lru); + unlock_zspage(zspage); + } + + spin_unlock(&pool->lock); + return -EAGAIN; +} + static int __init zs_init(void) { int ret;