Message ID | 20240103003011.211382-1-v-songbaohua@oppo.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp4757939dyb; Tue, 2 Jan 2024 16:31:00 -0800 (PST) X-Google-Smtp-Source: AGHT+IEj1AF/SlqVlznoItzCMViprk4OInqFcxswxbPTg1+RM6zxmQMiU4gMApkaKuiJu5BU2NrG X-Received: by 2002:ac8:5b8a:0:b0:428:2021:997b with SMTP id a10-20020ac85b8a000000b004282021997bmr5905430qta.128.1704241860308; Tue, 02 Jan 2024 16:31:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704241860; cv=none; d=google.com; s=arc-20160816; b=NU0WV3uMjkmyzp6jacP7dg7oQjo+TLUSkr8kOm+oS2UOzosJ8v6LtFNN3BRhN6xyOD Xs9OP2G/HU04OPHk1Uxqe72RG+yxeEnIYeb4fi+E+60A8//Rq07M3hKtGtDlPaKqqDq1 nBj3pEtcqTCv/0BnZLiHpL6C0dRSZY046K5zq8vypmTYgT//+Hn2J9OzieehYrUHo9dM INJsXqyDvDzeWegxiDBxc3p6RkLkacHK+nSK5twH2bN16xa3TmMCxp3osIs/AmEAcJi2 G621y+JkDj+r0dZaK8lJO6zLyRWIrRfX+40GXiWfTsmi7KtQMebjRzQRnxbNVPOxncxn R23A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=QFvILJ+t1ObHf5nxnhyRtHafjkPPFwRKVWgBM4XnJkU=; fh=sdTwZIqi/HvYQdcXsB/gI3MvqKut552vvb/Jp+aQT1Y=; b=LN5Bh3smXmdGD6Acl1SYEi5U4L7Emey8LMM5ONLKhxeq/cSp3GjSNneth0weS8q3mR IXDwU2p+l2rBB1FzXE1589tgIwjVSjD8Siiw5VqH2CIKKLAuvKQBlQSuCL3M9+BzmPnO ijw3imrvOjjbp2y+ayOhZP04jYxNOwyAitzP08nFr7XLpQ0m1ENy2p7wglgHQQMloVuz DRzLkNxzArjAdn5YIJr258ftVnLVRIR6date4g5wu7uvzmBg1H9uD6wgbkxEG0PDwuxn nxZgIqDZw9UaeCbc8iWAsEg43/E9QkLsukxSNh0Ms9mJHxriFcLrCL7d3ZYjtnFkZDuv vqWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=SYeeYVHi; spf=pass (google.com: domain of linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id v18-20020a05622a131200b004281fa7db34si5367279qtk.316.2024.01.02.16.31.00 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 16:31:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=SYeeYVHi; spf=pass (google.com: domain of linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14999-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id BD3D81C22C4D for <ouuuleilei@gmail.com>; Wed, 3 Jan 2024 00:30:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B0089A57; Wed, 3 Jan 2024 00:30:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="SYeeYVHi" X-Original-To: linux-kernel@vger.kernel.org Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D22611390; Wed, 3 Jan 2024 00:30:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Received: by mail-pj1-f41.google.com with SMTP id 98e67ed59e1d1-28ceae32e9fso6375a91.1; Tue, 02 Jan 2024 16:30:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1704241827; x=1704846627; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=QFvILJ+t1ObHf5nxnhyRtHafjkPPFwRKVWgBM4XnJkU=; b=SYeeYVHiQiLhjej0E7YLm8V15FV35BzKFb0fFEeWjyDIApB8NvSNAXziWXvBmjweXo ULz7j1aTqNfzzDx43uvQt2I2pUDPkFS2WwPL7VlzsHHhlcXLeE99DAqlHkqR77/3LbB0 3f+nTN7cEFyC2ynPglqfdf5NcUjemz+muMxNUdl7x6f6QgL0FKFHTiwxcajtNR2gjtIO xv4IFrTI0np2ph9COiJOJRXt+HdYfOMzpt31f/0aMWKT4BoZRkVFiN6LQfb6+QNiU8Qg VTbiGoTc4toaya0fXC1kuUUNp0BlGrtkLxHiWssJ4Hc/M4q/9J2PPPQlHQPQRidqnu0N X8hw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1704241827; x=1704846627; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=QFvILJ+t1ObHf5nxnhyRtHafjkPPFwRKVWgBM4XnJkU=; b=psrVJrjtNTBQcmoHFnof7peOdreD9Wnit0qxCOvZcLONsN0AzSg5OILCC1Db3kkEwF DS+7kmFn6KONMWdlyBvXx9RkUgshWf8rCAPwOvQM7JcpB1Si2onXaPM8dc/Lg/xnNZlq uvCqBO0FERZuzR25CoBTZBni5qxXdn6dbuuxVRADGnG5pSYPrFCXS95CT7qB7Pv7U2UJ Iggxlh2EL5GFA6Wy7ZFXYAZDKMiwQOK09Bba7t1E22e4V1oqOwYLTqT+VHsY/8LG15CH yVjvhhIYHSGNzFWJJ3lEsMZ9ecTv1O+wCaoB+PPRGSl4ggWyirEU8B3W/VhHODqxoItS weIA== X-Gm-Message-State: AOJu0YybUXjs12E1YGKYtgpLX9iN9cE3UCQfI1uYB1wX7r3xq5uhny1z JCpUHZuV4yXU+NWsgpMcXJ0= X-Received: by 2002:a17:90b:b15:b0:28c:bbc7:7576 with SMTP id bf21-20020a17090b0b1500b0028cbbc77576mr1859418pjb.77.1704241826950; Tue, 02 Jan 2024 16:30:26 -0800 (PST) Received: from barry-desktop.hub ([2407:7000:8942:5500:a7d6:f37a:9130:cd96]) by smtp.gmail.com with ESMTPSA id k5-20020a17090a3e8500b0028ce81d9f32sm254134pjc.16.2024.01.02.16.30.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Jan 2024 16:30:26 -0800 (PST) From: Barry Song <21cnbao@gmail.com> X-Google-Original-From: Barry Song <v-songbaohua@oppo.com> To: minchan@kernel.org, senozhatsky@chromium.org, axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, linux-block@vger.kernel.org, Barry Song <v-songbaohua@oppo.com> Subject: [PATCH] zram: easy the allocation of zcomp_strm's buffers with 2 pages Date: Wed, 3 Jan 2024 13:30:11 +1300 Message-Id: <20240103003011.211382-1-v-songbaohua@oppo.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1787027113189060870 X-GMAIL-MSGID: 1787027113189060870 |
Series |
zram: easy the allocation of zcomp_strm's buffers with 2 pages
|
|
Commit Message
Barry Song
Jan. 3, 2024, 12:30 a.m. UTC
There is no need to keep zcomp_strm's buffers contiguous physically.
And rarely, 1-order allocation can fail while buddy is seriously
fragmented.
Signed-off-by: Barry Song <v-songbaohua@oppo.com>
---
drivers/block/zram/zcomp.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
Comments
On (24/01/03 13:30), Barry Song wrote: > There is no need to keep zcomp_strm's buffers contiguous physically. > And rarely, 1-order allocation can fail while buddy is seriously > fragmented. Dunno. Some of these don't sound like convincing reasons, I'm afraid. We don't allocate compression streams all the time, we do it once per-CPU. And if the system is under such a terrible memory pressure then one probably should not use zram at all, because zsmalloc needs pages for its pool. I also wonder whether Android uses HW compression, in which case we may need to have physically contig pages. Not to mention TLB shootdowns that virt contig pages add to the picture. [..] > @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm) > { > if (!IS_ERR_OR_NULL(zstrm->tfm)) > crypto_free_comp(zstrm->tfm); > - free_pages((unsigned long)zstrm->buffer, 1); > + vfree(zstrm->buffer); > zstrm->tfm = NULL; > zstrm->buffer = NULL; > } > @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp) > * allocate 2 pages. 1 for compressed data, plus 1 extra for the > * case when compressed size is larger than the original one > */ > - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); > + zstrm->buffer = vzalloc(2 * PAGE_SIZE); > if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) { > zcomp_strm_free(zstrm); > return -ENOMEM;
On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky <senozhatsky@chromium.org> wrote: > > On (24/01/03 13:30), Barry Song wrote: > > There is no need to keep zcomp_strm's buffers contiguous physically. > > And rarely, 1-order allocation can fail while buddy is seriously > > fragmented. > > Dunno. Some of these don't sound like convincing reasons, I'm afraid. > We don't allocate compression streams all the time, we do it once > per-CPU. And if the system is under such a terrible memory pressure We actually do it many times actually because we free it while unplugging and re-allocate it during hotplugging. this can happen quite often for systems like Android using hotplug for power management. > then one probably should not use zram at all, because zsmalloc needs > pages for its pool. In my humble opinion, 1-order allocation and 0-order allocation are different things, 1-order is still more difficult though it is easier than 2-order which was a big pain causing allocation latency for tasks' kernel stacks and negatively affecting user experience. it has now been replaced by vmalloc and makes life easier :-) > > I also wonder whether Android uses HW compression, in which case we > may need to have physically contig pages. Not to mention TLB shootdowns > that virt contig pages add to the picture. I don't understand how HW compression and TLB shootdown are related as zRAM is using a traditional comp API. We are always passing a virtual address, traditional HW drivers use their own buffers to do DMA. int crypto_comp_compress(struct crypto_comp *comp, const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen); int crypto_comp_decompress(struct crypto_comp *comp, const u8 *src, unsigned int slen, u8 *dst, unsigned int *dlen); In new acomp API, we are passing a sg - users' buffers to drivers directly, sg_init_one(&input, src, entry->length); sg_init_table(&output, 1); sg_set_page(&output, page, PAGE_SIZE, 0); acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen); ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait); but i agree one-nents sg might have some advantage in scompress case after we move to new acomp APIs if we have this patch I sent recently [patch 3/3], https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/ For the current zRAM code, I guess HW compression/TLB is not a concern. > > [..] > > @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm) > > { > > if (!IS_ERR_OR_NULL(zstrm->tfm)) > > crypto_free_comp(zstrm->tfm); > > - free_pages((unsigned long)zstrm->buffer, 1); > > + vfree(zstrm->buffer); > > zstrm->tfm = NULL; > > zstrm->buffer = NULL; > > } > > @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp) > > * allocate 2 pages. 1 for compressed data, plus 1 extra for the > > * case when compressed size is larger than the original one > > */ > > - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); > > + zstrm->buffer = vzalloc(2 * PAGE_SIZE); > > if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) { > > zcomp_strm_free(zstrm); > > return -ENOMEM; Thanks Barry
On (24/01/06 15:38), Barry Song wrote: > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky > <senozhatsky@chromium.org> wrote: > > > > On (24/01/03 13:30), Barry Song wrote: > > > There is no need to keep zcomp_strm's buffers contiguous physically. > > > And rarely, 1-order allocation can fail while buddy is seriously > > > fragmented. > > > > Dunno. Some of these don't sound like convincing reasons, I'm afraid. > > We don't allocate compression streams all the time, we do it once > > per-CPU. And if the system is under such a terrible memory pressure > > We actually do it many times actually because we free it while unplugging and > re-allocate it during hotplugging. this can happen quite often for systems like > Android using hotplug for power management. Okay, makes sense. Do you see these problems in real life? I don't recall any reports. > > then one probably should not use zram at all, because zsmalloc needs > > pages for its pool. > > In my humble opinion, 1-order allocation and 0-order allocation are different > things, 1-order is still more difficult though it is easier than > 2-order which was > a big pain causing allocation latency for tasks' kernel stacks and negatively > affecting user experience. it has now been replaced by vmalloc and makes > life easier :-) Sure. > > I also wonder whether Android uses HW compression, in which case we > > may need to have physically contig pages. Not to mention TLB shootdowns > > that virt contig pages add to the picture. > > I don't understand how HW compression and TLB shootdown are related as zRAM > is using a traditional comp API. Oh, those are not related. TLB shootdowns are what now will be added to all compressions/decompressions, so it's sort of extra cost. HW compression (which android may be doing?) is another story. Did you run any perf tests on zram w/ and w/o the patch? > We are always passing a virtual address, traditional HW drivers use their own > buffers to do DMA. > > int crypto_comp_compress(struct crypto_comp *comp, > const u8 *src, unsigned int slen, > u8 *dst, unsigned int *dlen); > int crypto_comp_decompress(struct crypto_comp *comp, > const u8 *src, unsigned int slen, > u8 *dst, unsigned int *dlen); > > In new acomp API, we are passing a sg - users' buffers to drivers directly, > sg_init_one(&input, src, entry->length); > sg_init_table(&output, 1); > sg_set_page(&output, page, PAGE_SIZE, 0); > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen); > ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), > &acomp_ctx->wait); > > but i agree one-nents sg might have some advantage in scompress case Right. > after we move > to new acomp APIs if we have this patch I sent recently [patch 3/3], > https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/ Nice.
On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky <senozhatsky@chromium.org> wrote: > > On (24/01/06 15:38), Barry Song wrote: > > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky > > <senozhatsky@chromium.org> wrote: > > > > > > On (24/01/03 13:30), Barry Song wrote: > > > > There is no need to keep zcomp_strm's buffers contiguous physically. > > > > And rarely, 1-order allocation can fail while buddy is seriously > > > > fragmented. > > > > > > Dunno. Some of these don't sound like convincing reasons, I'm afraid. > > > We don't allocate compression streams all the time, we do it once > > > per-CPU. And if the system is under such a terrible memory pressure > > > > We actually do it many times actually because we free it while unplugging and > > re-allocate it during hotplugging. this can happen quite often for systems like > > Android using hotplug for power management. > > Okay, makes sense. > Do you see these problems in real life? I don't recall any reports. i don't have problems with the current zram which supports normal pages only. but in our out-of-tree code, we have enhanced zram/zsmalloc to support large folios compression/decompression, which will make zram work much better with large anon folios/mTHP things on which Ryan Roberts is working on. I mean, a large folio with for example 16 normal pages can be saved as one object in zram. In millions of phones, we have deployed this approach and seen huge improvement on compression ratio and cpu consumption decrease. in that case, we need a larger per-cpu buffer, and have seen frequent failure on allocation. that inspired me to send this patch in advance. > > > > then one probably should not use zram at all, because zsmalloc needs > > > pages for its pool. > > > > In my humble opinion, 1-order allocation and 0-order allocation are different > > things, 1-order is still more difficult though it is easier than > > 2-order which was > > a big pain causing allocation latency for tasks' kernel stacks and negatively > > affecting user experience. it has now been replaced by vmalloc and makes > > life easier :-) > > Sure. > > > > I also wonder whether Android uses HW compression, in which case we > > > may need to have physically contig pages. Not to mention TLB shootdowns > > > that virt contig pages add to the picture. > > > > I don't understand how HW compression and TLB shootdown are related as zRAM > > is using a traditional comp API. > > Oh, those are not related. TLB shootdowns are what now will be added to > all compressions/decompressions, so it's sort of extra cost. i am sorry i still don't understand where the tlb shootdowns come from. we don't unmap this per-cpu buffers during compression and decompression, do we ? am i missing something? > HW compression (which android may be doing?) is another story. > > Did you run any perf tests on zram w/ and w/o the patch? > > > We are always passing a virtual address, traditional HW drivers use their own > > buffers to do DMA. > > > > int crypto_comp_compress(struct crypto_comp *comp, > > const u8 *src, unsigned int slen, > > u8 *dst, unsigned int *dlen); > > int crypto_comp_decompress(struct crypto_comp *comp, > > const u8 *src, unsigned int slen, > > u8 *dst, unsigned int *dlen); > > > > In new acomp API, we are passing a sg - users' buffers to drivers directly, > > sg_init_one(&input, src, entry->length); > > sg_init_table(&output, 1); > > sg_set_page(&output, page, PAGE_SIZE, 0); > > acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, dlen); > > ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), > > &acomp_ctx->wait); > > > > but i agree one-nents sg might have some advantage in scompress case > > Right. > > > after we move > > to new acomp APIs if we have this patch I sent recently [patch 3/3], > > https://lore.kernel.org/linux-mm/20240103095006.608744-1-21cnbao@gmail.com/ > > Nice. Thanks Barry
On (24/01/29 10:46), Barry Song wrote: > On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky > <senozhatsky@chromium.org> wrote: > > > > On (24/01/06 15:38), Barry Song wrote: > > > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky > > > <senozhatsky@chromium.org> wrote: > > > > > > > > On (24/01/03 13:30), Barry Song wrote: > > > > > There is no need to keep zcomp_strm's buffers contiguous physically. > > > > > And rarely, 1-order allocation can fail while buddy is seriously > > > > > fragmented. > > > > [..] > > Okay, makes sense. > > Do you see these problems in real life? I don't recall any reports. > > i don't have problems with the current zram which supports normal pages only. > > but in our out-of-tree code, we have enhanced zram/zsmalloc to support large > folios compression/decompression, which will make zram work much better > with large anon folios/mTHP things on which Ryan Roberts is working on. > > I mean, a large folio with for example 16 normal pages can be saved as > one object in zram. > In millions of phones, we have deployed this approach and seen huge improvement > on compression ratio and cpu consumption decrease. in that case, we > need a larger > per-cpu buffer, and have seen frequent failure on allocation. that > inspired me to send > this patch in advance. May I please ask you to resend this patch with updated commit mesasge? E.g. mention cpu offlinig/onlining, etc. [..] > > > > I also wonder whether Android uses HW compression, in which case we > > > > may need to have physically contig pages. Not to mention TLB shootdowns > > > > that virt contig pages add to the picture. > > > > > > I don't understand how HW compression and TLB shootdown are related as zRAM > > > is using a traditional comp API. > > > > Oh, those are not related. TLB shootdowns are what now will be added to > > all compressions/decompressions, so it's sort of extra cost. > > i am sorry i still don't understand where the tlb shootdowns come > from. we don't unmap > this per-cpu buffers during compression and decompression, do we ? > > am i missing something? No, I guess you are right.
On Tue, Feb 6, 2024 at 9:55 AM Sergey Senozhatsky <senozhatsky@chromium.org> wrote: > > On (24/01/29 10:46), Barry Song wrote: > > On Mon, Jan 15, 2024 at 10:35 AM Sergey Senozhatsky > > <senozhatsky@chromium.org> wrote: > > > > > > On (24/01/06 15:38), Barry Song wrote: > > > > On Sat, Jan 6, 2024 at 9:30 AM Sergey Senozhatsky > > > > <senozhatsky@chromium.org> wrote: > > > > > > > > > > On (24/01/03 13:30), Barry Song wrote: > > > > > > There is no need to keep zcomp_strm's buffers contiguous physically. > > > > > > And rarely, 1-order allocation can fail while buddy is seriously > > > > > > fragmented. > > > > > > [..] > > > Okay, makes sense. > > > Do you see these problems in real life? I don't recall any reports. > > > > i don't have problems with the current zram which supports normal pages only. > > > > but in our out-of-tree code, we have enhanced zram/zsmalloc to support large > > folios compression/decompression, which will make zram work much better > > with large anon folios/mTHP things on which Ryan Roberts is working on. > > > > I mean, a large folio with for example 16 normal pages can be saved as > > one object in zram. > > In millions of phones, we have deployed this approach and seen huge improvement > > on compression ratio and cpu consumption decrease. in that case, we > > need a larger > > per-cpu buffer, and have seen frequent failure on allocation. that > > inspired me to send > > this patch in advance. > > May I please ask you to resend this patch with updated commit mesasge? > E.g. mention cpu offlinig/onlining, etc. I will send v2 to add this information in the commit message. thanks! > > [..] > > > > > I also wonder whether Android uses HW compression, in which case we > > > > > may need to have physically contig pages. Not to mention TLB shootdowns > > > > > that virt contig pages add to the picture. > > > > > > > > I don't understand how HW compression and TLB shootdown are related as zRAM > > > > is using a traditional comp API. > > > > > > Oh, those are not related. TLB shootdowns are what now will be added to > > > all compressions/decompressions, so it's sort of extra cost. > > > > i am sorry i still don't understand where the tlb shootdowns come > > from. we don't unmap > > this per-cpu buffers during compression and decompression, do we ? > > > > am i missing something? > > No, I guess you are right. Best regards Barry
diff --git a/drivers/block/zram/zcomp.c b/drivers/block/zram/zcomp.c index 55af4efd7983..8237b08c49d8 100644 --- a/drivers/block/zram/zcomp.c +++ b/drivers/block/zram/zcomp.c @@ -11,6 +11,7 @@ #include <linux/sched.h> #include <linux/cpu.h> #include <linux/crypto.h> +#include <linux/vmalloc.h> #include "zcomp.h" @@ -37,7 +38,7 @@ static void zcomp_strm_free(struct zcomp_strm *zstrm) { if (!IS_ERR_OR_NULL(zstrm->tfm)) crypto_free_comp(zstrm->tfm); - free_pages((unsigned long)zstrm->buffer, 1); + vfree(zstrm->buffer); zstrm->tfm = NULL; zstrm->buffer = NULL; } @@ -53,7 +54,7 @@ static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp) * allocate 2 pages. 1 for compressed data, plus 1 extra for the * case when compressed size is larger than the original one */ - zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1); + zstrm->buffer = vzalloc(2 * PAGE_SIZE); if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) { zcomp_strm_free(zstrm); return -ENOMEM;