Message ID | 20230223001607.95523-1-andrealmeid@igalia.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp31498wrd; Wed, 22 Feb 2023 16:39:27 -0800 (PST) X-Google-Smtp-Source: AK7set/NiFgqr2lxBjHOFY4a5W2Sqx7HG5lSsb39mT03bCRU9oNhW9NncPdJrtTiCnftn9Xnbb6R X-Received: by 2002:a17:906:1d45:b0:884:930:b014 with SMTP id o5-20020a1709061d4500b008840930b014mr15764316ejh.6.1677112767623; Wed, 22 Feb 2023 16:39:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677112767; cv=none; d=google.com; s=arc-20160816; b=jvUc7dOKjui3nSBj9BWCL4VSqBmmi6o9+I8rAOkouJr4bWsf4VP5iKsr+xKec6NpYG VQCZLLcsfhQSoEm825mK8TzqBEuNJOLSspJlgSg0F2+JOC1/kl9bdejb325Ea2SFbqMr vzr4CMDSC/VOJCakiDqR91RWZ7NLdVEHUgvKB1PgY+DP/LPnlqVs0/JrsKssni5cdUgO UxaI9zmHzg40bNtZ+A0L0aU/cksX311LFNZtyAFxXilR5rX18Hix+S24mZVpH1nbclaQ PD5rPARstv101aZUUWb5N5mGhFfD5BCqPMNyu4xTyBgo3VQCJXSMuvGoCXXm21PNNwy7 X0Zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=Rxuqb3bdScNwrTUfRo927M0lNZEoKULP5VXPvdExi3k=; b=ygSp8/ncu12HUa062X32Yiw/OWP2M50BBKHoZgtSKMtt3PAxsQawb/aVts49S9mvRY dDtayXsZjJtK2OnqiVo1OsW4QMlfuWqlkg2s90jA0iLSJJZ7qSXqDOnG6xaVW7FK7hOv eYVz5GilmgeQRUKyODmhwlIKeClBB3cfGpu5FjAu8Ju+5FMs9RJetxZpg1PAcLRa08K0 zuNo/wEMU2J8ZMcpR2nVSJNqvsBmxFH1NxNxm/3mKAi8pWnY1A8BNrKpEGk8C4U2ryib Ga2q4Dy9k02UNqzQUdvV1teEICi1oIvctcP3NZcoARrwpVKBwE0cGkKkmk1XQYzM4MMK mtJQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=PYjYWtHB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ui42-20020a170907c92a00b008cf02438e0dsi12477793ejc.683.2023.02.22.16.38.39; Wed, 22 Feb 2023 16:39:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=PYjYWtHB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232838AbjBWARC (ORCPT <rfc822;chinmaygameti@gmail.com> + 99 others); Wed, 22 Feb 2023 19:17:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229446AbjBWARA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 22 Feb 2023 19:17:00 -0500 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC24738660; Wed, 22 Feb 2023 16:16:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-Id: Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=Rxuqb3bdScNwrTUfRo927M0lNZEoKULP5VXPvdExi3k=; b=PYjYWtHBaxJZ81QvooVB++SUsT FYb/Vbe/zomCeKbrAG7O/bA0K6mM0FtPW53JgkG/4w6p+pK3B4RploFDqoJ/KxJ4lcv5Qt9IsfSte f7koC2nnvJ7ysHMrxFrM2JL+e0TmwusKDexbBEaxZ5R0fyGz/BgBFNB4m6STI28/2ZJKh/ljl3n+9 mAtHRrvKHfSLpce18GD7F4ld4ViPbwKm/ohF7Gj0K8D3hpEyeUYEJVCvgYbE3TH9B+cZnc5W/CwRg yLFut1GXViw4tvlh11VoIS5G4M8WhmARqRcpObqXIypIW38MBTBJpDl1cFiUS6bAx7DzfGNN7UJ87 lNvkrGpw==; Received: from 189-68-200-53.dsl.telesp.net.br ([189.68.200.53] helo=steammachine.lan) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1pUzHw-00BSBz-8n; Thu, 23 Feb 2023 01:16:44 +0100 From: =?utf-8?q?Andr=C3=A9_Almeida?= <andrealmeid@igalia.com> To: linux-kernel@vger.kernel.org, linux-kbuild@vger.kernel.org, Masahiro Yamada <masahiroy@kernel.org> Cc: kernel-dev@igalia.com, Nathan Chancellor <nathan@kernel.org>, Nick Desaulniers <ndesaulniers@google.com>, Nicolas Schier <nicolas@fjasle.eu>, =?utf-8?q?Andr=C3=A9_Almeida?= <andrealmeid@igalia.com> Subject: [PATCH] kbuild: modinst: Enable multithread xz compression Date: Wed, 22 Feb 2023 21:16:07 -0300 Message-Id: <20230223001607.95523-1-andrealmeid@igalia.com> X-Mailer: git-send-email 2.39.2 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758580197560947679?= X-GMAIL-MSGID: =?utf-8?q?1758580197560947679?= |
Series |
kbuild: modinst: Enable multithread xz compression
|
|
Commit Message
André Almeida
Feb. 23, 2023, 12:16 a.m. UTC
As it's done for zstd compression, enable multithread compression for
xz to speed up module installation.
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
On my setup xz is a bottleneck during module installation. Here are the
numbers to install it in a local directory, before and after this patch:
$ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
Executed in 100.08 secs
$ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16
Executed in 28.60 secs
---
scripts/Makefile.modinst | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
On Wed, Feb 22, 2023 at 09:16:07PM -0300, André Almeida wrote: > As it's done for zstd compression, enable multithread compression for > xz to speed up module installation. > > Signed-off-by: André Almeida <andrealmeid@igalia.com> This seems reasonable to me. Reviewed-by: Nathan Chancellor <nathan@kernel.org> If for some reason Masahiro does not want to take this, you could set XZ_OPT=-T0 in your build environment, which should accomplish the same thing. > --- > > On my setup xz is a bottleneck during module installation. Here are the > numbers to install it in a local directory, before and after this patch: > > $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > Executed in 100.08 secs > > $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > Executed in 28.60 secs > --- > scripts/Makefile.modinst | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst > index 4815a8e32227..28dcc523d2ee 100644 > --- a/scripts/Makefile.modinst > +++ b/scripts/Makefile.modinst > @@ -99,7 +99,7 @@ endif > quiet_cmd_gzip = GZIP $@ > cmd_gzip = $(KGZIP) -n -f $< > quiet_cmd_xz = XZ $@ > - cmd_xz = $(XZ) --lzma2=dict=2MiB -f $< > + cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $< > quiet_cmd_zstd = ZSTD $@ > cmd_zstd = $(ZSTD) -T0 --rm -f -q $< > > -- > 2.39.2 >
On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote: > > As it's done for zstd compression, enable multithread compression for > xz to speed up module installation. > > Signed-off-by: André Almeida <andrealmeid@igalia.com> > --- > > On my setup xz is a bottleneck during module installation. Here are the > numbers to install it in a local directory, before and after this patch: > > $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > Executed in 100.08 secs > > $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > Executed in 28.60 secs Heh, this is an interesting benchmark. Without this patch, you ran 16 processes of 'xz' in parallel since you gave -j16. You created multi-threads in each xz process, then you got 3x faster. What made it happen? How many threads can your system run? I did not get such an improvement in my testing. In my machine $(nproc) is 24. [Without this patch] $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) real 0m33.965s user 10m6.118s sys 0m37.231s [With this patch] $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) real 0m32.568s user 10m4.472s sys 0m39.132s Given that GNU Make provides the parallel execution environment, you can control the number of processes of 'xz'. There is no point in forcing multi-threading, which the user did not ask or ever want. > --- > scripts/Makefile.modinst | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst > index 4815a8e32227..28dcc523d2ee 100644 > --- a/scripts/Makefile.modinst > +++ b/scripts/Makefile.modinst > @@ -99,7 +99,7 @@ endif > quiet_cmd_gzip = GZIP $@ > cmd_gzip = $(KGZIP) -n -f $< > quiet_cmd_xz = XZ $@ > - cmd_xz = $(XZ) --lzma2=dict=2MiB -f $< > + cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $< > quiet_cmd_zstd = ZSTD $@ > cmd_zstd = $(ZSTD) -T0 --rm -f -q $< > > -- > 2.39.2 >
Hi Masahiro, Em 24/02/2023 02:38, Masahiro Yamada escreveu: > On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote: >> >> As it's done for zstd compression, enable multithread compression for >> xz to speed up module installation. >> >> Signed-off-by: André Almeida <andrealmeid@igalia.com> >> --- >> >> On my setup xz is a bottleneck during module installation. Here are the >> numbers to install it in a local directory, before and after this patch: >> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 >> Executed in 100.08 secs >> >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 >> Executed in 28.60 secs > > > Heh, this is an interesting benchmark. > > Without this patch, you ran 16 processes of 'xz' in parallel > since you gave -j16. > > You created multi-threads in each xz process, then you got 3x faster. > What made it happen? > > During the modules installation in my setup, the build system would spend most of it's time compressing big modules (such as the 350M amdgpu.ko) in a single thread, with 15 idles threads. Enabling multithread allowed amdgpu to be compressed really fast. The real performance improvement during modules compression is not compressing as many small modules as possible in parallel, but compressing the big ones in multithread, that proved to be the bottleneck in my setup. > How many threads can your system run? $ nproc 16 > > I did not get such an improvement in my testing. > In my machine $(nproc) is 24. > > > [Without this patch] > > $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) > > real 0m33.965s > user 10m6.118s > sys 0m37.231s > > [With this patch] > > $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) > > real 0m32.568s > user 10m4.472s > sys 0m39.132s > > I can see that my patch did not introduce performance regressions to your setup, at least. > > Given that GNU Make provides the parallel execution environment, > you can control the number of processes of 'xz'. > > There is no point in forcing multi-threading, which the user > did not ask or ever want. > > Should we drop -T0 from zstd then? Is currently forcing multi-threading. > > > > > > > > > > > >> --- >> scripts/Makefile.modinst | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst >> index 4815a8e32227..28dcc523d2ee 100644 >> --- a/scripts/Makefile.modinst >> +++ b/scripts/Makefile.modinst >> @@ -99,7 +99,7 @@ endif >> quiet_cmd_gzip = GZIP $@ >> cmd_gzip = $(KGZIP) -n -f $< >> quiet_cmd_xz = XZ $@ >> - cmd_xz = $(XZ) --lzma2=dict=2MiB -f $< >> + cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $< >> quiet_cmd_zstd = ZSTD $@ >> cmd_zstd = $(ZSTD) -T0 --rm -f -q $< >> >> -- >> 2.39.2 >> > > Thanks, André Almeida
On Fri, Feb 24, 2023 at 9:13 PM André Almeida <andrealmeid@igalia.com> wrote: > > Hi Masahiro, > > Em 24/02/2023 02:38, Masahiro Yamada escreveu: > > On Thu, Feb 23, 2023 at 9:17 AM André Almeida <andrealmeid@igalia.com> wrote: > >> > >> As it's done for zstd compression, enable multithread compression for > >> xz to speed up module installation. > >> > >> Signed-off-by: André Almeida <andrealmeid@igalia.com> > >> --- > >> > >> On my setup xz is a bottleneck during module installation. Here are the > >> numbers to install it in a local directory, before and after this patch: > >> > >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > >> Executed in 100.08 secs > >> > >> $ time make INSTALL_MOD_PATH=/home/tonyk/codes/.kernel_deploy/ modules_install -j16 > >> Executed in 28.60 secs > > > > > > Heh, this is an interesting benchmark. > > > > Without this patch, you ran 16 processes of 'xz' in parallel > > since you gave -j16. > > > > You created multi-threads in each xz process, then you got 3x faster. > > What made it happen? > > > > > > During the modules installation in my setup, the build system would > spend most of it's time compressing big modules (such as the 350M > amdgpu.ko) in a single thread, with 15 idles threads. Enabling > multithread allowed amdgpu to be compressed really fast. It is a corner case, isn't it? amdgpu.ko appears early in modules.order. In most use-cases, other *.ko will fill the idle threads. xz(1) says Setting threads to a special value 0 makes xz use up to as many threads as the processor(s) on the system support. So, 'make -j$(nproc) modules_install' will have (nproc * nproc) threads at maximum. Of course, this is a theoretical calculation. The actual number of spawned threads will be much less, but spawning too many threads may not be nice. For your case, Nathan's suggestion will do. > > The real performance improvement during modules compression is not > compressing as many small modules as possible in parallel, but > compressing the big ones in multithread, that proved to be the > bottleneck in my setup. > > > How many threads can your system run? > > $ nproc > 16 > > > > > I did not get such an improvement in my testing. > > In my machine $(nproc) is 24. > > > > > > [Without this patch] > > > > $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) > > > > real 0m33.965s > > user 10m6.118s > > sys 0m37.231s > > > > [With this patch] > > > > $ time make INSTALL_MOD_PATH=/tmp/inst1 modules_install -j$(nproc) > > > > real 0m32.568s > > user 10m4.472s > > sys 0m39.132s > > > > > > I can see that my patch did not introduce performance regressions to > your setup, at least. > > > > > Given that GNU Make provides the parallel execution environment, > > you can control the number of processes of 'xz'. > > > > There is no point in forcing multi-threading, which the user > > did not ask or ever want. > > > > > > Should we drop -T0 from zstd then? Is currently forcing multi-threading. I think yes. -- Best Regards Masahiro Yamada
diff --git a/scripts/Makefile.modinst b/scripts/Makefile.modinst index 4815a8e32227..28dcc523d2ee 100644 --- a/scripts/Makefile.modinst +++ b/scripts/Makefile.modinst @@ -99,7 +99,7 @@ endif quiet_cmd_gzip = GZIP $@ cmd_gzip = $(KGZIP) -n -f $< quiet_cmd_xz = XZ $@ - cmd_xz = $(XZ) --lzma2=dict=2MiB -f $< + cmd_xz = $(XZ) --lzma2=dict=2MiB -f -T0 $< quiet_cmd_zstd = ZSTD $@ cmd_zstd = $(ZSTD) -T0 --rm -f -q $<