Message ID | 20230121033942.350387-1-42.hyeyoo@gmail.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp555777wrn; Fri, 20 Jan 2023 19:41:50 -0800 (PST) X-Google-Smtp-Source: AMrXdXu0u1Q3c65XD5X7cslUQcTnfWXVFPtr7/rlmtGpQK3rE15XcukLYdreJdS1ANN5/1skqrY9 X-Received: by 2002:a17:906:6851:b0:873:343e:f814 with SMTP id a17-20020a170906685100b00873343ef814mr16481918ejs.40.1674272510817; Fri, 20 Jan 2023 19:41:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1674272510; cv=none; d=google.com; s=arc-20160816; b=jbgujzTsEB1bnS8f2xGfv4lmzmA5uSAEXqQzFdbfhr7Ri9eKCTq6brl4e7/0JQQuEk 9cBlGeToGASQ+Vq13dCLQhRfSQ4nJpECC3l7pq3NITMyYiwb//Iy6mAoIbSH4V0+dibR FtMnucQpaAlQBjQNvNaTvVYRch4z86/RNKENY9x6gCKXiLpqGqtP3YNiHiYPjQt2wBSC wlLOWrguOV/PLGlq+xnczOSSmXBruPkMSehjUGQHOLMTQVZo4y95EkIOEPa8zqKfFlMv acPChsgjuRWagBcp7U8pg8KwUQp10idpdS+KpqfOWUWGar3hX6pzAIPsf7hVOwIuLdtV BHdg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=E0GuYQ0G2pHZ2CbzhRDvaQYe3yXi7QgsYoJRG+c3IfY=; b=i4cD7GENqme0m7UapnYC0RxhVbwbv70srrMuHvjJuoFueKZw3ygHyXYA4eKjeGqiv8 yyDT94wj9voYNfjpzKH4BKUAnc2FxNQVh83BSBhoLlEPxBRbat2SaiA2mIuwLbXY5uq7 Rx+whOV2rG8BUUFroxFTaCxm8IuA71d5W6MAWpRES1b1cu9SPe1M6dyzYIX6ZhzSOe8X yQMKMrVFhG0m3+hDZ3kJUi/BYXKikjfrQADxeOIAxJ+XHdsWJ0vW6xjd9HJ07nTu62PQ G6fUA63to7VKcYGdRTU+2PPjh4+2NNmUzEC6ZhkTYkzbVx9ZhekNvXaaLYL2o0EKXPrw gNxQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=g9O69hSP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ga22-20020a1709070c1600b0087777d66a7fsi8084325ejc.61.2023.01.20.19.41.27; Fri, 20 Jan 2023 19:41:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20210112 header.b=g9O69hSP; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229632AbjAUDkH (ORCPT <rfc822;forouhar.linux@gmail.com> + 99 others); Fri, 20 Jan 2023 22:40:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229540AbjAUDkG (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 20 Jan 2023 22:40:06 -0500 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24F8A2D5F for <linux-kernel@vger.kernel.org>; Fri, 20 Jan 2023 19:40:06 -0800 (PST) Received: by mail-pl1-x632.google.com with SMTP id k18so6940529pll.5 for <linux-kernel@vger.kernel.org>; Fri, 20 Jan 2023 19:40:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=E0GuYQ0G2pHZ2CbzhRDvaQYe3yXi7QgsYoJRG+c3IfY=; b=g9O69hSPUUH1UI04WY8Oj4LiUrVyYZZaB3AkX1QEvzudVMNV8FREOyJGsxcCuLPen/ W/VHgHR2CBfmWJQntj4arjevp2CYtiWY39sK7cVpyfztIxwxT81UC6ptUqM61bJmOkGc RrvcLK+J43ektIPvssT5ULhHensowUmVBpUJoQLAteUtqe6OPGYEiWQbpFPUdFPTP1Hb t4/hNh+hy6AdYMPlZrglexXDxnahHY3ax5P/EllOLmR/7hlcQKIoPa30NZFRNoLFz9KQ rTtvoCWgnwCm2L93z3ckefpycnNyqn2kXHho795lyWFuFUEur+PH7vCbKx+Xe/B7Y8ok H2mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=E0GuYQ0G2pHZ2CbzhRDvaQYe3yXi7QgsYoJRG+c3IfY=; b=TndtMPyJZrNF7CrME6taj9w//Llzoe4fm3zoDKUF0XSUyq1zwTQtbwDaZ/Qd8xuxxb evMU7NsB4WmPVP0ed3swscsIY6wMPB1cswzQbrDPlpCppKUeA1NBNKn3NAuSNBgPCP+y XiMaWFE4ixB/Hj3n05+1Nd9xnl81vX3VFnjbl91OCFudbNwUT+lf+1rak0C6BiQy1lg4 XCOZpedVyt1jMNK4B/6n4OnIimcKercgj8bIIbjb/3IQpSDsb1hwRm9RjFxo/TZ7HuQv S7w5LssrMN5FeoLCiFnNefOEIz9idOZVs+t/u/8D0EmTll2YMG87oQuShl+v1e+JCfG1 T6Vw== X-Gm-Message-State: AFqh2krBmok6Bc+qBSIShAnQNWS8PWExnePijpYLzdp7ftOaQXcUK7Qc h6om1yb+qXGrmnG2nmhx7ms= X-Received: by 2002:a17:902:f2ca:b0:193:2303:c9e5 with SMTP id h10-20020a170902f2ca00b001932303c9e5mr15667506plc.20.1674272405501; Fri, 20 Jan 2023 19:40:05 -0800 (PST) Received: from hyeyoo.. ([114.29.91.56]) by smtp.gmail.com with ESMTPSA id s7-20020a170902b18700b0019335832ee9sm23331274plr.179.2023.01.20.19.40.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Jan 2023 19:40:04 -0800 (PST) From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Andrew Morton <akpm@linux-foundation.org>, Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Vlastimil Babka <vbabka@suse.cz>, Roman Gushchin <roman.gushchin@linux.dev> Cc: Ingo Molnar <mingo@redhat.com>, Johannes Weiner <hannes@cmpxchg.org>, Michal Hocko <mhocko@kernel.org>, Shakeel Butt <shakeelb@google.com>, Muchun Song <muchun.song@linux.dev>, Matthew Wilcox <willy@infradead.org>, Hyeonggon Yoo <42.hyeyoo@gmail.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH mm-unstable] lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default Date: Sat, 21 Jan 2023 12:39:42 +0900 Message-Id: <20230121033942.350387-1-42.hyeyoo@gmail.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM,HK_RANDOM_ENVFROM, HK_RANDOM_FROM,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1755601972400274038?= X-GMAIL-MSGID: =?utf-8?q?1755601972400274038?= |
Series |
[mm-unstable] lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default
|
|
Commit Message
Hyeonggon Yoo
Jan. 21, 2023, 3:39 a.m. UTC
In workloads where this_cpu operations are frequently performed,
enabling DEBUG_PREEMPT may result in significant increase in
runtime overhead due to frequent invocation of
__this_cpu_preempt_check() function.
This can be demonstrated through benchmarks such as hackbench where this
configuration results in a 10% reduction in performance, primarily due to
the added overhead within memcg charging path.
Therefore, do not to enable DEBUG_PREEMPT by default and make users aware
of its potential impact on performance in some workloads.
hackbench-process-sockets
debug_preempt no_debug_preempt
Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%*
Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%*
Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%*
Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%*
Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%*
Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%*
Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%*
Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%*
Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%*
Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com>
---
lib/Kconfig.debug | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
Comments
On 1/21/23 04:39, Hyeonggon Yoo wrote: > In workloads where this_cpu operations are frequently performed, > enabling DEBUG_PREEMPT may result in significant increase in > runtime overhead due to frequent invocation of > __this_cpu_preempt_check() function. > > This can be demonstrated through benchmarks such as hackbench where this > configuration results in a 10% reduction in performance, primarily due to > the added overhead within memcg charging path. > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > of its potential impact on performance in some workloads. > > hackbench-process-sockets > debug_preempt no_debug_preempt > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Looks like it's there since the beginning of preempt and pre-git. But probably should be something for scheduler maintainers rather than mm/slab, even if the impact manifests there. You did Cc Ingo (the original author) so let me Cc the rest here. > --- > lib/Kconfig.debug | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index ddbfac2adf9c..f6f845a4b9ec 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1176,13 +1176,16 @@ config DEBUG_TIMEKEEPING > config DEBUG_PREEMPT > bool "Debug preemptible kernel" > depends on DEBUG_KERNEL && PREEMPTION && TRACE_IRQFLAGS_SUPPORT > - default y > help > If you say Y here then the kernel will use a debug variant of the > commonly used smp_processor_id() function and will print warnings > if kernel code uses it in a preemption-unsafe way. Also, the kernel > will detect preemption count underflows. > > + This option has potential to introduce high runtime overhead, > + depending on workload as it triggers debugging routines for each > + this_cpu operation. It should only be used for debugging purposes. > + > menu "Lock Debugging (spinlocks, mutexes, etc...)" > > config LOCK_DEBUGGING_SUPPORT
On Sat, Jan 21, 2023 at 12:29:44PM +0100, Vlastimil Babka wrote: > On 1/21/23 04:39, Hyeonggon Yoo wrote: > > In workloads where this_cpu operations are frequently performed, > > enabling DEBUG_PREEMPT may result in significant increase in > > runtime overhead due to frequent invocation of > > __this_cpu_preempt_check() function. > > > > This can be demonstrated through benchmarks such as hackbench where this > > configuration results in a 10% reduction in performance, primarily due to > > the added overhead within memcg charging path. > > > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > > of its potential impact on performance in some workloads. > > > > hackbench-process-sockets > > debug_preempt no_debug_preempt > > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > Looks like it's there since the beginning of preempt and pre-git. But > probably should be something for scheduler maintainers rather than mm/slab, > even if the impact manifests there. You did Cc Ingo (the original author) so > let me Cc the rest here. Whew, I still get confused about who to Cc, thanks for adding them. and I also didn't include the percpu memory allocator maintainers, who may have opinion. let's add them too. > > > --- > > lib/Kconfig.debug | 5 ++++- > > 1 file changed, 4 insertions(+), 1 deletion(-) > > > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > > index ddbfac2adf9c..f6f845a4b9ec 100644 > > --- a/lib/Kconfig.debug > > +++ b/lib/Kconfig.debug > > @@ -1176,13 +1176,16 @@ config DEBUG_TIMEKEEPING > > config DEBUG_PREEMPT > > bool "Debug preemptible kernel" > > depends on DEBUG_KERNEL && PREEMPTION && TRACE_IRQFLAGS_SUPPORT > > - default y > > help > > If you say Y here then the kernel will use a debug variant of the > > commonly used smp_processor_id() function and will print warnings > > if kernel code uses it in a preemption-unsafe way. Also, the kernel > > will detect preemption count underflows. > > > > + This option has potential to introduce high runtime overhead, > > + depending on workload as it triggers debugging routines for each > > + this_cpu operation. It should only be used for debugging purposes. > > + > > menu "Lock Debugging (spinlocks, mutexes, etc...)" > > > > config LOCK_DEBUGGING_SUPPORT >
On Sat 21-01-23 20:54:15, Hyeonggon Yoo wrote: > On Sat, Jan 21, 2023 at 12:29:44PM +0100, Vlastimil Babka wrote: > > On 1/21/23 04:39, Hyeonggon Yoo wrote: > > > In workloads where this_cpu operations are frequently performed, > > > enabling DEBUG_PREEMPT may result in significant increase in > > > runtime overhead due to frequent invocation of > > > __this_cpu_preempt_check() function. > > > > > > This can be demonstrated through benchmarks such as hackbench where this > > > configuration results in a 10% reduction in performance, primarily due to > > > the added overhead within memcg charging path. > > > > > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > > > of its potential impact on performance in some workloads. > > > > > > hackbench-process-sockets > > > debug_preempt no_debug_preempt > > > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > > > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > > > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > > > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > > > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > > > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > > > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > > > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > > > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* Do you happen to have any perf data collected during those runs? I would be interested in the memcg side of things. Maybe we can do something better there.
On Sat, 21 Jan 2023, Hyeonggon Yoo wrote: > Whew, I still get confused about who to Cc, thanks for adding them. > and I also didn't include the percpu memory allocator maintainers, who may > have opinion. let's add them too. Well looks ok to me. However, I thought most distro kernels disable PREEMPT anyways for performance reasons? So DEBUG_PREEMPT should be off as well. I guess that is why this has not been an issue so far.
Adding Peter to the cc as this should go via the tip tree even though Ingo is cc'd already. Leaving full context and responding inline. On Sat, Jan 21, 2023 at 12:39:42PM +0900, Hyeonggon Yoo wrote: > In workloads where this_cpu operations are frequently performed, > enabling DEBUG_PREEMPT may result in significant increase in > runtime overhead due to frequent invocation of > __this_cpu_preempt_check() function. > > This can be demonstrated through benchmarks such as hackbench where this > configuration results in a 10% reduction in performance, primarily due to > the added overhead within memcg charging path. > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > of its potential impact on performance in some workloads. > > hackbench-process-sockets > debug_preempt no_debug_preempt > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> This has been default y since very early on in the development of the BKL removal. It was probably selected by default because it was expected there would be a bunch of new SMP-related bugs. These days, there is no real reason to enable it by default except when debugging a preempt-related issue or during development. It's not like CONFIG_SCHED_DEBUG which gets enabled in a lot of distros as it has some features which are useful in production (which is unfortunate but splitting CONFIG_SCHED_DEBUG is a completely separate topic). > --- > lib/Kconfig.debug | 5 ++++- > 1 file changed, 4 insertions(+), 1 deletion(-) > > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug > index ddbfac2adf9c..f6f845a4b9ec 100644 > --- a/lib/Kconfig.debug > +++ b/lib/Kconfig.debug > @@ -1176,13 +1176,16 @@ config DEBUG_TIMEKEEPING > config DEBUG_PREEMPT > bool "Debug preemptible kernel" > depends on DEBUG_KERNEL && PREEMPTION && TRACE_IRQFLAGS_SUPPORT > - default y > help > If you say Y here then the kernel will use a debug variant of the > commonly used smp_processor_id() function and will print warnings > if kernel code uses it in a preemption-unsafe way. Also, the kernel > will detect preemption count underflows. > > + This option has potential to introduce high runtime overhead, > + depending on workload as it triggers debugging routines for each > + this_cpu operation. It should only be used for debugging purposes. > + > menu "Lock Debugging (spinlocks, mutexes, etc...)" > > config LOCK_DEBUGGING_SUPPORT > -- > 2.34.1 > >
On Mon, Jan 23, 2023 at 12:05:00PM +0100, Christoph Lameter wrote: > On Sat, 21 Jan 2023, Hyeonggon Yoo wrote: > > > Whew, I still get confused about who to Cc, thanks for adding them. > > and I also didn't include the percpu memory allocator maintainers, who may > > have opinion. let's add them too. > > Well looks ok to me. Thanks for looking at! > However, I thought most distro kernels disable PREEMPT anyways for > performance reasons? So DEBUG_PREEMPT should be off as well. I guess that > is why this has not been an issue so far. It depends on PREEMPTION, and PREEMPT_DYNAMIC ("Preemption behaviour defined on boot") selects PREEMPTION even if I end up using PREEMPT_VOLUNTARY. Not so many distros use DEBUG_PREEMPT, but I am occationally hit by this because debian and fedora enabled it :)
On Thu 26-01-23 00:41:15, Hyeonggon Yoo wrote: [...] > > Do you happen to have any perf data collected during those runs? I > > would be interested in the memcg side of things. Maybe we can do > > something better there. > > Yes, below is performance data I've collected. > > 6.1.8-debug-preempt-dirty > ========================= > Overhead Command Shared Object Symbol > + 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled Thanks! Could you just add callers that are showing in the profile for this call please?
From: Hyeonggon Yoo <42.hyeyoo@gmail.com> To: Michal Hocko <mhocko@suse.com> Cc: Vlastimil Babka <vbabka@suse.cz>, Andrew Morton <akpm@linux-foundation.org>, Christoph Lameter <cl@linux.com>, Pekka Enberg <penberg@kernel.org>, David Rientjes <rientjes@google.com>, Joonsoo Kim <iamjoonsoo.kim@lge.com>, Roman Gushchin <roman.gushchin@linux.dev>, Ingo Molnar <mingo@redhat.com>, Johannes Weiner <hannes@cmpxchg.org>, Shakeel Butt <shakeelb@google.com>, Muchun Song <muchun.song@linux.dev>, Matthew Wilcox <willy@infradead.org>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@techsingularity.net>, Daniel Bristot de Oliveira <bristot@redhat.com>, Valentin Schneider <vschneid@redhat.com>, Dennis Zhou <dennis@kernel.org>, Tejun Heo <tj@kernel.org> Bcc: Subject: Re: [PATCH mm-unstable] lib/Kconfig.debug: do not enable DEBUG_PREEMPT by default Reply-To: In-Reply-To: <Y85MNmZDc5czMRUJ@dhcp22.suse.cz> On Mon, Jan 23, 2023 at 09:58:30AM +0100, Michal Hocko wrote: > On Sat 21-01-23 20:54:15, Hyeonggon Yoo wrote: > > On Sat, Jan 21, 2023 at 12:29:44PM +0100, Vlastimil Babka wrote: > > > On 1/21/23 04:39, Hyeonggon Yoo wrote: > > > > In workloads where this_cpu operations are frequently performed, > > > > enabling DEBUG_PREEMPT may result in significant increase in > > > > runtime overhead due to frequent invocation of > > > > __this_cpu_preempt_check() function. > > > > > > > > This can be demonstrated through benchmarks such as hackbench where this > > > > configuration results in a 10% reduction in performance, primarily due to > > > > the added overhead within memcg charging path. > > > > > > > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > > > > of its potential impact on performance in some workloads. > > > > > > > > hackbench-process-sockets > > > > debug_preempt no_debug_preempt > > > > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > > > > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > > > > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > > > > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > > > > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > > > > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > > > > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > > > > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > > > > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* Hello Michal, thanks for looking at this. > Do you happen to have any perf data collected during those runs? I > would be interested in the memcg side of things. Maybe we can do > something better there. Yes, below is performance data I've collected. 6.1.8-debug-preempt-dirty ========================= Overhead Command Shared Object Symbol + 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled + 7.33% hackbench [kernel.vmlinux] [k] copy_user_enhanced_fast_string + 7.32% hackbench [kernel.vmlinux] [k] mod_objcg_state 3.55% hackbench [kernel.vmlinux] [k] refill_obj_stock 3.39% hackbench [kernel.vmlinux] [k] debug_smp_processor_id 2.97% hackbench [kernel.vmlinux] [k] memset_erms 2.55% hackbench [kernel.vmlinux] [k] __check_object_size + 2.36% hackbench [kernel.vmlinux] [k] native_queued_spin_lock_slowpath 1.76% hackbench [kernel.vmlinux] [k] unix_stream_read_generic 1.64% hackbench [kernel.vmlinux] [k] __slab_free 1.58% hackbench [kernel.vmlinux] [k] unix_stream_sendmsg 1.46% hackbench [kernel.vmlinux] [k] memcg_slab_post_alloc_hook 1.35% hackbench [kernel.vmlinux] [k] vfs_write 1.33% hackbench [kernel.vmlinux] [k] vfs_read 1.28% hackbench [kernel.vmlinux] [k] __alloc_skb 1.18% hackbench [kernel.vmlinux] [k] sock_read_iter 1.16% hackbench [kernel.vmlinux] [k] obj_cgroup_charge 1.16% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64 1.14% hackbench [kernel.vmlinux] [k] sock_write_iter 1.12% hackbench [kernel.vmlinux] [k] skb_release_data 1.08% hackbench [kernel.vmlinux] [k] sock_wfree 1.07% hackbench [kernel.vmlinux] [k] cache_from_obj 0.96% hackbench [kernel.vmlinux] [k] unix_destruct_scm 0.95% hackbench [kernel.vmlinux] [k] kmem_cache_free 0.94% hackbench [kernel.vmlinux] [k] __kmem_cache_alloc_node 0.92% hackbench [kernel.vmlinux] [k] kmem_cache_alloc_node 0.89% hackbench [kernel.vmlinux] [k] _raw_spin_lock_irqsave 0.84% hackbench [kernel.vmlinux] [k] __x86_indirect_thunk_array 0.84% hackbench libc.so.6 [.] write 0.81% hackbench [kernel.vmlinux] [k] exit_to_user_mode_prepare 0.76% hackbench libc.so.6 [.] read 0.75% hackbench [kernel.vmlinux] [k] syscall_trace_enter.constprop.0 0.75% hackbench [kernel.vmlinux] [k] preempt_count_add 0.74% hackbench [kernel.vmlinux] [k] cmpxchg_double_slab.constprop.0.isra.0 0.69% hackbench [kernel.vmlinux] [k] get_partial_node 0.69% hackbench [kernel.vmlinux] [k] __virt_addr_valid 0.69% hackbench [kernel.vmlinux] [k] __rcu_read_unlock 0.65% hackbench [kernel.vmlinux] [k] get_obj_cgroup_from_current 0.63% hackbench [kernel.vmlinux] [k] __kmem_cache_free 0.62% hackbench [kernel.vmlinux] [k] entry_SYSRETQ_unsafe_stack 0.60% hackbench [kernel.vmlinux] [k] __rcu_read_lock 0.59% hackbench [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare 0.54% hackbench [kernel.vmlinux] [k] __unfreeze_partials 0.53% hackbench [kernel.vmlinux] [k] check_stack_object 0.52% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe 0.51% hackbench [kernel.vmlinux] [k] security_file_permission 0.50% hackbench [kernel.vmlinux] [k] __x64_sys_write 0.49% hackbench [kernel.vmlinux] [k] bpf_lsm_file_permission 0.48% hackbench [kernel.vmlinux] [k] ___slab_alloc 0.46% hackbench [kernel.vmlinux] [k] __check_heap_object and attached flamegraph-6.1.8-debug-preempt-dirty.svg. 6.1.8 (no debug preempt) ======================== Overhead Command Shared Object Symbol + 10.96% hackbench [kernel.vmlinux] [k] mod_objcg_state + 8.16% hackbench [kernel.vmlinux] [k] copy_user_enhanced_fast_string 3.29% hackbench [kernel.vmlinux] [k] memset_erms 3.07% hackbench [kernel.vmlinux] [k] __slab_free 2.89% hackbench [kernel.vmlinux] [k] refill_obj_stock 2.82% hackbench [kernel.vmlinux] [k] __check_object_size + 2.72% hackbench [kernel.vmlinux] [k] native_queued_spin_lock_slowpath 1.96% hackbench [kernel.vmlinux] [k] __x86_indirect_thunk_rax 1.88% hackbench [kernel.vmlinux] [k] memcg_slab_post_alloc_hook 1.69% hackbench [kernel.vmlinux] [k] __rcu_read_unlock 1.54% hackbench [kernel.vmlinux] [k] __alloc_skb 1.53% hackbench [kernel.vmlinux] [k] unix_stream_sendmsg 1.46% hackbench [kernel.vmlinux] [k] kmem_cache_free 1.44% hackbench [kernel.vmlinux] [k] vfs_write 1.43% hackbench [kernel.vmlinux] [k] vfs_read 1.33% hackbench [kernel.vmlinux] [k] unix_stream_read_generic 1.31% hackbench [kernel.vmlinux] [k] sock_write_iter 1.27% hackbench [kernel.vmlinux] [k] kmalloc_slab 1.22% hackbench [kernel.vmlinux] [k] __rcu_read_lock 1.20% hackbench [kernel.vmlinux] [k] sock_read_iter 1.18% hackbench [kernel.vmlinux] [k] __entry_text_start 1.15% hackbench [kernel.vmlinux] [k] kmem_cache_alloc_node 1.12% hackbench [kernel.vmlinux] [k] unix_stream_recvmsg 1.10% hackbench [kernel.vmlinux] [k] obj_cgroup_charge 0.98% hackbench [kernel.vmlinux] [k] __kmem_cache_alloc_node 0.97% hackbench libc.so.6 [.] write 0.91% hackbench [kernel.vmlinux] [k] exit_to_user_mode_prepare 0.88% hackbench [kernel.vmlinux] [k] __kmem_cache_free 0.87% hackbench [kernel.vmlinux] [k] syscall_trace_enter.constprop.0 0.86% hackbench [kernel.vmlinux] [k] __kmalloc_node_track_caller 0.84% hackbench libc.so.6 [.] read 0.81% hackbench [kernel.vmlinux] [k] __lock_text_start 0.80% hackbench [kernel.vmlinux] [k] cache_from_obj 0.74% hackbench [kernel.vmlinux] [k] get_obj_cgroup_from_current 0.73% hackbench [kernel.vmlinux] [k] entry_SYSRETQ_unsafe_stack 0.72% hackbench [kernel.vmlinux] [k] unix_destruct_scm 0.70% hackbench [kernel.vmlinux] [k] get_partial_node 0.69% hackbench [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare 0.65% hackbench [kernel.vmlinux] [k] kfree 0.63% hackbench [kernel.vmlinux] [k] __unfreeze_partials 0.60% hackbench [kernel.vmlinux] [k] cmpxchg_double_slab.constprop.0.isra.0 0.58% hackbench [kernel.vmlinux] [k] skb_release_data 0.56% hackbench [kernel.vmlinux] [k] __virt_addr_valid 0.56% hackbench [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe 0.56% hackbench [kernel.vmlinux] [k] __check_heap_object 0.55% hackbench [kernel.vmlinux] [k] sock_wfree 0.54% hackbench [kernel.vmlinux] [k] __audit_syscall_entry 0.53% hackbench [kernel.vmlinux] [k] ___slab_alloc 0.53% hackbench [kernel.vmlinux] [k] check_stack_object 0.52% hackbench [kernel.vmlinux] [k] bpf_lsm_file_permission and attached flamegraph-6.1.8.svg. If you need more information, feel free to ask. -- Thanks, Hyeonggon > -- > Michal Hocko > SUSE Labs
On Sat, Jan 21, 2023 at 12:39:42PM +0900, Hyeonggon Yoo wrote: > In workloads where this_cpu operations are frequently performed, > enabling DEBUG_PREEMPT may result in significant increase in > runtime overhead due to frequent invocation of > __this_cpu_preempt_check() function. > > This can be demonstrated through benchmarks such as hackbench where this > configuration results in a 10% reduction in performance, primarily due to > the added overhead within memcg charging path. > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > of its potential impact on performance in some workloads. > > hackbench-process-sockets > debug_preempt no_debug_preempt > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Nice! I checkout my very simple kmem performance test (1M allocations 8-bytes allocations) and it shows ~30% difference: 112319 us with vs 80836 us without. Probably not that big for real workloads, but still nice to have. Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Thank you!
On Wed, Jan 25, 2023 at 10:51:05AM +0100, Michal Hocko wrote: > On Thu 26-01-23 00:41:15, Hyeonggon Yoo wrote: > [...] > > > Do you happen to have any perf data collected during those runs? I > > > would be interested in the memcg side of things. Maybe we can do > > > something better there. > > > > Yes, below is performance data I've collected. > > > > 6.1.8-debug-preempt-dirty > > ========================= > > Overhead Command Shared Object Symbol > > + 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled > > Thanks! Could you just add callers that are showing in the profile for > this call please? - 14.56% 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled - 6.37% check_preemption_disabled + 3.48% mod_objcg_state + 1.10% obj_cgroup_charge 1.02% refill_obj_stock 0.67% memcg_slab_post_alloc_hook 0.58% mod_objcg_state According to perf, many memcg functions call this function and that's because __this_cpu_xxxx checks if preemption is disabled. in include/linux/percpu-defs.h: /* * Operations for contexts that are safe from preemption/interrupts. These * operations verify that preemption is disabled. */ #define __this_cpu_read(pcp) \ ({ \ __this_cpu_preempt_check("read"); \ raw_cpu_read(pcp); \ }) #define __this_cpu_write(pcp, val) \ ({ \ __this_cpu_preempt_check("write"); \ raw_cpu_write(pcp, val); \ }) #define __this_cpu_add(pcp, val) \ ({ \ __this_cpu_preempt_check("add"); \ raw_cpu_add(pcp, val); \ }) in lib/smp_processor_id.c: noinstr void __this_cpu_preempt_check(const char *op) { check_preemption_disabled("__this_cpu_", op); } EXPORT_SYMBOL(__this_cpu_preempt_check); > -- > Michal Hocko > SUSE Labs
On Wed, Jan 25, 2023 at 06:02:04PM -0800, Roman Gushchin wrote: > On Sat, Jan 21, 2023 at 12:39:42PM +0900, Hyeonggon Yoo wrote: > > In workloads where this_cpu operations are frequently performed, > > enabling DEBUG_PREEMPT may result in significant increase in > > runtime overhead due to frequent invocation of > > __this_cpu_preempt_check() function. > > > > This can be demonstrated through benchmarks such as hackbench where this > > configuration results in a 10% reduction in performance, primarily due to > > the added overhead within memcg charging path. > > > > Therefore, do not to enable DEBUG_PREEMPT by default and make users aware > > of its potential impact on performance in some workloads. > > > > hackbench-process-sockets > > debug_preempt no_debug_preempt > > Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* > > Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* > > Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* > > Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* > > Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* > > Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* > > Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* > > Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* > > Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > > > > Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> > > Nice! > > I checkout my very simple kmem performance test (1M allocations 8-bytes allocations) > and it shows ~30% difference: 112319 us with vs 80836 us without. Hello Roman, Oh, it has higher impact on micro benchmark. > > Probably not that big for real workloads, but still nice to have. > > Acked-by: Roman Gushchin <roman.gushchin@linux.dev> Thank you for kindly measuring impact of this patch and giving ack! > Thank you! > -- Thanks, Hyeonggon
On Fri 27-01-23 20:43:20, Hyeonggon Yoo wrote: > On Wed, Jan 25, 2023 at 10:51:05AM +0100, Michal Hocko wrote: > > On Thu 26-01-23 00:41:15, Hyeonggon Yoo wrote: > > [...] > > > > Do you happen to have any perf data collected during those runs? I > > > > would be interested in the memcg side of things. Maybe we can do > > > > something better there. > > > > > > Yes, below is performance data I've collected. > > > > > > 6.1.8-debug-preempt-dirty > > > ========================= > > > Overhead Command Shared Object Symbol > > > + 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled > > > > Thanks! Could you just add callers that are showing in the profile for > > this call please? > > - 14.56% 9.14% hackbench [kernel.vmlinux] [k] check_preemption_disabled > - 6.37% check_preemption_disabled > + 3.48% mod_objcg_state > + 1.10% obj_cgroup_charge > 1.02% refill_obj_stock > 0.67% memcg_slab_post_alloc_hook > 0.58% mod_objcg_state > > According to perf, many memcg functions call this function > and that's because __this_cpu_xxxx checks if preemption is disabled. OK, I see. Thanks! I was thinking whether we can optimize for that bu IIUC __this_cpu* is already an optimized form. mod_objcg_state is already called with local_lock so raw_cpu* could be used in that path but I guess this is not really worth just to optimize for a debug compile option to benefit.
On Sat, 21 Jan 2023, Hyeonggon Yoo wrote: >In workloads where this_cpu operations are frequently performed, >enabling DEBUG_PREEMPT may result in significant increase in >runtime overhead due to frequent invocation of >__this_cpu_preempt_check() function. > >This can be demonstrated through benchmarks such as hackbench where this >configuration results in a 10% reduction in performance, primarily due to >the added overhead within memcg charging path. > >Therefore, do not to enable DEBUG_PREEMPT by default and make users aware >of its potential impact on performance in some workloads. > >hackbench-process-sockets > debug_preempt no_debug_preempt >Amean 1 0.4743 ( 0.00%) 0.4295 * 9.45%* >Amean 4 1.4191 ( 0.00%) 1.2650 * 10.86%* >Amean 7 2.2677 ( 0.00%) 2.0094 * 11.39%* >Amean 12 3.6821 ( 0.00%) 3.2115 * 12.78%* >Amean 21 6.6752 ( 0.00%) 5.7956 * 13.18%* >Amean 30 9.6646 ( 0.00%) 8.5197 * 11.85%* >Amean 48 15.3363 ( 0.00%) 13.5559 * 11.61%* >Amean 79 24.8603 ( 0.00%) 22.0597 * 11.27%* >Amean 96 30.1240 ( 0.00%) 26.8073 * 11.01%* > >Signed-off-by: Hyeonggon Yoo <42.hyeyoo@gmail.com> Acked-by: Davidlohr Bueso <dave@stgolabs.net>
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index ddbfac2adf9c..f6f845a4b9ec 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1176,13 +1176,16 @@ config DEBUG_TIMEKEEPING config DEBUG_PREEMPT bool "Debug preemptible kernel" depends on DEBUG_KERNEL && PREEMPTION && TRACE_IRQFLAGS_SUPPORT - default y help If you say Y here then the kernel will use a debug variant of the commonly used smp_processor_id() function and will print warnings if kernel code uses it in a preemption-unsafe way. Also, the kernel will detect preemption count underflows. + This option has potential to introduce high runtime overhead, + depending on workload as it triggers debugging routines for each + this_cpu operation. It should only be used for debugging purposes. + menu "Lock Debugging (spinlocks, mutexes, etc...)" config LOCK_DEBUGGING_SUPPORT