Message ID | ZaqbP0QmVPAQTbYA@tpad |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:2bc4:b0:101:a8e8:374 with SMTP id hx4csp1099471dyb; Fri, 19 Jan 2024 07:56:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IGDP38KYiK8uvuCF57VCD8V4CkqUtURnOj0tomm+Nv3nv1wheqwaPrOlZpdNqQBa2pje16t X-Received: by 2002:a05:6a20:a103:b0:19a:bc0b:fa96 with SMTP id q3-20020a056a20a10300b0019abc0bfa96mr20415pzk.92.1705679786371; Fri, 19 Jan 2024 07:56:26 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705679786; cv=pass; d=google.com; s=arc-20160816; b=d24aJIEStLFiqNClHiM342aBznLheR1aEvcBUy2IdvylQ3cUzlo+Q4SMiiD6eHCgND peW+cdRKk/tbCcxtJAufPD+XQtwD9k/NsN/TjqlW9I0YKMmLSjoJgrX2eZEkJqW1QnsM 4UrcOo2WuDot/5s5wK6PMnBTK8/lier1lskHWT1Dqilbr8EYQhy3ti7wBPaVLQ0FCDSc +p5Lw5abzz+okCwf+nhZomhNiEHX8UUalMkisfbhMh9IiaX3njJGbjre6owen0ozm/Yf TbTpuf0YKieSvQZfzO9v/d9ZPfXovC189FfOBVfk7k0rCjdIt8tIeLU8hTEet1CUyWBA JNjA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-disposition:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:message-id:subject:cc:to:from:date :dkim-signature; bh=tnzMEK1Gp8cYV4BR16ID72YGGPaqlCx174qqJsFmo8k=; fh=Dx2+e+Yu2YLvWyzv+FNQY77vYJTKGXFkQlMQ9YYykew=; b=IGKA8895h8uePu4lV1ujC9AS/zszg2n955zwpTLioW5je20Y9Ep0BZ56ocARzNJf2+ Gw8+ZRE9P8P4mmCBJiVL7i6OMn3vHsNZ50dMTWqqA0bcQJ7szo8rZ7YL+MmJuAycyxib GUne1LAjk3YmfXGrny63QWjR6UE/GoTeKcZc6d8DE73SeHI6LLKVQHgYnx9RAhhLozHH Hfmb4xkVAf5LNz+eFKtbj8+azXjLj/aXlp3Wr+ZUrsxU45cZRBUlzDaAiMPo5zrg9FGy fDtSqY9oNBjHEJop9BOVKG466ky3JdLG3+xIZgJDsh91W126vwxWxo7WT+dvzuOy7Xl1 CnJg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SMqSVSvq; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id c35-20020a630d23000000b005cec9fd8061si3410833pgl.511.2024.01.19.07.56.26 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 19 Jan 2024 07:56:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=SMqSVSvq; arc=pass (i=1 spf=pass spfdomain=redhat.com dkim=pass dkdomain=redhat.com dmarc=pass fromdomain=redhat.com); spf=pass (google.com: domain of linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-31328-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7F534282AD5 for <ouuuleilei@gmail.com>; Fri, 19 Jan 2024 15:56:10 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0EAB854BF8; Fri, 19 Jan 2024 15:55:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="SMqSVSvq" Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F31E35479F for <linux-kernel@vger.kernel.org>; Fri, 19 Jan 2024 15:55:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705679756; cv=none; b=uDQqcFX12sowfoETnfEHwR36adL+wEHxCrBs4ADqVK0CLeD96Z+cWFN8QDLNKoT80f7AZz2tL1j07zVnaFk3P0VANsSGqKfBmflvJutSHXYtcktguncBoylFx/G5GJQcd6auRfmRB0kcx3cKavzgfFXjEbX7TsUu2e6I6QxtlVk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705679756; c=relaxed/simple; bh=TTwLpk+9WyqbjEGOZGuIU8l92E+nf3CARA/Oc1cdr5c=; h=Date:From:To:Cc:Subject:Message-ID:MIME-Version:Content-Type: Content-Disposition; b=Q26qn6tYNQflL8q+s6SVOP1YJL66CIZxTAFC77SIMaR7xbyVr6xtR1N92AXsrIQ70ZyWGq0nxFVbw//VeQ9TjfpnYKKT6q2JVlAKKyjBJkrzNmZ1w+REqvVUGrYyerPaDHFbhQGwmPqAWLOCV/gIALL59XStvVQKs1OmEt2oUQk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=SMqSVSvq; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1705679754; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type; bh=tnzMEK1Gp8cYV4BR16ID72YGGPaqlCx174qqJsFmo8k=; b=SMqSVSvqTPYCoNPEya2yamhwJQYjGIB87uSEUJKieLgl+nCaTe1m06z3EQ1Tz4ga7xJ6Tc +0JW9nlvdB57XnG4a8HFcwFPQFMKxms6g/bML0Z3ZhZSbYQpOqNQde0ZhGlcnchTxDQ1/5 n89L7dMiR8lk9VCbEwPLCp4OFD6xqO4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-99-W3JnvKtKPrecbLyAftJkZw-1; Fri, 19 Jan 2024 10:55:49 -0500 X-MC-Unique: W3JnvKtKPrecbLyAftJkZw-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6EB29867945; Fri, 19 Jan 2024 15:55:49 +0000 (UTC) Received: from tpad.localdomain (unknown [10.96.133.4]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 383B3492BE2; Fri, 19 Jan 2024 15:55:49 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id B7CB3401A1D88; Fri, 19 Jan 2024 12:54:39 -0300 (-03) Date: Fri, 19 Jan 2024 12:54:39 -0300 From: Marcelo Tosatti <mtosatti@redhat.com> To: Tejun Heo <tj@kernel.org> Cc: Frederic Weisbecker <frederic@kernel.org>, Joe Mario <jmario@redhat.com>, Juri Lelli <juri.lelli@redhat.com>, linux-kernel@vger.kernel.org Subject: [PATCH] mark power efficient workqueue as unbounded if nohz_full enabled Message-ID: <ZaqbP0QmVPAQTbYA@tpad> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.10 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788534887462977701 X-GMAIL-MSGID: 1788534887462977701 |
Series |
mark power efficient workqueue as unbounded if nohz_full enabled
|
|
Commit Message
Marcelo Tosatti
Jan. 19, 2024, 3:54 p.m. UTC
A customer using nohz_full has experienced the following interruption:
oslat-1004510 [018] timer_cancel: timer=0xffff90a7ca663cf8
oslat-1004510 [018] timer_expire_entry: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn now=4709188240 baseclk=4709188240
oslat-1004510 [018] workqueue_queue_work: work struct=0xffff90a7ca663cd8 function=fb_flashcursor workqueue=events_power_efficient req_cpu=8192 cpu=18
oslat-1004510 [018] workqueue_activate_work: work struct 0xffff90a7ca663cd8
oslat-1004510 [018] sched_wakeup: kworker/18:1:326 [120] CPU:018
oslat-1004510 [018] timer_expire_exit: timer=0xffff90a7ca663cf8
oslat-1004510 [018] irq_work_entry: vector=246
oslat-1004510 [018] irq_work_exit: vector=246
oslat-1004510 [018] tick_stop: success=0 dependency=SCHED
oslat-1004510 [018] hrtimer_start: hrtimer=0xffff90a70009cb00 function=tick_sched_timer/0x0 ...
oslat-1004510 [018] softirq_exit: vec=1 [action=TIMER]
oslat-1004510 [018] softirq_entry: vec=7 [action=SCHED]
oslat-1004510 [018] softirq_exit: vec=7 [action=SCHED]
oslat-1004510 [018] tick_stop: success=0 dependency=SCHED
oslat-1004510 [018] sched_switch: oslat:1004510 [120] R ==> kworker/18:1:326 [120]
kworker/18:1-326 [018] workqueue_execute_start: work struct 0xffff90a7ca663cd8: function fb_flashcursor
kworker/18:1-326 [018] workqueue_queue_work: work struct=0xffff9078f119eed0 function=drm_fb_helper_damage_work workqueue=events req_cpu=8192 cpu=18
kworker/18:1-326 [018] workqueue_activate_work: work struct 0xffff9078f119eed0
kworker/18:1-326 [018] timer_start: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn ...
Set wq_power_efficient to true, in case nohz_full is enabled.
This makes the power efficient workqueue be unbounded, which allows
workqueue items there to be moved to HK CPUs.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Comments
On Fri, Jan 19, 2024 at 12:54:39PM -0300, Marcelo Tosatti wrote: > > A customer using nohz_full has experienced the following interruption: > > oslat-1004510 [018] timer_cancel: timer=0xffff90a7ca663cf8 > oslat-1004510 [018] timer_expire_entry: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn now=4709188240 baseclk=4709188240 > oslat-1004510 [018] workqueue_queue_work: work struct=0xffff90a7ca663cd8 function=fb_flashcursor workqueue=events_power_efficient req_cpu=8192 cpu=18 > oslat-1004510 [018] workqueue_activate_work: work struct 0xffff90a7ca663cd8 > oslat-1004510 [018] sched_wakeup: kworker/18:1:326 [120] CPU:018 > oslat-1004510 [018] timer_expire_exit: timer=0xffff90a7ca663cf8 > oslat-1004510 [018] irq_work_entry: vector=246 > oslat-1004510 [018] irq_work_exit: vector=246 > oslat-1004510 [018] tick_stop: success=0 dependency=SCHED > oslat-1004510 [018] hrtimer_start: hrtimer=0xffff90a70009cb00 function=tick_sched_timer/0x0 ... > oslat-1004510 [018] softirq_exit: vec=1 [action=TIMER] > oslat-1004510 [018] softirq_entry: vec=7 [action=SCHED] > oslat-1004510 [018] softirq_exit: vec=7 [action=SCHED] > oslat-1004510 [018] tick_stop: success=0 dependency=SCHED > oslat-1004510 [018] sched_switch: oslat:1004510 [120] R ==> kworker/18:1:326 [120] > kworker/18:1-326 [018] workqueue_execute_start: work struct 0xffff90a7ca663cd8: function fb_flashcursor > kworker/18:1-326 [018] workqueue_queue_work: work struct=0xffff9078f119eed0 function=drm_fb_helper_damage_work workqueue=events req_cpu=8192 cpu=18 > kworker/18:1-326 [018] workqueue_activate_work: work struct 0xffff9078f119eed0 > kworker/18:1-326 [018] timer_start: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn ... > > Set wq_power_efficient to true, in case nohz_full is enabled. > This makes the power efficient workqueue be unbounded, which allows > workqueue items there to be moved to HK CPUs. > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> Applied to wq/for-6.9. A side note: with the recent affinity improvements to unbound workqueues, I wonder whether we'd be able to drop wq_power_efficient and just use system_unbound_wq instead without noticeable perf difference. Thanks.
On Fri, Jan 19, 2024 at 01:57:48PM -1000, Tejun Heo wrote: > On Fri, Jan 19, 2024 at 12:54:39PM -0300, Marcelo Tosatti wrote: > > > > A customer using nohz_full has experienced the following interruption: > > > > oslat-1004510 [018] timer_cancel: timer=0xffff90a7ca663cf8 > > oslat-1004510 [018] timer_expire_entry: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn now=4709188240 baseclk=4709188240 > > oslat-1004510 [018] workqueue_queue_work: work struct=0xffff90a7ca663cd8 function=fb_flashcursor workqueue=events_power_efficient req_cpu=8192 cpu=18 > > oslat-1004510 [018] workqueue_activate_work: work struct 0xffff90a7ca663cd8 > > oslat-1004510 [018] sched_wakeup: kworker/18:1:326 [120] CPU:018 > > oslat-1004510 [018] timer_expire_exit: timer=0xffff90a7ca663cf8 > > oslat-1004510 [018] irq_work_entry: vector=246 > > oslat-1004510 [018] irq_work_exit: vector=246 > > oslat-1004510 [018] tick_stop: success=0 dependency=SCHED > > oslat-1004510 [018] hrtimer_start: hrtimer=0xffff90a70009cb00 function=tick_sched_timer/0x0 ... > > oslat-1004510 [018] softirq_exit: vec=1 [action=TIMER] > > oslat-1004510 [018] softirq_entry: vec=7 [action=SCHED] > > oslat-1004510 [018] softirq_exit: vec=7 [action=SCHED] > > oslat-1004510 [018] tick_stop: success=0 dependency=SCHED > > oslat-1004510 [018] sched_switch: oslat:1004510 [120] R ==> kworker/18:1:326 [120] > > kworker/18:1-326 [018] workqueue_execute_start: work struct 0xffff90a7ca663cd8: function fb_flashcursor > > kworker/18:1-326 [018] workqueue_queue_work: work struct=0xffff9078f119eed0 function=drm_fb_helper_damage_work workqueue=events req_cpu=8192 cpu=18 > > kworker/18:1-326 [018] workqueue_activate_work: work struct 0xffff9078f119eed0 > > kworker/18:1-326 [018] timer_start: timer=0xffff90a7ca663cf8 function=delayed_work_timer_fn ... > > > > Set wq_power_efficient to true, in case nohz_full is enabled. > > This makes the power efficient workqueue be unbounded, which allows > > workqueue items there to be moved to HK CPUs. > > > > Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com> > > Applied to wq/for-6.9. > > A side note: with the recent affinity improvements to unbound workqueues, I > wonder whether we'd be able to drop wq_power_efficient and just use > system_unbound_wq instead without noticeable perf difference. > > Thanks. > > -- > tejun Tejun, About the performance difference (of running locally VS running remotely), can you list a few performance sensitive work queues (where per-CPU execution makes a significant difference). Because i suppose it would be safe (from a performance regression perspective) to move all delayed works to housekeeping CPUs. And also, being more extreme, why not an option to mark all workqueues as unbounded (or perhaps userspace control of bounding, even for workqueues marked as "per-CPU"). Thanks.
Hello, Marcelo. On Mon, Jan 22, 2024 at 11:22:10AM -0300, Marcelo Tosatti wrote: > About the performance difference (of running locally VS running > remotely), can you list a few performance sensitive work queues > (where per-CPU execution makes a significant difference). Unfortunately, I have no idea. It goes way back and I'm not sure anyone actually tested the difference in a long time. We'd have to dig through history to gather some context, set up a benchmark which exercises the path heavily and see whether the difference is still there. > Because i suppose it would be safe (from a performance regression > perspective) to move all delayed works to housekeeping CPUs. Yeah, replacing power_efficient with unbound should be safe. > And also, being more extreme, why not an option to mark all workqueues > as unbounded (or perhaps userspace control of bounding, even for > workqueues marked as "per-CPU"). There are correctness issues with per-cpu workqueues - e.g. accessing local atomic counters, cpu states and what not. Also, many per-cpu users already know that the cpu is hot as they're queueing on the local CPU. I'm not against moving more users towards unbound workqueues but that'd have be done case by case unfortunately. Thanks.
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 76e60faed892..45b3a63954a9 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -6630,6 +6630,13 @@ void __init workqueue_init_early(void) wq_update_pod_attrs_buf = alloc_workqueue_attrs(); BUG_ON(!wq_update_pod_attrs_buf); + /* + * If nohz_full is enabled, set power efficient workqueue as unbound. + * This allows workqueue items to be moved to HK CPUs. + */ + if (housekeeping_enabled(HK_TYPE_TICK)) + wq_power_efficient = true; + /* initialize WQ_AFFN_SYSTEM pods */ pt->pod_cpus = kcalloc(1, sizeof(pt->pod_cpus[0]), GFP_KERNEL); pt->pod_node = kcalloc(1, sizeof(pt->pod_node[0]), GFP_KERNEL);