From patchwork Thu May 18 03:00:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95634 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp209671vqo; Wed, 17 May 2023 20:19:38 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ56oS1H+wogva4IlWxPBITb6ibD8TtCM37AdVKYszBAkwiZUL2h3QC+/dVPHHANQu/iub5X X-Received: by 2002:a05:6a20:d909:b0:106:70af:a5b9 with SMTP id jd9-20020a056a20d90900b0010670afa5b9mr541764pzb.31.1684379978353; Wed, 17 May 2023 20:19:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684379978; cv=none; d=google.com; s=arc-20160816; b=KAAyLzctq75BLayNtTRpaMkmdusfc9SFvQBAqXo3LSDjIOxmpElvFI9lZSeDfhca5z rlz6mGFVp4kyq5lEyG4tr2tkixzj3VFrMDQXmySa3eFOSgxlDCfNOurMWeAhwzwCbaWU qoGEeqaPgWMNSGm256EMhugaZK7qM26hD8zM1Bu9i0dWH/mZxdv7etvvT25inGyWZIOt M6HVjeEOYXxC57xfPdE80I4M1bKDTeH/G89gV8Ms2F9mVn0mHt7M0ShmnEF7D/sFpG/X a5ULz5m1DdTQfCgskUhjwMGl6wMK/wFNxplqfdKV1CaVG1+YKaHLxzX4C2V8QIfHvoD9 nVyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=3r+6B3Xxt7Xi/2/V6KvqAugArSC2ecFZdcxXJFl4a8s=; b=BxRK3fElTuTsOYHd94mNROPq2cWY1YE67367MEirqCoQq+ALos4/h1+pSdN3ub9h+s 8lIErRpJqnAUmNneOp5TkiQ5UwvXe6wz7Mcdl55+ZgPx7+eV/u149h0uAu/yn3hQao9w ZufORh1scYKIEq6v+1AXH0RDgh/7uEiOrvVNC8vsH+7dYyWRIv+HRiaPcLQdADUQtG20 mvltdSHtAJrKLvjmw/DtEeEODS58ZhXqbRJSTTHb+07USjDburbV9g9UJ6gLJpZFR/dJ 1hmhlyQs5jWkqj1ku/hndV1+D7llSoF6T2htniJ9wqXDFlxnFOzE7ATAkP7SCP341cUP QU1w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=d85H+E1M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t189-20020a6378c6000000b0050f975790dcsi381242pgc.464.2023.05.17.20.19.24; Wed, 17 May 2023 20:19:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=d85H+E1M; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229815AbjERDBU (ORCPT + 99 others); Wed, 17 May 2023 23:01:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229704AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8F1B41BC for ; Wed, 17 May 2023 20:00:50 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-51f6461af24so994340a12.2 for ; Wed, 17 May 2023 20:00:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378850; x=1686970850; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=3r+6B3Xxt7Xi/2/V6KvqAugArSC2ecFZdcxXJFl4a8s=; b=d85H+E1MYfx1W6j6lnXXqSHDl1krFVwrRu5AaO3mLOirQIGssjoO7us3Jo64k49apY hGXYBYQo9v8UaJxD6tvbo9RJ1e2N4tL+00wT2o6uSvgSRbqtdKt4PKfAehgBCqZx4LQ5 tbkU2Awiwc3O3fW23pxoeW9KTUFQb6GZoou6ie/y8uSe4vK89OzDgVmjEMN9LqtL1EYf gsyHAmmM1KGPDjzFm7psmqUc74Zyhuqqk4SvY6+pS2/wlBsn3+vemvgxtNEHOeF3YYHI funw4Fmxk38c9f6rxBkINUM3V0G9afjbs+vB2HA9nxAklZxqLQPnNu+Hm+oxKYSzYf+5 r73Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378850; x=1686970850; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=3r+6B3Xxt7Xi/2/V6KvqAugArSC2ecFZdcxXJFl4a8s=; b=jZ+4FEyvE+0e085j9YubWqbkZD7tvrHc9yR946e8/MWbIAiKBOT7+1w7+eOnpV+8by Vrq2N2PKM6iao1XyO3LtWu//28v5nSbbGDkNU6mhpofsxdS6kWCBMShKcZA6IFNLQbTW /0KNAY2VBDb3uoMsyyalLXB3gs0sZKMNKyYeza0i8VC3BUAo37Y88WQzn6U2cax7TbXd pcei++dfxBTQesp01AzXtLOSPJN6MNomV0yP04CNTNtwcw2obSDi8HuYFitxfD5Bk7BR MflwZtTBuYVHn/VMyK3bSjPKwpgxRAf9EWIkhXNIYFOUum+s2P84IKSt4azG3U+9QHeT Ybhg== X-Gm-Message-State: AC+VfDy5Gk8rvWZ6b7ykeaD9AnWpyn/PJADZ0eu178/Nva/mktRuf/5P Y63x2Ji9QzOotAOOGcHXb28= X-Received: by 2002:a17:902:7297:b0:1ae:2e08:bacd with SMTP id d23-20020a170902729700b001ae2e08bacdmr1005018pll.14.1684378849437; Wed, 17 May 2023 20:00:49 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id u15-20020a17090341cf00b001ab12ccc2a7sm122725ple.98.2023.05.17.20.00.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:00:49 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 1/7] workqueue: Add pwq->stats[] and a monitoring script Date: Wed, 17 May 2023 17:00:27 -1000 Message-Id: <20230518030033.4163274-2-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766200420424434402?= X-GMAIL-MSGID: =?utf-8?q?1766200420424434402?= Currently, the only way to peer into workqueue operations is through tracing. While possible, it isn't easy or convenient to monitor per-workqueue behaviors over time this way. Let's add pwq->stats[] that track relevant events and a drgn monitoring script - tools/workqueue/wq_monitor.py. It's arguable whether this needs to be configurable. However, it currently only has several counters and the runtime overhead shouldn't be noticeable given that they're on pwq's which are per-cpu on per-cpu workqueues and per-numa-node on unbound ones. Let's keep it simple for the time being. v2: Patch reordered to earlier with fewer fields. Field will be added back gradually. Help message improved. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- Documentation/core-api/workqueue.rst | 32 ++++++ kernel/workqueue.c | 24 ++++- tools/workqueue/wq_monitor.py | 150 +++++++++++++++++++++++++++ 3 files changed, 205 insertions(+), 1 deletion(-) create mode 100644 tools/workqueue/wq_monitor.py diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst index 8ec4d6270b24..7e5c39310bbf 100644 --- a/Documentation/core-api/workqueue.rst +++ b/Documentation/core-api/workqueue.rst @@ -348,6 +348,37 @@ Guidelines level of locality in wq operations and work item execution. +Monitoring +========== + +Use tools/workqueue/wq_monitor.py to monitor workqueue operations: :: + + $ tools/workqueue/wq_monitor.py events + total infl CMwake mayday rescued + events 18545 0 5 - - + events_highpri 8 0 0 - - + events_long 3 0 0 - - + events_unbound 38306 0 - - - + events_freezable 0 0 0 - - + events_power_efficient 29598 0 0 - - + events_freezable_power_ 10 0 0 - - + sock_diag_events 0 0 0 - - + + total infl CMwake mayday rescued + events 18548 0 5 - - + events_highpri 8 0 0 - - + events_long 3 0 0 - - + events_unbound 38322 0 - - - + events_freezable 0 0 0 - - + events_power_efficient 29603 0 0 - - + events_freezable_power_ 10 0 0 - - + sock_diag_events 0 0 0 - - + + ... + +See the command's help message for more info. + + Debugging ========= @@ -387,6 +418,7 @@ For the second type of problems it should be possible to just check The work item's function should be trivially visible in the stack trace. + Non-reentrance Conditions ========================= diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 36bccc1285b3..60d5b84cccb2 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -199,6 +199,20 @@ struct worker_pool { struct rcu_head rcu; }; +/* + * Per-pool_workqueue statistics. These can be monitored using + * tools/workqueue/wq_monitor.py. + */ +enum pool_workqueue_stats { + PWQ_STAT_STARTED, /* work items started execution */ + PWQ_STAT_COMPLETED, /* work items completed execution */ + PWQ_STAT_CM_WAKEUP, /* concurrency-management worker wakeups */ + PWQ_STAT_MAYDAY, /* maydays to rescuer */ + PWQ_STAT_RESCUED, /* linked work items executed by rescuer */ + + PWQ_NR_STATS, +}; + /* * The per-pool workqueue. While queued, the lower WORK_STRUCT_FLAG_BITS * of work_struct->data are used for flags and the remaining high bits @@ -236,6 +250,8 @@ struct pool_workqueue { struct list_head pwqs_node; /* WR: node on wq->pwqs */ struct list_head mayday_node; /* MD: node on wq->maydays */ + u64 stats[PWQ_NR_STATS]; + /* * Release of unbound pwq is punted to system_wq. See put_pwq() * and pwq_unbound_release_workfn() for details. pool_workqueue @@ -929,8 +945,10 @@ void wq_worker_sleeping(struct task_struct *task) } pool->nr_running--; - if (need_more_worker(pool)) + if (need_more_worker(pool)) { + worker->current_pwq->stats[PWQ_STAT_CM_WAKEUP]++; wake_up_worker(pool); + } raw_spin_unlock_irq(&pool->lock); } @@ -2165,6 +2183,7 @@ static void send_mayday(struct work_struct *work) get_pwq(pwq); list_add_tail(&pwq->mayday_node, &wq->maydays); wake_up_process(wq->rescuer->task); + pwq->stats[PWQ_STAT_MAYDAY]++; } } @@ -2403,6 +2422,7 @@ __acquires(&pool->lock) * workqueues), so hiding them isn't a problem. */ lockdep_invariant_state(true); + pwq->stats[PWQ_STAT_STARTED]++; trace_workqueue_execute_start(work); worker->current_func(work); /* @@ -2410,6 +2430,7 @@ __acquires(&pool->lock) * point will only record its address. */ trace_workqueue_execute_end(work, worker->current_func); + pwq->stats[PWQ_STAT_COMPLETED]++; lock_map_release(&lockdep_map); lock_map_release(&pwq->wq->lockdep_map); @@ -2653,6 +2674,7 @@ static int rescuer_thread(void *__rescuer) if (first) pool->watchdog_ts = jiffies; move_linked_works(work, scheduled, &n); + pwq->stats[PWQ_STAT_RESCUED]++; } first = false; } diff --git a/tools/workqueue/wq_monitor.py b/tools/workqueue/wq_monitor.py new file mode 100644 index 000000000000..fc1643ba06b3 --- /dev/null +++ b/tools/workqueue/wq_monitor.py @@ -0,0 +1,150 @@ +#!/usr/bin/env drgn +# +# Copyright (C) 2023 Tejun Heo +# Copyright (C) 2023 Meta Platforms, Inc. and affiliates. + +desc = """ +This is a drgn script to monitor workqueues. For more info on drgn, visit +https://github.com/osandov/drgn. + + total Total number of work items executed by the workqueue. + + infl The number of currently in-flight work items. + + CMwake The number of concurrency-management wake-ups while executing a + work item of the workqueue. + + mayday The number of times the rescuer was requested while waiting for + new worker creation. + + rescued The number of work items executed by the rescuer. +""" + +import sys +import signal +import os +import re +import time +import json + +import drgn +from drgn.helpers.linux.list import list_for_each_entry,list_empty +from drgn.helpers.linux.cpumask import for_each_possible_cpu + +import argparse +parser = argparse.ArgumentParser(description=desc, + formatter_class=argparse.RawTextHelpFormatter) +parser.add_argument('workqueue', metavar='REGEX', nargs='*', + help='Target workqueue name patterns (all if empty)') +parser.add_argument('-i', '--interval', metavar='SECS', type=float, default=1, + help='Monitoring interval (0 to print once and exit)') +parser.add_argument('-j', '--json', action='store_true', + help='Output in json') +args = parser.parse_args() + +def err(s): + print(s, file=sys.stderr, flush=True) + sys.exit(1) + +workqueues = prog['workqueues'] + +WQ_UNBOUND = prog['WQ_UNBOUND'] +WQ_MEM_RECLAIM = prog['WQ_MEM_RECLAIM'] + +PWQ_STAT_STARTED = prog['PWQ_STAT_STARTED'] # work items started execution +PWQ_STAT_COMPLETED = prog['PWQ_STAT_COMPLETED'] # work items completed execution +PWQ_STAT_CM_WAKEUP = prog['PWQ_STAT_CM_WAKEUP'] # concurrency-management worker wakeups +PWQ_STAT_MAYDAY = prog['PWQ_STAT_MAYDAY'] # maydays to rescuer +PWQ_STAT_RESCUED = prog['PWQ_STAT_RESCUED'] # linked work items executed by rescuer +PWQ_NR_STATS = prog['PWQ_NR_STATS'] + +class WqStats: + def __init__(self, wq): + self.name = wq.name.string_().decode() + self.unbound = wq.flags & WQ_UNBOUND != 0 + self.mem_reclaim = wq.flags & WQ_MEM_RECLAIM != 0 + self.stats = [0] * PWQ_NR_STATS + for pwq in list_for_each_entry('struct pool_workqueue', wq.pwqs.address_of_(), 'pwqs_node'): + for i in range(PWQ_NR_STATS): + self.stats[i] += int(pwq.stats[i]) + + def dict(self, now): + return { 'timestamp' : now, + 'name' : self.name, + 'unbound' : self.unbound, + 'mem_reclaim' : self.mem_reclaim, + 'started' : self.stats[PWQ_STAT_STARTED], + 'completed' : self.stats[PWQ_STAT_COMPLETED], + 'cm_wakeup' : self.stats[PWQ_STAT_CM_WAKEUP], + 'mayday' : self.stats[PWQ_STAT_MAYDAY], + 'rescued' : self.stats[PWQ_STAT_RESCUED], } + + def table_header_str(): + return f'{"":>24} {"total":>8} {"infl":>5} {"CMwake":>7} {"mayday":>7} {"rescued":>7}' + + def table_row_str(self): + cm_wakeup = '-' + mayday = '-' + rescued = '-' + + if not self.unbound: + cm_wakeup = str(self.stats[PWQ_STAT_CM_WAKEUP]) + + if self.mem_reclaim: + mayday = str(self.stats[PWQ_STAT_MAYDAY]) + rescued = str(self.stats[PWQ_STAT_RESCUED]) + + out = f'{self.name[-24:]:24} ' \ + f'{self.stats[PWQ_STAT_STARTED]:8} ' \ + f'{max(self.stats[PWQ_STAT_STARTED] - self.stats[PWQ_STAT_COMPLETED], 0):5} ' \ + f'{cm_wakeup:>7} ' \ + f'{mayday:>7} ' \ + f'{rescued:>7} ' + return out.rstrip(':') + +exit_req = False + +def sigint_handler(signr, frame): + global exit_req + exit_req = True + +def main(): + # handle args + table_fmt = not args.json + interval = args.interval + + re_str = None + if args.workqueue: + for r in args.workqueue: + if re_str is None: + re_str = r + else: + re_str += '|' + r + + filter_re = re.compile(re_str) if re_str else None + + # monitoring loop + signal.signal(signal.SIGINT, sigint_handler) + + while not exit_req: + now = time.time() + + if table_fmt: + print() + print(WqStats.table_header_str()) + + for wq in list_for_each_entry('struct workqueue_struct', workqueues.address_of_(), 'list'): + stats = WqStats(wq) + if filter_re and not filter_re.search(stats.name): + continue + if table_fmt: + print(stats.table_row_str()) + else: + print(stats.dict(now)) + + if interval == 0: + break + time.sleep(interval) + +if __name__ == "__main__": + main() From patchwork Thu May 18 03:00:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95629 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp204285vqo; Wed, 17 May 2023 20:05:21 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5mUCpFiDNANcrA2lkLLdYnTWPdT4tLTCySk+9MQIkNLQThusBiLGFNz16A+HBOFgDlCkHV X-Received: by 2002:a05:6a20:3d81:b0:f2:eb8b:77a3 with SMTP id s1-20020a056a203d8100b000f2eb8b77a3mr534495pzi.35.1684379120741; Wed, 17 May 2023 20:05:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684379120; cv=none; d=google.com; s=arc-20160816; b=kIQFqTWjDDTXU8hTcYX3/MrafEpklq5Z01eHzY3NGp46etYWby7Jk8oxNl/97QCChE d8HsVqGOL/ewzlOTVuQrExtOJNEb9PzsueGejxyfZrX0/oIDVDLzUZntRUrMRJBfnuVH 654jCNUt4ppGBSXw4jYCYc3CJZpzyALRvwHY30Q6TcyxIdtg9N15m14zGZb2Z1iaPqOY Kg+ZlhTBm2UTKYGGl5WAB+B1CHzRFuOgJSW5PlECtoE6IkvmQXlUFsQ71v4TPG/0XnGE 8Y0Fcu3kNi61S4RgPh56+IElGly7vK4DvOt+35oWOowzjtAtpjDH+8tgTUpi7CsoZ2LS Jctg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=QDANAV6Ubr5KkEqy3DIRzGm4anT8hIgcnesbBUBsaAg=; b=Lzopfald5amfX1Vg1/WS7UPAGBNGF0FTdWUFjhW8uZZL8u1DXuNG+44l/84omXQux2 Dr2Pbiew4cHgik0edoF/GTs3T5LC2pGOzFWJRabRJQbWEcY6vB2E2DMkKOWhlTS9IqwS t224ShDJLxQx+dXfSl3Zq9mMu8rfVe5W20/SCSKuyehQ6USz9kbb66a7r0Uy/5BfEcNo 2nCSavYm8Q226WgrDnM4Gu7ORNG10BvIUswYHItXuhZem7/EpqIS0MffaXD85Z6Sa8zF F5g668vbC4jVEXn+Qo5xT1siMlp8rVFN6qr9Cb117YyBXSYIr0Q8c0RETCyxrCrLy4xh kbrw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=sLAJ2+km; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x8-20020a656aa8000000b0053045acfebdsi391439pgu.34.2023.05.17.20.05.05; Wed, 17 May 2023 20:05:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=sLAJ2+km; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229852AbjERDBb (ORCPT + 99 others); Wed, 17 May 2023 23:01:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41554 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229723AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BF4DE79 for ; Wed, 17 May 2023 20:00:52 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id 41be03b00d2f7-51f1b6e8179so1009097a12.3 for ; Wed, 17 May 2023 20:00:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378852; x=1686970852; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=QDANAV6Ubr5KkEqy3DIRzGm4anT8hIgcnesbBUBsaAg=; b=sLAJ2+kmCOJti90TvkNgjY86PPE0vpEXL9euJKkSmkS1p81QIaLkF6Pk8cNha9FYxs rRPdd2HaJfXmvDBYBs9Of7AOgXz2ifHMcrenWrZNon2pCCfYCeVToChobjepuZ3KPh/k kO6QNRLzViLp3KHzaY9jIngFQPH0UP/5jspLaixDAWRGlA6xFDQJ22WD+WfNGa1GWm4G B4bc3yKuEvuG91vLLALshkvUiKKN50lkfY7mGh1eIL1lqvWuWOk0O4PmTPx+Bvo2cq30 jN0C4+7xzAk64f/XRsBwMAQI+sB8OoSwYPp0jCzx7B73UQUcwrFKIjhIJPGLIVAtnQZ2 21kQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378852; x=1686970852; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QDANAV6Ubr5KkEqy3DIRzGm4anT8hIgcnesbBUBsaAg=; b=BXbZaG1FHvlir9Cd9h1UrfUHC6WiNw4UfHAq4CnPbok6eCx8v/x1rkiLaA4ZfkW+if 2+2pUeSmZ7fElhUPEMdJk5W9kjfCZrUpZ7J0w85rwz/tPqCiHio/BYq9rRaRpOD/9eNk HvCC7RFk0xUc1K+Y8NRQjkVulTbetEnJ4b1jmCPUSiASQIyNAoGpJtTgmO02oSSUu74n UYCnOwPUx/Dhjelv7fBI/LqBRovA4ITZMrbIFAjk0Zkj7bXDrWjMNlxXVHmI4HgsT9n4 9rxBQ3VkjdBXR1RbWgw7/eoEteeKRFpB6wqyiBlnwfVNnQCUPUpCoO2SSvLeu4+hFbkj /qtg== X-Gm-Message-State: AC+VfDxfILVhJz22CR3DvowQxDNLzAz+PqVNx2bKvWh/aUmgeXehHzvw abuS+AALnrw9nrqPlhhKppY= X-Received: by 2002:a05:6a20:e68e:b0:f6:55c:5371 with SMTP id mz14-20020a056a20e68e00b000f6055c5371mr517058pzb.49.1684378851577; Wed, 17 May 2023 20:00:51 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id ds15-20020a17090b08cf00b0025069c8a151sm272370pjb.53.2023.05.17.20.00.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:00:51 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 2/7] workqueue: Re-order struct worker fields Date: Wed, 17 May 2023 17:00:28 -1000 Message-Id: <20230518030033.4163274-3-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766199521224130668?= X-GMAIL-MSGID: =?utf-8?q?1766199521224130668?= struct worker was laid out with the intent that all fields that are modified for each work item execution are in the first cacheline. However, this hasn't been true for a while with the addition of ->last_func. Let's just collect hot fields together at the top. Move ->sleeping in the hole after ->current_color and move ->lst_func right below. While at it, drop the cacheline comment which isn't useful anymore. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- kernel/workqueue_internal.h | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index e00b1204a8e9..0600f04ceeb2 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -32,9 +32,12 @@ struct worker { work_func_t current_func; /* L: current_work's fn */ struct pool_workqueue *current_pwq; /* L: current_work's pwq */ unsigned int current_color; /* L: current_work's color */ - struct list_head scheduled; /* L: scheduled works */ + int sleeping; /* None */ + + /* used by the scheduler to determine a worker's last known identity */ + work_func_t last_func; /* L: last work's fn */ - /* 64 bytes boundary on 64bit, 32 on 32bit */ + struct list_head scheduled; /* L: scheduled works */ struct task_struct *task; /* I: worker task */ struct worker_pool *pool; /* A: the associated pool */ @@ -45,7 +48,6 @@ struct worker { unsigned long last_active; /* L: last active timestamp */ unsigned int flags; /* X: flags */ int id; /* I: worker id */ - int sleeping; /* None */ /* * Opaque string set with work_set_desc(). Printed out with task @@ -55,9 +57,6 @@ struct worker { /* used only by rescuers to point to the target workqueue */ struct workqueue_struct *rescue_wq; /* I: the workqueue to rescue */ - - /* used by the scheduler to determine a worker's last known identity */ - work_func_t last_func; }; /** From patchwork Thu May 18 03:00:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95635 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp210619vqo; Wed, 17 May 2023 20:22:24 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6bgxsq9Zv0A0u78XAm4A7kJZ4tWfLop2OYEdyW1o3MMG9TPfVVExr0w1LSJbQ2TXlQ8i95 X-Received: by 2002:a17:90a:aa02:b0:250:40f5:6838 with SMTP id k2-20020a17090aaa0200b0025040f56838mr999710pjq.30.1684380143834; Wed, 17 May 2023 20:22:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684380143; cv=none; d=google.com; s=arc-20160816; b=Nz/GKeWwHIVOIMLvPZPCAIBG/QGhVXMyrBHUWBFJpDk7zrsIlTZWMNVLJlQeyJU/kk TfKtNvwQZ9xhaKhvFbrQHA1bC2uyLZYLxySaQzfYcPSTsDUZlmr79KjXdVtxMeahQPA9 If6P54RH4/+hY1guc2GFokQ8Y11uIOmTNhl3nlm8fcqP6dc01KY/8ITuy8yXFtUcT4zh x6kGXJSr+yGKmBFHQ1Y4nrFLjrqqIb2r4jgZ+fRF2Wr6mD8BsAxrVIICaRKzDPRGVHbR 8B1STyLKKLuXDsqfxVhAeX5FJY7oZDsUvSFdKT4Et41W4QmPrp9aTw0d4IyONbvLFxx+ vfhA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=Ga0+M5guknuQtZyJzOHvjX4Dz3Vsr3EDDnoY1uXgoTE=; b=qajGIw5e/kBNi49V+KYcHtG7QnTufD5KqK2I6RoMv0EMPfbtYW3++MDP4vHLUI2D7B DVxN57LupOkRgj8JLqnLuljl0PJydyhBezwQX03cjEDG1H6qNlgklLSbv2ztZi3BzS4+ QI96VkjU8G3qbtQEiwPFUx8ZuZ2Ue2KlImpdgcWhKNTQrdkd0pDG/5fEoHu0sPMbxWUB gx8KgQ+aFEF4ClRZq87lcSS7e6UL7e0foY+xlYS+EUemb6smRlB5SuBCdyZDDu8h0lLw om2WZf5KsIYs7BjJZJ6/SKu1kscZSF7amZV/RA7V0nXfhMtf698naLQ0VERbUam9PKIS s0LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=kZMVHxpV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id mg8-20020a17090b370800b00252cf7d9c42si3322389pjb.50.2023.05.17.20.22.09; Wed, 17 May 2023 20:22:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=kZMVHxpV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229763AbjERDBI (ORCPT + 99 others); Wed, 17 May 2023 23:01:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41562 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229700AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC50810D7 for ; Wed, 17 May 2023 20:00:54 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-643990c5319so1102991b3a.2 for ; Wed, 17 May 2023 20:00:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378854; x=1686970854; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=Ga0+M5guknuQtZyJzOHvjX4Dz3Vsr3EDDnoY1uXgoTE=; b=kZMVHxpVu/h296FWMuDq4Ld+QXCwAGv99A9lc4Z95TpYBMyr5a/xSKiOIFFXAUbk1p 44K7dftGa1JJZm4x6V1SFNHPFVVIK0FUo5OXJjTse0V7gi8Zl0s/9skDsZBKYDGKX7jP WyUupe4jnxOYZTZQFTGSmjknzdQLeY3SJuuLsMTR6TBdbpljA/cgvlFr7SRiuvyWN8dH drPIawIGLZEQM2MZBJnVewTS5L9IaagLViGNDT3Zm+BVs7iCOIQFmnHL7kw1g8LnXwbJ 6rH4Y+UAsNjz9TQSpTZKfq2xOFyzLT0EXU/2FE6E/v4DfilZ3WaIbjmZI0iEughUCFsH tidQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378854; x=1686970854; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=Ga0+M5guknuQtZyJzOHvjX4Dz3Vsr3EDDnoY1uXgoTE=; b=c0Gl1v7cJrHAhMMg4EQmYt3zkK6PmoGO0mpLQjyJOHUOFlxSx8SrvXvRF0lhEZTTkg LJ1qn6ISatr3rjOHDBVx+TdANkH13k7F7d9LxzpslKwlEUZGtzRU3hbVyjO8roSctZz5 p453l5uXEUiEV1a9oyper9ZrsTlPxfzoO0UCKpsqZyLVpfrMwUIQ3pNxWeCWohO3qwQo SwAJ/FVy+eOZmUtcaYcWASYW4KjbTRxHDBB+qfupjVdJIb8OJceicNFAspSzvqRad9Zi oALwiYOdp27Y7vQsl6N1LDiE3tITuLTRcxufJ8ThF72qGNTLMqZKy/MNKr2uw/XZq/jY wkOA== X-Gm-Message-State: AC+VfDyAZBnq5c9qVh4HLne66IUdlQGxv6Gk0hRZw84krZJoPmQbpSbF Tsb0fecuqTHfMrpYN9qPgU8= X-Received: by 2002:a05:6a00:24c6:b0:648:ebb2:3d6 with SMTP id d6-20020a056a0024c600b00648ebb203d6mr2829769pfv.26.1684378853855; Wed, 17 May 2023 20:00:53 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id a26-20020a62bd1a000000b0063f0c9eadc7sm203331pff.200.2023.05.17.20.00.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:00:53 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 3/7] workqueue: Move worker_set/clr_flags() upwards Date: Wed, 17 May 2023 17:00:29 -1000 Message-Id: <20230518030033.4163274-4-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766200593531817497?= X-GMAIL-MSGID: =?utf-8?q?1766200593531817497?= They are going to be used in wq_worker_stopping(). Move them upwards. Signed-off-by: Tejun Heo Cc: Lai Jiangshan --- kernel/workqueue.c | 108 ++++++++++++++++++++++----------------------- 1 file changed, 54 insertions(+), 54 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 60d5b84cccb2..d70bb5be99ce 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -879,6 +879,60 @@ static void wake_up_worker(struct worker_pool *pool) wake_up_process(worker->task); } +/** + * worker_set_flags - set worker flags and adjust nr_running accordingly + * @worker: self + * @flags: flags to set + * + * Set @flags in @worker->flags and adjust nr_running accordingly. + * + * CONTEXT: + * raw_spin_lock_irq(pool->lock) + */ +static inline void worker_set_flags(struct worker *worker, unsigned int flags) +{ + struct worker_pool *pool = worker->pool; + + WARN_ON_ONCE(worker->task != current); + + /* If transitioning into NOT_RUNNING, adjust nr_running. */ + if ((flags & WORKER_NOT_RUNNING) && + !(worker->flags & WORKER_NOT_RUNNING)) { + pool->nr_running--; + } + + worker->flags |= flags; +} + +/** + * worker_clr_flags - clear worker flags and adjust nr_running accordingly + * @worker: self + * @flags: flags to clear + * + * Clear @flags in @worker->flags and adjust nr_running accordingly. + * + * CONTEXT: + * raw_spin_lock_irq(pool->lock) + */ +static inline void worker_clr_flags(struct worker *worker, unsigned int flags) +{ + struct worker_pool *pool = worker->pool; + unsigned int oflags = worker->flags; + + WARN_ON_ONCE(worker->task != current); + + worker->flags &= ~flags; + + /* + * If transitioning out of NOT_RUNNING, increment nr_running. Note + * that the nested NOT_RUNNING is not a noop. NOT_RUNNING is mask + * of multiple flags, not a single flag. + */ + if ((flags & WORKER_NOT_RUNNING) && (oflags & WORKER_NOT_RUNNING)) + if (!(worker->flags & WORKER_NOT_RUNNING)) + pool->nr_running++; +} + /** * wq_worker_running - a worker is running again * @task: task waking up @@ -983,60 +1037,6 @@ work_func_t wq_worker_last_func(struct task_struct *task) return worker->last_func; } -/** - * worker_set_flags - set worker flags and adjust nr_running accordingly - * @worker: self - * @flags: flags to set - * - * Set @flags in @worker->flags and adjust nr_running accordingly. - * - * CONTEXT: - * raw_spin_lock_irq(pool->lock) - */ -static inline void worker_set_flags(struct worker *worker, unsigned int flags) -{ - struct worker_pool *pool = worker->pool; - - WARN_ON_ONCE(worker->task != current); - - /* If transitioning into NOT_RUNNING, adjust nr_running. */ - if ((flags & WORKER_NOT_RUNNING) && - !(worker->flags & WORKER_NOT_RUNNING)) { - pool->nr_running--; - } - - worker->flags |= flags; -} - -/** - * worker_clr_flags - clear worker flags and adjust nr_running accordingly - * @worker: self - * @flags: flags to clear - * - * Clear @flags in @worker->flags and adjust nr_running accordingly. - * - * CONTEXT: - * raw_spin_lock_irq(pool->lock) - */ -static inline void worker_clr_flags(struct worker *worker, unsigned int flags) -{ - struct worker_pool *pool = worker->pool; - unsigned int oflags = worker->flags; - - WARN_ON_ONCE(worker->task != current); - - worker->flags &= ~flags; - - /* - * If transitioning out of NOT_RUNNING, increment nr_running. Note - * that the nested NOT_RUNNING is not a noop. NOT_RUNNING is mask - * of multiple flags, not a single flag. - */ - if ((flags & WORKER_NOT_RUNNING) && (oflags & WORKER_NOT_RUNNING)) - if (!(worker->flags & WORKER_NOT_RUNNING)) - pool->nr_running++; -} - /** * find_worker_executing_work - find worker which is executing a work * @pool: pool of interest From patchwork Thu May 18 03:00:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95631 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp207723vqo; Wed, 17 May 2023 20:14:15 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7dFeX3UTpoKUwFrklre0+ElVtYDxGFvpoXhIelRnW4yK1vlxYGKQckAcBVHFuy9A3toofk X-Received: by 2002:a17:90b:a53:b0:24e:4a1a:39a4 with SMTP id gw19-20020a17090b0a5300b0024e4a1a39a4mr1135645pjb.17.1684379655080; Wed, 17 May 2023 20:14:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684379655; cv=none; d=google.com; s=arc-20160816; b=eP1wQJI05R1jMcsTkMx9Qz6ehiNLwSdolnNKRKXIh5pfw6ClXm1A9+4wI+h635Sy6+ 1MQgoJ8fKiPVueJe+HXOFXpqUh8gNwHtXrd+JRlAVnAdO/Vv8mrltHJB25+cUOZWALIT wWJRXQ/wWB7wCgEPlv5mOP27shSIURx3SUsBYsqC18h3an4FNxAA5qMDy+VS6Zu5lJmV rnLadenR2eLYBlrvbB3GLy7w/PZ5FuKbnE/dGXiLFcXQKCPrSzfLD0TOki0CNtTRvqtE +wdeaL3fhqTdxyAUImQVBDZFccoA7Pc6SFqmhouar5dWTL2gyQ/inLR+FmmX323GBTnB LDSA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=BcWEUcI0Yjd/tOOOzuGK3jNI16EQymUNbl8FIIsBKDo=; b=CJg83xVGXzUTzJoGxUeRk2RKkMVqeAW/S9c9TgPlTIa2xD7XU26stBdEPo6+d1+wOF 7KlXPSigS8ka3mtFyMoqKgwqnjD73qQDPey9teX/i1C0ktr/LMh9AZRQlWMz4pKUpzjf XsSmAGHbJ4XwvK/bJcMk26k+2yRLAmGahaXMbwVZ+LQQQgdOEvUZwOFPUDXwHWTiWtLB IaYkbblCmvHXASjfWKV7pA9r4ZQQ1vUTawP+qW1xi3YY37wkhkA87viYG9py3ElSArQo X7AE4RRsh+AyjeEp4UuTCgc3zk/DRDBcA6aaAaXNcDfEwaqmkwIZex3UyFRlVKAp/Y+H H5vg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=mLoK4w5C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i10-20020a639d0a000000b005346c49e06fsi312518pgd.845.2023.05.17.20.13.59; Wed, 17 May 2023 20:14:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=mLoK4w5C; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229597AbjERDBR (ORCPT + 99 others); Wed, 17 May 2023 23:01:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41568 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229712AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0F9A1725 for ; Wed, 17 May 2023 20:00:56 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1ae4c5e12edso12460425ad.3 for ; Wed, 17 May 2023 20:00:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378856; x=1686970856; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=BcWEUcI0Yjd/tOOOzuGK3jNI16EQymUNbl8FIIsBKDo=; b=mLoK4w5CVlwvsn2uGcn61+9bh+RLyGRKMgK9m3fgjetK4/4YKElaxAIt6/76/T18c8 VmzeBqDjGW4HWvj3Y/TqkKbyLRdu3F4l5VnIkP6DNX2GA33R8IFdIdACrSl9pczRaOMt 2ZfMa+x24WXfFephLiF1diHTo8w4nSktrtIguS9R164eejlDJm0+1y23x9EZwcfm3P1T Y2PkFu5kdYwm8XPO4YEThlTqMeOVieZV4Hn+Dz7HU4GnbZR5Q7j+G+Eh60UOK1RRHkAR 4TVCDVrmeEB/NW2GRRwc/7IHDqUB8diltMB8I88BzyXbPlvSOr+Q8ldiWDJ0Ty61V1wI 21Fw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378856; x=1686970856; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=BcWEUcI0Yjd/tOOOzuGK3jNI16EQymUNbl8FIIsBKDo=; b=KQFpxKElWClulPFXuZ/n6zvhGIU0D4T0QbhWSPMACgo4mB5zQYheNxUBsuCb36iZFs zVrQtFszE9EiVGrxOlJfWauaULlb6UxxHo77CTJfExw9szBtyvEsa/gzOUAJ8GrCNv8H a/jwECdVyZzAr119OxnyuG9fE1Zv32gNWHNezXtCtQdU+wYvZKv3XzwvDRmlhrj8Szbx GVHj/Rkz3nLnCenUYHrXrf4qmQGxxUtjVey/848i/jgz94Yro9Qcc1/x62Qr5MLRzRjl PuJGRE80/Mg93NtfzqUtT/SQMrLkUbt/6+HQGRsmf8L5lPPlzcpb76qqNdC4aJ8R9BrI AA7Q== X-Gm-Message-State: AC+VfDyV3S5QxfTNR5CmSeL7hwoiDrzi3URI7Esw6t89+WSiezRVd+Wd 5LkWwUX4Q5h2qFYYWeEovuo= X-Received: by 2002:a17:903:338e:b0:1ac:6d4c:c26a with SMTP id kb14-20020a170903338e00b001ac6d4cc26amr1026745plb.14.1684378856064; Wed, 17 May 2023 20:00:56 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id q10-20020a170902daca00b001ac69bdc9d1sm113510plx.156.2023.05.17.20.00.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:00:55 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 4/7] workqueue: Improve locking rule description for worker fields Date: Wed, 17 May 2023 17:00:30 -1000 Message-Id: <20230518030033.4163274-5-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766200081203034159?= X-GMAIL-MSGID: =?utf-8?q?1766200081203034159?= * Some worker fields are modified only by the worker itself while holding pool->lock thus making them safe to read from self, IRQ context if the CPU is running the worker or while holding pool->lock. Add 'K' locking rule for them. * worker->sleeping is currently marked "None" which isn't very descriptive. It's used only by the worker itself. Add 'S' locking rule for it. A future patch will depend on the 'K' rule to access worker->current_* from the scheduler ticks. Signed-off-by: Tejun Heo --- kernel/workqueue.c | 6 ++++++ kernel/workqueue_internal.h | 15 ++++++++------- 2 files changed, 14 insertions(+), 7 deletions(-) diff --git a/kernel/workqueue.c b/kernel/workqueue.c index d70bb5be99ce..942421443603 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -126,6 +126,12 @@ enum { * cpu or grabbing pool->lock is enough for read access. If * POOL_DISASSOCIATED is set, it's identical to L. * + * K: Only modified by worker while holding pool->lock. Can be safely read by + * self, while holding pool->lock or from IRQ context if %current is the + * kworker. + * + * S: Only modified by worker self. + * * A: wq_pool_attach_mutex protected. * * PL: wq_pool_mutex protected. diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index 0600f04ceeb2..c2455be7b4c2 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -28,14 +28,15 @@ struct worker { struct hlist_node hentry; /* L: while busy */ }; - struct work_struct *current_work; /* L: work being processed */ - work_func_t current_func; /* L: current_work's fn */ - struct pool_workqueue *current_pwq; /* L: current_work's pwq */ - unsigned int current_color; /* L: current_work's color */ - int sleeping; /* None */ + struct work_struct *current_work; /* K: work being processed and its */ + work_func_t current_func; /* K: function */ + struct pool_workqueue *current_pwq; /* K: pwq */ + unsigned int current_color; /* K: color */ + + int sleeping; /* S: is worker sleeping? */ /* used by the scheduler to determine a worker's last known identity */ - work_func_t last_func; /* L: last work's fn */ + work_func_t last_func; /* K: last work's fn */ struct list_head scheduled; /* L: scheduled works */ @@ -45,7 +46,7 @@ struct worker { struct list_head node; /* A: anchored at pool->workers */ /* A: runs through worker->node */ - unsigned long last_active; /* L: last active timestamp */ + unsigned long last_active; /* K: last active timestamp */ unsigned int flags; /* X: flags */ int id; /* I: worker id */ From patchwork Thu May 18 03:00:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95632 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp208278vqo; Wed, 17 May 2023 20:15:55 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4tnZ7m4c7tsXBdzFw+DDJglgRUggk9zSgb1B/6jFjHL9E8//TgzSqLXLt8AryIVjLqjSnh X-Received: by 2002:a17:90a:fb82:b0:253:3ce4:b421 with SMTP id cp2-20020a17090afb8200b002533ce4b421mr1094972pjb.1.1684379755381; Wed, 17 May 2023 20:15:55 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684379755; cv=none; d=google.com; s=arc-20160816; b=UQm+C/iHjlhJ2uc27MrXOxpgU3vq9UiBHDCuOUA2T7FEDfqPjgn0FUP6ES6rCtsrAV UXaKiQ2I45BYBFCL5QFSIZ8Ej3VAtY6x9LWJAHcOXjphm0VkfCZ92s57qnsG0ZwPzdSk pY1lIfGYo31kzSJba9t/nH5jmsPnXRM0RUcundtaz0iS9flT9L8QY9/R22GYDe2o7ezs +396A0cqYmcvfMl8EZT/LUJprSxf1xQZuecvvlPfi5Z/Au2OT0NrhdLQxrHbLVF0TdRX c98STusHMRSwaOV50CJsztU0+Hq+HVYyhOa37KetSkiGbV6Y3xAyrdWeGDAze5NGm5ff oFaA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=6vWn9Lx96vURD4HIemcly9P0Sbhst5wEsNzzf7bjYr0=; b=SXXDMXRupZwJ9y0MXd87uWYDPKCQaud2mJXVHs4jIlpJX21dSUA72mYwDxphWgPTJS matThRAaXdk5Il7q8eeFa5qA/sapX9zjzkQE01lagSEJoxIBlqDutKRVTxkOXk9iLH1M PT0m2Ubrj7rDZAVr7R1BvIyO/j9fRPOEMfrqeu2oqh5jEfeknDhdDf7ZIlzK5ALrA/J0 bvfSCMYR6GRnUcmGDVSQ+oVLnKgzCyrM6mN2gclVBFcZy/mPdOsS50090Jn3iNv2twx4 LMujD2Zfx+sBG6s0iZAe8HSEz0OnCUrXN8mWsfQLl30X1ZurpQaJp9fSJ8scIxA2H4be 2itg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=YA3fI3O8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cv13-20020a17090afd0d00b0025345363ef3si647845pjb.115.2023.05.17.20.15.40; Wed, 17 May 2023 20:15:55 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=YA3fI3O8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229784AbjERDBZ (ORCPT + 99 others); Wed, 17 May 2023 23:01:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229744AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8FC71982 for ; Wed, 17 May 2023 20:00:58 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-51f6461af24so994412a12.2 for ; Wed, 17 May 2023 20:00:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378858; x=1686970858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=6vWn9Lx96vURD4HIemcly9P0Sbhst5wEsNzzf7bjYr0=; b=YA3fI3O8j0K5zdYmSxdOmjg1qXeOv6UFvUj4juu7JNurTiD+wIcG4HINVDVx4DvDZ8 mWPE9wWTlt7N61U5PtJIth5JMJ8kLkWPlWupgGmpqXusjH/WRKhL6PbjQBiJCbzLV9ZQ FelrIGroDz2zp7zBsw9SfK8hA+YHI3cWmqcx5NfFpv44d7q38TW6HJU+hYzDr18ubJSC d0HnGVg6uEBVwrImo1P+q3BZxAOama9f35xDgu1GAHNirLhlcxmom5UEckQZjjc1HMSl oW67/gjVC40NN4rZg2/Byv49FAStdm7B58CewGFLxlBlIWRGAnX25zRFm2uFlgwm2geX 9dNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378858; x=1686970858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=6vWn9Lx96vURD4HIemcly9P0Sbhst5wEsNzzf7bjYr0=; b=jKC3JZpwgtK9b2oaxBaewhhuipsMXJd6AbIsfBnP7uBwpVPUKNS2EL19aTNbtSQqKf ukGLgx6Ohhw1vGxyjVaa9jX77p/GZS76FtAO57Jrgo6ob88qtwtExUMMTk32t3xj0HmU FDqOYVj262tm4ar4/+1LtCnB327Zg7kStCH2cF9wLcybUuyYVEkLtqkG6+SZOu1ChXBY mpNw/GqbiRf35zlwa4G2dpYiM3sP7vw50qWj7UUBjfouEEakc2sNh3DgFw1Sc4itivZz n2xjCVAKFEK1WQwkqBDNO9+E7lTHlHEnsXONTUgJhK9la7RMxHh0tw34Sc0gKgdzHhW9 u94Q== X-Gm-Message-State: AC+VfDxJLFreYolG15zuf+kLN40KsmSM5JT7aGb6/PmdarDTj+1EGKbE G4na/LFOSGmBxWTLejKj/uE= X-Received: by 2002:a17:902:e5c8:b0:1ac:4fb3:1693 with SMTP id u8-20020a170902e5c800b001ac4fb31693mr1005716plf.52.1684378858220; Wed, 17 May 2023 20:00:58 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id a10-20020a170902ee8a00b001ab1cdb41d6sm94963pld.235.2023.05.17.20.00.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:00:57 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 5/7] workqueue: Automatically mark CPU-hogging work items CPU_INTENSIVE Date: Wed, 17 May 2023 17:00:31 -1000 Message-Id: <20230518030033.4163274-6-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766200186389592969?= X-GMAIL-MSGID: =?utf-8?q?1766200186389592969?= If a per-cpu work item hogs the CPU, it can prevent other work items from starting through concurrency management. A per-cpu workqueue which intends to host such CPU-hogging work items can choose to not participate in concurrency management by setting %WQ_CPU_INTENSIVE; however, this can be error-prone and difficult to debug when missed. This patch adds an automatic CPU usage based detection. If a concurrency-managed work item consumes more CPU time than the threshold (10ms by default) continuously without intervening sleeps, wq_worker_tick() which is called from scheduler_tick() will detect the condition and automatically mark it CPU_INTENSIVE. The mechanism isn't foolproof: * Detection depends on tick hitting the work item. Getting preempted at the right timings may allow a violating work item to evade detection at least temporarily. * nohz_full CPUs may not be running ticks and thus can fail detection. * Even when detection is working, the 10ms detection delays can add up if many CPU-hogging work items are queued at the same time. However, in vast majority of cases, this should be able to detect violations reliably and provide reasonable protection with a small increase in code complexity. If some work items trigger this condition repeatedly, the bigger problem likely is the CPU being saturated with such per-cpu work items and the solution would be making them UNBOUND. The next patch will add a debug mechanism to help spot such cases. v4: Documentation for workqueue.cpu_intensive_thresh_us added to kernel-parameters.txt. v3: Switch to use wq_worker_tick() instead of hooking into preemptions as suggested by Peter. v2: Lai pointed out that wq_worker_stopping() also needs to be called from preemption and rtlock paths and an earlier patch was updated accordingly. This patch adds a comment describing the risk of infinte recursions and how they're avoided. Signed-off-by: Tejun Heo Acked-by: Peter Zijlstra Cc: Linus Torvalds Cc: Lai Jiangshan --- .../admin-guide/kernel-parameters.txt | 7 ++ Documentation/core-api/workqueue.rst | 38 +++++------ kernel/sched/core.c | 3 + kernel/workqueue.c | 68 +++++++++++++++++-- kernel/workqueue_internal.h | 2 + tools/workqueue/wq_monitor.py | 13 +++- 6 files changed, 106 insertions(+), 25 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 9e5bab29685f..1f2185cf2f0a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6931,6 +6931,13 @@ it can be updated at runtime by writing to the corresponding sysfs file. + workqueue.cpu_intensive_thresh_us= + Per-cpu work items which run for longer than this + threshold are automatically considered CPU intensive + and excluded from concurrency management to prevent + them from noticeably delaying other per-cpu work + items. Default is 10000 (10ms). + workqueue.disable_numa By default, all work items queued to unbound workqueues are affine to the NUMA nodes they're diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst index 7e5c39310bbf..a389f31b025c 100644 --- a/Documentation/core-api/workqueue.rst +++ b/Documentation/core-api/workqueue.rst @@ -354,25 +354,25 @@ Monitoring Use tools/workqueue/wq_monitor.py to monitor workqueue operations: :: $ tools/workqueue/wq_monitor.py events - total infl CMwake mayday rescued - events 18545 0 5 - - - events_highpri 8 0 0 - - - events_long 3 0 0 - - - events_unbound 38306 0 - - - - events_freezable 0 0 0 - - - events_power_efficient 29598 0 0 - - - events_freezable_power_ 10 0 0 - - - sock_diag_events 0 0 0 - - - - total infl CMwake mayday rescued - events 18548 0 5 - - - events_highpri 8 0 0 - - - events_long 3 0 0 - - - events_unbound 38322 0 - - - - events_freezable 0 0 0 - - - events_power_efficient 29603 0 0 - - - events_freezable_power_ 10 0 0 - - - sock_diag_events 0 0 0 - - + total infl CPUitsv CMwake mayday rescued + events 18545 0 0 5 - - + events_highpri 8 0 0 0 - - + events_long 3 0 0 0 - - + events_unbound 38306 0 - - - - + events_freezable 0 0 0 0 - - + events_power_efficient 29598 0 0 0 - - + events_freezable_power_ 10 0 0 0 - - + sock_diag_events 0 0 0 0 - - + + total infl CPUitsv CMwake mayday rescued + events 18548 0 0 5 - - + events_highpri 8 0 0 0 - - + events_long 3 0 0 0 - - + events_unbound 38322 0 - - - - + events_freezable 0 0 0 0 - - + events_power_efficient 29603 0 0 0 - - + events_freezable_power_ 10 0 0 0 - - + sock_diag_events 0 0 0 0 - - ... diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 944c3ae39861..3484cada9a4a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -5632,6 +5632,9 @@ void scheduler_tick(void) perf_event_task_tick(); + if (curr->flags & PF_WQ_WORKER) + wq_worker_tick(curr); + #ifdef CONFIG_SMP rq->idle_balance = idle_cpu(cpu); trigger_load_balance(rq); diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 942421443603..3dc83d5eba50 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -212,6 +212,7 @@ struct worker_pool { enum pool_workqueue_stats { PWQ_STAT_STARTED, /* work items started execution */ PWQ_STAT_COMPLETED, /* work items completed execution */ + PWQ_STAT_CPU_INTENSIVE, /* wq_cpu_intensive_thresh_us violations */ PWQ_STAT_CM_WAKEUP, /* concurrency-management worker wakeups */ PWQ_STAT_MAYDAY, /* maydays to rescuer */ PWQ_STAT_RESCUED, /* linked work items executed by rescuer */ @@ -332,6 +333,14 @@ static struct kmem_cache *pwq_cache; static cpumask_var_t *wq_numa_possible_cpumask; /* possible CPUs of each node */ +/* + * Per-cpu work items which run for longer than the following threshold are + * automatically considered CPU intensive and excluded from concurrency + * management to prevent them from noticeably delaying other per-cpu work items. + */ +static unsigned long wq_cpu_intensive_thresh_us = 10000; +module_param_named(cpu_intensive_thresh_us, wq_cpu_intensive_thresh_us, ulong, 0644); + static bool wq_disable_numa; module_param_named(disable_numa, wq_disable_numa, bool, 0444); @@ -962,6 +971,13 @@ void wq_worker_running(struct task_struct *task) if (!(worker->flags & WORKER_NOT_RUNNING)) worker->pool->nr_running++; preempt_enable(); + + /* + * CPU intensive auto-detection cares about how long a work item hogged + * CPU without sleeping. Reset the starting timestamp on wakeup. + */ + worker->current_at = worker->task->se.sum_exec_runtime; + worker->sleeping = 0; } @@ -1012,6 +1028,45 @@ void wq_worker_sleeping(struct task_struct *task) raw_spin_unlock_irq(&pool->lock); } +/** + * wq_worker_tick - a scheduler tick occurred while a kworker is running + * @task: task currently running + * + * Called from scheduler_tick(). We're in the IRQ context and the current + * worker's fields which follow the 'K' locking rule can be accessed safely. + */ +void wq_worker_tick(struct task_struct *task) +{ + struct worker *worker = kthread_data(task); + struct pool_workqueue *pwq = worker->current_pwq; + struct worker_pool *pool = worker->pool; + + if (!pwq) + return; + + /* + * If the current worker is concurrency managed and hogged the CPU for + * longer than wq_cpu_intensive_thresh_us, it's automatically marked + * CPU_INTENSIVE to avoid stalling other concurrency-managed work items. + */ + if ((worker->flags & WORKER_NOT_RUNNING) || + worker->task->se.sum_exec_runtime - worker->current_at < + wq_cpu_intensive_thresh_us * NSEC_PER_USEC) + return; + + raw_spin_lock(&pool->lock); + + worker_set_flags(worker, WORKER_CPU_INTENSIVE); + pwq->stats[PWQ_STAT_CPU_INTENSIVE]++; + + if (need_more_worker(pool)) { + pwq->stats[PWQ_STAT_CM_WAKEUP]++; + wake_up_worker(pool); + } + + raw_spin_unlock(&pool->lock); +} + /** * wq_worker_last_func - retrieve worker's last work function * @task: Task to retrieve last work function of. @@ -2327,7 +2382,6 @@ __acquires(&pool->lock) { struct pool_workqueue *pwq = get_work_pwq(work); struct worker_pool *pool = worker->pool; - bool cpu_intensive = pwq->wq->flags & WQ_CPU_INTENSIVE; unsigned long work_data; struct worker *collision; #ifdef CONFIG_LOCKDEP @@ -2364,6 +2418,7 @@ __acquires(&pool->lock) worker->current_work = work; worker->current_func = work->func; worker->current_pwq = pwq; + worker->current_at = worker->task->se.sum_exec_runtime; work_data = *work_data_bits(work); worker->current_color = get_work_color(work_data); @@ -2381,7 +2436,7 @@ __acquires(&pool->lock) * of concurrency management and the next code block will chain * execution of the pending work items. */ - if (unlikely(cpu_intensive)) + if (unlikely(pwq->wq->flags & WQ_CPU_INTENSIVE)) worker_set_flags(worker, WORKER_CPU_INTENSIVE); /* @@ -2461,9 +2516,12 @@ __acquires(&pool->lock) raw_spin_lock_irq(&pool->lock); - /* clear cpu intensive status */ - if (unlikely(cpu_intensive)) - worker_clr_flags(worker, WORKER_CPU_INTENSIVE); + /* + * In addition to %WQ_CPU_INTENSIVE, @worker may also have been marked + * CPU intensive by wq_worker_tick() if @work hogged CPU longer than + * wq_cpu_intensive_thresh_us. Clear it. + */ + worker_clr_flags(worker, WORKER_CPU_INTENSIVE); /* tag the worker for identification in schedule() */ worker->last_func = worker->current_func; diff --git a/kernel/workqueue_internal.h b/kernel/workqueue_internal.h index c2455be7b4c2..6b1d66e28269 100644 --- a/kernel/workqueue_internal.h +++ b/kernel/workqueue_internal.h @@ -31,6 +31,7 @@ struct worker { struct work_struct *current_work; /* K: work being processed and its */ work_func_t current_func; /* K: function */ struct pool_workqueue *current_pwq; /* K: pwq */ + u64 current_at; /* K: runtime at start or last wakeup */ unsigned int current_color; /* K: color */ int sleeping; /* S: is worker sleeping? */ @@ -76,6 +77,7 @@ static inline struct worker *current_wq_worker(void) */ void wq_worker_running(struct task_struct *task); void wq_worker_sleeping(struct task_struct *task); +void wq_worker_tick(struct task_struct *task); work_func_t wq_worker_last_func(struct task_struct *task); #endif /* _KERNEL_WORKQUEUE_INTERNAL_H */ diff --git a/tools/workqueue/wq_monitor.py b/tools/workqueue/wq_monitor.py index fc1643ba06b3..7c6f523b9164 100644 --- a/tools/workqueue/wq_monitor.py +++ b/tools/workqueue/wq_monitor.py @@ -11,6 +11,11 @@ https://github.com/osandov/drgn. infl The number of currently in-flight work items. + CPUitsv The number of times a concurrency-managed work item hogged CPU + longer than the threshold (workqueue.cpu_intensive_thresh_us) + and got excluded from concurrency management to avoid stalling + other work items. + CMwake The number of concurrency-management wake-ups while executing a work item of the workqueue. @@ -53,6 +58,7 @@ WQ_MEM_RECLAIM = prog['WQ_MEM_RECLAIM'] PWQ_STAT_STARTED = prog['PWQ_STAT_STARTED'] # work items started execution PWQ_STAT_COMPLETED = prog['PWQ_STAT_COMPLETED'] # work items completed execution +PWQ_STAT_CPU_INTENSIVE = prog['PWQ_STAT_CPU_INTENSIVE'] # wq_cpu_intensive_thresh_us violations PWQ_STAT_CM_WAKEUP = prog['PWQ_STAT_CM_WAKEUP'] # concurrency-management worker wakeups PWQ_STAT_MAYDAY = prog['PWQ_STAT_MAYDAY'] # maydays to rescuer PWQ_STAT_RESCUED = prog['PWQ_STAT_RESCUED'] # linked work items executed by rescuer @@ -75,19 +81,23 @@ PWQ_NR_STATS = prog['PWQ_NR_STATS'] 'mem_reclaim' : self.mem_reclaim, 'started' : self.stats[PWQ_STAT_STARTED], 'completed' : self.stats[PWQ_STAT_COMPLETED], + 'cpu_intensive' : self.stats[PWQ_STAT_CPU_INTENSIVE], 'cm_wakeup' : self.stats[PWQ_STAT_CM_WAKEUP], 'mayday' : self.stats[PWQ_STAT_MAYDAY], 'rescued' : self.stats[PWQ_STAT_RESCUED], } def table_header_str(): - return f'{"":>24} {"total":>8} {"infl":>5} {"CMwake":>7} {"mayday":>7} {"rescued":>7}' + return f'{"":>24} {"total":>8} {"infl":>5} '\ + f'{"CPUitsv":>7} {"CMwake":>7} {"mayday":>7} {"rescued":>7}' def table_row_str(self): + cpu_intensive = '-' cm_wakeup = '-' mayday = '-' rescued = '-' if not self.unbound: + cpu_intensive = str(self.stats[PWQ_STAT_CPU_INTENSIVE]) cm_wakeup = str(self.stats[PWQ_STAT_CM_WAKEUP]) if self.mem_reclaim: @@ -97,6 +107,7 @@ PWQ_NR_STATS = prog['PWQ_NR_STATS'] out = f'{self.name[-24:]:24} ' \ f'{self.stats[PWQ_STAT_STARTED]:8} ' \ f'{max(self.stats[PWQ_STAT_STARTED] - self.stats[PWQ_STAT_COMPLETED], 0):5} ' \ + f'{cpu_intensive:>7} ' \ f'{cm_wakeup:>7} ' \ f'{mayday:>7} ' \ f'{rescued:>7} ' From patchwork Thu May 18 03:00:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95630 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp204317vqo; Wed, 17 May 2023 20:05:28 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6vReJqbExcpGCkd9I4Xmrc53fqBQUSt6Z/8NctCn84ZmNxHQFvkDQKGK4bK5A0f//3Wwfp X-Received: by 2002:a05:6a00:2402:b0:646:6e40:b421 with SMTP id z2-20020a056a00240200b006466e40b421mr1935590pfh.1.1684379127873; Wed, 17 May 2023 20:05:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684379127; cv=none; d=google.com; s=arc-20160816; b=JLQzMPbvvntX6vgPoWTj+onjDbAAyvIpnk96Ux7GE7UNivJ8p9Jx4qlf2apOuvvmie g3h65GgXQR+B3C28FVu1PCYEMQPtrq12bDRERjyNUOJmHDddKMNqC6VfDVPI3XBGEmwJ 25f3IS3pM2Q99pl2pRO2XDacvyI7zfdhBrAhVtQKb/YZdBe9Y0ZpdoXHXPUXJzIEyMm1 ewI+o2qPiqckbtWd9ys67WeII+MdeLdDHKM0TgfGlutuUNfQz5m9i9RPYQ45t04DtrOw K6HEZvN/1E8QdwTFjsWbpL7Luaya/gpp7JF5GJp3lZVFRqjUAk5AmVBeAsQVw3GBTGie hTjw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=RuuKBNrzsKzOsdArC2h9Wd9NRa9otXooTrbKl9vYQZg=; b=0wtBo+H0JnYPjUf5FyhVJVU2PFNadUaK2ujYZmvJvqpyj6D5XLKN68g0DiFcT2EZbn k5wOySEiHyZ37EO6xVGUIhH6vdL2xJIozxIJBKQXS9Hu6m7gMdq7MOOSgJZi/YzU19Lr hxTuG2PW84PNSPlqiIuw4C9ah1/xVEWdC5dzLuDUfP+fZ5YbAvwCofMZPCPy5yjmpiDj SWTq2dlRz/ahNguYmk/DPmWNyq0UX1YANo5uk9DruJ5SRSgA5pvcYGTHU4jR9U1IYN/q yt18MyIFfwp3XNvfVNIA9Dm9aFuggNMIOnZzrpgpF3w5+HIZPgapdHA5WD/vzE8bGrGV jx9w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=hyVWZFX7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f74-20020a62384d000000b0063d5f645cacsi584464pfa.116.2023.05.17.20.05.13; Wed, 17 May 2023 20:05:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=hyVWZFX7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229867AbjERDBi (ORCPT + 99 others); Wed, 17 May 2023 23:01:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41580 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229745AbjERDBC (ORCPT ); Wed, 17 May 2023 23:01:02 -0400 Received: from mail-pf1-x42e.google.com (mail-pf1-x42e.google.com [IPv6:2607:f8b0:4864:20::42e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 68D71199A for ; Wed, 17 May 2023 20:01:01 -0700 (PDT) Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-64ab2a37812so13645602b3a.1 for ; Wed, 17 May 2023 20:01:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378860; x=1686970860; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=RuuKBNrzsKzOsdArC2h9Wd9NRa9otXooTrbKl9vYQZg=; b=hyVWZFX7Iwsc2do9pp4W91d+XLrhy9OrzKJL+sjijLDmoSaGgDe0XHz6rdeliV2P4i 4B8thGj4Epms7QBbmb+QpomipuO3+dHJ40WEV4w9i0rOmQv9HqaTbMTjyg/e17W/vMBc 8RbXl0mV3KDeX7VwxPauzLL80z+OCOIBQxgeb2vSfcQqIoUNgzooTWAqpFlnXfKPWHo4 yPPgzvvW7Pgu3O6T5pPkUPPDdgLFXM+nbXCf+Yrhqk5k6eRUaUDCgFcwxvfqhVzE/ukV fZ4ThixvB1YXxSB3Axl3NRXEg7Excv/m5VYkGLm6NNpOWwsaCqCO9QCPZhap4JBfACvm XXsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378860; x=1686970860; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=RuuKBNrzsKzOsdArC2h9Wd9NRa9otXooTrbKl9vYQZg=; b=EZMVrPuQ1BHBEpxDjj5SJ99y7xT7k6II9VNzcbx3LgZU3b1EHrb2Cxx/5N15bRn8Fg f5MQnW5m4Rn2rXhP+MhJtA+xoyOAP7CAY2xyQGFcjqj4FTlyiKTtz4mOOcrOj7L0wlKp KWgaytB2mvS8WtPQN1ti3h3tMZgjLPoQp2Qg8ADF/3qAy1q93cWK//xfpncDQ5P7KQg3 BhQGFBJn00cdkDHAJShK0ZvbXppG2Jb8MEd5jotMHnlM7lpfH1gTwgdQoYW/Q4I3zpCp wAGGn9Qtxda1YLznIGK3IIzN3E6H9rSnBw4LU+mohPJJM6ERuZxDSAIu4IAjSqdKwrku 7KjQ== X-Gm-Message-State: AC+VfDzi5FH9gCJcY8fBdSoZ7syYUUvM5VtpyAbImaj82W5pR9wF8meR jSSTvabStc2EvQxj0QLRIUE= X-Received: by 2002:a05:6a20:440d:b0:104:28d6:5db3 with SMTP id ce13-20020a056a20440d00b0010428d65db3mr453679pzb.29.1684378860340; Wed, 17 May 2023 20:01:00 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id f2-20020a655502000000b0051baf3f1b3esm135649pgr.76.2023.05.17.20.00.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:01:00 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 6/7] workqueue: Report work funcs that trigger automatic CPU_INTENSIVE mechanism Date: Wed, 17 May 2023 17:00:32 -1000 Message-Id: <20230518030033.4163274-7-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766199528609647045?= X-GMAIL-MSGID: =?utf-8?q?1766199528609647045?= Workqueue now automatically marks per-cpu work items that hog CPU for too long as CPU_INTENSIVE, which excludes them from concurrency management and prevents stalling other concurrency-managed work items. If a work function keeps running over the thershold, it likely needs to be switched to use an unbound workqueue. This patch adds a debug mechanism which tracks the work functions which trigger the automatic CPU_INTENSIVE mechanism and report them using pr_warn() with exponential backoff. v3: Documentation update. v2: Drop bouncing to kthread_worker for printing messages. It was to avoid introducing circular locking dependency through printk but not effective as it still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just print directly using printk_deferred(). Signed-off-by: Tejun Heo Suggested-by: Peter Zijlstra --- .../admin-guide/kernel-parameters.txt | 5 + kernel/workqueue.c | 93 +++++++++++++++++++ lib/Kconfig.debug | 13 +++ 3 files changed, 111 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 1f2185cf2f0a..3ed7dda4c994 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -6938,6 +6938,11 @@ them from noticeably delaying other per-cpu work items. Default is 10000 (10ms). + If CONFIG_WQ_CPU_INTENSIVE_REPORT is set, the kernel + will report the work functions which violate this + threshold repeatedly. They are likely good + candidates for using WQ_UNBOUND workqueues instead. + workqueue.disable_numa By default, all work items queued to unbound workqueues are affine to the NUMA nodes they're diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 3dc83d5eba50..4ca66384d288 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -948,6 +948,98 @@ static inline void worker_clr_flags(struct worker *worker, unsigned int flags) pool->nr_running++; } +#ifdef CONFIG_WQ_CPU_INTENSIVE_REPORT + +/* + * Concurrency-managed per-cpu work items that hog CPU for longer than + * wq_cpu_intensive_thresh_us trigger the automatic CPU_INTENSIVE mechanism, + * which prevents them from stalling other concurrency-managed work items. If a + * work function keeps triggering this mechanism, it's likely that the work item + * should be using an unbound workqueue instead. + * + * wq_cpu_intensive_report() tracks work functions which trigger such conditions + * and report them so that they can be examined and converted to use unbound + * workqueues as appropriate. To avoid flooding the console, each violating work + * function is tracked and reported with exponential backoff. + */ +#define WCI_MAX_ENTS 128 + +struct wci_ent { + work_func_t func; + atomic64_t cnt; + struct hlist_node hash_node; +}; + +static struct wci_ent wci_ents[WCI_MAX_ENTS]; +static int wci_nr_ents; +static DEFINE_RAW_SPINLOCK(wci_lock); +static DEFINE_HASHTABLE(wci_hash, ilog2(WCI_MAX_ENTS)); + +static struct wci_ent *wci_find_ent(work_func_t func) +{ + struct wci_ent *ent; + + hash_for_each_possible_rcu(wci_hash, ent, hash_node, + (unsigned long)func) { + if (ent->func == func) + return ent; + } + return NULL; +} + +static void wq_cpu_intensive_report(work_func_t func) +{ + struct wci_ent *ent; + +restart: + ent = wci_find_ent(func); + if (ent) { + u64 cnt; + + /* + * Start reporting from the fourth time and back off + * exponentially. + */ + cnt = atomic64_inc_return_relaxed(&ent->cnt); + if (cnt >= 4 && is_power_of_2(cnt)) + printk_deferred(KERN_WARNING "workqueue: %ps hogged CPU for >%luus %llu times, consider switching to WQ_UNBOUND\n", + ent->func, wq_cpu_intensive_thresh_us, + atomic64_read(&ent->cnt)); + return; + } + + /* + * @func is a new violation. Allocate a new entry for it. If wcn_ents[] + * is exhausted, something went really wrong and we probably made enough + * noise already. + */ + if (wci_nr_ents >= WCI_MAX_ENTS) + return; + + raw_spin_lock(&wci_lock); + + if (wci_nr_ents >= WCI_MAX_ENTS) { + raw_spin_unlock(&wci_lock); + return; + } + + if (wci_find_ent(func)) { + raw_spin_unlock(&wci_lock); + goto restart; + } + + ent = &wci_ents[wci_nr_ents++]; + ent->func = func; + atomic64_set(&ent->cnt, 1); + hash_add_rcu(wci_hash, &ent->hash_node, (unsigned long)func); + + raw_spin_unlock(&wci_lock); +} + +#else /* CONFIG_WQ_CPU_INTENSIVE_REPORT */ +static void wq_cpu_intensive_report(work_func_t func) {} +#endif /* CONFIG_WQ_CPU_INTENSIVE_REPORT */ + /** * wq_worker_running - a worker is running again * @task: task waking up @@ -1057,6 +1149,7 @@ void wq_worker_tick(struct task_struct *task) raw_spin_lock(&pool->lock); worker_set_flags(worker, WORKER_CPU_INTENSIVE); + wq_cpu_intensive_report(worker->current_func); pwq->stats[PWQ_STAT_CPU_INTENSIVE]++; if (need_more_worker(pool)) { diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug index ce51d4dc6803..97e880aa48d7 100644 --- a/lib/Kconfig.debug +++ b/lib/Kconfig.debug @@ -1134,6 +1134,19 @@ config WQ_WATCHDOG state. This can be configured through kernel parameter "workqueue.watchdog_thresh" and its sysfs counterpart. +config WQ_CPU_INTENSIVE_REPORT + bool "Report per-cpu work items which hog CPU for too long" + depends on DEBUG_KERNEL + help + Say Y here to enable reporting of concurrency-managed per-cpu work + items that hog CPUs for longer than + workqueue.cpu_intensive_threshold_us. Workqueue automatically + detects and excludes them from concurrency management to prevent + them from stalling other per-cpu work items. Occassional + triggering may not necessarily indicate a problem. Repeated + triggering likely indicates that the work item should be switched + to use an unbound workqueue. + config TEST_LOCKUP tristate "Test module to generate lockups" depends on m From patchwork Thu May 18 03:00:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Tejun Heo X-Patchwork-Id: 95636 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp210875vqo; Wed, 17 May 2023 20:23:11 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4nGxzevi9yRALRNKbsYrnIcrUNCfBWVUBiTEZCP18Pr7CdmWCEUfpSiRUa++uyiNyCyLDi X-Received: by 2002:a05:6a20:7286:b0:109:a30a:92c4 with SMTP id o6-20020a056a20728600b00109a30a92c4mr2574pzk.55.1684380190690; Wed, 17 May 2023 20:23:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684380190; cv=none; d=google.com; s=arc-20160816; b=xK9j15QmvjYaU4qH0J0e48aQhA2bLdpUWHmozelpXPQGnfgH6ZJCoOO5htO9FFHh0/ uSoTKTsYeDY+Ise6U049ZhjRaEd1/03M8kAeLRphD+ABnKXr4B2qQPawz5XkIeq2PpKX Ej1koUT6AULeFVMGn7yJP7aBIYj9CO02u54d+hKTn0nFT24rrah7CGo3E51aOrvJjZXo Vnlouq4kklskOTs4ncz3GLQdTlgnN2SZJteW+VRkP95IBBi+JHORNDQJiwQ06aP3uWlM fS76p2lRJnz+mRXGPPMnhPvCpHHBgWgvtFHjMlzexwYLTE9UXPcNA5ml2mWeyZY5k5ky 5cXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from:sender :dkim-signature; bh=M31UFjRMNA+ol69v+jDv2vil0t39UiMHhdSd0bcsVeM=; b=V/bXHTwDWbEYyl0jEbdoaJbxHlRYbPf3lyMl56gclroEeoVDG19t4TNMiTHbapkAvn GLj7hqEH7tl763emrEUjG19UP6tIw6H5qHHxmlTO2Utix3fkztfJqGpE7huQFh1KF95g jwor3KShnX2I1C0aEDxh1bdJvm6NSqxfsYCmNPoMHeZH6UjLSliPRYiSTiqaSeJNqJAc EVmFNXjCnunTLx+ZxPU/8jyrz752xA9h7z9yR854wxtGI/4UYlmk/BC6zZkoUxsW2yA5 4CBhzOO1MoS2EUNwJ0Fpmj1aQwezQkUjBGPRtTBaq9x0cBCjsiH5kjPuM0jSqfw0MCUH WrNA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=CWXDoBcJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n17-20020a637211000000b00525037a363asi431704pgc.328.2023.05.17.20.22.56; Wed, 17 May 2023 20:23:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=CWXDoBcJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229882AbjERDBn (ORCPT + 99 others); Wed, 17 May 2023 23:01:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229696AbjERDBE (ORCPT ); Wed, 17 May 2023 23:01:04 -0400 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 40D2B121 for ; Wed, 17 May 2023 20:01:03 -0700 (PDT) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-6436dfa15b3so1089986b3a.1 for ; Wed, 17 May 2023 20:01:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1684378863; x=1686970863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:from:to:cc:subject:date :message-id:reply-to; bh=M31UFjRMNA+ol69v+jDv2vil0t39UiMHhdSd0bcsVeM=; b=CWXDoBcJupLZJTesgoewDVaIiHUgN/g79eHrK5PJWbvky/qRWF/RCKNRaASOqkErXf relzDaOA1TL/Rz2N3FQQta+wjHS//a8NSvHhJ4pddF8YsXeh6lk8iCI3aHIz1hvLynh5 UVfxW4KFCODb10TqKItvwBFGiesMsrm2WYdXZG+Jj6+7trNiZyi+LGZ0M4mVeq1km2CA O0PRkVIX3Qk3Lcd2iN+5TDaJT87tzMCS0JQZw7bIPevmzyFP3cfjyABPuk4/hJ3KBvjd mjfSy00umbs7MrP04d99Z76FLGBB+VD5TQA42hPA9RiWNNMo0V2SVpbOxSLMFmlM9vNp LRnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684378863; x=1686970863; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:sender:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=M31UFjRMNA+ol69v+jDv2vil0t39UiMHhdSd0bcsVeM=; b=JJNdv75QXxXHQm9JUnY7ktt/wYb6uAR/k4UpxIIzdegW9/R5huPBrPhoBZMii8T4ke 3sFOhk8t3PwjYJB+5NNfq0q8aIWA77D33XJOpg8Wx5F30em74ECcI35u2aznpRMxBtkC +OpIFWm7vc7isoMK861GfIaycyncqzKbRD6pi+BfJwmhS7l+o7BZpC1WH7iZ1KBMrMBE 08bUoEGltgpRQSypvu0zbfQ+6RiacC66baTAkBcOOZk6KsJUwnlOP8IUJ9x2fGD0/d4z gISGMBOZJlhjYLizrxWeowjLG7hrWRF0LM2v8cPDXqADeHGWCoM47+xfqDwfDTDQJOgT Vmlg== X-Gm-Message-State: AC+VfDyVW7eanX+DGD+gXLVo79Po7UDKvGea+fl/VS+od0hmH1fMoysl iHktLXfK3NDlROBVnqWDD4g= X-Received: by 2002:a05:6a21:328e:b0:102:2de7:ee36 with SMTP id yt14-20020a056a21328e00b001022de7ee36mr652028pzb.36.1684378862523; Wed, 17 May 2023 20:01:02 -0700 (PDT) Received: from localhost ([2620:10d:c090:400::5:1c2f]) by smtp.gmail.com with ESMTPSA id 133-20020a63048b000000b0050bc03741ffsm145902pge.84.2023.05.17.20.01.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 20:01:02 -0700 (PDT) Sender: Tejun Heo From: Tejun Heo To: jiangshanlai@gmail.com Cc: torvalds@linux-foundation.org, peterz@infradead.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, Tejun Heo Subject: [PATCH 7/7] workqueue: Track and monitor per-workqueue CPU time usage Date: Wed, 17 May 2023 17:00:33 -1000 Message-Id: <20230518030033.4163274-8-tj@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230518030033.4163274-1-tj@kernel.org> References: <20230518030033.4163274-1-tj@kernel.org> MIME-Version: 1.0 X-Spam-Status: No, score=-1.5 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_EF,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766200642446218563?= X-GMAIL-MSGID: =?utf-8?q?1766200642446218563?= Now that wq_worker_tick() is there, we can easily track the rough CPU time consumption of each workqueue by charging the whole tick whenever a tick hits an active workqueue. While not super accurate, it provides reasonable visibility into the workqueues that consume a lot of CPU cycles. wq_monitor.py is updated to report the per-workqueue CPU times. v2: wq_monitor.py was using "cputime" as the key when outputting in json format. Use "cpu_time" instead for consistency with other fields. Signed-off-by: Tejun Heo --- Documentation/core-api/workqueue.rst | 38 ++++++++++++++-------------- kernel/workqueue.c | 3 +++ tools/workqueue/wq_monitor.py | 9 ++++++- 3 files changed, 30 insertions(+), 20 deletions(-) diff --git a/Documentation/core-api/workqueue.rst b/Documentation/core-api/workqueue.rst index a389f31b025c..a4c9b9d1905f 100644 --- a/Documentation/core-api/workqueue.rst +++ b/Documentation/core-api/workqueue.rst @@ -354,25 +354,25 @@ Monitoring Use tools/workqueue/wq_monitor.py to monitor workqueue operations: :: $ tools/workqueue/wq_monitor.py events - total infl CPUitsv CMwake mayday rescued - events 18545 0 0 5 - - - events_highpri 8 0 0 0 - - - events_long 3 0 0 0 - - - events_unbound 38306 0 - - - - - events_freezable 0 0 0 0 - - - events_power_efficient 29598 0 0 0 - - - events_freezable_power_ 10 0 0 0 - - - sock_diag_events 0 0 0 0 - - - - total infl CPUitsv CMwake mayday rescued - events 18548 0 0 5 - - - events_highpri 8 0 0 0 - - - events_long 3 0 0 0 - - - events_unbound 38322 0 - - - - - events_freezable 0 0 0 0 - - - events_power_efficient 29603 0 0 0 - - - events_freezable_power_ 10 0 0 0 - - - sock_diag_events 0 0 0 0 - - + total infl CPUtime CPUhog CMwake mayday rescued + events 18545 0 6.1 0 5 - - + events_highpri 8 0 0.0 0 0 - - + events_long 3 0 0.0 0 0 - - + events_unbound 38306 0 0.1 - - - - + events_freezable 0 0 0.0 0 0 - - + events_power_efficient 29598 0 0.2 0 0 - - + events_freezable_power_ 10 0 0.0 0 0 - - + sock_diag_events 0 0 0.0 0 0 - - + + total infl CPUtime CPUhog CMwake mayday rescued + events 18548 0 6.1 0 5 - - + events_highpri 8 0 0.0 0 0 - - + events_long 3 0 0.0 0 0 - - + events_unbound 38322 0 0.1 - - - - + events_freezable 0 0 0.0 0 0 - - + events_power_efficient 29603 0 0.2 0 0 - - + events_freezable_power_ 10 0 0.0 0 0 - - + sock_diag_events 0 0 0.0 0 0 - - ... diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4ca66384d288..ee16ddb0647c 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -212,6 +212,7 @@ struct worker_pool { enum pool_workqueue_stats { PWQ_STAT_STARTED, /* work items started execution */ PWQ_STAT_COMPLETED, /* work items completed execution */ + PWQ_STAT_CPU_TIME, /* total CPU time consumed */ PWQ_STAT_CPU_INTENSIVE, /* wq_cpu_intensive_thresh_us violations */ PWQ_STAT_CM_WAKEUP, /* concurrency-management worker wakeups */ PWQ_STAT_MAYDAY, /* maydays to rescuer */ @@ -1136,6 +1137,8 @@ void wq_worker_tick(struct task_struct *task) if (!pwq) return; + pwq->stats[PWQ_STAT_CPU_TIME] += TICK_USEC; + /* * If the current worker is concurrency managed and hogged the CPU for * longer than wq_cpu_intensive_thresh_us, it's automatically marked diff --git a/tools/workqueue/wq_monitor.py b/tools/workqueue/wq_monitor.py index 7c6f523b9164..6e258d123e8c 100644 --- a/tools/workqueue/wq_monitor.py +++ b/tools/workqueue/wq_monitor.py @@ -11,6 +11,10 @@ https://github.com/osandov/drgn. infl The number of currently in-flight work items. + CPUtime Total CPU time consumed by the workqueue in seconds. This is + sampled from scheduler ticks and only provides ballpark + measurement. "nohz_full=" CPUs are excluded from measurement. + CPUitsv The number of times a concurrency-managed work item hogged CPU longer than the threshold (workqueue.cpu_intensive_thresh_us) and got excluded from concurrency management to avoid stalling @@ -58,6 +62,7 @@ WQ_MEM_RECLAIM = prog['WQ_MEM_RECLAIM'] PWQ_STAT_STARTED = prog['PWQ_STAT_STARTED'] # work items started execution PWQ_STAT_COMPLETED = prog['PWQ_STAT_COMPLETED'] # work items completed execution +PWQ_STAT_CPU_TIME = prog['PWQ_STAT_CPU_TIME'] # total CPU time consumed PWQ_STAT_CPU_INTENSIVE = prog['PWQ_STAT_CPU_INTENSIVE'] # wq_cpu_intensive_thresh_us violations PWQ_STAT_CM_WAKEUP = prog['PWQ_STAT_CM_WAKEUP'] # concurrency-management worker wakeups PWQ_STAT_MAYDAY = prog['PWQ_STAT_MAYDAY'] # maydays to rescuer @@ -81,13 +86,14 @@ PWQ_NR_STATS = prog['PWQ_NR_STATS'] 'mem_reclaim' : self.mem_reclaim, 'started' : self.stats[PWQ_STAT_STARTED], 'completed' : self.stats[PWQ_STAT_COMPLETED], + 'cpu_time' : self.stats[PWQ_STAT_CPU_TIME], 'cpu_intensive' : self.stats[PWQ_STAT_CPU_INTENSIVE], 'cm_wakeup' : self.stats[PWQ_STAT_CM_WAKEUP], 'mayday' : self.stats[PWQ_STAT_MAYDAY], 'rescued' : self.stats[PWQ_STAT_RESCUED], } def table_header_str(): - return f'{"":>24} {"total":>8} {"infl":>5} '\ + return f'{"":>24} {"total":>8} {"infl":>5} {"CPUtime":>8} '\ f'{"CPUitsv":>7} {"CMwake":>7} {"mayday":>7} {"rescued":>7}' def table_row_str(self): @@ -107,6 +113,7 @@ PWQ_NR_STATS = prog['PWQ_NR_STATS'] out = f'{self.name[-24:]:24} ' \ f'{self.stats[PWQ_STAT_STARTED]:8} ' \ f'{max(self.stats[PWQ_STAT_STARTED] - self.stats[PWQ_STAT_COMPLETED], 0):5} ' \ + f'{self.stats[PWQ_STAT_CPU_TIME] / 1000000:8.1f} ' \ f'{cpu_intensive:>7} ' \ f'{cm_wakeup:>7} ' \ f'{mayday:>7} ' \