Message ID | 20240214063708.972376-1-irogers@google.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:bc8a:b0:106:860b:bbdd with SMTP id dn10csp1025076dyb; Tue, 13 Feb 2024 22:38:15 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCViAccDlz8akBXFNfjRYSqzvoaIUz/a3ldpRL47aHCgpoI7MidvAhepZ2uICTBXAomet29N2a9ZTHr0hdv4TpisaFOjNg== X-Google-Smtp-Source: AGHT+IGQ25LPNy0iSD8qbsk7t8fAfK+wUgfChnmN/eV73VJA8QkKIJQfwd4IgwZ/Fs5A62bVvwbI X-Received: by 2002:a05:6358:3a15:b0:178:fac8:da7c with SMTP id g21-20020a0563583a1500b00178fac8da7cmr1196374rwe.21.1707892694989; Tue, 13 Feb 2024 22:38:14 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1707892694; cv=pass; d=google.com; s=arc-20160816; b=d6y9IyGLRJ8x0GYkvEwWXl/RL+dsDkZVODcARU4khc651yRdxxrUZgvtWzCKYGrX11 Qb9bjI2Rchjt3K5sBo1Eoqyss/PBtn6IDcvJOWMQ5W7eEgEyL2KflGWRJdQFApf7zQ49 KF9yr90QPXFhPss4g6ko1mvFgjcUay8KWIQrj2QPTK9RJ7RKJ2o8HtIOH/TgHQOy+c9v dlOQkC756m4FGqyYTtsnGCzSlkrNqTCWgPujz5AjLSwt9+3aCxAVCWtMGgoS4TsTuTFC CL1m56rxfqNy8zxG5Hkpz4IZ+IEH67KBsj0ftGWtgHvNeVLUC5HBmN025OHPgOkl9Sld M/XA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=to:from:subject:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:message-id:date:dkim-signature; bh=SjusGx6p43Fl1gAtWsqPYtqEBcBHhyA1NbBeHZWXXBk=; fh=6x37Wby9zaiFK1DjrD+M/ftTAPHbmLkS+j5TUbKtG8c=; b=ClLpC6Oyh/tikVy+7scoE1jwuT1ltMFpLYIQdYTKJ14llXzQ+oqfBp3VYHTc/5zBLs XyZGUN7hzRojch6HXtzE8PIIsUiLoYFgKD9xbp3myN3vht9doUFk2pRGs70vKUui/d/j UFIhVKRB5sUEK/eF7xSLjNKCcqvduyepZ0F8XZImu7H1Qkc4MGfXOxLn8keRDar22l2j v7QMTMI6pvm2XotyRKZUQoegnz0JsB8BAkFvucypvMOjkcmiLvHMBM/YrDSdm7iLmHz3 mUBPW6ZOZqFeTfzce5SN/4EoniYSsDgTG3D56keKz0nXcmLOdg+o7jNMJBIfCmBRJTUp tG8A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=aKC2p1yW; arc=pass (i=1 spf=pass spfdomain=flex--irogers.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com X-Forwarded-Encrypted: i=2; AJvYcCXMD4rHgMeQMEQ5E2SPsA4nGv6jR2Ua4NDIu7XSyhWKfIJDQVPWYLdwwio/cEAgLYI670o82SCWnrYEg4OxwgIpWieuCA== Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id w16-20020a639350000000b005dc42fd8c40si3154503pgm.404.2024.02.13.22.38.14 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 13 Feb 2024 22:38:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=aKC2p1yW; arc=pass (i=1 spf=pass spfdomain=flex--irogers.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-64795-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id B91F0285C03 for <ouuuleilei@gmail.com>; Wed, 14 Feb 2024 06:38:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 004B2125BB; Wed, 14 Feb 2024 06:37:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="aKC2p1yW" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF21B11729 for <linux-kernel@vger.kernel.org>; Wed, 14 Feb 2024 06:37:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707892643; cv=none; b=eK9s9BaptwrQese/M4UxySBRiPkZ1Xg412DT9W8lyNYdn4XenggxCmv0v/+RT+hmsxlRdvuqomTn2itHPe5rsyy8dfL74MrHzy2GAxrp5Y9WEAEok3imp5sfebPzKIiFnZcTZkpjTjV10yY39Ibzlc2pTl8pIXPSz/iBPh+j5X4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1707892643; c=relaxed/simple; bh=G/2siGEslfhIe2PbP6r2MxvFbe+8RJMIG2S2/Tjz5Sk=; h=Date:Message-Id:Mime-Version:Subject:From:To:Content-Type; b=LXx1TAkJIFG7WTzhvkxVyDedDx3USrmEpg4W/TK5X0ssIcGWMJOhWHBeGb6w2fkEu+PV4iM1MT8u5uVH05yENvUKWc4V6o4rbTQH+E7mjosCc9RMk2PMKLE8yGSknGl+t4Vd1dQRJno4pHlzNa3qF+B2fDj/R9cxWNYjwW0GBig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=aKC2p1yW; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--irogers.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6079cf3541bso19270477b3.2 for <linux-kernel@vger.kernel.org>; Tue, 13 Feb 2024 22:37:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1707892641; x=1708497441; darn=vger.kernel.org; h=to:from:subject:mime-version:message-id:date:from:to:cc:subject :date:message-id:reply-to; bh=SjusGx6p43Fl1gAtWsqPYtqEBcBHhyA1NbBeHZWXXBk=; b=aKC2p1yW+k+f204eXYynN0DphVe+dqAkLQgK3xLaZ6jmJk+jt1IS3vh7gEDIi8QM0K tpS4zebZHOjfqUlZglRnc4FuUKJTF37LacDuO7Wf+nXddVip3XOdJWVdfy7HUXRjFyj6 f7SSiSEMPDkEQMTt80tRSFXkVwGY6dU9HLvH9WIgmbMnTPpCnVgkurEac7gWFR75ZxzG GHaSgyvZl79Ucno+D2W7HhpzrIMCf4lFiVmrc5Zubofm6iBJ5bEFSd1woVA+29jKKgqa tUEMPNC1d67NHBHb0kEaQ1NlhZnxyTMWTjlL3Y8RINxrpyclw0e3tFD/zguCkxF9l0XT ZMrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707892641; x=1708497441; h=to:from:subject:mime-version:message-id:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=SjusGx6p43Fl1gAtWsqPYtqEBcBHhyA1NbBeHZWXXBk=; b=tR/Wtlq94drdt4po0yhSNadoKst/Xo3huyORKOEdhqkNtaDIfa2ElMLYICe92XV7C0 0llCNpG7064N6+p967UF/uwWOpn+2a64OknC5NqP/lVSZeRMk38GmlwkWKJD5LK9RO64 TRUudF/etUpO/V2PlD8RdjT2VVLHjSGBGU9MLj7H43TYGfQ0pH3Y1EfbEyrQgsH3PtoM gdQzi3+JhpJeaJmKqEjc8LW3vOb4tF5Idxj3+xtTn967QcoWxuYiAKOJOe3cgWTPb/Ow Yd/W5+iasZFVo7aFhMYIj+H3pwkFzZLevt4dp2gAkdMJylJS89l2CbuAVekU951N//f5 vy5g== X-Forwarded-Encrypted: i=1; AJvYcCUYUobSLGhn8o736E0dOE78reeMNfqRwWQNggdecWjf66RrIB2v62Law4BWwsyHa+g80NVVX1nX2J4Xfbt6dAy4u30TFoGc2NTkzuPs X-Gm-Message-State: AOJu0YyNZ1GCBScW1ZtcqKkAjO+M/WNQQioX3ct4i9NMIqLNFlJQjiLy PFROl0JdRe7S7/tcRSbMMJYkLf7r8K3+474g/Kr4Op98ZquVEnJMjHibryg7aPBUZ6kBdTN2E3O ATrQCyw== X-Received: from irogers.svl.corp.google.com ([2620:15c:2a3:200:6d92:85eb:9adc:66dd]) (user=irogers job=sendgmr) by 2002:a81:a105:0:b0:603:ea10:c37c with SMTP id y5-20020a81a105000000b00603ea10c37cmr360558ywg.7.1707892640759; Tue, 13 Feb 2024 22:37:20 -0800 (PST) Date: Tue, 13 Feb 2024 22:37:02 -0800 Message-Id: <20240214063708.972376-1-irogers@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 X-Mailer: git-send-email 2.43.0.687.g38aa6559b0-goog Subject: [PATCH v1 0/6] Thread memory improvements and fixes From: Ian Rogers <irogers@google.com> To: Peter Zijlstra <peterz@infradead.org>, Ingo Molnar <mingo@redhat.com>, Arnaldo Carvalho de Melo <acme@kernel.org>, Namhyung Kim <namhyung@kernel.org>, Mark Rutland <mark.rutland@arm.com>, Alexander Shishkin <alexander.shishkin@linux.intel.com>, Jiri Olsa <jolsa@kernel.org>, Ian Rogers <irogers@google.com>, Adrian Hunter <adrian.hunter@intel.com>, Oliver Upton <oliver.upton@linux.dev>, Yang Jihong <yangjihong1@huawei.com>, linux-kernel@vger.kernel.org, linux-perf-users@vger.kernel.org, bpf@vger.kernel.org Content-Type: text/plain; charset="UTF-8" X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1790855290173394191 X-GMAIL-MSGID: 1790855290173394191 |
Series |
Thread memory improvements and fixes
|
|
Message
Ian Rogers
Feb. 14, 2024, 6:37 a.m. UTC
The next 6 patches from: https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ now the initial maps fixes have landed: https://lore.kernel.org/all/20240210031746.4057262-1-irogers@google.com/ Separate out and reimplement threads to use a hashmap for lower memory consumption and faster look up. The fixes a regression in memory usage where reference count checking switched to using non-invasive tree nodes. Reduce threads default size by 32 times and improve locking discipline. Also, fix regressions where tids had become unordered to make `perf report --tasks` and `perf trace --summary` output easier to read. Ian Rogers (6): perf report: Sort child tasks by tid perf trace: Ignore thread hashing in summary perf machine: Move fprintf to for_each loop and a callback perf threads: Move threads to its own files perf threads: Switch from rbtree to hashmap perf threads: Reduce table size from 256 to 8 tools/perf/builtin-report.c | 203 ++++++++------- tools/perf/builtin-trace.c | 41 +-- tools/perf/util/Build | 1 + tools/perf/util/bpf_lock_contention.c | 8 +- tools/perf/util/machine.c | 344 +++++++------------------- tools/perf/util/machine.h | 30 +-- tools/perf/util/rb_resort.h | 5 - tools/perf/util/thread.c | 2 +- tools/perf/util/thread.h | 6 - tools/perf/util/threads.c | 186 ++++++++++++++ tools/perf/util/threads.h | 35 +++ 11 files changed, 464 insertions(+), 397 deletions(-) create mode 100644 tools/perf/util/threads.c create mode 100644 tools/perf/util/threads.h
Comments
On Tue, Feb 13, 2024 at 10:37:03PM -0800, Ian Rogers wrote: > Commit 91e467bc568f ("perf machine: Use hashtable for machine > threads") made the iteration of thread tids unordered. The perf report > --tasks output now shows child threads in an order determined by the > hashing. For example, in this snippet tid 3 appears after tid 256 even > though they have the same ppid 2: > > ``` > $ perf report --tasks > % pid tid ppid comm > 0 0 -1 |swapper > 2 2 0 | kthreadd > 256 256 2 | kworker/12:1H-k > 693761 693761 2 | kworker/10:1-mm > 1301762 1301762 2 | kworker/1:1-mm_ > 1302530 1302530 2 | kworker/u32:0-k > 3 3 2 | rcu_gp > ... > ``` > > The output is easier to read if threads appear numerically > increasing. To allow for this, read all threads into a list then sort > with a comparator that orders by the child task's of the first common > parent. The list creation and deletion are created as utilities on > machine. The indentation is possible by counting the number of > parents a child has. > > With this change the output for the same data file is now like: > ``` > $ perf report --tasks > % pid tid ppid comm > 0 0 -1 |swapper > 1 1 0 | systemd > 823 823 1 | systemd-journal > 853 853 1 | systemd-udevd > 3230 3230 1 | systemd-timesyn > 3236 3236 1 | auditd > 3239 3239 3236 | audisp-syslog > 3321 3321 1 | accounts-daemon Since we're adding extra code for sorting wouldn't be more convenient to have this done in an graphically hierarchical output? But maybe to make this honour asking for a CSV output the above is enough? Or can we have both, i.e. for people just doing --tasks, the hirarchical way, for CSV, then like above, with the comma separator. But then perf stat has -x to ask for CSV that is used by the more obscure --exclude-other option :-\ Maybe we need a --csv that is consistent accross all tools. - Arnaldo ⬢[acme@toolbox b]$ perf stat -x, ls perf.data 0.65,msec,task-clock:u,648266,100.00,0.534,CPUs utilized 0,,context-switches:u,648266,100.00,0.000,/sec 0,,cpu-migrations:u,648266,100.00,0.000,/sec 91,,page-faults:u,648266,100.00,140.374,K/sec 775564,,cpu_atom/cycles/u,276334,42.00,1.196,GHz <not counted>,,cpu_core/cycles/u,0,0.00,, 508381,,cpu_atom/instructions/u,648266,100.00,0.66,insn per cycle <not counted>,,cpu_core/instructions/u,0,0.00,, 99137,,cpu_atom/branches/u,648266,100.00,152.926,M/sec <not counted>,,cpu_core/branches/u,0,0.00,, 6238,,cpu_atom/branch-misses/u,648266,100.00,6.29,of all branches <not counted>,,cpu_core/branch-misses/u,0,0.00,, ,648266,100.00,,,,TopdownL1 (cpu_atom) ,,,,,87.9,% tma_bad_speculation ,648266,100.00,,,24.0,% tma_frontend_bound ,648266,100.00,,,31.5,% tma_backend_bound ,,,,,31.5,% tma_backend_bound_aux ,371932,57.00,,,0.0,% tma_retiring ⬢[acme@toolbox b]$ perf report -h -x Usage: perf report [<options>] -x, --exclude-other Only display entries with parent-match ⬢[acme@toolbox b]$ - Arnaldo > ... > ``` > > Signed-off-by: Ian Rogers <irogers@google.com> > --- > tools/perf/builtin-report.c | 203 ++++++++++++++++++++---------------- > tools/perf/util/machine.c | 30 ++++++ > tools/perf/util/machine.h | 10 ++ > 3 files changed, 155 insertions(+), 88 deletions(-) > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > index 8e16fa261e6f..b48f1d5309e3 100644 > --- a/tools/perf/builtin-report.c > +++ b/tools/perf/builtin-report.c > @@ -59,6 +59,7 @@ > #include <linux/ctype.h> > #include <signal.h> > #include <linux/bitmap.h> > +#include <linux/list_sort.h> > #include <linux/string.h> > #include <linux/stringify.h> > #include <linux/time64.h> > @@ -828,35 +829,6 @@ static void tasks_setup(struct report *rep) > rep->tool.no_warn = true; > } > > -struct task { > - struct thread *thread; > - struct list_head list; > - struct list_head children; > -}; > - > -static struct task *tasks_list(struct task *task, struct machine *machine) > -{ > - struct thread *parent_thread, *thread = task->thread; > - struct task *parent_task; > - > - /* Already listed. */ > - if (!list_empty(&task->list)) > - return NULL; > - > - /* Last one in the chain. */ > - if (thread__ppid(thread) == -1) > - return task; > - > - parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > - if (!parent_thread) > - return ERR_PTR(-ENOENT); > - > - parent_task = thread__priv(parent_thread); > - thread__put(parent_thread); > - list_add_tail(&task->list, &parent_task->children); > - return tasks_list(parent_task, machine); > -} > - > struct maps__fprintf_task_args { > int indent; > FILE *fp; > @@ -900,89 +872,144 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp) > return args.printed; > } > > -static void task__print_level(struct task *task, FILE *fp, int level) > +static int thread_level(struct machine *machine, const struct thread *thread) > { > - struct thread *thread = task->thread; > - struct task *child; > - int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > - thread__pid(thread), thread__tid(thread), > - thread__ppid(thread), level, ""); > + struct thread *parent_thread; > + int res; > > - fprintf(fp, "%s\n", thread__comm_str(thread)); > + if (thread__tid(thread) <= 0) > + return 0; > > - maps__fprintf_task(thread__maps(thread), comm_indent, fp); > + if (thread__ppid(thread) <= 0) > + return 1; > > - if (!list_empty(&task->children)) { > - list_for_each_entry(child, &task->children, list) > - task__print_level(child, fp, level + 1); > + parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > + if (!parent_thread) { > + pr_err("Missing parent thread of %d\n", thread__tid(thread)); > + return 0; > } > + res = 1 + thread_level(machine, parent_thread); > + thread__put(parent_thread); > + return res; > } > > -static int tasks_print(struct report *rep, FILE *fp) > +static void task__print_level(struct machine *machine, struct thread *thread, FILE *fp) > { > - struct perf_session *session = rep->session; > - struct machine *machine = &session->machines.host; > - struct task *tasks, *task; > - unsigned int nr = 0, itask = 0, i; > - struct rb_node *nd; > - LIST_HEAD(list); > + int level = thread_level(machine, thread); > + int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > + thread__pid(thread), thread__tid(thread), > + thread__ppid(thread), level, ""); > > - /* > - * No locking needed while accessing machine->threads, > - * because --tasks is single threaded command. > - */ > + fprintf(fp, "%s\n", thread__comm_str(thread)); > > - /* Count all the threads. */ > - for (i = 0; i < THREADS__TABLE_SIZE; i++) > - nr += machine->threads[i].nr; > + maps__fprintf_task(thread__maps(thread), comm_indent, fp); > +} > > - tasks = malloc(sizeof(*tasks) * nr); > - if (!tasks) > - return -ENOMEM; > +static int task_list_cmp(void *priv, const struct list_head *la, const struct list_head *lb) > +{ > + struct machine *machine = priv; > + struct thread_list *task_a = list_entry(la, struct thread_list, list); > + struct thread_list *task_b = list_entry(lb, struct thread_list, list); > + struct thread *a = task_a->thread; > + struct thread *b = task_b->thread; > + int level_a, level_b, res; > + > + /* Compare a and b to root. */ > + if (thread__tid(a) == thread__tid(b)) > + return 0; > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) { > - struct threads *threads = &machine->threads[i]; > + if (thread__tid(a) == 0) > + return -1; > > - for (nd = rb_first_cached(&threads->entries); nd; > - nd = rb_next(nd)) { > - task = tasks + itask++; > + if (thread__tid(b) == 0) > + return 1; > > - task->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread; > - INIT_LIST_HEAD(&task->children); > - INIT_LIST_HEAD(&task->list); > - thread__set_priv(task->thread, task); > - } > + /* If parents match sort by tid. */ > + if (thread__ppid(a) == thread__ppid(b)) { > + return thread__tid(a) < thread__tid(b) > + ? -1 > + : (thread__tid(a) > thread__tid(b) ? 1 : 0); > } > > /* > - * Iterate every task down to the unprocessed parent > - * and link all in task children list. Task with no > - * parent is added into 'list'. > + * Find a and b such that if they are a child of each other a and b's > + * tid's match, otherwise a and b have a common parent and distinct > + * tid's to sort by. First make the depths of the threads match. > */ > - for (itask = 0; itask < nr; itask++) { > - task = tasks + itask; > - > - if (!list_empty(&task->list)) > - continue; > - > - task = tasks_list(task, machine); > - if (IS_ERR(task)) { > - pr_err("Error: failed to process tasks\n"); > - free(tasks); > - return PTR_ERR(task); > + level_a = thread_level(machine, a); > + level_b = thread_level(machine, b); > + a = thread__get(a); > + b = thread__get(b); > + for (int i = level_a; i > level_b; i--) { > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(a)); > + > + thread__put(a); > + if (!parent) { > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > + thread__put(b); > + return -1; > } > + a = parent; > + } > + for (int i = level_b; i > level_a; i--) { > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(b)); > > - if (task) > - list_add_tail(&task->list, &list); > + thread__put(b); > + if (!parent) { > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > + thread__put(a); > + return 1; > + } > + b = parent; > + } > + /* Search up to a common parent. */ > + while (thread__ppid(a) != thread__ppid(b)) { > + struct thread *parent; > + > + parent = machine__find_thread(machine, -1, thread__ppid(a)); > + thread__put(a); > + if (!parent) > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > + a = parent; > + parent = machine__find_thread(machine, -1, thread__ppid(b)); > + thread__put(b); > + if (!parent) > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > + b = parent; > + if (!a || !b) > + return !a && !b ? 0 : (!a ? -1 : 1); > + } > + if (thread__tid(a) == thread__tid(b)) { > + /* a is a child of b or vice-versa, deeper levels appear later. */ > + res = level_a < level_b ? -1 : (level_a > level_b ? 1 : 0); > + } else { > + /* Sort by tid now the parent is the same. */ > + res = thread__tid(a) < thread__tid(b) ? -1 : 1; > } > + thread__put(a); > + thread__put(b); > + return res; > +} > + > +static int tasks_print(struct report *rep, FILE *fp) > +{ > + struct machine *machine = &rep->session->machines.host; > + LIST_HEAD(tasks); > + int ret; > > - fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > + ret = machine__thread_list(machine, &tasks); > + if (!ret) { > + struct thread_list *task; > > - list_for_each_entry(task, &list, list) > - task__print_level(task, fp, 0); > + list_sort(machine, &tasks, task_list_cmp); > > - free(tasks); > - return 0; > + fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > + > + list_for_each_entry(task, &tasks, list) > + task__print_level(machine, task->thread, fp); > + } > + thread_list__delete(&tasks); > + return ret; > } > > static int __cmd_report(struct report *rep) > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index 3da92f18814a..7872ce92c9fc 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -3261,6 +3261,36 @@ int machines__for_each_thread(struct machines *machines, > return rc; > } > > + > +static int thread_list_cb(struct thread *thread, void *data) > +{ > + struct list_head *list = data; > + struct thread_list *entry = malloc(sizeof(*entry)); > + > + if (!entry) > + return -ENOMEM; > + > + entry->thread = thread__get(thread); > + list_add_tail(&entry->list, list); > + return 0; > +} > + > +int machine__thread_list(struct machine *machine, struct list_head *list) > +{ > + return machine__for_each_thread(machine, thread_list_cb, list); > +} > + > +void thread_list__delete(struct list_head *list) > +{ > + struct thread_list *pos, *next; > + > + list_for_each_entry_safe(pos, next, list, list) { > + thread__zput(pos->thread); > + list_del(&pos->list); > + free(pos); > + } > +} > + > pid_t machine__get_current_tid(struct machine *machine, int cpu) > { > if (cpu < 0 || (size_t)cpu >= machine->current_tid_sz) > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > index 1279acda6a8a..b738ce84817b 100644 > --- a/tools/perf/util/machine.h > +++ b/tools/perf/util/machine.h > @@ -280,6 +280,16 @@ int machines__for_each_thread(struct machines *machines, > int (*fn)(struct thread *thread, void *p), > void *priv); > > +struct thread_list { > + struct list_head list; > + struct thread *thread; > +}; > + > +/* Make a list of struct thread_list based on threads in the machine. */ > +int machine__thread_list(struct machine *machine, struct list_head *list); > +/* Free up the nodes within the thread_list list. */ > +void thread_list__delete(struct list_head *list); > + > pid_t machine__get_current_tid(struct machine *machine, int cpu); > int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid, > pid_t tid); > -- > 2.43.0.687.g38aa6559b0-goog >
On Wed, Feb 14, 2024 at 9:42 AM Ian Rogers <irogers@google.com> wrote: > > On Wed, Feb 14, 2024 at 9:24 AM Arnaldo Carvalho de Melo > <acme@kernel.org> wrote: > > > > On Tue, Feb 13, 2024 at 10:37:03PM -0800, Ian Rogers wrote: > > > Commit 91e467bc568f ("perf machine: Use hashtable for machine > > > threads") made the iteration of thread tids unordered. The perf report > > > --tasks output now shows child threads in an order determined by the > > > hashing. For example, in this snippet tid 3 appears after tid 256 even > > > though they have the same ppid 2: > > > > > > ``` > > > $ perf report --tasks > > > % pid tid ppid comm > > > 0 0 -1 |swapper > > > 2 2 0 | kthreadd > > > 256 256 2 | kworker/12:1H-k > > > 693761 693761 2 | kworker/10:1-mm > > > 1301762 1301762 2 | kworker/1:1-mm_ > > > 1302530 1302530 2 | kworker/u32:0-k > > > 3 3 2 | rcu_gp > > > ... > > > ``` > > > > > > The output is easier to read if threads appear numerically > > > increasing. To allow for this, read all threads into a list then sort > > > with a comparator that orders by the child task's of the first common > > > parent. The list creation and deletion are created as utilities on > > > machine. The indentation is possible by counting the number of > > > parents a child has. > > > > > > With this change the output for the same data file is now like: > > > ``` > > > $ perf report --tasks > > > % pid tid ppid comm > > > 0 0 -1 |swapper > > > 1 1 0 | systemd > > > 823 823 1 | systemd-journal > > > 853 853 1 | systemd-udevd > > > 3230 3230 1 | systemd-timesyn > > > 3236 3236 1 | auditd > > > 3239 3239 3236 | audisp-syslog > > > 3321 3321 1 | accounts-daemon > > > > > > Since we're adding extra code for sorting wouldn't be more convenient to > > have this done in an graphically hierarchical output? > > > > But maybe to make this honour asking for a CSV output the above is > > enough? Or can we have both, i.e. for people just doing --tasks, the > > hirarchical way, for CSV, then like above, with the comma separator. > > > > But then perf stat has -x to ask for CSV that is used by the more > > obscure --exclude-other option :-\ > > > > Maybe we need a --csv that is consistent accross all tools. > > I've no objection to a graphical/CSV output, I was in this code as I > was restructuring it for memory savings. Fixing the output ordering > was a side-effect, the "graphical" sorting/indentation is mentioned as > it is carrying forward a behavior from the previous code but done in a > somewhat different way. Let's have other output things as follow up > work. Agreed, maybe a good project for GSoC students.. Thanks, Namhyung
On Tue, Feb 13, 2024 at 10:37 PM Ian Rogers <irogers@google.com> wrote: > > The next 6 patches from: > https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/ > now the initial maps fixes have landed: > https://lore.kernel.org/all/20240210031746.4057262-1-irogers@google.com/ > > Separate out and reimplement threads to use a hashmap for lower memory > consumption and faster look up. The fixes a regression in memory usage > where reference count checking switched to using non-invasive tree > nodes. Reduce threads default size by 32 times and improve locking > discipline. Also, fix regressions where tids had become unordered to > make `perf report --tasks` and `perf trace --summary` output easier to > read. > > Ian Rogers (6): > perf report: Sort child tasks by tid > perf trace: Ignore thread hashing in summary > perf machine: Move fprintf to for_each loop and a callback > perf threads: Move threads to its own files > perf threads: Switch from rbtree to hashmap > perf threads: Reduce table size from 256 to 8 > > tools/perf/builtin-report.c | 203 ++++++++------- > tools/perf/builtin-trace.c | 41 +-- > tools/perf/util/Build | 1 + > tools/perf/util/bpf_lock_contention.c | 8 +- > tools/perf/util/machine.c | 344 +++++++------------------- > tools/perf/util/machine.h | 30 +-- > tools/perf/util/rb_resort.h | 5 - > tools/perf/util/thread.c | 2 +- > tools/perf/util/thread.h | 6 - > tools/perf/util/threads.c | 186 ++++++++++++++ > tools/perf/util/threads.h | 35 +++ > 11 files changed, 464 insertions(+), 397 deletions(-) > create mode 100644 tools/perf/util/threads.c > create mode 100644 tools/perf/util/threads.h Arnaldo/Namhyung, anything outstanding that needs addressing in a v2? I'm looking for reviewed-by/acked-by tags :-) Thanks, Ian > -- > 2.43.0.687.g38aa6559b0-goog >
On Tue, Feb 13, 2024 at 10:37 PM Ian Rogers <irogers@google.com> wrote: > > Commit 91e467bc568f ("perf machine: Use hashtable for machine > threads") made the iteration of thread tids unordered. The perf report > --tasks output now shows child threads in an order determined by the > hashing. For example, in this snippet tid 3 appears after tid 256 even > though they have the same ppid 2: > > ``` > $ perf report --tasks > % pid tid ppid comm > 0 0 -1 |swapper > 2 2 0 | kthreadd > 256 256 2 | kworker/12:1H-k > 693761 693761 2 | kworker/10:1-mm > 1301762 1301762 2 | kworker/1:1-mm_ > 1302530 1302530 2 | kworker/u32:0-k > 3 3 2 | rcu_gp > ... > ``` > > The output is easier to read if threads appear numerically > increasing. To allow for this, read all threads into a list then sort > with a comparator that orders by the child task's of the first common > parent. The list creation and deletion are created as utilities on > machine. The indentation is possible by counting the number of > parents a child has. > > With this change the output for the same data file is now like: > ``` > $ perf report --tasks > % pid tid ppid comm > 0 0 -1 |swapper > 1 1 0 | systemd > 823 823 1 | systemd-journal > 853 853 1 | systemd-udevd > 3230 3230 1 | systemd-timesyn > 3236 3236 1 | auditd > 3239 3239 3236 | audisp-syslog > 3321 3321 1 | accounts-daemon > ... > ``` > > Signed-off-by: Ian Rogers <irogers@google.com> > --- > tools/perf/builtin-report.c | 203 ++++++++++++++++++++---------------- > tools/perf/util/machine.c | 30 ++++++ > tools/perf/util/machine.h | 10 ++ > 3 files changed, 155 insertions(+), 88 deletions(-) > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > index 8e16fa261e6f..b48f1d5309e3 100644 > --- a/tools/perf/builtin-report.c > +++ b/tools/perf/builtin-report.c > @@ -59,6 +59,7 @@ > #include <linux/ctype.h> > #include <signal.h> > #include <linux/bitmap.h> > +#include <linux/list_sort.h> > #include <linux/string.h> > #include <linux/stringify.h> > #include <linux/time64.h> > @@ -828,35 +829,6 @@ static void tasks_setup(struct report *rep) > rep->tool.no_warn = true; > } > > -struct task { > - struct thread *thread; > - struct list_head list; > - struct list_head children; > -}; > - > -static struct task *tasks_list(struct task *task, struct machine *machine) > -{ > - struct thread *parent_thread, *thread = task->thread; > - struct task *parent_task; > - > - /* Already listed. */ > - if (!list_empty(&task->list)) > - return NULL; > - > - /* Last one in the chain. */ > - if (thread__ppid(thread) == -1) > - return task; > - > - parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > - if (!parent_thread) > - return ERR_PTR(-ENOENT); > - > - parent_task = thread__priv(parent_thread); > - thread__put(parent_thread); > - list_add_tail(&task->list, &parent_task->children); > - return tasks_list(parent_task, machine); > -} > - > struct maps__fprintf_task_args { > int indent; > FILE *fp; > @@ -900,89 +872,144 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp) > return args.printed; > } > > -static void task__print_level(struct task *task, FILE *fp, int level) > +static int thread_level(struct machine *machine, const struct thread *thread) > { > - struct thread *thread = task->thread; > - struct task *child; > - int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > - thread__pid(thread), thread__tid(thread), > - thread__ppid(thread), level, ""); > + struct thread *parent_thread; > + int res; > > - fprintf(fp, "%s\n", thread__comm_str(thread)); > + if (thread__tid(thread) <= 0) > + return 0; > > - maps__fprintf_task(thread__maps(thread), comm_indent, fp); > + if (thread__ppid(thread) <= 0) > + return 1; > > - if (!list_empty(&task->children)) { > - list_for_each_entry(child, &task->children, list) > - task__print_level(child, fp, level + 1); > + parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > + if (!parent_thread) { > + pr_err("Missing parent thread of %d\n", thread__tid(thread)); > + return 0; > } > + res = 1 + thread_level(machine, parent_thread); > + thread__put(parent_thread); > + return res; > } > > -static int tasks_print(struct report *rep, FILE *fp) > +static void task__print_level(struct machine *machine, struct thread *thread, FILE *fp) > { > - struct perf_session *session = rep->session; > - struct machine *machine = &session->machines.host; > - struct task *tasks, *task; > - unsigned int nr = 0, itask = 0, i; > - struct rb_node *nd; > - LIST_HEAD(list); > + int level = thread_level(machine, thread); > + int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > + thread__pid(thread), thread__tid(thread), > + thread__ppid(thread), level, ""); > > - /* > - * No locking needed while accessing machine->threads, > - * because --tasks is single threaded command. > - */ > + fprintf(fp, "%s\n", thread__comm_str(thread)); > > - /* Count all the threads. */ > - for (i = 0; i < THREADS__TABLE_SIZE; i++) > - nr += machine->threads[i].nr; > + maps__fprintf_task(thread__maps(thread), comm_indent, fp); > +} > > - tasks = malloc(sizeof(*tasks) * nr); > - if (!tasks) > - return -ENOMEM; > +static int task_list_cmp(void *priv, const struct list_head *la, const struct list_head *lb) I'm a little afraid that this comparison logic becomes complex. But I think it's better than having a tree of thread relationship. Just a comment that explains why we need this would be nice. > +{ > + struct machine *machine = priv; > + struct thread_list *task_a = list_entry(la, struct thread_list, list); > + struct thread_list *task_b = list_entry(lb, struct thread_list, list); > + struct thread *a = task_a->thread; > + struct thread *b = task_b->thread; > + int level_a, level_b, res; > + > + /* Compare a and b to root. */ > + if (thread__tid(a) == thread__tid(b)) > + return 0; > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) { > - struct threads *threads = &machine->threads[i]; > + if (thread__tid(a) == 0) > + return -1; > > - for (nd = rb_first_cached(&threads->entries); nd; > - nd = rb_next(nd)) { > - task = tasks + itask++; > + if (thread__tid(b) == 0) > + return 1; > > - task->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread; > - INIT_LIST_HEAD(&task->children); > - INIT_LIST_HEAD(&task->list); > - thread__set_priv(task->thread, task); > - } > + /* If parents match sort by tid. */ > + if (thread__ppid(a) == thread__ppid(b)) { > + return thread__tid(a) < thread__tid(b) > + ? -1 > + : (thread__tid(a) > thread__tid(b) ? 1 : 0); Can it be simply like this? We know tid(a) != tid(b). return thread__tid(a) < thread__tid(b) ? -1 : 1; > } > > /* > - * Iterate every task down to the unprocessed parent > - * and link all in task children list. Task with no > - * parent is added into 'list'. > + * Find a and b such that if they are a child of each other a and b's > + * tid's match, otherwise a and b have a common parent and distinct > + * tid's to sort by. First make the depths of the threads match. > */ > - for (itask = 0; itask < nr; itask++) { > - task = tasks + itask; > - > - if (!list_empty(&task->list)) > - continue; > - > - task = tasks_list(task, machine); > - if (IS_ERR(task)) { > - pr_err("Error: failed to process tasks\n"); > - free(tasks); > - return PTR_ERR(task); > + level_a = thread_level(machine, a); > + level_b = thread_level(machine, b); > + a = thread__get(a); > + b = thread__get(b); > + for (int i = level_a; i > level_b; i--) { > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(a)); > + > + thread__put(a); > + if (!parent) { > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > + thread__put(b); > + return -1; > } > + a = parent; > + } > + for (int i = level_b; i > level_a; i--) { > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(b)); > > - if (task) > - list_add_tail(&task->list, &list); > + thread__put(b); > + if (!parent) { > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > + thread__put(a); > + return 1; > + } > + b = parent; > + } > + /* Search up to a common parent. */ > + while (thread__ppid(a) != thread__ppid(b)) { > + struct thread *parent; > + > + parent = machine__find_thread(machine, -1, thread__ppid(a)); > + thread__put(a); > + if (!parent) > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > + a = parent; > + parent = machine__find_thread(machine, -1, thread__ppid(b)); > + thread__put(b); > + if (!parent) > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > + b = parent; > + if (!a || !b) > + return !a && !b ? 0 : (!a ? -1 : 1); Wouldn't it leak a refcount if either a or b is NULL (not both)? > + } > + if (thread__tid(a) == thread__tid(b)) { > + /* a is a child of b or vice-versa, deeper levels appear later. */ > + res = level_a < level_b ? -1 : (level_a > level_b ? 1 : 0); > + } else { > + /* Sort by tid now the parent is the same. */ > + res = thread__tid(a) < thread__tid(b) ? -1 : 1; > } > + thread__put(a); > + thread__put(b); > + return res; > +} > + > +static int tasks_print(struct report *rep, FILE *fp) > +{ > + struct machine *machine = &rep->session->machines.host; > + LIST_HEAD(tasks); > + int ret; > > - fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > + ret = machine__thread_list(machine, &tasks); > + if (!ret) { > + struct thread_list *task; Do we really need this thread_list? Why not use an array of threads directly? Thanks, Namhyung > > - list_for_each_entry(task, &list, list) > - task__print_level(task, fp, 0); > + list_sort(machine, &tasks, task_list_cmp); > > - free(tasks); > - return 0; > + fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > + > + list_for_each_entry(task, &tasks, list) > + task__print_level(machine, task->thread, fp); > + } > + thread_list__delete(&tasks); > + return ret; > } > > static int __cmd_report(struct report *rep) > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > index 3da92f18814a..7872ce92c9fc 100644 > --- a/tools/perf/util/machine.c > +++ b/tools/perf/util/machine.c > @@ -3261,6 +3261,36 @@ int machines__for_each_thread(struct machines *machines, > return rc; > } > > + > +static int thread_list_cb(struct thread *thread, void *data) > +{ > + struct list_head *list = data; > + struct thread_list *entry = malloc(sizeof(*entry)); > + > + if (!entry) > + return -ENOMEM; > + > + entry->thread = thread__get(thread); > + list_add_tail(&entry->list, list); > + return 0; > +} > + > +int machine__thread_list(struct machine *machine, struct list_head *list) > +{ > + return machine__for_each_thread(machine, thread_list_cb, list); > +} > + > +void thread_list__delete(struct list_head *list) > +{ > + struct thread_list *pos, *next; > + > + list_for_each_entry_safe(pos, next, list, list) { > + thread__zput(pos->thread); > + list_del(&pos->list); > + free(pos); > + } > +} > + > pid_t machine__get_current_tid(struct machine *machine, int cpu) > { > if (cpu < 0 || (size_t)cpu >= machine->current_tid_sz) > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > index 1279acda6a8a..b738ce84817b 100644 > --- a/tools/perf/util/machine.h > +++ b/tools/perf/util/machine.h > @@ -280,6 +280,16 @@ int machines__for_each_thread(struct machines *machines, > int (*fn)(struct thread *thread, void *p), > void *priv); > > +struct thread_list { > + struct list_head list; > + struct thread *thread; > +}; > + > +/* Make a list of struct thread_list based on threads in the machine. */ > +int machine__thread_list(struct machine *machine, struct list_head *list); > +/* Free up the nodes within the thread_list list. */ > +void thread_list__delete(struct list_head *list); > + > pid_t machine__get_current_tid(struct machine *machine, int cpu); > int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid, > pid_t tid); > -- > 2.43.0.687.g38aa6559b0-goog >
On Tue, Feb 27, 2024 at 10:11 PM Namhyung Kim <namhyung@kernel.org> wrote: > > On Mon, Feb 26, 2024 at 11:12 PM Ian Rogers <irogers@google.com> wrote: > > > > On Mon, Feb 26, 2024 at 10:39 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > On Tue, Feb 13, 2024 at 10:37 PM Ian Rogers <irogers@google.com> wrote: > > > > > > > > Commit 91e467bc568f ("perf machine: Use hashtable for machine > > > > threads") made the iteration of thread tids unordered. The perf report > > > > --tasks output now shows child threads in an order determined by the > > > > hashing. For example, in this snippet tid 3 appears after tid 256 even > > > > though they have the same ppid 2: > > > > > > > > ``` > > > > $ perf report --tasks > > > > % pid tid ppid comm > > > > 0 0 -1 |swapper > > > > 2 2 0 | kthreadd > > > > 256 256 2 | kworker/12:1H-k > > > > 693761 693761 2 | kworker/10:1-mm > > > > 1301762 1301762 2 | kworker/1:1-mm_ > > > > 1302530 1302530 2 | kworker/u32:0-k > > > > 3 3 2 | rcu_gp > > > > ... > > > > ``` > > > > > > > > The output is easier to read if threads appear numerically > > > > increasing. To allow for this, read all threads into a list then sort > > > > with a comparator that orders by the child task's of the first common > > > > parent. The list creation and deletion are created as utilities on > > > > machine. The indentation is possible by counting the number of > > > > parents a child has. > > > > > > > > With this change the output for the same data file is now like: > > > > ``` > > > > $ perf report --tasks > > > > % pid tid ppid comm > > > > 0 0 -1 |swapper > > > > 1 1 0 | systemd > > > > 823 823 1 | systemd-journal > > > > 853 853 1 | systemd-udevd > > > > 3230 3230 1 | systemd-timesyn > > > > 3236 3236 1 | auditd > > > > 3239 3239 3236 | audisp-syslog > > > > 3321 3321 1 | accounts-daemon > > > > ... > > > > ``` > > > > > > > > Signed-off-by: Ian Rogers <irogers@google.com> > > I know you sent out v2 already, but let me continue the discussion > here. > > > > > > --- > > > > tools/perf/builtin-report.c | 203 ++++++++++++++++++++---------------- > > > > tools/perf/util/machine.c | 30 ++++++ > > > > tools/perf/util/machine.h | 10 ++ > > > > 3 files changed, 155 insertions(+), 88 deletions(-) > > > > > > > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > > > > index 8e16fa261e6f..b48f1d5309e3 100644 > > > > --- a/tools/perf/builtin-report.c > > > > +++ b/tools/perf/builtin-report.c > > > > @@ -59,6 +59,7 @@ > > > > #include <linux/ctype.h> > > > > #include <signal.h> > > > > #include <linux/bitmap.h> > > > > +#include <linux/list_sort.h> > > > > #include <linux/string.h> > > > > #include <linux/stringify.h> > > > > #include <linux/time64.h> > > > > @@ -828,35 +829,6 @@ static void tasks_setup(struct report *rep) > > > > rep->tool.no_warn = true; > > > > } > > > > > > > > -struct task { > > > > - struct thread *thread; > > > > - struct list_head list; > > > > - struct list_head children; > > > > -}; > > > > - > > > > -static struct task *tasks_list(struct task *task, struct machine *machine) > > > > -{ > > > > - struct thread *parent_thread, *thread = task->thread; > > > > - struct task *parent_task; > > > > - > > > > - /* Already listed. */ > > > > - if (!list_empty(&task->list)) > > > > - return NULL; > > > > - > > > > - /* Last one in the chain. */ > > > > - if (thread__ppid(thread) == -1) > > > > - return task; > > > > - > > > > - parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > > > > - if (!parent_thread) > > > > - return ERR_PTR(-ENOENT); > > > > - > > > > - parent_task = thread__priv(parent_thread); > > > > - thread__put(parent_thread); > > > > - list_add_tail(&task->list, &parent_task->children); > > > > - return tasks_list(parent_task, machine); > > > > -} > > > > - > > > > struct maps__fprintf_task_args { > > > > int indent; > > > > FILE *fp; > > > > @@ -900,89 +872,144 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp) > > > > return args.printed; > > > > } > > > > > > > > -static void task__print_level(struct task *task, FILE *fp, int level) > > > > +static int thread_level(struct machine *machine, const struct thread *thread) > > > > { > > > > - struct thread *thread = task->thread; > > > > - struct task *child; > > > > - int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > > > > - thread__pid(thread), thread__tid(thread), > > > > - thread__ppid(thread), level, ""); > > > > + struct thread *parent_thread; > > > > + int res; > > > > > > > > - fprintf(fp, "%s\n", thread__comm_str(thread)); > > > > + if (thread__tid(thread) <= 0) > > > > + return 0; > > > > > > > > - maps__fprintf_task(thread__maps(thread), comm_indent, fp); > > > > + if (thread__ppid(thread) <= 0) > > > > + return 1; > > > > > > > > - if (!list_empty(&task->children)) { > > > > - list_for_each_entry(child, &task->children, list) > > > > - task__print_level(child, fp, level + 1); > > > > + parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > > > > + if (!parent_thread) { > > > > + pr_err("Missing parent thread of %d\n", thread__tid(thread)); > > > > + return 0; > > > > } > > > > + res = 1 + thread_level(machine, parent_thread); > > > > + thread__put(parent_thread); > > > > + return res; > > > > } > > > > > > > > -static int tasks_print(struct report *rep, FILE *fp) > > > > +static void task__print_level(struct machine *machine, struct thread *thread, FILE *fp) > > > > { > > > > - struct perf_session *session = rep->session; > > > > - struct machine *machine = &session->machines.host; > > > > - struct task *tasks, *task; > > > > - unsigned int nr = 0, itask = 0, i; > > > > - struct rb_node *nd; > > > > - LIST_HEAD(list); > > > > + int level = thread_level(machine, thread); > > > > + int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > > > > + thread__pid(thread), thread__tid(thread), > > > > + thread__ppid(thread), level, ""); > > > > > > > > - /* > > > > - * No locking needed while accessing machine->threads, > > > > - * because --tasks is single threaded command. > > > > - */ > > > > + fprintf(fp, "%s\n", thread__comm_str(thread)); > > > > > > > > - /* Count all the threads. */ > > > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) > > > > - nr += machine->threads[i].nr; > > > > + maps__fprintf_task(thread__maps(thread), comm_indent, fp); > > > > +} > > > > > > > > - tasks = malloc(sizeof(*tasks) * nr); > > > > - if (!tasks) > > > > - return -ENOMEM; > > > > +static int task_list_cmp(void *priv, const struct list_head *la, const struct list_head *lb) > > > > > > I'm a little afraid that this comparison logic becomes complex. > > > But I think it's better than having a tree of thread relationship. > > > Just a comment that explains why we need this would be nice. > > > > I can add something in v2. > > > > > > > > > +{ > > > > + struct machine *machine = priv; > > > > + struct thread_list *task_a = list_entry(la, struct thread_list, list); > > > > + struct thread_list *task_b = list_entry(lb, struct thread_list, list); > > > > + struct thread *a = task_a->thread; > > > > + struct thread *b = task_b->thread; > > > > + int level_a, level_b, res; > > > > + > > > > + /* Compare a and b to root. */ > > > > + if (thread__tid(a) == thread__tid(b)) > > > > + return 0; > > > > > > > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) { > > > > - struct threads *threads = &machine->threads[i]; > > > > + if (thread__tid(a) == 0) > > > > + return -1; > > > > > > > > - for (nd = rb_first_cached(&threads->entries); nd; > > > > - nd = rb_next(nd)) { > > > > - task = tasks + itask++; > > > > + if (thread__tid(b) == 0) > > > > + return 1; > > > > > > > > - task->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread; > > > > - INIT_LIST_HEAD(&task->children); > > > > - INIT_LIST_HEAD(&task->list); > > > > - thread__set_priv(task->thread, task); > > > > - } > > > > + /* If parents match sort by tid. */ > > > > + if (thread__ppid(a) == thread__ppid(b)) { > > > > + return thread__tid(a) < thread__tid(b) > > > > + ? -1 > > > > + : (thread__tid(a) > thread__tid(b) ? 1 : 0); > > > > > > Can it be simply like this? We know tid(a) != tid(b). > > > > > > return thread__tid(a) < thread__tid(b) ? -1 : 1; > > > > Yes, but the parent check is still required. > > Sure. I only meant the return statement. > > > > > > > } > > > > > > > > /* > > > > - * Iterate every task down to the unprocessed parent > > > > - * and link all in task children list. Task with no > > > > - * parent is added into 'list'. > > > > + * Find a and b such that if they are a child of each other a and b's > > > > + * tid's match, otherwise a and b have a common parent and distinct > > > > + * tid's to sort by. First make the depths of the threads match. > > > > */ > > > > - for (itask = 0; itask < nr; itask++) { > > > > - task = tasks + itask; > > > > - > > > > - if (!list_empty(&task->list)) > > > > - continue; > > > > - > > > > - task = tasks_list(task, machine); > > > > - if (IS_ERR(task)) { > > > > - pr_err("Error: failed to process tasks\n"); > > > > - free(tasks); > > > > - return PTR_ERR(task); > > > > + level_a = thread_level(machine, a); > > > > + level_b = thread_level(machine, b); > > > > + a = thread__get(a); > > > > + b = thread__get(b); > > > > + for (int i = level_a; i > level_b; i--) { > > > > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(a)); > > > > + > > > > + thread__put(a); > > > > + if (!parent) { > > > > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > > > > + thread__put(b); > > > > + return -1; > > > > } > > > > + a = parent; > > > > + } > > > > + for (int i = level_b; i > level_a; i--) { > > > > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(b)); > > > > > > > > - if (task) > > > > - list_add_tail(&task->list, &list); > > > > + thread__put(b); > > > > + if (!parent) { > > > > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > > > > + thread__put(a); > > > > + return 1; > > > > + } > > > > + b = parent; > > > > + } > > > > + /* Search up to a common parent. */ > > > > + while (thread__ppid(a) != thread__ppid(b)) { > > > > + struct thread *parent; > > > > + > > > > + parent = machine__find_thread(machine, -1, thread__ppid(a)); > > > > + thread__put(a); > > > > + if (!parent) > > > > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > > > > + a = parent; > > > > + parent = machine__find_thread(machine, -1, thread__ppid(b)); > > > > + thread__put(b); > > > > + if (!parent) > > > > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > > > > + b = parent; > > > > + if (!a || !b) > > > > + return !a && !b ? 0 : (!a ? -1 : 1); > > > > > > Wouldn't it leak a refcount if either a or b is NULL (not both)? > > > > It would, but this would be an error condition anyway. I can add puts. > > > > > > > > > + } > > > > + if (thread__tid(a) == thread__tid(b)) { > > > > + /* a is a child of b or vice-versa, deeper levels appear later. */ > > > > + res = level_a < level_b ? -1 : (level_a > level_b ? 1 : 0); > > > > + } else { > > > > + /* Sort by tid now the parent is the same. */ > > > > + res = thread__tid(a) < thread__tid(b) ? -1 : 1; > > > > } > > > > + thread__put(a); > > > > + thread__put(b); > > > > + return res; > > > > +} > > > > + > > > > +static int tasks_print(struct report *rep, FILE *fp) > > > > +{ > > > > + struct machine *machine = &rep->session->machines.host; > > > > + LIST_HEAD(tasks); > > > > + int ret; > > > > > > > > - fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > > > > + ret = machine__thread_list(machine, &tasks); > > > > + if (!ret) { > > > > + struct thread_list *task; > > > > > > Do we really need this thread_list? Why not use an > > > array of threads directly? > > > > The code isn't particularly performance critical. I used a list as it > > best approximated how the rbtree was being used. The code is reused in > > subsequent patches, there's no initial pass to size an array and I > > think the reallocarray/qsort logic is generally more problematic than > > the list ones. If we were worried about performance then I think > > arrays could make sense for optimization, but I think this is good > > enough for now. > > Well, it's not about performance. It made me think why we need > this thread_list but I couldn't find the reason. If you can move > machine__threads_nr() here then you won't need realloc(). But then you can race between allocating an array and traversing to fill it in. Using realloc in the iterator callback would avoid this but with capacity tracking, etc. If this were C++ its a call between a std::vector and a std::list, and std::vector would win that race there (imo). Here we're moving from code that was working on sorted tree nodes in code that tends to more heavily use lists. I wanted the transition from the rbtree nodes to list nodes to be as small as possible in the changes to the APIs that did strange things to the threads tree (resorting it). Moving to an array with indices would require more tracking and be a larger change in general. The array could move because of a realloc, whilst nodes wouldn't, etc. Having the code now work on a list its easier to see how it can migrate to an array, but that can be follow on work. I'm not sure we're motivated to do it given there's no code on a performance critical path. Thanks, Ian > Thanks, > Namhyung > > > > > > > > > - list_for_each_entry(task, &list, list) > > > > - task__print_level(task, fp, 0); > > > > + list_sort(machine, &tasks, task_list_cmp); > > > > > > > > - free(tasks); > > > > - return 0; > > > > + fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > > > > + > > > > + list_for_each_entry(task, &tasks, list) > > > > + task__print_level(machine, task->thread, fp); > > > > + } > > > > + thread_list__delete(&tasks); > > > > + return ret; > > > > } > > > > > > > > static int __cmd_report(struct report *rep) > > > > diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c > > > > index 3da92f18814a..7872ce92c9fc 100644 > > > > --- a/tools/perf/util/machine.c > > > > +++ b/tools/perf/util/machine.c > > > > @@ -3261,6 +3261,36 @@ int machines__for_each_thread(struct machines *machines, > > > > return rc; > > > > } > > > > > > > > + > > > > +static int thread_list_cb(struct thread *thread, void *data) > > > > +{ > > > > + struct list_head *list = data; > > > > + struct thread_list *entry = malloc(sizeof(*entry)); > > > > + > > > > + if (!entry) > > > > + return -ENOMEM; > > > > + > > > > + entry->thread = thread__get(thread); > > > > + list_add_tail(&entry->list, list); > > > > + return 0; > > > > +} > > > > + > > > > +int machine__thread_list(struct machine *machine, struct list_head *list) > > > > +{ > > > > + return machine__for_each_thread(machine, thread_list_cb, list); > > > > +} > > > > + > > > > +void thread_list__delete(struct list_head *list) > > > > +{ > > > > + struct thread_list *pos, *next; > > > > + > > > > + list_for_each_entry_safe(pos, next, list, list) { > > > > + thread__zput(pos->thread); > > > > + list_del(&pos->list); > > > > + free(pos); > > > > + } > > > > +} > > > > + > > > > pid_t machine__get_current_tid(struct machine *machine, int cpu) > > > > { > > > > if (cpu < 0 || (size_t)cpu >= machine->current_tid_sz) > > > > diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h > > > > index 1279acda6a8a..b738ce84817b 100644 > > > > --- a/tools/perf/util/machine.h > > > > +++ b/tools/perf/util/machine.h > > > > @@ -280,6 +280,16 @@ int machines__for_each_thread(struct machines *machines, > > > > int (*fn)(struct thread *thread, void *p), > > > > void *priv); > > > > > > > > +struct thread_list { > > > > + struct list_head list; > > > > + struct thread *thread; > > > > +}; > > > > + > > > > +/* Make a list of struct thread_list based on threads in the machine. */ > > > > +int machine__thread_list(struct machine *machine, struct list_head *list); > > > > +/* Free up the nodes within the thread_list list. */ > > > > +void thread_list__delete(struct list_head *list); > > > > + > > > > pid_t machine__get_current_tid(struct machine *machine, int cpu); > > > > int machine__set_current_tid(struct machine *machine, int cpu, pid_t pid, > > > > pid_t tid); > > > > -- > > > > 2.43.0.687.g38aa6559b0-goog > > > >
On Tue, Feb 27, 2024 at 11:06 PM Ian Rogers <irogers@google.com> wrote: > > On Tue, Feb 27, 2024 at 10:11 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > On Mon, Feb 26, 2024 at 11:12 PM Ian Rogers <irogers@google.com> wrote: > > > > > > On Mon, Feb 26, 2024 at 10:39 PM Namhyung Kim <namhyung@kernel.org> wrote: > > > > > > > > On Tue, Feb 13, 2024 at 10:37 PM Ian Rogers <irogers@googlecom> wrote: > > > > > > > > > > Commit 91e467bc568f ("perf machine: Use hashtable for machine > > > > > threads") made the iteration of thread tids unordered. The perf report > > > > > --tasks output now shows child threads in an order determined by the > > > > > hashing. For example, in this snippet tid 3 appears after tid 256 even > > > > > though they have the same ppid 2: > > > > > > > > > > ``` > > > > > $ perf report --tasks > > > > > % pid tid ppid comm > > > > > 0 0 -1 |swapper > > > > > 2 2 0 | kthreadd > > > > > 256 256 2 | kworker/12:1H-k > > > > > 693761 693761 2 | kworker/10:1-mm > > > > > 1301762 1301762 2 | kworker/1:1-mm_ > > > > > 1302530 1302530 2 | kworker/u32:0-k > > > > > 3 3 2 | rcu_gp > > > > > ... > > > > > ``` > > > > > > > > > > The output is easier to read if threads appear numerically > > > > > increasing. To allow for this, read all threads into a list then sort > > > > > with a comparator that orders by the child task's of the first common > > > > > parent. The list creation and deletion are created as utilities on > > > > > machine. The indentation is possible by counting the number of > > > > > parents a child has. > > > > > > > > > > With this change the output for the same data file is now like: > > > > > ``` > > > > > $ perf report --tasks > > > > > % pid tid ppid comm > > > > > 0 0 -1 |swapper > > > > > 1 1 0 | systemd > > > > > 823 823 1 | systemd-journal > > > > > 853 853 1 | systemd-udevd > > > > > 3230 3230 1 | systemd-timesyn > > > > > 3236 3236 1 | auditd > > > > > 3239 3239 3236 | audisp-syslog > > > > > 3321 3321 1 | accounts-daemon > > > > > ... > > > > > ``` > > > > > > > > > > Signed-off-by: Ian Rogers <irogers@google.com> > > > > I know you sent out v2 already, but let me continue the discussion > > here. > > > > > > > > > --- > > > > > tools/perf/builtin-report.c | 203 ++++++++++++++++++++---------------- > > > > > tools/perf/util/machine.c | 30 ++++++ > > > > > tools/perf/util/machine.h | 10 ++ > > > > > 3 files changed, 155 insertions(+), 88 deletions(-) > > > > > > > > > > diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c > > > > > index 8e16fa261e6f..b48f1d5309e3 100644 > > > > > --- a/tools/perf/builtin-report.c > > > > > +++ b/tools/perf/builtin-report.c > > > > > @@ -59,6 +59,7 @@ > > > > > #include <linux/ctype.h> > > > > > #include <signal.h> > > > > > #include <linux/bitmap.h> > > > > > +#include <linux/list_sort.h> > > > > > #include <linux/string.h> > > > > > #include <linux/stringify.h> > > > > > #include <linux/time64.h> > > > > > @@ -828,35 +829,6 @@ static void tasks_setup(struct report *rep) > > > > > rep->tool.no_warn = true; > > > > > } > > > > > > > > > > -struct task { > > > > > - struct thread *thread; > > > > > - struct list_head list; > > > > > - struct list_head children; > > > > > -}; > > > > > - > > > > > -static struct task *tasks_list(struct task *task, struct machine *machine) > > > > > -{ > > > > > - struct thread *parent_thread, *thread = task->thread; > > > > > - struct task *parent_task; > > > > > - > > > > > - /* Already listed. */ > > > > > - if (!list_empty(&task->list)) > > > > > - return NULL; > > > > > - > > > > > - /* Last one in the chain. */ > > > > > - if (thread__ppid(thread) == -1) > > > > > - return task; > > > > > - > > > > > - parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > > > > > - if (!parent_thread) > > > > > - return ERR_PTR(-ENOENT); > > > > > - > > > > > - parent_task = thread__priv(parent_thread); > > > > > - thread__put(parent_thread); > > > > > - list_add_tail(&task->list, &parent_task->children); > > > > > - return tasks_list(parent_task, machine); > > > > > -} > > > > > - > > > > > struct maps__fprintf_task_args { > > > > > int indent; > > > > > FILE *fp; > > > > > @@ -900,89 +872,144 @@ static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp) > > > > > return args.printed; > > > > > } > > > > > > > > > > -static void task__print_level(struct task *task, FILE *fp, int level) > > > > > +static int thread_level(struct machine *machine, const struct thread *thread) > > > > > { > > > > > - struct thread *thread = task->thread; > > > > > - struct task *child; > > > > > - int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > > > > > - thread__pid(thread), thread__tid(thread), > > > > > - thread__ppid(thread), level, ""); > > > > > + struct thread *parent_thread; > > > > > + int res; > > > > > > > > > > - fprintf(fp, "%s\n", thread__comm_str(thread)); > > > > > + if (thread__tid(thread) <= 0) > > > > > + return 0; > > > > > > > > > > - maps__fprintf_task(thread__maps(thread), comm_indent, fp); > > > > > + if (thread__ppid(thread) <= 0) > > > > > + return 1; > > > > > > > > > > - if (!list_empty(&task->children)) { > > > > > - list_for_each_entry(child, &task->children, list) > > > > > - task__print_level(child, fp, level + 1); > > > > > + parent_thread = machine__find_thread(machine, -1, thread__ppid(thread)); > > > > > + if (!parent_thread) { > > > > > + pr_err("Missing parent thread of %d\n", thread__tid(thread)); > > > > > + return 0; > > > > > } > > > > > + res = 1 + thread_level(machine, parent_thread); > > > > > + thread__put(parent_thread); > > > > > + return res; > > > > > } > > > > > > > > > > -static int tasks_print(struct report *rep, FILE *fp) > > > > > +static void task__print_level(struct machine *machine, struct thread *thread, FILE *fp) > > > > > { > > > > > - struct perf_session *session = rep->session; > > > > > - struct machine *machine = &session->machines.host; > > > > > - struct task *tasks, *task; > > > > > - unsigned int nr = 0, itask = 0, i; > > > > > - struct rb_node *nd; > > > > > - LIST_HEAD(list); > > > > > + int level = thread_level(machine, thread); > > > > > + int comm_indent = fprintf(fp, " %8d %8d %8d |%*s", > > > > > + thread__pid(thread), thread__tid(thread), > > > > > + thread__ppid(thread), level, ""); > > > > > > > > > > - /* > > > > > - * No locking needed while accessing machine->threads, > > > > > - * because --tasks is single threaded command. > > > > > - */ > > > > > + fprintf(fp, "%s\n", thread__comm_str(thread)); > > > > > > > > > > - /* Count all the threads. */ > > > > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) > > > > > - nr += machine->threads[i].nr; > > > > > + maps__fprintf_task(thread__maps(thread), comm_indent, fp); > > > > > +} > > > > > > > > > > - tasks = malloc(sizeof(*tasks) * nr); > > > > > - if (!tasks) > > > > > - return -ENOMEM; > > > > > +static int task_list_cmp(void *priv, const struct list_head *la, const struct list_head *lb) > > > > > > > > I'm a little afraid that this comparison logic becomes complex. > > > > But I think it's better than having a tree of thread relationship. > > > > Just a comment that explains why we need this would be nice. > > > > > > I can add something in v2. > > > > > > > > > > > > +{ > > > > > + struct machine *machine = priv; > > > > > + struct thread_list *task_a = list_entry(la, struct thread_list, list); > > > > > + struct thread_list *task_b = list_entry(lb, struct thread_list, list); > > > > > + struct thread *a = task_a->thread; > > > > > + struct thread *b = task_b->thread; > > > > > + int level_a, level_b, res; > > > > > + > > > > > + /* Compare a and b to root. */ > > > > > + if (thread__tid(a) == thread__tid(b)) > > > > > + return 0; > > > > > > > > > > - for (i = 0; i < THREADS__TABLE_SIZE; i++) { > > > > > - struct threads *threads = &machine->threads[i]; > > > > > + if (thread__tid(a) == 0) > > > > > + return -1; > > > > > > > > > > - for (nd = rb_first_cached(&threads->entries); nd; > > > > > - nd = rb_next(nd)) { > > > > > - task = tasks + itask++; > > > > > + if (thread__tid(b) == 0) > > > > > + return 1; > > > > > > > > > > - task->thread = rb_entry(nd, struct thread_rb_node, rb_node)->thread; > > > > > - INIT_LIST_HEAD(&task->children); > > > > > - INIT_LIST_HEAD(&task->list); > > > > > - thread__set_priv(task->thread, task); > > > > > - } > > > > > + /* If parents match sort by tid. */ > > > > > + if (thread__ppid(a) == thread__ppid(b)) { > > > > > + return thread__tid(a) < thread__tid(b) > > > > > + ? -1 > > > > > + : (thread__tid(a) > thread__tid(b) ? 1 : 0); > > > > > > > > Can it be simply like this? We know tid(a) != tid(b). > > > > > > > > return thread__tid(a) < thread__tid(b) ? -1 : 1; > > > > > > Yes, but the parent check is still required. > > > > Sure. I only meant the return statement. > > > > > > > > > > } > > > > > > > > > > /* > > > > > - * Iterate every task down to the unprocessed parent > > > > > - * and link all in task children list. Task with no > > > > > - * parent is added into 'list'. > > > > > + * Find a and b such that if they are a child of each other a and b's > > > > > + * tid's match, otherwise a and b have a common parent and distinct > > > > > + * tid's to sort by. First make the depths of the threads match. > > > > > */ > > > > > - for (itask = 0; itask < nr; itask++) { > > > > > - task = tasks + itask; > > > > > - > > > > > - if (!list_empty(&task->list)) > > > > > - continue; > > > > > - > > > > > - task = tasks_list(task, machine); > > > > > - if (IS_ERR(task)) { > > > > > - pr_err("Error: failed to process tasks\n"); > > > > > - free(tasks); > > > > > - return PTR_ERR(task); > > > > > + level_a = thread_level(machine, a); > > > > > + level_b = thread_level(machine, b); > > > > > + a = thread__get(a); > > > > > + b = thread__get(b); > > > > > + for (int i = level_a; i > level_b; i--) { > > > > > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(a)); > > > > > + > > > > > + thread__put(a); > > > > > + if (!parent) { > > > > > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > > > > > + thread__put(b); > > > > > + return -1; > > > > > } > > > > > + a = parent; > > > > > + } > > > > > + for (int i = level_b; i > level_a; i--) { > > > > > + struct thread *parent = machine__find_thread(machine, -1, thread__ppid(b)); > > > > > > > > > > - if (task) > > > > > - list_add_tail(&task->list, &list); > > > > > + thread__put(b); > > > > > + if (!parent) { > > > > > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > > > > > + thread__put(a); > > > > > + return 1; > > > > > + } > > > > > + b = parent; > > > > > + } > > > > > + /* Search up to a common parent. */ > > > > > + while (thread__ppid(a) != thread__ppid(b)) { > > > > > + struct thread *parent; > > > > > + > > > > > + parent = machine__find_thread(machine, -1, thread__ppid(a)); > > > > > + thread__put(a); > > > > > + if (!parent) > > > > > + pr_err("Missing parent thread of %d\n", thread__tid(a)); > > > > > + a = parent; > > > > > + parent = machine__find_thread(machine, -1, thread__ppid(b)); > > > > > + thread__put(b); > > > > > + if (!parent) > > > > > + pr_err("Missing parent thread of %d\n", thread__tid(b)); > > > > > + b = parent; > > > > > + if (!a || !b) > > > > > + return !a && !b ? 0 : (!a ? -1 : 1); > > > > > > > > Wouldn't it leak a refcount if either a or b is NULL (not both)? > > > > > > It would, but this would be an error condition anyway. I can add puts. > > > > > > > > > > > > + } > > > > > + if (thread__tid(a) == thread__tid(b)) { > > > > > + /* a is a child of b or vice-versa, deeper levels appear later. */ > > > > > + res = level_a < level_b ? -1 : (level_a > level_b ? 1 : 0); > > > > > + } else { > > > > > + /* Sort by tid now the parent is the same. */ > > > > > + res = thread__tid(a) < thread__tid(b) ? -1 : 1; > > > > > } > > > > > + thread__put(a); > > > > > + thread__put(b); > > > > > + return res; > > > > > +} > > > > > + > > > > > +static int tasks_print(struct report *rep, FILE *fp) > > > > > +{ > > > > > + struct machine *machine = &rep->session->machines.host; > > > > > + LIST_HEAD(tasks); > > > > > + int ret; > > > > > > > > > > - fprintf(fp, "# %8s %8s %8s %s\n", "pid", "tid", "ppid", "comm"); > > > > > + ret = machine__thread_list(machine, &tasks); > > > > > + if (!ret) { > > > > > + struct thread_list *task; > > > > > > > > Do we really need this thread_list? Why not use an > > > > array of threads directly? > > > > > > The code isn't particularly performance critical. I used a list as it > > > best approximated how the rbtree was being used. The code is reused in > > > subsequent patches, there's no initial pass to size an array and I > > > think the reallocarray/qsort logic is generally more problematic than > > > the list ones. If we were worried about performance then I think > > > arrays could make sense for optimization, but I think this is good > > > enough for now. > > > > Well, it's not about performance. It made me think why we need > > this thread_list but I couldn't find the reason. If you can move > > machine__threads_nr() here then you won't need realloc(). > > But then you can race between allocating an array and traversing to > fill it in. Using realloc in the iterator callback would avoid this > but with capacity tracking, etc. If this were C++ its a call between a > std::vector and a std::list, and std::vector would win that race there > (imo). Here we're moving from code that was working on sorted tree > nodes in code that tends to more heavily use lists. I wanted the > transition from the rbtree nodes to list nodes to be as small as > possible in the changes to the APIs that did strange things to the > threads tree (resorting it). Moving to an array with indices would > require more tracking and be a larger change in general. The array > could move because of a realloc, whilst nodes wouldn't, etc. Having > the code now work on a list its easier to see how it can migrate to an > array, but that can be follow on work. I'm not sure we're motivated to > do it given there's no code on a performance critical path. Ok, as you said it's not a critical path. I'm ok with this change. Thanks, Namhyung