[v4,0/7] Thread memory improvements and fixes

Message ID 20240301053646.1449657-1-irogers@google.com
Headers
Series Thread memory improvements and fixes |

Message

Ian Rogers March 1, 2024, 5:36 a.m. UTC
  The next 6 patches (now 7) from:
https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
now the initial maps fixes have landed:
https://lore.kernel.org/all/20240210031746.4057262-1-irogers@google.com/

Separate out and reimplement threads to use a hashmap for lower memory
consumption and faster look up. The fixes a regression in memory usage
where reference count checking switched to using non-invasive tree
nodes.  Reduce threads default size by 32 times and improve locking
discipline. Also, fix regressions where tids had become unordered to
make `perf report --tasks` and `perf trace --summary` output easier to
read.

v4. Add read lock to threads__for_each_thread, Namhyung.
v3. Factor threads out of machine in 1 patch, then move threads
    functions in a second.
v2: improve comments and a commit message.

Ian Rogers (7):
  perf report: Sort child tasks by tid
  perf trace: Ignore thread hashing in summary
  perf machine: Move fprintf to for_each loop and a callback
  perf machine: Move machine's threads into its own abstraction
  perf threads: Move threads to its own files
  perf threads: Switch from rbtree to hashmap
  perf threads: Reduce table size from 256 to 8

 tools/perf/builtin-report.c           | 217 +++++++++-------
 tools/perf/builtin-trace.c            |  41 ++--
 tools/perf/util/Build                 |   1 +
 tools/perf/util/bpf_lock_contention.c |   4 +-
 tools/perf/util/machine.c             | 341 +++++++-------------------
 tools/perf/util/machine.h             |  30 +--
 tools/perf/util/rb_resort.h           |   5 -
 tools/perf/util/thread.c              |   2 +-
 tools/perf/util/thread.h              |   6 -
 tools/perf/util/threads.c             | 190 ++++++++++++++
 tools/perf/util/threads.h             |  35 +++
 11 files changed, 478 insertions(+), 394 deletions(-)
 create mode 100644 tools/perf/util/threads.c
 create mode 100644 tools/perf/util/threads.h
  

Comments

Namhyung Kim March 4, 2024, 6:45 a.m. UTC | #1
Hi Ian,

On Thu, Feb 29, 2024 at 9:36 PM Ian Rogers <irogers@google.com> wrote:
>
> The next 6 patches (now 7) from:
> https://lore.kernel.org/lkml/20240202061532.1939474-1-irogers@google.com/
> now the initial maps fixes have landed:
> https://lore.kernel.org/all/20240210031746.4057262-1-irogers@google.com/
>
> Separate out and reimplement threads to use a hashmap for lower memory
> consumption and faster look up. The fixes a regression in memory usage
> where reference count checking switched to using non-invasive tree
> nodes.  Reduce threads default size by 32 times and improve locking
> discipline. Also, fix regressions where tids had become unordered to
> make `perf report --tasks` and `perf trace --summary` output easier to
> read.
>
> v4. Add read lock to threads__for_each_thread, Namhyung.
> v3. Factor threads out of machine in 1 patch, then move threads
>     functions in a second.
> v2: improve comments and a commit message.
>
> Ian Rogers (7):
>   perf report: Sort child tasks by tid
>   perf trace: Ignore thread hashing in summary
>   perf machine: Move fprintf to for_each loop and a callback
>   perf machine: Move machine's threads into its own abstraction
>   perf threads: Move threads to its own files
>   perf threads: Switch from rbtree to hashmap
>   perf threads: Reduce table size from 256 to 8

Acked-by: Namhyung Kim <namhyung@kernel.org>

Thanks,
Namhyung

>
>  tools/perf/builtin-report.c           | 217 +++++++++-------
>  tools/perf/builtin-trace.c            |  41 ++--
>  tools/perf/util/Build                 |   1 +
>  tools/perf/util/bpf_lock_contention.c |   4 +-
>  tools/perf/util/machine.c             | 341 +++++++-------------------
>  tools/perf/util/machine.h             |  30 +--
>  tools/perf/util/rb_resort.h           |   5 -
>  tools/perf/util/thread.c              |   2 +-
>  tools/perf/util/thread.h              |   6 -
>  tools/perf/util/threads.c             | 190 ++++++++++++++
>  tools/perf/util/threads.h             |  35 +++
>  11 files changed, 478 insertions(+), 394 deletions(-)
>  create mode 100644 tools/perf/util/threads.c
>  create mode 100644 tools/perf/util/threads.h
>
> --
> 2.44.0.278.ge034bb2e1d-goog
>