Message ID | 20240224001153.2584030-1-jstultz@google.com |
---|---|
Headers |
Return-Path: <linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:a81b:b0:108:e6aa:91d0 with SMTP id bq27csp913230dyb; Fri, 23 Feb 2024 16:13:01 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWkXWsvR1DuPwWTPdseCAksPn9Rsno/DyUUO/Vg5ipGe3D8kDqacE7qinerbD9eNilCwQ6B22dazxcen/We9pdu07783A== X-Google-Smtp-Source: AGHT+IFZyOaWSfzR2KvmBdrNcdS9Qnjzvy2rurG/Go+44M0fI+sgnBJNk3I4zExHXPBo1kSbkAUm X-Received: by 2002:a05:6a00:84a:b0:6e4:5cdc:5350 with SMTP id q10-20020a056a00084a00b006e45cdc5350mr1948083pfk.1.1708733581069; Fri, 23 Feb 2024 16:13:01 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708733581; cv=pass; d=google.com; s=arc-20160816; b=0ITo8W/Wo9jusghOHUwXaLXle+yDmEk/fLlCm7RZ4QTtJ3pSO7iFPbWFkJ55DA6sI0 m4O+BsorWg0UsbEiypBg53y4DfY9bDa7+ZUVSxOVX6520zKUJmmGb50YsDiBe0W/3Dxq HYd4boG1FmSFngjocc2+es0cX971fcPOGQOlIYenQbZFvqLF8NDdjxyI8Mr+OPrC6T5E KpynsRoS/XtflgLccYmaTo6987UAzGxDElb68zN/1R6gOta//88y9KyCHDpbM7YbGzn/ Rg6LJu5VAZuUO5Mf+qGbPVnx41SYBfDjSo7ZT+LpnWlB5iZ2Hd9y9zruChDWuNboYzDK kgkg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:list-unsubscribe:list-subscribe:list-id:precedence :date:dkim-signature; bh=BPtJcJCeMuVw6r9hInSrqVQWf5ePSY5D8abyGQm8sgg=; fh=lq1k4m46TqOToIb7rHC/F2KmHTJTjmaV9sS3Am9W7nA=; b=yPz7JumW/Cmc8Ky1GElSs7ogsES96bV/NruU0mIme6nMPc0bixM3+etbjQjo0V9Dzs jlIVW8Rq56tu3r+ciygLqttBFq1Ys73PQJts98cKxWPYkyv0hbyU1iS7nM27RaBENpZQ 5G1vXxlFlut7b4z60wCZTkfGaMHwJrQfIxKSoJjMgS8wL/LIVq3DosKZap67rwoA7guO wP6RGmHnNoVatwHjzlrtfucVO6OMBZlXw/d1k63CAGqNb23HNoOTt0/AlO7gmdGvJNMG 7UDX+Qm00a5gDqYjqPVwe8chvbUkajKaATzV3wjvqtIIaidu0hW+GkorBNgnEVxLszjh eyvQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tR5l5cgI; arc=pass (i=1 spf=pass spfdomain=flex--jstultz.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id ka30-20020a056a00939e00b006e4d47f00a2si80597pfb.100.2024.02.23.16.13.00 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Feb 2024 16:13:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tR5l5cgI; arc=pass (i=1 spf=pass spfdomain=flex--jstultz.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-79343-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 9FEBBB2328E for <ouuuleilei@gmail.com>; Sat, 24 Feb 2024 00:12:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 5C5991362; Sat, 24 Feb 2024 00:12:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="tR5l5cgI" Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B555119A for <linux-kernel@vger.kernel.org>; Sat, 24 Feb 2024 00:12:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708733524; cv=none; b=DzqfVl5oFkazuicYq8NqcvaTJErTmB5clKsWiAUmvf5/oRK57zkFBF41Z1mjDHOsch4TN9FZM2XcrxFYK7j01/JDvcIHszmMN5xozumIYbeQYZb6A9YUf42qdcoqS4BJVweNPXnKN9pnfpuE3z8kPH4FNIaRwCX13NVGMkjyuFA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708733524; c=relaxed/simple; bh=LPpWgJSNSUqkMM+2TmoJV0u8ewrDuMwgqkHvBe+r8S4=; h=Date:Mime-Version:Message-ID:Subject:From:To:Cc:Content-Type; b=BNfbEUil8zQB67wYJfPndhHJytg7nKzUrEeJQViXWrBo1cmLwcAciprPSQ+54mVuBkzX0nmZWmRtrzfWTA8y9CDMmNtrKuCfODdae9Tl6C3v4ccYmB0/xi1UYGW8EN/kP55Lh6ssAirlLzZgEFMcBRfXR0DTYiSQJ3hcCMM/hyc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=tR5l5cgI; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jstultz.bounces.google.com Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-5efe82b835fso20720857b3.0 for <linux-kernel@vger.kernel.org>; Fri, 23 Feb 2024 16:12:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1708733521; x=1709338321; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:from:to:cc:subject:date:message-id:reply-to; bh=BPtJcJCeMuVw6r9hInSrqVQWf5ePSY5D8abyGQm8sgg=; b=tR5l5cgIa/OVCxSJfEyITQCjBe5/A6AhPCgbHVq/nmtEBZ/T2zoyaSOXrR4d7hB9Zg e3RSSCinCrKTOmC7h1oH2f3hJ7fOm70VPayNW/SJUIf+JYiJ0J7mIofzjOgLdmnY9QSL imD5vvE48xQEVThRaTwkE5Xv2O9pJBQHNRk9ZwwsC0iLc5s5KtC/ZPzzhjaYGKrAQMGR odtEXHQbkbxtstvmqmkL7OOC0uS5nxOEiCOS0F4jPt9heGAFOc77ZR+fp7vlZ+adMZPf VttzA+8apAKhWAIydL3sfaO9TUZrFZcljw6Uq/M7rUUy7fcO2Y6CDpdD2S82HfHG8J20 iEfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1708733521; x=1709338321; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BPtJcJCeMuVw6r9hInSrqVQWf5ePSY5D8abyGQm8sgg=; b=savypUIFJSIclqTgq83Lbeg1YDy53T5amBKj9b2/WPc7gPs6AAs0mu8tsn8+2alD3r VAqXapXA8un4bv8glnGJ+cIMnFx7AhhCybeNO1sfJA5+P7OMEb4JyTKvMcUZuz+ZjcW/ 04FllPvGH/hXY7EKiafaEqNR6MgwGK/LOePRxCqt+/bYTMdKfu43zo5Y8rj1sziuW+SK YTrl4dMhdYcBhdh9gfQMBeImCym8XsA9P0XI26PIBfsH9ijhnbkWm4Ywfu6395BkoVgD 7kBaZYKnslalahYRVe6jv7iXscX/KF/MzZF2uUXiIYS0eIZ/6Sz/LZAGZyA5uMR8IbMN DO0w== X-Gm-Message-State: AOJu0YwWaLP80Ff93L006QmfagQ9kp6KvkwzY2E97EGRJG4v6hAxedha GV0bMvFzIcitFEvHNH2LhKNFKenR6tK9SY7ZYxwMhOqvXdH5PKWD9VygsvjBrPFu/6KKTISDnS8 yRVHsEVmzih37+Gro/1ZwdtF9dLMpIhs/eyaLRkN7HIx0DiRcku23vHOfRkZnP45gaYfbpM7R0q M2zDJJz5Sz5KbQKgpljgg6pNr+MBb/QCn5ahw+2t3ueOII X-Received: from jstultz-noogler2.c.googlers.com ([fda3:e722:ac3:cc00:24:72f4:c0a8:600]) (user=jstultz job=sendgmr) by 2002:a05:690c:98b:b0:608:53ab:73fd with SMTP id ce11-20020a05690c098b00b0060853ab73fdmr345805ywb.2.1708733521595; Fri, 23 Feb 2024 16:12:01 -0800 (PST) Date: Fri, 23 Feb 2024 16:11:40 -0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> Mime-Version: 1.0 X-Mailer: git-send-email 2.44.0.rc0.258.g7320e95886-goog Message-ID: <20240224001153.2584030-1-jstultz@google.com> Subject: [RESEND][PATCH v8 0/7] Preparatory changes for Proxy Execution v8 From: John Stultz <jstultz@google.com> To: LKML <linux-kernel@vger.kernel.org> Cc: John Stultz <jstultz@google.com>, Joel Fernandes <joelaf@google.com>, Qais Yousef <qyousef@google.com>, Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Valentin Schneider <vschneid@redhat.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Zimuzo Ezeozue <zezeozue@google.com>, Youssef Esmat <youssefesmat@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Will Deacon <will@kernel.org>, Waiman Long <longman@redhat.com>, Boqun Feng <boqun.feng@gmail.com>, "Paul E. McKenney" <paulmck@kernel.org>, Metin Kaya <Metin.Kaya@arm.com>, Xuewen Yan <xuewen.yan94@gmail.com>, K Prateek Nayak <kprateek.nayak@amd.com>, Thomas Gleixner <tglx@linutronix.de>, kernel-team@android.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791737023427327347 X-GMAIL-MSGID: 1791737023427327347 |
Series |
Preparatory changes for Proxy Execution v8
|
|
Message
John Stultz
Feb. 24, 2024, 12:11 a.m. UTC
After sending out v7 of Proxy Execution, I got feedback that the patch series was getting a bit unwieldy to review, and Qais suggested I break out just the cleanups/preparatory components of the patch series and submit them on their own in the hope we can start to merge the less complex bits and discussion can focus on the more complicated portions afterwards. So for the v8 of this series, I only submitted those earlier cleanup/preparatory changes: https://lore.kernel.org/lkml/20240210002328.4126422-1-jstultz@google.com/ After sending this out a few weeks back, I’ve not heard much, so I wanted to resend this again. (I did correct one detail here, which was that I had accidentally lost the author credit to one of the patches, and I’ve fixed that in this submission). As before, If you are interested, the full v8 series, it can be found here: https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v8-6.8-rc3 https://github.com/johnstultz-work/linux-dev.git proxy-exec-v8-6.8-rc3 However, I’ve been focusing pretty intensely on the series to shake out some issues with the more complicated later patches in the series (not in what I’m submitting here), and have resolved a number of problems I uncovered in doing wider testing (along with lots of review feedback from Metin), so v9 and all of its improvements will hopefully be ready to send out soon. If you want a preview, my current WIP tree (careful, as I rebase it frequently) is here: https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-WIP https://github.com/johnstultz-work/linux-dev.git proxy-exec-WIP Review and feedback would be greatly appreciated! Thanks so much! -john Cc: Joel Fernandes <joelaf@google.com> Cc: Qais Yousef <qyousef@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Juri Lelli <juri.lelli@redhat.com> Cc: Vincent Guittot <vincent.guittot@linaro.org> Cc: Dietmar Eggemann <dietmar.eggemann@arm.com> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Steven Rostedt <rostedt@goodmis.org> Cc: Ben Segall <bsegall@google.com> Cc: Zimuzo Ezeozue <zezeozue@google.com> Cc: Youssef Esmat <youssefesmat@google.com> Cc: Mel Gorman <mgorman@suse.de> Cc: Daniel Bristot de Oliveira <bristot@redhat.com> Cc: Will Deacon <will@kernel.org> Cc: Waiman Long <longman@redhat.com> Cc: Boqun Feng <boqun.feng@gmail.com> Cc: "Paul E. McKenney" <paulmck@kernel.org> Cc: Metin Kaya <Metin.Kaya@arm.com> Cc: Xuewen Yan <xuewen.yan94@gmail.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: kernel-team@android.com Connor O'Brien (2): sched: Add do_push_task helper sched: Consolidate pick_*_task to task_is_pushable helper John Stultz (1): sched: Split out __schedule() deactivate task logic into a helper Juri Lelli (2): locking/mutex: Make mutex::wait_lock irq safe locking/mutex: Expose __mutex_owner() Peter Zijlstra (2): locking/mutex: Remove wakeups from under mutex::wait_lock sched: Split scheduler and execution contexts kernel/locking/mutex.c | 60 +++++++---------- kernel/locking/mutex.h | 25 +++++++ kernel/locking/rtmutex.c | 26 +++++--- kernel/locking/rwbase_rt.c | 4 +- kernel/locking/rwsem.c | 4 +- kernel/locking/spinlock_rt.c | 3 +- kernel/locking/ww_mutex.h | 49 ++++++++------ kernel/sched/core.c | 122 +++++++++++++++++++++-------------- kernel/sched/deadline.c | 53 ++++++--------- kernel/sched/fair.c | 18 +++--- kernel/sched/rt.c | 59 +++++++---------- kernel/sched/sched.h | 44 ++++++++++++- 12 files changed, 268 insertions(+), 199 deletions(-)
Comments
Hello John, Happy to report that I did not see any regressions with the series as expected. Full results below. On 2/24/2024 5:41 AM, John Stultz wrote: > After sending out v7 of Proxy Execution, I got feedback that the > patch series was getting a bit unwieldy to review, and Qais > suggested I break out just the cleanups/preparatory components > of the patch series and submit them on their own in the hope we > can start to merge the less complex bits and discussion can focus > on the more complicated portions afterwards. > > So for the v8 of this series, I only submitted those earlier > cleanup/preparatory changes: > https://lore.kernel.org/lkml/20240210002328.4126422-1-jstultz@google.com/ > > After sending this out a few weeks back, I’ve not heard much, so > I wanted to resend this again. > > (I did correct one detail here, which was that I had accidentally > lost the author credit to one of the patches, and I’ve fixed that > in this submission). > > As before, If you are interested, the full v8 series, it can be > found here: > https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-v8-6.8-rc3 > https://github.com/johnstultz-work/linux-dev.git proxy-exec-v8-6.8-rc3 > > However, I’ve been focusing pretty intensely on the series to > shake out some issues with the more complicated later patches in > the series (not in what I’m submitting here), and have resolved > a number of problems I uncovered in doing wider testing (along > with lots of review feedback from Metin), so v9 and all of its > improvements will hopefully be ready to send out soon. > > If you want a preview, my current WIP tree (careful, as I rebase > it frequently) is here: > https://github.com/johnstultz-work/linux-dev/commits/proxy-exec-WIP > https://github.com/johnstultz-work/linux-dev.git proxy-exec-WIP > > Review and feedback would be greatly appreciated! o System Details - 3rd Generation EPYC System - 2 x 64C/128T - NPS1 mode o Kernels tip: tip:sched/core at commit 8cec3dd9e593 ("sched/core: Simplify code by removing duplicate #ifdefs") proxy-setup: tip + this series o Results ================================================================== Test : hackbench Units : Normalized time in seconds Interpretation: Lower is better Statistic : AMean ================================================================== Case: tip[pct imp](CV) proxy-setup[pct imp](CV) 1-groups 1.00 [ -0.00]( 2.08) 1.01 [ -0.53]( 2.45) 2-groups 1.00 [ -0.00]( 0.89) 1.03 [ -3.32]( 1.48) 4-groups 1.00 [ -0.00]( 0.81) 1.02 [ -2.26]( 1.22) 8-groups 1.00 [ -0.00]( 0.78) 1.00 [ -0.29]( 0.97) 16-groups 1.00 [ -0.00]( 1.60) 1.00 [ -0.27]( 1.86) ================================================================== Test : tbench Units : Normalized throughput Interpretation: Higher is better Statistic : AMean ================================================================== Clients: tip[pct imp](CV) proxy-setup[pct imp](CV) 1 1.00 [ 0.00]( 0.71) 1.00 [ 0.31]( 0.37) 2 1.00 [ 0.00]( 0.25) 0.99 [ -0.56]( 0.31) 4 1.00 [ 0.00]( 0.85) 0.98 [ -2.35]( 0.69) 8 1.00 [ 0.00]( 1.00) 0.99 [ -0.99]( 0.12) 16 1.00 [ 0.00]( 1.25) 0.99 [ -0.78]( 1.35) 32 1.00 [ 0.00]( 0.35) 1.00 [ 0.12]( 2.23) 64 1.00 [ 0.00]( 0.71) 0.99 [ -0.97]( 0.55) 128 1.00 [ 0.00]( 0.46) 0.96 [ -4.38]( 0.47) 256 1.00 [ 0.00]( 0.24) 0.99 [ -1.32]( 0.95) 512 1.00 [ 0.00]( 0.30) 0.98 [ -1.52]( 0.10) 1024 1.00 [ 0.00]( 0.40) 0.98 [ -1.59]( 0.23) ================================================================== Test : stream-10 Units : Normalized Bandwidth, MB/s Interpretation: Higher is better Statistic : HMean ================================================================== Test: tip[pct imp](CV) proxy-setup[pct imp](CV) Copy 1.00 [ 0.00]( 9.73) 1.04 [ 4.18]( 3.12) Scale 1.00 [ 0.00]( 5.57) 0.99 [ -1.35]( 5.74) Add 1.00 [ 0.00]( 5.43) 0.99 [ -1.29]( 5.93) Triad 1.00 [ 0.00]( 5.50) 0.97 [ -3.47]( 7.81) ================================================================== Test : stream-100 Units : Normalized Bandwidth, MB/s Interpretation: Higher is better Statistic : HMean ================================================================== Test: tip[pct imp](CV) proxy-setup[pct imp](CV) Copy 1.00 [ 0.00]( 3.26) 1.01 [ 0.83]( 2.69) Scale 1.00 [ 0.00]( 1.26) 1.00 [ -0.32]( 4.52) Add 1.00 [ 0.00]( 1.47) 1.01 [ 0.63]( 0.96) Triad 1.00 [ 0.00]( 1.77) 1.02 [ 1.81]( 1.00) ================================================================== Test : netperf Units : Normalized Througput Interpretation: Higher is better Statistic : AMean ================================================================== Clients: tip[pct imp](CV) proxy-setup[pct imp](CV) 1-clients 1.00 [ 0.00]( 0.22) 0.99 [ -0.53]( 0.26) 2-clients 1.00 [ 0.00]( 0.57) 1.00 [ -0.44]( 0.41) 4-clients 1.00 [ 0.00]( 0.43) 1.00 [ -0.48]( 0.39) 8-clients 1.00 [ 0.00]( 0.27) 1.00 [ -0.31]( 0.42) 16-clients 1.00 [ 0.00]( 0.46) 1.00 [ -0.11]( 0.42) 32-clients 1.00 [ 0.00]( 0.95) 1.00 [ -0.41]( 0.56) 64-clients 1.00 [ 0.00]( 1.79) 1.00 [ -0.15]( 1.65) 128-clients 1.00 [ 0.00]( 0.89) 1.00 [ -0.43]( 0.80) 256-clients 1.00 [ 0.00]( 3.88) 1.00 [ -0.37]( 4.74) 512-clients 1.00 [ 0.00](35.06) 1.01 [ 1.05](50.84) ================================================================== Test : schbench Units : Normalized 99th percentile latency in us Interpretation: Lower is better Statistic : Median ================================================================== #workers: tip[pct imp](CV) proxy-setup[pct imp](CV) 1 1.00 [ -0.00](27.28) 1.31 [-31.25]( 2.38) 2 1.00 [ -0.00]( 3.85) 1.00 [ -0.00]( 8.85) 4 1.00 [ -0.00](14.00) 1.11 [-10.53](11.18) 8 1.00 [ -0.00]( 4.68) 1.08 [ -8.33]( 9.93) 16 1.00 [ -0.00]( 4.08) 0.92 [ 8.06]( 3.70) 32 1.00 [ -0.00]( 6.68) 0.95 [ 5.10]( 2.22) 64 1.00 [ -0.00]( 1.79) 0.99 [ 1.02]( 3.18) 128 1.00 [ -0.00]( 6.30) 1.02 [ -2.48]( 7.37) 256 1.00 [ -0.00](43.39) 1.00 [ -0.00](37.06) 512 1.00 [ -0.00]( 2.26) 0.98 [ 1.88]( 6.96) Note: schbench is known to have high run to run variance for 16-workers and below. ================================================================== Test : Unixbench Units : Normalized scores Interpretation: Lower is better Statistic : Various (Mentioned) ================================================================== Metric Variant tip proxy-setup Hmean unixbench-dhry2reg-1 0.00% -0.60% Hmean unixbench-dhry2reg-512 0.00% -0.01% Amean unixbench-syscall-1 0.00% -0.41% Amean unixbench-syscall-512 0.00% 0.13% Hmean unixbench-pipe-1 0.00% 1.02% Hmean unixbench-pipe-512 0.00% 0.53% Hmean unixbench-spawn-1 0.00% -2.68% Hmean unixbench-spawn-512 0.00% 3.24% Hmean unixbench-execl-1 0.00% 0.61% Hmean unixbench-execl-512 0.00% 1.97% -- Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> > > Thanks so much! > -john > > [..snip..] > -- Thanks and Regards, Prateek
On Tue, Feb 27, 2024 at 8:43 PM 'K Prateek Nayak' via kernel-team <kernel-team@android.com> wrote: > Happy to report that I did not see any regressions with the series > as expected. Full results below. > [snip] > o System Details > > - 3rd Generation EPYC System > - 2 x 64C/128T > - NPS1 mode > > o Kernels > > tip: tip:sched/core at commit 8cec3dd9e593 ("sched/core: > Simplify code by removing duplicate #ifdefs") > > proxy-setup: tip + this series > Hey! Thank you so much for taking the time to run these through the testing! I *really* appreciate it! Just to clarify: by "this series" did you test just the 7 preparatory patches submitted to the list here, or did you pull the full proxy-exec-v8-6.8-rc3 set from git? (Either is great! I just wanted to make sure its clear which were covered) [snip] > Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> Thanks so much again! -john
Hello John, On 2/28/2024 10:21 AM, John Stultz wrote: > On Tue, Feb 27, 2024 at 8:43 PM 'K Prateek Nayak' via kernel-team > <kernel-team@android.com> wrote: >> Happy to report that I did not see any regressions with the series >> as expected. Full results below. >> > [snip] >> o System Details >> >> - 3rd Generation EPYC System >> - 2 x 64C/128T >> - NPS1 mode >> >> o Kernels >> >> tip: tip:sched/core at commit 8cec3dd9e593 ("sched/core: >> Simplify code by removing duplicate #ifdefs") >> >> proxy-setup: tip + this series >> > > Hey! Thank you so much for taking the time to run these through the > testing! I *really* appreciate it! > > Just to clarify: by "this series" did you test just the 7 preparatory > patches submitted to the list here, or did you pull the full > proxy-exec-v8-6.8-rc3 set from git? Just these preparatory patches for now. On my way to queue a run for the whole set from your tree. I'll use the "proxy-exec-v8-6.8-rc3" branch and pick the commits past the "[ANNOTATION] === Proxy Exec patches past this point ===" till the commit ff90fb583a81 ("FIX: Avoid using possibly uninitialized cpu value with activate_blocked_entities()") on top of the tip:sched/core mentioned above since it'll allow me to reuse the baseline numbers :) > (Either is great! I just wanted to make sure its clear which were covered) > > [snip] >> Tested-by: K Prateek Nayak <kprateek.nayak@amd.com> > > Thanks so much again! > -john -- Thanks and Regards, Prateek
Hello John, On 2/28/2024 10:54 AM, John Stultz wrote: > On Tue, Feb 27, 2024 at 9:12 PM K Prateek Nayak <kprateek.nayak@amd.com> wrote: >> On 2/28/2024 10:21 AM, John Stultz wrote: >>> Just to clarify: by "this series" did you test just the 7 preparatory >>> patches submitted to the list here, or did you pull the full >>> proxy-exec-v8-6.8-rc3 set from git? >> >> Just these preparatory patches for now. On my way to queue a run for the >> whole set from your tree. I'll use the "proxy-exec-v8-6.8-rc3" branch and >> pick the commits past the >> "[ANNOTATION] === Proxy Exec patches past this point ===" till the commit >> ff90fb583a81 ("FIX: Avoid using possibly uninitialized cpu value with >> activate_blocked_entities()") on top of the tip:sched/core mentioned >> above since it'll allow me to reuse the baseline numbers :) >> > > Ah, thank you for the clarification! > > Also, I really appreciate your testing with the rest of the series as > well. It will be good to have any potential problems identified early I got a chance to test the whole of v8 patches on the same dual socket 3rd Generation EPYC system: tl;dr - There is a slight regression in hackbench but instead of the 10x blowup seen previously, it is only around 5% with overloaded case not regressing at all. - A small but consistent (~2-3%) regression is seen in tbench and netperf. - schbench is inconclusive due to run to run variance and stream is perf neutral with proxy execution. I've not looked deeper into the regressions. I'll let you know if I spot anything when digging deeper. Below are the full results: o System Details - 3rd Generation EPYC System - 2 x 64C/128T - NPS1 mode o Kernels tip: tip:sched/core at commit 8cec3dd9e593 ("sched/core: Simplify code by removing duplicate #ifdefs") proxy-exec-full: tip + proxy execution commits from "proxy-exec-v8-6.8-rc3" described previously in this thread. o Results ================================================================== Test : hackbench Units : Normalized time in seconds Interpretation: Lower is better Statistic : AMean ================================================================== Case: tip[pct imp](CV) proxy-exec-full[pct imp](CV) 1-groups 1.00 [ -0.00]( 2.08) 1.00 [ -0.18]( 3.90) 2-groups 1.00 [ -0.00]( 0.89) 1.04 [ -4.43]( 0.78) 4-groups 1.00 [ -0.00]( 0.81) 1.05 [ -4.82]( 1.03) 8-groups 1.00 [ -0.00]( 0.78) 1.02 [ -1.90]( 1.00) 16-groups 1.00 [ -0.00]( 1.60) 1.01 [ -0.80]( 1.18) ================================================================== Test : tbench Units : Normalized throughput Interpretation: Higher is better Statistic : AMean ================================================================== Clients: tip[pct imp](CV) proxy-exec-full[pct imp](CV) 1 1.00 [ 0.00]( 0.71) 0.97 [ -3.00]( 0.15) 2 1.00 [ 0.00]( 0.25) 0.97 [ -3.35]( 0.98) 4 1.00 [ 0.00]( 0.85) 0.97 [ -3.26]( 1.40) 8 1.00 [ 0.00]( 1.00) 0.97 [ -2.75]( 0.46) 16 1.00 [ 0.00]( 1.25) 0.99 [ -1.27]( 0.11) 32 1.00 [ 0.00]( 0.35) 0.98 [ -2.42]( 0.06) 64 1.00 [ 0.00]( 0.71) 0.97 [ -2.76]( 1.81) 128 1.00 [ 0.00]( 0.46) 0.97 [ -2.67]( 0.88) 256 1.00 [ 0.00]( 0.24) 0.98 [ -1.97]( 0.98) 512 1.00 [ 0.00]( 0.30) 0.98 [ -2.41]( 0.38) 1024 1.00 [ 0.00]( 0.40) 0.98 [ -2.21]( 0.11) ================================================================== Test : stream-10 Units : Normalized Bandwidth, MB/s Interpretation: Higher is better Statistic : HMean ================================================================== Test: tip[pct imp](CV) proxy-exec-full[pct imp](CV) Copy 1.00 [ 0.00]( 9.73) 1.00 [ 0.26]( 6.36) Scale 1.00 [ 0.00]( 5.57) 1.02 [ 1.59]( 2.98) Add 1.00 [ 0.00]( 5.43) 1.00 [ 0.48]( 2.77) Triad 1.00 [ 0.00]( 5.50) 0.98 [ -2.18]( 6.06) ================================================================== Test : stream-100 Units : Normalized Bandwidth, MB/s Interpretation: Higher is better Statistic : HMean ================================================================== Test: tip[pct imp](CV) proxy-exec-full[pct imp](CV) Copy 1.00 [ 0.00]( 3.26) 0.98 [ -1.96]( 3.24) Scale 1.00 [ 0.00]( 1.26) 0.96 [ -3.61]( 6.41) Add 1.00 [ 0.00]( 1.47) 0.98 [ -1.84]( 4.14) Triad 1.00 [ 0.00]( 1.77) 1.00 [ 0.27]( 2.60) ================================================================== Test : netperf Units : Normalized Througput Interpretation: Higher is better Statistic : AMean ================================================================== Clients: tip[pct imp](CV) proxy-exec-full[pct imp](CV) 1-clients 1.00 [ 0.00]( 0.22) 0.97 [ -3.01]( 0.40) 2-clients 1.00 [ 0.00]( 0.57) 0.97 [ -3.25]( 0.45) 4-clients 1.00 [ 0.00]( 0.43) 0.97 [ -3.26]( 0.59) 8-clients 1.00 [ 0.00]( 0.27) 0.97 [ -2.83]( 0.55) 16-clients 1.00 [ 0.00]( 0.46) 0.97 [ -2.99]( 0.65) 32-clients 1.00 [ 0.00]( 0.95) 0.97 [ -2.98]( 0.71) 64-clients 1.00 [ 0.00]( 1.79) 0.97 [ -2.61]( 1.38) 128-clients 1.00 [ 0.00]( 0.89) 0.97 [ -2.72]( 0.94) 256-clients 1.00 [ 0.00]( 3.88) 0.98 [ -1.89]( 2.92) 512-clients 1.00 [ 0.00](35.06) 0.99 [ -0.78](47.83) ================================================================== Test : schbench Units : Normalized 99th percentile latency in us Interpretation: Lower is better Statistic : Median ================================================================== #workers: tip[pct imp](CV) proxy-exec-full[pct imp](CV) 1 1.00 [ -0.00](27.28) 1.31 [-31.25]( 6.45) 2 1.00 [ -0.00]( 3.85) 0.95 [ 5.00](10.02) 4 1.00 [ -0.00](14.00) 1.11 [-10.53]( 1.36) 8 1.00 [ -0.00]( 4.68) 1.15 [-14.58](14.55) 16 1.00 [ -0.00]( 4.08) 0.98 [ 1.61]( 3.28) 32 1.00 [ -0.00]( 6.68) 1.02 [ -2.04]( 1.71) 64 1.00 [ -0.00]( 1.79) 1.12 [-11.73]( 7.08) 128 1.00 [ -0.00]( 6.30) 1.11 [-10.84]( 5.52) 256 1.00 [ -0.00](43.39) 1.37 [-37.14](20.11) 512 1.00 [ -0.00]( 2.26) 0.99 [ 1.17]( 1.43) ================================================================== Test : Unixbench Units : Normalized scores Interpretation: Lower is better Statistic : Various (Mentioned) ================================================================== Metric Variant tip proxy-exec-full Hmean unixbench-dhry2reg-1 0.00% -0.67% Hmean unixbench-dhry2reg-512 0.00% 0.14% Amean unixbench-syscall-1 0.00% -0.86% Amean unixbench-syscall-512 0.00% -6.42% Hmean unixbench-pipe-1 0.00% 0.79% Hmean unixbench-pipe-512 0.00% 0.57% Hmean unixbench-spawn-1 0.00% -3.91% Hmean unixbench-spawn-512 0.00% 3.17% Hmean unixbench-execl-1 0.00% -1.18% Hmean unixbench-execl-512 0.00% 1.26% -- > (I'm trying to get v9 ready as soon as I can here, as its fixed a > number of smaller issues - However, I've also managed to uncover some > new problems in stress testing, so we'll see how quickly I can chase > those down). I haven't seen any splats when running the above tests. I'll test some larger workloads next. Please let me know if you would like me to test any specific workload or need additional data from these tests :) > > thanks > -john -- Thanks and Regards, Prateek
Hello John, On 2/29/2024 11:49 AM, John Stultz wrote: > On Wed, Feb 28, 2024 at 9:37 AM 'K Prateek Nayak' via kernel-team > <kernel-team@android.com> wrote: >> I got a chance to test the whole of v8 patches on the same dual socket >> 3rd Generation EPYC system: >> >> tl;dr >> >> - There is a slight regression in hackbench but instead of the 10x >> blowup seen previously, it is only around 5% with overloaded case >> not regressing at all. >> >> - A small but consistent (~2-3%) regression is seen in tbench and >> netperf. > > Once again, thank you so much for your testing and reporting of the > data! I really appreciate it! > > Do you mind sharing exactly how you're running the benchmarks? (I'd > like to try to reproduce these locally (though my machine is much > smaller). > > I'm guessing the hackbench one is the same command you shared earlier with v6? Yup it is same as earlier. I'll list all the commands down below: o Hackbench perf bench sched messaging -p -t -l 100000 -g <# of groups> o Old schbench git://git.kernel.org/pub/scm/linux/kernel/git/mason/schbench.git at commit e4aa540 ("Make sure rps isn't zero in auto_rps mode.") schbench -m 2 -t <# workers> -r 30 (I should probably upgrade this to the latest! Let me get on it) o tbench (https://www.samba.org/ftp/tridge/dbench/dbench-4.0.tar.gz) nohup tbench_srv 0 & tbench -c client.txt -t 60 <# clients> 127.0.0.1 o Stream (https://www.cs.virginia.edu/stream/FTP/Code/) export ARRAY_SIZE=128000000; # 4 * Local L3 size gcc -DSTREAM_ARRAY_SIZE=$ARRAY_SIZE -DNTIMES=<Loops internally> -fopenmp -O2 stream.c -o stream export OMP_NUM_THREADS=16; # Number of CCX on my machine ./stream; o netperf netserver -L 127.0.0.1 for i in `seq 0 1 <num clients>`; do netperf -H 127.0.0.1 -t TCP_RR -l 100 -- -r 100 -k REQUEST_SIZE,RESPONSE_SIZE,ELAPSED_TIME,THROUGHPUT,THROUGHPUT_UNITS,MIN_LATENCY,MEAN_LATENCY,P50_LATENCY,P90_LATENCY,P99_LATENCY,MAX_LATENCY,STDDEV_LATENCY& done wait; o Unixbench (from mmtest) ./run-mmtests.sh --no-monitor --config configs/config-workload-unixbench -- If you have any other question, please do let me know :) > > thanks > -john -- Thanks and Regards, Prateek