Message ID | 20231016053002.756205-4-ying.huang@intel.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp3251206vqb; Sun, 15 Oct 2023 22:32:21 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF0DM7bOqVkeCQi7iQpCb5wMLGnQ/EaxEtv6i98HhixjFGzIcMdy/qB13lsUGi7bLoXpZl6 X-Received: by 2002:a17:902:e5d2:b0:1c4:1cd3:8062 with SMTP id u18-20020a170902e5d200b001c41cd38062mr37692813plf.2.1697434341107; Sun, 15 Oct 2023 22:32:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697434341; cv=none; d=google.com; s=arc-20160816; b=LNENx8eu+6gf4jthrRUT6lFfCxek1A5aGgpYG50UIG42cSQ0ejLwmqD8cmwmJpKKjV d0nifKcvPIFGZG36rxR3FPpf4cp0GkNcreL/5+nal0DWy1FVuRNbpETDx6CnPYSkGBBe Z5OFCcInEiT5XxeIRN9KUUULJ7cxN/q2GU0Ete0xScc9uwlWjrm7CPDd4pSZrYD0BMoR GWGdmXcDbHQMekcW5kC9VCjV1k8Kh/8OScha1ZV7LO6PAzy4Rv5+5TVIZTP84nqJoBVj Q2PY8StZGCYpgAQRPgBOH4XYweyzGYW20mUdFZXT9ERN3Nf1+z2aqsXwh4vXvOZ/yBCW 5bbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=EM2p971k14pLhhfzm38QBffPfm3gPH5AMIDqRjUWOuU=; fh=1FqZr0yMAv+SdfvJPoxGoNImCEBaOO1uCCh4W2d7ToE=; b=mlVJxvr/MUJ3hcHLXjSXP2qijGT9ig+i4a/LQQLVKB50+yuEzQ+YHEpCb/bnTvegUE kg4osMnnT8KMLqJkuCzYvMTNtYxvWuYBSapHJ61g4vOdOJeZQGf1hbQO/RFPvbF/ghRS 0nbBZmVMA6yO0su3mu3+zLHBTsahl9Fjioc0VZjM3biXyXxz6tnG5yb6VpDT5vKh15OJ QkPPn9nPeVOjQCZ565NwXSeJEfSki7ka3ZqjArdlWfnwB0Mm/A7pbtgtxzsvcboEhQIv vJmMHUMfcQUwLXbrsBa/I0hsIc1x+GUI2xS+1W22TPKIIzLCt8UJi4pepA66CjvRA2hg IJTQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Sr1B9FfN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id d12-20020a170902c18c00b001c09d893c94si9513985pld.612.2023.10.15.22.32.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Oct 2023 22:32:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=Sr1B9FfN; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 6E0F2805A78A; Sun, 15 Oct 2023 22:31:52 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231654AbjJPFal (ORCPT <rfc822;hjfbswb@gmail.com> + 18 others); Mon, 16 Oct 2023 01:30:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231691AbjJPFag (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 16 Oct 2023 01:30:36 -0400 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94CE1EB for <linux-kernel@vger.kernel.org>; Sun, 15 Oct 2023 22:30:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1697434231; x=1728970231; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=UjGG03uSU9scSCPZWKrCozXwLS8Efdo9vVeSGMTDY0g=; b=Sr1B9FfN37v+qc3sXW4PrN6sHmE1Z1dLeXeYAow4gjmjWz9h9ScCJ4PO aRgdzGO1/CEyk9wsN3PAzFr4Shz8c3fbzUsq70+3Fh5psfvLmiX7apkJZ eYwgzgBiiUi6HNdUxnqQXQtFxBdq0g+b2VyMG5hBxkzowZvV4j04C8xeR fTMzWOWaJHcKqAxX7P+nMGTfxGB726v9aG2nKV1F6YhJt6j/Mpbb3qiDW 69d3aYEsIgYutMG2vU4Zaur0+T1v3mkKNVtrB2CfBqiacAaZTu01VrP6m 6/ZVBDImc+tls0l+r1jr4fFGRS7lL1SBdN95IG721RGCl7k/hFM0KocFa Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="389307992" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="389307992" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:30:30 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10863"; a="899356680" X-IronPort-AV: E=Sophos;i="6.03,228,1694761200"; d="scan'208";a="899356680" Received: from yhuang6-mobl2.sh.intel.com ([10.238.6.133]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Oct 2023 22:28:29 -0700 From: Huang Ying <ying.huang@intel.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Arjan Van De Ven <arjan@linux.intel.com>, Huang Ying <ying.huang@intel.com>, Mel Gorman <mgorman@techsingularity.net>, Sudeep Holla <sudeep.holla@arm.com>, Vlastimil Babka <vbabka@suse.cz>, David Hildenbrand <david@redhat.com>, Johannes Weiner <jweiner@redhat.com>, Dave Hansen <dave.hansen@linux.intel.com>, Michal Hocko <mhocko@suse.com>, Pavel Tatashin <pasha.tatashin@soleen.com>, Matthew Wilcox <willy@infradead.org>, Christoph Lameter <cl@linux.com> Subject: [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages Date: Mon, 16 Oct 2023 13:29:56 +0800 Message-Id: <20231016053002.756205-4-ying.huang@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231016053002.756205-1-ying.huang@intel.com> References: <20231016053002.756205-1-ying.huang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Sun, 15 Oct 2023 22:31:52 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779888911851038423 X-GMAIL-MSGID: 1779888911851038423 |
Series |
mm: PCP high auto-tuning
|
|
Commit Message
Huang, Ying
Oct. 16, 2023, 5:29 a.m. UTC
In commit f26b3fa04611 ("mm/page_alloc: limit number of high-order pages on PCP during bulk free"), the PCP (Per-CPU Pageset) will be drained when PCP is mostly used for high-order pages freeing to improve the cache-hot pages reusing between page allocating and freeing CPUs. On system with small per-CPU data cache slice, pages shouldn't be cached before draining to guarantee cache-hot. But on a system with large per-CPU data cache slice, some pages can be cached before draining to reduce zone lock contention. So, in this patch, instead of draining without any caching, "pcp->batch" pages will be cached in PCP before draining if the size of the per-CPU data cache slice is more than "3 * batch". In theory, if the size of per-CPU data cache slice is more than "2 * batch", we can reuse cache-hot pages between CPUs. But considering the other usage of cache (code, other data accessing, etc.), "3 * batch" is used. Note: "3 * batch" is chosen to make sure the optimization works on recent x86_64 server CPUs. If you want to increase it, please check whether it breaks the optimization. On a 2-socket Intel server with 128 logical CPU, with the patch, the network bandwidth of the UNIX (AF_UNIX) test case of lmbench test suite with 16-pair processes increase 70.5%. The cycles% of the spinlock contention (mostly for zone lock) decreases from 46.1% to 21.3%. The number of PCP draining for high order pages freeing (free_high) decreases 89.9%. The cache miss rate keeps 0.2%. Signed-off-by: "Huang, Ying" <ying.huang@intel.com> Acked-by: Mel Gorman <mgorman@techsingularity.net> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Sudeep Holla <sudeep.holla@arm.com> Cc: Vlastimil Babka <vbabka@suse.cz> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <jweiner@redhat.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Michal Hocko <mhocko@suse.com> Cc: Pavel Tatashin <pasha.tatashin@soleen.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Lameter <cl@linux.com> --- drivers/base/cacheinfo.c | 2 ++ include/linux/gfp.h | 1 + include/linux/mmzone.h | 6 ++++++ mm/page_alloc.c | 38 +++++++++++++++++++++++++++++++++++++- 4 files changed, 46 insertions(+), 1 deletion(-)
Comments
Hello, kernel test robot noticed a 14.6% improvement of netperf.Throughput_Mbps on: commit: f5ddc662f07d7d99e9cfc5e07778e26c7394caf8 ("[PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages") url: https://github.com/intel-lab-lkp/linux/commits/Huang-Ying/mm-pcp-avoid-to-drain-PCP-when-process-exit/20231017-143633 base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git 36b2d7dd5a8ac95c8c1e69bdc93c4a6e2dc28a23 patch link: https://lore.kernel.org/all/20231016053002.756205-4-ying.huang@intel.com/ patch subject: [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages testcase: netperf test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory parameters: ip: ipv4 runtime: 300s nr_threads: 200% cluster: cs-localhost send_size: 10K test: SCTP_STREAM_MANY cpufreq_governor: performance Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231027/202310271441.71ce0a9-oliver.sang@intel.com ========================================================================================= cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/send_size/tbox_group/test/testcase: cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-8.3/200%/debian-11.1-x86_64-20220510.cgz/300s/10K/lkp-icl-2sp2/SCTP_STREAM_MANY/netperf commit: c828e65251 ("cacheinfo: calculate size of per-CPU data cache slice") f5ddc662f0 ("mm, pcp: reduce lock contention for draining high-order pages") c828e65251502516 f5ddc662f07d7d99e9cfc5e0777 ---------------- --------------------------- %stddev %change %stddev \ | \ 26471 -11.1% 23520 uptime.idle 2.098e+10 -14.1% 1.802e+10 cpuidle..time 5.798e+08 +14.3% 6.628e+08 cpuidle..usage 1.329e+09 +14.7% 1.525e+09 numa-numastat.node0.local_node 1.329e+09 +14.7% 1.525e+09 numa-numastat.node0.numa_hit 1.336e+09 +14.6% 1.531e+09 numa-numastat.node1.local_node 1.336e+09 +14.6% 1.531e+09 numa-numastat.node1.numa_hit 1.329e+09 +14.7% 1.525e+09 numa-vmstat.node0.numa_hit 1.329e+09 +14.7% 1.525e+09 numa-vmstat.node0.numa_local 1.336e+09 +14.6% 1.531e+09 numa-vmstat.node1.numa_hit 1.336e+09 +14.6% 1.531e+09 numa-vmstat.node1.numa_local 26.31 ± 12% +33.0% 35.00 ± 10% perf-sched.wait_and_delay.avg.ms.__cond_resched.__kmem_cache_alloc_node.kmalloc_trace.sctp_datamsg_from_user.sctp_sendmsg_to_asoc 229.00 ± 13% -24.7% 172.33 ± 5% perf-sched.wait_and_delay.count.__cond_resched.__kmem_cache_alloc_node.kmalloc_trace.sctp_datamsg_from_user.sctp_sendmsg_to_asoc 929.50 ± 2% +8.2% 1005 ± 4% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 26.30 ± 12% +33.0% 35.00 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.__kmem_cache_alloc_node.kmalloc_trace.sctp_datamsg_from_user.sctp_sendmsg_to_asoc 53.98 -14.1% 46.36 vmstat.cpu.id 58.15 +17.6% 68.37 vmstat.procs.r 3720385 +15.6% 4301904 vmstat.system.cs 1991764 +14.5% 2281507 vmstat.system.in 53.69 -7.7 46.03 mpstat.cpu.all.idle% 2.10 +0.3 2.44 mpstat.cpu.all.irq% 7.25 +1.3 8.58 mpstat.cpu.all.soft% 35.74 +5.7 41.46 mpstat.cpu.all.sys% 1.23 +0.3 1.49 mpstat.cpu.all.usr% 2047040 +2.9% 2105598 proc-vmstat.nr_file_pages 1377160 +4.2% 1435588 proc-vmstat.nr_shmem 2.665e+09 +14.7% 3.056e+09 proc-vmstat.numa_hit 2.665e+09 +14.7% 3.056e+09 proc-vmstat.numa_local 1.534e+10 +14.6% 1.758e+10 proc-vmstat.pgalloc_normal 1.534e+10 +14.6% 1.758e+10 proc-vmstat.pgfree 1296 +16.3% 1507 turbostat.Avg_MHz 49.98 +8.1 58.12 turbostat.Busy% 5.797e+08 +14.3% 6.628e+08 turbostat.C1 53.88 -7.6 46.34 turbostat.C1% 50.02 -16.3% 41.88 turbostat.CPU%c1 6.081e+08 +14.5% 6.961e+08 turbostat.IRQ 391.82 +3.5% 405.41 turbostat.PkgWatt 2204 +14.6% 2527 netperf.ThroughputBoth_Mbps 564378 +14.6% 647027 netperf.ThroughputBoth_total_Mbps 2204 +14.6% 2527 netperf.Throughput_Mbps 564378 +14.6% 647027 netperf.Throughput_total_Mbps 146051 +5.9% 154705 netperf.time.involuntary_context_switches 3011 +16.8% 3516 netperf.time.percent_of_cpu_this_job_got 8875 +16.6% 10351 netperf.time.system_time 221.39 +18.0% 261.14 netperf.time.user_time 2759631 +8.0% 2981144 netperf.time.voluntary_context_switches 2.067e+09 +14.6% 2.369e+09 netperf.workload 2920531 +34.4% 3925407 sched_debug.cfs_rq:/.avg_vruntime.avg 3172407 ± 2% +36.5% 4331807 ± 3% sched_debug.cfs_rq:/.avg_vruntime.max 2801767 +35.2% 3787891 ± 2% sched_debug.cfs_rq:/.avg_vruntime.min 45404 ± 5% +33.3% 60516 ± 11% sched_debug.cfs_rq:/.avg_vruntime.stddev 2817265 ± 10% +40.6% 3961862 sched_debug.cfs_rq:/.left_vruntime.max 376003 ± 18% +51.2% 568331 ± 13% sched_debug.cfs_rq:/.left_vruntime.stddev 2920531 +34.4% 3925407 sched_debug.cfs_rq:/.min_vruntime.avg 3172407 ± 2% +36.5% 4331807 ± 3% sched_debug.cfs_rq:/.min_vruntime.max 2801767 +35.2% 3787891 ± 2% sched_debug.cfs_rq:/.min_vruntime.min 45404 ± 5% +33.3% 60516 ± 11% sched_debug.cfs_rq:/.min_vruntime.stddev 2817265 ± 10% +40.6% 3961862 sched_debug.cfs_rq:/.right_vruntime.max 376003 ± 18% +51.2% 568331 ± 13% sched_debug.cfs_rq:/.right_vruntime.stddev 157.25 ± 6% +13.3% 178.14 ± 4% sched_debug.cfs_rq:/.util_est_enqueued.avg 4361500 +15.5% 5035528 sched_debug.cpu.nr_switches.avg 4674667 +14.7% 5363125 sched_debug.cpu.nr_switches.max 3947619 +14.1% 4504637 ± 2% sched_debug.cpu.nr_switches.min 0.56 -3.7% 0.54 perf-stat.i.MPKI 2.293e+10 +14.3% 2.622e+10 perf-stat.i.branch-instructions 1.449e+08 +15.6% 1.675e+08 perf-stat.i.branch-misses 2.15 -0.1 2.05 perf-stat.i.cache-miss-rate% 67409238 +10.2% 74274510 perf-stat.i.cache-misses 3.199e+09 +15.7% 3.702e+09 perf-stat.i.cache-references 3765045 +15.6% 4353228 perf-stat.i.context-switches 1.42 +1.7% 1.45 perf-stat.i.cpi 1.717e+11 +16.5% 2e+11 perf-stat.i.cpu-cycles 5094 +51.1% 7695 ± 3% perf-stat.i.cpu-migrations 2554 +5.7% 2699 perf-stat.i.cycles-between-cache-misses 3.28e+10 +14.5% 3.756e+10 perf-stat.i.dTLB-loads 329792 ± 11% +37.3% 452936 ± 15% perf-stat.i.dTLB-store-misses 2.04e+10 +14.7% 2.339e+10 perf-stat.i.dTLB-stores 1.205e+11 +14.4% 1.379e+11 perf-stat.i.instructions 0.71 -1.7% 0.69 perf-stat.i.ipc 1.34 +16.5% 1.56 perf-stat.i.metric.GHz 221.29 +7.4% 237.74 perf-stat.i.metric.K/sec 619.67 +14.5% 709.77 perf-stat.i.metric.M/sec 7031738 +14.3% 8034255 perf-stat.i.node-load-misses 79.94 -1.3 78.62 perf-stat.i.node-store-miss-rate% 3349862 ± 2% +9.2% 3656880 perf-stat.i.node-stores 0.56 -3.7% 0.54 perf-stat.overall.MPKI 2.11 -0.1 2.01 perf-stat.overall.cache-miss-rate% 1.42 +1.8% 1.45 perf-stat.overall.cpi 2546 +5.7% 2692 perf-stat.overall.cycles-between-cache-misses 0.70 -1.8% 0.69 perf-stat.overall.ipc 79.91 -1.4 78.54 perf-stat.overall.node-store-miss-rate% 2.286e+10 +14.3% 2.614e+10 perf-stat.ps.branch-instructions 1.444e+08 +15.6% 1.669e+08 perf-stat.ps.branch-misses 67192773 +10.2% 74037940 perf-stat.ps.cache-misses 3.189e+09 +15.7% 3.69e+09 perf-stat.ps.cache-references 3753095 +15.6% 4339552 perf-stat.ps.context-switches 1.711e+11 +16.5% 1.994e+11 perf-stat.ps.cpu-cycles 5078 +51.1% 7674 ± 3% perf-stat.ps.cpu-migrations 3.269e+10 +14.5% 3.743e+10 perf-stat.ps.dTLB-loads 328489 ± 11% +37.3% 451131 ± 15% perf-stat.ps.dTLB-store-misses 2.033e+10 +14.7% 2.331e+10 perf-stat.ps.dTLB-stores 1.201e+11 +14.4% 1.374e+11 perf-stat.ps.instructions 7009249 +14.3% 8009170 perf-stat.ps.node-load-misses 3339511 ± 2% +9.2% 3645997 perf-stat.ps.node-stores 3.635e+13 +14.3% 4.155e+13 perf-stat.total.instructions 4.40 ± 2% -1.5 2.87 perf-profile.calltrace.cycles-pp.skb_release_data.kfree_skb_reason.sctp_recvmsg.inet_recvmsg.sock_recvmsg 5.83 -1.4 4.41 perf-profile.calltrace.cycles-pp.kfree_skb_reason.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 1.92 ± 3% -1.4 0.55 perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.kfree_skb_reason.sctp_recvmsg.inet_recvmsg 22.33 -1.3 21.03 perf-profile.calltrace.cycles-pp.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg 22.42 -1.3 21.12 perf-profile.calltrace.cycles-pp.inet_recvmsg.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg 22.75 -1.3 21.48 perf-profile.calltrace.cycles-pp.sock_recvmsg.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64 23.44 -1.2 22.20 perf-profile.calltrace.cycles-pp.____sys_recvmsg.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe 24.65 -1.2 23.47 perf-profile.calltrace.cycles-pp.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg 25.14 -1.2 23.98 perf-profile.calltrace.cycles-pp.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg 25.46 -1.1 24.31 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvmsg 25.59 -1.1 24.46 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvmsg 26.47 -1.1 25.36 perf-profile.calltrace.cycles-pp.recvmsg 3.57 ± 6% -0.6 2.93 ± 9% perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.__kmalloc_large_node.__kmalloc_node_track_caller 5.22 ± 2% -0.4 4.79 perf-profile.calltrace.cycles-pp.__alloc_pages.__kmalloc_large_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb 4.76 ± 2% -0.4 4.33 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.__kmalloc_large_node.__kmalloc_node_track_caller.kmalloc_reserve 0.96 -0.4 0.59 ± 2% perf-profile.calltrace.cycles-pp.release_sock.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 3.16 ± 2% -0.3 2.84 perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.sctp_packet_transmit.sctp_outq_flush 3.14 ± 2% -0.3 2.82 perf-profile.calltrace.cycles-pp.__kmalloc_large_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb.sctp_packet_transmit 3.18 ± 2% -0.3 2.86 perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter 3.44 ± 2% -0.3 3.13 perf-profile.calltrace.cycles-pp.__alloc_skb.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm 1.62 ± 3% -0.3 1.34 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue.get_page_from_freelist.__alloc_pages.__kmalloc_large_node 1.49 ± 3% -0.3 1.22 ± 3% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue.get_page_from_freelist.__alloc_pages 1.46 ± 2% -0.2 1.25 ± 2% perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.__free_pages_ok.skb_release_data.kfree_skb_reason 1.62 ± 2% -0.2 1.43 ± 2% perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.__free_pages_ok.skb_release_data.kfree_skb_reason.sctp_recvmsg 1.99 ± 2% -0.2 1.80 perf-profile.calltrace.cycles-pp.__free_pages_ok.skb_release_data.kfree_skb_reason.sctp_recvmsg.inet_recvmsg 0.76 -0.2 0.58 perf-profile.calltrace.cycles-pp.asm_sysvec_apic_timer_interrupt.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter 0.85 -0.1 0.74 perf-profile.calltrace.cycles-pp.__slab_free.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 0.84 -0.1 0.73 perf-profile.calltrace.cycles-pp.free_unref_page_commit.free_unref_page.skb_release_data.consume_skb.sctp_chunk_put 1.37 -0.1 1.28 perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.consume_skb.sctp_chunk_put.sctp_outq_sack 2.65 -0.1 2.57 perf-profile.calltrace.cycles-pp.kmalloc_reserve.__alloc_skb._sctp_make_chunk.sctp_make_datafrag_empty.sctp_datamsg_from_user 2.56 -0.1 2.48 perf-profile.calltrace.cycles-pp.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb._sctp_make_chunk.sctp_make_datafrag_empty 2.49 ± 2% -0.1 2.42 perf-profile.calltrace.cycles-pp.__kmalloc_large_node.__kmalloc_node_track_caller.kmalloc_reserve.__alloc_skb._sctp_make_chunk 1.92 -0.1 1.85 perf-profile.calltrace.cycles-pp.skb_release_data.consume_skb.sctp_chunk_put.sctp_outq_sack.sctp_cmd_interpreter 0.62 +0.0 0.64 perf-profile.calltrace.cycles-pp.simple_copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.sctp_recvmsg.inet_recvmsg 0.65 +0.0 0.68 perf-profile.calltrace.cycles-pp.sctp_chunk_put.sctp_ulpevent_free.sctp_recvmsg.inet_recvmsg.sock_recvmsg 0.89 +0.0 0.93 perf-profile.calltrace.cycles-pp.copy_msghdr_from_user.___sys_recvmsg.__sys_recvmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe 1.24 +0.0 1.28 perf-profile.calltrace.cycles-pp.try_to_wake_up.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sctp_data_ready 0.56 ± 2% +0.0 0.60 perf-profile.calltrace.cycles-pp.sctp_packet_config.sctp_outq_select_transport.sctp_outq_flush_data.sctp_outq_flush.sctp_cmd_interpreter 1.32 +0.0 1.36 perf-profile.calltrace.cycles-pp.autoremove_wake_function.__wake_up_common.__wake_up_common_lock.sctp_data_ready.sctp_ulpq_tail_event 1.29 +0.0 1.33 perf-profile.calltrace.cycles-pp.sctp_ulpevent_free.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 0.71 ± 2% +0.0 0.75 perf-profile.calltrace.cycles-pp.sctp_outq_select_transport.sctp_outq_flush_data.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm 0.61 +0.0 0.66 perf-profile.calltrace.cycles-pp.dequeue_entity.dequeue_task_fair.__schedule.schedule.schedule_timeout 0.62 +0.1 0.67 perf-profile.calltrace.cycles-pp.enqueue_entity.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending 1.50 +0.1 1.56 perf-profile.calltrace.cycles-pp.__wake_up_common.__wake_up_common_lock.sctp_data_ready.sctp_ulpq_tail_event.sctp_ulpq_tail_data 1.58 +0.1 1.64 perf-profile.calltrace.cycles-pp.__wake_up_common_lock.sctp_data_ready.sctp_ulpq_tail_event.sctp_ulpq_tail_data.sctp_cmd_interpreter 0.70 +0.1 0.76 perf-profile.calltrace.cycles-pp.dequeue_task_fair.__schedule.schedule.schedule_timeout.sctp_skb_recv_datagram 1.02 +0.1 1.08 perf-profile.calltrace.cycles-pp.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.__sysvec_call_function_single.sysvec_call_function_single 2.02 +0.1 2.09 perf-profile.calltrace.cycles-pp.sctp_outq_flush_data.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm.sctp_primitive_SEND 1.86 +0.1 1.93 perf-profile.calltrace.cycles-pp.sctp_data_ready.sctp_ulpq_tail_event.sctp_ulpq_tail_data.sctp_cmd_interpreter.sctp_do_sm 0.76 +0.1 0.83 perf-profile.calltrace.cycles-pp.activate_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue.__sysvec_call_function_single 0.73 +0.1 0.80 perf-profile.calltrace.cycles-pp.enqueue_task_fair.activate_task.ttwu_do_activate.sched_ttwu_pending.__flush_smp_call_function_queue 0.89 +0.1 0.96 perf-profile.calltrace.cycles-pp.__schedule.schedule_idle.do_idle.cpu_startup_entry.start_secondary 0.82 +0.1 0.89 perf-profile.calltrace.cycles-pp.__sk_mem_reduce_allocated.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put 0.95 +0.1 1.03 perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 2.06 +0.1 2.14 perf-profile.calltrace.cycles-pp.sctp_ulpq_tail_event.sctp_ulpq_tail_data.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv 3.68 +0.1 3.76 perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.sctp_user_addto_chunk.sctp_datamsg_from_user.sctp_sendmsg_to_asoc 0.98 +0.1 1.06 perf-profile.calltrace.cycles-pp.__sk_mem_reduce_allocated.skb_release_head_state.kfree_skb_reason.sctp_recvmsg.inet_recvmsg 1.34 +0.1 1.43 perf-profile.calltrace.cycles-pp.sched_ttwu_pending.__flush_smp_call_function_queue.__sysvec_call_function_single.sysvec_call_function_single.asm_sysvec_call_function_single 1.38 +0.1 1.47 perf-profile.calltrace.cycles-pp.sctp_wfree.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_outq_sack 1.54 +0.1 1.63 perf-profile.calltrace.cycles-pp.skb_release_head_state.consume_skb.sctp_chunk_put.sctp_outq_sack.sctp_cmd_interpreter 1.25 ± 2% +0.1 1.35 perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.sctp_sendmsg_to_asoc.sctp_sendmsg.sock_sendmsg 1.28 ± 2% +0.1 1.38 perf-profile.calltrace.cycles-pp.__sk_mem_schedule.sctp_sendmsg_to_asoc.sctp_sendmsg.sock_sendmsg.____sys_sendmsg 1.82 +0.1 1.93 perf-profile.calltrace.cycles-pp.__flush_smp_call_function_queue.__sysvec_call_function_single.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt 2.00 +0.1 2.11 perf-profile.calltrace.cycles-pp.__sysvec_call_function_single.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter 1.39 +0.1 1.50 perf-profile.calltrace.cycles-pp.skb_release_head_state.kfree_skb_reason.sctp_recvmsg.inet_recvmsg.sock_recvmsg 4.39 +0.1 4.51 perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 2.68 +0.2 2.84 perf-profile.calltrace.cycles-pp.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state 2.98 +0.2 3.14 perf-profile.calltrace.cycles-pp.sctp_ulpevent_make_rcvmsg.sctp_ulpq_tail_data.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv 1.88 +0.2 2.06 perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg 0.34 ± 70% +0.2 0.54 perf-profile.calltrace.cycles-pp.pick_next_task_fair.__schedule.schedule.schedule_timeout.sctp_skb_recv_datagram 10.32 +0.2 10.53 perf-profile.calltrace.cycles-pp.sctp_do_sm.sctp_primitive_SEND.sctp_sendmsg_to_asoc.sctp_sendmsg.sock_sendmsg 3.60 +0.2 3.81 perf-profile.calltrace.cycles-pp.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 1.94 +0.2 2.14 perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg 2.20 +0.2 2.41 perf-profile.calltrace.cycles-pp.schedule_timeout.sctp_skb_recv_datagram.sctp_recvmsg.inet_recvmsg.sock_recvmsg 10.93 +0.2 11.16 perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 10.51 +0.2 10.74 perf-profile.calltrace.cycles-pp.sctp_cmd_interpreter.sctp_do_sm.sctp_primitive_SEND.sctp_sendmsg_to_asoc.sctp_sendmsg 7.26 +0.2 7.50 perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.sctp_recvmsg 11.17 +0.2 11.42 perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry 5.40 +0.2 5.64 perf-profile.calltrace.cycles-pp.sctp_ulpq_tail_data.sctp_cmd_interpreter.sctp_do_sm.sctp_assoc_bh_rcv.sctp_rcv 11.25 +0.2 11.50 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary 7.38 +0.3 7.64 perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.sctp_recvmsg.inet_recvmsg 20.03 +0.3 20.29 perf-profile.calltrace.cycles-pp.sctp_backlog_rcv.__release_sock.release_sock.sctp_sendmsg.sock_sendmsg 20.09 +0.3 20.36 perf-profile.calltrace.cycles-pp.__release_sock.release_sock.sctp_sendmsg.sock_sendmsg.____sys_sendmsg 20.30 +0.3 20.57 perf-profile.calltrace.cycles-pp.release_sock.sctp_sendmsg.sock_sendmsg.____sys_sendmsg.___sys_sendmsg 8.40 +0.3 8.68 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.sctp_recvmsg.inet_recvmsg.sock_recvmsg 8.44 +0.3 8.72 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.sctp_recvmsg.inet_recvmsg.sock_recvmsg.____sys_recvmsg 11.85 +0.3 12.14 perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 11.22 +0.3 11.52 perf-profile.calltrace.cycles-pp.sctp_packet_transmit.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm.sctp_primitive_SEND 13.26 +0.3 13.61 perf-profile.calltrace.cycles-pp.sctp_outq_flush.sctp_cmd_interpreter.sctp_do_sm.sctp_primitive_SEND.sctp_sendmsg_to_asoc 13.21 +0.4 13.59 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 13.25 +0.4 13.64 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify 13.24 +0.4 13.62 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 13.34 +0.4 13.74 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify 15.70 +0.4 16.12 perf-profile.calltrace.cycles-pp.sctp_primitive_SEND.sctp_sendmsg_to_asoc.sctp_sendmsg.sock_sendmsg.____sys_sendmsg 0.55 +0.5 1.02 ± 19% perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.sctp_ulpevent_make_rcvmsg.sctp_ulpq_tail_data.sctp_cmd_interpreter 0.66 ± 28% +0.5 1.14 perf-profile.calltrace.cycles-pp.__sk_mem_schedule.sctp_ulpevent_make_rcvmsg.sctp_ulpq_tail_data.sctp_cmd_interpreter.sctp_do_sm 0.00 +0.5 0.54 perf-profile.calltrace.cycles-pp.sctp_sf_eat_data_6_2.sctp_do_sm.sctp_assoc_bh_rcv.sctp_rcv.ip_protocol_deliver_rcu 51.26 +0.5 51.80 perf-profile.calltrace.cycles-pp.sctp_sendmsg.sock_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg 15.28 +0.5 15.82 perf-profile.calltrace.cycles-pp.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter 51.76 +0.6 52.32 perf-profile.calltrace.cycles-pp.sock_sendmsg.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64 53.77 +0.6 54.34 perf-profile.calltrace.cycles-pp.____sys_sendmsg.___sys_sendmsg.__sys_sendmsg.do_syscall_64.entry_SYSCALL_64_after_hwframe 6.06 ± 2% -2.4 3.68 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 5.94 ± 2% -2.2 3.75 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 6.84 -1.9 4.97 perf-profile.children.cycles-pp.skb_release_data 3.64 -1.7 1.92 perf-profile.children.cycles-pp.free_unref_page 2.04 ± 2% -1.7 0.34 ± 2% perf-profile.children.cycles-pp.free_pcppages_bulk 5.84 -1.4 4.42 perf-profile.children.cycles-pp.kfree_skb_reason 22.43 -1.3 21.14 perf-profile.children.cycles-pp.inet_recvmsg 22.67 -1.3 21.39 perf-profile.children.cycles-pp.sctp_recvmsg 22.76 -1.3 21.50 perf-profile.children.cycles-pp.sock_recvmsg 23.46 -1.2 22.22 perf-profile.children.cycles-pp.____sys_recvmsg 24.68 -1.2 23.50 perf-profile.children.cycles-pp.___sys_recvmsg 25.16 -1.2 24.00 perf-profile.children.cycles-pp.__sys_recvmsg 26.69 -1.1 25.59 perf-profile.children.cycles-pp.recvmsg 82.77 -0.5 82.24 perf-profile.children.cycles-pp.do_syscall_64 83.14 -0.5 82.63 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 5.02 -0.5 4.53 perf-profile.children.cycles-pp.get_page_from_freelist 5.46 -0.5 4.98 perf-profile.children.cycles-pp.__alloc_pages 5.96 -0.5 5.50 perf-profile.children.cycles-pp.__kmalloc_node_track_caller 6.21 -0.5 5.76 perf-profile.children.cycles-pp.kmalloc_reserve 3.86 -0.5 3.41 perf-profile.children.cycles-pp.rmqueue 5.88 -0.5 5.44 perf-profile.children.cycles-pp.__kmalloc_large_node 7.47 -0.4 7.07 perf-profile.children.cycles-pp.__alloc_skb 0.65 ± 3% -0.3 0.30 ± 5% perf-profile.children.cycles-pp.sctp_wait_for_sndbuf 1.91 -0.3 1.58 perf-profile.children.cycles-pp._raw_spin_lock_bh 1.78 -0.3 1.46 perf-profile.children.cycles-pp.lock_sock_nested 4.43 -0.2 4.22 perf-profile.children.cycles-pp.consume_skb 6.00 -0.2 5.80 perf-profile.children.cycles-pp.sctp_outq_sack 5.82 -0.2 5.62 perf-profile.children.cycles-pp.sctp_chunk_put 2.00 ± 2% -0.2 1.82 ± 2% perf-profile.children.cycles-pp.__free_pages_ok 1.20 -0.2 1.04 perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 1.27 -0.1 1.16 perf-profile.children.cycles-pp.__slab_free 0.39 -0.1 0.32 ± 2% perf-profile.children.cycles-pp.__free_one_page 0.86 ± 2% -0.1 0.79 perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.42 -0.1 0.36 ± 2% perf-profile.children.cycles-pp.__zone_watermark_ok 0.45 ± 2% -0.1 0.40 ± 2% perf-profile.children.cycles-pp.rmqueue_bulk 0.54 -0.0 0.51 perf-profile.children.cycles-pp.__list_add_valid_or_report 0.65 ± 2% -0.0 0.62 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 0.47 ± 2% -0.0 0.44 ± 2% perf-profile.children.cycles-pp.__kmalloc 0.25 ± 3% -0.0 0.22 ± 2% perf-profile.children.cycles-pp.__irq_exit_rcu 0.24 ± 4% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.perf_event_task_tick 0.24 ± 3% -0.0 0.22 ± 3% perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context 0.15 ± 5% -0.0 0.13 ± 4% perf-profile.children.cycles-pp.__intel_pmu_enable_all 0.11 ± 4% -0.0 0.09 ± 5% perf-profile.children.cycles-pp.sctp_assoc_rwnd_increase 0.06 +0.0 0.07 perf-profile.children.cycles-pp.ct_idle_exit 0.12 ± 3% +0.0 0.13 ± 2% perf-profile.children.cycles-pp.get_pfnblock_flags_mask 0.42 +0.0 0.44 perf-profile.children.cycles-pp.free_unref_page_prepare 0.14 ± 2% +0.0 0.16 ± 3% perf-profile.children.cycles-pp.check_stack_object 0.13 ± 2% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.__mod_lruvec_page_state 0.27 +0.0 0.28 perf-profile.children.cycles-pp.update_curr 0.22 +0.0 0.24 ± 2% perf-profile.children.cycles-pp.__switch_to_asm 0.16 ± 2% +0.0 0.18 ± 3% perf-profile.children.cycles-pp.__update_load_avg_se 0.29 ± 2% +0.0 0.30 perf-profile.children.cycles-pp.sctp_outq_flush_ctrl 0.42 +0.0 0.44 perf-profile.children.cycles-pp.free_large_kmalloc 0.13 ± 2% +0.0 0.15 ± 7% perf-profile.children.cycles-pp.update_cfs_group 0.40 +0.0 0.42 ± 2% perf-profile.children.cycles-pp.loopback_xmit 0.24 ± 3% +0.0 0.26 ± 2% perf-profile.children.cycles-pp.__update_load_avg_cfs_rq 0.45 +0.0 0.47 perf-profile.children.cycles-pp.dev_hard_start_xmit 0.20 +0.0 0.23 ± 3% perf-profile.children.cycles-pp.set_next_entity 0.63 +0.0 0.65 perf-profile.children.cycles-pp.simple_copy_to_iter 0.13 ± 3% +0.0 0.15 ± 3% perf-profile.children.cycles-pp.sk_leave_memory_pressure 0.30 +0.0 0.32 ± 2% perf-profile.children.cycles-pp.sctp_inet_skb_msgname 0.54 ± 2% +0.0 0.57 ± 2% perf-profile.children.cycles-pp.__copy_skb_header 0.31 +0.0 0.34 perf-profile.children.cycles-pp.___perf_sw_event 0.27 ± 3% +0.0 0.30 ± 2% perf-profile.children.cycles-pp.security_socket_recvmsg 0.24 ± 3% +0.0 0.26 perf-profile.children.cycles-pp.ipv4_dst_check 0.42 ± 2% +0.0 0.44 ± 3% perf-profile.children.cycles-pp.page_counter_try_charge 1.30 +0.0 1.33 perf-profile.children.cycles-pp.try_to_wake_up 0.42 +0.0 0.45 perf-profile.children.cycles-pp.__mod_node_page_state 0.79 +0.0 0.82 perf-profile.children.cycles-pp.__skb_clone 0.44 +0.0 0.48 perf-profile.children.cycles-pp.aa_sk_perm 0.30 +0.0 0.33 ± 4% perf-profile.children.cycles-pp.accept_connection 0.30 +0.0 0.33 ± 4% perf-profile.children.cycles-pp.spawn_child 0.30 +0.0 0.33 ± 4% perf-profile.children.cycles-pp.process_requests 0.36 +0.0 0.40 perf-profile.children.cycles-pp.prepare_task_switch 0.28 ± 2% +0.0 0.31 ± 5% perf-profile.children.cycles-pp.recv_sctp_stream_1toMany 0.66 +0.0 0.70 perf-profile.children.cycles-pp.sctp_addrs_lookup_transport 0.69 +0.0 0.72 perf-profile.children.cycles-pp.__sctp_rcv_lookup 0.39 ± 3% +0.0 0.43 perf-profile.children.cycles-pp.dst_release 1.36 +0.0 1.40 perf-profile.children.cycles-pp.autoremove_wake_function 0.77 +0.0 0.81 perf-profile.children.cycles-pp.kmem_cache_alloc_node 1.31 +0.0 1.35 perf-profile.children.cycles-pp.sctp_ulpevent_free 0.92 +0.0 0.96 perf-profile.children.cycles-pp.try_charge_memcg 0.64 +0.0 0.69 perf-profile.children.cycles-pp.dequeue_entity 0.83 +0.0 0.88 perf-profile.children.cycles-pp.sctp_packet_config 2.48 +0.0 2.53 perf-profile.children.cycles-pp.copy_msghdr_from_user 0.61 ± 3% +0.1 0.66 ± 2% perf-profile.children.cycles-pp.mem_cgroup_uncharge_skmem 0.66 +0.1 0.71 perf-profile.children.cycles-pp.enqueue_entity 1.56 +0.1 1.61 perf-profile.children.cycles-pp.__wake_up_common 1.39 +0.1 1.45 perf-profile.children.cycles-pp.kmem_cache_free 1.02 ± 2% +0.1 1.08 perf-profile.children.cycles-pp.sctp_outq_select_transport 0.00 +0.1 0.06 ± 9% perf-profile.children.cycles-pp.pick_next_task_idle 1.64 +0.1 1.70 perf-profile.children.cycles-pp.__wake_up_common_lock 0.86 ± 3% +0.1 0.92 perf-profile.children.cycles-pp.pick_next_task_fair 0.58 +0.1 0.64 perf-profile.children.cycles-pp.update_load_avg 1.56 +0.1 1.62 perf-profile.children.cycles-pp.__check_object_size 0.71 +0.1 0.77 perf-profile.children.cycles-pp.dequeue_task_fair 0.86 +0.1 0.93 perf-profile.children.cycles-pp.sctp_eat_data 1.92 +0.1 1.99 perf-profile.children.cycles-pp.sctp_data_ready 1.05 +0.1 1.12 perf-profile.children.cycles-pp.ttwu_do_activate 0.26 ± 32% +0.1 0.33 ± 4% perf-profile.children.cycles-pp.accept_connections 2.16 +0.1 2.22 perf-profile.children.cycles-pp.sctp_ulpq_tail_event 0.76 +0.1 0.83 perf-profile.children.cycles-pp.enqueue_task_fair 0.78 +0.1 0.86 perf-profile.children.cycles-pp.activate_task 0.98 +0.1 1.05 perf-profile.children.cycles-pp.sctp_sf_eat_data_6_2 0.97 +0.1 1.04 perf-profile.children.cycles-pp.schedule_idle 3.22 +0.1 3.30 perf-profile.children.cycles-pp.sctp_outq_flush_data 1.78 +0.1 1.85 perf-profile.children.cycles-pp.mem_cgroup_charge_skmem 1.48 +0.1 1.56 perf-profile.children.cycles-pp.sctp_wfree 1.38 +0.1 1.46 perf-profile.children.cycles-pp.sched_ttwu_pending 3.80 +0.1 3.89 perf-profile.children.cycles-pp.copyin 3.92 +0.1 4.00 perf-profile.children.cycles-pp._copy_from_iter 10.14 +0.1 10.24 perf-profile.children.cycles-pp.sctp_datamsg_from_user 1.87 +0.1 1.97 perf-profile.children.cycles-pp.__flush_smp_call_function_queue 4.48 +0.1 4.59 perf-profile.children.cycles-pp.sctp_user_addto_chunk 2.04 +0.1 2.15 perf-profile.children.cycles-pp.__sysvec_call_function_single 6.96 +0.1 7.09 perf-profile.children.cycles-pp.__memcpy 7.57 +0.1 7.71 perf-profile.children.cycles-pp.sctp_packet_pack 3.20 +0.1 3.34 perf-profile.children.cycles-pp.sctp_ulpevent_make_rcvmsg 1.85 +0.2 2.00 perf-profile.children.cycles-pp.__sk_mem_reduce_allocated 12.41 +0.2 12.56 perf-profile.children.cycles-pp.sctp_rcv 2.74 +0.2 2.90 perf-profile.children.cycles-pp.sysvec_call_function_single 2.41 +0.2 2.57 perf-profile.children.cycles-pp.__sk_mem_raise_allocated 2.48 +0.2 2.65 perf-profile.children.cycles-pp.__sk_mem_schedule 13.86 +0.2 14.04 perf-profile.children.cycles-pp.__do_softirq 13.28 +0.2 13.45 perf-profile.children.cycles-pp.process_backlog 13.31 +0.2 13.49 perf-profile.children.cycles-pp.__napi_poll 13.45 +0.2 13.63 perf-profile.children.cycles-pp.net_rx_action 2.04 +0.2 2.21 perf-profile.children.cycles-pp.schedule 2.28 +0.2 2.46 perf-profile.children.cycles-pp.schedule_timeout 12.53 +0.2 12.71 perf-profile.children.cycles-pp.ip_local_deliver_finish 13.05 +0.2 13.23 perf-profile.children.cycles-pp.__netif_receive_skb_one_core 12.51 +0.2 12.69 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu 29.73 +0.2 29.92 perf-profile.children.cycles-pp.sctp_outq_flush 3.63 +0.2 3.84 perf-profile.children.cycles-pp.sctp_skb_recv_datagram 13.78 +0.2 13.98 perf-profile.children.cycles-pp.do_softirq 5.68 +0.2 5.89 perf-profile.children.cycles-pp.sctp_ulpq_tail_data 13.98 +0.2 14.20 perf-profile.children.cycles-pp.__local_bh_enable_ip 3.22 +0.2 3.44 perf-profile.children.cycles-pp.skb_release_head_state 2.90 +0.2 3.13 perf-profile.children.cycles-pp.__schedule 36.67 +0.2 36.90 perf-profile.children.cycles-pp.sctp_do_sm 36.13 +0.2 36.36 perf-profile.children.cycles-pp.sctp_cmd_interpreter 10.99 +0.2 11.22 perf-profile.children.cycles-pp.acpi_safe_halt 7.30 +0.2 7.54 perf-profile.children.cycles-pp.copyout 14.37 +0.2 14.61 perf-profile.children.cycles-pp.__dev_queue_xmit 11.01 +0.2 11.26 perf-profile.children.cycles-pp.acpi_idle_enter 14.53 +0.2 14.78 perf-profile.children.cycles-pp.ip_finish_output2 7.40 +0.3 7.65 perf-profile.children.cycles-pp._copy_to_iter 15.04 +0.3 15.29 perf-profile.children.cycles-pp.__ip_queue_xmit 11.26 +0.3 11.52 perf-profile.children.cycles-pp.cpuidle_enter_state 11.33 +0.3 11.59 perf-profile.children.cycles-pp.cpuidle_enter 29.10 +0.3 29.37 perf-profile.children.cycles-pp.sctp_sendmsg_to_asoc 8.41 +0.3 8.69 perf-profile.children.cycles-pp.__skb_datagram_iter 8.45 +0.3 8.73 perf-profile.children.cycles-pp.skb_copy_datagram_iter 11.94 +0.3 12.25 perf-profile.children.cycles-pp.cpuidle_idle_call 9.15 +0.4 9.52 perf-profile.children.cycles-pp.asm_sysvec_call_function_single 13.25 +0.4 13.64 perf-profile.children.cycles-pp.start_secondary 13.32 +0.4 13.71 perf-profile.children.cycles-pp.do_idle 13.34 +0.4 13.74 perf-profile.children.cycles-pp.secondary_startup_64_no_verify 13.34 +0.4 13.74 perf-profile.children.cycles-pp.cpu_startup_entry 16.00 +0.4 16.41 perf-profile.children.cycles-pp.sctp_primitive_SEND 52.23 +0.6 52.80 perf-profile.children.cycles-pp.sock_sendmsg 52.14 +0.6 52.72 perf-profile.children.cycles-pp.sctp_sendmsg 54.28 +0.6 54.87 perf-profile.children.cycles-pp.____sys_sendmsg 56.24 +0.6 56.85 perf-profile.children.cycles-pp.___sys_sendmsg 56.83 +0.6 57.45 perf-profile.children.cycles-pp.__sys_sendmsg 6.05 ± 2% -2.4 3.68 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 0.97 -0.2 0.81 perf-profile.self.cycles-pp._raw_spin_lock_irqsave 1.26 -0.1 1.14 perf-profile.self.cycles-pp.__slab_free 1.22 -0.1 1.14 perf-profile.self.cycles-pp.rmqueue 0.40 -0.1 0.35 ± 2% perf-profile.self.cycles-pp.__zone_watermark_ok 0.46 -0.0 0.42 perf-profile.self.cycles-pp.__list_add_valid_or_report 0.18 ± 4% -0.0 0.16 ± 4% perf-profile.self.cycles-pp.__free_one_page 0.15 ± 5% -0.0 0.13 ± 4% perf-profile.self.cycles-pp.__intel_pmu_enable_all 0.10 ± 3% +0.0 0.11 perf-profile.self.cycles-pp._copy_to_iter 0.31 +0.0 0.32 perf-profile.self.cycles-pp.sctp_v4_xmit 0.24 +0.0 0.26 ± 2% perf-profile.self.cycles-pp.__sys_sendmsg 0.06 ± 7% +0.0 0.08 perf-profile.self.cycles-pp.dequeue_task_fair 0.07 ± 5% +0.0 0.09 ± 5% perf-profile.self.cycles-pp.newidle_balance 0.40 +0.0 0.42 perf-profile.self.cycles-pp.sctp_skb_recv_datagram 0.19 ± 3% +0.0 0.20 ± 2% perf-profile.self.cycles-pp.menu_select 0.11 ± 4% +0.0 0.13 ± 4% perf-profile.self.cycles-pp.enqueue_task_fair 0.37 +0.0 0.39 perf-profile.self.cycles-pp.__check_object_size 0.78 +0.0 0.80 perf-profile.self.cycles-pp._raw_spin_lock_bh 0.22 ± 2% +0.0 0.24 ± 2% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq 0.21 +0.0 0.23 perf-profile.self.cycles-pp.__switch_to_asm 0.27 +0.0 0.30 perf-profile.self.cycles-pp.___perf_sw_event 0.38 +0.0 0.40 ± 2% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.12 ± 3% +0.0 0.15 ± 6% perf-profile.self.cycles-pp.update_cfs_group 0.35 +0.0 0.38 perf-profile.self.cycles-pp.____sys_recvmsg 0.05 +0.0 0.08 ± 6% perf-profile.self.cycles-pp.schedule 0.20 +0.0 0.22 ± 2% perf-profile.self.cycles-pp.update_load_avg 0.28 ± 2% +0.0 0.31 perf-profile.self.cycles-pp.sctp_inet_skb_msgname 0.23 ± 3% +0.0 0.25 perf-profile.self.cycles-pp.ipv4_dst_check 0.12 ± 4% +0.0 0.14 ± 3% perf-profile.self.cycles-pp.sk_leave_memory_pressure 0.58 +0.0 0.61 ± 2% perf-profile.self.cycles-pp.kmem_cache_alloc_node 0.41 +0.0 0.44 ± 2% perf-profile.self.cycles-pp.__mod_node_page_state 0.36 +0.0 0.39 ± 2% perf-profile.self.cycles-pp.aa_sk_perm 0.27 ± 3% +0.0 0.30 ± 5% perf-profile.self.cycles-pp.recv_sctp_stream_1toMany 0.78 +0.0 0.82 perf-profile.self.cycles-pp.sctp_recvmsg 0.71 +0.0 0.74 perf-profile.self.cycles-pp.sctp_sendmsg 0.38 ± 3% +0.0 0.42 ± 2% perf-profile.self.cycles-pp.dst_release 1.36 +0.1 1.42 perf-profile.self.cycles-pp.kmem_cache_free 0.51 ± 2% +0.1 0.58 ± 2% perf-profile.self.cycles-pp.__sk_mem_raise_allocated 0.63 +0.1 0.70 ± 2% perf-profile.self.cycles-pp.sctp_eat_data 0.47 ± 4% +0.1 0.55 ± 3% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated 3.77 +0.1 3.86 perf-profile.self.cycles-pp.copyin 6.90 +0.1 7.03 perf-profile.self.cycles-pp.__memcpy 7.30 +0.2 7.45 perf-profile.self.cycles-pp.acpi_safe_halt 7.26 +0.2 7.49 perf-profile.self.cycles-pp.copyout Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
hi, Huang Ying, sorry for late of this report. we reported "a 14.6% improvement of netperf.Throughput_Mbps" in https://lore.kernel.org/all/202310271441.71ce0a9-oliver.sang@intel.com/ later, our auto-bisect tool captured a regression on a netperf test with different configurations, however, unfortunately, regarded it as 'reported' so we missed this report at the first time. now send again FYI. Hello, kernel test robot noticed a -60.4% regression of netperf.Throughput_Mbps on: commit: f5ddc662f07d7d99e9cfc5e07778e26c7394caf8 ("[PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages") url: https://github.com/intel-lab-lkp/linux/commits/Huang-Ying/mm-pcp-avoid-to-drain-PCP-when-process-exit/20231017-143633 base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git 36b2d7dd5a8ac95c8c1e69bdc93c4a6e2dc28a23 patch link: https://lore.kernel.org/all/20231016053002.756205-4-ying.huang@intel.com/ patch subject: [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages testcase: netperf test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory parameters: ip: ipv4 runtime: 300s nr_threads: 50% cluster: cs-localhost test: UDP_STREAM cpufreq_governor: performance If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202311061311.8d63998-oliver.sang@intel.com Details are as below: --------------------------------------------------------------------------------------------------> The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231106/202311061311.8d63998-oliver.sang@intel.com ========================================================================================= cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase: cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-8.3/50%/debian-11.1-x86_64-20220510.cgz/300s/lkp-icl-2sp2/UDP_STREAM/netperf commit: c828e65251 ("cacheinfo: calculate size of per-CPU data cache slice") f5ddc662f0 ("mm, pcp: reduce lock contention for draining high-order pages") c828e65251502516 f5ddc662f07d7d99e9cfc5e0777 ---------------- --------------------------- %stddev %change %stddev \ | \ 7321 ± 4% +28.2% 9382 uptime.idle 50.65 ± 4% -4.0% 48.64 boot-time.boot 6042 ± 4% -4.2% 5785 boot-time.idle 1.089e+09 ± 2% +232.1% 3.618e+09 cpuidle..time 1087075 ± 2% +24095.8% 2.63e+08 cpuidle..usage 3357014 +99.9% 6710312 vmstat.memory.cache 48731 ± 19% +4666.5% 2322787 vmstat.system.cs 144637 +711.2% 1173334 vmstat.system.in 2.59 ± 2% +6.2 8.79 mpstat.cpu.all.idle% 1.01 +0.7 1.66 mpstat.cpu.all.irq% 6.00 -3.2 2.79 mpstat.cpu.all.soft% 1.13 ± 2% -0.1 1.02 mpstat.cpu.all.usr% 1.407e+09 ± 3% -28.2% 1.011e+09 numa-numastat.node0.local_node 1.407e+09 ± 3% -28.2% 1.01e+09 numa-numastat.node0.numa_hit 1.469e+09 ± 8% -32.0% 9.979e+08 numa-numastat.node1.local_node 1.469e+09 ± 8% -32.1% 9.974e+08 numa-numastat.node1.numa_hit 103.00 ± 19% -44.0% 57.67 ± 20% perf-c2c.DRAM.local 8970 ± 12% -89.4% 951.00 ± 4% perf-c2c.DRAM.remote 8192 ± 5% +68.5% 13807 perf-c2c.HITM.local 6675 ± 11% -92.6% 491.00 ± 2% perf-c2c.HITM.remote 1051014 ± 2% +24922.0% 2.63e+08 turbostat.C1 2.75 ± 2% +6.5 9.29 turbostat.C1% 2.72 ± 2% +178.3% 7.57 turbostat.CPU%c1 0.09 -22.2% 0.07 turbostat.IPC 44589125 +701.5% 3.574e+08 turbostat.IRQ 313.00 ± 57% +1967.0% 6469 ± 8% turbostat.POLL 70.33 +3.3% 72.67 turbostat.PkgTmp 44.23 ± 4% -31.8% 30.15 ± 2% turbostat.RAMWatt 536096 +583.7% 3665194 meminfo.Active 535414 +584.4% 3664543 meminfo.Active(anon) 3238301 +103.2% 6579677 meminfo.Cached 1204424 +278.9% 4563575 meminfo.Committed_AS 469093 +47.9% 693889 ± 3% meminfo.Inactive 467250 +48.4% 693496 ± 3% meminfo.Inactive(anon) 53615 +562.5% 355225 ± 4% meminfo.Mapped 5223078 +64.1% 8571212 meminfo.Memused 557305 +599.6% 3899111 meminfo.Shmem 5660207 +58.9% 8993642 meminfo.max_used_kB 78504 ± 3% -30.1% 54869 netperf.ThroughputBoth_Mbps 5024292 ± 3% -30.1% 3511666 netperf.ThroughputBoth_total_Mbps 7673 ± 5% +249.7% 26832 netperf.ThroughputRecv_Mbps 491074 ± 5% +249.7% 1717287 netperf.ThroughputRecv_total_Mbps 70831 ± 2% -60.4% 28037 netperf.Throughput_Mbps 4533217 ± 2% -60.4% 1794379 netperf.Throughput_total_Mbps 5439 +9.4% 5949 netperf.time.percent_of_cpu_this_job_got 16206 +9.4% 17728 netperf.time.system_time 388.14 -51.9% 186.53 netperf.time.user_time 2.876e+09 ± 3% -30.1% 2.01e+09 netperf.workload 177360 ± 30% -36.0% 113450 ± 20% numa-meminfo.node0.AnonPages 255926 ± 12% -40.6% 152052 ± 12% numa-meminfo.node0.AnonPages.max 22582 ± 61% +484.2% 131916 ± 90% numa-meminfo.node0.Mapped 138287 ± 17% +22.6% 169534 ± 12% numa-meminfo.node1.AnonHugePages 267468 ± 20% +29.1% 345385 ± 6% numa-meminfo.node1.AnonPages 346204 ± 18% +34.5% 465696 ± 2% numa-meminfo.node1.AnonPages.max 279416 ± 19% +77.0% 494652 ± 18% numa-meminfo.node1.Inactive 278445 ± 19% +77.6% 494393 ± 18% numa-meminfo.node1.Inactive(anon) 31726 ± 45% +607.7% 224533 ± 45% numa-meminfo.node1.Mapped 4802 ± 6% +19.4% 5733 ± 3% numa-meminfo.node1.PageTables 297323 ± 12% +792.6% 2653850 ± 63% numa-meminfo.node1.Shmem 44325 ± 30% -36.0% 28379 ± 20% numa-vmstat.node0.nr_anon_pages 5590 ± 61% +491.0% 33042 ± 90% numa-vmstat.node0.nr_mapped 1.407e+09 ± 3% -28.2% 1.01e+09 numa-vmstat.node0.numa_hit 1.407e+09 ± 3% -28.2% 1.011e+09 numa-vmstat.node0.numa_local 66858 ± 20% +29.2% 86385 ± 6% numa-vmstat.node1.nr_anon_pages 69601 ± 20% +77.8% 123729 ± 18% numa-vmstat.node1.nr_inactive_anon 7953 ± 45% +608.3% 56335 ± 45% numa-vmstat.node1.nr_mapped 1201 ± 6% +19.4% 1434 ± 3% numa-vmstat.node1.nr_page_table_pages 74288 ± 11% +792.6% 663111 ± 63% numa-vmstat.node1.nr_shmem 69601 ± 20% +77.8% 123728 ± 18% numa-vmstat.node1.nr_zone_inactive_anon 1.469e+09 ± 8% -32.1% 9.974e+08 numa-vmstat.node1.numa_hit 1.469e+09 ± 8% -32.0% 9.979e+08 numa-vmstat.node1.numa_local 133919 +584.2% 916254 proc-vmstat.nr_active_anon 111196 +3.3% 114828 proc-vmstat.nr_anon_pages 5602484 -1.5% 5518799 proc-vmstat.nr_dirty_background_threshold 11218668 -1.5% 11051092 proc-vmstat.nr_dirty_threshold 809646 +103.2% 1645012 proc-vmstat.nr_file_pages 56374629 -1.5% 55536913 proc-vmstat.nr_free_pages 116775 +48.4% 173349 ± 3% proc-vmstat.nr_inactive_anon 13386 ± 2% +563.3% 88793 ± 4% proc-vmstat.nr_mapped 2286 +6.5% 2434 proc-vmstat.nr_page_table_pages 139393 +599.4% 974869 proc-vmstat.nr_shmem 29092 +6.6% 31019 proc-vmstat.nr_slab_reclaimable 133919 +584.2% 916254 proc-vmstat.nr_zone_active_anon 116775 +48.4% 173349 ± 3% proc-vmstat.nr_zone_inactive_anon 32135 ± 11% +257.2% 114797 ± 21% proc-vmstat.numa_hint_faults 20858 ± 16% +318.3% 87244 ± 6% proc-vmstat.numa_hint_faults_local 2.876e+09 ± 3% -30.2% 2.008e+09 proc-vmstat.numa_hit 2.876e+09 ± 3% -30.2% 2.008e+09 proc-vmstat.numa_local 25453 ± 7% -75.2% 6324 ± 30% proc-vmstat.numa_pages_migrated 178224 ± 2% +76.6% 314680 ± 7% proc-vmstat.numa_pte_updates 160889 ± 3% +267.6% 591393 ± 6% proc-vmstat.pgactivate 2.295e+10 ± 3% -30.2% 1.601e+10 proc-vmstat.pgalloc_normal 1026605 +21.9% 1251671 proc-vmstat.pgfault 2.295e+10 ± 3% -30.2% 1.601e+10 proc-vmstat.pgfree 25453 ± 7% -75.2% 6324 ± 30% proc-vmstat.pgmigrate_success 39208 ± 2% -6.1% 36815 proc-vmstat.pgreuse 3164416 -20.3% 2521344 ± 2% proc-vmstat.unevictable_pgs_scanned 19248627 -22.1% 14989905 sched_debug.cfs_rq:/.avg_vruntime.avg 20722680 -24.9% 15569530 sched_debug.cfs_rq:/.avg_vruntime.max 17634233 -22.5% 13663168 sched_debug.cfs_rq:/.avg_vruntime.min 949063 ± 2% -70.5% 280388 sched_debug.cfs_rq:/.avg_vruntime.stddev 0.78 ± 10% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_running.min 0.16 ± 8% +113.3% 0.33 ± 2% sched_debug.cfs_rq:/.h_nr_running.stddev 0.56 ±141% +2.2e+07% 122016 ± 52% sched_debug.cfs_rq:/.left_vruntime.avg 45.01 ±141% +2.2e+07% 10035976 ± 28% sched_debug.cfs_rq:/.left_vruntime.max 4.58 ±141% +2.3e+07% 1072762 ± 36% sched_debug.cfs_rq:/.left_vruntime.stddev 5814 ± 10% -100.0% 0.00 sched_debug.cfs_rq:/.load.min 5.39 ± 9% -73.2% 1.44 ± 10% sched_debug.cfs_rq:/.load_avg.min 19248627 -22.1% 14989905 sched_debug.cfs_rq:/.min_vruntime.avg 20722680 -24.9% 15569530 sched_debug.cfs_rq:/.min_vruntime.max 17634233 -22.5% 13663168 sched_debug.cfs_rq:/.min_vruntime.min 949063 ± 2% -70.5% 280388 sched_debug.cfs_rq:/.min_vruntime.stddev 0.78 ± 10% -100.0% 0.00 sched_debug.cfs_rq:/.nr_running.min 0.06 ± 8% +369.2% 0.30 ± 3% sched_debug.cfs_rq:/.nr_running.stddev 4.84 ± 26% +1611.3% 82.79 ± 67% sched_debug.cfs_rq:/.removed.load_avg.avg 27.92 ± 12% +3040.3% 876.79 ± 68% sched_debug.cfs_rq:/.removed.load_avg.stddev 0.56 ±141% +2.2e+07% 122016 ± 52% sched_debug.cfs_rq:/.right_vruntime.avg 45.06 ±141% +2.2e+07% 10035976 ± 28% sched_debug.cfs_rq:/.right_vruntime.max 4.59 ±141% +2.3e+07% 1072762 ± 36% sched_debug.cfs_rq:/.right_vruntime.stddev 900.25 -10.4% 806.45 sched_debug.cfs_rq:/.runnable_avg.avg 533.28 ± 4% -87.0% 69.56 ± 39% sched_debug.cfs_rq:/.runnable_avg.min 122.77 ± 2% +92.9% 236.86 sched_debug.cfs_rq:/.runnable_avg.stddev 896.13 -10.8% 799.44 sched_debug.cfs_rq:/.util_avg.avg 379.06 ± 4% -83.4% 62.94 ± 37% sched_debug.cfs_rq:/.util_avg.min 116.35 ± 8% +99.4% 232.04 sched_debug.cfs_rq:/.util_avg.stddev 550.87 -14.2% 472.66 ± 2% sched_debug.cfs_rq:/.util_est_enqueued.avg 1124 ± 8% +18.2% 1329 ± 3% sched_debug.cfs_rq:/.util_est_enqueued.max 134.17 ± 30% -100.0% 0.00 sched_debug.cfs_rq:/.util_est_enqueued.min 558243 ± 6% -66.9% 184666 sched_debug.cpu.avg_idle.avg 12860 ± 11% -56.1% 5644 sched_debug.cpu.avg_idle.min 365635 -53.5% 169863 ± 5% sched_debug.cpu.avg_idle.stddev 9.56 ± 3% -28.4% 6.84 ± 8% sched_debug.cpu.clock.stddev 6999 ± 2% -85.6% 1007 ± 3% sched_debug.cpu.clock_task.stddev 3985 ± 10% -100.0% 0.00 sched_debug.cpu.curr->pid.min 491.71 ± 10% +209.3% 1520 ± 4% sched_debug.cpu.curr->pid.stddev 270.19 ±141% +1096.6% 3233 ± 51% sched_debug.cpu.max_idle_balance_cost.stddev 0.78 ± 10% -100.0% 0.00 sched_debug.cpu.nr_running.min 0.15 ± 6% +121.7% 0.34 ± 2% sched_debug.cpu.nr_running.stddev 62041 ± 15% +4280.9% 2717948 sched_debug.cpu.nr_switches.avg 1074922 ± 14% +292.6% 4220307 ± 2% sched_debug.cpu.nr_switches.max 1186 ± 2% +1.2e+05% 1379073 ± 4% sched_debug.cpu.nr_switches.min 132392 ± 21% +294.6% 522476 ± 5% sched_debug.cpu.nr_switches.stddev 6.44 ± 4% +21.4% 7.82 ± 12% sched_debug.cpu.nr_uninterruptible.stddev 6.73 ± 13% -84.8% 1.02 ± 5% perf-stat.i.MPKI 1.652e+10 ± 2% -22.2% 1.285e+10 perf-stat.i.branch-instructions 0.72 +0.0 0.75 perf-stat.i.branch-miss-rate% 1.19e+08 ± 3% -19.8% 95493630 perf-stat.i.branch-misses 27.46 ± 12% -26.2 1.30 ± 4% perf-stat.i.cache-miss-rate% 5.943e+08 ± 10% -88.6% 67756219 ± 5% perf-stat.i.cache-misses 2.201e+09 +143.7% 5.364e+09 perf-stat.i.cache-references 48911 ± 19% +4695.4% 2345525 perf-stat.i.context-switches 3.66 ± 2% +28.5% 4.71 perf-stat.i.cpi 3.228e+11 -4.1% 3.097e+11 perf-stat.i.cpu-cycles 190.51 +1363.7% 2788 ± 10% perf-stat.i.cpu-migrations 803.99 ± 6% +510.2% 4905 ± 5% perf-stat.i.cycles-between-cache-misses 0.00 ± 16% +0.0 0.01 ± 14% perf-stat.i.dTLB-load-miss-rate% 755654 ± 18% +232.4% 2512024 ± 14% perf-stat.i.dTLB-load-misses 2.385e+10 ± 2% -26.9% 1.742e+10 perf-stat.i.dTLB-loads 0.00 ± 31% +0.0 0.01 ± 35% perf-stat.i.dTLB-store-miss-rate% 305657 ± 36% +200.0% 916822 ± 35% perf-stat.i.dTLB-store-misses 1.288e+10 ± 2% -28.8% 9.179e+09 perf-stat.i.dTLB-stores 8.789e+10 ± 2% -25.2% 6.578e+10 perf-stat.i.instructions 0.28 ± 2% -21.6% 0.22 perf-stat.i.ipc 2.52 -4.1% 2.42 perf-stat.i.metric.GHz 873.89 ± 12% -67.0% 288.04 ± 8% perf-stat.i.metric.K/sec 435.61 ± 2% -19.6% 350.06 perf-stat.i.metric.M/sec 2799 +29.9% 3637 ± 2% perf-stat.i.minor-faults 99.74 -2.6 97.11 perf-stat.i.node-load-miss-rate% 1.294e+08 ± 12% -92.4% 9879207 ± 7% perf-stat.i.node-load-misses 76.55 +16.4 92.92 perf-stat.i.node-store-miss-rate% 2.257e+08 ± 10% -90.4% 21721672 ± 8% perf-stat.i.node-store-misses 69217511 ± 13% -97.7% 1625810 ± 7% perf-stat.i.node-stores 2799 +29.9% 3637 ± 2% perf-stat.i.page-faults 6.79 ± 13% -84.9% 1.03 ± 5% perf-stat.overall.MPKI 0.72 +0.0 0.74 perf-stat.overall.branch-miss-rate% 27.06 ± 12% -25.8 1.26 ± 4% perf-stat.overall.cache-miss-rate% 3.68 ± 2% +28.1% 4.71 perf-stat.overall.cpi 549.38 ± 10% +736.0% 4592 ± 5% perf-stat.overall.cycles-between-cache-misses 0.00 ± 18% +0.0 0.01 ± 14% perf-stat.overall.dTLB-load-miss-rate% 0.00 ± 36% +0.0 0.01 ± 35% perf-stat.overall.dTLB-store-miss-rate% 0.27 ± 2% -22.0% 0.21 perf-stat.overall.ipc 99.80 -2.4 97.37 perf-stat.overall.node-load-miss-rate% 76.60 +16.4 93.03 perf-stat.overall.node-store-miss-rate% 9319 +5.8% 9855 perf-stat.overall.path-length 1.646e+10 ± 2% -22.2% 1.281e+10 perf-stat.ps.branch-instructions 1.186e+08 ± 3% -19.8% 95167897 perf-stat.ps.branch-misses 5.924e+08 ± 10% -88.6% 67384354 ± 5% perf-stat.ps.cache-misses 2.193e+09 +143.4% 5.339e+09 perf-stat.ps.cache-references 49100 ± 19% +4668.0% 2341074 perf-stat.ps.context-switches 3.218e+11 -4.1% 3.087e+11 perf-stat.ps.cpu-cycles 189.73 +1368.4% 2786 ± 10% perf-stat.ps.cpu-migrations 753056 ± 18% +229.9% 2484575 ± 14% perf-stat.ps.dTLB-load-misses 2.377e+10 ± 2% -26.9% 1.737e+10 perf-stat.ps.dTLB-loads 304509 ± 36% +199.1% 910856 ± 35% perf-stat.ps.dTLB-store-misses 1.284e+10 ± 2% -28.7% 9.152e+09 perf-stat.ps.dTLB-stores 8.76e+10 ± 2% -25.2% 6.557e+10 perf-stat.ps.instructions 2791 +28.2% 3580 ± 2% perf-stat.ps.minor-faults 1.29e+08 ± 12% -92.4% 9815672 ± 7% perf-stat.ps.node-load-misses 2.25e+08 ± 10% -90.4% 21575943 ± 8% perf-stat.ps.node-store-misses 69002373 ± 13% -97.7% 1615410 ± 7% perf-stat.ps.node-stores 2791 +28.2% 3580 ± 2% perf-stat.ps.page-faults 2.68e+13 ± 2% -26.1% 1.981e+13 perf-stat.total.instructions 0.00 ± 35% +2600.0% 0.04 ± 23% perf-sched.sch_delay.avg.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg 1.18 ± 9% -98.1% 0.02 ± 32% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part 0.58 ± 3% -62.1% 0.22 ± 97% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 0.51 ± 22% -82.7% 0.09 ± 11% perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 0.25 ± 23% -59.6% 0.10 ± 10% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 0.03 ± 42% -64.0% 0.01 ± 15% perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 0.04 ± 7% +434.6% 0.23 ± 36% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll 1.00 ± 20% -84.1% 0.16 ± 78% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select 0.01 ± 7% -70.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 0.02 ± 2% +533.9% 0.12 ± 43% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 0.03 ± 7% +105.9% 0.06 ± 33% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 0.01 ± 15% +67.5% 0.02 ± 8% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 0.09 ± 50% -85.7% 0.01 ± 33% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open 0.04 ± 7% +343.4% 0.16 ± 6% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm 0.06 ± 41% +3260.7% 1.88 ± 30% perf-sched.sch_delay.max.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg 3.78 -96.2% 0.14 ± 3% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part 2.86 ± 4% -72.6% 0.78 ±113% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] 4.09 ± 7% -34.1% 2.69 ± 7% perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 3.09 ± 37% -64.1% 1.11 ± 5% perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 0.00 ±141% +6200.0% 0.13 ± 82% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 3.94 -40.5% 2.35 ± 48% perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 1.63 ± 21% -77.0% 0.38 ± 90% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select 7.29 ± 39% +417.5% 37.72 ± 16% perf-sched.sch_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 3.35 ± 14% -51.7% 1.62 ± 3% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 0.05 ± 13% +2245.1% 1.13 ± 40% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork 3.01 ± 26% +729.6% 25.01 ± 91% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 1.93 ± 59% -85.5% 0.28 ± 62% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open 0.01 -50.0% 0.00 perf-sched.total_sch_delay.average.ms 7.29 ± 39% +468.8% 41.46 ± 26% perf-sched.total_sch_delay.max.ms 6.04 ± 4% -94.1% 0.35 perf-sched.total_wait_and_delay.average.ms 205790 ± 3% +1811.0% 3932742 perf-sched.total_wait_and_delay.count.ms 6.03 ± 4% -94.2% 0.35 perf-sched.total_wait_time.average.ms 75.51 ± 41% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 23.01 ± 17% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop 23.82 ± 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter 95.27 ± 41% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin 55.86 ±141% +1014.6% 622.64 ± 5% perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 0.07 ± 23% -82.5% 0.01 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 137.41 ± 3% +345.1% 611.63 ± 2% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll 0.04 ± 5% -49.6% 0.02 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 536.33 ± 5% -46.5% 287.00 perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 21.67 ± 32% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 5.67 ± 8% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.perf_poll.do_poll.constprop 1.67 ± 56% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter 5.67 ± 29% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin 5.33 ± 23% +93.8% 10.33 ± 25% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 101725 ± 3% +15.3% 117243 ± 10% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 100.00 ± 7% -80.3% 19.67 ± 2% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll 97762 ± 4% +3794.8% 3807606 perf-sched.wait_and_delay.count.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 1091 ± 9% +111.9% 2311 ± 3% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 604.50 ± 43% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 37.41 ± 9% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop 27.08 ± 13% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter 275.41 ± 32% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin 1313 ± 69% +112.1% 2786 ± 15% perf-sched.wait_and_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 333.38 ±141% +200.4% 1001 perf-sched.wait_and_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 1000 -96.8% 31.85 ± 48% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 17.99 ± 33% +387.5% 87.71 ± 8% perf-sched.wait_and_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 0.33 ± 19% -74.1% 0.09 ± 10% perf-sched.wait_time.avg.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg 0.02 ± 53% +331.4% 0.10 ± 50% perf-sched.wait_time.avg.ms.__cond_resched.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.__sys_recvfrom 0.09 ± 65% -75.9% 0.02 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.__sys_sendto 0.02 ± 22% -70.2% 0.01 ±141% perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 75.51 ± 41% -100.0% 0.04 ± 42% perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 0.10 ± 36% -80.3% 0.02 ± 9% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb 0.55 ± 61% -94.9% 0.03 ± 45% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags 23.01 ± 17% -100.0% 0.00 ±141% perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop 23.82 ± 7% -99.7% 0.07 ± 57% perf-sched.wait_time.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter 95.27 ± 41% -100.0% 0.03 ± 89% perf-sched.wait_time.avg.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin 56.30 ±139% +1005.5% 622.44 ± 5% perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 2.78 ± 66% -98.2% 0.05 ± 52% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 0.07 ± 23% -82.5% 0.01 perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 137.37 ± 3% +345.1% 611.40 ± 2% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll 0.02 ± 5% -41.9% 0.01 ± 3% perf-sched.wait_time.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 536.32 ± 5% -46.5% 286.98 perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 4.66 ± 20% -56.7% 2.02 ± 26% perf-sched.wait_time.max.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg 0.03 ± 63% +995.0% 0.37 ± 26% perf-sched.wait_time.max.ms.__cond_resched.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.__sys_recvfrom 1.67 ± 87% -92.6% 0.12 ± 57% perf-sched.wait_time.max.ms.__cond_resched.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.__sys_sendto 0.54 ±117% -95.1% 0.03 ±105% perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary 0.06 ± 49% -89.1% 0.01 ±141% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 604.50 ± 43% -100.0% 0.16 ± 83% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write 2.77 ± 45% -95.4% 0.13 ± 64% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb 2.86 ± 45% -94.3% 0.16 ± 91% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags 37.41 ± 9% -100.0% 0.01 ±141% perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop 27.08 ± 13% -99.7% 0.08 ± 61% perf-sched.wait_time.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter 275.41 ± 32% -100.0% 0.03 ± 89% perf-sched.wait_time.max.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin 1313 ± 69% +112.1% 2786 ± 15% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm 334.74 ±140% +198.9% 1000 perf-sched.wait_time.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep 21.74 ± 58% -95.4% 1.00 ±103% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt 1000 -97.6% 24.49 ± 50% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 10.90 ± 27% +682.9% 85.36 ± 6% perf-sched.wait_time.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 32.91 ± 58% -63.5% 12.01 ±115% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone 169.97 ± 7% -49.2% 86.29 ± 15% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread 44.08 -19.8 24.25 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb 44.47 -19.6 24.87 perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg 43.63 -19.5 24.15 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data 45.62 -19.2 26.39 perf-profile.calltrace.cycles-pp.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg 45.62 -19.2 26.40 perf-profile.calltrace.cycles-pp.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom 45.00 -19.1 25.94 perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg 50.41 -16.8 33.64 ± 39% perf-profile.calltrace.cycles-pp.accept_connections.main.__libc_start_main 50.41 -16.8 33.64 ± 39% perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main.__libc_start_main 50.41 -16.8 33.64 ± 39% perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main.__libc_start_main 50.41 -16.8 33.64 ± 39% perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main 99.92 -14.2 85.72 ± 15% perf-profile.calltrace.cycles-pp.main.__libc_start_main 99.96 -14.2 85.77 ± 15% perf-profile.calltrace.cycles-pp.__libc_start_main 50.10 -8.6 41.52 perf-profile.calltrace.cycles-pp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom 50.11 -8.6 41.55 perf-profile.calltrace.cycles-pp.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64 50.13 -8.5 41.64 perf-profile.calltrace.cycles-pp.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe 50.28 -8.0 42.27 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom 50.29 -8.0 42.29 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni 50.31 -7.9 42.42 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests 50.32 -7.8 42.47 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests.spawn_child 50.36 -7.6 42.78 perf-profile.calltrace.cycles-pp.recvfrom.recv_omni.process_requests.spawn_child.accept_connection 50.41 -7.3 43.07 perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections 19.93 ± 2% -6.6 13.36 perf-profile.calltrace.cycles-pp.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg 19.44 ± 2% -6.3 13.16 perf-profile.calltrace.cycles-pp._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg 18.99 ± 2% -6.1 12.90 perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb 8.95 -5.1 3.82 perf-profile.calltrace.cycles-pp.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto 8.70 -5.0 3.71 perf-profile.calltrace.cycles-pp.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto 8.10 -4.6 3.45 perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg 7.69 -4.4 3.27 perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg 6.51 -3.7 2.78 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb 6.47 -3.7 2.75 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb 6.41 -3.7 2.71 perf-profile.calltrace.cycles-pp.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2 5.88 -3.5 2.43 perf-profile.calltrace.cycles-pp.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit 5.73 -3.4 2.35 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip 5.69 -3.4 2.33 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__do_softirq.do_softirq 5.36 -3.2 2.19 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__do_softirq 4.59 -2.7 1.89 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action 4.55 ± 2% -2.7 1.88 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll 4.40 ± 2% -2.6 1.81 perf-profile.calltrace.cycles-pp.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog 3.81 ± 2% -2.2 1.57 perf-profile.calltrace.cycles-pp.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core 3.75 ± 2% -2.2 1.55 perf-profile.calltrace.cycles-pp.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish 2.21 ± 2% -1.6 0.63 perf-profile.calltrace.cycles-pp.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto 1.94 ± 2% -1.4 0.51 ± 2% perf-profile.calltrace.cycles-pp.__ip_select_ident.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg 1.14 -0.6 0.51 perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg 0.00 +0.5 0.53 ± 2% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 0.00 +0.7 0.69 perf-profile.calltrace.cycles-pp.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state 0.00 +0.7 0.71 perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb 0.00 +0.7 0.72 perf-profile.calltrace.cycles-pp.__sk_mem_schedule.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv 0.00 +1.0 0.99 ± 20% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp 0.00 +1.0 1.01 ± 20% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg 0.00 +1.1 1.05 ± 20% perf-profile.calltrace.cycles-pp.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg 0.00 +1.1 1.12 perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call 0.00 +1.2 1.18 ± 20% perf-profile.calltrace.cycles-pp.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg 0.00 +1.3 1.32 perf-profile.calltrace.cycles-pp.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu 0.00 +2.2 2.23 perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom 49.51 +2.6 52.08 perf-profile.calltrace.cycles-pp.send_udp_stream.main.__libc_start_main 49.49 +2.6 52.07 perf-profile.calltrace.cycles-pp.send_omni_inner.send_udp_stream.main.__libc_start_main 0.00 +3.0 2.96 ± 2% perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle 48.71 +3.0 51.73 perf-profile.calltrace.cycles-pp.sendto.send_omni_inner.send_udp_stream.main.__libc_start_main 0.00 +3.1 3.06 ± 2% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry 0.00 +3.1 3.09 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary 48.34 +3.2 51.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream.main 0.00 +3.3 3.33 ± 2% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 48.13 +3.8 51.96 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream 47.82 +4.0 51.82 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner 47.70 +4.1 51.76 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto 0.00 +4.1 4.08 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 0.00 +4.1 4.10 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify 0.00 +4.1 4.10 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify 0.00 +4.1 4.14 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify 0.00 +4.3 4.35 ± 2% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter 46.52 +4.8 51.27 perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe 46.04 +5.0 51.08 perf-profile.calltrace.cycles-pp.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64 3.67 +8.0 11.63 perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg 3.71 +8.1 11.80 perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg 3.96 +8.5 12.42 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg 3.96 +8.5 12.44 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom 35.13 +11.3 46.39 perf-profile.calltrace.cycles-pp.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto 32.68 ± 2% +13.0 45.65 perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto 10.27 +20.3 30.59 perf-profile.calltrace.cycles-pp.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg 10.24 +20.3 30.58 perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg 9.84 +20.5 30.32 perf-profile.calltrace.cycles-pp.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb 9.59 +20.5 30.11 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data 8.40 +21.0 29.42 perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill 6.13 +21.9 28.05 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist 6.20 +22.0 28.15 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages 6.46 +22.5 28.91 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill 48.24 -21.8 26.43 perf-profile.children.cycles-pp.skb_release_data 47.19 -21.2 25.98 perf-profile.children.cycles-pp.free_unref_page 44.48 -19.6 24.88 perf-profile.children.cycles-pp.free_pcppages_bulk 45.62 -19.2 26.40 perf-profile.children.cycles-pp.__consume_stateless_skb 99.95 -14.2 85.76 ± 15% perf-profile.children.cycles-pp.main 99.96 -14.2 85.77 ± 15% perf-profile.children.cycles-pp.__libc_start_main 50.10 -8.6 41.53 perf-profile.children.cycles-pp.udp_recvmsg 50.11 -8.6 41.56 perf-profile.children.cycles-pp.inet_recvmsg 50.13 -8.5 41.65 perf-profile.children.cycles-pp.sock_recvmsg 50.29 -8.0 42.28 perf-profile.children.cycles-pp.__sys_recvfrom 50.29 -8.0 42.30 perf-profile.children.cycles-pp.__x64_sys_recvfrom 50.38 -7.5 42.86 perf-profile.children.cycles-pp.recvfrom 50.41 -7.3 43.07 perf-profile.children.cycles-pp.accept_connections 50.41 -7.3 43.07 perf-profile.children.cycles-pp.accept_connection 50.41 -7.3 43.07 perf-profile.children.cycles-pp.spawn_child 50.41 -7.3 43.07 perf-profile.children.cycles-pp.process_requests 50.41 -7.3 43.07 perf-profile.children.cycles-pp.recv_omni 19.96 ± 2% -6.5 13.50 perf-profile.children.cycles-pp.ip_generic_getfrag 19.46 ± 2% -6.2 13.28 perf-profile.children.cycles-pp._copy_from_iter 19.21 ± 2% -6.1 13.14 perf-profile.children.cycles-pp.copyin 8.96 -5.1 3.86 perf-profile.children.cycles-pp.udp_send_skb 8.72 -5.0 3.75 perf-profile.children.cycles-pp.ip_send_skb 8.11 -4.6 3.49 perf-profile.children.cycles-pp.ip_finish_output2 7.72 -4.4 3.32 perf-profile.children.cycles-pp.__dev_queue_xmit 98.71 -4.1 94.59 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe 98.51 -4.0 94.46 perf-profile.children.cycles-pp.do_syscall_64 6.49 -3.7 2.78 perf-profile.children.cycles-pp.do_softirq 6.51 -3.7 2.82 perf-profile.children.cycles-pp.__local_bh_enable_ip 6.43 -3.7 2.78 perf-profile.children.cycles-pp.__do_softirq 5.90 -3.4 2.46 perf-profile.children.cycles-pp.net_rx_action 5.74 -3.4 2.38 perf-profile.children.cycles-pp.__napi_poll 5.71 -3.4 2.36 perf-profile.children.cycles-pp.process_backlog 5.37 -3.2 2.21 perf-profile.children.cycles-pp.__netif_receive_skb_one_core 4.60 -2.7 1.91 perf-profile.children.cycles-pp.ip_local_deliver_finish 4.57 ± 2% -2.7 1.90 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu 4.42 ± 2% -2.6 1.83 perf-profile.children.cycles-pp.__udp4_lib_rcv 3.82 ± 2% -2.2 1.58 ± 2% perf-profile.children.cycles-pp.udp_unicast_rcv_skb 3.78 ± 2% -2.2 1.57 ± 2% perf-profile.children.cycles-pp.udp_queue_rcv_one_skb 2.23 ± 2% -1.6 0.65 ± 2% perf-profile.children.cycles-pp.__ip_make_skb 1.95 ± 2% -1.4 0.52 ± 3% perf-profile.children.cycles-pp.__ip_select_ident 1.51 ± 4% -1.2 0.34 perf-profile.children.cycles-pp.free_unref_page_commit 1.17 -0.7 0.51 ± 2% perf-profile.children.cycles-pp.ip_route_output_flow 1.15 -0.6 0.52 perf-profile.children.cycles-pp.sock_alloc_send_pskb 0.91 -0.5 0.39 perf-profile.children.cycles-pp.alloc_skb_with_frags 0.86 -0.5 0.37 perf-profile.children.cycles-pp.__alloc_skb 0.83 -0.5 0.36 ± 2% perf-profile.children.cycles-pp.ip_route_output_key_hash_rcu 0.75 -0.4 0.32 perf-profile.children.cycles-pp.dev_hard_start_xmit 0.72 -0.4 0.31 ± 3% perf-profile.children.cycles-pp.fib_table_lookup 0.67 -0.4 0.28 perf-profile.children.cycles-pp.loopback_xmit 0.70 ± 2% -0.4 0.33 perf-profile.children.cycles-pp.__zone_watermark_ok 0.47 ± 4% -0.3 0.15 perf-profile.children.cycles-pp.kmem_cache_free 0.57 -0.3 0.26 perf-profile.children.cycles-pp.kmem_cache_alloc_node 0.46 -0.3 0.18 ± 2% perf-profile.children.cycles-pp.ip_rcv 0.42 -0.3 0.17 perf-profile.children.cycles-pp.move_addr_to_kernel 0.41 -0.2 0.16 ± 2% perf-profile.children.cycles-pp.__udp4_lib_lookup 0.32 -0.2 0.13 perf-profile.children.cycles-pp.__netif_rx 0.30 -0.2 0.12 perf-profile.children.cycles-pp.netif_rx_internal 0.30 -0.2 0.12 perf-profile.children.cycles-pp._copy_from_user 0.31 -0.2 0.13 perf-profile.children.cycles-pp.kmalloc_reserve 0.63 -0.2 0.46 ± 2% perf-profile.children.cycles-pp.free_unref_page_prepare 0.28 -0.2 0.11 perf-profile.children.cycles-pp.enqueue_to_backlog 0.27 -0.2 0.11 perf-profile.children.cycles-pp.udp4_lib_lookup2 0.29 -0.2 0.13 ± 6% perf-profile.children.cycles-pp.send_data 0.25 -0.2 0.10 perf-profile.children.cycles-pp.__netif_receive_skb_core 0.23 ± 2% -0.1 0.10 ± 4% perf-profile.children.cycles-pp.security_socket_sendmsg 0.19 ± 2% -0.1 0.06 perf-profile.children.cycles-pp.ip_rcv_core 0.37 -0.1 0.24 perf-profile.children.cycles-pp.irqtime_account_irq 0.21 -0.1 0.08 perf-profile.children.cycles-pp.sock_wfree 0.21 ± 3% -0.1 0.08 perf-profile.children.cycles-pp.validate_xmit_skb 0.20 ± 2% -0.1 0.08 perf-profile.children.cycles-pp.ip_output 0.22 ± 2% -0.1 0.10 ± 4% perf-profile.children.cycles-pp.ip_rcv_finish_core 0.20 ± 6% -0.1 0.09 ± 5% perf-profile.children.cycles-pp.__mkroute_output 0.21 ± 2% -0.1 0.09 ± 5% perf-profile.children.cycles-pp._raw_spin_lock_irq 0.28 -0.1 0.18 perf-profile.children.cycles-pp._raw_spin_trylock 0.34 ± 3% -0.1 0.25 perf-profile.children.cycles-pp.__slab_free 0.13 ± 3% -0.1 0.05 perf-profile.children.cycles-pp.siphash_3u32 0.12 ± 4% -0.1 0.03 ± 70% perf-profile.children.cycles-pp.ipv4_pktinfo_prepare 0.14 ± 3% -0.1 0.06 ± 7% perf-profile.children.cycles-pp.__ip_local_out 0.20 ± 2% -0.1 0.12 perf-profile.children.cycles-pp.aa_sk_perm 0.18 ± 2% -0.1 0.10 perf-profile.children.cycles-pp.get_pfnblock_flags_mask 0.12 ± 3% -0.1 0.05 perf-profile.children.cycles-pp.sk_filter_trim_cap 0.13 -0.1 0.06 perf-profile.children.cycles-pp.ip_setup_cork 0.13 ± 7% -0.1 0.06 ± 8% perf-profile.children.cycles-pp.fib_lookup_good_nhc 0.15 ± 3% -0.1 0.08 ± 5% perf-profile.children.cycles-pp.skb_set_owner_w 0.11 ± 4% -0.1 0.05 perf-profile.children.cycles-pp.dst_release 0.23 ± 2% -0.1 0.17 ± 2% perf-profile.children.cycles-pp.__entry_text_start 0.11 -0.1 0.05 perf-profile.children.cycles-pp.ipv4_mtu 0.20 ± 2% -0.1 0.15 ± 3% perf-profile.children.cycles-pp.__list_add_valid_or_report 0.10 -0.1 0.05 perf-profile.children.cycles-pp.ip_send_check 0.31 ± 2% -0.0 0.26 ± 3% perf-profile.children.cycles-pp.sockfd_lookup_light 0.27 -0.0 0.22 ± 2% perf-profile.children.cycles-pp.__fget_light 0.63 -0.0 0.58 perf-profile.children.cycles-pp.__check_object_size 0.15 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack 0.13 -0.0 0.09 ± 5% perf-profile.children.cycles-pp.alloc_pages 0.27 -0.0 0.24 perf-profile.children.cycles-pp.sched_clock_cpu 0.11 ± 4% -0.0 0.08 ± 6% perf-profile.children.cycles-pp.__cond_resched 0.14 ± 3% -0.0 0.11 perf-profile.children.cycles-pp.free_tail_page_prepare 0.11 -0.0 0.08 ± 5% perf-profile.children.cycles-pp.syscall_return_via_sysret 0.09 ± 9% -0.0 0.06 ± 7% perf-profile.children.cycles-pp.__xfrm_policy_check2 0.23 ± 2% -0.0 0.21 ± 2% perf-profile.children.cycles-pp.sched_clock 0.14 ± 3% -0.0 0.11 ± 4% perf-profile.children.cycles-pp.prep_compound_page 0.21 ± 2% -0.0 0.20 ± 2% perf-profile.children.cycles-pp.native_sched_clock 0.06 -0.0 0.05 perf-profile.children.cycles-pp.task_tick_fair 0.06 -0.0 0.05 perf-profile.children.cycles-pp.check_stack_object 0.18 ± 2% +0.0 0.20 ± 2% perf-profile.children.cycles-pp.perf_event_task_tick 0.18 ± 2% +0.0 0.19 ± 2% perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context 0.31 ± 3% +0.0 0.33 perf-profile.children.cycles-pp.tick_sched_handle 0.31 ± 3% +0.0 0.33 perf-profile.children.cycles-pp.update_process_times 0.41 ± 2% +0.0 0.43 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt 0.40 ± 2% +0.0 0.42 perf-profile.children.cycles-pp.hrtimer_interrupt 0.32 ± 2% +0.0 0.34 perf-profile.children.cycles-pp.tick_sched_timer 0.36 ± 2% +0.0 0.39 perf-profile.children.cycles-pp.__hrtimer_run_queues 0.06 ± 7% +0.0 0.10 ± 4% perf-profile.children.cycles-pp.exit_to_user_mode_prepare 0.05 ± 8% +0.0 0.10 perf-profile.children.cycles-pp._raw_spin_lock_bh 0.00 +0.1 0.05 perf-profile.children.cycles-pp.update_cfs_group 0.00 +0.1 0.05 perf-profile.children.cycles-pp.cpuidle_governor_latency_req 0.00 +0.1 0.05 perf-profile.children.cycles-pp.flush_smp_call_function_queue 0.00 +0.1 0.05 ± 8% perf-profile.children.cycles-pp.prepare_to_wait_exclusive 0.07 +0.1 0.13 ± 3% perf-profile.children.cycles-pp.__mod_zone_page_state 0.00 +0.1 0.06 ± 13% perf-profile.children.cycles-pp.cgroup_rstat_updated 0.00 +0.1 0.06 perf-profile.children.cycles-pp.__x2apic_send_IPI_dest 0.00 +0.1 0.06 perf-profile.children.cycles-pp.security_socket_recvmsg 0.00 +0.1 0.06 perf-profile.children.cycles-pp.select_task_rq_fair 0.00 +0.1 0.06 perf-profile.children.cycles-pp.tick_irq_enter 0.00 +0.1 0.06 perf-profile.children.cycles-pp.tick_nohz_idle_enter 0.42 ± 2% +0.1 0.49 ± 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt 0.00 +0.1 0.07 ± 7% perf-profile.children.cycles-pp.ktime_get 0.00 +0.1 0.07 perf-profile.children.cycles-pp.__get_user_4 0.00 +0.1 0.07 perf-profile.children.cycles-pp.update_rq_clock 0.00 +0.1 0.07 perf-profile.children.cycles-pp.select_task_rq 0.00 +0.1 0.07 perf-profile.children.cycles-pp.native_apic_msr_eoi 0.49 +0.1 0.57 ± 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt 0.11 ± 11% +0.1 0.19 ± 2% perf-profile.children.cycles-pp._raw_spin_lock 0.00 +0.1 0.08 perf-profile.children.cycles-pp.update_rq_clock_task 0.00 +0.1 0.08 perf-profile.children.cycles-pp.__update_load_avg_se 0.00 +0.1 0.09 ± 5% perf-profile.children.cycles-pp.irq_enter_rcu 0.00 +0.1 0.09 ± 5% perf-profile.children.cycles-pp.__irq_exit_rcu 0.00 +0.1 0.09 perf-profile.children.cycles-pp.__update_load_avg_cfs_rq 0.00 +0.1 0.09 perf-profile.children.cycles-pp.update_blocked_averages 0.00 +0.1 0.09 perf-profile.children.cycles-pp.update_sg_lb_stats 0.00 +0.1 0.09 ± 5% perf-profile.children.cycles-pp.set_next_entity 0.00 +0.1 0.10 perf-profile.children.cycles-pp.__switch_to_asm 0.00 +0.1 0.11 ± 12% perf-profile.children.cycles-pp._copy_to_user 0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.menu_select 0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.recv_data 0.00 +0.1 0.12 ± 3% perf-profile.children.cycles-pp.update_sd_lb_stats 0.00 +0.1 0.13 ± 3% perf-profile.children.cycles-pp.native_irq_return_iret 0.00 +0.1 0.13 ± 3% perf-profile.children.cycles-pp.__switch_to 0.00 +0.1 0.13 ± 3% perf-profile.children.cycles-pp.find_busiest_group 0.00 +0.1 0.14 perf-profile.children.cycles-pp.finish_task_switch 0.00 +0.1 0.15 ± 3% perf-profile.children.cycles-pp.update_curr 0.00 +0.2 0.15 ± 3% perf-profile.children.cycles-pp.mem_cgroup_uncharge_skmem 0.00 +0.2 0.16 perf-profile.children.cycles-pp.ttwu_queue_wakelist 0.05 +0.2 0.22 ± 2% perf-profile.children.cycles-pp.page_counter_try_charge 0.00 +0.2 0.17 ± 2% perf-profile.children.cycles-pp.load_balance 0.00 +0.2 0.17 ± 2% perf-profile.children.cycles-pp.___perf_sw_event 0.02 ±141% +0.2 0.19 ± 2% perf-profile.children.cycles-pp.page_counter_uncharge 0.33 +0.2 0.52 perf-profile.children.cycles-pp.__free_one_page 0.02 ±141% +0.2 0.21 ± 2% perf-profile.children.cycles-pp.drain_stock 0.00 +0.2 0.20 ± 2% perf-profile.children.cycles-pp.prepare_task_switch 0.16 ± 3% +0.2 0.38 ± 2% perf-profile.children.cycles-pp.simple_copy_to_iter 0.07 ± 11% +0.2 0.31 perf-profile.children.cycles-pp.refill_stock 0.07 ± 6% +0.2 0.31 ± 4% perf-profile.children.cycles-pp.move_addr_to_user 0.00 +0.2 0.24 perf-profile.children.cycles-pp.enqueue_entity 0.00 +0.2 0.25 perf-profile.children.cycles-pp.update_load_avg 0.21 ± 2% +0.3 0.48 perf-profile.children.cycles-pp.__list_del_entry_valid_or_report 0.00 +0.3 0.31 ± 4% perf-profile.children.cycles-pp.dequeue_entity 0.08 ± 5% +0.3 0.40 ± 3% perf-profile.children.cycles-pp.try_charge_memcg 0.00 +0.3 0.33 perf-profile.children.cycles-pp.enqueue_task_fair 0.00 +0.4 0.35 ± 2% perf-profile.children.cycles-pp.dequeue_task_fair 0.00 +0.4 0.35 ± 2% perf-profile.children.cycles-pp.activate_task 0.00 +0.4 0.36 ± 2% perf-profile.children.cycles-pp.try_to_wake_up 0.00 +0.4 0.37 ± 2% perf-profile.children.cycles-pp.autoremove_wake_function 0.00 +0.4 0.39 ± 3% perf-profile.children.cycles-pp.newidle_balance 0.12 ± 8% +0.4 0.51 ± 2% perf-profile.children.cycles-pp.mem_cgroup_charge_skmem 0.00 +0.4 0.39 perf-profile.children.cycles-pp.ttwu_do_activate 0.00 +0.4 0.40 ± 2% perf-profile.children.cycles-pp.__wake_up_common 0.18 ± 4% +0.4 0.59 perf-profile.children.cycles-pp.udp_rmem_release 0.11 ± 7% +0.4 0.52 perf-profile.children.cycles-pp.__sk_mem_reduce_allocated 0.00 +0.4 0.43 perf-profile.children.cycles-pp.__wake_up_common_lock 0.00 +0.5 0.46 perf-profile.children.cycles-pp.sched_ttwu_pending 0.00 +0.5 0.49 perf-profile.children.cycles-pp.sock_def_readable 0.00 +0.5 0.53 ± 2% perf-profile.children.cycles-pp.pick_next_task_fair 0.00 +0.5 0.54 ± 2% perf-profile.children.cycles-pp.schedule_idle 0.00 +0.6 0.55 perf-profile.children.cycles-pp.__flush_smp_call_function_queue 0.15 ± 3% +0.6 0.73 ± 2% perf-profile.children.cycles-pp.__sk_mem_raise_allocated 0.00 +0.6 0.57 perf-profile.children.cycles-pp.__sysvec_call_function_single 0.16 ± 5% +0.6 0.74 ± 2% perf-profile.children.cycles-pp.__sk_mem_schedule 0.00 +0.8 0.78 perf-profile.children.cycles-pp.sysvec_call_function_single 0.41 ± 3% +0.9 1.33 ± 2% perf-profile.children.cycles-pp.__udp_enqueue_schedule_skb 0.00 +1.2 1.16 ± 2% perf-profile.children.cycles-pp.schedule 0.00 +1.2 1.21 ± 2% perf-profile.children.cycles-pp.schedule_timeout 0.00 +1.3 1.33 ± 2% perf-profile.children.cycles-pp.__skb_wait_for_more_packets 0.00 +1.7 1.66 ± 2% perf-profile.children.cycles-pp.__schedule 0.27 ± 3% +2.0 2.25 perf-profile.children.cycles-pp.__skb_recv_udp 50.41 +2.4 52.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave 0.00 +2.7 2.68 perf-profile.children.cycles-pp.asm_sysvec_call_function_single 49.78 +2.7 52.49 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath 0.00 +3.0 2.98 perf-profile.children.cycles-pp.acpi_safe_halt 0.00 +3.0 3.00 perf-profile.children.cycles-pp.acpi_idle_enter 49.51 +3.1 52.57 perf-profile.children.cycles-pp.send_udp_stream 49.50 +3.1 52.56 perf-profile.children.cycles-pp.send_omni_inner 0.00 +3.1 3.10 perf-profile.children.cycles-pp.cpuidle_enter_state 0.00 +3.1 3.12 perf-profile.children.cycles-pp.cpuidle_enter 0.00 +3.4 3.37 perf-profile.children.cycles-pp.cpuidle_idle_call 48.90 +3.4 52.30 perf-profile.children.cycles-pp.sendto 47.85 +4.0 51.83 perf-profile.children.cycles-pp.__x64_sys_sendto 47.73 +4.0 51.77 perf-profile.children.cycles-pp.__sys_sendto 0.00 +4.1 4.10 perf-profile.children.cycles-pp.start_secondary 0.00 +4.1 4.13 perf-profile.children.cycles-pp.do_idle 0.00 +4.1 4.14 perf-profile.children.cycles-pp.secondary_startup_64_no_verify 0.00 +4.1 4.14 perf-profile.children.cycles-pp.cpu_startup_entry 46.54 +4.7 51.28 perf-profile.children.cycles-pp.sock_sendmsg 46.10 +5.0 51.11 perf-profile.children.cycles-pp.udp_sendmsg 3.70 +8.0 11.71 perf-profile.children.cycles-pp.copyout 3.71 +8.1 11.80 perf-profile.children.cycles-pp._copy_to_iter 3.96 +8.5 12.43 perf-profile.children.cycles-pp.__skb_datagram_iter 3.96 +8.5 12.44 perf-profile.children.cycles-pp.skb_copy_datagram_iter 35.14 +11.3 46.40 perf-profile.children.cycles-pp.ip_make_skb 32.71 ± 2% +13.0 45.66 perf-profile.children.cycles-pp.__ip_append_data 10.28 +20.6 30.89 perf-profile.children.cycles-pp.sk_page_frag_refill 10.25 +20.6 30.88 perf-profile.children.cycles-pp.skb_page_frag_refill 9.86 +20.8 30.63 perf-profile.children.cycles-pp.__alloc_pages 9.62 +20.8 30.42 perf-profile.children.cycles-pp.get_page_from_freelist 8.42 +21.3 29.72 perf-profile.children.cycles-pp.rmqueue 6.47 +22.8 29.22 perf-profile.children.cycles-pp.rmqueue_bulk 19.11 ± 2% -6.0 13.08 perf-profile.self.cycles-pp.copyin 1.81 ± 2% -1.4 0.39 perf-profile.self.cycles-pp.rmqueue 1.81 ± 2% -1.3 0.46 ± 2% perf-profile.self.cycles-pp.__ip_select_ident 1.47 ± 4% -1.2 0.31 perf-profile.self.cycles-pp.free_unref_page_commit 1.29 ± 2% -0.5 0.75 perf-profile.self.cycles-pp.__ip_append_data 0.71 -0.4 0.29 perf-profile.self.cycles-pp.udp_sendmsg 0.68 ± 2% -0.4 0.32 perf-profile.self.cycles-pp.__zone_watermark_ok 0.50 -0.3 0.16 perf-profile.self.cycles-pp.skb_release_data 0.59 ± 3% -0.3 0.26 ± 3% perf-profile.self.cycles-pp.fib_table_lookup 0.46 ± 4% -0.3 0.15 ± 3% perf-profile.self.cycles-pp.kmem_cache_free 0.63 -0.3 0.33 ± 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave 0.47 -0.3 0.19 perf-profile.self.cycles-pp.__sys_sendto 0.44 -0.2 0.21 ± 2% perf-profile.self.cycles-pp.kmem_cache_alloc_node 0.36 -0.2 0.16 ± 3% perf-profile.self.cycles-pp.send_omni_inner 0.35 ± 2% -0.2 0.15 ± 3% perf-profile.self.cycles-pp.ip_finish_output2 0.29 -0.2 0.12 perf-profile.self.cycles-pp._copy_from_user 0.24 -0.1 0.10 ± 4% perf-profile.self.cycles-pp.__netif_receive_skb_core 0.22 ± 2% -0.1 0.08 ± 5% perf-profile.self.cycles-pp.free_unref_page 0.19 ± 2% -0.1 0.06 perf-profile.self.cycles-pp.ip_rcv_core 0.21 ± 2% -0.1 0.08 perf-profile.self.cycles-pp.__alloc_skb 0.20 ± 2% -0.1 0.08 perf-profile.self.cycles-pp.sock_wfree 0.22 ± 2% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.send_data 0.21 -0.1 0.09 perf-profile.self.cycles-pp.sendto 0.21 ± 2% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.ip_rcv_finish_core 0.21 ± 2% -0.1 0.09 ± 5% perf-profile.self.cycles-pp.__ip_make_skb 0.20 ± 4% -0.1 0.09 ± 5% perf-profile.self.cycles-pp._raw_spin_lock_irq 0.21 ± 2% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.__dev_queue_xmit 0.38 ± 3% -0.1 0.27 perf-profile.self.cycles-pp.get_page_from_freelist 0.20 ± 2% -0.1 0.09 perf-profile.self.cycles-pp.udp_send_skb 0.18 ± 2% -0.1 0.07 perf-profile.self.cycles-pp.__udp_enqueue_schedule_skb 0.18 ± 4% -0.1 0.08 ± 6% perf-profile.self.cycles-pp.__mkroute_output 0.25 -0.1 0.15 ± 3% perf-profile.self.cycles-pp._copy_from_iter 0.27 ± 4% -0.1 0.17 ± 2% perf-profile.self.cycles-pp.skb_page_frag_refill 0.16 -0.1 0.06 ± 7% perf-profile.self.cycles-pp.sock_sendmsg 0.33 ± 2% -0.1 0.24 perf-profile.self.cycles-pp.__slab_free 0.15 ± 3% -0.1 0.06 perf-profile.self.cycles-pp.udp4_lib_lookup2 0.38 ± 2% -0.1 0.29 ± 2% perf-profile.self.cycles-pp.free_unref_page_prepare 0.26 -0.1 0.17 perf-profile.self.cycles-pp._raw_spin_trylock 0.15 -0.1 0.06 perf-profile.self.cycles-pp.ip_output 0.14 -0.1 0.05 ± 8% perf-profile.self.cycles-pp.process_backlog 0.14 -0.1 0.06 perf-profile.self.cycles-pp.ip_route_output_flow 0.14 -0.1 0.06 perf-profile.self.cycles-pp.__udp4_lib_lookup 0.21 ± 2% -0.1 0.13 ± 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe 0.12 ± 3% -0.1 0.05 perf-profile.self.cycles-pp.siphash_3u32 0.13 ± 3% -0.1 0.06 ± 8% perf-profile.self.cycles-pp.ip_send_skb 0.17 -0.1 0.10 perf-profile.self.cycles-pp.__do_softirq 0.15 ± 3% -0.1 0.08 ± 5% perf-profile.self.cycles-pp.skb_set_owner_w 0.17 ± 2% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.aa_sk_perm 0.12 -0.1 0.05 perf-profile.self.cycles-pp.__x64_sys_sendto 0.12 ± 6% -0.1 0.05 perf-profile.self.cycles-pp.fib_lookup_good_nhc 0.19 ± 2% -0.1 0.13 perf-profile.self.cycles-pp.__list_add_valid_or_report 0.14 ± 3% -0.1 0.07 ± 6% perf-profile.self.cycles-pp.net_rx_action 0.16 ± 2% -0.1 0.10 perf-profile.self.cycles-pp.do_syscall_64 0.11 -0.1 0.05 perf-profile.self.cycles-pp.__udp4_lib_rcv 0.16 ± 3% -0.1 0.10 ± 4% perf-profile.self.cycles-pp.get_pfnblock_flags_mask 0.11 ± 4% -0.1 0.05 perf-profile.self.cycles-pp.ip_route_output_key_hash_rcu 0.10 ± 4% -0.1 0.05 perf-profile.self.cycles-pp.ip_generic_getfrag 0.10 -0.1 0.05 perf-profile.self.cycles-pp.ipv4_mtu 0.26 -0.0 0.21 ± 2% perf-profile.self.cycles-pp.__fget_light 0.15 ± 3% -0.0 0.11 ± 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack 0.24 -0.0 0.20 ± 2% perf-profile.self.cycles-pp.__alloc_pages 0.15 ± 3% -0.0 0.12 perf-profile.self.cycles-pp.__check_object_size 0.11 -0.0 0.08 ± 6% perf-profile.self.cycles-pp.syscall_return_via_sysret 0.08 ± 5% -0.0 0.05 perf-profile.self.cycles-pp.loopback_xmit 0.13 -0.0 0.11 ± 4% perf-profile.self.cycles-pp.prep_compound_page 0.11 -0.0 0.09 ± 5% perf-profile.self.cycles-pp.irqtime_account_irq 0.09 ± 10% -0.0 0.06 ± 7% perf-profile.self.cycles-pp.__xfrm_policy_check2 0.07 -0.0 0.05 perf-profile.self.cycles-pp.alloc_pages 0.08 -0.0 0.06 ± 7% perf-profile.self.cycles-pp.__entry_text_start 0.09 ± 5% -0.0 0.07 perf-profile.self.cycles-pp.free_tail_page_prepare 0.10 +0.0 0.11 perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context 0.06 +0.0 0.08 ± 6% perf-profile.self.cycles-pp.free_pcppages_bulk 0.05 ± 8% +0.0 0.10 ± 4% perf-profile.self.cycles-pp._raw_spin_lock_bh 0.07 +0.0 0.12 perf-profile.self.cycles-pp.__mod_zone_page_state 0.00 +0.1 0.05 perf-profile.self.cycles-pp.cpuidle_idle_call 0.00 +0.1 0.05 perf-profile.self.cycles-pp.udp_rmem_release 0.00 +0.1 0.05 perf-profile.self.cycles-pp.__flush_smp_call_function_queue 0.00 +0.1 0.05 perf-profile.self.cycles-pp.sock_def_readable 0.00 +0.1 0.05 perf-profile.self.cycles-pp.update_cfs_group 0.11 ± 11% +0.1 0.17 ± 2% perf-profile.self.cycles-pp._raw_spin_lock 0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.finish_task_switch 0.00 +0.1 0.05 ± 8% perf-profile.self.cycles-pp.cgroup_rstat_updated 0.00 +0.1 0.06 perf-profile.self.cycles-pp.do_idle 0.00 +0.1 0.06 perf-profile.self.cycles-pp.__skb_wait_for_more_packets 0.00 +0.1 0.06 perf-profile.self.cycles-pp.__x2apic_send_IPI_dest 0.00 +0.1 0.06 ± 7% perf-profile.self.cycles-pp.enqueue_entity 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.schedule_timeout 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.move_addr_to_user 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.menu_select 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.native_apic_msr_eoi 0.00 +0.1 0.07 ± 7% perf-profile.self.cycles-pp.update_sg_lb_stats 0.00 +0.1 0.07 perf-profile.self.cycles-pp.__update_load_avg_se 0.00 +0.1 0.07 perf-profile.self.cycles-pp.__get_user_4 0.00 +0.1 0.08 ± 6% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated 0.00 +0.1 0.08 perf-profile.self.cycles-pp.update_curr 0.00 +0.1 0.08 ± 5% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq 0.00 +0.1 0.09 ± 5% perf-profile.self.cycles-pp.try_to_wake_up 0.00 +0.1 0.09 perf-profile.self.cycles-pp.recvfrom 0.00 +0.1 0.09 perf-profile.self.cycles-pp.mem_cgroup_charge_skmem 0.00 +0.1 0.09 perf-profile.self.cycles-pp.update_load_avg 0.00 +0.1 0.09 ± 5% perf-profile.self.cycles-pp.enqueue_task_fair 0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp._copy_to_iter 0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.newidle_balance 0.00 +0.1 0.10 ± 4% perf-profile.self.cycles-pp.recv_data 0.00 +0.1 0.10 perf-profile.self.cycles-pp.refill_stock 0.00 +0.1 0.10 perf-profile.self.cycles-pp.__switch_to_asm 0.00 +0.1 0.11 ± 15% perf-profile.self.cycles-pp._copy_to_user 0.00 +0.1 0.12 perf-profile.self.cycles-pp.recv_omni 0.00 +0.1 0.12 perf-profile.self.cycles-pp.mem_cgroup_uncharge_skmem 0.00 +0.1 0.13 ± 3% perf-profile.self.cycles-pp.native_irq_return_iret 0.00 +0.1 0.13 perf-profile.self.cycles-pp.__switch_to 0.06 +0.1 0.20 ± 2% perf-profile.self.cycles-pp.rmqueue_bulk 0.09 ± 5% +0.1 0.23 ± 4% perf-profile.self.cycles-pp.udp_recvmsg 0.00 +0.1 0.14 ± 3% perf-profile.self.cycles-pp.__skb_recv_udp 0.00 +0.1 0.14 ± 3% perf-profile.self.cycles-pp.___perf_sw_event 0.08 +0.1 0.22 ± 2% perf-profile.self.cycles-pp.__skb_datagram_iter 0.03 ± 70% +0.2 0.20 ± 4% perf-profile.self.cycles-pp.page_counter_try_charge 0.02 ±141% +0.2 0.18 ± 4% perf-profile.self.cycles-pp.__sys_recvfrom 0.00 +0.2 0.17 ± 2% perf-profile.self.cycles-pp.__schedule 0.00 +0.2 0.17 ± 2% perf-profile.self.cycles-pp.try_charge_memcg 0.00 +0.2 0.17 ± 2% perf-profile.self.cycles-pp.page_counter_uncharge 0.00 +0.2 0.21 ± 2% perf-profile.self.cycles-pp.__sk_mem_raise_allocated 0.14 ± 3% +0.2 0.36 perf-profile.self.cycles-pp.__free_one_page 0.20 ± 2% +0.3 0.47 perf-profile.self.cycles-pp.__list_del_entry_valid_or_report 0.00 +2.1 2.07 ± 2% perf-profile.self.cycles-pp.acpi_safe_halt 49.78 +2.7 52.49 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath 3.68 +8.0 11.64 perf-profile.self.cycles-pp.copyout Disclaimer: Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance.
Hi, kernel test robot <oliver.sang@intel.com> writes: > hi, Huang Ying, > > sorry for late of this report. > we reported > "a 14.6% improvement of netperf.Throughput_Mbps" > in > https://lore.kernel.org/all/202310271441.71ce0a9-oliver.sang@intel.com/ > > later, our auto-bisect tool captured a regression on a netperf test with > different configurations, however, unfortunately, regarded it as 'reported' > so we missed this report at the first time. > > now send again FYI. > > > Hello, > > kernel test robot noticed a -60.4% regression of netperf.Throughput_Mbps on: > > > commit: f5ddc662f07d7d99e9cfc5e07778e26c7394caf8 ("[PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages") > url: https://github.com/intel-lab-lkp/linux/commits/Huang-Ying/mm-pcp-avoid-to-drain-PCP-when-process-exit/20231017-143633 > base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/driver-core.git 36b2d7dd5a8ac95c8c1e69bdc93c4a6e2dc28a23 > patch link: https://lore.kernel.org/all/20231016053002.756205-4-ying.huang@intel.com/ > patch subject: [PATCH -V3 3/9] mm, pcp: reduce lock contention for draining high-order pages > > testcase: netperf > test machine: 128 threads 2 sockets Intel(R) Xeon(R) Gold 6338 CPU @ 2.00GHz (Ice Lake) with 256G memory > parameters: > > ip: ipv4 > runtime: 300s > nr_threads: 50% > cluster: cs-localhost > test: UDP_STREAM > cpufreq_governor: performance > > > > If you fix the issue in a separate patch/commit (i.e. not just a new version of > the same patch/commit), kindly add following tags > | Reported-by: kernel test robot <oliver.sang@intel.com> > | Closes: https://lore.kernel.org/oe-lkp/202311061311.8d63998-oliver.sang@intel.com > > > Details are as below: > --------------------------------------------------------------------------------------------------> > > > The kernel config and materials to reproduce are available at: > https://download.01.org/0day-ci/archive/20231106/202311061311.8d63998-oliver.sang@intel.com > > ========================================================================================= > cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase: > cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-8.3/50%/debian-11.1-x86_64-20220510.cgz/300s/lkp-icl-2sp2/UDP_STREAM/netperf > > commit: > c828e65251 ("cacheinfo: calculate size of per-CPU data cache slice") > f5ddc662f0 ("mm, pcp: reduce lock contention for draining high-order pages") > > c828e65251502516 f5ddc662f07d7d99e9cfc5e0777 > ---------------- --------------------------- > %stddev %change %stddev > \ | \ > 7321 4% +28.2% 9382 uptime.idle > 50.65 4% -4.0% 48.64 boot-time.boot > 6042 4% -4.2% 5785 boot-time.idle > 1.089e+09 2% +232.1% 3.618e+09 cpuidle..time > 1087075 2% +24095.8% 2.63e+08 cpuidle..usage > 3357014 +99.9% 6710312 vmstat.memory.cache > 48731 19% +4666.5% 2322787 vmstat.system.cs > 144637 +711.2% 1173334 vmstat.system.in > 2.59 2% +6.2 8.79 mpstat.cpu.all.idle% > 1.01 +0.7 1.66 mpstat.cpu.all.irq% > 6.00 -3.2 2.79 mpstat.cpu.all.soft% > 1.13 2% -0.1 1.02 mpstat.cpu.all.usr% > 1.407e+09 3% -28.2% 1.011e+09 numa-numastat.node0.local_node > 1.407e+09 3% -28.2% 1.01e+09 numa-numastat.node0.numa_hit > 1.469e+09 8% -32.0% 9.979e+08 numa-numastat.node1.local_node > 1.469e+09 8% -32.1% 9.974e+08 numa-numastat.node1.numa_hit > 103.00 19% -44.0% 57.67 20% perf-c2c.DRAM.local > 8970 12% -89.4% 951.00 4% perf-c2c.DRAM.remote > 8192 5% +68.5% 13807 perf-c2c.HITM.local > 6675 11% -92.6% 491.00 2% perf-c2c.HITM.remote > 1051014 2% +24922.0% 2.63e+08 turbostat.C1 > 2.75 2% +6.5 9.29 turbostat.C1% > 2.72 2% +178.3% 7.57 turbostat.CPU%c1 > 0.09 -22.2% 0.07 turbostat.IPC > 44589125 +701.5% 3.574e+08 turbostat.IRQ > 313.00 57% +1967.0% 6469 8% turbostat.POLL > 70.33 +3.3% 72.67 turbostat.PkgTmp > 44.23 4% -31.8% 30.15 2% turbostat.RAMWatt > 536096 +583.7% 3665194 meminfo.Active > 535414 +584.4% 3664543 meminfo.Active(anon) > 3238301 +103.2% 6579677 meminfo.Cached > 1204424 +278.9% 4563575 meminfo.Committed_AS > 469093 +47.9% 693889 3% meminfo.Inactive > 467250 +48.4% 693496 3% meminfo.Inactive(anon) > 53615 +562.5% 355225 4% meminfo.Mapped > 5223078 +64.1% 8571212 meminfo.Memused > 557305 +599.6% 3899111 meminfo.Shmem > 5660207 +58.9% 8993642 meminfo.max_used_kB > 78504 3% -30.1% 54869 netperf.ThroughputBoth_Mbps > 5024292 3% -30.1% 3511666 netperf.ThroughputBoth_total_Mbps > 7673 5% +249.7% 26832 netperf.ThroughputRecv_Mbps > 491074 5% +249.7% 1717287 netperf.ThroughputRecv_total_Mbps > 70831 2% -60.4% 28037 netperf.Throughput_Mbps > 4533217 2% -60.4% 1794379 netperf.Throughput_total_Mbps This is a UDP test. So the sender will not wait for receiver. In the result, you can find that the sender throughput reduces 60.4%, while the receiver throughput increases 249.7%. And, much less packets are dropped during the test, and this is good too. All in all, considering the performance of both the sender and the receiver, I think the patch helps the performance. -- Best Regards, Huang, Ying > 5439 +9.4% 5949 netperf.time.percent_of_cpu_this_job_got > 16206 +9.4% 17728 netperf.time.system_time > 388.14 -51.9% 186.53 netperf.time.user_time > 2.876e+09 3% -30.1% 2.01e+09 netperf.workload > 177360 30% -36.0% 113450 20% numa-meminfo.node0.AnonPages > 255926 12% -40.6% 152052 12% numa-meminfo.node0.AnonPages.max > 22582 61% +484.2% 131916 90% numa-meminfo.node0.Mapped > 138287 17% +22.6% 169534 12% numa-meminfo.node1.AnonHugePages > 267468 20% +29.1% 345385 6% numa-meminfo.node1.AnonPages > 346204 18% +34.5% 465696 2% numa-meminfo.node1.AnonPages.max > 279416 19% +77.0% 494652 18% numa-meminfo.node1.Inactive > 278445 19% +77.6% 494393 18% numa-meminfo.node1.Inactive(anon) > 31726 45% +607.7% 224533 45% numa-meminfo.node1.Mapped > 4802 6% +19.4% 5733 3% numa-meminfo.node1.PageTables > 297323 12% +792.6% 2653850 63% numa-meminfo.node1.Shmem > 44325 30% -36.0% 28379 20% numa-vmstat.node0.nr_anon_pages > 5590 61% +491.0% 33042 90% numa-vmstat.node0.nr_mapped > 1.407e+09 3% -28.2% 1.01e+09 numa-vmstat.node0.numa_hit > 1.407e+09 3% -28.2% 1.011e+09 numa-vmstat.node0.numa_local > 66858 20% +29.2% 86385 6% numa-vmstat.node1.nr_anon_pages > 69601 20% +77.8% 123729 18% numa-vmstat.node1.nr_inactive_anon > 7953 45% +608.3% 56335 45% numa-vmstat.node1.nr_mapped > 1201 6% +19.4% 1434 3% numa-vmstat.node1.nr_page_table_pages > 74288 11% +792.6% 663111 63% numa-vmstat.node1.nr_shmem > 69601 20% +77.8% 123728 18% numa-vmstat.node1.nr_zone_inactive_anon > 1.469e+09 8% -32.1% 9.974e+08 numa-vmstat.node1.numa_hit > 1.469e+09 8% -32.0% 9.979e+08 numa-vmstat.node1.numa_local > 133919 +584.2% 916254 proc-vmstat.nr_active_anon > 111196 +3.3% 114828 proc-vmstat.nr_anon_pages > 5602484 -1.5% 5518799 proc-vmstat.nr_dirty_background_threshold > 11218668 -1.5% 11051092 proc-vmstat.nr_dirty_threshold > 809646 +103.2% 1645012 proc-vmstat.nr_file_pages > 56374629 -1.5% 55536913 proc-vmstat.nr_free_pages > 116775 +48.4% 173349 3% proc-vmstat.nr_inactive_anon > 13386 2% +563.3% 88793 4% proc-vmstat.nr_mapped > 2286 +6.5% 2434 proc-vmstat.nr_page_table_pages > 139393 +599.4% 974869 proc-vmstat.nr_shmem > 29092 +6.6% 31019 proc-vmstat.nr_slab_reclaimable > 133919 +584.2% 916254 proc-vmstat.nr_zone_active_anon > 116775 +48.4% 173349 3% proc-vmstat.nr_zone_inactive_anon > 32135 11% +257.2% 114797 21% proc-vmstat.numa_hint_faults > 20858 16% +318.3% 87244 6% proc-vmstat.numa_hint_faults_local > 2.876e+09 3% -30.2% 2.008e+09 proc-vmstat.numa_hit > 2.876e+09 3% -30.2% 2.008e+09 proc-vmstat.numa_local > 25453 7% -75.2% 6324 30% proc-vmstat.numa_pages_migrated > 178224 2% +76.6% 314680 7% proc-vmstat.numa_pte_updates > 160889 3% +267.6% 591393 6% proc-vmstat.pgactivate > 2.295e+10 3% -30.2% 1.601e+10 proc-vmstat.pgalloc_normal > 1026605 +21.9% 1251671 proc-vmstat.pgfault > 2.295e+10 3% -30.2% 1.601e+10 proc-vmstat.pgfree > 25453 7% -75.2% 6324 30% proc-vmstat.pgmigrate_success > 39208 2% -6.1% 36815 proc-vmstat.pgreuse > 3164416 -20.3% 2521344 2% proc-vmstat.unevictable_pgs_scanned > 19248627 -22.1% 14989905 sched_debug.cfs_rq:/.avg_vruntime.avg > 20722680 -24.9% 15569530 sched_debug.cfs_rq:/.avg_vruntime.max > 17634233 -22.5% 13663168 sched_debug.cfs_rq:/.avg_vruntime.min > 949063 2% -70.5% 280388 sched_debug.cfs_rq:/.avg_vruntime.stddev > 0.78 10% -100.0% 0.00 sched_debug.cfs_rq:/.h_nr_running.min > 0.16 8% +113.3% 0.33 2% sched_debug.cfs_rq:/.h_nr_running.stddev > 0.56 141% +2.2e+07% 122016 52% sched_debug.cfs_rq:/.left_vruntime.avg > 45.01 141% +2.2e+07% 10035976 28% sched_debug.cfs_rq:/.left_vruntime.max > 4.58 141% +2.3e+07% 1072762 36% sched_debug.cfs_rq:/.left_vruntime.stddev > 5814 10% -100.0% 0.00 sched_debug.cfs_rq:/.load.min > 5.39 9% -73.2% 1.44 10% sched_debug.cfs_rq:/.load_avg.min > 19248627 -22.1% 14989905 sched_debug.cfs_rq:/.min_vruntime.avg > 20722680 -24.9% 15569530 sched_debug.cfs_rq:/.min_vruntime.max > 17634233 -22.5% 13663168 sched_debug.cfs_rq:/.min_vruntime.min > 949063 2% -70.5% 280388 sched_debug.cfs_rq:/.min_vruntime.stddev > 0.78 10% -100.0% 0.00 sched_debug.cfs_rq:/.nr_running.min > 0.06 8% +369.2% 0.30 3% sched_debug.cfs_rq:/.nr_running.stddev > 4.84 26% +1611.3% 82.79 67% sched_debug.cfs_rq:/.removed.load_avg.avg > 27.92 12% +3040.3% 876.79 68% sched_debug.cfs_rq:/.removed.load_avg.stddev > 0.56 141% +2.2e+07% 122016 52% sched_debug.cfs_rq:/.right_vruntime.avg > 45.06 141% +2.2e+07% 10035976 28% sched_debug.cfs_rq:/.right_vruntime.max > 4.59 141% +2.3e+07% 1072762 36% sched_debug.cfs_rq:/.right_vruntime.stddev > 900.25 -10.4% 806.45 sched_debug.cfs_rq:/.runnable_avg.avg > 533.28 4% -87.0% 69.56 39% sched_debug.cfs_rq:/.runnable_avg.min > 122.77 2% +92.9% 236.86 sched_debug.cfs_rq:/.runnable_avg.stddev > 896.13 -10.8% 799.44 sched_debug.cfs_rq:/.util_avg.avg > 379.06 4% -83.4% 62.94 37% sched_debug.cfs_rq:/.util_avg.min > 116.35 8% +99.4% 232.04 sched_debug.cfs_rq:/.util_avg.stddev > 550.87 -14.2% 472.66 2% sched_debug.cfs_rq:/.util_est_enqueued.avg > 1124 8% +18.2% 1329 3% sched_debug.cfs_rq:/.util_est_enqueued.max > 134.17 30% -100.0% 0.00 sched_debug.cfs_rq:/.util_est_enqueued.min > 558243 6% -66.9% 184666 sched_debug.cpu.avg_idle.avg > 12860 11% -56.1% 5644 sched_debug.cpu.avg_idle.min > 365635 -53.5% 169863 5% sched_debug.cpu.avg_idle.stddev > 9.56 3% -28.4% 6.84 8% sched_debug.cpu.clock.stddev > 6999 2% -85.6% 1007 3% sched_debug.cpu.clock_task.stddev > 3985 10% -100.0% 0.00 sched_debug.cpu.curr->pid.min > 491.71 10% +209.3% 1520 4% sched_debug.cpu.curr->pid.stddev > 270.19 141% +1096.6% 3233 51% sched_debug.cpu.max_idle_balance_cost.stddev > 0.78 10% -100.0% 0.00 sched_debug.cpu.nr_running.min > 0.15 6% +121.7% 0.34 2% sched_debug.cpu.nr_running.stddev > 62041 15% +4280.9% 2717948 sched_debug.cpu.nr_switches.avg > 1074922 14% +292.6% 4220307 2% sched_debug.cpu.nr_switches.max > 1186 2% +1.2e+05% 1379073 4% sched_debug.cpu.nr_switches.min > 132392 21% +294.6% 522476 5% sched_debug.cpu.nr_switches.stddev > 6.44 4% +21.4% 7.82 12% sched_debug.cpu.nr_uninterruptible.stddev > 6.73 13% -84.8% 1.02 5% perf-stat.i.MPKI > 1.652e+10 2% -22.2% 1.285e+10 perf-stat.i.branch-instructions > 0.72 +0.0 0.75 perf-stat.i.branch-miss-rate% > 1.19e+08 3% -19.8% 95493630 perf-stat.i.branch-misses > 27.46 12% -26.2 1.30 4% perf-stat.i.cache-miss-rate% > 5.943e+08 10% -88.6% 67756219 5% perf-stat.i.cache-misses > 2.201e+09 +143.7% 5.364e+09 perf-stat.i.cache-references > 48911 19% +4695.4% 2345525 perf-stat.i.context-switches > 3.66 2% +28.5% 4.71 perf-stat.i.cpi > 3.228e+11 -4.1% 3.097e+11 perf-stat.i.cpu-cycles > 190.51 +1363.7% 2788 10% perf-stat.i.cpu-migrations > 803.99 6% +510.2% 4905 5% perf-stat.i.cycles-between-cache-misses > 0.00 16% +0.0 0.01 14% perf-stat.i.dTLB-load-miss-rate% > 755654 18% +232.4% 2512024 14% perf-stat.i.dTLB-load-misses > 2.385e+10 2% -26.9% 1.742e+10 perf-stat.i.dTLB-loads > 0.00 31% +0.0 0.01 35% perf-stat.i.dTLB-store-miss-rate% > 305657 36% +200.0% 916822 35% perf-stat.i.dTLB-store-misses > 1.288e+10 2% -28.8% 9.179e+09 perf-stat.i.dTLB-stores > 8.789e+10 2% -25.2% 6.578e+10 perf-stat.i.instructions > 0.28 2% -21.6% 0.22 perf-stat.i.ipc > 2.52 -4.1% 2.42 perf-stat.i.metric.GHz > 873.89 12% -67.0% 288.04 8% perf-stat.i.metric.K/sec > 435.61 2% -19.6% 350.06 perf-stat.i.metric.M/sec > 2799 +29.9% 3637 2% perf-stat.i.minor-faults > 99.74 -2.6 97.11 perf-stat.i.node-load-miss-rate% > 1.294e+08 12% -92.4% 9879207 7% perf-stat.i.node-load-misses > 76.55 +16.4 92.92 perf-stat.i.node-store-miss-rate% > 2.257e+08 10% -90.4% 21721672 8% perf-stat.i.node-store-misses > 69217511 13% -97.7% 1625810 7% perf-stat.i.node-stores > 2799 +29.9% 3637 2% perf-stat.i.page-faults > 6.79 13% -84.9% 1.03 5% perf-stat.overall.MPKI > 0.72 +0.0 0.74 perf-stat.overall.branch-miss-rate% > 27.06 12% -25.8 1.26 4% perf-stat.overall.cache-miss-rate% > 3.68 2% +28.1% 4.71 perf-stat.overall.cpi > 549.38 10% +736.0% 4592 5% perf-stat.overall.cycles-between-cache-misses > 0.00 18% +0.0 0.01 14% perf-stat.overall.dTLB-load-miss-rate% > 0.00 36% +0.0 0.01 35% perf-stat.overall.dTLB-store-miss-rate% > 0.27 2% -22.0% 0.21 perf-stat.overall.ipc > 99.80 -2.4 97.37 perf-stat.overall.node-load-miss-rate% > 76.60 +16.4 93.03 perf-stat.overall.node-store-miss-rate% > 9319 +5.8% 9855 perf-stat.overall.path-length > 1.646e+10 2% -22.2% 1.281e+10 perf-stat.ps.branch-instructions > 1.186e+08 3% -19.8% 95167897 perf-stat.ps.branch-misses > 5.924e+08 10% -88.6% 67384354 5% perf-stat.ps.cache-misses > 2.193e+09 +143.4% 5.339e+09 perf-stat.ps.cache-references > 49100 19% +4668.0% 2341074 perf-stat.ps.context-switches > 3.218e+11 -4.1% 3.087e+11 perf-stat.ps.cpu-cycles > 189.73 +1368.4% 2786 10% perf-stat.ps.cpu-migrations > 753056 18% +229.9% 2484575 14% perf-stat.ps.dTLB-load-misses > 2.377e+10 2% -26.9% 1.737e+10 perf-stat.ps.dTLB-loads > 304509 36% +199.1% 910856 35% perf-stat.ps.dTLB-store-misses > 1.284e+10 2% -28.7% 9.152e+09 perf-stat.ps.dTLB-stores > 8.76e+10 2% -25.2% 6.557e+10 perf-stat.ps.instructions > 2791 +28.2% 3580 2% perf-stat.ps.minor-faults > 1.29e+08 12% -92.4% 9815672 7% perf-stat.ps.node-load-misses > 2.25e+08 10% -90.4% 21575943 8% perf-stat.ps.node-store-misses > 69002373 13% -97.7% 1615410 7% perf-stat.ps.node-stores > 2791 +28.2% 3580 2% perf-stat.ps.page-faults > 2.68e+13 2% -26.1% 1.981e+13 perf-stat.total.instructions > 0.00 35% +2600.0% 0.04 23% perf-sched.sch_delay.avg.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg > 1.18 9% -98.1% 0.02 32% perf-sched.sch_delay.avg.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part > 0.58 3% -62.1% 0.22 97% perf-sched.sch_delay.avg.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] > 0.51 22% -82.7% 0.09 11% perf-sched.sch_delay.avg.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 0.25 23% -59.6% 0.10 10% perf-sched.sch_delay.avg.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 > 0.03 42% -64.0% 0.01 15% perf-sched.sch_delay.avg.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 > 0.04 7% +434.6% 0.23 36% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll > 1.00 20% -84.1% 0.16 78% perf-sched.sch_delay.avg.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select > 0.01 7% -70.0% 0.00 perf-sched.sch_delay.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 0.02 2% +533.9% 0.12 43% perf-sched.sch_delay.avg.ms.schedule_timeout.kcompactd.kthread.ret_from_fork > 0.03 7% +105.9% 0.06 33% perf-sched.sch_delay.avg.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread > 0.01 15% +67.5% 0.02 8% perf-sched.sch_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 0.09 50% -85.7% 0.01 33% perf-sched.sch_delay.avg.ms.wait_for_partner.fifo_open.do_dentry_open.do_open > 0.04 7% +343.4% 0.16 6% perf-sched.sch_delay.avg.ms.worker_thread.kthread.ret_from_fork.ret_from_fork_asm > 0.06 41% +3260.7% 1.88 30% perf-sched.sch_delay.max.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg > 3.78 -96.2% 0.14 3% perf-sched.sch_delay.max.ms.__cond_resched.stop_one_cpu.sched_exec.bprm_execve.part > 2.86 4% -72.6% 0.78 113% perf-sched.sch_delay.max.ms.__x64_sys_pause.do_syscall_64.entry_SYSCALL_64_after_hwframe.[unknown] > 4.09 7% -34.1% 2.69 7% perf-sched.sch_delay.max.ms.do_task_dead.do_exit.do_group_exit.__x64_sys_exit_group.do_syscall_64 > 3.09 37% -64.1% 1.11 5% perf-sched.sch_delay.max.ms.do_wait.kernel_wait4.__do_sys_wait4.do_syscall_64 > 0.00 141% +6200.0% 0.13 82% perf-sched.sch_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt > 3.94 -40.5% 2.35 48% perf-sched.sch_delay.max.ms.pipe_read.vfs_read.ksys_read.do_syscall_64 > 1.63 21% -77.0% 0.38 90% perf-sched.sch_delay.max.ms.schedule_hrtimeout_range_clock.do_select.core_sys_select.kern_select > 7.29 39% +417.5% 37.72 16% perf-sched.sch_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 3.35 14% -51.7% 1.62 3% perf-sched.sch_delay.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone > 0.05 13% +2245.1% 1.13 40% perf-sched.sch_delay.max.ms.schedule_timeout.kcompactd.kthread.ret_from_fork > 3.01 26% +729.6% 25.01 91% perf-sched.sch_delay.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread > 1.93 59% -85.5% 0.28 62% perf-sched.sch_delay.max.ms.wait_for_partner.fifo_open.do_dentry_open.do_open > 0.01 -50.0% 0.00 perf-sched.total_sch_delay.average.ms > 7.29 39% +468.8% 41.46 26% perf-sched.total_sch_delay.max.ms > 6.04 4% -94.1% 0.35 perf-sched.total_wait_and_delay.average.ms > 205790 3% +1811.0% 3932742 perf-sched.total_wait_and_delay.count.ms > 6.03 4% -94.2% 0.35 perf-sched.total_wait_time.average.ms > 75.51 41% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 23.01 17% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop > 23.82 7% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 95.27 41% -100.0% 0.00 perf-sched.wait_and_delay.avg.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin > 55.86 141% +1014.6% 622.64 5% perf-sched.wait_and_delay.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep > 0.07 23% -82.5% 0.01 perf-sched.wait_and_delay.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 137.41 3% +345.1% 611.63 2% perf-sched.wait_and_delay.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll > 0.04 5% -49.6% 0.02 perf-sched.wait_and_delay.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 536.33 5% -46.5% 287.00 perf-sched.wait_and_delay.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 21.67 32% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 5.67 8% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.mutex_lock.perf_poll.do_poll.constprop > 1.67 56% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 5.67 29% -100.0% 0.00 perf-sched.wait_and_delay.count.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin > 5.33 23% +93.8% 10.33 25% perf-sched.wait_and_delay.count.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 101725 3% +15.3% 117243 10% perf-sched.wait_and_delay.count.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 100.00 7% -80.3% 19.67 2% perf-sched.wait_and_delay.count.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll > 97762 4% +3794.8% 3807606 perf-sched.wait_and_delay.count.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 1091 9% +111.9% 2311 3% perf-sched.wait_and_delay.count.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 604.50 43% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 37.41 9% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop > 27.08 13% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 275.41 32% -100.0% 0.00 perf-sched.wait_and_delay.max.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin > 1313 69% +112.1% 2786 15% perf-sched.wait_and_delay.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 333.38 141% +200.4% 1001 perf-sched.wait_and_delay.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep > 1000 -96.8% 31.85 48% perf-sched.wait_and_delay.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 17.99 33% +387.5% 87.71 8% perf-sched.wait_and_delay.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 0.33 19% -74.1% 0.09 10% perf-sched.wait_time.avg.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg > 0.02 53% +331.4% 0.10 50% perf-sched.wait_time.avg.ms.__cond_resched.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.__sys_recvfrom > 0.09 65% -75.9% 0.02 9% perf-sched.wait_time.avg.ms.__cond_resched.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.__sys_sendto > 0.02 22% -70.2% 0.01 141% perf-sched.wait_time.avg.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 > 75.51 41% -100.0% 0.04 42% perf-sched.wait_time.avg.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 0.10 36% -80.3% 0.02 9% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > 0.55 61% -94.9% 0.03 45% perf-sched.wait_time.avg.ms.__cond_resched.kmem_cache_alloc_node.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags > 23.01 17% -100.0% 0.00 141% perf-sched.wait_time.avg.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop > 23.82 7% -99.7% 0.07 57% perf-sched.wait_time.avg.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 95.27 41% -100.0% 0.03 89% perf-sched.wait_time.avg.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin > 56.30 139% +1005.5% 622.44 5% perf-sched.wait_time.avg.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep > 2.78 66% -98.2% 0.05 52% perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt > 0.07 23% -82.5% 0.01 perf-sched.wait_time.avg.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 137.37 3% +345.1% 611.40 2% perf-sched.wait_time.avg.ms.schedule_hrtimeout_range_clock.do_poll.constprop.0.do_sys_poll > 0.02 5% -41.9% 0.01 3% perf-sched.wait_time.avg.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 536.32 5% -46.5% 286.98 perf-sched.wait_time.avg.ms.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 4.66 20% -56.7% 2.02 26% perf-sched.wait_time.max.ms.__cond_resched.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg > 0.03 63% +995.0% 0.37 26% perf-sched.wait_time.max.ms.__cond_resched.aa_sk_perm.security_socket_recvmsg.sock_recvmsg.__sys_recvfrom > 1.67 87% -92.6% 0.12 57% perf-sched.wait_time.max.ms.__cond_resched.aa_sk_perm.security_socket_sendmsg.sock_sendmsg.__sys_sendto > 0.54 117% -95.1% 0.03 105% perf-sched.wait_time.max.ms.__cond_resched.down_write_killable.exec_mmap.begin_new_exec.load_elf_binary > 0.06 49% -89.1% 0.01 141% perf-sched.wait_time.max.ms.__cond_resched.dput.__fput.__x64_sys_close.do_syscall_64 > 604.50 43% -100.0% 0.16 83% perf-sched.wait_time.max.ms.__cond_resched.generic_perform_write.shmem_file_write_iter.vfs_write.ksys_write > 2.77 45% -95.4% 0.13 64% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.__alloc_skb.alloc_skb_with_frags.sock_alloc_send_pskb > 2.86 45% -94.3% 0.16 91% perf-sched.wait_time.max.ms.__cond_resched.kmem_cache_alloc_node.kmalloc_reserve.__alloc_skb.alloc_skb_with_frags > 37.41 9% -100.0% 0.01 141% perf-sched.wait_time.max.ms.__cond_resched.mutex_lock.perf_poll.do_poll.constprop > 27.08 13% -99.7% 0.08 61% perf-sched.wait_time.max.ms.__cond_resched.shmem_get_folio_gfp.shmem_write_begin.generic_perform_write.shmem_file_write_iter > 275.41 32% -100.0% 0.03 89% perf-sched.wait_time.max.ms.__cond_resched.shmem_inode_acct_block.shmem_alloc_and_acct_folio.shmem_get_folio_gfp.shmem_write_begin > 1313 69% +112.1% 2786 15% perf-sched.wait_time.max.ms.__cond_resched.smpboot_thread_fn.kthread.ret_from_fork.ret_from_fork_asm > 334.74 140% +198.9% 1000 perf-sched.wait_time.max.ms.do_nanosleep.hrtimer_nanosleep.common_nsleep.__x64_sys_clock_nanosleep > 21.74 58% -95.4% 1.00 103% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.irqentry_exit_to_user_mode.asm_sysvec_apic_timer_interrupt > 1000 -97.6% 24.49 50% perf-sched.wait_time.max.ms.exit_to_user_mode_loop.exit_to_user_mode_prepare.syscall_exit_to_user_mode.do_syscall_64 > 10.90 27% +682.9% 85.36 6% perf-sched.wait_time.max.ms.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 32.91 58% -63.5% 12.01 115% perf-sched.wait_time.max.ms.schedule_timeout.__wait_for_common.wait_for_completion_state.kernel_clone > 169.97 7% -49.2% 86.29 15% perf-sched.wait_time.max.ms.schedule_timeout.rcu_gp_fqs_loop.rcu_gp_kthread.kthread > 44.08 -19.8 24.25 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb > 44.47 -19.6 24.87 perf-profile.calltrace.cycles-pp.free_pcppages_bulk.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg > 43.63 -19.5 24.15 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.free_pcppages_bulk.free_unref_page.skb_release_data > 45.62 -19.2 26.39 perf-profile.calltrace.cycles-pp.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg > 45.62 -19.2 26.40 perf-profile.calltrace.cycles-pp.__consume_stateless_skb.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom > 45.00 -19.1 25.94 perf-profile.calltrace.cycles-pp.free_unref_page.skb_release_data.__consume_stateless_skb.udp_recvmsg.inet_recvmsg > 50.41 -16.8 33.64 39% perf-profile.calltrace.cycles-pp.accept_connections.main.__libc_start_main > 50.41 -16.8 33.64 39% perf-profile.calltrace.cycles-pp.accept_connection.accept_connections.main.__libc_start_main > 50.41 -16.8 33.64 39% perf-profile.calltrace.cycles-pp.spawn_child.accept_connection.accept_connections.main.__libc_start_main > 50.41 -16.8 33.64 39% perf-profile.calltrace.cycles-pp.process_requests.spawn_child.accept_connection.accept_connections.main > 99.92 -14.2 85.72 15% perf-profile.calltrace.cycles-pp.main.__libc_start_main > 99.96 -14.2 85.77 15% perf-profile.calltrace.cycles-pp.__libc_start_main > 50.10 -8.6 41.52 perf-profile.calltrace.cycles-pp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom > 50.11 -8.6 41.55 perf-profile.calltrace.cycles-pp.inet_recvmsg.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64 > 50.13 -8.5 41.64 perf-profile.calltrace.cycles-pp.sock_recvmsg.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe > 50.28 -8.0 42.27 perf-profile.calltrace.cycles-pp.__sys_recvfrom.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom > 50.29 -8.0 42.29 perf-profile.calltrace.cycles-pp.__x64_sys_recvfrom.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni > 50.31 -7.9 42.42 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests > 50.32 -7.8 42.47 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.recvfrom.recv_omni.process_requests.spawn_child > 50.36 -7.6 42.78 perf-profile.calltrace.cycles-pp.recvfrom.recv_omni.process_requests.spawn_child.accept_connection > 50.41 -7.3 43.07 perf-profile.calltrace.cycles-pp.recv_omni.process_requests.spawn_child.accept_connection.accept_connections > 19.93 2% -6.6 13.36 perf-profile.calltrace.cycles-pp.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg > 19.44 2% -6.3 13.16 perf-profile.calltrace.cycles-pp._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb.udp_sendmsg > 18.99 2% -6.1 12.90 perf-profile.calltrace.cycles-pp.copyin._copy_from_iter.ip_generic_getfrag.__ip_append_data.ip_make_skb > 8.95 -5.1 3.82 perf-profile.calltrace.cycles-pp.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto > 8.70 -5.0 3.71 perf-profile.calltrace.cycles-pp.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg.__sys_sendto > 8.10 -4.6 3.45 perf-profile.calltrace.cycles-pp.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg.sock_sendmsg > 7.69 -4.4 3.27 perf-profile.calltrace.cycles-pp.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb.udp_sendmsg > 6.51 -3.7 2.78 perf-profile.calltrace.cycles-pp.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb.udp_send_skb > 6.47 -3.7 2.75 perf-profile.calltrace.cycles-pp.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2.ip_send_skb > 6.41 -3.7 2.71 perf-profile.calltrace.cycles-pp.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit.ip_finish_output2 > 5.88 -3.5 2.43 perf-profile.calltrace.cycles-pp.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip.__dev_queue_xmit > 5.73 -3.4 2.35 perf-profile.calltrace.cycles-pp.__napi_poll.net_rx_action.__do_softirq.do_softirq.__local_bh_enable_ip > 5.69 -3.4 2.33 perf-profile.calltrace.cycles-pp.process_backlog.__napi_poll.net_rx_action.__do_softirq.do_softirq > 5.36 -3.2 2.19 perf-profile.calltrace.cycles-pp.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action.__do_softirq > 4.59 -2.7 1.89 perf-profile.calltrace.cycles-pp.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll.net_rx_action > 4.55 2% -2.7 1.88 perf-profile.calltrace.cycles-pp.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog.__napi_poll > 4.40 2% -2.6 1.81 perf-profile.calltrace.cycles-pp.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core.process_backlog > 3.81 2% -2.2 1.57 perf-profile.calltrace.cycles-pp.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish.__netif_receive_skb_one_core > 3.75 2% -2.2 1.55 perf-profile.calltrace.cycles-pp.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu.ip_local_deliver_finish > 2.21 2% -1.6 0.63 perf-profile.calltrace.cycles-pp.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto > 1.94 2% -1.4 0.51 2% perf-profile.calltrace.cycles-pp.__ip_select_ident.__ip_make_skb.ip_make_skb.udp_sendmsg.sock_sendmsg > 1.14 -0.6 0.51 perf-profile.calltrace.cycles-pp.sock_alloc_send_pskb.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg > 0.00 +0.5 0.53 2% perf-profile.calltrace.cycles-pp.schedule_idle.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 0.00 +0.7 0.69 perf-profile.calltrace.cycles-pp.sysvec_call_function_single.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state > 0.00 +0.7 0.71 perf-profile.calltrace.cycles-pp.__sk_mem_raise_allocated.__sk_mem_schedule.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb > 0.00 +0.7 0.72 perf-profile.calltrace.cycles-pp.__sk_mem_schedule.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv > 0.00 +1.0 0.99 20% perf-profile.calltrace.cycles-pp.__schedule.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp > 0.00 +1.0 1.01 20% perf-profile.calltrace.cycles-pp.schedule.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg > 0.00 +1.1 1.05 20% perf-profile.calltrace.cycles-pp.schedule_timeout.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg > 0.00 +1.1 1.12 perf-profile.calltrace.cycles-pp.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call > 0.00 +1.2 1.18 20% perf-profile.calltrace.cycles-pp.__skb_wait_for_more_packets.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg > 0.00 +1.3 1.32 perf-profile.calltrace.cycles-pp.__udp_enqueue_schedule_skb.udp_queue_rcv_one_skb.udp_unicast_rcv_skb.__udp4_lib_rcv.ip_protocol_deliver_rcu > 0.00 +2.2 2.23 perf-profile.calltrace.cycles-pp.__skb_recv_udp.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom > 49.51 +2.6 52.08 perf-profile.calltrace.cycles-pp.send_udp_stream.main.__libc_start_main > 49.49 +2.6 52.07 perf-profile.calltrace.cycles-pp.send_omni_inner.send_udp_stream.main.__libc_start_main > 0.00 +3.0 2.96 2% perf-profile.calltrace.cycles-pp.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle > 48.71 +3.0 51.73 perf-profile.calltrace.cycles-pp.sendto.send_omni_inner.send_udp_stream.main.__libc_start_main > 0.00 +3.1 3.06 2% perf-profile.calltrace.cycles-pp.cpuidle_enter_state.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry > 0.00 +3.1 3.09 perf-profile.calltrace.cycles-pp.cpuidle_enter.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary > 48.34 +3.2 51.56 perf-profile.calltrace.cycles-pp.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream.main > 0.00 +3.3 3.33 2% perf-profile.calltrace.cycles-pp.cpuidle_idle_call.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 48.13 +3.8 51.96 perf-profile.calltrace.cycles-pp.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner.send_udp_stream > 47.82 +4.0 51.82 perf-profile.calltrace.cycles-pp.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto.send_omni_inner > 47.70 +4.1 51.76 perf-profile.calltrace.cycles-pp.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe.sendto > 0.00 +4.1 4.08 perf-profile.calltrace.cycles-pp.do_idle.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 0.00 +4.1 4.10 perf-profile.calltrace.cycles-pp.cpu_startup_entry.start_secondary.secondary_startup_64_no_verify > 0.00 +4.1 4.10 perf-profile.calltrace.cycles-pp.start_secondary.secondary_startup_64_no_verify > 0.00 +4.1 4.14 perf-profile.calltrace.cycles-pp.secondary_startup_64_no_verify > 0.00 +4.3 4.35 2% perf-profile.calltrace.cycles-pp.asm_sysvec_call_function_single.acpi_safe_halt.acpi_idle_enter.cpuidle_enter_state.cpuidle_enter > 46.52 +4.8 51.27 perf-profile.calltrace.cycles-pp.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64.entry_SYSCALL_64_after_hwframe > 46.04 +5.0 51.08 perf-profile.calltrace.cycles-pp.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto.do_syscall_64 > 3.67 +8.0 11.63 perf-profile.calltrace.cycles-pp.copyout._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg > 3.71 +8.1 11.80 perf-profile.calltrace.cycles-pp._copy_to_iter.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg > 3.96 +8.5 12.42 perf-profile.calltrace.cycles-pp.__skb_datagram_iter.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg > 3.96 +8.5 12.44 perf-profile.calltrace.cycles-pp.skb_copy_datagram_iter.udp_recvmsg.inet_recvmsg.sock_recvmsg.__sys_recvfrom > 35.13 +11.3 46.39 perf-profile.calltrace.cycles-pp.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto.__x64_sys_sendto > 32.68 2% +13.0 45.65 perf-profile.calltrace.cycles-pp.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg.__sys_sendto > 10.27 +20.3 30.59 perf-profile.calltrace.cycles-pp.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg.sock_sendmsg > 10.24 +20.3 30.58 perf-profile.calltrace.cycles-pp.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb.udp_sendmsg > 9.84 +20.5 30.32 perf-profile.calltrace.cycles-pp.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data.ip_make_skb > 9.59 +20.5 30.11 perf-profile.calltrace.cycles-pp.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill.__ip_append_data > 8.40 +21.0 29.42 perf-profile.calltrace.cycles-pp.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill.sk_page_frag_refill > 6.13 +21.9 28.05 perf-profile.calltrace.cycles-pp.native_queued_spin_lock_slowpath._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist > 6.20 +22.0 28.15 perf-profile.calltrace.cycles-pp._raw_spin_lock_irqsave.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages > 6.46 +22.5 28.91 perf-profile.calltrace.cycles-pp.rmqueue_bulk.rmqueue.get_page_from_freelist.__alloc_pages.skb_page_frag_refill > 48.24 -21.8 26.43 perf-profile.children.cycles-pp.skb_release_data > 47.19 -21.2 25.98 perf-profile.children.cycles-pp.free_unref_page > 44.48 -19.6 24.88 perf-profile.children.cycles-pp.free_pcppages_bulk > 45.62 -19.2 26.40 perf-profile.children.cycles-pp.__consume_stateless_skb > 99.95 -14.2 85.76 15% perf-profile.children.cycles-pp.main > 99.96 -14.2 85.77 15% perf-profile.children.cycles-pp.__libc_start_main > 50.10 -8.6 41.53 perf-profile.children.cycles-pp.udp_recvmsg > 50.11 -8.6 41.56 perf-profile.children.cycles-pp.inet_recvmsg > 50.13 -8.5 41.65 perf-profile.children.cycles-pp.sock_recvmsg > 50.29 -8.0 42.28 perf-profile.children.cycles-pp.__sys_recvfrom > 50.29 -8.0 42.30 perf-profile.children.cycles-pp.__x64_sys_recvfrom > 50.38 -7.5 42.86 perf-profile.children.cycles-pp.recvfrom > 50.41 -7.3 43.07 perf-profile.children.cycles-pp.accept_connections > 50.41 -7.3 43.07 perf-profile.children.cycles-pp.accept_connection > 50.41 -7.3 43.07 perf-profile.children.cycles-pp.spawn_child > 50.41 -7.3 43.07 perf-profile.children.cycles-pp.process_requests > 50.41 -7.3 43.07 perf-profile.children.cycles-pp.recv_omni > 19.96 2% -6.5 13.50 perf-profile.children.cycles-pp.ip_generic_getfrag > 19.46 2% -6.2 13.28 perf-profile.children.cycles-pp._copy_from_iter > 19.21 2% -6.1 13.14 perf-profile.children.cycles-pp.copyin > 8.96 -5.1 3.86 perf-profile.children.cycles-pp.udp_send_skb > 8.72 -5.0 3.75 perf-profile.children.cycles-pp.ip_send_skb > 8.11 -4.6 3.49 perf-profile.children.cycles-pp.ip_finish_output2 > 7.72 -4.4 3.32 perf-profile.children.cycles-pp.__dev_queue_xmit > 98.71 -4.1 94.59 perf-profile.children.cycles-pp.entry_SYSCALL_64_after_hwframe > 98.51 -4.0 94.46 perf-profile.children.cycles-pp.do_syscall_64 > 6.49 -3.7 2.78 perf-profile.children.cycles-pp.do_softirq > 6.51 -3.7 2.82 perf-profile.children.cycles-pp.__local_bh_enable_ip > 6.43 -3.7 2.78 perf-profile.children.cycles-pp.__do_softirq > 5.90 -3.4 2.46 perf-profile.children.cycles-pp.net_rx_action > 5.74 -3.4 2.38 perf-profile.children.cycles-pp.__napi_poll > 5.71 -3.4 2.36 perf-profile.children.cycles-pp.process_backlog > 5.37 -3.2 2.21 perf-profile.children.cycles-pp.__netif_receive_skb_one_core > 4.60 -2.7 1.91 perf-profile.children.cycles-pp.ip_local_deliver_finish > 4.57 2% -2.7 1.90 perf-profile.children.cycles-pp.ip_protocol_deliver_rcu > 4.42 2% -2.6 1.83 perf-profile.children.cycles-pp.__udp4_lib_rcv > 3.82 2% -2.2 1.58 2% perf-profile.children.cycles-pp.udp_unicast_rcv_skb > 3.78 2% -2.2 1.57 2% perf-profile.children.cycles-pp.udp_queue_rcv_one_skb > 2.23 2% -1.6 0.65 2% perf-profile.children.cycles-pp.__ip_make_skb > 1.95 2% -1.4 0.52 3% perf-profile.children.cycles-pp.__ip_select_ident > 1.51 4% -1.2 0.34 perf-profile.children.cycles-pp.free_unref_page_commit > 1.17 -0.7 0.51 2% perf-profile.children.cycles-pp.ip_route_output_flow > 1.15 -0.6 0.52 perf-profile.children.cycles-pp.sock_alloc_send_pskb > 0.91 -0.5 0.39 perf-profile.children.cycles-pp.alloc_skb_with_frags > 0.86 -0.5 0.37 perf-profile.children.cycles-pp.__alloc_skb > 0.83 -0.5 0.36 2% perf-profile.children.cycles-pp.ip_route_output_key_hash_rcu > 0.75 -0.4 0.32 perf-profile.children.cycles-pp.dev_hard_start_xmit > 0.72 -0.4 0.31 3% perf-profile.children.cycles-pp.fib_table_lookup > 0.67 -0.4 0.28 perf-profile.children.cycles-pp.loopback_xmit > 0.70 2% -0.4 0.33 perf-profile.children.cycles-pp.__zone_watermark_ok > 0.47 4% -0.3 0.15 perf-profile.children.cycles-pp.kmem_cache_free > 0.57 -0.3 0.26 perf-profile.children.cycles-pp.kmem_cache_alloc_node > 0.46 -0.3 0.18 2% perf-profile.children.cycles-pp.ip_rcv > 0.42 -0.3 0.17 perf-profile.children.cycles-pp.move_addr_to_kernel > 0.41 -0.2 0.16 2% perf-profile.children.cycles-pp.__udp4_lib_lookup > 0.32 -0.2 0.13 perf-profile.children.cycles-pp.__netif_rx > 0.30 -0.2 0.12 perf-profile.children.cycles-pp.netif_rx_internal > 0.30 -0.2 0.12 perf-profile.children.cycles-pp._copy_from_user > 0.31 -0.2 0.13 perf-profile.children.cycles-pp.kmalloc_reserve > 0.63 -0.2 0.46 2% perf-profile.children.cycles-pp.free_unref_page_prepare > 0.28 -0.2 0.11 perf-profile.children.cycles-pp.enqueue_to_backlog > 0.27 -0.2 0.11 perf-profile.children.cycles-pp.udp4_lib_lookup2 > 0.29 -0.2 0.13 6% perf-profile.children.cycles-pp.send_data > 0.25 -0.2 0.10 perf-profile.children.cycles-pp.__netif_receive_skb_core > 0.23 2% -0.1 0.10 4% perf-profile.children.cycles-pp.security_socket_sendmsg > 0.19 2% -0.1 0.06 perf-profile.children.cycles-pp.ip_rcv_core > 0.37 -0.1 0.24 perf-profile.children.cycles-pp.irqtime_account_irq > 0.21 -0.1 0.08 perf-profile.children.cycles-pp.sock_wfree > 0.21 3% -0.1 0.08 perf-profile.children.cycles-pp.validate_xmit_skb > 0.20 2% -0.1 0.08 perf-profile.children.cycles-pp.ip_output > 0.22 2% -0.1 0.10 4% perf-profile.children.cycles-pp.ip_rcv_finish_core > 0.20 6% -0.1 0.09 5% perf-profile.children.cycles-pp.__mkroute_output > 0.21 2% -0.1 0.09 5% perf-profile.children.cycles-pp._raw_spin_lock_irq > 0.28 -0.1 0.18 perf-profile.children.cycles-pp._raw_spin_trylock > 0.34 3% -0.1 0.25 perf-profile.children.cycles-pp.__slab_free > 0.13 3% -0.1 0.05 perf-profile.children.cycles-pp.siphash_3u32 > 0.12 4% -0.1 0.03 70% perf-profile.children.cycles-pp.ipv4_pktinfo_prepare > 0.14 3% -0.1 0.06 7% perf-profile.children.cycles-pp.__ip_local_out > 0.20 2% -0.1 0.12 perf-profile.children.cycles-pp.aa_sk_perm > 0.18 2% -0.1 0.10 perf-profile.children.cycles-pp.get_pfnblock_flags_mask > 0.12 3% -0.1 0.05 perf-profile.children.cycles-pp.sk_filter_trim_cap > 0.13 -0.1 0.06 perf-profile.children.cycles-pp.ip_setup_cork > 0.13 7% -0.1 0.06 8% perf-profile.children.cycles-pp.fib_lookup_good_nhc > 0.15 3% -0.1 0.08 5% perf-profile.children.cycles-pp.skb_set_owner_w > 0.11 4% -0.1 0.05 perf-profile.children.cycles-pp.dst_release > 0.23 2% -0.1 0.17 2% perf-profile.children.cycles-pp.__entry_text_start > 0.11 -0.1 0.05 perf-profile.children.cycles-pp.ipv4_mtu > 0.20 2% -0.1 0.15 3% perf-profile.children.cycles-pp.__list_add_valid_or_report > 0.10 -0.1 0.05 perf-profile.children.cycles-pp.ip_send_check > 0.31 2% -0.0 0.26 3% perf-profile.children.cycles-pp.sockfd_lookup_light > 0.27 -0.0 0.22 2% perf-profile.children.cycles-pp.__fget_light > 0.63 -0.0 0.58 perf-profile.children.cycles-pp.__check_object_size > 0.15 3% -0.0 0.11 perf-profile.children.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.13 -0.0 0.09 5% perf-profile.children.cycles-pp.alloc_pages > 0.27 -0.0 0.24 perf-profile.children.cycles-pp.sched_clock_cpu > 0.11 4% -0.0 0.08 6% perf-profile.children.cycles-pp.__cond_resched > 0.14 3% -0.0 0.11 perf-profile.children.cycles-pp.free_tail_page_prepare > 0.11 -0.0 0.08 5% perf-profile.children.cycles-pp.syscall_return_via_sysret > 0.09 9% -0.0 0.06 7% perf-profile.children.cycles-pp.__xfrm_policy_check2 > 0.23 2% -0.0 0.21 2% perf-profile.children.cycles-pp.sched_clock > 0.14 3% -0.0 0.11 4% perf-profile.children.cycles-pp.prep_compound_page > 0.21 2% -0.0 0.20 2% perf-profile.children.cycles-pp.native_sched_clock > 0.06 -0.0 0.05 perf-profile.children.cycles-pp.task_tick_fair > 0.06 -0.0 0.05 perf-profile.children.cycles-pp.check_stack_object > 0.18 2% +0.0 0.20 2% perf-profile.children.cycles-pp.perf_event_task_tick > 0.18 2% +0.0 0.19 2% perf-profile.children.cycles-pp.perf_adjust_freq_unthr_context > 0.31 3% +0.0 0.33 perf-profile.children.cycles-pp.tick_sched_handle > 0.31 3% +0.0 0.33 perf-profile.children.cycles-pp.update_process_times > 0.41 2% +0.0 0.43 perf-profile.children.cycles-pp.__sysvec_apic_timer_interrupt > 0.40 2% +0.0 0.42 perf-profile.children.cycles-pp.hrtimer_interrupt > 0.32 2% +0.0 0.34 perf-profile.children.cycles-pp.tick_sched_timer > 0.36 2% +0.0 0.39 perf-profile.children.cycles-pp.__hrtimer_run_queues > 0.06 7% +0.0 0.10 4% perf-profile.children.cycles-pp.exit_to_user_mode_prepare > 0.05 8% +0.0 0.10 perf-profile.children.cycles-pp._raw_spin_lock_bh > 0.00 +0.1 0.05 perf-profile.children.cycles-pp.update_cfs_group > 0.00 +0.1 0.05 perf-profile.children.cycles-pp.cpuidle_governor_latency_req > 0.00 +0.1 0.05 perf-profile.children.cycles-pp.flush_smp_call_function_queue > 0.00 +0.1 0.05 8% perf-profile.children.cycles-pp.prepare_to_wait_exclusive > 0.07 +0.1 0.13 3% perf-profile.children.cycles-pp.__mod_zone_page_state > 0.00 +0.1 0.06 13% perf-profile.children.cycles-pp.cgroup_rstat_updated > 0.00 +0.1 0.06 perf-profile.children.cycles-pp.__x2apic_send_IPI_dest > 0.00 +0.1 0.06 perf-profile.children.cycles-pp.security_socket_recvmsg > 0.00 +0.1 0.06 perf-profile.children.cycles-pp.select_task_rq_fair > 0.00 +0.1 0.06 perf-profile.children.cycles-pp.tick_irq_enter > 0.00 +0.1 0.06 perf-profile.children.cycles-pp.tick_nohz_idle_enter > 0.42 2% +0.1 0.49 2% perf-profile.children.cycles-pp.sysvec_apic_timer_interrupt > 0.00 +0.1 0.07 7% perf-profile.children.cycles-pp.ktime_get > 0.00 +0.1 0.07 perf-profile.children.cycles-pp.__get_user_4 > 0.00 +0.1 0.07 perf-profile.children.cycles-pp.update_rq_clock > 0.00 +0.1 0.07 perf-profile.children.cycles-pp.select_task_rq > 0.00 +0.1 0.07 perf-profile.children.cycles-pp.native_apic_msr_eoi > 0.49 +0.1 0.57 2% perf-profile.children.cycles-pp.asm_sysvec_apic_timer_interrupt > 0.11 11% +0.1 0.19 2% perf-profile.children.cycles-pp._raw_spin_lock > 0.00 +0.1 0.08 perf-profile.children.cycles-pp.update_rq_clock_task > 0.00 +0.1 0.08 perf-profile.children.cycles-pp.__update_load_avg_se > 0.00 +0.1 0.09 5% perf-profile.children.cycles-pp.irq_enter_rcu > 0.00 +0.1 0.09 5% perf-profile.children.cycles-pp.__irq_exit_rcu > 0.00 +0.1 0.09 perf-profile.children.cycles-pp.__update_load_avg_cfs_rq > 0.00 +0.1 0.09 perf-profile.children.cycles-pp.update_blocked_averages > 0.00 +0.1 0.09 perf-profile.children.cycles-pp.update_sg_lb_stats > 0.00 +0.1 0.09 5% perf-profile.children.cycles-pp.set_next_entity > 0.00 +0.1 0.10 perf-profile.children.cycles-pp.__switch_to_asm > 0.00 +0.1 0.11 12% perf-profile.children.cycles-pp._copy_to_user > 0.00 +0.1 0.12 3% perf-profile.children.cycles-pp.menu_select > 0.00 +0.1 0.12 3% perf-profile.children.cycles-pp.recv_data > 0.00 +0.1 0.12 3% perf-profile.children.cycles-pp.update_sd_lb_stats > 0.00 +0.1 0.13 3% perf-profile.children.cycles-pp.native_irq_return_iret > 0.00 +0.1 0.13 3% perf-profile.children.cycles-pp.__switch_to > 0.00 +0.1 0.13 3% perf-profile.children.cycles-pp.find_busiest_group > 0.00 +0.1 0.14 perf-profile.children.cycles-pp.finish_task_switch > 0.00 +0.1 0.15 3% perf-profile.children.cycles-pp.update_curr > 0.00 +0.2 0.15 3% perf-profile.children.cycles-pp.mem_cgroup_uncharge_skmem > 0.00 +0.2 0.16 perf-profile.children.cycles-pp.ttwu_queue_wakelist > 0.05 +0.2 0.22 2% perf-profile.children.cycles-pp.page_counter_try_charge > 0.00 +0.2 0.17 2% perf-profile.children.cycles-pp.load_balance > 0.00 +0.2 0.17 2% perf-profile.children.cycles-pp.___perf_sw_event > 0.02 141% +0.2 0.19 2% perf-profile.children.cycles-pp.page_counter_uncharge > 0.33 +0.2 0.52 perf-profile.children.cycles-pp.__free_one_page > 0.02 141% +0.2 0.21 2% perf-profile.children.cycles-pp.drain_stock > 0.00 +0.2 0.20 2% perf-profile.children.cycles-pp.prepare_task_switch > 0.16 3% +0.2 0.38 2% perf-profile.children.cycles-pp.simple_copy_to_iter > 0.07 11% +0.2 0.31 perf-profile.children.cycles-pp.refill_stock > 0.07 6% +0.2 0.31 4% perf-profile.children.cycles-pp.move_addr_to_user > 0.00 +0.2 0.24 perf-profile.children.cycles-pp.enqueue_entity > 0.00 +0.2 0.25 perf-profile.children.cycles-pp.update_load_avg > 0.21 2% +0.3 0.48 perf-profile.children.cycles-pp.__list_del_entry_valid_or_report > 0.00 +0.3 0.31 4% perf-profile.children.cycles-pp.dequeue_entity > 0.08 5% +0.3 0.40 3% perf-profile.children.cycles-pp.try_charge_memcg > 0.00 +0.3 0.33 perf-profile.children.cycles-pp.enqueue_task_fair > 0.00 +0.4 0.35 2% perf-profile.children.cycles-pp.dequeue_task_fair > 0.00 +0.4 0.35 2% perf-profile.children.cycles-pp.activate_task > 0.00 +0.4 0.36 2% perf-profile.children.cycles-pp.try_to_wake_up > 0.00 +0.4 0.37 2% perf-profile.children.cycles-pp.autoremove_wake_function > 0.00 +0.4 0.39 3% perf-profile.children.cycles-pp.newidle_balance > 0.12 8% +0.4 0.51 2% perf-profile.children.cycles-pp.mem_cgroup_charge_skmem > 0.00 +0.4 0.39 perf-profile.children.cycles-pp.ttwu_do_activate > 0.00 +0.4 0.40 2% perf-profile.children.cycles-pp.__wake_up_common > 0.18 4% +0.4 0.59 perf-profile.children.cycles-pp.udp_rmem_release > 0.11 7% +0.4 0.52 perf-profile.children.cycles-pp.__sk_mem_reduce_allocated > 0.00 +0.4 0.43 perf-profile.children.cycles-pp.__wake_up_common_lock > 0.00 +0.5 0.46 perf-profile.children.cycles-pp.sched_ttwu_pending > 0.00 +0.5 0.49 perf-profile.children.cycles-pp.sock_def_readable > 0.00 +0.5 0.53 2% perf-profile.children.cycles-pp.pick_next_task_fair > 0.00 +0.5 0.54 2% perf-profile.children.cycles-pp.schedule_idle > 0.00 +0.6 0.55 perf-profile.children.cycles-pp.__flush_smp_call_function_queue > 0.15 3% +0.6 0.73 2% perf-profile.children.cycles-pp.__sk_mem_raise_allocated > 0.00 +0.6 0.57 perf-profile.children.cycles-pp.__sysvec_call_function_single > 0.16 5% +0.6 0.74 2% perf-profile.children.cycles-pp.__sk_mem_schedule > 0.00 +0.8 0.78 perf-profile.children.cycles-pp.sysvec_call_function_single > 0.41 3% +0.9 1.33 2% perf-profile.children.cycles-pp.__udp_enqueue_schedule_skb > 0.00 +1.2 1.16 2% perf-profile.children.cycles-pp.schedule > 0.00 +1.2 1.21 2% perf-profile.children.cycles-pp.schedule_timeout > 0.00 +1.3 1.33 2% perf-profile.children.cycles-pp.__skb_wait_for_more_packets > 0.00 +1.7 1.66 2% perf-profile.children.cycles-pp.__schedule > 0.27 3% +2.0 2.25 perf-profile.children.cycles-pp.__skb_recv_udp > 50.41 +2.4 52.81 perf-profile.children.cycles-pp._raw_spin_lock_irqsave > 0.00 +2.7 2.68 perf-profile.children.cycles-pp.asm_sysvec_call_function_single > 49.78 +2.7 52.49 perf-profile.children.cycles-pp.native_queued_spin_lock_slowpath > 0.00 +3.0 2.98 perf-profile.children.cycles-pp.acpi_safe_halt > 0.00 +3.0 3.00 perf-profile.children.cycles-pp.acpi_idle_enter > 49.51 +3.1 52.57 perf-profile.children.cycles-pp.send_udp_stream > 49.50 +3.1 52.56 perf-profile.children.cycles-pp.send_omni_inner > 0.00 +3.1 3.10 perf-profile.children.cycles-pp.cpuidle_enter_state > 0.00 +3.1 3.12 perf-profile.children.cycles-pp.cpuidle_enter > 0.00 +3.4 3.37 perf-profile.children.cycles-pp.cpuidle_idle_call > 48.90 +3.4 52.30 perf-profile.children.cycles-pp.sendto > 47.85 +4.0 51.83 perf-profile.children.cycles-pp.__x64_sys_sendto > 47.73 +4.0 51.77 perf-profile.children.cycles-pp.__sys_sendto > 0.00 +4.1 4.10 perf-profile.children.cycles-pp.start_secondary > 0.00 +4.1 4.13 perf-profile.children.cycles-pp.do_idle > 0.00 +4.1 4.14 perf-profile.children.cycles-pp.secondary_startup_64_no_verify > 0.00 +4.1 4.14 perf-profile.children.cycles-pp.cpu_startup_entry > 46.54 +4.7 51.28 perf-profile.children.cycles-pp.sock_sendmsg > 46.10 +5.0 51.11 perf-profile.children.cycles-pp.udp_sendmsg > 3.70 +8.0 11.71 perf-profile.children.cycles-pp.copyout > 3.71 +8.1 11.80 perf-profile.children.cycles-pp._copy_to_iter > 3.96 +8.5 12.43 perf-profile.children.cycles-pp.__skb_datagram_iter > 3.96 +8.5 12.44 perf-profile.children.cycles-pp.skb_copy_datagram_iter > 35.14 +11.3 46.40 perf-profile.children.cycles-pp.ip_make_skb > 32.71 2% +13.0 45.66 perf-profile.children.cycles-pp.__ip_append_data > 10.28 +20.6 30.89 perf-profile.children.cycles-pp.sk_page_frag_refill > 10.25 +20.6 30.88 perf-profile.children.cycles-pp.skb_page_frag_refill > 9.86 +20.8 30.63 perf-profile.children.cycles-pp.__alloc_pages > 9.62 +20.8 30.42 perf-profile.children.cycles-pp.get_page_from_freelist > 8.42 +21.3 29.72 perf-profile.children.cycles-pp.rmqueue > 6.47 +22.8 29.22 perf-profile.children.cycles-pp.rmqueue_bulk > 19.11 2% -6.0 13.08 perf-profile.self.cycles-pp.copyin > 1.81 2% -1.4 0.39 perf-profile.self.cycles-pp.rmqueue > 1.81 2% -1.3 0.46 2% perf-profile.self.cycles-pp.__ip_select_ident > 1.47 4% -1.2 0.31 perf-profile.self.cycles-pp.free_unref_page_commit > 1.29 2% -0.5 0.75 perf-profile.self.cycles-pp.__ip_append_data > 0.71 -0.4 0.29 perf-profile.self.cycles-pp.udp_sendmsg > 0.68 2% -0.4 0.32 perf-profile.self.cycles-pp.__zone_watermark_ok > 0.50 -0.3 0.16 perf-profile.self.cycles-pp.skb_release_data > 0.59 3% -0.3 0.26 3% perf-profile.self.cycles-pp.fib_table_lookup > 0.46 4% -0.3 0.15 3% perf-profile.self.cycles-pp.kmem_cache_free > 0.63 -0.3 0.33 2% perf-profile.self.cycles-pp._raw_spin_lock_irqsave > 0.47 -0.3 0.19 perf-profile.self.cycles-pp.__sys_sendto > 0.44 -0.2 0.21 2% perf-profile.self.cycles-pp.kmem_cache_alloc_node > 0.36 -0.2 0.16 3% perf-profile.self.cycles-pp.send_omni_inner > 0.35 2% -0.2 0.15 3% perf-profile.self.cycles-pp.ip_finish_output2 > 0.29 -0.2 0.12 perf-profile.self.cycles-pp._copy_from_user > 0.24 -0.1 0.10 4% perf-profile.self.cycles-pp.__netif_receive_skb_core > 0.22 2% -0.1 0.08 5% perf-profile.self.cycles-pp.free_unref_page > 0.19 2% -0.1 0.06 perf-profile.self.cycles-pp.ip_rcv_core > 0.21 2% -0.1 0.08 perf-profile.self.cycles-pp.__alloc_skb > 0.20 2% -0.1 0.08 perf-profile.self.cycles-pp.sock_wfree > 0.22 2% -0.1 0.10 4% perf-profile.self.cycles-pp.send_data > 0.21 -0.1 0.09 perf-profile.self.cycles-pp.sendto > 0.21 2% -0.1 0.10 4% perf-profile.self.cycles-pp.ip_rcv_finish_core > 0.21 2% -0.1 0.09 5% perf-profile.self.cycles-pp.__ip_make_skb > 0.20 4% -0.1 0.09 5% perf-profile.self.cycles-pp._raw_spin_lock_irq > 0.21 2% -0.1 0.10 4% perf-profile.self.cycles-pp.__dev_queue_xmit > 0.38 3% -0.1 0.27 perf-profile.self.cycles-pp.get_page_from_freelist > 0.20 2% -0.1 0.09 perf-profile.self.cycles-pp.udp_send_skb > 0.18 2% -0.1 0.07 perf-profile.self.cycles-pp.__udp_enqueue_schedule_skb > 0.18 4% -0.1 0.08 6% perf-profile.self.cycles-pp.__mkroute_output > 0.25 -0.1 0.15 3% perf-profile.self.cycles-pp._copy_from_iter > 0.27 4% -0.1 0.17 2% perf-profile.self.cycles-pp.skb_page_frag_refill > 0.16 -0.1 0.06 7% perf-profile.self.cycles-pp.sock_sendmsg > 0.33 2% -0.1 0.24 perf-profile.self.cycles-pp.__slab_free > 0.15 3% -0.1 0.06 perf-profile.self.cycles-pp.udp4_lib_lookup2 > 0.38 2% -0.1 0.29 2% perf-profile.self.cycles-pp.free_unref_page_prepare > 0.26 -0.1 0.17 perf-profile.self.cycles-pp._raw_spin_trylock > 0.15 -0.1 0.06 perf-profile.self.cycles-pp.ip_output > 0.14 -0.1 0.05 8% perf-profile.self.cycles-pp.process_backlog > 0.14 -0.1 0.06 perf-profile.self.cycles-pp.ip_route_output_flow > 0.14 -0.1 0.06 perf-profile.self.cycles-pp.__udp4_lib_lookup > 0.21 2% -0.1 0.13 3% perf-profile.self.cycles-pp.entry_SYSCALL_64_after_hwframe > 0.12 3% -0.1 0.05 perf-profile.self.cycles-pp.siphash_3u32 > 0.13 3% -0.1 0.06 8% perf-profile.self.cycles-pp.ip_send_skb > 0.17 -0.1 0.10 perf-profile.self.cycles-pp.__do_softirq > 0.15 3% -0.1 0.08 5% perf-profile.self.cycles-pp.skb_set_owner_w > 0.17 2% -0.1 0.10 4% perf-profile.self.cycles-pp.aa_sk_perm > 0.12 -0.1 0.05 perf-profile.self.cycles-pp.__x64_sys_sendto > 0.12 6% -0.1 0.05 perf-profile.self.cycles-pp.fib_lookup_good_nhc > 0.19 2% -0.1 0.13 perf-profile.self.cycles-pp.__list_add_valid_or_report > 0.14 3% -0.1 0.07 6% perf-profile.self.cycles-pp.net_rx_action > 0.16 2% -0.1 0.10 perf-profile.self.cycles-pp.do_syscall_64 > 0.11 -0.1 0.05 perf-profile.self.cycles-pp.__udp4_lib_rcv > 0.16 3% -0.1 0.10 4% perf-profile.self.cycles-pp.get_pfnblock_flags_mask > 0.11 4% -0.1 0.05 perf-profile.self.cycles-pp.ip_route_output_key_hash_rcu > 0.10 4% -0.1 0.05 perf-profile.self.cycles-pp.ip_generic_getfrag > 0.10 -0.1 0.05 perf-profile.self.cycles-pp.ipv4_mtu > 0.26 -0.0 0.21 2% perf-profile.self.cycles-pp.__fget_light > 0.15 3% -0.0 0.11 4% perf-profile.self.cycles-pp.entry_SYSRETQ_unsafe_stack > 0.24 -0.0 0.20 2% perf-profile.self.cycles-pp.__alloc_pages > 0.15 3% -0.0 0.12 perf-profile.self.cycles-pp.__check_object_size > 0.11 -0.0 0.08 6% perf-profile.self.cycles-pp.syscall_return_via_sysret > 0.08 5% -0.0 0.05 perf-profile.self.cycles-pp.loopback_xmit > 0.13 -0.0 0.11 4% perf-profile.self.cycles-pp.prep_compound_page > 0.11 -0.0 0.09 5% perf-profile.self.cycles-pp.irqtime_account_irq > 0.09 10% -0.0 0.06 7% perf-profile.self.cycles-pp.__xfrm_policy_check2 > 0.07 -0.0 0.05 perf-profile.self.cycles-pp.alloc_pages > 0.08 -0.0 0.06 7% perf-profile.self.cycles-pp.__entry_text_start > 0.09 5% -0.0 0.07 perf-profile.self.cycles-pp.free_tail_page_prepare > 0.10 +0.0 0.11 perf-profile.self.cycles-pp.perf_adjust_freq_unthr_context > 0.06 +0.0 0.08 6% perf-profile.self.cycles-pp.free_pcppages_bulk > 0.05 8% +0.0 0.10 4% perf-profile.self.cycles-pp._raw_spin_lock_bh > 0.07 +0.0 0.12 perf-profile.self.cycles-pp.__mod_zone_page_state > 0.00 +0.1 0.05 perf-profile.self.cycles-pp.cpuidle_idle_call > 0.00 +0.1 0.05 perf-profile.self.cycles-pp.udp_rmem_release > 0.00 +0.1 0.05 perf-profile.self.cycles-pp.__flush_smp_call_function_queue > 0.00 +0.1 0.05 perf-profile.self.cycles-pp.sock_def_readable > 0.00 +0.1 0.05 perf-profile.self.cycles-pp.update_cfs_group > 0.11 11% +0.1 0.17 2% perf-profile.self.cycles-pp._raw_spin_lock > 0.00 +0.1 0.05 8% perf-profile.self.cycles-pp.finish_task_switch > 0.00 +0.1 0.05 8% perf-profile.self.cycles-pp.cgroup_rstat_updated > 0.00 +0.1 0.06 perf-profile.self.cycles-pp.do_idle > 0.00 +0.1 0.06 perf-profile.self.cycles-pp.__skb_wait_for_more_packets > 0.00 +0.1 0.06 perf-profile.self.cycles-pp.__x2apic_send_IPI_dest > 0.00 +0.1 0.06 7% perf-profile.self.cycles-pp.enqueue_entity > 0.00 +0.1 0.07 7% perf-profile.self.cycles-pp.schedule_timeout > 0.00 +0.1 0.07 7% perf-profile.self.cycles-pp.move_addr_to_user > 0.00 +0.1 0.07 7% perf-profile.self.cycles-pp.menu_select > 0.00 +0.1 0.07 7% perf-profile.self.cycles-pp.native_apic_msr_eoi > 0.00 +0.1 0.07 7% perf-profile.self.cycles-pp.update_sg_lb_stats > 0.00 +0.1 0.07 perf-profile.self.cycles-pp.__update_load_avg_se > 0.00 +0.1 0.07 perf-profile.self.cycles-pp.__get_user_4 > 0.00 +0.1 0.08 6% perf-profile.self.cycles-pp.__sk_mem_reduce_allocated > 0.00 +0.1 0.08 perf-profile.self.cycles-pp.update_curr > 0.00 +0.1 0.08 5% perf-profile.self.cycles-pp.__update_load_avg_cfs_rq > 0.00 +0.1 0.09 5% perf-profile.self.cycles-pp.try_to_wake_up > 0.00 +0.1 0.09 perf-profile.self.cycles-pp.recvfrom > 0.00 +0.1 0.09 perf-profile.self.cycles-pp.mem_cgroup_charge_skmem > 0.00 +0.1 0.09 perf-profile.self.cycles-pp.update_load_avg > 0.00 +0.1 0.09 5% perf-profile.self.cycles-pp.enqueue_task_fair > 0.00 +0.1 0.10 4% perf-profile.self.cycles-pp._copy_to_iter > 0.00 +0.1 0.10 4% perf-profile.self.cycles-pp.newidle_balance > 0.00 +0.1 0.10 4% perf-profile.self.cycles-pp.recv_data > 0.00 +0.1 0.10 perf-profile.self.cycles-pp.refill_stock > 0.00 +0.1 0.10 perf-profile.self.cycles-pp.__switch_to_asm > 0.00 +0.1 0.11 15% perf-profile.self.cycles-pp._copy_to_user > 0.00 +0.1 0.12 perf-profile.self.cycles-pp.recv_omni > 0.00 +0.1 0.12 perf-profile.self.cycles-pp.mem_cgroup_uncharge_skmem > 0.00 +0.1 0.13 3% perf-profile.self.cycles-pp.native_irq_return_iret > 0.00 +0.1 0.13 perf-profile.self.cycles-pp.__switch_to > 0.06 +0.1 0.20 2% perf-profile.self.cycles-pp.rmqueue_bulk > 0.09 5% +0.1 0.23 4% perf-profile.self.cycles-pp.udp_recvmsg > 0.00 +0.1 0.14 3% perf-profile.self.cycles-pp.__skb_recv_udp > 0.00 +0.1 0.14 3% perf-profile.self.cycles-pp.___perf_sw_event > 0.08 +0.1 0.22 2% perf-profile.self.cycles-pp.__skb_datagram_iter > 0.03 70% +0.2 0.20 4% perf-profile.self.cycles-pp.page_counter_try_charge > 0.02 141% +0.2 0.18 4% perf-profile.self.cycles-pp.__sys_recvfrom > 0.00 +0.2 0.17 2% perf-profile.self.cycles-pp.__schedule > 0.00 +0.2 0.17 2% perf-profile.self.cycles-pp.try_charge_memcg > 0.00 +0.2 0.17 2% perf-profile.self.cycles-pp.page_counter_uncharge > 0.00 +0.2 0.21 2% perf-profile.self.cycles-pp.__sk_mem_raise_allocated > 0.14 3% +0.2 0.36 perf-profile.self.cycles-pp.__free_one_page > 0.20 2% +0.3 0.47 perf-profile.self.cycles-pp.__list_del_entry_valid_or_report > 0.00 +2.1 2.07 2% perf-profile.self.cycles-pp.acpi_safe_halt > 49.78 +2.7 52.49 perf-profile.self.cycles-pp.native_queued_spin_lock_slowpath > 3.68 +8.0 11.64 perf-profile.self.cycles-pp.copyout > > > > Disclaimer: > Results have been estimated based on internal Intel analysis and are provided > for informational purposes only. Any difference in system hardware or software > design or configuration may affect actual performance.
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c index 585c66fce9d9..f1e79263fe61 100644 --- a/drivers/base/cacheinfo.c +++ b/drivers/base/cacheinfo.c @@ -950,6 +950,7 @@ static int cacheinfo_cpu_online(unsigned int cpu) if (rc) goto err; update_per_cpu_data_slice_size(true, cpu); + setup_pcp_cacheinfo(); return 0; err: free_cache_attributes(cpu); @@ -963,6 +964,7 @@ static int cacheinfo_cpu_pre_down(unsigned int cpu) free_cache_attributes(cpu); update_per_cpu_data_slice_size(false, cpu); + setup_pcp_cacheinfo(); return 0; } diff --git a/include/linux/gfp.h b/include/linux/gfp.h index 665f06675c83..665edc11fb9f 100644 --- a/include/linux/gfp.h +++ b/include/linux/gfp.h @@ -325,6 +325,7 @@ void drain_all_pages(struct zone *zone); void drain_local_pages(struct zone *zone); void page_alloc_init_late(void); +void setup_pcp_cacheinfo(void); /* * gfp_allowed_mask is set to GFP_BOOT_MASK during early boot to restrict what diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 19c40a6f7e45..cdff247e8c6f 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -682,8 +682,14 @@ enum zone_watermarks { * PCPF_PREV_FREE_HIGH_ORDER: a high-order page is freed in the * previous page freeing. To avoid to drain PCP for an accident * high-order page freeing. + * + * PCPF_FREE_HIGH_BATCH: preserve "pcp->batch" pages in PCP before + * draining PCP for consecutive high-order pages freeing without + * allocation if data cache slice of CPU is large enough. To reduce + * zone lock contention and keep cache-hot pages reusing. */ #define PCPF_PREV_FREE_HIGH_ORDER BIT(0) +#define PCPF_FREE_HIGH_BATCH BIT(1) struct per_cpu_pages { spinlock_t lock; /* Protects lists field */ diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 295e61f0c49d..ba2d8f06523e 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -52,6 +52,7 @@ #include <linux/psi.h> #include <linux/khugepaged.h> #include <linux/delayacct.h> +#include <linux/cacheinfo.h> #include <asm/div64.h> #include "internal.h" #include "shuffle.h" @@ -2385,7 +2386,9 @@ static void free_unref_page_commit(struct zone *zone, struct per_cpu_pages *pcp, */ if (order && order <= PAGE_ALLOC_COSTLY_ORDER) { free_high = (pcp->free_factor && - (pcp->flags & PCPF_PREV_FREE_HIGH_ORDER)); + (pcp->flags & PCPF_PREV_FREE_HIGH_ORDER) && + (!(pcp->flags & PCPF_FREE_HIGH_BATCH) || + pcp->count >= READ_ONCE(pcp->batch))); pcp->flags |= PCPF_PREV_FREE_HIGH_ORDER; } else if (pcp->flags & PCPF_PREV_FREE_HIGH_ORDER) { pcp->flags &= ~PCPF_PREV_FREE_HIGH_ORDER; @@ -5418,6 +5421,39 @@ static void zone_pcp_update(struct zone *zone, int cpu_online) mutex_unlock(&pcp_batch_high_lock); } +static void zone_pcp_update_cacheinfo(struct zone *zone) +{ + int cpu; + struct per_cpu_pages *pcp; + struct cpu_cacheinfo *cci; + + for_each_online_cpu(cpu) { + pcp = per_cpu_ptr(zone->per_cpu_pageset, cpu); + cci = get_cpu_cacheinfo(cpu); + /* + * If data cache slice of CPU is large enough, "pcp->batch" + * pages can be preserved in PCP before draining PCP for + * consecutive high-order pages freeing without allocation. + * This can reduce zone lock contention without hurting + * cache-hot pages sharing. + */ + spin_lock(&pcp->lock); + if ((cci->per_cpu_data_slice_size >> PAGE_SHIFT) > 3 * pcp->batch) + pcp->flags |= PCPF_FREE_HIGH_BATCH; + else + pcp->flags &= ~PCPF_FREE_HIGH_BATCH; + spin_unlock(&pcp->lock); + } +} + +void setup_pcp_cacheinfo(void) +{ + struct zone *zone; + + for_each_populated_zone(zone) + zone_pcp_update_cacheinfo(zone); +} + /* * Allocate per cpu pagesets and initialize them. * Before this call only boot pagesets were available.