From patchwork Wed Oct 4 08:34:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Bristot de Oliveira X-Patchwork-Id: 148178 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2a8e:b0:403:3b70:6f57 with SMTP id in14csp2609074vqb; Wed, 4 Oct 2023 01:34:58 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGnw502PjVCuf8u1FrpMP1zdS2YmpCerxCsxTsFuwMzS9nsaXXLEKVpb3zMF92sR+Oi+srg X-Received: by 2002:a25:aacb:0:b0:c6f:b843:c502 with SMTP id t69-20020a25aacb000000b00c6fb843c502mr1502441ybi.25.1696408498775; Wed, 04 Oct 2023 01:34:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696408498; cv=none; d=google.com; s=arc-20160816; b=CDU5wJs35byHVgkLhw6PyXeob8JligXwAc2pOaQV1NJfz2Sg+gAbm0GvHkNafMTWVE pgNeGDxrMScKjivHgg/nO/UXfxrBJ98CqN58BgmVW+p3lqZJ5SWzWh4LrPxiuvVokb19 zYZqhT0d/WLX0Cw/2+yExvgaB7N3P8uEHvWhSsp0jUtVHMONtIlvrgEUYxqaZm20NiBf HK7QRR/yyhVpcNr/2jRvrYm91C7iltRnJtnwcB+LFTteP+NWZKH29ITL2IJZ+pLwoQxj H8FMfEbFIV/VbIIg65O30aryU8gP9AyMnQ/pNWGQssytdJ4I7vJ5gmEXVtVf98eLNWpq VqMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=r14+VtA2oNOgzPz4V7rv+ABPmrgLWjydK0mMiBn5oOw=; fh=bayJ+cVr10UhSV0tg0pPYVDfbzFwUe6jRgS/QS7zYvo=; b=JaG/acFR9TdArsM/J3Z/HpjZqWrXi8xTEjTzYufsN3E3q0uRfc708mrZl3C0Po4ew3 TBT5KmyQ9UZZQYoZods2ARbZBIM4d28MXsDcK3RDHz0OLnSMibbt0j/SjG6hxYi0DVM5 SPFn1JiVt4vIMA2pLRfkSUoqjBrreX0ymDfrUOhr0zgMDnu0Gi7Msudm0ZCrXg+csPaU WLZ11DSUv8Ap20IwnAbYz7FvB2Ky/4ZCdCnnuFVJzz/9B2NbAgl5ppbP/lS2x5oqiPoy S6SxQvOHyhacSJqf4T4gKLZJNn+T5BTDRppB6dLniNmd6PE/nxNwf4pXx3kPDq8LIy2t PiNw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="ImT0ZF/w"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id c17-20020a631c11000000b00584b3e195basi3061901pgc.289.2023.10.04.01.34.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 04 Oct 2023 01:34:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="ImT0ZF/w"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 27C9B81A05BF; Wed, 4 Oct 2023 01:34:56 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241691AbjJDIen (ORCPT + 18 others); Wed, 4 Oct 2023 04:34:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241688AbjJDIem (ORCPT ); Wed, 4 Oct 2023 04:34:42 -0400 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D945AA7 for ; Wed, 4 Oct 2023 01:34:38 -0700 (PDT) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 018C5C433C8; Wed, 4 Oct 2023 08:34:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1696408478; bh=MS6eVdU/ecL57/ElNUj1x5YXHRLiQNWPGabx9+f+Ksk=; h=From:To:Cc:Subject:Date:From; b=ImT0ZF/w6Vu7BVGrLdeMj7FeACfBWlEEc+wyyTgeP2pLRa24N8nyEtQ3DWJiDN35e EV+2hZlKkxlNUz+gcHRwTaqFAW8dtUQ0uVYNN1bzKlHgwpHiy0S2EpW2YQKUWvvTzW pnSnu9TwDF8nZWq69IHBRan2SWfm9WGx6F0dEB0cAunUqI0RFZWVWO5ItuL1ACrNXn 9wnlLxh6xZlQfUjyXf6lieKxLnQm6zZmcUaW+JLO3K8pWkzu9M96OzhpkYGgv/tL4y RJyyrwzTcnkiwaRvG6Ccl4YyL2aArjXI341WuJ1MD3sdccsAF0uk5hmjNAskYedRsz ajWLlwlFA/5hQ== From: Daniel Bristot de Oliveira To: Steven Rostedt Cc: Daniel Bristot de Oliveira , Xie XiuQi , Masami Hiramatsu , linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Subject: [GIT PULL] rtla: Fixes for 6.6 Date: Wed, 4 Oct 2023 10:34:28 +0200 Message-ID: <20231004083428.51747-1-bristot@kernel.org> X-Mailer: git-send-email 2.41.0 MIME-Version: 1.0 X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 04 Oct 2023 01:34:56 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778813238041737180 X-GMAIL-MSGID: 1778813238041737180 Steven, Timerlat auto-analysis: - Timerlat is reporting thread interference time without thread noise events occurrence. It was caused because the thread interference variable was not reset after the analysis of a timerlat activation that did not hit the threshold. - The IRQ handler delay is estimated from the delta of the IRQ latency reported by timerlat, and the timestamp from IRQ handler start event. If the delta is near-zero, the drift from the external clock and the trace event and/or the overhead can cause the value to be negative. If the value is negative, print a zero-delay. - IRQ handlers happening after the timerlat thread event but before the stop tracing were being reported as IRQ that happened before the *current* IRQ occurrence. Ignore Previous IRQ noise in this condition because they are valid only for the *next* timerlat activation. Timerlat user-space: - Timerlat is stopping all user-space thread if a CPU becomes offline. Do not stop the entire tool if a CPU is/become offline, but only the thread of the unavailable CPU. Stop the tool only, if all threads leave because the CPUs become/are offline. man-pages: - Fix command line example in timerlat hist man page. Please pull the latest rtla-v6.6-fixes tree, which can be found at: git://git.kernel.org/pub/scm/linux/kernel/git/bristot/linux.git rtla-v6.6-fixes Tag SHA1: 6a0df51b5c0a075a553a5ad73b1fd421f559eb6b Head SHA1: 81ec384b80ffbda752c230778d39ea620c7e3bcf Daniel Bristot de Oliveira (4): rtla/timerlat_aa: Zero thread sum after every sample analysis rtla/timerlat_aa: Fix negative IRQ delay rtla/timerlat_aa: Fix previous IRQ delay for IRQs that happens after thread sample rtla/timerlat: Do not stop user-space if a cpu is offline Xie XiuQi (1): rtla: fix a example in rtla-timerlat-hist.rst ---- Documentation/tools/rtla/rtla-timerlat-hist.rst | 4 ++-- tools/tracing/rtla/src/timerlat_aa.c | 32 ++++++++++++++++++++----- tools/tracing/rtla/src/timerlat_u.c | 6 +++-- 3 files changed, 32 insertions(+), 10 deletions(-) --------------------------- diff --git a/Documentation/tools/rtla/rtla-timerlat-hist.rst b/Documentation/tools/rtla/rtla-timerlat-hist.rst index 057db78d4095..03b7f3deb069 100644 --- a/Documentation/tools/rtla/rtla-timerlat-hist.rst +++ b/Documentation/tools/rtla/rtla-timerlat-hist.rst @@ -36,11 +36,11 @@ EXAMPLE In the example below, **rtla timerlat hist** is set to run for *10* minutes, in the cpus *0-4*, *skipping zero* only lines. Moreover, **rtla timerlat hist** will change the priority of the *timerlat* threads to run under -*SCHED_DEADLINE* priority, with a *10us* runtime every *1ms* period. The +*SCHED_DEADLINE* priority, with a *100us* runtime every *1ms* period. The *1ms* period is also passed to the *timerlat* tracer. Auto-analysis is disabled to reduce overhead :: - [root@alien ~]# timerlat hist -d 10m -c 0-4 -P d:100us:1ms -p 1ms --no-aa + [root@alien ~]# timerlat hist -d 10m -c 0-4 -P d:100us:1ms -p 1000 --no-aa # RTLA timerlat histogram # Time unit is microseconds (us) # Duration: 0 00:10:00 diff --git a/tools/tracing/rtla/src/timerlat_aa.c b/tools/tracing/rtla/src/timerlat_aa.c index e0ffe69c271c..7093fd5333be 100644 --- a/tools/tracing/rtla/src/timerlat_aa.c +++ b/tools/tracing/rtla/src/timerlat_aa.c @@ -159,6 +159,7 @@ static int timerlat_aa_irq_latency(struct timerlat_aa_data *taa_data, taa_data->thread_nmi_sum = 0; taa_data->thread_irq_sum = 0; taa_data->thread_softirq_sum = 0; + taa_data->thread_thread_sum = 0; taa_data->thread_blocking_duration = 0; taa_data->timer_irq_start_time = 0; taa_data->timer_irq_duration = 0; @@ -337,7 +338,23 @@ static int timerlat_aa_irq_handler(struct trace_seq *s, struct tep_record *recor taa_data->timer_irq_start_time = start; taa_data->timer_irq_duration = duration; - taa_data->timer_irq_start_delay = taa_data->timer_irq_start_time - expected_start; + /* + * We are dealing with two different clock sources: the + * external clock source that timerlat uses as a reference + * and the clock used by the tracer. There are also two + * moments: the time reading the clock and the timer in + * which the event is placed in the buffer (the trace + * event timestamp). If the processor is slow or there + * is some hardware noise, the difference between the + * timestamp and the external clock read can be longer + * than the IRQ handler delay, resulting in a negative + * time. If so, set IRQ start delay as 0. In the end, + * it is less relevant than the noise. + */ + if (expected_start < taa_data->timer_irq_start_time) + taa_data->timer_irq_start_delay = taa_data->timer_irq_start_time - expected_start; + else + taa_data->timer_irq_start_delay = 0; /* * not exit from idle. @@ -528,7 +545,7 @@ static int timerlat_aa_kworker_start_handler(struct trace_seq *s, struct tep_rec static void timerlat_thread_analysis(struct timerlat_aa_data *taa_data, int cpu, int irq_thresh, int thread_thresh) { - unsigned long long exp_irq_ts; + long long exp_irq_ts; int total; int irq; @@ -545,12 +562,15 @@ static void timerlat_thread_analysis(struct timerlat_aa_data *taa_data, int cpu, /* * Expected IRQ arrival time using the trace clock as the base. + * + * TODO: Add a list of previous IRQ, and then run the list backwards. */ exp_irq_ts = taa_data->timer_irq_start_time - taa_data->timer_irq_start_delay; - - if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) - printf(" Previous IRQ interference: \t\t up to %9.2f us\n", - ns_to_usf(taa_data->prev_irq_duration)); + if (exp_irq_ts < taa_data->prev_irq_timstamp + taa_data->prev_irq_duration) { + if (taa_data->prev_irq_timstamp < taa_data->timer_irq_start_time) + printf(" Previous IRQ interference: \t\t up to %9.2f us\n", + ns_to_usf(taa_data->prev_irq_duration)); + } /* * The delay that the IRQ suffered before starting. diff --git a/tools/tracing/rtla/src/timerlat_u.c b/tools/tracing/rtla/src/timerlat_u.c index 05e310696dd5..01dbf9a6b5a5 100644 --- a/tools/tracing/rtla/src/timerlat_u.c +++ b/tools/tracing/rtla/src/timerlat_u.c @@ -45,7 +45,7 @@ static int timerlat_u_main(int cpu, struct timerlat_u_params *params) retval = sched_setaffinity(gettid(), sizeof(set), &set); if (retval == -1) { - err_msg("Error setting user thread affinity\n"); + debug_msg("Error setting user thread affinity %d, is the CPU online?\n", cpu); exit(1); } @@ -193,7 +193,9 @@ void *timerlat_u_dispatcher(void *data) procs_count--; } } - break; + + if (!procs_count) + break; } sleep(1);