Message ID | 20240102073351.1527503-1-alexs@kernel.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7301:6f82:b0:100:9c79:88ff with SMTP id tb2csp4313036dyb; Mon, 1 Jan 2024 23:32:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFpBByqROLqGGqm4IPo4JN6mM3GG1zl0zknbahgW4gZHpIKzoxZtpl5RWD955qkq08G7sF0 X-Received: by 2002:ad4:5b8e:0:b0:67f:49af:8ba1 with SMTP id 14-20020ad45b8e000000b0067f49af8ba1mr24213262qvp.68.1704180757895; Mon, 01 Jan 2024 23:32:37 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704180757; cv=none; d=google.com; s=arc-20160816; b=rA8hk66Gc/y5jr7sZYKeWW00o4OKth1ZEnAVzxYzMHjKTD77WSYjcBAqHLk7BlbJK8 kBzQOjqoxF3Rsz4tXsejTCkxerZh/4OPaa00lGYi0Wt0SUB5kuTCQ3HVQs9sldwDHSje ABVtWiPWk0GRJU1BgcSAWW/xPamWc0cLJ5h2Qyl+GCWM5MbJQRRBn/qMn9UgxCxDFkix 7ZaeoYah9oXyCaZjDnue1gXsBu8Ir+Pv24C/pTMcHiJ1eysbNRYTvuawMu2mbEOz2BtT wvifXoBfdOQOvZv1PNZ3/88bo01A08uxHpK8VPQzUK9PTp1sI/f7M5dR5tHx/pz60cYH 1VXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=qJjUIUInDVD7OsJS68gW9bZ3E+xtjqhB6saMdJYbhgk=; fh=anjk1Egm383vfZAe+uFZ4BcIo6SzVgNVjjnhXgCkdws=; b=o2JpX1nFsr05GX2pap3wwk9TywWKxvzaXYZMn9xogOFeCxtxp5xKCruo2V8n83u5Hp iZhdl7PZ4z+J+eULb9a5JVAXHJ3/UtV5iafXX55CcKQ9DA4jKX+FRzcjmn9X+RJmncEL 3y9dfHQ2Qy5FybBXl5Jxw7OLPloMzUqYY98OLsszTDu+U9rnFOyNX8FRCC+aOBjJg7Oj 2+EFRqd3qjypT4oKlmg+DhzXhXLOrLlu4OpOJ6+yBQDa6PR1DWZy/E4YnezPeY4OVQ+k 1/EvvzWrPF4F/ftf8//DXkjtlu+9O5SKySfjT8qHaoGYd5O5jlc7sLgDKddtmfMGAv2i yjxw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=lfcmYcFy; spf=pass (google.com: domain of linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id b18-20020a0cc992000000b0067f84fac63fsi25472461qvk.10.2024.01.01.23.32.37 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Jan 2024 23:32:37 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=lfcmYcFy; spf=pass (google.com: domain of linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-14116-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id ABB381C21385 for <ouuuleilei@gmail.com>; Tue, 2 Jan 2024 07:32:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B055363CC; Tue, 2 Jan 2024 07:32:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lfcmYcFy" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2292E53A5 for <linux-kernel@vger.kernel.org>; Tue, 2 Jan 2024 07:32:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id AFDF6C433C8; Tue, 2 Jan 2024 07:32:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1704180743; bh=OV5kQZVtgYNjmq8xkJXDaZLj5pV6ouoj+mekcsjkUbw=; h=From:To:Cc:Subject:Date:From; b=lfcmYcFy9H2YhEXtMCT30jydaiWMpym50FEYmk4ujJgEuOo82ndTjafV3Jwem+h6o hrKzNvd67MsZbUwccQzsqVxe45tO2SVdL6tpbuoy/Z6aTjSDFeQf1NDy0odHpWgOvm IjxOND1U5cuuFOyxnV3MvjaHwbrPtHH1wvY+JfvP+riFcgWv1v4eY4uk4kvNe3r3KP mh9xPp45pOk46m+wFFgL0A2LCZJeSbh7ModG+D2fDaxGspJKuMuUTsPgZEgtRwVwdI va9u9pzENaG1AxuTTxbaVsyprQyCkwn4jnZH/E/zvcRNuuAizLH2I09BgwrcUDt2uI jJ8gPLcJmd3vg== From: alexs@kernel.org To: Ingo Molnar <mingo@redhat.com>, Peter Zijlstra <peterz@infradead.org>, Juri Lelli <juri.lelli@redhat.com>, Vincent Guittot <vincent.guittot@linaro.org>, Dietmar Eggemann <dietmar.eggemann@arm.com>, Steven Rostedt <rostedt@goodmis.org>, Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>, Daniel Bristot de Oliveira <bristot@redhat.com>, Valentin Schneider <vschneid@redhat.com>, linux-kernel@vger.kernel.org Cc: curuwang@tencent.com, Alex Shi <alexs@kernel.org> Subject: [PATCH] sched/tracing: correct the task blocking state Date: Tue, 2 Jan 2024 15:33:51 +0800 Message-ID: <20240102073351.1527503-1-alexs@kernel.org> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1786963042088082981 X-GMAIL-MSGID: 1786963042088082981 |
Series |
sched/tracing: correct the task blocking state
|
|
Commit Message
alexs@kernel.org
Jan. 2, 2024, 7:33 a.m. UTC
From: Alex Shi <alexs@kernel.org> commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") stopped the idle kthreads contribution to loadavg. Also task idle should separated from blocked state too, otherwise we will get incorrect task blocking state from event tracing sched:sched_stat_blocked. Originally-from: Curu Wong <curuwang@tencent.com> Signed-off-by: Alex Shi <alexs@kernel.org> To: linux-kernel@vger.kernel.org To: Valentin Schneider <vschneid@redhat.com> To: Daniel Bristot de Oliveira <bristot@redhat.com> To: Mel Gorman <mgorman@suse.de> To: Ben Segall <bsegall@google.com> To: Steven Rostedt <rostedt@goodmis.org> To: Dietmar Eggemann <dietmar.eggemann@arm.com> To: Vincent Guittot <vincent.guittot@linaro.org> To: Juri Lelli <juri.lelli@redhat.com> To: Peter Zijlstra <peterz@infradead.org> To: Ingo Molnar <mingo@redhat.com> --- include/linux/sched.h | 4 ++++ kernel/sched/deadline.c | 2 +- kernel/sched/fair.c | 2 +- kernel/sched/rt.c | 2 +- 4 files changed, 7 insertions(+), 3 deletions(-)
Comments
On 02/01/24 15:33, alexs@kernel.org wrote: > From: Alex Shi <alexs@kernel.org> > > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") > stopped the idle kthreads contribution to loadavg. Also task idle should > separated from blocked state too, otherwise we will get incorrect task > blocking state from event tracing sched:sched_stat_blocked. > Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked' meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to loadavg yes, but they are still in an UNINTERRUPTIBLE wait. > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > index b28114478b82..b6afa596f071 100644 > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1570,7 +1570,7 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, > __schedstat_set(p->stats.sleep_start, > rq_clock(rq_of_dl_rq(dl_rq))); > > - if (state & TASK_UNINTERRUPTIBLE) > + if (is_blocked_task_state(state)) > __schedstat_set(p->stats.block_start, > rq_clock(rq_of_dl_rq(dl_rq))); This change makes it so tasks waiting in TASK_IDLE have their waiting ignored by schedstat (they are seen as neither INTERRUPTIBLE nor UNINTERRUPTIBLE).
On Tue, Jan 2, 2024 at 6:19 PM Valentin Schneider <vschneid@redhat.com> wrote: > > On 02/01/24 15:33, alexs@kernel.org wrote: > > From: Alex Shi <alexs@kernel.org> > > > > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") > > stopped the idle kthreads contribution to loadavg. Also task idle should > > separated from blocked state too, otherwise we will get incorrect task > > blocking state from event tracing sched:sched_stat_blocked. > > > > Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked' > meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to > loadavg yes, but they are still in an UNINTERRUPTIBLE wait. Hi Valentin, Thanks a lot for the reply. I agree with you the current usage, but if so, we account for the idle task into blocked state. And it's better to distinguish between idle and block. > > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > > index b28114478b82..b6afa596f071 100644 > > --- a/kernel/sched/deadline.c > > +++ b/kernel/sched/deadline.c > > @@ -1570,7 +1570,7 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, > > __schedstat_set(p->stats.sleep_start, > > rq_clock(rq_of_dl_rq(dl_rq))); > > > > - if (state & TASK_UNINTERRUPTIBLE) > > + if (is_blocked_task_state(state)) > > __schedstat_set(p->stats.block_start, > > rq_clock(rq_of_dl_rq(dl_rq))); > > This change makes it so tasks waiting in TASK_IDLE have their waiting > ignored by schedstat (they are seen as neither INTERRUPTIBLE nor UNINTERRUPTIBLE). > Right, maybe it's time to add a new state for TASK_IDLE? Many thanks! Alex Shi
On 02/01/24 21:00, Alex Shi wrote: > On Tue, Jan 2, 2024 at 6:19 PM Valentin Schneider <vschneid@redhat.com> wrote: >> >> On 02/01/24 15:33, alexs@kernel.org wrote: >> > From: Alex Shi <alexs@kernel.org> >> > >> > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") >> > stopped the idle kthreads contribution to loadavg. Also task idle should >> > separated from blocked state too, otherwise we will get incorrect task >> > blocking state from event tracing sched:sched_stat_blocked. >> > >> >> Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked' >> meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to >> loadavg yes, but they are still in an UNINTERRUPTIBLE wait. > > > Hi Valentin, > Thanks a lot for the reply. > > I agree with you the current usage, but if so, we account for the idle task into > blocked state. And it's better to distinguish between idle and block. > Why is that an issue? If those tasks didn't have to be TASK_UNINTERRUPTIBLE (via TASK_IDLE), we'd make them TASK_INTERRUPTIBLE and they'd also inflate the 'sleeping' schedstat (rather than the 'blocked'). What problem are you facing with those tasks being flagged as blocked during their wait?
On Wed, Jan 3, 2024 at 12:16 AM Valentin Schneider <vschneid@redhat.com> wrote: > > On 02/01/24 21:00, Alex Shi wrote: > > On Tue, Jan 2, 2024 at 6:19 PM Valentin Schneider <vschneid@redhat.com> wrote: > >> > >> On 02/01/24 15:33, alexs@kernel.org wrote: > >> > From: Alex Shi <alexs@kernel.org> > >> > > >> > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") > >> > stopped the idle kthreads contribution to loadavg. Also task idle should > >> > separated from blocked state too, otherwise we will get incorrect task > >> > blocking state from event tracing sched:sched_stat_blocked. > >> > > >> > >> Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked' > >> meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to > >> loadavg yes, but they are still in an UNINTERRUPTIBLE wait. > > > > > > Hi Valentin, > > Thanks a lot for the reply. > > > > I agree with you the current usage, but if so, we account for the idle task into > > blocked state. And it's better to distinguish between idle and block. > > > > Why is that an issue? If those tasks didn't have to be > TASK_UNINTERRUPTIBLE (via TASK_IDLE), we'd make them TASK_INTERRUPTIBLE and > they'd also inflate the 'sleeping' schedstat (rather than the 'blocked'). > > What problem are you facing with those tasks being flagged as blocked during > their wait? > Uh, Tencent cloud has some latency sensitive services, a blocked state means the service has some trouble, but with IDLE state involved, it's failed on this judgement. and 2nd, if a service has abnormal, we want to check if it's hanging on io or sth else, but the top 3 D tasks are often queuework in our system, and even a task in blocked state we have no quick way to figure out if it's IDLE or BLOCKED. 2 different states will help us a lot. Thanks!
On Tue, Jan 2, 2024 at 6:19 PM Valentin Schneider <vschneid@redhat.com> wrote: > > On 02/01/24 15:33, alexs@kernel.org wrote: > > From: Alex Shi <alexs@kernel.org> > > > > commit 80ed87c8a9ca ("sched/wait: Introduce TASK_NOLOAD and TASK_IDLE") > > stopped the idle kthreads contribution to loadavg. Also task idle should > > separated from blocked state too, otherwise we will get incorrect task > > blocking state from event tracing sched:sched_stat_blocked. > > > > Why is that incorrect? AFAICT we have mapped the (schedstat) 'blocked' > meaning to TASK_UNINTERRUPTIBLE. TASK_IDLE tasks don't contribute to > loadavg yes, but they are still in an UNINTERRUPTIBLE wait. > > > diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c > > index b28114478b82..b6afa596f071 100644 > > --- a/kernel/sched/deadline.c > > +++ b/kernel/sched/deadline.c > > @@ -1570,7 +1570,7 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, > > __schedstat_set(p->stats.sleep_start, > > rq_clock(rq_of_dl_rq(dl_rq))); > > > > - if (state & TASK_UNINTERRUPTIBLE) > > + if (is_blocked_task_state(state)) > > __schedstat_set(p->stats.block_start, > > rq_clock(rq_of_dl_rq(dl_rq))); > > This change makes it so tasks waiting in TASK_IDLE have their waiting > ignored by schedstat (they are seen as neither INTERRUPTIBLE nor UNINTERRUPTIBLE). Right, I will fix it by adding idle time into sleep. will send the 2nd version patch. Thanks Alex
On 03/01/24 11:14, Alex Shi wrote: > On Wed, Jan 3, 2024 at 12:16 AM Valentin Schneider <vschneid@redhat.com> wrote: >> >> Why is that an issue? If those tasks didn't have to be >> TASK_UNINTERRUPTIBLE (via TASK_IDLE), we'd make them TASK_INTERRUPTIBLE and >> they'd also inflate the 'sleeping' schedstat (rather than the 'blocked'). >> >> What problem are you facing with those tasks being flagged as blocked during >> their wait? >> > > Uh, Tencent cloud has some latency sensitive services, a blocked state > means the service has > some trouble, but with IDLE state involved, it's failed on this judgement. > and 2nd, if a service has abnormal, we want to check if it's hanging > on io or sth else, but the top > 3 D tasks are often queuework in our system, and even a task in > blocked state we have no > quick way to figure out if it's IDLE or BLOCKED. 2 different states > will help us a lot. > That's useful information - generally it's good to add the motivation for a patch in the changelog. > Thanks!
On Wed, Jan 3, 2024 at 9:56 PM Valentin Schneider <vschneid@redhat.com> wrote: > > On 03/01/24 11:14, Alex Shi wrote: > > On Wed, Jan 3, 2024 at 12:16 AM Valentin Schneider <vschneid@redhat.com> wrote: > >> > >> Why is that an issue? If those tasks didn't have to be > >> TASK_UNINTERRUPTIBLE (via TASK_IDLE), we'd make them TASK_INTERRUPTIBLE and > >> they'd also inflate the 'sleeping' schedstat (rather than the 'blocked'). > >> > >> What problem are you facing with those tasks being flagged as blocked during > >> their wait? > >> > > > > Uh, Tencent cloud has some latency sensitive services, a blocked state > > means the service has > > some trouble, but with IDLE state involved, it's failed on this judgement. > > and 2nd, if a service has abnormal, we want to check if it's hanging > > on io or sth else, but the top > > 3 D tasks are often queuework in our system, and even a task in > > blocked state we have no > > quick way to figure out if it's IDLE or BLOCKED. 2 different states > > will help us a lot. > > > > That's useful information - generally it's good to add the motivation for a > patch in the changelog. Thanks a lot for coaching, I changed the commit log in the v2 patch: https://lore.kernel.org/lkml/20240103081042.1549189-1-alexs@kernel.org/ Btw, which way is usual prefered, send v2 patch in a separate thread or add '--in-reply-to' to follow this thread? Thanks Alex > > > Thanks! >
On 04/01/24 15:56, Alex Shi wrote: > On Wed, Jan 3, 2024 at 9:56 PM Valentin Schneider <vschneid@redhat.com> wrote: >> >> That's useful information - generally it's good to add the motivation for a >> patch in the changelog. > > Thanks a lot for coaching, I changed the commit log in the v2 patch: > https://lore.kernel.org/lkml/20240103081042.1549189-1-alexs@kernel.org/ > Btw, which way is usual prefered, send v2 patch in a separate thread > or add '--in-reply-to' to follow this thread? > New threads are usually better, it makes it clear that it is a new version of the patch and not just a reply to the previous version's thread. Reviewers/Maintainers can also jump more easily on the v(n+1) when it's a separate thread, in case they missed / didn't have time to review the previous version. > Thanks > Alex > >> >> > Thanks! >>
diff --git a/include/linux/sched.h b/include/linux/sched.h index 292c31697248..341e62255ea7 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -140,6 +140,10 @@ struct user_event_mm; #define is_special_task_state(state) \ ((state) & (__TASK_STOPPED | __TASK_TRACED | TASK_PARKED | TASK_DEAD)) +/* blocked task is UNINTERRUPTIBLE but not NOLOAD */ +#define is_blocked_task_state(state) \ + ((state) & TASK_UNINTERRUPTIBLE && (!((state) & TASK_NOLOAD))) + #ifdef CONFIG_DEBUG_ATOMIC_SLEEP # define debug_normal_state_change(state_value) \ do { \ diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c index b28114478b82..b6afa596f071 100644 --- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -1570,7 +1570,7 @@ update_stats_dequeue_dl(struct dl_rq *dl_rq, struct sched_dl_entity *dl_se, __schedstat_set(p->stats.sleep_start, rq_clock(rq_of_dl_rq(dl_rq))); - if (state & TASK_UNINTERRUPTIBLE) + if (is_blocked_task_state(state)) __schedstat_set(p->stats.block_start, rq_clock(rq_of_dl_rq(dl_rq))); } diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index d7a3c63a2171..349b0c5104b6 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -1281,7 +1281,7 @@ update_stats_dequeue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int fl if (state & TASK_INTERRUPTIBLE) __schedstat_set(tsk->stats.sleep_start, rq_clock(rq_of(cfs_rq))); - if (state & TASK_UNINTERRUPTIBLE) + if (is_blocked_task_state(state)) __schedstat_set(tsk->stats.block_start, rq_clock(rq_of(cfs_rq))); } diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c index 6aaf0a3d6081..2fdf3d71428d 100644 --- a/kernel/sched/rt.c +++ b/kernel/sched/rt.c @@ -1375,7 +1375,7 @@ update_stats_dequeue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se, __schedstat_set(p->stats.sleep_start, rq_clock(rq_of_rt_rq(rt_rq))); - if (state & TASK_UNINTERRUPTIBLE) + if (is_blocked_task_state(state)) __schedstat_set(p->stats.block_start, rq_clock(rq_of_rt_rq(rt_rq))); }