Message ID | 20230306154548.655799-1-oss@malat.biz |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp1917169wrd; Mon, 6 Mar 2023 07:54:13 -0800 (PST) X-Google-Smtp-Source: AK7set9zKE+aoS9RoOl5py8ENOtgvbSyjDM+PCp0OeKWt0B7RlWNfnPRSeB/Az3tj+Z7rRnsVlXU X-Received: by 2002:a17:906:ca5a:b0:8b2:abc7:1ef9 with SMTP id jx26-20020a170906ca5a00b008b2abc71ef9mr11634196ejb.68.1678118052771; Mon, 06 Mar 2023 07:54:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678118052; cv=none; d=google.com; s=arc-20160816; b=s7yDnIqcS2P+0DsFEIkdK9iRJ4ea4gJ5v5Z/RjjZo0aHXxK/u7UCqZ8+xrhX21jHhH 50oXzf7CN0XuO3yxvIKmwz1N2M9r8AmjPyhsExojiTH/mvj4SLpt2NpXrsq/xGRmqFIG 7et5e+HzFpZVXSYZVXjqJBiIyIYQXtPMEDQ/9QpfFRtUg+yuVrUHM7bhDjBKVtLDLwFj SWqIQnyHutwu7txfUZe5QuWXJkyPmBp2VpCEbqz0c9eX4azrqWNYC/RofXiDf9BV513Q EXRDFNAE87dLx5faD6VIN1HwUZXTQ4MsTRsaoVgxEmp4YZOdb4pMEghu4ki7qDCvA5OK 0BRQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=FK3DmguboOukt1iHc1HrCdV38+6oWDtRK0mdWOzWsHs=; b=csRaVFcSYNBc7ENPbrYsIEQl+UwBKRtshi+ZEUS7cPpxUtGRfq85vPgInuw0nPpjgj mleoV4Jo/tFeSss6LVXq/uZevB0bCZYJ/1KZYxt37MiGCbQAIFrz4dygOPiZMQyueEOf QnBH9LkVgOKYFO+IgXs2c823m7WGyHa8jrssfdRYSR0JIf8N55wMpIlcVoPMEk6z966O 3NwaY9uNT7OteEZeh6tcXaxp2JdHt4cA9/OAqDf3XSD270XyHfjZX46hQUryEXsTRDPj e5mnOPL4+vkNImuXB+r3EHzd+oc3rXvJCg1PfitAHxiVhRc+s47zuKiZb/49x7b1v8v1 G+OQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@malat-biz.20210112.gappssmtp.com header.s=20210112 header.b=gQiPQhIm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nc1-20020a1709071c0100b008c254523c3dsi11928818ejc.56.2023.03.06.07.53.49; Mon, 06 Mar 2023 07:54:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@malat-biz.20210112.gappssmtp.com header.s=20210112 header.b=gQiPQhIm; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231153AbjCFPq3 (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Mon, 6 Mar 2023 10:46:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230161AbjCFPqX (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 6 Mar 2023 10:46:23 -0500 Received: from mail-wr1-x42f.google.com (mail-wr1-x42f.google.com [IPv6:2a00:1450:4864:20::42f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ADE5E36FC6 for <linux-kernel@vger.kernel.org>; Mon, 6 Mar 2023 07:45:59 -0800 (PST) Received: by mail-wr1-x42f.google.com with SMTP id q16so9307302wrw.2 for <linux-kernel@vger.kernel.org>; Mon, 06 Mar 2023 07:45:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=malat-biz.20210112.gappssmtp.com; s=20210112; t=1678117557; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=FK3DmguboOukt1iHc1HrCdV38+6oWDtRK0mdWOzWsHs=; b=gQiPQhImmG1kml4a8V42voVxpGLISWqPfGjKkwT1jkfOdV997Hzq4+nVieoDu7lpg2 8uBa1Rkh9Q0LxPe6ZeNQc2St2m2++nw3V3uvn7LSRNSCRTOpdT4FKW3VJc3tjTvBE5pl qsjyuUiOrZfDElyJ5bXp8E5JNpJdE3W2oDybTYvJwKcL0MKl9cdNffVla+eToOhOg9yr fbOzSXkwuzWcrGfAeh/uiDchIvFRbGRuezqc/plwhVrwKj6ILmOAFSjQV9pPE6s9z2Pd OthiSQY5QOaxsa5hKVXGH9Ck2Av8xpfVon2lFQxNb2M3XkzF7VIuKVMzWSCmLbeIqILj 5g1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1678117557; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=FK3DmguboOukt1iHc1HrCdV38+6oWDtRK0mdWOzWsHs=; b=VVIWBBGjXtmoBtfwB4yvh3cSVtIQHkyjDCia5vDvW3yFnoCeW3NwbHkC5xziOQ3CA1 MBItjIGM8LGoaw12zzRvXDT6nsnfZ6a3qFc9N0z9yQy2KFKOiTv+zeu23N4vB3Uik8WU HjM1WAYcIGTvdAEQkHbf20nz+ahWWow1fNS1Jbn34XAV2YpWtI51WoiOYWsXhD9TqY/K R0LJYGOoi8MaTVxzUwQLAEVkN5XPRvkKGg4r+n1baVHF6onpRbMFAQtI674Ov6yVFr4r LFgWsIk27tooYVQyk8hXOhWkEple0c1YHqOoTT5qiPghygwPPW8O1VhrC8Nzaf9YmRg5 clwg== X-Gm-Message-State: AO0yUKVhs68dgqz8sdnSZo04/NPG9ivqgLijiuDPu9MtJ57pkuqpCqE+ skOUItCuo69H1lNI/8/7bviv8BbWPRIDJnFn0q5WAg== X-Received: by 2002:a5d:4a51:0:b0:2ca:fd48:7c1e with SMTP id v17-20020a5d4a51000000b002cafd487c1emr7093099wrs.48.1678117557029; Mon, 06 Mar 2023 07:45:57 -0800 (PST) Received: from ntb.petris.klfree.czf ([193.86.118.65]) by smtp.googlemail.com with ESMTPSA id e15-20020a5d500f000000b002c561805a4csm10225692wrt.45.2023.03.06.07.45.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 06 Mar 2023 07:45:56 -0800 (PST) From: Petr Malat <oss@malat.biz> To: linux-kernel@vger.kernel.org Cc: paulmck@kernel.org, tglx@linutronix.de, bigeasy@linutronix.de, nsaenzju@redhat.com, frederic@kernel.org, Petr Malat <oss@malat.biz> Subject: [PATCH] softirq: Do not loop if running under a real-time task Date: Mon, 6 Mar 2023 16:45:48 +0100 Message-Id: <20230306154548.655799-1-oss@malat.biz> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759634314999530759?= X-GMAIL-MSGID: =?utf-8?q?1759634314999530759?= |
Series |
softirq: Do not loop if running under a real-time task
|
|
Commit Message
Petr Malat
March 6, 2023, 3:45 p.m. UTC
Softirq processing can be a source of a scheduling jitter if it executes
in a real-time task as in that case need_resched() is false unless there
is another runnable task with a higher priority. This is especially bad
if the softirq processing runs in a migration thread, which has priority
99 and usually runs for a short time.
One option would be to not restart the softirq processing if there is
another runnable task to allow the high prio task to finish and yield the
CPU, the second one is to not restart if softirq executes in a real-time
task. Usually, real-time tasks don't want to be interrupted, so implement
the second option.
Signed-off-by: Petr Malat <oss@malat.biz>
---
kernel/softirq.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Comments
On 2023-03-06 16:45:48 [+0100], Petr Malat wrote: > Softirq processing can be a source of a scheduling jitter if it executes > in a real-time task as in that case need_resched() is false unless there > is another runnable task with a higher priority. This is especially bad > if the softirq processing runs in a migration thread, which has priority > 99 and usually runs for a short time. > > One option would be to not restart the softirq processing if there is > another runnable task to allow the high prio task to finish and yield the > CPU, the second one is to not restart if softirq executes in a real-time > task. Usually, real-time tasks don't want to be interrupted, so implement > the second option. This affects only PEEMPT_RT, right? I have plans to redo parts of it. You shouldn't enter ksoftirqd to be begin with. There is this ktimerd in v6.1 which mitigates this to some point and I plan to extend it to also cover the sched-softirq. Other than that, you are right in saying that the softirq must not continue with a RT prio and that need_resched() is not visible here. However ksoftirqd itself must be able to do loops unless the need-resched flag is seen. Since you mentioned migration thread, how ofter to you see this or how does this trigger? > Signed-off-by: Petr Malat <oss@malat.biz> Sebastian
On Wed, Mar 08, 2023 at 10:14:58AM +0100, Sebastian Andrzej Siewior wrote: > On 2023-03-06 16:45:48 [+0100], Petr Malat wrote: > > Softirq processing can be a source of a scheduling jitter if it executes > > in a real-time task as in that case need_resched() is false unless there > > is another runnable task with a higher priority. This is especially bad > > if the softirq processing runs in a migration thread, which has priority > > 99 and usually runs for a short time. > > > > One option would be to not restart the softirq processing if there is > > another runnable task to allow the high prio task to finish and yield the > > CPU, the second one is to not restart if softirq executes in a real-time > > task. Usually, real-time tasks don't want to be interrupted, so implement > > the second option. > > This affects only PEEMPT_RT, right? I have observed the issue on 5.15 CONFIG_PREEMPT=y arm32 kernel. > I have plans to redo parts of it. You shouldn't enter ksoftirqd to be > begin with. There is this ktimerd in v6.1 which mitigates this to some > point and I plan to extend it to also cover the sched-softirq. > Other than that, you are right in saying that the softirq must not > continue with a RT prio and that need_resched() is not visible here. > However ksoftirqd itself must be able to do loops unless the > need-resched flag is seen. > > Since you mentioned migration thread, how ofter to you see this or how > does this trigger? I have seen only one occurrence, where I have a back trace available (from hundreds systems). I think that's because on my system it may occur only if it hits the migration thread, otherwise there are more runable threads of the same priority and need_resched() breaks the loop. I obtained the stack trace by making a debugging module which uses a periodic timer to monitor active tasks and it dumps stack when it finds something fishy. This is what I got: [<bf84f559>] (hogger_handler [hogger]) from [<c04850ef>] (__hrtimer_run_queues+0x13f/0x2f4) [<c04850ef>] (__hrtimer_run_queues) from [<c04858a5>] (hrtimer_interrupt+0xc9/0x1c4) [<c04858a5>] (hrtimer_interrupt) from [<c0810533>] (arch_timer_handler_phys+0x27/0x2c) [<c0810533>] (arch_timer_handler_phys) from [<c046de3b>] (handle_percpu_devid_irq+0x5b/0x1e4) [<c046de3b>] (handle_percpu_devid_irq) from [<c0469a27>] (__handle_domain_irq+0x53/0x94) [<c0469a27>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc) [<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94) Exception stack(0xc1595ca8 to 0xc1595cf0) [<c0400ad3>] (__irq_svc) from [<c098e404>] (_raw_spin_unlock_irqrestore+0x1c/0x3c) [<c098e404>] (_raw_spin_unlock_irqrestore) from [<c0446b6d>] (try_to_wake_up+0x1d9/0x5d0) [<c0446b6d>] (try_to_wake_up) from [<c0483d2d>] (call_timer_fn+0x31/0x16c) [<c0483d2d>] (call_timer_fn) from [<c048406f>] (run_timer_softirq+0x207/0x2d4) [<c048406f>] (run_timer_softirq) from [<c0401293>] (__do_softirq+0xd3/0x2f8) [<c0401293>] (__do_softirq) from [<c042876b>] (irq_exit+0x57/0x78) [<c042876b>] (irq_exit) from [<c0469a2b>] (__handle_domain_irq+0x57/0x94) [<c0469a2b>] (__handle_domain_irq) from [<c041e501>] (axxia_gic_handle_irq+0x16d/0x1bc) [<c041e501>] (axxia_gic_handle_irq) from [<c0400ad3>] (__irq_svc+0x53/0x94) Exception stack(0xc1595e78 to 0xc1595ec0) [<c0400ad3>] (__irq_svc) from [<c044d37c>] (active_load_balance_cpu_stop+0x1ec/0x234) [<c044d37c>] (active_load_balance_cpu_stop) from [<c04ac099>] (cpu_stopper_thread+0x69/0xd8) [<c04ac099>] (cpu_stopper_thread) from [<c0440b53>] (smpboot_thread_fn+0x9f/0x17c) [<c0440b53>] (smpboot_thread_fn) from [<c043ccf9>] (kthread+0x129/0x12c) [<c043ccf9>] (kthread) from [<c0400131>] (ret_from_fork+0x11/0x20) I was then looking into the code how it could happen softirqs were not offloaded to the thread and the only explanation I have is what I described in the original mail. BR, Petr
diff --git a/kernel/softirq.c b/kernel/softirq.c index c8a6913c067d..6a66d28bf020 100644 --- a/kernel/softirq.c +++ b/kernel/softirq.c @@ -478,7 +478,8 @@ asmlinkage __visible void do_softirq(void) /* * We restart softirq processing for at most MAX_SOFTIRQ_RESTART times, - * but break the loop if need_resched() is set or after 2 ms. + * but break the loop after 2 ms or if need_resched() is set or if we + * execute in a real-time task. * The MAX_SOFTIRQ_TIME provides a nice upper bound in most cases, but in * certain cases, such as stop_machine(), jiffies may cease to * increment and so we need the MAX_SOFTIRQ_RESTART limit as @@ -589,6 +590,7 @@ asmlinkage __visible void __softirq_entry __do_softirq(void) pending = local_softirq_pending(); if (pending) { if (time_before(jiffies, end) && !need_resched() && + (current->prio >= MAX_RT_PRIO || current == __this_cpu_read(ksoftirqd)) && --max_restart) goto restart;