From patchwork Mon Jun 12 20:50:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 106877 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp134107vqr; Mon, 12 Jun 2023 14:09:25 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4JtR/q1Lcsbf8SSaeRFAFOg2rEUN8KbnSsx9PO4QL6Np1n+BcwliFeW8SmmEZvjb5ecQU9 X-Received: by 2002:a17:906:c150:b0:96f:f807:6af5 with SMTP id dp16-20020a170906c15000b0096ff8076af5mr9875314ejc.39.1686604165245; Mon, 12 Jun 2023 14:09:25 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686604165; cv=none; d=google.com; s=arc-20160816; b=XeOwicWcpLnZVfRGfQB3vJXLIab1j6rl1LRMVYRr+u86xg2tkEZeIpZrBWnoHIuOLu xgIn9xqlIiU6/t/NxfumIuTxfYpDtTGlDYQfJUzpTmR49grpFDpezN7oCxjx3jITa8bA cVROkxrRg36qH9CXyY6MteP5lMJeBUcBb1FmVvd8Qa6Vv04MqFE7d36ai/RI587nD2+j QxFr7d5ZHug546ordLQu5mKHMgGWXeXMXO5rySIRoOuDzwTSofH4FF6thLhN2welaZDY WD7EqV02+ScPn4j84btraPXcVkvDGuWKXpBaTIyupyfZKoLavi2HTiFUj9MD6r5GNQ90 rXMQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=VIfFepwKCald0+2zuro6TaFHeOPWtpPL0HPdGLcfrH4=; b=ysxsr2ZRCqI4uNIlucQVIOrXg/bcq7u90NABF6r3gLwWgppuRi4eM9k5Et0QVpkn0g r2yHBWsUVN9t7z8efWKTe2oOdK/waRbD3aUCCvR13CsnScARvT+4PxKgirXB3YlBQ2GL xqYPJgkrInDtNd4fjpM4qa/t3lrM2l7WIOkaGwuBJxub9PMen2fLllHFoQCzdwafGKmK DIXMJXqzYM24t9mhJTUPkQfGjnIO9YEJeZ6agCCmx5vEQ/ODzRuscYrbQ+4Gcbj381NC 096UU3EOUvSB0TclGO0DRV1lAyUmu3KvraJ4BMK/kxnBALXo3ZPtkyXL0++816hfxcpE GL/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bItv31sJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id e15-20020a170906504f00b009784f00c5besi5449263ejk.263.2023.06.12.14.08.58; Mon, 12 Jun 2023 14:09:25 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=bItv31sJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238589AbjFLU5K (ORCPT + 99 others); Mon, 12 Jun 2023 16:57:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238583AbjFLU4u (ORCPT ); Mon, 12 Jun 2023 16:56:50 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9B331E54 for ; Mon, 12 Jun 2023 13:52:37 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5B02A62150 for ; Mon, 12 Jun 2023 20:50:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BF844C433D2; Mon, 12 Jun 2023 20:50:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686603037; bh=+6bWbNlMMY6DSoL9TFRSEX0OALiJJGb4tKRRiWQ49+c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bItv31sJeirjsOFKBsE1dogUy5MHsWfB0+vvt2xRVGFKCPiCIXIgwpCMg7WjwxvzB 2eKvPoQ3X4RbxTdwdexd1N1i7GefSZNsPqf5yOBe/qtqeD3Trgli4BrYVR8fMdboi/ sWe/SED+XUtjw2EhHbM0rdMjqU9+gi9eMvIfamtECeAaxkEPx73mlhyIXGxWKP7G77 2azpPIoNEy42mvgL97b7Zom19DDqX8hpYT40t6dYatHUdBUOqUjr7AlbW+Bpz/LRZd Vp527MQpZNNVIDVBYYHBiY0BahikGz8PMwyXBJS/0/dePSlAQj/8OyUdGg4eV3y9tJ kLoWBlT1ceTfw== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6C1BBCE09E7; Mon, 12 Jun 2023 13:50:37 -0700 (PDT) From: "Paul E. McKenney" To: peterz@infradead.org, jgross@suse.com, vschneid@redhat.com, yury.norov@gmail.com Cc: linux-kernel@vger.kernel.org, imran.f.khan@oracle.com, kernel-team@meta.com, "Paul E . McKenney" Subject: [PATCH csd-lock 1/2] smp: Reduce logging due to dump_stack of CSD waiters Date: Mon, 12 Jun 2023 13:50:35 -0700 Message-Id: <20230612205036.292542-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <3ee27fe5-cbea-46b6-adb0-48c4dde92b4f@paulmck-laptop> References: <3ee27fe5-cbea-46b6-adb0-48c4dde92b4f@paulmck-laptop> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768532649377613138?= X-GMAIL-MSGID: =?utf-8?q?1768532649377613138?= From: Imran Khan If a waiter is waiting for CSD lock, its call stack will not change between first and subsequent hang detection for the same CSD lock. Therefore, do dump_stack only for first-time detection for a given waiter. This avoids excessive logging on systems with hundreds of CPUs where repetitive dump_stack from hundreds of CPUs would otherwise flood the console. Signed-off-by: Imran Khan Cc: Peter Zijlstra Cc: Juergen Gross Cc: Valentin Schneider Cc: Yury Norov Signed-off-by: Paul E. McKenney --- kernel/smp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index ab3e5dad6cfe..b7ccba677a0a 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -248,7 +248,8 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * arch_send_call_function_single_ipi(cpu); } } - dump_stack(); + if (firsttime) + dump_stack(); *ts1 = ts2; return false; From patchwork Mon Jun 12 20:50:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 106880 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp134290vqr; Mon, 12 Jun 2023 14:09:47 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6l4O4t5BX91a/cLkCf/NRMm0+pGPnqtABqV/KTJIW/O9PZVysP7cEaN2Z6ABhLRlALugJv X-Received: by 2002:a17:907:6d0d:b0:97a:13cc:558 with SMTP id sa13-20020a1709076d0d00b0097a13cc0558mr10435078ejc.56.1686604187529; Mon, 12 Jun 2023 14:09:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686604187; cv=none; d=google.com; s=arc-20160816; b=THTe0tmKbFsyAWKHXZYEHCi0pNVfnUDvVCb2tXkKR5R30sAi0NZqfBOWms3bWrXb5z BuidnpepGlbqLAj/BbMQfHDXQS9ooHaM9Fwun8xDyQxnsGY0S7zcmZt4/UDiFixOm57K yJO43IoZOlcgSElhNVOpxwVelpwLOaoWZU8A03ZQSCnyWGJjeFiGXyMLsNjY/zo1ZK6F deLuh2GuGx8/fTADkioT7NgI1kE6xn9cMx27CexdFe+2gbumXSa/vy4bTzn+G6jOHRba dltaPX5s6xLA9tT4wOH11amn4ieezP4wBeoubi2J5e5JHxG70xYzIWL8u5t1MYZS+oVw UV8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=KlabrkvOjBxxaGnOzRshE2WtKNFwo1fA40bHmIHYZc0=; b=vbc6eMLIzH7dvpif11kwdLcB/zzYgpxK/bQSJoLaBxSEYAX8J/3UZkyMD8PPisNqJA Uy9Qib0sW+5+hAyv6HEQJ2laU6XRmhwiVGGkb2tFQpqfHMwFQH5x0//wPkT/qYooh4J8 0sHBa5fupdYnkzIqaaztqePISyxQe2oIW1RP4TI9kD0CZMPnW+z7fmoRMc3aipnyL++X kQzWZ+HyXF13Zc+D8ZiJrCxYvtqLzUj24YXag8BWb/GqL9Ru+EpOFOWg+URLkeANx1dC fXwZ2DWE4xxasjmLheXze7FSBZThs1AEdgbiIYG54mw2gCUUiLNnaoDumkIIsMpYGXH1 4m7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=eunDYNXl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id f26-20020a1709067f9a00b00977daa2f39asi5535783ejr.1028.2023.06.12.14.09.22; Mon, 12 Jun 2023 14:09:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=eunDYNXl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238751AbjFLVCG (ORCPT + 99 others); Mon, 12 Jun 2023 17:02:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238738AbjFLVBo (ORCPT ); Mon, 12 Jun 2023 17:01:44 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55F0B46B8 for ; Mon, 12 Jun 2023 13:57:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 69C836130B for ; Mon, 12 Jun 2023 20:50:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF233C4339B; Mon, 12 Jun 2023 20:50:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1686603037; bh=7tWVQBIU+JFaG4tXsksS7+hwzxbweS0SzHRcgMflDQM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eunDYNXl0fq54sZs/5+G4HGBRJmjdY4r3gpf0wI0LnTLpkpfMNoxTus2Rn+la90Pl MhASlIRuceWOLmeHWhurVnMNf6VhO6F01dvt8xFHhIAHvWsZwVAtxqskmDrsRtBxzT CZaE4rIJpzOF7lMgYtw0mVD/NNZ+2iuro7N1692CSYhin077lo8J2ZMilOW+EV3Ems +8u/jCrOXNTecIju9OGt+SUWsLXRcmiahnzFRMCPWCSN9Xu1BFLxYSD6ZZmVlNa8ME MerThXO/Y+uOhqKosfXL+1RTX6ptOqpNmwBislH06UuJ0MjvSU0xykmlcaWc+SnUN4 yhyw8RdYCIo8w== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 6ECF5CE3A1C; Mon, 12 Jun 2023 13:50:37 -0700 (PDT) From: "Paul E. McKenney" To: peterz@infradead.org, jgross@suse.com, vschneid@redhat.com, yury.norov@gmail.com Cc: linux-kernel@vger.kernel.org, imran.f.khan@oracle.com, kernel-team@meta.com, "Paul E . McKenney" Subject: [PATCH csd-lock 2/2] smp: Reduce NMI traffic from CSD waiters to CSD destination Date: Mon, 12 Jun 2023 13:50:36 -0700 Message-Id: <20230612205036.292542-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <3ee27fe5-cbea-46b6-adb0-48c4dde92b4f@paulmck-laptop> References: <3ee27fe5-cbea-46b6-adb0-48c4dde92b4f@paulmck-laptop> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1768532672198823040?= X-GMAIL-MSGID: =?utf-8?q?1768532672198823040?= From: Imran Khan On systems with hundreds of CPUs, if most of the CPUs detect a CSD hang, then all of these waiting CPUs send an NMI to the destination CPU in order to dump its backtrace. Given enough NMIs, the destination CPU will spent much of its time producing backtraces, thus further delaying that CPU's response to the original CSD IPI. In the worst case, by the time destination CPU is done producing all of these backtrace NMIs, the CSD wait timeout will have elapsed so that the waiters resend their backtrace NMIs again, further delaying forward progress. Therefore, to avoid these delays, issue the backtrace NMI only from the first waiter. The destination CPU's other waiters can make use of backtrace obtained from the first waiter's NMI. Signed-off-by: Imran Khan Cc: Peter Zijlstra Cc: Juergen Gross Cc: Valentin Schneider Cc: Yury Norov Signed-off-by: Paul E. McKenney --- kernel/smp.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index b7ccba677a0a..a1cd21ea8b30 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -43,6 +43,8 @@ static DEFINE_PER_CPU_ALIGNED(struct call_function_data, cfd_data); static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue); +static DEFINE_PER_CPU(atomic_t, trigger_backtrace) = ATOMIC_INIT(1); + static void __flush_smp_call_function_queue(bool warn_cpu_offline); int smpcfd_prepare_cpu(unsigned int cpu) @@ -242,7 +244,8 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request"); } if (cpu >= 0) { - dump_cpu_task(cpu); + if (atomic_cmpxchg_acquire(&per_cpu(trigger_backtrace, cpu), 1, 0)) + dump_cpu_task(cpu); if (!cpu_cur_csd) { pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu); arch_send_call_function_single_ipi(cpu); @@ -423,9 +426,14 @@ static void __flush_smp_call_function_queue(bool warn_cpu_offline) struct llist_node *entry, *prev; struct llist_head *head; static bool warned; + atomic_t *tbt; lockdep_assert_irqs_disabled(); + /* Allow waiters to send backtrace NMI from here onwards */ + tbt = this_cpu_ptr(&trigger_backtrace); + atomic_set_release(tbt, 1); + head = this_cpu_ptr(&call_single_queue); entry = llist_del_all(head); entry = llist_reverse_order(entry);