From patchwork Fri Apr 28 17:00:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Imran Khan X-Patchwork-Id: 88688 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1080270vqo; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ4JX4h1oxppG7zD+SEChUqbFzZiQzHioW7tFqn/ijLJOReWMNu+RxpG9Ep9Lh2LPNBoXial X-Received: by 2002:a05:6a00:1947:b0:63b:8b47:453c with SMTP id s7-20020a056a00194700b0063b8b47453cmr8136425pfk.1.1682701719083; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682701719; cv=none; d=google.com; s=arc-20160816; b=Bh3v7WfLCSZIDJ8FZt8BZ9wIqanirbNhOJt94a+MokfpeGlF2Umo5t6Dg9F0SYzqvK ShbPZvI4pnd/sOnl8hOgmXqvSCFhNEnS2+jovr3Qu0qTioU4w1QP2OzcWcJoJyUcN3Vs JrQky5hGs3XfyuMKxzU5OOjJ9Fc2+cVefUV7v45J5tE9hUQ07S65yhPY6+SLGQHifjKI WZiTsV3eK1Dael6Ivjc+SHp1fITefl7xRCNYaYIfr1uYo6Qqcqcq/xJsq+nxteZwcvoD TINaSkJQTG1IVTfcLk4i36SPta3f9sPGVL8nAfhA76znlNSSnpxnxmzN00oS5IU653TF B7zw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=+QMy/l384YvGQZ5Q1IRZPWRi2K6puso7UHybf7DN4AY=; b=zh7MSPv11TNwm9TSorGxcRj0XnhA0TW89Bl0T+l/iHR4aQPfXQ1ktpVvkOuvrcWb6v PPy3fxb0yDTQJ5f2yWJ5hNWCZFJ3EYiTBbJ8PAE7lDqPiWbsuMTaE62sjs+BB8m8sfsu v5ZLsV5X0aKfUBhoAgUIu0wk21lHa6+MSA2VHX2j226mqL6/p02cn1yrnivCegi777DI 6kL0xoCdKxzRbmnZrn519Jy6AOtkGKKgSRpuIOX+bF/u50hfoW2/zb+Rw4MZnlRdmGCm mTb1hqH5+dBKllU0U7qvxjYGV6GX4sL6GDeYR9Qotp8RQIUVqraPOJGnfNdjeVBcO/U6 Ywdw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b="EbkYj/Gw"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y18-20020aa78f32000000b0063b8ab7f91csi7610835pfr.365.2023.04.28.10.08.23; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b="EbkYj/Gw"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1346338AbjD1RA7 (ORCPT + 99 others); Fri, 28 Apr 2023 13:00:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229687AbjD1RAz (ORCPT ); Fri, 28 Apr 2023 13:00:55 -0400 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38DFE2703 for ; Fri, 28 Apr 2023 10:00:54 -0700 (PDT) Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33SF4XSx031617; Fri, 28 Apr 2023 17:00:25 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2023-03-30; bh=+QMy/l384YvGQZ5Q1IRZPWRi2K6puso7UHybf7DN4AY=; b=EbkYj/GwAotd4c9mOdbYHlF9HjFfE/6WCa8mfFk2QLALMvqnKBdTFvBN+YvU4StSpTPt zVa2mHuopaimttBTU5sFsdYHrbiQA7NVIl5jQ009qpNMx7yUY3gOTX+hz3egRf2LbgKY nRZawliVwClP2dJWw7/Qi5eqmoM7s57HRWWKgFzwI1ylaO7v/of1dfFJCNJ+MbYY5aot UjQ9XXZoMEiv8kEzJRou7dGBfd1/QZ3sadV+C3J2v9VoH+jjrIrzozRb6zg+hR4MvBsg Fp7WCGs4KTescVjI8iYvWKp1UCW0PfR8624VVQt3pSrmSFSO2xCA3n8EkTJFr9NtlrVB NQ== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46gbxwq8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 28 Apr 2023 17:00:25 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33SFlOMh029935; Fri, 28 Apr 2023 17:00:24 GMT Received: from localhost.localdomain (dhcp-10-191-130-53.vpn.oracle.com [10.191.130.53]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 3q461b1n2g-2; Fri, 28 Apr 2023 17:00:23 +0000 From: Imran Khan To: peterz@infradead.org, paulmck@kernel.org, jgross@suse.com, vschneid@redhat.com, yury.norov@gmail.com, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org Subject: [RFC PATCH 1/2] smp: Reduce logging due to dump_stack of CSD waiters. Date: Sat, 29 Apr 2023 03:00:05 +1000 Message-Id: <20230428170006.1241472-2-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230428170006.1241472-1-imran.f.khan@oracle.com> References: <20230428170006.1241472-1-imran.f.khan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-28_04,2023-04-27_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=915 adultscore=0 malwarescore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304280136 X-Proofpoint-GUID: uTolTkvX5GzRBumGrt1tMhrKR3wGkRXw X-Proofpoint-ORIG-GUID: uTolTkvX5GzRBumGrt1tMhrKR3wGkRXw X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764440637944733716?= X-GMAIL-MSGID: =?utf-8?q?1764440637944733716?= If a waiter is waiting for CSD lock, its call stack will not change between first and subsequent hang detection for the same CSD lock. So dump_stack for the waiter only for first time detection. This avoids excessive logging on large scale systems(with hundreds of CPUs) where repetitive dump_stack from hundreds of CPUs can flood the console. Signed-off-by: Imran Khan Reviewed-by: Paul E. McKenney --- kernel/smp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index ab3e5dad6cfe9..b7ccba677a0a0 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -248,7 +248,8 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * arch_send_call_function_single_ipi(cpu); } } - dump_stack(); + if (firsttime) + dump_stack(); *ts1 = ts2; return false; From patchwork Fri Apr 28 17:00:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Imran Khan X-Patchwork-Id: 88689 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1080271vqo; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ72djWQ/n54/17Y/+RHMtTpCKsKBsj1ys93bxVFv1t+M/W9sQ3Ziqxrk3arqM6+Lk26zO5T X-Received: by 2002:a17:902:ecc2:b0:1a5:1842:f7da with SMTP id a2-20020a170902ecc200b001a51842f7damr12993599plh.6.1682701719171; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1682701719; cv=none; d=google.com; s=arc-20160816; b=MQz9jWz2kdYMiBIMx59HDcTT3Ej/QDEU7hbTBaUYyHcbQjT1egQzat9CXe11Zfk7/S kDiVsRpzgWLPosbB5fs5lNv3PeP/VxQCa5i8rMTIGpbG/j3ZhnIu2JnLhru2gfM20N69 bkKl2gxbLSF+Q2kmFIZojNrcYsmfv+6rk1tZcXtWmW67VxX5ADiOAVSsWcOp2R9aJcKw Ek1MRO5Lg6vwATRSmBmGFLhSGI+Pyfz+2SaqnEQ2WbFBJxvM2BrpT9hUK1MVGZSgcN9s EhMoBgfu3qhJ+wkMStVdoiFoT1qVYWwoxkvoqjTFvDNArs6P21+Iq/vFvQVkDfVMwAYX MmaQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=B9yD/+CYrKVcypcEgjmGEaQX//z09PC+3sHaWn9qNXY=; b=uaN7CFxEj5h3KB/BZHWAkx87PD2JluNXmWORA8YEgyX4yHEtGSdnG5KLLInnBXVEC/ vlkITf7ZAbD3NdUd1X6aUPYyYL+h6z0bfRXbcvtBMa4CLMdSXRV+JyWh826n1zY7uwNo 53OJH51S/k/qQQIfNzViHzlaq5GoasPXe9zs3/xIye1KnLcart4Bkx5Vebz27qzHWlHb TetHLHl9QlveDVZbs44eWI+TcpBZ9vETsb0X/gY0bhEMO6Itm5EYk3mh9lXKgA31Lr1s Jw6UeXqJEUmUIoWMAA3LRVQnn0ONRLJNq7qJsgBN7lAx+cf+8dmiuE+LqfsySBMssqgt i09A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=gEASCGYc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ko6-20020a17090307c600b001a6b0edecc8si13294994plb.519.2023.04.28.10.08.21; Fri, 28 Apr 2023 10:08:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2023-03-30 header.b=gEASCGYc; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230012AbjD1RAy (ORCPT + 99 others); Fri, 28 Apr 2023 13:00:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229687AbjD1RAw (ORCPT ); Fri, 28 Apr 2023 13:00:52 -0400 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 174292703 for ; Fri, 28 Apr 2023 10:00:51 -0700 (PDT) Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33SF4HRn004481; Fri, 28 Apr 2023 17:00:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding; s=corp-2023-03-30; bh=B9yD/+CYrKVcypcEgjmGEaQX//z09PC+3sHaWn9qNXY=; b=gEASCGYc/eynyhzWI06n9O69YP3Gn1L7hXBPfR2WVDVwcRtD9esT8bvlzd19mE9W4RqA LFR/d5FnlRbIRMCEkC32HPDCyIwM9H/VQ0PuVm+LGSrvJi1tvM4Laqf6R+8Oit80mP3b 7oV0EmjKsXNQtOPwxmBVPBy2DAii6hej8KDf3FHUGhHrBvA4sus9LG2o6eZlsCHT+GAK RW9TbrSEZnJDvrbCPaPotzmvfOtoPaaJhIzBQhKjPW/8eia2LDTda37l5Ia0xWLRZSP7 z/jM8nsu92VtOTlIXMsnjE2XrbWhOP7Wgpd4pCswKQQvmLOCoQaNkx3Dw0YR/oEwuLYL EA== Received: from iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (iadpaimrmta01.appoci.oracle.com [130.35.100.223]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q47faxrv5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Fri, 28 Apr 2023 17:00:29 +0000 Received: from pps.filterd (iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com [127.0.0.1]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33SFlOMr029935; Fri, 28 Apr 2023 17:00:27 GMT Received: from localhost.localdomain (dhcp-10-191-130-53.vpn.oracle.com [10.191.130.53]) by iadpaimrmta01.imrmtpd1.prodappiadaev1.oraclevcn.com (PPS) with ESMTP id 3q461b1n2g-3; Fri, 28 Apr 2023 17:00:27 +0000 From: Imran Khan To: peterz@infradead.org, paulmck@kernel.org, jgross@suse.com, vschneid@redhat.com, yury.norov@gmail.com, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org Subject: [RFC PATCH 2/2] smp: Reduce NMI traffic from CSD waiters to CSD destination. Date: Sat, 29 Apr 2023 03:00:06 +1000 Message-Id: <20230428170006.1241472-3-imran.f.khan@oracle.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230428170006.1241472-1-imran.f.khan@oracle.com> References: <20230428170006.1241472-1-imran.f.khan@oracle.com> MIME-Version: 1.0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-28_04,2023-04-27_01,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 phishscore=0 mlxlogscore=978 adultscore=0 malwarescore=0 mlxscore=0 spamscore=0 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304280136 X-Proofpoint-ORIG-GUID: ZTw0sbXCQlslz37NWZcrpv0Eb2CeNxwi X-Proofpoint-GUID: ZTw0sbXCQlslz37NWZcrpv0Eb2CeNxwi X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1764440637953019803?= X-GMAIL-MSGID: =?utf-8?q?1764440637953019803?= On systems with hundreds of CPUs, if few hundred or most of the CPUs detect a CSD hang, then all of these waiters endup sending an NMI to destination CPU to dump its backtrace. Depending on the number of such NMIs, destination CPU can spent a significant amount of time handling these NMIs and thus making it more difficult for this CPU to address those pending CSDs timely. In worst case it can happen that by the time destination CPU is done handling all of the above mentioned backtrace NMIs, csd wait time may have elapsed and all of the waiters start sending backtrace NMI again and this behaviour continues in loop. To avoid the above mentioned scenario, issue backtrace NMI only from first waiter. The other waiters to same CSD destination can make use of backtrace obtained via fist waiter's NMI. Signed-off-by: Imran Khan --- kernel/smp.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index b7ccba677a0a0..a1cd21ea8b308 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -43,6 +43,8 @@ static DEFINE_PER_CPU_ALIGNED(struct call_function_data, cfd_data); static DEFINE_PER_CPU_SHARED_ALIGNED(struct llist_head, call_single_queue); +static DEFINE_PER_CPU(atomic_t, trigger_backtrace) = ATOMIC_INIT(1); + static void __flush_smp_call_function_queue(bool warn_cpu_offline); int smpcfd_prepare_cpu(unsigned int cpu) @@ -242,7 +244,8 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * *bug_id, !cpu_cur_csd ? "unresponsive" : "handling this request"); } if (cpu >= 0) { - dump_cpu_task(cpu); + if (atomic_cmpxchg_acquire(&per_cpu(trigger_backtrace, cpu), 1, 0)) + dump_cpu_task(cpu); if (!cpu_cur_csd) { pr_alert("csd: Re-sending CSD lock (#%d) IPI from CPU#%02d to CPU#%02d\n", *bug_id, raw_smp_processor_id(), cpu); arch_send_call_function_single_ipi(cpu); @@ -423,9 +426,14 @@ static void __flush_smp_call_function_queue(bool warn_cpu_offline) struct llist_node *entry, *prev; struct llist_head *head; static bool warned; + atomic_t *tbt; lockdep_assert_irqs_disabled(); + /* Allow waiters to send backtrace NMI from here onwards */ + tbt = this_cpu_ptr(&trigger_backtrace); + atomic_set_release(tbt, 1); + head = this_cpu_ptr(&call_single_queue); entry = llist_del_all(head); entry = llist_reverse_order(entry);