From patchwork Mon Jul 17 18:36:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 121555 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp1285361vqt; Mon, 17 Jul 2023 11:51:09 -0700 (PDT) X-Google-Smtp-Source: APBJJlGkJXuqmRVf49lJkoPCr1ynwVA9j1ly9kq0w0P0/O++pFd5zmMM28IzQ6Kj0NxRvF3a6Bv1 X-Received: by 2002:a17:902:8496:b0:1b9:dea2:800f with SMTP id c22-20020a170902849600b001b9dea2800fmr10677786plo.8.1689619868895; Mon, 17 Jul 2023 11:51:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689619868; cv=none; d=google.com; s=arc-20160816; b=qWPBBBC8g3UmkF0JMJFW1hakl+2EdNG8IZ2soN3RNW6ehSafYWEGyFCxmMIpeBr0fP Xfvo/itwgOXEG9kevX0kB+gFNGi5Yj/3cehnAvdz1OT9JE2KbhJrNznbNdXWAej99Agv 4qQ3siidN0QHqM0J+vdvUJ67aPamhfWaYITZtkN9S6dbiy3jx0jC6Lpu+PIEKk+hLpH5 qJPcFFxtVxvxqrW5BdZDPqOc0SeA/DhOKwbJMBCfA3egWbhQBivSpD0SH0c44YphFUHs 9NWWNPkmulcWFfjs3ws2cBpHSVCzr46vm8y/4Dn8LUkOs/cTM3fKp6tPosENgmHGsAdU wTlw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Z2YniucVw6EbMQ9gP2/8wgTt74n6RDs6sPX/omncp+E=; fh=2v2BigKX/Pf+C6GEhbQZCWw2KRurWN9Nu+sU6pW3/0Q=; b=qDW/YzRs66JGLlEbRdft99g8DLfi0LnoRVu8OhuVIvT4K0mnUzSqnkmT1hqaWohJRD dZ2FmSEPhe6/oCG+ov1KBWKX8EHH23/TKJ/mjoKmxz6F4UY19l4wpk10aKspxClhK4/j b2clpoTl/YMNfhouSa1quSY96o/Mwq5BkbazWIiCZWyzJR/VGobIWobaMPWI/SkoqcsM dPgpks19MkB42j/ONiUtCB7yFW1XmH/W1MvGezwne0gRnKRiarEZKVnlIXSG06dKNKuR HKq6IBkPwAZ64f/Xu24z4yrycVnuHFKaMtdoDPACgzDQ1Ll5GlBVFDVYauCr7paj4Z3L 9yMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qq7BU6Ch; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lg3-20020a170902fb8300b001b02e044c87si231149plb.320.2023.07.17.11.50.55; Mon, 17 Jul 2023 11:51:08 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qq7BU6Ch; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230322AbjGQSgH (ORCPT + 99 others); Mon, 17 Jul 2023 14:36:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229471AbjGQSgG (ORCPT ); Mon, 17 Jul 2023 14:36:06 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77D2194 for ; Mon, 17 Jul 2023 11:36:05 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 139EF611EC for ; Mon, 17 Jul 2023 18:36:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 741BEC433C7; Mon, 17 Jul 2023 18:36:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689618964; bh=9yE12PZ29ftJGWkQSTA0NaFgX48aRCYj4ThbucaU9ic=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qq7BU6ChMsy/ajc3jDBg3X32u8GY/CKhM7Xwa94P9zvmRefEK6JM6D9dioCAITlqG DftQwzSUfZOj2G1xf9EJokEE+Ad9EGHRdO0VmhJcm8FYVPkefNjSyBXwgd4hfC3I57 SaPQKFvdTXeorrf0Xue5P/9h3kIYWPnCGiGIh49/Ko4+aEOwE9ndZUnMkfcz7Oxegd 6eWR74hIaF4tBLEPK1XMGo7deyoc6bNLmcy2N+pFSJeFSmgeZto/YWjfuRv3xmJ3cS eFo03CCLPVxVFUN0c+kANxKNEDROzEMVC6f60LyCUtkEaFgYuZSxDnbn45QSTiLchI gHfv3eOLTXWdA== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 2F6FDCE03F1; Mon, 17 Jul 2023 11:36:04 -0700 (PDT) From: "Paul E. McKenney" To: peterz@infradead.org, jgross@suse.com, vschneid@redhat.com, yury.norov@gmail.com Cc: linux-kernel@vger.kernel.org, imran.f.khan@oracle.com, kernel-team@meta.com, "Paul E . McKenney" Subject: [PATCH csd-lock 1/2] smp: Reduce logging due to dump_stack of CSD waiters Date: Mon, 17 Jul 2023 11:36:01 -0700 Message-Id: <20230717183602.1099773-1-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: <96818440-a922-4b43-8871-50358e18b523@paulmck-laptop> References: <96818440-a922-4b43-8871-50358e18b523@paulmck-laptop> MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771694843890821198 X-GMAIL-MSGID: 1771694843890821198 From: Imran Khan If a waiter is waiting for CSD lock, its call stack will not change between first and subsequent hang detection for the same CSD lock. Therefore, do dump_stack only for first-time detection for a given waiter. This avoids excessive logging on systems with hundreds of CPUs where repetitive dump_stack from hundreds of CPUs would otherwise flood the console. Signed-off-by: Imran Khan Cc: Peter Zijlstra Cc: Juergen Gross Cc: Valentin Schneider Cc: Yury Norov Signed-off-by: Paul E. McKenney --- kernel/smp.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/kernel/smp.c b/kernel/smp.c index 385179dae360..1d41a0cb54f1 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -259,7 +259,8 @@ static bool csd_lock_wait_toolong(struct __call_single_data *csd, u64 ts0, u64 * arch_send_call_function_single_ipi(cpu); } } - dump_stack(); + if (firsttime) + dump_stack(); *ts1 = ts2; return false; From patchwork Mon Jul 17 18:28:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 121542 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp1281149vqt; Mon, 17 Jul 2023 11:39:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlHGpi0JHJUc9yrHoQ/C7t+l17oG44nhFRS9RrYVVd1diWE1YGsA3ifvYosbo6L2fgH+iaea X-Received: by 2002:a05:6a00:3a1e:b0:668:82fe:16f1 with SMTP id fj30-20020a056a003a1e00b0066882fe16f1mr15984376pfb.1.1689619191745; Mon, 17 Jul 2023 11:39:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689619191; cv=none; d=google.com; s=arc-20160816; b=BUhDFjWkjDaGauZfd1capc5sWiPnOb875AfHL5Ihozh0JNCxe/NQEUPr5ikg8N6Evi kmVCtCqUKNiFKkWA6ojPNG7nFZUc6EE56R6kcQfBMDonhKOLq28/VOLdHNxx5XrB2cma oOxLK6iJLKBAomL3/8H7SWQFEbYbV1oKIdxehltexT1e7XO5cRDPbBiSToOGXieNxIRP 8F7z9ngRZiAyKjKXgXTq/mLNCwC6mQXRylygHwR142im1OzAutMZh2UpIPIWCOAoiLqG vFJab6nERBfpWe5nxB4L97nQcSmMfB4ylX1b7XBBM2umpzLai6OriZcRUpGdqhzNT3W0 hVvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=WR1rqtioSV1UzmCa1mbQt6cdrfqRV2qoapm+TffNV8M=; fh=ieGhnLmsaY5FDf8cp0iAT1CjY0Ir3a1s18HnfHp238Y=; b=tnybJppv1ds9elphQ6RfxGavrQDinryBqudFtAeSDJC4/YkRLhGyorF260McugGdZd JuiBnlHMK9unQznUbg1LrT+kTPAkh+CCLXFPfmukznZSHWmYiSLqtgU0bI9K7x4Uubtl VlvCj0SBpmmF8IPylCcYzxKha7Oc3lFkXpVNv20ji8qB1q9+UoIV2/qxRTeRXXeh158/ 8AkMGKsVQQNMEtVEnkWiNYemTOR4S5Gn7Sq7gwVgwkpj3QBzO1tVuQ9h9WLQcLMyV/2U kr3rD7JAuBWx8JPpq7MqleP/0nIGU3TBgKHkuAqrG7lp5gvqUcWVg/HH4pFqKKSXFX8q aMdQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Q7HCYboI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b25-20020a656699000000b0055b731aa992si180787pgw.370.2023.07.17.11.39.38; Mon, 17 Jul 2023 11:39:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Q7HCYboI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230244AbjGQS2v (ORCPT + 99 others); Mon, 17 Jul 2023 14:28:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60794 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232086AbjGQS2S (ORCPT ); Mon, 17 Jul 2023 14:28:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 119E2E6F for ; Mon, 17 Jul 2023 11:28:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id A1467611F0 for ; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0BBABC433C8; Mon, 17 Jul 2023 18:28:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1689618496; bh=LxVPV/awBr9JfrIXQcn7s5nybGvSK4R3yuZ8UegsyEo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Q7HCYboIy3k9DTctpmN4Na4Olx0Cl/2WKV4x3hO5adpOJs9UdrnskmzdF0EYOzt3C kxvC9Vb9O4mdfIuRerjrWE5UXoIzFkU0VDwuX7N5pI/bxZwqDPz+7CVAc8v93Z65M/ Q79nxpbCVvlesiSbtqppsdZA0M5P7OXDA5aiie1sT9mxzx6Jy+CrSvsG4L/TLmTbmx O3HUml2d9qqsH76oGWKlHof+tldsQI03K5VQ0w2LTyoCVx3OMDlJIPNXVfGQEnLoDM NZClUUW//W1SxqNzqwUqlk5sif3Wp3TyH8UDYoCJq09n+p4fXVeS+Hwn3xNKPf0YZP SPGbU9iD5D4Yg== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id A4D05CE04CD; Mon, 17 Jul 2023 11:28:15 -0700 (PDT) From: "Paul E. McKenney" To: tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, john.stultz@linaro.org, sboyd@kernel.org, corbet@lwn.net, Mark.Rutland@arm.com, maz@kernel.org, kernel-team@meta.com, neeraju@codeaurora.org, ak@linux.intel.com, feng.tang@intel.com, zhengjun.xing@intel.com, daniel.lezcano@linaro.org, Yu Liao , "Paul E . McKenney" Subject: [PATCH clocksource 2/2] x86/tsc: Extend watchdog check exemption to 4-Sockets platform Date: Mon, 17 Jul 2023 11:28:14 -0700 Message-Id: <20230717182814.1099419-2-paulmck@kernel.org> X-Mailer: git-send-email 2.40.1 In-Reply-To: References: MIME-Version: 1.0 X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771694133372054909 X-GMAIL-MSGID: 1771694133372054909 From: Feng Tang There were reports again that the tsc clocksource on 4 sockets x86 servers was wrongly judged as 'unstable' by 'jiffies' and other watchdogs, and disabled [1][2]. Commit b50db7095fe0 ("x86/tsc: Disable clocksource watchdog for TSC on qualified platorms") was introduce to deal with these false alarms of tsc unstable issues, covering qualified platforms for 2 sockets or smaller ones. And from history of chasing TSC issues, Thomas and Peter only saw real TSC synchronization issue on 8 socket machines. So extend the exemption to 4 sockets to fix the issue. Rui also proposed another way to disable 'jiffies' as clocksource watchdog [3], which can also solve problem in [1]. in an architecture independent way, but can't cure the problem in [2]. whose watchdog is HPET or PMTIMER, while 'jiffies' is mostly used as watchdog in boot phase. 'nr_online_nodes' has known inaccurate problem for cases like platform with cpu-less memory nodes, sub numa cluster enabled, fakenuma, kernel cmdline parameter 'maxcpus=', etc. The harmful case is the 'maxcpus' one which could possibly under estimates the package number, and disable the watchdog, but bright side is it is mostly for debug usage. All these will be addressed in other patches, as discussed in thread [4]. [1]. https://lore.kernel.org/all/9d3bf570-3108-0336-9c52-9bee15767d29@huawei.com/ [2]. https://lore.kernel.org/lkml/06df410c-2177-4671-832f-339cff05b1d9@paulmck-laptop/ [3]. https://lore.kernel.org/all/bd5b97f89ab2887543fc262348d1c7cafcaae536.camel@intel.com/ [4]. https://lore.kernel.org/all/20221021062131.1826810-1-feng.tang@intel.com/ Reported-by: Yu Liao Reported-by: Paul E. McKenney Signed-off-by: Feng Tang Signed-off-by: Paul E. McKenney --- arch/x86/kernel/tsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c index 3425c6a943e4..15f97c0abc9d 100644 --- a/arch/x86/kernel/tsc.c +++ b/arch/x86/kernel/tsc.c @@ -1258,7 +1258,7 @@ static void __init check_system_tsc_reliable(void) if (boot_cpu_has(X86_FEATURE_CONSTANT_TSC) && boot_cpu_has(X86_FEATURE_NONSTOP_TSC) && boot_cpu_has(X86_FEATURE_TSC_ADJUST) && - nr_online_nodes <= 2) + nr_online_nodes <= 4) tsc_disable_clocksource_watchdog(); }