From patchwork Thu Jan 12 00:36:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Paul E. McKenney" X-Patchwork-Id: 42210 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp3613795wrt; Wed, 11 Jan 2023 16:37:11 -0800 (PST) X-Google-Smtp-Source: AMrXdXtZ2siKk/IFIdccsjyjt7qhdchIFEuhW5U+9yOrniqZgUsEcOoNYuxFc/5ruKLjZhWrYyvz X-Received: by 2002:a17:907:b681:b0:858:a721:8394 with SMTP id vm1-20020a170907b68100b00858a7218394mr7210831ejc.65.1673483830857; Wed, 11 Jan 2023 16:37:10 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673483830; cv=none; d=google.com; s=arc-20160816; b=0b50fVXdmFJHbGwNOMCFjEpu3oBzVmFgf4ndwnRQMcOkQLg9EylZfqu6ulp133EgzU DReOiJdoOXC3lRK3yzDiTjtQwP/H6dE6RnNPVejLAQ5lhTAFVP3BQYeFIhpISPWGh6Gm o/R895qv77w6XbFl7oUx4NmZ4SCXOmteC4I/t+yuN6Upt2TgoVazyTyqkxfEP8xu4XcA s1Z5NSUzyWnNQoTzTvZJchq+63jtPOqFCZO7us9ddYOA49XGrER8GJsMhZYlArxVI5XA a7J3FmdtMML4Rt2P5Z3QiUBPMkFT6co17juSvW68SL69nubc/rMRd6DUPRKI0jOwQxis XoeQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-disposition:mime-version:reply-to :message-id:subject:cc:to:from:date:dkim-signature; bh=rg4ktHhPQRjhEQUXm5ar3XufjUSMy000vrXqtXJ8Shw=; b=fIHExjPtB80btYo4o2t+2bgqHPfOAC4wOWouUMQCkZoDMv+46dTXr8E2UOLC5JxZt+ C2k1eiQPMU1aOLsc0pAstxsWvZ3+TSxjp9R3OC/NOGQdty7+1JECfw25HfnJVr0IMDdu WVnwLzCDHwE0f1pKIFdQCx06QovQPfMakiJ3dIMF6aw9R4Htyo8pSmdKtnGIS9bSzyge ADSScgEGT2WoQzmjPwa+/SpedFxhkDH4VTQ1eix/vd62r2YZ2S6DEUtnerq3TgwFEi2Y 0WTR3Iy/pm7Ln1mpTI/c3HSMIrHfXjVxBJNdrmvWUIjrxBUJ5sIDJG4T//ZvsBRcg9Hh do7w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YvCQnHmW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id eq12-20020a056402298c00b0048567770f4esi15375791edb.628.2023.01.11.16.36.47; Wed, 11 Jan 2023 16:37:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YvCQnHmW; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233380AbjALAgk (ORCPT + 99 others); Wed, 11 Jan 2023 19:36:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54848 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231690AbjALAg3 (ORCPT ); Wed, 11 Jan 2023 19:36:29 -0500 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E013B62C3 for ; Wed, 11 Jan 2023 16:36:28 -0800 (PST) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 72AA061E89 for ; Thu, 12 Jan 2023 00:36:28 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8B1EC433EF; Thu, 12 Jan 2023 00:36:27 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1673483787; bh=IbaBrMYGLmoj1xH3uSne5D7/7LXFcR6qR4kCmMCIZrQ=; h=Date:From:To:Cc:Subject:Reply-To:From; b=YvCQnHmWShKbDCakinOSWib2Lhc6TcArDnsbF/Re2PUB4MPEeAPgt4aW5Rsyi+3zN X3+waHqMFZU5qRU0DdBSMacavsyBNfSXBu/IR6r2VfjyvtC92xqm/BxO8josEQiWVI 8wfkGuOQWjmUyOzov0ENbUpnb7XPQQO71jejkvw2OjChV+WMjH2MCPsZqIZByb6SZS fWRnF8aRBg44QW7pGO4COXLEFcGsItuqu3d3vEC3esWOGHw6kOME1k7LjgiSGPAhDs tJCB7Tag3nA5wzgimhTJbt7AW4fiX44FEE4grJX+zMTxBCqOykMzqWQnOeEmXKRrSg MavaYbAiEvZew== Received: by paulmck-ThinkPad-P17-Gen-1.home (Postfix, from userid 1000) id 726BD5C0687; Wed, 11 Jan 2023 16:36:27 -0800 (PST) Date: Wed, 11 Jan 2023 16:36:27 -0800 From: "Paul E. McKenney" To: riel@surriel.com, davej@codemonkey.org.uk Cc: linux-kernel@vger.kernel.org, kernel-team@meta.com Subject: [PATCH diagnostic qspinlock] Diagnostics for excessive lock-drop wait loop time Message-ID: <20230112003627.GA3133092@paulmck-ThinkPad-P17-Gen-1> Reply-To: paulmck@kernel.org MIME-Version: 1.0 Content-Disposition: inline X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1754774981459664648?= X-GMAIL-MSGID: =?utf-8?q?1754774981459664648?= We see systems stuck in the queued_spin_lock_slowpath() loop that waits for the lock to become unlocked in the case where the current CPU has set pending state. Therefore, this not-for-mainline commit gives a warning that includes the lock word state if the loop has been spinning for more than 10 seconds. It also adds a WARN_ON_ONCE() that complains if the lock is not in pending state. If this is to be placed in production, some reporting mechanism not involving spinlocks is likely needed, for example, BPF, trace events, or some combination thereof. Signed-off-by: Paul E. McKenney diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c index ac5a3e6d3b564..be1440782c4b3 100644 --- a/kernel/locking/qspinlock.c +++ b/kernel/locking/qspinlock.c @@ -379,8 +379,22 @@ void __lockfunc queued_spin_lock_slowpath(struct qspinlock *lock, u32 val) * clear_pending_set_locked() implementations imply full * barriers. */ - if (val & _Q_LOCKED_MASK) - atomic_cond_read_acquire(&lock->val, !(VAL & _Q_LOCKED_MASK)); + if (val & _Q_LOCKED_MASK) { + int cnt = _Q_PENDING_LOOPS; + unsigned long j = jiffies + 10 * HZ; + struct qspinlock qval; + int val; + + for (;;) { + val = atomic_read_acquire(&lock->val); + atomic_set(&qval.val, val); + WARN_ON_ONCE(!(val & _Q_PENDING_VAL)); + if (!(val & _Q_LOCKED_MASK)) + break; + if (!--cnt && !WARN(time_after(jiffies, j), "%s: Still pending and locked: %#x (%c%c%#x)\n", __func__, val, ".L"[!!qval.locked], ".P"[!!qval.pending], qval.tail)) + cnt = _Q_PENDING_LOOPS; + } + } /* * take ownership and clear the pending bit.