From patchwork Fri Jun 23 17:12:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 112247 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp5966620vqr; Fri, 23 Jun 2023 11:33:46 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ6j/s2Hl25i28+eLjmRigMUbUDkJOuJlaAKubyvIj/vY3yoCiMl5qPCtWBYynOxElFwoOAM X-Received: by 2002:a05:6a00:1746:b0:668:9dca:13ac with SMTP id j6-20020a056a00174600b006689dca13acmr11252134pfc.7.1687545226286; Fri, 23 Jun 2023 11:33:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687545226; cv=none; d=google.com; s=arc-20160816; b=FZhGdM8r7RKW17XEJDi7LXQvLWJp3rF8x2Pqjtd17V4cKR9k1vuv9Jw9KWi0RTqbxP NdyL+cW1Z0xMpTDjpMvsQw9teQOFJd4El2BZd9Mm2yf7Uxm1Op++9ZsoTWCByTaadg4B uc6tXoVmGfalL6PpH6i8oiUAvRbO0t0S6SeUS4rqffIGiONFT5SB5ekAo2AlKZAw6fX1 oQi3SLvToXFSiePzW8q3XzA7V3QBpaswmeBTGzxP7y+W4vuZX1Q5dA+TZqOsw1RJkcQn 2lSsSesxeiwrU7+mZQLu8joD+4VEG6k2AlxzBQYMFSzv4+61NbW6bL/7xjrsP53Gx0BW Kt5g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=d174zlUL/lhF0vBTL/OICvsXlNACFuiYhQ7ttHm3uTo=; b=eXY3QuHyG5ITau5hEM0tjgoP5QmrdC7uMc+nBrnm30bf5PNIJUDR14trqCN9E6tkNW k264nZdK++tnUA7LF53ukeRHKOrewDxBSirDnpBRYeE6pfi006NLO6s4Y/GzbitWgCDU Heu4HxsWd9ErUT4nsIi6HZEc5MPMC62vMPw194Zw+uRoeDr57/4CmV2a1u9mGetSKBqu SOVmiZEznWwmHSVmKEugG3BdZhUF0hPyB/TNXDBcMlyPI2u61muSMA+/yd0dqYRzsaK3 v65PQ53jrG2imCkPFH1ErWtnS9TCHTg4mDCnTmgTfzeFj2rqQwLih6gnYTtgQJ/VUiB7 ytNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=wUEukFRu; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id a13-20020aa795ad000000b00647d6c7075asi7247558pfk.109.2023.06.23.11.33.31; Fri, 23 Jun 2023 11:33:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=wUEukFRu; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232098AbjFWRND (ORCPT + 99 others); Fri, 23 Jun 2023 13:13:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230506AbjFWRMq (ORCPT ); Fri, 23 Jun 2023 13:12:46 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08DBA19A1 for ; Fri, 23 Jun 2023 10:12:43 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1687540361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d174zlUL/lhF0vBTL/OICvsXlNACFuiYhQ7ttHm3uTo=; b=wUEukFRuQtWtID4ZOeOhNXoXQ773U6bquwTxJC7s8rOqzIlrm+gZSF7ZYmg1UhYmzJz4qm oD5Ysr/0zLXb9jT8ovdKF0Bpnlu+ruuv5VQLGVR9IpQu1tCXNMfKYOdKKjWpQ6kyj38jjk A3H8UVBPOl/2QXtQf3GaYd928rVYkoeof+erMEKQc7QMrkdtrWBK1pJnKT5YTlzAhx4C64 eK4rIHSgqMfZyHITIAvcGigQjrydcs2bQNIuo3JhRnlq9qSlzz3y5lPmbLHw+cWmTyNiTQ YRAqhj5dYBPvaNyVjqV0onfE/WhUArwF1Be/lz2+WhLL5yHn/M6an9SWQOwFHQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1687540361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=d174zlUL/lhF0vBTL/OICvsXlNACFuiYhQ7ttHm3uTo=; b=crRCrHYhVULMoIjeuCQU0yIWeH3xxFHKwaD5PsqTuAc+qRz2en09xW9R2TKMpVhjpwL/DE JSx8UERIkzfTFABA== To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: "Luis Claudio R. Goncalves" , Andrew Morton , Boqun Feng , Ingo Molnar , John Ogness , Mel Gorman , Michal Hocko , Peter Zijlstra , Petr Mladek , Tetsuo Handa , Thomas Gleixner , Waiman Long , Will Deacon , Sebastian Andrzej Siewior , Tetsuo Handa Subject: [PATCH v2 1/2] seqlock: Do the lockdep annotation before locking in do_write_seqcount_begin_nested() Date: Fri, 23 Jun 2023 19:12:31 +0200 Message-Id: <20230623171232.892937-2-bigeasy@linutronix.de> In-Reply-To: <20230623171232.892937-1-bigeasy@linutronix.de> References: <20230623171232.892937-1-bigeasy@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769519423675632382?= X-GMAIL-MSGID: =?utf-8?q?1769519423675632382?= It was brought up by Tetsuo that the following sequence write_seqlock_irqsave() printk_deferred_enter() could lead to a deadlock if the lockdep annotation within write_seqlock_irqsave() triggers. The problem is that the sequence counter is incremented before the lockdep annotation is performed. The lockdep splat would then attempt to invoke printk() but the reader side, of the same seqcount, could have a tty_port::lock acquired waiting for the sequence number to become even again. The other lockdep annotations come before the actual locking because "we want to see the locking error before it happens". There is no reason why seqcount should be different here. Do the lockdep annotation first then perform the locking operation (the sequence increment). Fixes: 1ca7d67cf5d5a ("seqcount: Add lockdep functionality to seqcount/seqlock structures") Reported-by: Tetsuo Handa Link: https://lore.kernel.org/20230621130641.-5iueY1I@linutronix.de Signed-off-by: Sebastian Andrzej Siewior Acked-by: Mel Gorman --- include/linux/seqlock.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h index 3926e90279477..d778af83c8f36 100644 --- a/include/linux/seqlock.h +++ b/include/linux/seqlock.h @@ -512,8 +512,8 @@ do { \ static inline void do_write_seqcount_begin_nested(seqcount_t *s, int subclass) { - do_raw_write_seqcount_begin(s); seqcount_acquire(&s->dep_map, subclass, 0, _RET_IP_); + do_raw_write_seqcount_begin(s); } /** From patchwork Fri Jun 23 17:12:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sebastian Andrzej Siewior X-Patchwork-Id: 112245 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp5962152vqr; Fri, 23 Jun 2023 11:24:59 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ5bJxssfdbQC6DD+J1qYVfrmuxvacQ+uThrizwCOu02n9Uzgog8oNtDf4Xev95Sz0//ox8P X-Received: by 2002:a05:6358:1a8f:b0:131:d52f:ca6b with SMTP id gm15-20020a0563581a8f00b00131d52fca6bmr7153842rwb.24.1687544698954; Fri, 23 Jun 2023 11:24:58 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1687544698; cv=none; d=google.com; s=arc-20160816; b=0OMR0nnD/oY91LQmPlsRBXI7x1pDM4RdRwa10yqv+JGvpAPva+su9p+1vSm5ysPY3w 0CAS//2R4C5vktWUspKomCGm+lXHXP2phAUApy16FUMsa3dJtUdzyM++UJXcZuhbLS4C kqL9c8kQcWn4QRJctszGqn1FWrZACLNVzBu0CeWFCZEOSP/jyXZqK5p4O/JFfMIckpFh wWv6hWIq+ZS6y+zzBuInAPVYsQDpE4FIbOSLuLa/tDh13WqpbIqx4Hucjz0fH7W0K2/6 IzjVfwjo723hT01mev+zzFqXQO0vhs5xFnSVTNt2/ch4vr8NXRus3xQzLm88m6yja9cm aeww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:dkim-signature :dkim-signature:from; bh=BMld1PknhyoTTWqIQWTi/haDnPaStkrULY18977v1W8=; b=NqRPB0naLHMd5VybLPoG8B8U80bsDoC20o/Yt7lvPAZ7LrYH7dpp6LDcmMKQLRg38H qMz0DOkEGwVfkMprEHAMwdUhtBskVeYI03x7ypyQ8t3hDJ8fasgBaI1LUH0vsecQKbKN 2YAgjlwxOFANvm5MDIVtZrkzGp95OeK0PJnkwbgwTkqgLmB6iyIJfviFkZjJSvj9soys raD9xCXfm0kxI0zf1P8Yl0hNWQZdEcTBBZB9/BWXsSYAvodxFMcY5EDUbx6WlIF1CVnH LmiV/LfeGDAEd3QIV4tEec0S0z3PrPttW8t75d/+dkzk4A1ycRwZpWslcZheRCpkkrnQ naig== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="UooxLz/z"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id i28-20020a633c5c000000b005572aaee706si7103541pgn.689.2023.06.23.11.24.42; Fri, 23 Jun 2023 11:24:58 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b="UooxLz/z"; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231761AbjFWRMx (ORCPT + 99 others); Fri, 23 Jun 2023 13:12:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42628 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230501AbjFWRMq (ORCPT ); Fri, 23 Jun 2023 13:12:46 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 08BD01993 for ; Fri, 23 Jun 2023 10:12:43 -0700 (PDT) From: Sebastian Andrzej Siewior DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1687540361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BMld1PknhyoTTWqIQWTi/haDnPaStkrULY18977v1W8=; b=UooxLz/zkoWhq+XAmNcit5ZQDI4n/Y+KVronARo5RtTZ+Tn6qjAbf+SOx/FC241sDV8f4t t8HJpjsAu06tlvLNMcBnutuGtTbknm/McJdfqacY7/9mKIMPXaChuyDfV91Lysn8JRgd04 FEajI9b521OGBvRhaLW4XirU9DI8WhB8c6njc7IoxIwm2p33Jxy1cOxv66rkXxgB5Chru+ A/FRmn+Hy1gok2d3NWwOZ5rM/Dtjk9URSGAuXs6LWjFMo+DbMSueXkd5UnHH+D8fM0FIeG T9YGtSeiToHljMH9F6NvMNkhsbUtBC+W2HnzhVsbBO2nnoCWRzHlsR620hmdWQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1687540361; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=BMld1PknhyoTTWqIQWTi/haDnPaStkrULY18977v1W8=; b=8kTgNXsUUa80s7tfPsheHF9UbSnMTZUFyjFjS52xpo0dV5r9dRvc1KbatV5CicN60n6keS OX2i42K9YrjV0vBg== To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: "Luis Claudio R. Goncalves" , Andrew Morton , Boqun Feng , Ingo Molnar , John Ogness , Mel Gorman , Michal Hocko , Peter Zijlstra , Petr Mladek , Tetsuo Handa , Thomas Gleixner , Waiman Long , Will Deacon , Sebastian Andrzej Siewior Subject: [PATCH v2 2/2] mm/page_alloc: Use write_seqlock_irqsave() instead write_seqlock() + local_irq_save(). Date: Fri, 23 Jun 2023 19:12:32 +0200 Message-Id: <20230623171232.892937-3-bigeasy@linutronix.de> In-Reply-To: <20230623171232.892937-1-bigeasy@linutronix.de> References: <20230623171232.892937-1-bigeasy@linutronix.de> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1769518870272293978?= X-GMAIL-MSGID: =?utf-8?q?1769518870272293978?= __build_all_zonelists() acquires zonelist_update_seq by first disabling interrupts via local_irq_save() and then acquiring the seqlock with write_seqlock(). This is troublesome and leads to problems on PREEMPT_RT. The problem is that the inner spinlock_t becomes a sleeping lock on PREEMPT_RT and must not be acquired with disabled interrupts. The API provides write_seqlock_irqsave() which does the right thing in one step. printk_deferred_enter() has to be invoked in non-migrate-able context to ensure that deferred printing is enabled and disabled on the same CPU. This is the case after zonelist_update_seq has been acquired. There was discussion on the first submission that the order should be: local_irq_disable(); printk_deferred_enter(); write_seqlock(); to avoid pitfalls like having an unaccounted printk() coming from write_seqlock_irqsave() before printk_deferred_enter() is invoked. The only origin of such a printk() can be a lockdep splat because the lockdep annotation happens after the sequence count is incremented. This is exceptional and subject to change. It was also pointed that PREEMPT_RT can be affected by the printk problem since its write_seqlock_irqsave() does not really disable interrupts. This isn't the case because PREEMPT_RT's printk implementation differs from the mainline implementation in two important aspects: - Printing happens in a dedicated threads and not at during the invocation of printk(). - In emergency cases where synchronous printing is used, a different driver is used which does not use tty_port::lock. Acquire zonelist_update_seq with write_seqlock_irqsave() and then defer printk output. Fixes: 1007843a91909 ("mm/page_alloc: fix potential deadlock on zonelist_update_seq seqlock") Signed-off-by: Sebastian Andrzej Siewior Acked-by: Michal Hocko --- mm/page_alloc.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 47421bedc12b7..99b7e7d09c5c0 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -5808,11 +5808,10 @@ static void __build_all_zonelists(void *data) unsigned long flags; /* - * Explicitly disable this CPU's interrupts before taking seqlock - * to prevent any IRQ handler from calling into the page allocator - * (e.g. GFP_ATOMIC) that could hit zonelist_iter_begin and livelock. + * The zonelist_update_seq must be acquired with irqsave because the + * reader can be invoked from IRQ with GFP_ATOMIC. */ - local_irq_save(flags); + write_seqlock_irqsave(&zonelist_update_seq, flags); /* * Explicitly disable this CPU's synchronous printk() before taking * seqlock to prevent any printk() from trying to hold port->lock, for @@ -5820,7 +5819,6 @@ static void __build_all_zonelists(void *data) * calling kmalloc(GFP_ATOMIC | __GFP_NOWARN) with port->lock held. */ printk_deferred_enter(); - write_seqlock(&zonelist_update_seq); #ifdef CONFIG_NUMA memset(node_load, 0, sizeof(node_load)); @@ -5857,9 +5855,8 @@ static void __build_all_zonelists(void *data) #endif } - write_sequnlock(&zonelist_update_seq); printk_deferred_exit(); - local_irq_restore(flags); + write_sequnlock_irqrestore(&zonelist_update_seq, flags); } static noinline void __init