From patchwork Tue Apr 4 15:47:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Rongwei Wang X-Patchwork-Id: 79211 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp3135046vqo; Tue, 4 Apr 2023 08:58:49 -0700 (PDT) X-Google-Smtp-Source: AKy350YT4P24mLFibtuhpj76k8E3W6a1TWn0g84dfodg4syjRhzvsyWv0/lIIIQZ1/g/WOg+J0sk X-Received: by 2002:a05:6a20:7511:b0:d9:199a:c719 with SMTP id r17-20020a056a20751100b000d9199ac719mr2599888pzd.37.1680623929017; Tue, 04 Apr 2023 08:58:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680623928; cv=none; d=google.com; s=arc-20160816; b=mTpZAtQppUGp71P2gGJ3ASUn+sqOoepsURJj6sm7FOlKUJsggV6QWinlEQD/wtQPnq LT2vbZk6rkNt5KMFr4IWXEcwa6dnPxLKZAmxY6LBVtinX/+TKNi96ARJaT9WYlhOnZm2 bxWg55kd2f+y6ONGQcHJXQc1VwXnmBvhHxjtEUx1weKdC6ksrMs08tmmna5DvbZq0xxP PH1pRffNTkBdDGmSmqgfksrBC6IzQ2XnHF5sqCqdQAf+zvquWns48lZtnYQvrJNiAwuF 01R9xINVzzh51UsihYfsHlzU4TSs4ZSDCg216NcbmhXJ5Ysj7xYTdX8LXHHhchZDRSeV fhig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=QYcy1RubAtARp9HjLnnlVEo2Mh9wUZGeem73NWGOito=; b=SiPcERWoMlj18HRvsz7AZiVkCKZGb2SF1KBkGDg39ylQppXtgBakpFfZ8xogvmynJl LiWkkQ+3H5aFvqdGi+KgKJrOo8leTguyiNTPZ1TzXFR1hbmhF+JJTGJx30KOA0dJoUXY JpeHZ/kfLLcP/Nwzqgdu5Tw5yqGVPftWx39EoU57sQU8DADV7sHRdXRp9K/Eq65dz0qA SA7YN1Ik+pok88KMajQpYcFfiQxDZZdA6SdibSbeTasZhOKN/8QT5KpncofntIMBdhqZ 6nJQwS/ssDnEOwlbHUb7y7uNLIpvZpy44kSEAtIrwnKWx/TOcmdGxBBxL0ZMeJ1z7O8F 7EjA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c10-20020a63ea0a000000b00503011e4f27si10460310pgi.869.2023.04.04.08.58.31; Tue, 04 Apr 2023 08:58:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235918AbjDDPr1 (ORCPT + 99 others); Tue, 4 Apr 2023 11:47:27 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46216 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235898AbjDDPrZ (ORCPT ); Tue, 4 Apr 2023 11:47:25 -0400 Received: from out30-113.freemail.mail.aliyun.com (out30-113.freemail.mail.aliyun.com [115.124.30.113]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80892173E; Tue, 4 Apr 2023 08:47:22 -0700 (PDT) X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R101e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=ay29a033018046060;MF=rongwei.wang@linux.alibaba.com;NM=1;PH=DS;RN=6;SR=0;TI=SMTPD_---0VfMLSkR_1680623236; Received: from localhost.localdomain(mailfrom:rongwei.wang@linux.alibaba.com fp:SMTPD_---0VfMLSkR_1680623236) by smtp.aliyun-inc.com; Tue, 04 Apr 2023 23:47:17 +0800 From: Rongwei Wang To: akpm@linux-foundation.org, bagasdotme@gmail.com, willy@infradead.org Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH v2] mm/swap: fix swap_info_struct race between swapoff and get_swap_pages() Date: Tue, 4 Apr 2023 23:47:16 +0800 Message-Id: <20230404154716.23058-1-rongwei.wang@linux.alibaba.com> X-Mailer: git-send-email 2.32.0 In-Reply-To: <20230401221920.57986-1-rongwei.wang@linux.alibaba.com> References: <20230401221920.57986-1-rongwei.wang@linux.alibaba.com> MIME-Version: 1.0 X-Spam-Status: No, score=-8.0 required=5.0 tests=ENV_AND_HDR_SPF_MATCH, RCVD_IN_DNSWL_NONE,RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_PASS, UNPARSEABLE_RELAY,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1762014250199735344?= X-GMAIL-MSGID: =?utf-8?q?1762261916916944782?= The si->lock must be held when deleting the si from the available list. Otherwise, another thread can re-add the si to the available list, which can lead to memory corruption. The only place we have found where this happens is in the swapoff path. This case can be described as below: core 0 core 1 swapoff del_from_avail_list(si) waiting try lock si->lock acquire swap_avail_lock and re-add si into swap_avail_head acquire si->lock but missing si already be added again, and continuing to clear SWP_WRITEOK, etc. It can be easily found a massive warning messages can be triggered inside get_swap_pages() by some special cases, for example, we call madvise(MADV_PAGEOUT) on blocks of touched memory concurrently, meanwhile, run much swapon-swapoff operations (e.g. stress-ng-swap). However, in the worst case, panic can be caused by the above scene. In swapoff(), the memory used by si could be kept in swap_info[] after turning off a swap. This means memory corruption will not be caused immediately until allocated and reset for a new swap in the swapon path. A panic message caused: (with CONFIG_PLIST_DEBUG enabled) ------------[ cut here ]------------ top: 00000000e58a3003, n: 0000000013e75cda, p: 000000008cd4451a prev: 0000000035b1e58a, n: 000000008cd4451a, p: 000000002150ee8d next: 000000008cd4451a, n: 000000008cd4451a, p: 000000008cd4451a WARNING: CPU: 21 PID: 1843 at lib/plist.c:60 plist_check_prev_next_node+0x50/0x70 Modules linked in: rfkill(E) crct10dif_ce(E)... CPU: 21 PID: 1843 Comm: stress-ng Kdump: ... 5.10.134+ Hardware name: Alibaba Cloud ECS, BIOS 0.0.0 02/06/2015 pstate: 60400005 (nZCv daif +PAN -UAO -TCO BTYPE=--) pc : plist_check_prev_next_node+0x50/0x70 lr : plist_check_prev_next_node+0x50/0x70 sp : ffff0018009d3c30 x29: ffff0018009d3c40 x28: ffff800011b32a98 x27: 0000000000000000 x26: ffff001803908000 x25: ffff8000128ea088 x24: ffff800011b32a48 x23: 0000000000000028 x22: ffff001800875c00 x21: ffff800010f9e520 x20: ffff001800875c00 x19: ffff001800fdc6e0 x18: 0000000000000030 x17: 0000000000000000 x16: 0000000000000000 x15: 0736076307640766 x14: 0730073007380731 x13: 0736076307640766 x12: 0730073007380731 x11: 000000000004058d x10: 0000000085a85b76 x9 : ffff8000101436e4 x8 : ffff800011c8ce08 x7 : 0000000000000000 x6 : 0000000000000001 x5 : ffff0017df9ed338 x4 : 0000000000000001 x3 : ffff8017ce62a000 x2 : ffff0017df9ed340 x1 : 0000000000000000 x0 : 0000000000000000 Call trace: plist_check_prev_next_node+0x50/0x70 plist_check_head+0x80/0xf0 plist_add+0x28/0x140 add_to_avail_list+0x9c/0xf0 _enable_swap_info+0x78/0xb4 __do_sys_swapon+0x918/0xa10 __arm64_sys_swapon+0x20/0x30 el0_svc_common+0x8c/0x220 do_el0_svc+0x2c/0x90 el0_svc+0x1c/0x30 el0_sync_handler+0xa8/0xb0 el0_sync+0x148/0x180 irq event stamp: 2082270 Now, si->lock locked before calling 'del_from_avail_list()' to make sure other thread see the si had been deleted and SWP_WRITEOK cleared together, will not reinsert again. This problem exists in versions after stable 5.10.y. Cc: stable@vger.kernel.org Tested-by: Yongchen Yin Signed-off-by: Rongwei Wang Reviewed-by: Aaron Lu --- mm/swapfile.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/swapfile.c b/mm/swapfile.c index 62ba2bf577d7..2c718f45745f 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -679,6 +679,7 @@ static void __del_from_avail_list(struct swap_info_struct *p) { int nid; + assert_spin_locked(&p->lock); for_each_node(nid) plist_del(&p->avail_lists[nid], &swap_avail_heads[nid]); } @@ -2434,8 +2435,8 @@ SYSCALL_DEFINE1(swapoff, const char __user *, specialfile) spin_unlock(&swap_lock); goto out_dput; } - del_from_avail_list(p); spin_lock(&p->lock); + del_from_avail_list(p); if (p->prio < 0) { struct swap_info_struct *si = p; int nid;