Message ID | 20230224141145.96814-1-ying.huang@intel.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp931247wrd; Fri, 24 Feb 2023 06:14:20 -0800 (PST) X-Google-Smtp-Source: AK7set9pUkJReitkfmDQ0pvynkBI2AjqSO2GRlFJa45X1f3/280D++V8rjGhWTrBpd2nV+5exdND X-Received: by 2002:a17:906:2491:b0:878:7471:6da7 with SMTP id e17-20020a170906249100b0087874716da7mr24742325ejb.66.1677248060345; Fri, 24 Feb 2023 06:14:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677248060; cv=none; d=google.com; s=arc-20160816; b=ewAZZa5FpVm4aN3sfmsEAvKOZeccxnmFn2g28RCsqt5aYEqXskj1/zzkN5juyujIGi IO8/i6MmkyFcBXh6+5IRNiPblwxICI2aZZSRXeGJVNedI4ko0X/CbE1q2ZVTzk1Wlz8v prSU/5htYC8aAWkg/ajYTYZ5acUi53HFspr7fytsxDLHNVlhZKCMaWHZSFKfXu2RXQPC w3alz3U+0+pMiI77lXPgpPZiFH5l/eA4SaztioOBWlmGwx8rvOYeqAc62djeOTQ0EWBK Bn3vCceRlFuBe0ywsT80HbCSMukRC5hBRHBCqby80kdwpeeZeTErMSRfIvLDysvEY55i +zMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=8T+GzZnJo9HvWwR7cB+UuEtm1seROOD8Pjk7vAjFfGE=; b=L3quKSwI4xZLUS33G1pwZXMicNEmSwL8gtWbMGgZUs/RXf+R6TnxkqkqXa6ZkdFS5X 8ROLZVo5cM2HVNo76WcQW7lhg/iP5WSxS4ShPTzzcSJG1qKZqHzmdrfPw1/RTOeqRVJx JpjMlyEBIQA1lWKGSoim5LRjCKcbWkw5KfJMnkd/xqLjeihLaF05mctExxtBkixuhWjj QTdfQwMMXptTpFFgThiHik42qpoRy6Ed9U/0fpEXCD5gpdNBZdSocV/UrL9jv/xy1HE6 jIxBw8ruJGUld1rRg5zb9T75Tq2AZCNa0nVbzSq7eaR6UswwA3kaYrpR1j4x6IwmOndR rfLA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BvC4wCm8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c12-20020aa7df0c000000b004acd46cc71esi8246846edy.131.2023.02.24.06.13.49; Fri, 24 Feb 2023 06:14:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=BvC4wCm8; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229685AbjBXOMP (ORCPT <rfc822;jeff.pang.chn@gmail.com> + 99 others); Fri, 24 Feb 2023 09:12:15 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33168 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229462AbjBXOMM (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 24 Feb 2023 09:12:12 -0500 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 10DDA12B for <linux-kernel@vger.kernel.org>; Fri, 24 Feb 2023 06:12:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677247932; x=1708783932; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=wbHiZmYpOxEtl62leTc5hC5LH7H7mGhZv9YLmgqKKdE=; b=BvC4wCm8dN36wSjUbBPRfkEaBLBWQLYcxdCHq5o0ushvH1anyxiGyKOO 4EI/oqNWNT3Ud5L3DsnvbASClGLEVAMlRG7NG+jII087SgJnN+lUY7MXD iMijdpF26HN0W7ErXgmvae7kkyCM1dXFKpxexzIobdvI34akFyJ+7rBWJ cU1+4UmJcqVmSvdDaCh7qR1Zwfs8z54Ex800rYoh9mLazr707A0lchouw QxgTT1vQ1BlvRg97GNuVe72dUU037dLiqAXrkCxD/gQAl+4qV+YuDz6kz abZpm2d0JdTikxnPtU7Y53w4w3Mc1mV3LOahebTdUvCgofINuNVwd/2Fm w==; X-IronPort-AV: E=McAfee;i="6500,9779,10630"; a="332167657" X-IronPort-AV: E=Sophos;i="5.97,324,1669104000"; d="scan'208";a="332167657" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2023 06:12:11 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10630"; a="741684622" X-IronPort-AV: E=Sophos;i="5.97,324,1669104000"; d="scan'208";a="741684622" Received: from bingqili-mobl2.ccr.corp.intel.com (HELO yhuang6-mobl2.ccr.corp.intel.com) ([10.255.28.19]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Feb 2023 06:12:07 -0800 From: Huang Ying <ying.huang@intel.com> To: Andrew Morton <akpm@linux-foundation.org> Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, Huang Ying <ying.huang@intel.com>, Hugh Dickins <hughd@google.com>, "Xu, Pengfei" <pengfei.xu@intel.com>, Christoph Hellwig <hch@lst.de>, Stefan Roesch <shr@devkernel.io>, Tejun Heo <tj@kernel.org>, Xin Hao <xhao@linux.alibaba.com>, Zi Yan <ziy@nvidia.com>, Yang Shi <shy828301@gmail.com>, Baolin Wang <baolin.wang@linux.alibaba.com>, Matthew Wilcox <willy@infradead.org>, Mike Kravetz <mike.kravetz@oracle.com> Subject: [PATCH 0/3] migrate_pages: fix deadlock in batched synchronous migration Date: Fri, 24 Feb 2023 22:11:42 +0800 Message-Id: <20230224141145.96814-1-ying.huang@intel.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1758722062020996789?= X-GMAIL-MSGID: =?utf-8?q?1758722062020996789?= |
Series |
migrate_pages: fix deadlock in batched synchronous migration
|
|
Message
Huang, Ying
Feb. 24, 2023, 2:11 p.m. UTC
Two deadlock bugs were reported for the migrate_pages() batching series. Thanks Hugh and Pengfei. Analysis shows that if we have locked some other folios except the one we are migrating, it's not safe in general to wait synchronously, for example, to wait the writeback to complete or wait to lock the buffer head. So 1/3 fixes the deadlock in a simple way, where the batching support for the synchronous migration is disabled. The change is straightforward and easy to be understood. While 3/3 re-introduce the batching for synchronous migration via trying to migrate asynchronously in batch optimistically, then fall back to migrate synchronously one by one for fail-to-migrate folios. Test shows that this can restore the TLB flushing batching performance for synchronous migration effectively. Best Regards, Huang, Ying
Comments
On Fri, 24 Feb 2023 22:11:42 +0800 Huang Ying <ying.huang@intel.com> wrote: > Two deadlock bugs were reported for the migrate_pages() batching > series. "migrate_pages(): batch TLB flushing" > Thanks Hugh and Pengfei. Analysis shows that if we have > locked some other folios except the one we are migrating, it's not > safe in general to wait synchronously, for example, to wait the > writeback to complete or wait to lock the buffer head. > > So 1/3 fixes the deadlock in a simple way, where the batching support > for the synchronous migration is disabled. The change is > straightforward and easy to be understood. While 3/3 re-introduce the > batching for synchronous migration via trying to migrate > asynchronously in batch optimistically, then fall back to migrate > synchronously one by one for fail-to-migrate folios. Test shows that > this can restore the TLB flushing batching performance for synchronous > migration effectively. If anyone backports the "migrate_pages(): batch TLB flushing" series into their kernels, they will want to know about such fixes. So we can help them by providing suitable Link: tags. Such a Link: may also be helpful to people who are performing git bisection searches for some issue but who keep stumbling over the issues which this series addresses. Being lazy, I slapped Fixes: 6f7d760e86fa ("migrate_pages: move THP/hugetlb migration support check to simplify code") on all three, as this was the final patch in that series. Inaccurate, but it means that these fixes will land in a suitable place if anyone needs them.
Andrew Morton <akpm@linux-foundation.org> writes: > On Fri, 24 Feb 2023 22:11:42 +0800 Huang Ying <ying.huang@intel.com> wrote: > >> Two deadlock bugs were reported for the migrate_pages() batching >> series. > > "migrate_pages(): batch TLB flushing" Yes. Should have written as that. >> Thanks Hugh and Pengfei. Analysis shows that if we have >> locked some other folios except the one we are migrating, it's not >> safe in general to wait synchronously, for example, to wait the >> writeback to complete or wait to lock the buffer head. >> >> So 1/3 fixes the deadlock in a simple way, where the batching support >> for the synchronous migration is disabled. The change is >> straightforward and easy to be understood. While 3/3 re-introduce the >> batching for synchronous migration via trying to migrate >> asynchronously in batch optimistically, then fall back to migrate >> synchronously one by one for fail-to-migrate folios. Test shows that >> this can restore the TLB flushing batching performance for synchronous >> migration effectively. > > If anyone backports the "migrate_pages(): batch TLB flushing" series > into their kernels, they will want to know about such fixes. So we can > help them by providing suitable Link: tags. > > Such a Link: may also be helpful to people who are performing git > bisection searches for some issue but who keep stumbling over the > issues which this series addresses. > > Being lazy, I slapped > > Fixes: 6f7d760e86fa ("migrate_pages: move THP/hugetlb migration support check to simplify code") > > on all three, as this was the final patch in that series. Inaccurate, > but it means that these fixes will land in a suitable place if anyone > needs them. Sorry. I should have added the "Fixes:" tag. I will be more careful in the future. And, I will add proper "Link:" tag too. Best Regards, Huang, Ying