From patchwork Wed Jul 12 06:01:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 118843 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp944989vqm; Tue, 11 Jul 2023 23:31:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlGX41k0FZeU5QKJ264m4VGKrhldHzizNpHxUbjqN/nsJbrPsNaD5ksj3NhAGAycX+XcMfec X-Received: by 2002:a17:906:100b:b0:992:bc8:58e4 with SMTP id 11-20020a170906100b00b009920bc858e4mr18710405ejm.20.1689143486242; Tue, 11 Jul 2023 23:31:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689143486; cv=none; d=google.com; s=arc-20160816; b=AB01TpI5/Y0huR1vPtIhTVCaClH/9cSwhscXahPlGFFy69Zx2XA9APc8o6vJgGQfSO ofAx4aVeEFTECDnOfWbF8i8qJFiXzhLJCm9ThwepxmYi1i45IkYzk4ZvsDFXMLWV8iQt VzlRUjgATQ1XteB4TLEP/9BqSfmFpE7NXqeokll04eR6YzcqNxRSxHzaycpmGxLoT1ug yvtBLaCws4lQDGbaGNCAo9vSiAHxGT+z75sfIMMFETRW3S+cFAtexPCB3NZSyRSXNkvd myGU3MA1HjPvGbHtsqmyTbCEWbbzJDGTaGRH7rXgaKgxcnq27ZtIBsM7zx7UXko9R8U5 z9RQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=8Tp9SwptlNmcx1F/KycyE/xYsKdhV38ttWqI2MHrVbE=; fh=Y4HgeauDBEhQu1wH6OLPqy/8s3n0MdV7qVoxim52czI=; b=ud7ppZIUnu1V1Zx6iEDsg/hFZi+2pz8LowTC/8eVCYEh3oALekemzRUg2A2Ql+b+nP tiXu97SxJbE2c6uqNrv81IbCkUfQxiN2Ru/118Qq76Yq4g25QsJY6+hi398l9M0esQGo hlmJZqiBB72EqgDPcqDMKPbC7KAmoZg0fSqpW8MWzlVRSD2Nx9D6kau1ocCcN2uujDCE 8wE6OOCO5gkdPKRv4QGrgtkdDd5QwFi9uavwzqUlAZmQDuw/1aB9MJbfpQsg7hvajSX+ eh5TpkAO2gGL0x+WuFix/4a14TAph/3sK76jwU8C7FEhDTouvtolSy3QYkkoSIiJz3fm oHCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="HOWl/6Iv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g24-20020a1709064e5800b009827e07cca6si3849194ejw.17.2023.07.11.23.31.01; Tue, 11 Jul 2023 23:31:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b="HOWl/6Iv"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231920AbjGLGCA (ORCPT + 99 others); Wed, 12 Jul 2023 02:02:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48086 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231908AbjGLGB6 (ORCPT ); Wed, 12 Jul 2023 02:01:58 -0400 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E7CE1A1 for ; Tue, 11 Jul 2023 23:01:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141716; x=1720677716; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=o/Aws3JMOGR03jlt+wsWp78WyeH6J7qvKifxMU03VXk=; b=HOWl/6IvgD7ycJi94VOA24NeNfDtM69WHwqB2Ql84ce5BfNHhJ3VbMYA 0EM8S5SFqbw4AhwxQ6Zg4XPLvHfXOkHJWOw0QaRXL5ZwZR+bjXsCIcxGP oU2s9LJAbznR4SEtcI+hBCFJVL+Iag2tN4qMoQMD9rKR3ZO29bUs7fDJp jAF7FoWtJr3JMcwy7D0GOCC+m2o2jk/jdDELlrXW/cW+fnY66BSU1zbQl qedxo5r54FVa2ntq2zqjF9CwhQrsgpoJbw63RWcvr4TjTRx4UFMOvF7zF G+/Nddhhm976b/qsVcgTEFi9OohPAIKZdNrGcd5bNGaFoRNtM4PCTfdan g==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="354715284" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="354715284" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:01:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="756643382" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="756643382" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by orsmga001.jf.intel.com with ESMTP; 11 Jul 2023 23:01:53 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 1/3] mm: add functions folio_in_range() and folio_within_vma() Date: Wed, 12 Jul 2023 14:01:42 +0800 Message-Id: <20230712060144.3006358-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771195320217388233 X-GMAIL-MSGID: 1771195320217388233 It will be used to check whether the folio is mapped to specific VMA and whether the mapping address of folio is in the range. Also a helper function folio_within_vma() to check whether folio is in the range of vma based on folio_in_range(). Signed-off-by: Yin Fengwei Reviewed-by: Yu Zhao --- mm/internal.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index 483add0bfb289..c7dd15d8de3ef 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -585,6 +585,38 @@ extern long faultin_vma_page_range(struct vm_area_struct *vma, bool write, int *locked); extern bool mlock_future_ok(struct mm_struct *mm, unsigned long flags, unsigned long bytes); + +static inline bool +folio_in_range(struct folio *folio, struct vm_area_struct *vma, + unsigned long start, unsigned long end) +{ + pgoff_t pgoff, addr; + unsigned long vma_pglen = (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; + + VM_WARN_ON_FOLIO(folio_test_ksm(folio), folio); + if (start < vma->vm_start) + start = vma->vm_start; + + if (end > vma->vm_end) + end = vma->vm_end; + + pgoff = folio_pgoff(folio); + + /* if folio start address is not in vma range */ + if (pgoff < vma->vm_pgoff || pgoff > vma->vm_pgoff + vma_pglen) + return false; + + addr = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); + + return ((addr >= start) && (addr + folio_size(folio) <= end)); +} + +static inline bool +folio_within_vma(struct folio *folio, struct vm_area_struct *vma) +{ + return folio_in_range(folio, vma, vma->vm_start, vma->vm_end); +} + /* * mlock_vma_folio() and munlock_vma_folio(): * should be called with vma's mmap_lock held for read or write, From patchwork Wed Jul 12 06:01:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 118841 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp944466vqm; Tue, 11 Jul 2023 23:30:17 -0700 (PDT) X-Google-Smtp-Source: APBJJlHovuKl7d2tq/d7b99aYB0W6UPw9VDJ2NoXcRv3Ojwe5LviyJIlAH2Ik00JXoypcXohuSqS X-Received: by 2002:a17:906:530c:b0:991:cf4e:a361 with SMTP id h12-20020a170906530c00b00991cf4ea361mr11321883ejo.26.1689143417038; Tue, 11 Jul 2023 23:30:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689143417; cv=none; d=google.com; s=arc-20160816; b=mWmSemoQRfFRQNDerdgpGT0T77lXJ9OyjFhzEVtsrC30Nf077B47/N0Hf4gTBVhNKe ks+3SA+SKJ2tqhjvgYaK48Li9HEHoVScHdGqJvp56ondORqvH8S55J8LJGm1qbHPPiZr VmK3bGqyF5bKmleaGf28gaEoTuYYvdmArS0k2y+QCoBa9tLbnF/8p+jIDWrddAzwXjtd bf7xP9PsKlJ1XbJFVJ6Q0DD4jVXAIE7A6ZPrnIrfoRh4+//Zz50QNtj8Ud7MWpqa5zuR fYjN155S/+1VhRY0+nTRlCK7HgPTy08fjlyGHfiUZJ9pxDY5eUvSDg+EsZY7aeSxjZbs IE4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PVsUsy9u6N0EfW896VdKQHOaT76JviUbVuFG7umdeIo=; fh=Y4HgeauDBEhQu1wH6OLPqy/8s3n0MdV7qVoxim52czI=; b=Q6I7ODePJ0YnhatR3TPjRS/pUH5LUK3g4RR0bIqfxZ8vTsjsVT1tgvQ1WiNll2JMrx SKjvC8/ToyEg5wGfz2915C0iz5zFk/2nPNX6l0X+LNvc98kXOeEvCg+liGTak+aWuUf1 1rpAnup1dlwlLnDCTfYLsbNYA1MBulfH0DM6bzCeruue8Iw7mRexNNumdJHfAUyU6sLK vBT2Bl9hawx3qtMiM6MtFLiodIYSSBnjJWNdhEbiBqKH7mwOPCLS54qyXAtK6GQxo/Zw mxARPcYS3V8q3tRuDMIZxVssRaOTYdtgL5ImUEoEVYL/d5UDWivOXbwe96ya102SJu7p 8LEA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id r23-20020a170906c29700b00992c30f5887si4278442ejz.474.2023.07.11.23.29.53; Tue, 11 Jul 2023 23:30:17 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=FvPwCiTQ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231932AbjGLGCS (ORCPT + 99 others); Wed, 12 Jul 2023 02:02:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48386 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231908AbjGLGCR (ORCPT ); Wed, 12 Jul 2023 02:02:17 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C772F198A for ; Tue, 11 Jul 2023 23:02:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141730; x=1720677730; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=bgQflrx8qMg9vkqGlfihJRy1xgfAbAds62AMD1mGmM4=; b=FvPwCiTQ2ewOzBpaDjqFetT9isTYHhpAjobWoHZFeIe6ep3ztCH19Ivb HwavmLE6K5tpFAHGpCl9gLrlUIO86Hu4bpaQHM5vFzoSqw/0sip947DEb v/GqNwmNSHzHsOXFTTeiBg/xMFKtubFyY587jPVcxWUlJ1YlqlBD1UL73 45TaGFtOkIz/5LQL9iVU2D6N+2MH4innH2+N977o9rYInXR42z3XrSjMl bE7OxDR4p6pvixxAa7G5PGwrkors1Egl2usSZhabG4nJyGOg1/OtVtW5p vsMDOZX+f41IOJE0TLfGLil/F6I8tf3uPAqVWsszJPqaYkwq4g83C+0K4 A==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="363673767" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="363673767" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:02:09 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="1052051350" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="1052051350" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga005.fm.intel.com with ESMTP; 11 Jul 2023 23:02:06 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 2/3] mm: handle large folio when large folio in VM_LOCKED VMA range Date: Wed, 12 Jul 2023 14:01:43 +0800 Message-Id: <20230712060144.3006358-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771195247868829525 X-GMAIL-MSGID: 1771195247868829525 If large folio is in the range of VM_LOCKED VMA, it should be mlocked to avoid being picked by page reclaim. Which may split the large folio and then mlock each pages again. Mlock this kind of large folio to prevent them being picked by page reclaim. For the large folio which cross the boundary of VM_LOCKED VMA, we'd better not to mlock it. So if the system is under memory pressure, this kind of large folio will be split and the pages ouf of VM_LOCKED VMA can be reclaimed. Signed-off-by: Yin Fengwei --- mm/internal.h | 11 ++++++++--- mm/rmap.c | 34 +++++++++++++++++++++++++++------- 2 files changed, 35 insertions(+), 10 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index c7dd15d8de3ef..776141de2797a 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -643,7 +643,8 @@ static inline void mlock_vma_folio(struct folio *folio, * still be set while VM_SPECIAL bits are added: so ignore it then. */ if (unlikely((vma->vm_flags & (VM_LOCKED|VM_SPECIAL)) == VM_LOCKED) && - (compound || !folio_test_large(folio))) + (compound || !folio_test_large(folio) || + folio_in_range(folio, vma, vma->vm_start, vma->vm_end))) mlock_folio(folio); } @@ -651,8 +652,12 @@ void munlock_folio(struct folio *folio); static inline void munlock_vma_folio(struct folio *folio, struct vm_area_struct *vma, bool compound) { - if (unlikely(vma->vm_flags & VM_LOCKED) && - (compound || !folio_test_large(folio))) + /* + * To handle the case that a mlocked large folio is unmapped from VMA + * piece by piece, allow munlock the large folio which is partially + * mapped to VMA. + */ + if (unlikely(vma->vm_flags & VM_LOCKED)) munlock_folio(folio); } diff --git a/mm/rmap.c b/mm/rmap.c index 2668f5ea35342..455f415d8d9ca 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -803,6 +803,14 @@ struct folio_referenced_arg { unsigned long vm_flags; struct mem_cgroup *memcg; }; + +static inline bool should_restore_mlock(struct folio *folio, + struct vm_area_struct *vma, bool pmd_mapped) +{ + return !folio_test_large(folio) || + pmd_mapped || folio_within_vma(folio, vma); +} + /* * arg: folio_referenced_arg will be passed */ @@ -816,13 +824,25 @@ static bool folio_referenced_one(struct folio *folio, while (page_vma_mapped_walk(&pvmw)) { address = pvmw.address; - if ((vma->vm_flags & VM_LOCKED) && - (!folio_test_large(folio) || !pvmw.pte)) { - /* Restore the mlock which got missed */ - mlock_vma_folio(folio, vma, !pvmw.pte); - page_vma_mapped_walk_done(&pvmw); - pra->vm_flags |= VM_LOCKED; - return false; /* To break the loop */ + if (vma->vm_flags & VM_LOCKED) { + if (should_restore_mlock(folio, vma, !pvmw.pte)) { + /* Restore the mlock which got missed */ + mlock_vma_folio(folio, vma, !pvmw.pte); + page_vma_mapped_walk_done(&pvmw); + pra->vm_flags |= VM_LOCKED; + return false; /* To break the loop */ + } else { + /* + * For large folio cross VMA boundaries, it's + * expected to be picked by page reclaim. But + * should skip reference of pages which are in + * the range of VM_LOCKED vma. As page reclaim + * should just count the reference of pages out + * the range of VM_LOCKED vma. + */ + pra->mapcount--; + continue; + } } if (pvmw.pte) { From patchwork Wed Jul 12 06:01:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 118844 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:a6b2:0:b0:3e4:2afc:c1 with SMTP id c18csp945017vqm; Tue, 11 Jul 2023 23:31:30 -0700 (PDT) X-Google-Smtp-Source: APBJJlFqixVVEJWDW9t6YsX4My+LCKMzzMqw76Hqk/15InuHAPaLm8TpegEHXxxcPrnOHBKax5zW X-Received: by 2002:a17:906:5356:b0:993:f497:adbe with SMTP id j22-20020a170906535600b00993f497adbemr14427753ejo.19.1689143490250; Tue, 11 Jul 2023 23:31:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689143490; cv=none; d=google.com; s=arc-20160816; b=eM840La4FY3z+2Yx5IoWRNq/Eo3wZB0YALcCcBCTbu5z1y5P5u1WoGt+Dgzn0AyfnE 9+8TOYF57psu5GnTL2n3UdqkJSmtuo1br+if1lEnqKm9qj/mozLSi933P0Bmxgrn3V8z oFnQNK/aTsAYh8gpKhRqS9/aPrBGoNdzyBrAiOEBPculYg26M5Tv44OO1zAkTJ8eKG7j W2nJwJm/ZUVz18R/xKQFlEZZ1hpqa/d2k69FMu5RMHDY+f8ek5D2MRqFSBwh7hf0Li82 WkvC9sQxIMJVqdItvNuyZqRx9dxX50/8mJHlP0UEZmcRo6VVaUNAq6CoLbwYyTd7YmpI V0+Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=rjv9dWHp8SbpP9YjBKKRvSmX7w1XyPe4qRtFYjWE2Uk=; fh=Y4HgeauDBEhQu1wH6OLPqy/8s3n0MdV7qVoxim52czI=; b=Y2gvyFayRYr+nKHM6XY8+AYAzJY6RBoDyy5LT5d+1FkXPXK5gjskEQlsGp2LnCZJo1 i125yC7QeMQ4dVpgcRzu1wIsZ5lU0VFjfoPftrSCMZbIsUXRNuvnBYtkF08uzXOtLoXz D4Bntb8gQSAsBHrstEkjrhQEfJSSxJAENxWM1ZG/pS0p7c8HUNqa7U4bPQ4xPgYn1M7o 6+O2RzepjSsvArpaJjl6CTq/yrdZ+ed8PS17wTdz2DRrsM0wg9YyjlwGNgPlfcvTO7GD 7FziWp8RW1isBLSYuE47b+UvUfzHrNBULadr2rs9pn3cxk1J/HZQUHg6tQWSEXTdLgSJ jevw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=enRLnjAy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qq17-20020a17090720d100b009938899a768si3474507ejb.530.2023.07.11.23.31.07; Tue, 11 Jul 2023 23:31:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=enRLnjAy; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231956AbjGLGCg (ORCPT + 99 others); Wed, 12 Jul 2023 02:02:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231979AbjGLGCZ (ORCPT ); Wed, 12 Jul 2023 02:02:25 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9B031995 for ; Tue, 11 Jul 2023 23:02:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1689141743; x=1720677743; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=q8YW+Ak1aFSHS6ap1YknDbUFbazLnvkg+GE8oMd5GIc=; b=enRLnjAyfbywbEY81IxqlMNG7o2X8GJXfjfhkjMMe30vy9nHF5QIlrLO pk7GQQEcbmRTdT7CRnEMrEDTyuHlAQvNKVwSRogWVSJ+Xb8K/gDvEx6Ld AcFFP5iC+aQgBHb1nPrlvlP1Dp6BnAjYunmHyexrcpRtBlFamgq6gvGbk A0C3HwsITopl4phQz+mzoL3Iau5PNh2IIMiMPXaOirR6XDBP68Hw3801M wCH5yjghn1s57dln24wKUlhptM6beOsXshyjvOnU9Ybrv/1chhBEwCkPE Ja1Ce058+MAMmBCUlb3WKNlzYYk81R8p1ii39aJKKvBrDdvNmnm02uqly g==; X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="349662855" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="349662855" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jul 2023 23:02:23 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10768"; a="865994265" X-IronPort-AV: E=Sophos;i="6.01,198,1684825200"; d="scan'208";a="865994265" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga001.fm.intel.com with ESMTP; 11 Jul 2023 23:02:20 -0700 From: Yin Fengwei To: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, yuzhao@google.com, willy@infradead.org, david@redhat.com, ryan.roberts@arm.com, shy828301@gmail.com Cc: fengwei.yin@intel.com Subject: [RFC PATCH v2 3/3] mm: mlock: update mlock_pte_range to handle large folio Date: Wed, 12 Jul 2023 14:01:44 +0800 Message-Id: <20230712060144.3006358-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230712060144.3006358-1-fengwei.yin@intel.com> References: <20230712060144.3006358-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771195324520493405 X-GMAIL-MSGID: 1771195324520493405 Current kernel only lock base size folio during mlock syscall. Add large folio support with following rules: - Only mlock large folio when it's in VM_LOCKED VMA range - If there is cow folio, mlock the cow folio as cow folio is also in VM_LOCKED VMA range. - munlock will apply to the large folio which is in VMA range or cross the VMA boundary. The last rule is used to handle the case that the large folio is mlocked, later the VMA is split in the middle of large folio and this large folio become cross VMA boundary. Signed-off-by: Yin Fengwei --- mm/mlock.c | 104 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 99 insertions(+), 5 deletions(-) diff --git a/mm/mlock.c b/mm/mlock.c index 0a0c996c5c214..f49e079066870 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -305,6 +305,95 @@ void munlock_folio(struct folio *folio) local_unlock(&mlock_fbatch.lock); } +static inline bool should_mlock_folio(struct folio *folio, + struct vm_area_struct *vma) +{ + if (vma->vm_flags & VM_LOCKED) + return (!folio_test_large(folio) || + folio_within_vma(folio, vma)); + + /* + * For unlock, allow munlock large folio which is partially + * mapped to VMA. As it's possible that large folio is + * mlocked and VMA is split later. + * + * During memory pressure, such kind of large folio can + * be split. And the pages are not in VM_LOCKed VMA + * can be reclaimed. + */ + + return true; +} + +static inline unsigned int get_folio_mlock_step(struct folio *folio, + pte_t pte, unsigned long addr, unsigned long end) +{ + unsigned int nr; + + nr = folio_pfn(folio) + folio_nr_pages(folio) - pte_pfn(pte); + return min_t(unsigned int, nr, (end - addr) >> PAGE_SHIFT); +} + +void mlock_folio_range(struct folio *folio, struct vm_area_struct *vma, + pte_t *pte, unsigned long addr, unsigned int nr) +{ + struct folio *cow_folio; + unsigned int step = 1; + + mlock_folio(folio); + if (nr == 1) + return; + + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) { + pte_t ptent; + + step = 1; + ptent = ptep_get(pte); + + if (!pte_present(ptent)) + continue; + + cow_folio = vm_normal_folio(vma, addr, ptent); + if (!cow_folio || cow_folio == folio) { + continue; + } + + mlock_folio(cow_folio); + step = get_folio_mlock_step(folio, ptent, + addr, addr + (nr << PAGE_SHIFT)); + } +} + +void munlock_folio_range(struct folio *folio, struct vm_area_struct *vma, + pte_t *pte, unsigned long addr, unsigned int nr) +{ + struct folio *cow_folio; + unsigned int step = 1; + + munlock_folio(folio); + if (nr == 1) + return; + + for (; nr > 0; pte += step, addr += (step << PAGE_SHIFT), nr -= step) { + pte_t ptent; + + step = 1; + ptent = ptep_get(pte); + + if (!pte_present(ptent)) + continue; + + cow_folio = vm_normal_folio(vma, addr, ptent); + if (!cow_folio || cow_folio == folio) { + continue; + } + + munlock_folio(cow_folio); + step = get_folio_mlock_step(folio, ptent, + addr, addr + (nr << PAGE_SHIFT)); + } +} + static int mlock_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) @@ -314,6 +403,7 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, pte_t *start_pte, *pte; pte_t ptent; struct folio *folio; + unsigned int step = 1; ptl = pmd_trans_huge_lock(pmd, vma); if (ptl) { @@ -329,24 +419,28 @@ static int mlock_pte_range(pmd_t *pmd, unsigned long addr, goto out; } - start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + pte = start_pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); if (!start_pte) { walk->action = ACTION_AGAIN; return 0; } - for (pte = start_pte; addr != end; pte++, addr += PAGE_SIZE) { + + for (; addr != end; pte += step, addr += (step << PAGE_SHIFT)) { + step = 1; ptent = ptep_get(pte); if (!pte_present(ptent)) continue; folio = vm_normal_folio(vma, addr, ptent); if (!folio || folio_is_zone_device(folio)) continue; - if (folio_test_large(folio)) + if (!should_mlock_folio(folio, vma)) continue; + + step = get_folio_mlock_step(folio, ptent, addr, end); if (vma->vm_flags & VM_LOCKED) - mlock_folio(folio); + mlock_folio_range(folio, vma, pte, addr, step); else - munlock_folio(folio); + munlock_folio_range(folio, vma, pte, addr, step); } pte_unmap(start_pte); out: