From patchwork Fri Dec 30 01:12:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37540 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2668145wrt; Thu, 29 Dec 2022 17:15:48 -0800 (PST) X-Google-Smtp-Source: AMrXdXvpNh9VElZ+pGcy8Avd/7ZOL7G9qtFVkIPMdcxQv5AQHGia22U5Ganuo9uiOqmz0mSQXB6U X-Received: by 2002:a17:90a:7848:b0:226:228:3e4b with SMTP id y8-20020a17090a784800b0022602283e4bmr12796387pjl.6.1672362947852; Thu, 29 Dec 2022 17:15:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672362947; cv=none; d=google.com; s=arc-20160816; b=02vdVEfO+w21x/fIWIJjLi4YcL4mIbseIAZO4bFdOUsYZ5GcKb0maeze3Xb4EB3Kzm OksUPW2HdsAqmwOr5NY/OXh0n8B5OIFFjDdVip6h3+a9qN1rfZ2uguRUmJ/YoIOiG835 U4TJq1RoSSJqKOnO9CgKx5rSNr13wf7DEJK3UDKk5dF6PUMTtQVC1Gilf2ZrAhTzIv01 ucT/dNtZsIv/CjB4sVaLo98SxMurlbpXU2z1mt1EibD2D1Mcbn4GZofx7Ti7NHc1Yzbl pBLl4pGPc2jMTrVr8tI1p25diqMkUjfigbQxAwAERg+jDxLLtS7OmviBaW1Z03WbuwJl qZKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=Fn8CaapX2uBzNcHLvV2QH2LQpWpDs2Djiqkkl35bf/w=; b=BrN+yRRU59ZNgBe53wOo0x8Awfcf7rIxmEtq6poTTlQ2cqq7fkPsOxZvMW7Bhxxn4k DfMEF1wgzgtvjCIRGlgxKXC2UYuZ1FeKan47SnQitIc8btGlWYt2dt/yY3dxAHCw8EHp nneJSbDAGwWEEUIiW/6l1qrlatVMTq88/iEupa8uAPnLysMSUb2SJ9p5IiZSglffxcOq d6XTaXUFz7+v0fPNyTAC85jUcfcS/DDwTOfYVEAmNXgPgSbZaZXpqiNHoe2/cAMS4z0X W3l3LOy8MPADhZOncK6jQdAYOp4au7kcsy7c6dq78qHEvhAYwwT75g1nh9zV/xIF6HDX 1FfQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ng10-20020a17090b1a8a00b002233c2053c8si21534914pjb.78.2022.12.29.17.15.35; Thu, 29 Dec 2022 17:15:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234024AbiL3BMy (ORCPT + 99 others); Thu, 29 Dec 2022 20:12:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34358 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229534AbiL3BMw (ORCPT ); Thu, 29 Dec 2022 20:12:52 -0500 Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [63.216.63.40]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5786DFD0B for ; Thu, 29 Dec 2022 17:12:51 -0800 (PST) Received: from mse-fl2.zte.com.cn (unknown [10.5.228.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxhk.zte.com.cn (FangMail) with ESMTPS id 4NjnLV0XKMz8R039; Fri, 30 Dec 2022 09:12:50 +0800 (CST) Received: from szxlzmapp03.zte.com.cn ([10.5.231.207]) by mse-fl2.zte.com.cn with SMTP id 2BU1CiXg057579; Fri, 30 Dec 2022 09:12:44 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:12:44 +0800 (CST) Date: Fri, 30 Dec 2022 09:12:44 +0800 (CST) X-Zmail-TransId: 2b0363ae3b0c59fb2628 X-Mailer: Zmail v1.0 Message-ID: <202212300912449061763@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_1/6=5D_ksm=3A_abstract_the_function_try=5Fto?= =?utf-8?q?=5Fget=5Fold=5Frmap=5Fitem?= X-MAIL: mse-fl2.zte.com.cn 2BU1CiXg057579 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.250.137.novalocal with ID 63AE3B12.000 by FangMail milter! X-FangMail-Envelope: 1672362770/4NjnLV0XKMz8R039/63AE3B12.000/10.5.228.133/[10.5.228.133]/mse-fl2.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3B12.000/4NjnLV0XKMz8R039 X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,RCVD_IN_MSPIKE_H2, SORTED_RECIPS,SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753599650326877646?= X-GMAIL-MSGID: =?utf-8?q?1753599650326877646?= From: xu xin A new function try_to_get_old_rmap_item is abstracted from get_next_rmap_item. This function will be reused by the subsequent patches about counting ksm_zero_pages. The patch improves the readability and reusability of KSM code. Signed-off-by: xu xin Cc: David Hildenbrand Cc: Claudio Imbrenda Cc: Xuexin Jiang Reviewed-by: Xiaokai Ran Reviewed-by: Yang Yang --- mm/ksm.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 83e2f74ae7da..5b0a7343ff4a 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -2214,23 +2214,36 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite } } -static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot, - struct ksm_rmap_item **rmap_list, - unsigned long addr) +static struct ksm_rmap_item *try_to_get_old_rmap_item(unsigned long addr, + struct ksm_rmap_item **rmap_list) { - struct ksm_rmap_item *rmap_item; - while (*rmap_list) { - rmap_item = *rmap_list; + struct ksm_rmap_item *rmap_item = *rmap_list; if ((rmap_item->address & PAGE_MASK) == addr) return rmap_item; if (rmap_item->address > addr) break; *rmap_list = rmap_item->rmap_list; + /* Running here indicates it's vma has been UNMERGEABLE */ remove_rmap_item_from_tree(rmap_item); free_rmap_item(rmap_item); } + return NULL; +} + +static struct ksm_rmap_item *get_next_rmap_item(struct ksm_mm_slot *mm_slot, + struct ksm_rmap_item **rmap_list, + unsigned long addr) +{ + struct ksm_rmap_item *rmap_item; + + /* lookup if we have a old rmap_item matching the addr*/ + rmap_item = try_to_get_old_rmap_item(addr, rmap_list); + if (rmap_item) + return rmap_item; + + /* Need to allocate a new rmap_item */ rmap_item = alloc_rmap_item(); if (rmap_item) { /* It has already been zeroed */ From patchwork Fri Dec 30 01:13:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37541 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2668494wrt; Thu, 29 Dec 2022 17:17:00 -0800 (PST) X-Google-Smtp-Source: AMrXdXtzE1kvFnPpKMYw/KFvJaDPxNRFYDO2sJKv6Nzjz4pGgkPP33MCm4ibSJwG4beF9nM/R/Rp X-Received: by 2002:a05:6a20:1bc2:b0:ad:e765:9554 with SMTP id cv2-20020a056a201bc200b000ade7659554mr31855355pzb.55.1672363020402; Thu, 29 Dec 2022 17:17:00 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672363020; cv=none; d=google.com; s=arc-20160816; b=xEAQpNbAEEoWzHwKOel6pdfX6lWfVl2l27K1bGtcZc5HWcJwmH54zZS9NXhsrIGvJz PlKfw/RIRZ34eq8qSgq3bWNXhgvFkXkmjfUm2OM1UHAKbcamK+6BUIEGjJAnLqn0hphy /wGoW9mmmlIYcDR883SmmokPBUs3EeJpjx4n8ggbiZRBO5FNYJIW41lw+9xjs9PmYpgS JkOuLMh6Ncs1/TADTT0L7ozDRkSVChDM9OYrEBTgatce/DHh+DAWQoYywK7vpomwKdYV Q91plaxTqVvgvUM/K64ob8BandOS4VKkWMQtSggtOXuDWxuIanH6zchSav4uBmsc21To 3gCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=40t31l01hiP9u92oShLHjW2svT5tkL5iGUmHA1NCqTU=; b=rSPLjdRVbl48lfZhIYogeYYf9s5HccKy1zSJAv1xQ4Vc5GRXTuYTGRqTGbZIrl3Txk tznc8uHm/CzSL9D3PNm/sDsWuttgWAWAwGNujy8uPEAL41QoB07L77g5TBiW/geEzyrU Gwn1JJ26tp8M6ca/mrktLaBzOjs3M0TAJ92w7em4xtCV/OFy4mzHyVMW4/pnmvJhQTNM yc8YNcdwrYXH3mOHCNA4OGADaq88RdpZ6ApMcp0YvbOvieU3xuQh4fIfJYY8y3Cevrr7 SADXwBQcJS4XVB+gc4bdn27+VcJxZdRPg9UJHu9SJolbP7Aoj8EHtOu7EvhEGMk4Uihs Uwcw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id z12-20020a630a4c000000b0048e225a6ec4si20479885pgk.357.2022.12.29.17.16.48; Thu, 29 Dec 2022 17:17:00 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234146AbiL3BOH (ORCPT + 99 others); Thu, 29 Dec 2022 20:14:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234007AbiL3BOF (ORCPT ); Thu, 29 Dec 2022 20:14:05 -0500 Received: from mxct.zte.com.cn (mxct.zte.com.cn [183.62.165.209]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65812FD0B for ; Thu, 29 Dec 2022 17:14:03 -0800 (PST) Received: from mse-fl2.zte.com.cn (unknown [10.5.228.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxct.zte.com.cn (FangMail) with ESMTPS id 4NjnMs3QrRz501Qf; Fri, 30 Dec 2022 09:14:01 +0800 (CST) Received: from szxlzmapp07.zte.com.cn ([10.5.230.251]) by mse-fl2.zte.com.cn with SMTP id 2BU1DuKS058130; Fri, 30 Dec 2022 09:13:56 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:13:57 +0800 (CST) Date: Fri, 30 Dec 2022 09:13:57 +0800 (CST) X-Zmail-TransId: 2b0363ae3b55650b66d5 X-Mailer: Zmail v1.0 Message-ID: <202212300913573751808@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_2/6=5D_ksm=3A_support_unsharing_zero_pages_pl?= =?utf-8?q?aced_by_KSM?= X-MAIL: mse-fl2.zte.com.cn 2BU1DuKS058130 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.251.13.novalocal with ID 63AE3B59.000 by FangMail milter! X-FangMail-Envelope: 1672362841/4NjnMs3QrRz501Qf/63AE3B59.000/10.5.228.133/[10.5.228.133]/mse-fl2.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3B59.000/4NjnMs3QrRz501Qf X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,SORTED_RECIPS, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753599726375752286?= X-GMAIL-MSGID: =?utf-8?q?1753599726375752286?= From: xu xin use_zero_pages may be very useful, not just because of cache colouring as described in doc, but also because use_zero_pages can accelerate merging empty pages when there are plenty of empty pages (full of zeros) as the time of page-by-page comparisons (unstable_tree_search_insert) is saved. But when enabling use_zero_pages, madvise(addr, len, MADV_UNMERGEABLE) and other ways (like write 2 to /sys/kernel/mm/ksm/run) to trigger unsharing will *not* actually unshare the shared zeropage as placed by KSM (which is against the MADV_UNMERGEABLE documentation). As these KSM-placed zero pages are out of the control of KSM, the related counts of ksm pages don't expose how many zero pages are placed by KSM (these special zero pages are different from those initially mapped zero pages, because the zero pages mapped to MADV_UNMERGEABLE areas are expected to be a complete and unshared page) To not blindly unshare all shared zero_pages in applicable VMAs, the patch introduces a dedicated flag ZERO_PAGE_FLAG to mark the rmap_items of those shared zero_pages. and guarantee that these rmap_items will be not freed during the time of zero_pages not being writing, so we can only unshare the *KSM-placed* zero_pages. The patch will not degrade the performance of use_zero_pages as it doesn't change the way of merging empty pages in use_zero_pages's feature. Signed-off-by: xu xin Reported-by: David Hildenbrand Cc: Claudio Imbrenda Cc: Xuexin Jiang Reviewed-by: Xiaokai Ran Reviewed-by: Yang Yang --- mm/ksm.c | 141 +++++++++++++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 111 insertions(+), 30 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 5b0a7343ff4a..652c088f9786 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -214,6 +214,7 @@ struct ksm_rmap_item { #define SEQNR_MASK 0x0ff /* low bits of unstable tree seqnr */ #define UNSTABLE_FLAG 0x100 /* is a node of the unstable tree */ #define STABLE_FLAG 0x200 /* is listed from the stable tree */ +#define ZERO_PAGE_FLAG 0x400 /* is zero page placed by KSM */ /* The stable and unstable tree heads */ static struct rb_root one_stable_tree[1] = { RB_ROOT }; @@ -420,6 +421,11 @@ static inline bool ksm_test_exit(struct mm_struct *mm) return atomic_read(&mm->mm_users) == 0; } +enum break_ksm_pmd_entry_return_flag { + HAVE_KSM_PAGE = 1, + HAVE_ZERO_PAGE +}; + static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long next, struct mm_walk *walk) { @@ -427,6 +433,7 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex spinlock_t *ptl; pte_t *pte; int ret; + bool is_zero_page = false; if (pmd_leaf(*pmd) || !pmd_present(*pmd)) return 0; @@ -434,6 +441,8 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); if (pte_present(*pte)) { page = vm_normal_page(walk->vma, addr, *pte); + if (!page) + is_zero_page = is_zero_pfn(pte_pfn(*pte)); } else if (!pte_none(*pte)) { swp_entry_t entry = pte_to_swp_entry(*pte); @@ -444,7 +453,14 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex if (is_migration_entry(entry)) page = pfn_swap_entry_to_page(entry); } - ret = page && PageKsm(page); + + if (page && PageKsm(page)) + ret = HAVE_KSM_PAGE; + else if (is_zero_page) + ret = HAVE_ZERO_PAGE; + else + ret = 0; + pte_unmap_unlock(pte, ptl); return ret; } @@ -466,19 +482,22 @@ static const struct mm_walk_ops break_ksm_ops = { * of the process that owns 'vma'. We also do not want to enforce * protection keys here anyway. */ -static int break_ksm(struct vm_area_struct *vma, unsigned long addr) +static int break_ksm(struct vm_area_struct *vma, unsigned long addr, + bool unshare_zero_page) { vm_fault_t ret = 0; do { - int ksm_page; + int walk_result; cond_resched(); - ksm_page = walk_page_range_vma(vma, addr, addr + 1, + walk_result = walk_page_range_vma(vma, addr, addr + 1, &break_ksm_ops, NULL); - if (WARN_ON_ONCE(ksm_page < 0)) - return ksm_page; - if (!ksm_page) + if (WARN_ON_ONCE(walk_result < 0)) + return walk_result; + if (!walk_result) + return 0; + if (walk_result == HAVE_ZERO_PAGE && !unshare_zero_page) return 0; ret = handle_mm_fault(vma, addr, FAULT_FLAG_UNSHARE | FAULT_FLAG_REMOTE, @@ -539,7 +558,7 @@ static void break_cow(struct ksm_rmap_item *rmap_item) mmap_read_lock(mm); vma = find_mergeable_vma(mm, addr); if (vma) - break_ksm(vma, addr); + break_ksm(vma, addr, false); mmap_read_unlock(mm); } @@ -764,6 +783,30 @@ static struct page *get_ksm_page(struct ksm_stable_node *stable_node, return NULL; } +/* + * Cleaning the rmap_item's ZERO_PAGE_FLAG + * This function will be called when unshare or writing on zero pages. + */ +static inline void clean_rmap_item_zero_flag(struct ksm_rmap_item *rmap_item) +{ + if (rmap_item->address & ZERO_PAGE_FLAG) + rmap_item->address &= PAGE_MASK; +} + +/* Only called when rmap_item is going to be freed */ +static inline void unshare_zero_pages(struct ksm_rmap_item *rmap_item) +{ + struct vm_area_struct *vma; + + if (rmap_item->address & ZERO_PAGE_FLAG) { + vma = vma_lookup(rmap_item->mm, rmap_item->address); + if (vma && !ksm_test_exit(rmap_item->mm)) + break_ksm(vma, rmap_item->address, true); + } + /* Put at last. */ + clean_rmap_item_zero_flag(rmap_item); +} + /* * Removing rmap_item from stable or unstable tree. * This function will clean the information from the stable/unstable tree. @@ -824,6 +867,7 @@ static void remove_trailing_rmap_items(struct ksm_rmap_item **rmap_list) struct ksm_rmap_item *rmap_item = *rmap_list; *rmap_list = rmap_item->rmap_list; remove_rmap_item_from_tree(rmap_item); + unshare_zero_pages(rmap_item); free_rmap_item(rmap_item); } } @@ -853,7 +897,7 @@ static int unmerge_ksm_pages(struct vm_area_struct *vma, if (signal_pending(current)) err = -ERESTARTSYS; else - err = break_ksm(vma, addr); + err = break_ksm(vma, addr, false); } return err; } @@ -2044,6 +2088,39 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item, rmap_item->mm->ksm_merging_pages++; } +static int try_to_merge_with_kernel_zero_page(struct ksm_rmap_item *rmap_item, + struct page *page) +{ + struct mm_struct *mm = rmap_item->mm; + int err = 0; + + /* + * It should not take ZERO_PAGE_FLAG because on one hand, + * get_next_rmap_item don't return zero pages' rmap_item. + * On the other hand, even if zero page was writen as + * anonymous page, rmap_item has been cleaned after + * stable_tree_search + */ + if (!WARN_ON_ONCE(rmap_item->address & ZERO_PAGE_FLAG)) { + struct vm_area_struct *vma; + + mmap_read_lock(mm); + vma = find_mergeable_vma(mm, rmap_item->address); + if (vma) { + err = try_to_merge_one_page(vma, page, + ZERO_PAGE(rmap_item->address)); + if (!err) + rmap_item->address |= ZERO_PAGE_FLAG; + } else { + /* If the vma is out of date, we do not need to continue. */ + err = 0; + } + mmap_read_unlock(mm); + } + + return err; +} + /* * cmp_and_merge_page - first see if page can be merged into the stable tree; * if not, compare checksum to previous and if it's the same, see if page can @@ -2055,7 +2132,6 @@ static void stable_tree_append(struct ksm_rmap_item *rmap_item, */ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_item) { - struct mm_struct *mm = rmap_item->mm; struct ksm_rmap_item *tree_rmap_item; struct page *tree_page = NULL; struct ksm_stable_node *stable_node; @@ -2092,6 +2168,7 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite } remove_rmap_item_from_tree(rmap_item); + clean_rmap_item_zero_flag(rmap_item); if (kpage) { if (PTR_ERR(kpage) == -EBUSY) @@ -2128,29 +2205,16 @@ static void cmp_and_merge_page(struct page *page, struct ksm_rmap_item *rmap_ite * Same checksum as an empty page. We attempt to merge it with the * appropriate zero page if the user enabled this via sysfs. */ - if (ksm_use_zero_pages && (checksum == zero_checksum)) { - struct vm_area_struct *vma; - - mmap_read_lock(mm); - vma = find_mergeable_vma(mm, rmap_item->address); - if (vma) { - err = try_to_merge_one_page(vma, page, - ZERO_PAGE(rmap_item->address)); - } else { + if (ksm_use_zero_pages) { + if (checksum == zero_checksum) /* - * If the vma is out of date, we do not need to - * continue. + * In case of failure, the page was not really empty, so we + * need to continue. Otherwise we're done. */ - err = 0; - } - mmap_read_unlock(mm); - /* - * In case of failure, the page was not really empty, so we - * need to continue. Otherwise we're done. - */ - if (!err) - return; + if (!try_to_merge_with_kernel_zero_page(rmap_item, page)) + return; } + tree_rmap_item = unstable_tree_search_insert(rmap_item, page, &tree_page); if (tree_rmap_item) { @@ -2226,6 +2290,7 @@ static struct ksm_rmap_item *try_to_get_old_rmap_item(unsigned long addr, *rmap_list = rmap_item->rmap_list; /* Running here indicates it's vma has been UNMERGEABLE */ remove_rmap_item_from_tree(rmap_item); + unshare_zero_pages(rmap_item); free_rmap_item(rmap_item); } @@ -2350,6 +2415,22 @@ static struct ksm_rmap_item *scan_get_next_rmap_item(struct page **page) } if (is_zone_device_page(*page)) goto next_page; + if (is_zero_pfn(page_to_pfn(*page))) { + /* + * To monitor ksm zero pages which becomes non-anonymous, + * we have to save each rmap_item of zero pages by + * try_to_get_old_rmap_item() walking on + * ksm_scan.rmap_list, otherwise their rmap_items will be + * freed by the next turn of get_next_rmap_item(). The + * function get_next_rmap_item() will free all "skipped" + * rmap_items because it thinks its areas as UNMERGEABLE. + */ + rmap_item = try_to_get_old_rmap_item(ksm_scan.address, + ksm_scan.rmap_list); + if (rmap_item && (rmap_item->address & ZERO_PAGE_FLAG)) + ksm_scan.rmap_list = &rmap_item->rmap_list; + goto next_page; + } if (PageAnon(*page)) { flush_anon_page(vma, *page, ksm_scan.address); flush_dcache_page(*page); From patchwork Fri Dec 30 01:15:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37543 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2669347wrt; Thu, 29 Dec 2022 17:19:51 -0800 (PST) X-Google-Smtp-Source: AMrXdXsohcptIvXqg2gY0akwfNgEsAUUamU1O04B3bHre98KRgp/saEr78TllPWpb0yxJhdZZlxw X-Received: by 2002:a17:907:7da1:b0:7c0:d6b6:1ee9 with SMTP id oz33-20020a1709077da100b007c0d6b61ee9mr30843539ejc.11.1672363190874; Thu, 29 Dec 2022 17:19:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672363190; cv=none; d=google.com; s=arc-20160816; b=aj/xFa/ftE9e/qAiQgdX+lk+OAdYsyA7uusYnOg8URln2cqX7uNurdGaoziBZb86m2 GZRElBmMlolE2MhNaUKlPtDVl7CecmjTkElbV+ZVLLhGzoT7+MzBXvTUYnlH/N/jxUjo CuNxTIpSJUn+niofBGo+cCbRlcEK1iWjGDHPethrEfoEpZ11gBKsCFwHs3yD2ZZjttLg iXhl/OdcHBIlxdwcLRy80TJDDSTzSL1+ViMmadug27bRB5S5iZcDOd2JeW8IhUS0JNYQ y1zwlcwVWntotb0YfDSuHHp5wkPGlTlcB5Wd2E9jkGjZt4YKGJBumkwHrsP2lJXHL01k k8Mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=vJ+WhRMHBEmLRP1qcZM3trDK4cV64+QE7FKeYlR0Q4I=; b=NJeZIUfeHSg1MMs1Jk9W5JAjiA+wEmHNS8Qh7Tt9PdCluW8Skvy1uK9l32CS9crW25 Xt29DKm9R8NZtb5xCsKb4nV8gXOgP3oWnzOUrnduJVxtuEmaZwYZoBoc0Pel69TLi6I/ dfW99RpF1745eX98lKnv4hsSjpttOVO5vYf/Ny59I0TiNdVBy+sKYqz05IwOVgnobDsB dxVjxgtJUNHDETf8UG4ome7a8/m6vj/tII8sMWXT2q/ZVp6PnCv3p7aYO5/LeFrz2uvi XQs8p9UNl1bVtS6W0R8E/zyFk6sWAKKnADlTt7IET6aUaABfWJasgsky7BM60CmJIcln 4msw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id di15-20020a170906730f00b0080de94d6270si18392627ejc.304.2022.12.29.17.19.27; Thu, 29 Dec 2022 17:19:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234141AbiL3BPX (ORCPT + 99 others); Thu, 29 Dec 2022 20:15:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234086AbiL3BPU (ORCPT ); Thu, 29 Dec 2022 20:15:20 -0500 Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [63.216.63.35]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60E1417409 for ; Thu, 29 Dec 2022 17:15:19 -0800 (PST) Received: from mse-fl1.zte.com.cn (unknown [10.5.228.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxhk.zte.com.cn (FangMail) with ESMTPS id 4NjnPK4sdDz5PkHg; Fri, 30 Dec 2022 09:15:17 +0800 (CST) Received: from szxlzmapp06.zte.com.cn ([10.5.230.252]) by mse-fl1.zte.com.cn with SMTP id 2BU1FDjd083722; Fri, 30 Dec 2022 09:15:13 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:15:14 +0800 (CST) Date: Fri, 30 Dec 2022 09:15:14 +0800 (CST) X-Zmail-TransId: 2b0363ae3ba2067bb0e7 X-Mailer: Zmail v1.0 Message-ID: <202212300915147801864@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_3/6=5D_ksm=3A_count_all_zero_pages_placed_by_?= =?utf-8?q?KSM?= X-MAIL: mse-fl1.zte.com.cn 2BU1FDjd083722 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.250.138.novalocal with ID 63AE3BA5.000 by FangMail milter! X-FangMail-Envelope: 1672362917/4NjnPK4sdDz5PkHg/63AE3BA5.000/10.5.228.132/[10.5.228.132]/mse-fl1.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3BA5.000/4NjnPK4sdDz5PkHg X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,SORTED_RECIPS, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753599905151833052?= X-GMAIL-MSGID: =?utf-8?q?1753599905151833052?= From: xu xin As pages_sharing and pages_shared don't include the number of zero pages merged by KSM, we cannot know how many pages are zero pages placed by KSM when enabling use_zero_pages, which leads to KSM not being transparent with all actual merged pages by KSM. In the early days of use_zero_pages, zero-pages was unable to get unshared by the ways like MADV_UNMERGEABLE so it's hard to count how many times one of those zeropages was then unmerged. But now, unsharing KSM-placed zero page accurately has been achieved, so we can easily count both how many times a page full of zeroes was merged with zero-page and how many times one of those pages was then unmerged. and so, it helps to estimate memory demands when each and every shared page could get unshared. So we add zero_pages_sharing under /sys/kernel/mm/ksm/ to show the number of all zero pages placed by KSM. Signed-off-by: xu xin Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Xuexin Jiang Reviewed-by: Xiaokai Ran Reviewed-by: Yang Yang v4->v5: fix warning mm/ksm.c:3238:9: warning: no previous prototype for 'zero_pages_sharing_show' [-Wmissing-prototypes]. --- mm/ksm.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/mm/ksm.c b/mm/ksm.c index 652c088f9786..72c0722be280 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -276,6 +276,9 @@ static unsigned int zero_checksum __read_mostly; /* Whether to merge empty (zeroed) pages with actual zero pages */ static bool ksm_use_zero_pages __read_mostly; +/* The number of zero pages placed by KSM use_zero_pages */ +static unsigned long ksm_zero_pages_sharing; + #ifdef CONFIG_NUMA /* Zeroed when merging across nodes is not allowed */ static unsigned int ksm_merge_across_nodes = 1; @@ -789,8 +792,10 @@ static struct page *get_ksm_page(struct ksm_stable_node *stable_node, */ static inline void clean_rmap_item_zero_flag(struct ksm_rmap_item *rmap_item) { - if (rmap_item->address & ZERO_PAGE_FLAG) + if (rmap_item->address & ZERO_PAGE_FLAG) { + ksm_zero_pages_sharing--; rmap_item->address &= PAGE_MASK; + } } /* Only called when rmap_item is going to be freed */ @@ -2109,8 +2114,10 @@ static int try_to_merge_with_kernel_zero_page(struct ksm_rmap_item *rmap_item, if (vma) { err = try_to_merge_one_page(vma, page, ZERO_PAGE(rmap_item->address)); - if (!err) + if (!err) { rmap_item->address |= ZERO_PAGE_FLAG; + ksm_zero_pages_sharing++; + } } else { /* If the vma is out of date, we do not need to continue. */ err = 0; @@ -3228,6 +3235,13 @@ static ssize_t pages_volatile_show(struct kobject *kobj, } KSM_ATTR_RO(pages_volatile); +static ssize_t zero_pages_sharing_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%ld\n", ksm_zero_pages_sharing); +} +KSM_ATTR_RO(zero_pages_sharing); + static ssize_t stable_node_dups_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) { @@ -3283,6 +3297,7 @@ static struct attribute *ksm_attrs[] = { &pages_sharing_attr.attr, &pages_unshared_attr.attr, &pages_volatile_attr.attr, + &zero_pages_sharing_attr.attr, &full_scans_attr.attr, #ifdef CONFIG_NUMA &merge_across_nodes_attr.attr, From patchwork Fri Dec 30 01:16:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37542 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2669191wrt; Thu, 29 Dec 2022 17:19:17 -0800 (PST) X-Google-Smtp-Source: AMrXdXvDHe8/mSRTLY1l5MUlhN5/l1LkUg1XOkM22jTxyrgphUv4oL8ZSB0DqjecuigmJfVEzh2i X-Received: by 2002:a17:907:d68b:b0:7c1:691a:6d2c with SMTP id wf11-20020a170907d68b00b007c1691a6d2cmr36003529ejc.7.1672363157315; Thu, 29 Dec 2022 17:19:17 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672363157; cv=none; d=google.com; s=arc-20160816; b=g0nQGQ+h55ffWlNidlTjUsPT4qVRkeaujbR3SfLsqwQpHQd88b81UeHSjIjIRfJRO6 P7ulkXVqy6FpjvRsuWAvtcQG4Mws3s/HUyGqO+KEtd0xNaSlfBnCxl6c7fhAa6xy3xQw m6rLoUYUqhd4tZpqvEh2BXEgbl3yY3X9Fk4mEsx2ne3S03tlLQ/J37PwbgDRaZVCrrga KcNY1LeXuxLCPL3E0oazy4AZQTGibSXMznAiCXtw0cuk0pFTkPU/26cXjgxDGQO3l3pl /0ZK1Eq41C55HD+g2h1MJj7ZH/DMgCr+PUusex2AK0JVYqwaGHE1uM7QLonPrbTTfzHx J9sg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=lcOvOXzlofwtDdJWJO/YaawEgINJw6332MG6HHQ7ykY=; b=ss1SR+/9qWKp5mLdzLQa66fZ+QF3Jr4gEaRmCfb1Y43Gzxwj+HaJEfPYSl1a+2w+Am BQWo/THbWv2U83GOUXDgs9gNmTndk0/Y+LTSVEhXyn2E5payRyh1EiLfkq4mPlIoi0vT fxwwG7NT5qFihTvhT0pyH7ybpAdIjZr7NO9J6lEYfO8eKPcKoO0A0xLozD1g065HoA9Q xURCQDIdWVekma2KGqYTv7rjs7Ggy2mmgv/rrfOb2Rfmt6nuuehBVu5df/dEvGm6xrxa hNuZAP+D3DspO4ZW2kljSnXGIirHdLxPqIb7sGoCFy0GEn+AYDFuMyWJ2UrenKaZhmfb 2D1w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ji11-20020a170907980b00b0084514612c28si16798031ejc.612.2022.12.29.17.18.53; Thu, 29 Dec 2022 17:19:17 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234216AbiL3BQh (ORCPT + 99 others); Thu, 29 Dec 2022 20:16:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234169AbiL3BQg (ORCPT ); Thu, 29 Dec 2022 20:16:36 -0500 Received: from mxct.zte.com.cn (mxct.zte.com.cn [183.62.165.209]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CD716FB for ; Thu, 29 Dec 2022 17:16:34 -0800 (PST) Received: from mse-fl1.zte.com.cn (unknown [10.5.228.132]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxct.zte.com.cn (FangMail) with ESMTPS id 4NjnQn2W1mz501Qh; Fri, 30 Dec 2022 09:16:33 +0800 (CST) Received: from szxlzmapp06.zte.com.cn ([10.5.230.252]) by mse-fl1.zte.com.cn with SMTP id 2BU1GSAF084154; Fri, 30 Dec 2022 09:16:28 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:16:29 +0800 (CST) Date: Fri, 30 Dec 2022 09:16:29 +0800 (CST) X-Zmail-TransId: 2b0363ae3bed7a8bf824 X-Mailer: Zmail v1.0 Message-ID: <202212300916292181912@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_4/6=5D_ksm=3A_count_zero_pages_for_each_proce?= =?utf-8?q?ss?= X-MAIL: mse-fl1.zte.com.cn 2BU1GSAF084154 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.251.13.novalocal with ID 63AE3BF1.000 by FangMail milter! X-FangMail-Envelope: 1672362993/4NjnQn2W1mz501Qh/63AE3BF1.000/10.5.228.132/[10.5.228.132]/mse-fl1.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3BF1.000/4NjnQn2W1mz501Qh X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,SORTED_RECIPS, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753599869979645050?= X-GMAIL-MSGID: =?utf-8?q?1753599869979645050?= From: xu xin As the number of ksm zero pages is not included in ksm_merging_pages per process when enabling use_zero_pages, it's unclear of how many actual pages are merged by KSM. To let users accurately estimate their memory demands when unsharing KSM zero-pages, it's necessary to show KSM zero- pages per process. since unsharing zero pages placed by KSM accurately is achieved, then tracking empty pages merging and unmerging is not a difficult thing any longer. Since we already have /proc//ksm_stat, just add the information of zero_pages_sharing in it. Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Xuexin Jiang Cc: Xiaokai Ran Cc: Yang Yang Signed-off-by: xu xin --- fs/proc/base.c | 1 + include/linux/mm_types.h | 7 ++++++- mm/ksm.c | 2 ++ 3 files changed, 9 insertions(+), 1 deletion(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 9e479d7d202b..ac9ebe972be0 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3207,6 +3207,7 @@ static int proc_pid_ksm_stat(struct seq_file *m, struct pid_namespace *ns, mm = get_task_mm(task); if (mm) { seq_printf(m, "ksm_rmap_items %lu\n", mm->ksm_rmap_items); + seq_printf(m, "zero_pages_sharing %lu\n", mm->ksm_zero_pages_sharing); mmput(mm); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 4e1031626403..5c734ebc1890 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -776,7 +776,7 @@ struct mm_struct { #ifdef CONFIG_KSM /* * Represent how many pages of this process are involved in KSM - * merging. + * merging (not including ksm_zero_pages_sharing). */ unsigned long ksm_merging_pages; /* @@ -784,6 +784,11 @@ struct mm_struct { * including merged and not merged. */ unsigned long ksm_rmap_items; + /* + * Represent how many empty pages are merged with kernel zero + * pages when enabling KSM use_zero_pages. + */ + unsigned long ksm_zero_pages_sharing; #endif #ifdef CONFIG_LRU_GEN struct { diff --git a/mm/ksm.c b/mm/ksm.c index 72c0722be280..083f5d125373 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -794,6 +794,7 @@ static inline void clean_rmap_item_zero_flag(struct ksm_rmap_item *rmap_item) { if (rmap_item->address & ZERO_PAGE_FLAG) { ksm_zero_pages_sharing--; + rmap_item->mm->ksm_zero_pages_sharing--; rmap_item->address &= PAGE_MASK; } } @@ -2117,6 +2118,7 @@ static int try_to_merge_with_kernel_zero_page(struct ksm_rmap_item *rmap_item, if (!err) { rmap_item->address |= ZERO_PAGE_FLAG; ksm_zero_pages_sharing++; + rmap_item->mm->ksm_zero_pages_sharing++; } } else { /* If the vma is out of date, we do not need to continue. */ From patchwork Fri Dec 30 01:17:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37545 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2674472wrt; Thu, 29 Dec 2022 17:37:13 -0800 (PST) X-Google-Smtp-Source: AMrXdXvVSY9ImygpNeRc1qJUAoOwcgHDQy0XiPYjjZZBu3RPqLyw2tSotaSsuNwrLKtv4HT/i1Ey X-Received: by 2002:a17:903:30d2:b0:189:ed85:94d2 with SMTP id s18-20020a17090330d200b00189ed8594d2mr30848125plc.1.1672364233372; Thu, 29 Dec 2022 17:37:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672364233; cv=none; d=google.com; s=arc-20160816; b=Os8BaqJf+uN0H/U6/KFWRmXtRDucngho6wvR2e5cUyhSOdmr3ITsfKE5RX5ucUZn1/ YAnCCiEpIGV17xX6s1E2LcJQrk1jElfnI5y7iWrGxKi0QiMAoPk72uAv2qGgjO3xeUFq 1YRZ7LXulVkT1oTCJHXNLAsPbCjG3V9fEVEnszUwFg6HZnyH5U0snuitQQorlg1gbJXq 0GGGZjGgwdUqodlDiQM7jlbwZ4XiDURGh9diJJ0basVOW8Fi7UUZ8x0xs3PzBLu4nE8d Imw85lXrTFvSFiRirJocw520l9XS8HEQMOtj/aEKGeVQBWrtp31LxJ+Avtf1Wi5Pefrp Twww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=XU9FheZT57GPBRP073pAhzfqBkSRwOOpOqW3eL8idB8=; b=ou2cX3hFJqE8+AYdNw8ICTtYdGaon/Sb34FkA1yPU98eu/VCSGic3qw5DUu6vcAv0W 0omzC5mBSM1Yh8wwwJKyTWKN2dLaV+NqLWSkkX/XZcEOo9Mi2ZhdMp3oorZUY/zPG18V 8Q7LdemH7yvOfWrvvtHhoE79WnZvGHHYj42OwGDcYOzdA6N5XMfwp1mSjGjDbrCF8iUc a2XJ3YvZehdN02O0TKVn+Lp/5UGLl4WmTU6Irh8wkbm+ggGE76Zhyav+JxVCG3Lk+MIF mfZPBwITzKhnWMyYOJ1MVRSg1yMEiDE1mLc5heN1duRIqzPSS9yESY7kpQeD69QNoY9V EIGg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l16-20020a170903245000b001928d1778f8si8648303pls.251.2022.12.29.17.36.59; Thu, 29 Dec 2022 17:37:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234234AbiL3BRh (ORCPT + 99 others); Thu, 29 Dec 2022 20:17:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229667AbiL3BRf (ORCPT ); Thu, 29 Dec 2022 20:17:35 -0500 Received: from mxct.zte.com.cn (mxct.zte.com.cn [183.62.165.209]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E8EA51B1 for ; Thu, 29 Dec 2022 17:17:34 -0800 (PST) Received: from mse-fl2.zte.com.cn (unknown [10.5.228.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxct.zte.com.cn (FangMail) with ESMTPS id 4NjnRx3gb2z501Qn; Fri, 30 Dec 2022 09:17:33 +0800 (CST) Received: from szxlzmapp04.zte.com.cn ([10.5.231.166]) by mse-fl2.zte.com.cn with SMTP id 2BU1HRq8060335; Fri, 30 Dec 2022 09:17:27 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:17:28 +0800 (CST) Date: Fri, 30 Dec 2022 09:17:28 +0800 (CST) X-Zmail-TransId: 2b0363ae3c28ffffffffc3ec2cb7 X-Mailer: Zmail v1.0 Message-ID: <202212300917284911971@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_5/6=5D_ksm=3A_add_zero=5Fpages=5Fsharing_docu?= =?utf-8?q?mentation?= X-MAIL: mse-fl2.zte.com.cn 2BU1HRq8060335 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.251.13.novalocal with ID 63AE3C2D.000 by FangMail milter! X-FangMail-Envelope: 1672363053/4NjnRx3gb2z501Qn/63AE3C2D.000/10.5.228.133/[10.5.228.133]/mse-fl2.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3C2D.000/4NjnRx3gb2z501Qn X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,SORTED_RECIPS, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753600998606585748?= X-GMAIL-MSGID: =?utf-8?q?1753600998606585748?= From: xu xin When enabling use_zero_pages, pages_sharing cannot represent how much memory saved indeed. zero_pages_sharing + pages_sharing does. add the description of zero_pages_sharing. Cc: Xiaokai Ran Cc: Yang Yang Cc: Jiang Xuexin Cc: Claudio Imbrenda Cc: David Hildenbrand Signed-off-by: xu xin --- Documentation/admin-guide/mm/ksm.rst | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/Documentation/admin-guide/mm/ksm.rst b/Documentation/admin-guide/mm/ksm.rst index fb6ba2002a4b..f160f9487a90 100644 --- a/Documentation/admin-guide/mm/ksm.rst +++ b/Documentation/admin-guide/mm/ksm.rst @@ -173,6 +173,13 @@ stable_node_chains the number of KSM pages that hit the ``max_page_sharing`` limit stable_node_dups number of duplicated KSM pages +zero_pages_sharing + how many empty pages are sharing kernel zero page(s) instead of + with each other as it would happen normally. Only effective when + enabling ``use_zero_pages`` knob. + +When enabling ``use_zero_pages``, the sum of ``pages_sharing`` + +``zero_pages_sharing`` represents how much really saved by KSM. A high ratio of ``pages_sharing`` to ``pages_shared`` indicates good sharing, but a high ratio of ``pages_unshared`` to ``pages_sharing`` From patchwork Fri Dec 30 01:18:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 37544 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:4e01:0:0:0:0:0 with SMTP id p1csp2674272wrt; Thu, 29 Dec 2022 17:36:35 -0800 (PST) X-Google-Smtp-Source: AMrXdXuChVkVfUabTtR9+CkP8J5QpeF3sBA8ai/7l9xcNGwF/b05ic/n7OW56WU28I9OjymHUItg X-Received: by 2002:a17:90a:be04:b0:226:225d:d9b7 with SMTP id a4-20020a17090abe0400b00226225dd9b7mr6656792pjs.16.1672364195644; Thu, 29 Dec 2022 17:36:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1672364195; cv=none; d=google.com; s=arc-20160816; b=AK+ldZcrptK6S/hqobBXbF2wQ1ETYZftVerJsKUCvtMFbA5CCz9e5MVpMoGV01dn58 UCnj7HANC2llhkOHwUATk3driPk35x8Plc0sz4RcJoTAqAWguYJA9NfiYAgcHTt7op7C x6w6HOz23Md+LUyS+nOtcO+uRVixK3U5DadY6BdlxGtRLbEPDEWu2wIUmMZ+w3NcP0iM fZ85mJz/7gaJRsMkTCQNgL82c2ubiyUvmTB2zcp5/F+6+n4TJ0he1Hj0XKUGpEa62pM7 7SPe9sHIT+XPRR05jhHlAsz8B2TreXJ4t/97fxad8eSa3J2cjAYA6QfF4Pr5HwYdj2B1 bOBA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:mime-version:message-id:date; bh=qDt9E+H3ZcjHlcvgzmR4RDBbVLN6q6RAw+nqg9pWau4=; b=AMxt9D3ADYrl1GFlZp3G6BOirUo8P/yRpfbBj4pYbJHSZRepsvlr3ErYx6GKMv8tSQ 1183/NOWxpIPfhlx0dmiQnuJqxPUMLMxRPLXLFOOtRTcArMl876Z6smKXkCxJiX3wHQR eghyAPTeZ2A4nDFLcjHFFzFeYydBlGViwqxR0nXCfDSGe7ZOL8lvP57+ujPZSlL/7x0c chx+bZYwr9+NftUIso9qFgAGszosL9usTNi1RH3XaN8+keMCPxy0zOusopHLgbzsSG45 mN024bNoffXADuNta26sg7SHe8sIongnMHr7OBZil2D0huN6ajIhAAImgjExBlzpyZ1F a2JA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id pl9-20020a17090b268900b00225dacb5818si1820916pjb.86.2022.12.29.17.36.19; Thu, 29 Dec 2022 17:36:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234270AbiL3BS4 (ORCPT + 99 others); Thu, 29 Dec 2022 20:18:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36846 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229667AbiL3BSy (ORCPT ); Thu, 29 Dec 2022 20:18:54 -0500 Received: from mxct.zte.com.cn (mxct.zte.com.cn [183.62.165.209]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FB6B1C4 for ; Thu, 29 Dec 2022 17:18:53 -0800 (PST) Received: from mse-fl2.zte.com.cn (unknown [10.5.228.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mxct.zte.com.cn (FangMail) with ESMTPS id 4NjnTS10zfz501Qj; Fri, 30 Dec 2022 09:18:52 +0800 (CST) Received: from szxlzmapp05.zte.com.cn ([10.5.230.85]) by mse-fl2.zte.com.cn with SMTP id 2BU1IkTi061010; Fri, 30 Dec 2022 09:18:46 +0800 (+08) (envelope-from yang.yang29@zte.com.cn) Received: from mapi (szxlzmapp01[null]) by mapi (Zmail) with MAPI id mid14; Fri, 30 Dec 2022 09:18:47 +0800 (CST) Date: Fri, 30 Dec 2022 09:18:47 +0800 (CST) X-Zmail-TransId: 2b0363ae3c77ffffffffe91c6f06 X-Mailer: Zmail v1.0 Message-ID: <202212300918477352037@zte.com.cn> Mime-Version: 1.0 From: To: Cc: , , , , , , , , Subject: =?utf-8?q?=5BPATCH_v5_6/6=5D_selftest=3A_add_testing_unsharing_and_?= =?utf-8?q?counting_ksm_zero?= =?utf-8?q?_page?= X-MAIL: mse-fl2.zte.com.cn 2BU1IkTi061010 X-Fangmail-Gw-Spam-Type: 0 X-FangMail-Miltered: at cgslv5.04-192.168.251.13.novalocal with ID 63AE3C7C.000 by FangMail milter! X-FangMail-Envelope: 1672363132/4NjnTS10zfz501Qj/63AE3C7C.000/10.5.228.133/[10.5.228.133]/mse-fl2.zte.com.cn/ X-Fangmail-Anti-Spam-Filtered: true X-Fangmail-MID-QID: 63AE3C7C.000/4NjnTS10zfz501Qj X-Spam-Status: No, score=0.6 required=5.0 tests=BAYES_00,SORTED_RECIPS, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1753600958800456846?= X-GMAIL-MSGID: =?utf-8?q?1753600958800456846?= From: xu xin Add a function test_unmerge_zero_page() to test the functionality on unsharing and counting ksm-placed zero pages and counting of this patch series. test_unmerge_zero_page() actually contains three subjct test objects: 1) whether the count of ksm zero page can react correctly to cow (copy on write); 2) whether the count of ksm zero page can react correctly to unmerge; 3) whether ksm zero pages are really unmerged. Signed-off-by: xu xin Cc: Claudio Imbrenda Cc: David Hildenbrand Cc: Xuexin Jiang Reviewed-by: Xiaokai Ran Reviewed-by: Yang Yang v4->v5: fix error of "} while (end_scans < start_scans + 20);" to "} while (end_scans < start_scans + 2);" in wait_two_full_scans(). --- tools/testing/selftests/vm/ksm_functional_tests.c | 103 +++++++++++++++++++++- 1 file changed, 99 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/vm/ksm_functional_tests.c b/tools/testing/selftests/vm/ksm_functional_tests.c index b11b7e5115dc..b792798a54c4 100644 --- a/tools/testing/selftests/vm/ksm_functional_tests.c +++ b/tools/testing/selftests/vm/ksm_functional_tests.c @@ -27,6 +27,8 @@ static int ksm_fd; static int ksm_full_scans_fd; +static int ksm_zero_pages_fd; +static int ksm_use_zero_pages_fd; static int pagemap_fd; static size_t pagesize; @@ -57,6 +59,22 @@ static bool range_maps_duplicates(char *addr, unsigned long size) return false; } +static bool check_ksm_zero_pages_count(unsigned long zero_size) +{ + unsigned long pages_expected = zero_size / (4 * KiB); + char buf[20]; + ssize_t read_size; + unsigned long ksm_zero_pages; + + read_size = pread(ksm_zero_pages_fd, buf, sizeof(buf) - 1, 0); + if (read_size < 0) + return -errno; + buf[read_size] = 0; + ksm_zero_pages = strtol(buf, NULL, 10); + + return ksm_zero_pages == pages_expected; +} + static long ksm_get_full_scans(void) { char buf[10]; @@ -70,15 +88,12 @@ static long ksm_get_full_scans(void) return strtol(buf, NULL, 10); } -static int ksm_merge(void) +static int wait_two_full_scans(void) { long start_scans, end_scans; - /* Wait for two full scans such that any possible merging happened. */ start_scans = ksm_get_full_scans(); if (start_scans < 0) - return start_scans; - if (write(ksm_fd, "1", 1) != 1) return -errno; do { end_scans = ksm_get_full_scans(); @@ -89,6 +104,34 @@ static int ksm_merge(void) return 0; } +static inline int ksm_merge(void) +{ + /* Wait for two full scans such that any possible merging happened. */ + if (write(ksm_fd, "1", 1) != 1) + return -errno; + return wait_two_full_scans(); +} + +static inline int make_cow(char *map, char val, unsigned long size) +{ + + memset(map, val, size); + return wait_two_full_scans(); +} + +static int unmerge_zero_page(char *start, unsigned long size) +{ + int ret; + + ret = madvise(start, size, MADV_UNMERGEABLE); + if (ret) { + ksft_test_result_fail("MADV_UNMERGEABLE failed\n"); + return ret; + } + + return wait_two_full_scans(); +} + static char *mmap_and_merge_range(char val, unsigned long size) { char *map; @@ -146,6 +189,56 @@ static void test_unmerge(void) munmap(map, size); } +static void test_unmerge_zero_pages(void) +{ + const unsigned int size = 2 * MiB; + char *map; + + ksft_print_msg("[RUN] %s\n", __func__); + + /* Confirm the interfaces*/ + ksm_zero_pages_fd = open("/sys/kernel/mm/ksm/zero_pages_sharing", O_RDONLY); + if (ksm_zero_pages_fd < 0) { + ksft_test_result_skip("open(\"/sys/kernel/mm/ksm/zero_pages_sharing\") failed\n"); + return; + } + ksm_use_zero_pages_fd = open("/sys/kernel/mm/ksm/use_zero_pages", O_RDWR); + if (ksm_use_zero_pages_fd < 0) { + ksft_test_result_skip("open \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n"); + return; + } + if (write(ksm_use_zero_pages_fd, "1", 1) != 1) { + ksft_test_result_skip("write \"/sys/kernel/mm/ksm/use_zero_pages\" failed\n"); + return; + } + + /* Mmap zero pages*/ + map = mmap_and_merge_range(0x00, size); + + /* Case 1: make Writing on ksm zero pages (COW) */ + if (make_cow(map, 0xcf, size / 2)) { + ksft_test_result_fail("COW failed\n"); + goto unmap; + } + ksft_test_result(check_ksm_zero_pages_count(size / 2), + "zero page count react to cow\n"); + + /* Case 2: Call madvise(xxx, MADV_UNMERGEABLE)*/ + if (unmerge_zero_page(map + size / 2, size / 4)) { + ksft_test_result_fail("unmerge_zero_page failed\n"); + goto unmap; + } + ksft_test_result(check_ksm_zero_pages_count(size / 4), + "zero page count react to unmerge\n"); + + /*Check if ksm pages are really unmerged */ + ksft_test_result(!range_maps_duplicates(map + size / 2, size / 4), + "KSM zero pages were unmerged\n"); + +unmap: + munmap(map, size); +} + static void test_unmerge_discarded(void) { const unsigned int size = 2 * MiB; @@ -261,11 +354,13 @@ int main(int argc, char **argv) ksm_full_scans_fd = open("/sys/kernel/mm/ksm/full_scans", O_RDONLY); if (ksm_full_scans_fd < 0) ksft_exit_skip("open(\"/sys/kernel/mm/ksm/full_scans\") failed\n"); + pagemap_fd = open("/proc/self/pagemap", O_RDONLY); if (pagemap_fd < 0) ksft_exit_skip("open(\"/proc/self/pagemap\") failed\n"); test_unmerge(); + test_unmerge_zero_pages(); test_unmerge_discarded(); #ifdef __NR_userfaultfd test_unmerge_uffd_wp();