From patchwork Wed May 24 05:57:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Yang X-Patchwork-Id: 98308 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp2629234vqo; Tue, 23 May 2023 23:17:23 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ7BFbjYTT6eNhNV/Qc6gLywBknKWFYhRqjKcvuCYRAHN7UdpOjDVzhzxSAWfRwoTaaqG0ov X-Received: by 2002:a17:902:d4d2:b0:1af:b5af:367b with SMTP id o18-20020a170902d4d200b001afb5af367bmr7175365plg.29.1684909043117; Tue, 23 May 2023 23:17:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684909043; cv=none; d=google.com; s=arc-20160816; b=UehIppS0KSyb2ERkOrzSel/TvXGvrpN6hJSeQ+udfIMvls2HYhnA/7mn54QTiq9usD jtkwsGEVNcaY6qZKYFPAeXmoCMIO205vTYb22qawqMDXB2+NcDom4i9jAIVpL3zyfchH zsFrq9w6021IErIW4FZIvNrvqoDGvbsDwlDuf3L+5ePCcH+Z3cZoVfM5zMxZ3HZLMErj U0pKygZd8DFCVLJVGxscXDmLv4XZPqfEu+f0eQXoUQrroFlrFndwt2Ws8nC2sdH1uKOQ JXvbE9DI73ITMHXxqtbLWpRFXxGAyaT3dPS3s0cmDZaLznUNEyUEivwzgHDhbrI1xD/0 c+fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=7JI6wONS1ZOPxIh+CUvBMbRR/tf4apgOKD5okXa96g0=; b=fcqS9WlSVDSjsO6Hkz/JoeGugmQG3Ph6A13zS97AJ8efDYKQA8LYg0y4za35g8hsgK 5yq5c22u+1hvVjzAAGXqP2SI8GxuQVM/Ap3UZEEr3/I2UtWLs62ldinSS8prNfdgIFzd R7QMTiyhxYnQKSBkuiruQ3zdiFcwcQRGcB7nadlFtBhIGUsUN9pIPPKXCB+6vby3a9Q6 79Ipa7+a8qophjpoM5SA0uMzM9z3mPCB2uMlxkbEwVIvP9R2QN5/ENYpzLxwV3PTqNnX kysq88oF/fIwykK3WfAVRouA1SQLWNcEd0Q0rSSJVEqQ9g5Q+nrHFybOfwIvjEDMAns/ /qjQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bk13-20020a056a02028d00b0053ef1b5a506si2367025pgb.275.2023.05.23.23.17.07; Tue, 23 May 2023 23:17:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=zte.com.cn Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239552AbjEXF5f (ORCPT + 99 others); Wed, 24 May 2023 01:57:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41292 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239548AbjEXF5d (ORCPT ); Wed, 24 May 2023 01:57:33 -0400 Received: from ubuntu20 (unknown [193.203.214.57]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D104198 for ; Tue, 23 May 2023 22:57:30 -0700 (PDT) Received: by ubuntu20 (Postfix, from userid 1003) id A86C3E1EDF; Wed, 24 May 2023 13:57:13 +0800 (CST) From: Yang Yang To: akpm@linux-foundation.org, david@redhat.com Cc: yang.yang29@zte.com.cn, imbrenda@linux.ibm.com, jiang.xuexin@zte.com.cn, linux-kernel@vger.kernel.org, linux-mm@kvack.org, ran.xiaokai@zte.com.cn, xu.xin.sc@gmail.com, xu.xin16@zte.com.cn Subject: [PATCH v9 1/5] ksm: support unsharing KSM-placed zero pages Date: Wed, 24 May 2023 13:57:11 +0800 Message-Id: <20230524055711.20387-1-yang.yang29@zte.com.cn> X-Mailer: git-send-email 2.25.1 In-Reply-To: <202305241351365661923@zte.com.cn> References: <202305241351365661923@zte.com.cn> MIME-Version: 1.0 X-Spam-Status: No, score=3.4 required=5.0 tests=BAYES_00, FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM,FSL_HELO_NON_FQDN_1, HEADER_FROM_DIFFERENT_DOMAINS,HELO_NO_DOMAIN,NO_DNS_FOR_FROM, RCVD_IN_PBL,RDNS_NONE,SPF_SOFTFAIL,SPOOFED_FREEMAIL_NO_RDNS, T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Level: *** X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766755184379528976?= X-GMAIL-MSGID: =?utf-8?q?1766755184379528976?= From: xu xin When use_zero_pages of ksm is enabled, madvise(addr, len, MADV_UNMERGEABLE) and other ways (like write 2 to /sys/kernel/mm/ksm/run) to trigger unsharing will *not* actually unshare the shared zeropage as placed by KSM (which is against the MADV_UNMERGEABLE documentation). As these KSM-placed zero pages are out of the control of KSM, the related counts of ksm pages don't expose how many zero pages are placed by KSM (these special zero pages are different from those initially mapped zero pages, because the zero pages mapped to MADV_UNMERGEABLE areas are expected to be a complete and unshared page). To not blindly unshare all shared zero_pages in applicable VMAs, the patch use pte_mkdirty (related with architecture) to mark KSM-placed zero pages. Thus, MADV_UNMERGEABLE will only unshare those KSM-placed zero pages. In addition, we'll reuse this mechanism to reliably identify KSM-placed ZeroPages to properly account for them (e.g., calculating the KSM profit that includes zeropages) in the latter patches. The patch will not degrade the performance of use_zero_pages as it doesn't change the way of merging empty pages in use_zero_pages's feature. Signed-off-by: xu xin Suggested-by: David Hildenbrand Cc: Claudio Imbrenda Cc: Xuexin Jiang Reviewed-by: Xiaokai Ran Reviewed-by: Yang Yang Reviewed-by: David Hildenbrand --- include/linux/ksm.h | 8 ++++++++ mm/ksm.c | 11 ++++++++--- 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/include/linux/ksm.h b/include/linux/ksm.h index 899a314bc487..4fd5f4a50bac 100644 --- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -26,6 +26,12 @@ int ksm_disable(struct mm_struct *mm); int __ksm_enter(struct mm_struct *mm); void __ksm_exit(struct mm_struct *mm); +/* + * To identify zeropages that were mapped by KSM, we reuse the dirty bit + * in the PTE. If the PTE is dirty, the zeropage was mapped by KSM when + * deduplicating memory. + */ +#define is_ksm_zero_pte(pte) (is_zero_pfn(pte_pfn(pte)) && pte_dirty(pte)) static inline int ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm) { @@ -95,6 +101,8 @@ static inline void ksm_exit(struct mm_struct *mm) { } +#define is_ksm_zero_pte(pte) 0 + #ifdef CONFIG_MEMORY_FAILURE static inline void collect_procs_ksm(struct page *page, struct list_head *to_kill, int force_early) diff --git a/mm/ksm.c b/mm/ksm.c index 0156bded3a66..f31c789406b1 100644 --- a/mm/ksm.c +++ b/mm/ksm.c @@ -447,7 +447,8 @@ static int break_ksm_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long nex if (is_migration_entry(entry)) page = pfn_swap_entry_to_page(entry); } - ret = page && PageKsm(page); + /* return 1 if the page is an normal ksm page or KSM-placed zero page */ + ret = (page && PageKsm(page)) || is_ksm_zero_pte(*pte); pte_unmap_unlock(pte, ptl); return ret; } @@ -1220,8 +1221,12 @@ static int replace_page(struct vm_area_struct *vma, struct page *page, page_add_anon_rmap(kpage, vma, addr, RMAP_NONE); newpte = mk_pte(kpage, vma->vm_page_prot); } else { - newpte = pte_mkspecial(pfn_pte(page_to_pfn(kpage), - vma->vm_page_prot)); + /* + * Use pte_mkdirty to mark the zero page mapped by KSM, and then + * we can easily track all KSM-placed zero pages by checking if + * the dirty bit in zero page's PTE is set. + */ + newpte = pte_mkdirty(pte_mkspecial(pfn_pte(page_to_pfn(kpage), vma->vm_page_prot))); /* * We're replacing an anonymous page with a zero page, which is * not anonymous. We need to do proper accounting otherwise we