From patchwork Tue Sep 26 19:49:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 145143 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp2377647vqu; Tue, 26 Sep 2023 21:31:36 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGLlZHiY4XJFNbCAeXNPUHEqauvoXjCrrLPK58dEKZlBb7w+EFStLkH4cn5vYQHSHU3CS74 X-Received: by 2002:a05:6808:140b:b0:3a9:6400:62c6 with SMTP id w11-20020a056808140b00b003a9640062c6mr1236413oiv.32.1695789096370; Tue, 26 Sep 2023 21:31:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695789096; cv=none; d=google.com; s=arc-20160816; b=rN3y/4i9DfMrQjzNvIrwb1BObBXsj9lMPDaYBMKVsfiTJC7GF0srakf8fDCCoKmKrk 6IN/p7j9/Yl18kfDXysrybEAEW8flDp4mbqMXY9DVa+kK4+oq9xJ1SbtJc5nlQrK66st nXGI3I+9VEIyxelY+5XIAq0J/iftcz5TyzlGy34y11ZwMT75iOj6+TMq1Fbm08qPV0XB lSqOlCBGMT+X1YOSvGUgD87a7Gl4SrSYFEs+m/zzsxotGJ7t/N7hJs4Saq/aNdyE9ILW FMlu0zYzjmHbXXJYffKjLwNfI2mBSaZBPE/z86UPj/nRgPI4Rt4C8xtnT1deqMsHkRIk SClQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=DHPSbE4HFUhSHC8JococnD2WCCrSSr2Zh6a3tHWORhg=; fh=dr01barU9xhgfmkh9W9wZxPUiBGNGqZfFlS6t/GIbLY=; b=NlAtloOp/4PGoCa9ZyasD7fauOS6x+4g0FsN1FSgaTQreWM0ij47YVK8lNskBE98of piN+VpskWINS++EZX1UFxgv0JFXfKbCMLmqd2qPb7485c/D1kZDzImdTjeMYeuyhOZ+K I3UjpOa51+1zWU1l1iUM7FpFu5BPrFfa/2OVWG4l1rP6FO1lEjrb7FNgVOD6JwFhsaVL 5ctm2xnEuh7e0nETOdjxCvhKJcNnfOjGaFzftdfd/JlUgK8ohoKIEVggRHQGf07M8Qdc n83doezhWLMGFNiBVdXEWxvbpNlgP3VWI+4mkcjmL20WYAu6T/0D4CE5Wnz5WPbEQZvU F3Gg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=dnWxv+Va; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id b9-20020a63d809000000b0057c29fec784si13837508pgh.110.2023.09.26.21.31.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 21:31:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b=dnWxv+Va; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id A69FB8028918; Tue, 26 Sep 2023 12:50:13 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235422AbjIZTuD (ORCPT + 28 others); Tue, 26 Sep 2023 15:50:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234553AbjIZTt6 (ORCPT ); Tue, 26 Sep 2023 15:49:58 -0400 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BDF6EC0; Tue, 26 Sep 2023 12:49:51 -0700 (PDT) Received: by mail-pj1-x1030.google.com with SMTP id 98e67ed59e1d1-277336b8717so3835206a91.2; Tue, 26 Sep 2023 12:49:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695757791; x=1696362591; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=DHPSbE4HFUhSHC8JococnD2WCCrSSr2Zh6a3tHWORhg=; b=dnWxv+VasCA/ZeHubV+rrplQP+4dnMJ0wV/8H/UhWbBSKisN4EYiNSd/jWNiXJm0Fv Z6XH0pL3WNj6iHvGV25A0PnaawJm2UFiQxcI83ekTFUIAFkWDceQGuATqolzxpcrrcsK f5HMFx6KWWfy9yWuhTY6jzfp42fv+Q2yafjBdCC2MlzAn0IolPLyI5MJs9nPIK3Rb0Xg xLKRowX3ij6+4E//RG+1tJ1ZvGy/Lt4BHwpbj/lA2EzgKKpHAb0ml8c02lLNEKTEg4uc a6/NoT8LWAyWuTLQR3QMPF8AUblDMfndRf8jMyNR5mjt7vZ4fQ6Su6pd7bPdRNPCQVOA fMVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695757791; x=1696362591; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=DHPSbE4HFUhSHC8JococnD2WCCrSSr2Zh6a3tHWORhg=; b=G1f7oouhxh2UMPv7s7NHNTuiB0IaF5bATUhdTMzStSaq35WjaQy1jyWeF3s4rggBVZ hQ4t1n4VRICLE+9BwVitfm1DjUkbfuFV+/H/X6WXtpirA8A8FPzPSULC6rxS3GbjafVN 1bNk+3tJiTL7egQB/W65vA2gyhnnUtrQeCC9Zg6ElcX3IeUuNJ0+MzYI+n9jCpbGxf65 ZqNFV200uYbND47Sz8Ooin3zbkXwM8d1XgiCBrXI2IEh2OiQraxkA2zx0Mecjtuw+7q9 yDd20QqzFtUFRNPtlKdcsEOSC30Dbh+cFTOdMcjfs3CuRp9dnD2d+0CoqKhwsIhzkTXF v0rA== X-Gm-Message-State: AOJu0YxMg/50aFtcQwC2sOWsmU9Sr1OlUFDVZFPWuiSJNnXykuSoFxae ELFfk6urr+Yr3M7M6qBkG7Q= X-Received: by 2002:a17:90b:128a:b0:271:9c5f:fc42 with SMTP id fw10-20020a17090b128a00b002719c5ffc42mr8383608pjb.31.1695757791100; Tue, 26 Sep 2023 12:49:51 -0700 (PDT) Received: from localhost (fwdproxy-prn-006.fbsv.net. [2a03:2880:ff:6::face:b00c]) by smtp.gmail.com with ESMTPSA id 25-20020a17090a031900b00274bbfc34c8sm13212320pje.16.2023.09.26.12.49.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 12:49:50 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: riel@surriel.com, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com, shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: [PATCH 1/2] hugetlb: memcg: account hugetlb-backed memory in memory controller Date: Tue, 26 Sep 2023 12:49:48 -0700 Message-Id: <20230926194949.2637078-2-nphamcs@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230926194949.2637078-1-nphamcs@gmail.com> References: <20230926194949.2637078-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Tue, 26 Sep 2023 12:50:13 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778163747378751681 X-GMAIL-MSGID: 1778163747378751681 Currently, hugetlb memory usage is not acounted for in the memory controller, which could lead to memory overprotection for cgroups with hugetlb-backed memory. This has been observed in our production system. This patch rectifies this issue by charging the memcg when the hugetlb folio is allocated, and uncharging when the folio is freed (analogous to the hugetlb controller). Signed-off-by: Nhat Pham --- fs/hugetlbfs/inode.c | 2 +- include/linux/hugetlb.h | 6 ++++-- include/linux/memcontrol.h | 8 ++++++++ mm/hugetlb.c | 23 ++++++++++++++++------ mm/memcontrol.c | 40 ++++++++++++++++++++++++++++++++++++++ 5 files changed, 70 insertions(+), 9 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 60fce26ff937..034967319955 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -902,7 +902,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, * to keep reservation accounting consistent. */ hugetlb_set_vma_policy(&pseudo_vma, inode, index); - folio = alloc_hugetlb_folio(&pseudo_vma, addr, 0); + folio = alloc_hugetlb_folio(&pseudo_vma, addr, 0, true); hugetlb_drop_vma_policy(&pseudo_vma); if (IS_ERR(folio)) { mutex_unlock(&hugetlb_fault_mutex_table[hash]); diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a30686e649f7..9b73db1605a2 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -713,7 +713,8 @@ struct huge_bootmem_page { int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, - unsigned long addr, int avoid_reserve); + unsigned long addr, int avoid_reserve, + bool restore_reserve_on_memcg_failure); struct folio *alloc_hugetlb_folio_nodemask(struct hstate *h, int preferred_nid, nodemask_t *nmask, gfp_t gfp_mask); struct folio *alloc_hugetlb_folio_vma(struct hstate *h, struct vm_area_struct *vma, @@ -1016,7 +1017,8 @@ static inline int isolate_or_dissolve_huge_page(struct page *page, static inline struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, unsigned long addr, - int avoid_reserve) + int avoid_reserve, + bool restore_reserve_on_memcg_failure) { return NULL; } diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index e0cfab58ab71..8094679c99dd 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -677,6 +677,8 @@ static inline int mem_cgroup_charge(struct folio *folio, struct mm_struct *mm, return __mem_cgroup_charge(folio, mm, gfp); } +int mem_cgroup_hugetlb_charge_folio(struct folio *folio, gfp_t gfp); + int mem_cgroup_swapin_charge_folio(struct folio *folio, struct mm_struct *mm, gfp_t gfp, swp_entry_t entry); void mem_cgroup_swapin_uncharge_swap(swp_entry_t entry); @@ -1251,6 +1253,12 @@ static inline int mem_cgroup_charge(struct folio *folio, return 0; } +static inline int mem_cgroup_hugetlb_charge_folio(struct folio *folio, + gfp_t gfp) +{ + return 0; +} + static inline int mem_cgroup_swapin_charge_folio(struct folio *folio, struct mm_struct *mm, gfp_t gfp, swp_entry_t entry) { diff --git a/mm/hugetlb.c b/mm/hugetlb.c index de220e3ff8be..ff88ea4df11a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1902,6 +1902,7 @@ void free_huge_folio(struct folio *folio) pages_per_huge_page(h), folio); hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), pages_per_huge_page(h), folio); + mem_cgroup_uncharge(folio); if (restore_reserve) h->resv_huge_pages++; @@ -3004,7 +3005,8 @@ int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list) } struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, - unsigned long addr, int avoid_reserve) + unsigned long addr, int avoid_reserve, + bool restore_reserve_on_memcg_failure) { struct hugepage_subpool *spool = subpool_vma(vma); struct hstate *h = hstate_vma(vma); @@ -3119,6 +3121,15 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, hugetlb_cgroup_uncharge_folio_rsvd(hstate_index(h), pages_per_huge_page(h), folio); } + + /* undo allocation if memory controller disallows it. */ + if (mem_cgroup_hugetlb_charge_folio(folio, GFP_KERNEL)) { + if (restore_reserve_on_memcg_failure) + restore_reserve_on_error(h, vma, addr, folio); + folio_put(folio); + return ERR_PTR(-ENOMEM); + } + return folio; out_uncharge_cgroup: @@ -5179,7 +5190,7 @@ int copy_hugetlb_page_range(struct mm_struct *dst, struct mm_struct *src, spin_unlock(src_ptl); spin_unlock(dst_ptl); /* Do not use reserve as it's private owned */ - new_folio = alloc_hugetlb_folio(dst_vma, addr, 1); + new_folio = alloc_hugetlb_folio(dst_vma, addr, 1, false); if (IS_ERR(new_folio)) { folio_put(pte_folio); ret = PTR_ERR(new_folio); @@ -5656,7 +5667,7 @@ static vm_fault_t hugetlb_wp(struct mm_struct *mm, struct vm_area_struct *vma, * be acquired again before returning to the caller, as expected. */ spin_unlock(ptl); - new_folio = alloc_hugetlb_folio(vma, haddr, outside_reserve); + new_folio = alloc_hugetlb_folio(vma, haddr, outside_reserve, true); if (IS_ERR(new_folio)) { /* @@ -5930,7 +5941,7 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, VM_UFFD_MISSING); } - folio = alloc_hugetlb_folio(vma, haddr, 0); + folio = alloc_hugetlb_folio(vma, haddr, 0, true); if (IS_ERR(folio)) { /* * Returning error will result in faulting task being @@ -6352,7 +6363,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, goto out; } - folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0); + folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0, true); if (IS_ERR(folio)) { ret = -ENOMEM; goto out; @@ -6394,7 +6405,7 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, goto out; } - folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0); + folio = alloc_hugetlb_folio(dst_vma, dst_addr, 0, false); if (IS_ERR(folio)) { folio_put(*foliop); ret = -ENOMEM; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index d1a322a75172..e7ae63f14120 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -7050,6 +7050,46 @@ int __mem_cgroup_charge(struct folio *folio, struct mm_struct *mm, gfp_t gfp) return ret; } +static struct mem_cgroup *get_mem_cgroup_from_current(void) +{ + struct mem_cgroup *memcg; + +again: + rcu_read_lock(); + memcg = mem_cgroup_from_task(current); + if (!css_tryget(&memcg->css)) { + rcu_read_unlock(); + goto again; + } + rcu_read_unlock(); + return memcg; +} + +/** + * mem_cgroup_hugetlb_charge_folio - Charge a newly allocated hugetlb folio. + * @folio: folio to charge. + * @gfp: reclaim mode + * + * This function charges an allocated hugetlbf folio to the memcg of the + * current task. + * + * Returns 0 on success. Otherwise, an error code is returned. + */ +int mem_cgroup_hugetlb_charge_folio(struct folio *folio, gfp_t gfp) +{ + struct mem_cgroup *memcg; + int ret; + + if (mem_cgroup_disabled()) + return 0; + + memcg = get_mem_cgroup_from_current(); + ret = charge_memcg(folio, memcg, gfp); + mem_cgroup_put(memcg); + + return ret; +} + /** * mem_cgroup_swapin_charge_folio - Charge a newly allocated folio for swapin. * @folio: folio to charge. From patchwork Tue Sep 26 19:49:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 145037 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:cae8:0:b0:403:3b70:6f57 with SMTP id r8csp2188888vqu; Tue, 26 Sep 2023 14:00:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IENWVEF77DxcoOFpE/i7qH86VdUhVnmqPocUblS8ctG2h44U7/EzbacALSVQVgKxQw3ZvvG X-Received: by 2002:a25:b312:0:b0:d09:f39b:cecf with SMTP id l18-20020a25b312000000b00d09f39bcecfmr81274ybj.9.1695762015058; Tue, 26 Sep 2023 14:00:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1695762015; cv=none; d=google.com; s=arc-20160816; b=Npal2aL5lIx4SWclEZfw3W2eD7goEoliSYTmV2tm+Qwq3+JL6YEXyEGWCaLFOnUOTU Ss7GfT6ejVY9FW0fTnOSQSP/TE16vvZRDYihVEsaeoGJhjxlQRcEf6Eub1EONvtSR/BZ 394mhcP9+oqv228+oMRsH6+DGL6CqnnlEbNbdEUUrDANK6Hftn/FG44Eqd8PR7v1gso8 vZlfkkMMyy91Nwr1yCq2e0muz31HZXMYB82eJpSisPxRSGgZlQtj1wxMs/qd9szRrIjD 2y9umJpPXcmfWhtsp5kfXEXICtWQ6MVXbnVwYki4xZat5QCJJx25/invLwd5VVuDr9Un 6Mww== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=36Ec5DlRzmoQv9z0q1zeDBidbeUYibAaRRb/G9wYEfI=; fh=dr01barU9xhgfmkh9W9wZxPUiBGNGqZfFlS6t/GIbLY=; b=ZZKtIRfv+Qzeus739PqTopHEWX/AkXjFt2kcDaBcNfswiINULrBZ6ooFKaQK3wGi5J yU4dbUZGSeg/+Hba27BrPwU0J+iYV84ja/NjfIdyEcMsQfhQWELbZ/SfgRIiZxuMz3jb M5uvHxUsOYFhz7z6HvNNDYrmhusyV5E3GXTeu1uwGkJ4s7ABUloBATvhU/JzgWqmVmd2 Em7gmfwF+9yq5fT2eBYubCM/JA350y+2o4QaH7kZS1ffZR36KAjcDIVRiH1hlb/STX03 p+X1GPW+IZMa0SJTICF1YAmVWay3wDMEafFURGp7Jg2EvE41lQ0KP35odSASwzlv2dz/ /dgg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="TLHpHT/s"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id 63-20020a630142000000b005775a4a2961si13740443pgb.657.2023.09.26.14.00.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 14:00:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20230601 header.b="TLHpHT/s"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id D5CE88073DD4; Tue, 26 Sep 2023 12:50:18 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233555AbjIZTuI (ORCPT + 28 others); Tue, 26 Sep 2023 15:50:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234838AbjIZTuA (ORCPT ); Tue, 26 Sep 2023 15:50:00 -0400 Received: from mail-oi1-x230.google.com (mail-oi1-x230.google.com [IPv6:2607:f8b0:4864:20::230]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D88E0A3; Tue, 26 Sep 2023 12:49:52 -0700 (PDT) Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3a86a0355dfso5544960b6e.0; Tue, 26 Sep 2023 12:49:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1695757792; x=1696362592; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=36Ec5DlRzmoQv9z0q1zeDBidbeUYibAaRRb/G9wYEfI=; b=TLHpHT/suS2GeTb8qkbTQimruiLvBzbbTpyVK5bfIhFKofkGEWFIzP5YMKapYeNH4I Wog804PXTwsyuWzMbZIrBmi3zqpTAkk8r/Re1hGXTZUT47C97sox2+DiB5dCTaynTTfz NQvrs4Yi9t+1hhxFkmj3lrVR77lHAKOf/NyQ7pDKl0UB5APMikiQbFUTyCjqOsI5wREQ 1o2Fkb0dbdUa13nfobX0JlNtg8HRx3kZNgc9CXIq39mV5LU9qKmMwH3v8wC5AlPQJtsi mXzPyaRDt7SXh2ED//BUClLChL8E3DcNWh+EskEtmigB7LEW9A/Czg49OJtbefUXPI3P QiQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1695757792; x=1696362592; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=36Ec5DlRzmoQv9z0q1zeDBidbeUYibAaRRb/G9wYEfI=; b=MWGJAASznV+WcrchYNYaTRCNngMtM6NH1dg9DnjO5n5WkiKAPKTAkonAVeVjXwJ5bh jrCaCUtz7yVAELM7pAZ2XNap0stiUq5YP+NTOKzcYAResma4BhGLOb84gjFbCwNxfOBp mwlniRW1e+QxgjepDytzpJraSRzpGckmFyBeTD0rj3q7SBY2Bd42U3q+vWV7+gss9+5r yLE2K75Z1AhaDxhoIr1Qbq9LiC0GwwPY9/ozhYwNq1GTsqGHh+NCImetDJXJ0UYjAwTc 1BHtgsBZ7/ARPhMPoe0gVaDlcUvCg5b0x/MH7Ils8k/ab2Vr0FTvDywl0F/ZWh15VXzP JPwg== X-Gm-Message-State: AOJu0YynOtYvWFKayGWZ/iL5GRbJBPBycZdXd1BrlotCukdClK8B4w46 hPdGe2WUc6Dc7tZrTIZ/+I8= X-Received: by 2002:aca:1311:0:b0:3a6:f622:70f1 with SMTP id e17-20020aca1311000000b003a6f62270f1mr9105389oii.57.1695757792002; Tue, 26 Sep 2023 12:49:52 -0700 (PDT) Received: from localhost (fwdproxy-prn-016.fbsv.net. [2a03:2880:ff:10::face:b00c]) by smtp.gmail.com with ESMTPSA id i4-20020a63e444000000b0056428865aadsm10113211pgk.82.2023.09.26.12.49.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 26 Sep 2023 12:49:51 -0700 (PDT) From: Nhat Pham To: akpm@linux-foundation.org Cc: riel@surriel.com, hannes@cmpxchg.org, mhocko@kernel.org, roman.gushchin@linux.dev, shakeelb@google.com, muchun.song@linux.dev, tj@kernel.org, lizefan.x@bytedance.com, shuah@kernel.org, mike.kravetz@oracle.com, yosryahmed@google.com, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org Subject: [PATCH 2/2] selftests: add a selftest to verify hugetlb usage in memcg Date: Tue, 26 Sep 2023 12:49:49 -0700 Message-Id: <20230926194949.2637078-3-nphamcs@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230926194949.2637078-1-nphamcs@gmail.com> References: <20230926194949.2637078-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Tue, 26 Sep 2023 12:50:18 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1778135350604245754 X-GMAIL-MSGID: 1778135350604245754 This patch add a new kselftest to demonstrate and verify the new hugetlb memcg accounting behavior. Signed-off-by: Nhat Pham --- MAINTAINERS | 2 + tools/testing/selftests/cgroup/.gitignore | 1 + tools/testing/selftests/cgroup/Makefile | 2 + .../selftests/cgroup/test_hugetlb_memcg.c | 222 ++++++++++++++++++ 4 files changed, 227 insertions(+) create mode 100644 tools/testing/selftests/cgroup/test_hugetlb_memcg.c diff --git a/MAINTAINERS b/MAINTAINERS index bf0f54c24f81..ce9f40bcc2ba 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -5269,6 +5269,7 @@ S: Maintained F: mm/memcontrol.c F: mm/swap_cgroup.c F: tools/testing/selftests/cgroup/memcg_protection.m +F: tools/testing/selftests/cgroup/test_hugetlb_memcg.c F: tools/testing/selftests/cgroup/test_kmem.c F: tools/testing/selftests/cgroup/test_memcontrol.c @@ -9652,6 +9653,7 @@ F: include/linux/hugetlb.h F: mm/hugetlb.c F: mm/hugetlb_vmemmap.c F: mm/hugetlb_vmemmap.h +F: tools/testing/selftests/cgroup/test_hugetlb_memcg.c HVA ST MEDIA DRIVER M: Jean-Christophe Trotin diff --git a/tools/testing/selftests/cgroup/.gitignore b/tools/testing/selftests/cgroup/.gitignore index af8c3f30b9c1..2732e0b29271 100644 --- a/tools/testing/selftests/cgroup/.gitignore +++ b/tools/testing/selftests/cgroup/.gitignore @@ -7,4 +7,5 @@ test_kill test_cpu test_cpuset test_zswap +test_hugetlb_memcg wait_inotify diff --git a/tools/testing/selftests/cgroup/Makefile b/tools/testing/selftests/cgroup/Makefile index c27f05f6ce9b..00b441928909 100644 --- a/tools/testing/selftests/cgroup/Makefile +++ b/tools/testing/selftests/cgroup/Makefile @@ -14,6 +14,7 @@ TEST_GEN_PROGS += test_kill TEST_GEN_PROGS += test_cpu TEST_GEN_PROGS += test_cpuset TEST_GEN_PROGS += test_zswap +TEST_GEN_PROGS += test_hugetlb_memcg LOCAL_HDRS += $(selfdir)/clone3/clone3_selftests.h $(selfdir)/pidfd/pidfd.h @@ -27,3 +28,4 @@ $(OUTPUT)/test_kill: cgroup_util.c $(OUTPUT)/test_cpu: cgroup_util.c $(OUTPUT)/test_cpuset: cgroup_util.c $(OUTPUT)/test_zswap: cgroup_util.c +$(OUTPUT)/test_hugetlb_memcg: cgroup_util.c diff --git a/tools/testing/selftests/cgroup/test_hugetlb_memcg.c b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c new file mode 100644 index 000000000000..9651f6af6914 --- /dev/null +++ b/tools/testing/selftests/cgroup/test_hugetlb_memcg.c @@ -0,0 +1,222 @@ +// SPDX-License-Identifier: GPL-2.0 +#define _GNU_SOURCE + +#include +#include +#include +#include +#include +#include +#include "../kselftest.h" +#include "cgroup_util.h" + +#define ADDR ((void *)(0x0UL)) +#define FLAGS (MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB) +/* mapping 8 MBs == 4 hugepages */ +#define LENGTH (8UL*1024*1024) +#define PROTECTION (PROT_READ | PROT_WRITE) + +/* borrowed from mm/hmm-tests.c */ +static long get_hugepage_size(void) +{ + int fd; + char buf[2048]; + int len; + char *p, *q, *path = "/proc/meminfo", *tag = "Hugepagesize:"; + long val; + + fd = open(path, O_RDONLY); + if (fd < 0) { + /* Error opening the file */ + return -1; + } + + len = read(fd, buf, sizeof(buf)); + close(fd); + if (len < 0) { + /* Error in reading the file */ + return -1; + } + if (len == sizeof(buf)) { + /* Error file is too large */ + return -1; + } + buf[len] = '\0'; + + /* Search for a tag if provided */ + if (tag) { + p = strstr(buf, tag); + if (!p) + return -1; /* looks like the line we want isn't there */ + p += strlen(tag); + } else + p = buf; + + val = strtol(p, &q, 0); + if (*q != ' ') { + /* Error parsing the file */ + return -1; + } + + return val; +} + +static int set_file(const char *path, long value) +{ + FILE *file; + int ret; + + file = fopen(path, "w"); + if (!file) + return -1; + ret = fprintf(file, "%ld\n", value); + fclose(file); + return ret; +} + +static int set_nr_hugepages(long value) +{ + return set_file("/proc/sys/vm/nr_hugepages", value); +} + +static unsigned int check_first(char *addr) +{ + return *(unsigned int *)addr; +} + +static void write_data(char *addr) +{ + unsigned long i; + + for (i = 0; i < LENGTH; i++) + *(addr + i) = (char)i; +} + +static int hugetlb_test_program(const char *cgroup, void *arg) +{ + char *test_group = (char *)arg; + void *addr; + long old_current, expected_current, current; + int ret = EXIT_FAILURE; + + old_current = cg_read_long(test_group, "memory.current"); + + addr = mmap(ADDR, LENGTH, PROTECTION, FLAGS, 0, 0); + if (addr == MAP_FAILED) { + ksft_print_msg("fail to mmap.\n"); + return EXIT_FAILURE; + } + current = cg_read_long(test_group, "memory.current"); + if (current - old_current >= MB(2)) { + ksft_print_msg("mmap should not increase hugepage usage.\n"); + goto out_failed_munmap; + } + old_current = current; + + /* read the first page */ + check_first(addr); + expected_current = old_current + MB(2); + current = cg_read_long(test_group, "memory.current"); + if (!values_close(expected_current, current, 1)) { + ksft_print_msg("memory usage should increase by around 2MB.\n"); + goto out_failed_munmap; + } + + /* write to the whole range */ + write_data(addr); + current = cg_read_long(test_group, "memory.current"); + expected_current = old_current + MB(8); + if (!values_close(expected_current, current, 1)) { + ksft_print_msg("memory usage should increase by around 8MB.\n"); + goto out_failed_munmap; + } + + /* unmap the whole range */ + munmap(addr, LENGTH); + current = cg_read_long(test_group, "memory.current"); + expected_current = old_current; + if (!values_close(expected_current, current, 1)) { + ksft_print_msg("memory usage should go back down.\n"); + return ret; + } + + ret = EXIT_SUCCESS; + return ret; + +out_failed_munmap: + munmap(addr, LENGTH); + return ret; +} + +static int test_hugetlb_memcg(char *root) +{ + int ret = KSFT_FAIL; + char *test_group; + long old_current, expected_current, current; + + test_group = cg_name(root, "hugetlb_memcg_test"); + + if (!test_group || cg_create(test_group)) { + ksft_print_msg("fail to create cgroup.\n"); + goto out; + } + + if (cg_write(test_group, "memory.max", "100M")) { + ksft_print_msg("fail to set cgroup memory limit.\n"); + goto out; + } + + /* disable swap */ + if (cg_write(test_group, "memory.swap.max", "0")) { + ksft_print_msg("fail to disable swap.\n"); + goto out; + } + old_current = cg_read_long(test_group, "memory.current"); + + set_nr_hugepages(20); + current = cg_read_long(test_group, "memory.current"); + expected_current = old_current; + if (!values_close(expected_current, current, 10)) { + ksft_print_msg( + "memory usage should not increase after setting nr_hugepages.\n"); + goto out; + } + + if (!cg_run(test_group, hugetlb_test_program, (void *)test_group)) + ret = KSFT_PASS; +out: + cg_destroy(test_group); + free(test_group); + return ret; +} + +int main(int argc, char **argv) +{ + char root[PATH_MAX]; + int ret = EXIT_SUCCESS; + + /* Unit is kB! */ + if (get_hugepage_size() != 2048) { + ksft_print_msg("test_hugetlb_memcg requires 2MB hugepages\n"); + ksft_test_result_skip("test_hugetlb_memcg\n"); + return ret; + } + + if (cg_find_unified_root(root, sizeof(root))) + ksft_exit_skip("cgroup v2 isn't mounted\n"); + + switch (test_hugetlb_memcg(root)) { + case KSFT_PASS: + ksft_test_result_pass("test_hugetlb_memcg\n"); + break; + case KSFT_SKIP: + ksft_test_result_skip("test_hugetlb_memcg\n"); + break; + default: + ret = EXIT_FAILURE; + ksft_test_result_fail("test_hugetlb_memcg\n"); + break; + } + + return ret; +}