Message ID | 20231130023652.50284-1-sj@kernel.org |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp113828vqy; Wed, 29 Nov 2023 18:38:27 -0800 (PST) X-Google-Smtp-Source: AGHT+IFVsIx3whU/d5357KFpCl2uIk27Oo44Y0OJ52GfKJahsJT6+3NBKDcLKSwr5OXb2VpEMOPg X-Received: by 2002:a05:6830:4387:b0:6d8:1505:f528 with SMTP id s7-20020a056830438700b006d81505f528mr19473019otv.3.1701311907583; Wed, 29 Nov 2023 18:38:27 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701311907; cv=none; d=google.com; s=arc-20160816; b=Y0KnBMlyKdP6WEbAXGB9/XttTTDRj79cpno2h+goON24mbYFDSLuWh3s7wMwYe0wk3 LqtmZKoeXehjABpQDHwCssYdAuDL6Z7+dzqAGsAsT8wYsNosybPymZkyC634Y5KM1aRm LmLxcD7v6sBT+9yK8iHhSX9Xqvh5aJRRLsrTj0ia87IKKkrfFAOa4MQXalxKs4DHYM8H J5OGuum7qlf+dNimKnCCUGLPSxe+0cfboFWEj0qhE0ZzQK0ZT8UYoIHACXE7SRiF6s5V gdo+G5UuLhZVfOZMUKG+L6VqjbcxHJVA4ng+S4c69APqLRONosft7obYNDByWzOxJRrP QbWA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=iTnqB0cuFLnOvx5gOpd0nsB17KQts4j1h0MQ1JkqVY0=; fh=zViVJXJG2BpYLWTP79yFFrX9lggdGb7G0DAi+YjE1WA=; b=SigUfetdOTMnWNjGbgiiK90thXreglMZ4YmgvBUHoh7lBsQMjj/U4WmLzyCSS7zO/z QG9BD4oGcKxmd102GoHVuWqEpARz50AJVLy8nPFalz6lOJ/jWe0NhLpI+pSgRbVrsEVu HUZOJ37eJKQgRbIRPLiPUNr/mrtuG/KbxBO0VsQyDMqswG/TrQHd8/3MyMTTvzG+VoUj rRNe777UXJ3JGpiyYd9cjDkEsC460+G/CxCQ/XOUS1vdI9U2YLwNt/o/fGeo8KcM3t1T i7fnuInhqJR8SF28dhT4/wrrl7ccbOla19O0gyVeb4Yrj4O+jT1dqRiwHUUi4IW0+U3S Xy0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=C4We1hTa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id c23-20020a6566d7000000b005be1e55546esi276050pgw.51.2023.11.29.18.38.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Nov 2023 18:38:27 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=C4We1hTa; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 954EC8030A62; Wed, 29 Nov 2023 18:38:21 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234920AbjK3Chq (ORCPT <rfc822;realc9580@gmail.com> + 99 others); Wed, 29 Nov 2023 21:37:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1344165AbjK3Chd (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 29 Nov 2023 21:37:33 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77D6E19A5 for <linux-kernel@vger.kernel.org>; Wed, 29 Nov 2023 18:37:00 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A6135C433C8; Thu, 30 Nov 2023 02:36:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701311819; bh=kmVvWSOSNg0jZKTITQFOc3w/RhYn3gjC0FKbwo8Ddmw=; h=From:To:Cc:Subject:Date:From; b=C4We1hTaQoMBxNMxK+etK/eU9uiVFvyaqY0ailoCmXBqYoLq3x4VxXTHvAzruRkPl GIfBThPTbvy6LHqAHLHlOrsBBxRuBE6DuF7vtxpZDwEmtcgV8WMHBUtepnIaqfVuSN FciTuVYgZ9LBeE1nHPlrxgC0uKdeh7QZ5mxu2/bk0Fi/HPJsp3q0XmIBjcKpd4Vlhm cdYR/7tvoHLGkfqfNbu4SDN2igO9d8jsqBr04crSPyTzw8/c4T+n3J9jqkb8uCMtE8 cfyyCOGp9t9gMQdUHJhVCFBb+/gQP9959TZq8VuPOMySRAqLrykJzUXojEIkvRckWx tJ6nkSdw3KixQ== From: SeongJae Park <sj@kernel.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: SeongJae Park <sj@kernel.org>, Jonathan Corbet <corbet@lwn.net>, Shuah Khan <shuah@kernel.org>, Brendan Higgins <brendanhiggins@google.com>, David Gow <davidgow@google.com>, damon@lists.linux.dev, linux-mm@kvack.org, linux-doc@vger.kernel.org, kunit-dev@googlegroups.com, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 0/9] mm/damon: let users feed and tame/auto-tune DAMOS Date: Thu, 30 Nov 2023 02:36:43 +0000 Message-Id: <20231130023652.50284-1-sj@kernel.org> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 29 Nov 2023 18:38:21 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783954834987063362 X-GMAIL-MSGID: 1783954834987063362 |
Series |
mm/damon: let users feed and tame/auto-tune DAMOS
|
|
Message
SeongJae Park
Nov. 30, 2023, 2:36 a.m. UTC
Introduce Aim-oriented Feedback-driven DAMOS Aggressiveness Auto-tuning. It makes DAMOS self-tuned with periodic simple user feedback. Patchset Changelog ================== From RFC (https://lore.kernel.org/damon/20231112194607.61399-1-sj@kernel.org/) - Wordsmith commit messages and cover letter Background: DAMOS Control Difficulty ==================================== DAMOS helps users easily implement access pattern aware system operations. However, controlling DAMOS in the wild is not that easy. The basic way for DAMOS control is specifying the target access pattern. In this approach, the user is assumed to well understand the access pattern and the characteristics of the system and the workloads. Though there are useful tools for that, it takes time and effort depending on the complexity and the dynamicity of the system and the workloads. After all, the access pattern consists of three ranges, namely the size, the access rate, and the age of the regions. It means users need to tune six parameters, which is anyway not a simple task. One of the worst cases would be DAMOS being too aggressive like a berserker, and therefore consuming too much system resource and making unwanted radical system operations. To let users avoid such cases, DAMOS allows users to set the upper-limit of the schemes' aggressiveness, namely DAMOS quota. DAMOS further provides its best-effort under the limit by prioritizing regions based on the access pattern of the regions. For example, users can ask DAMOS to page out up to 100 MiB of memory regions per second. Then DAMOS pages out regions that are not accessed for a longer time (colder) first under the limit. This allows users to set the target access pattern a bit naive with wider ranges, and focus on tuning only one parameter, the quota. In other words, the number of parameters to tune can be reduced from six to one. Still, however, the optimum value for the quota depends on the system and the workloads' characteristics, so not that simple. The number of parameters to tune can also increase again if the user needs to run multiple schemes. Aim-oriented Feedback-driven DAMOS Aggressiveness Auto Tuning ============================================================= Users would use DAMOS since they want to achieve something with it. They will likely have measurable metrics representing the achievement and the target number of the metric like SLO, and continuously measure that anyway. While the additional cost of getting the information is nearly zero, it could be useful for DAMOS to understand how appropriate its current aggressiveness is set, and adjust it on its own to make the metric value more close to the target. Based on this idea, we introduce a new way of tuning DAMOS with nearly zero additional effort, namely Aim-oriented Feedback-driven DAMOS Aggressiveness Auto Tuning. It asks users to provide feedback representing how well DAMOS is doing relative to the users' aim. Then DAMOS adjusts its aggressiveness, specifically the quota that provides the best effort result under the limit, based on the current level of the aggressiveness and the users' feedback. Implementation -------------- The implementation asks users to represent the feedback with score numbers. The scores could be anything including user-space specific metrics including latency and throughput of special user-space workloads, and system metrics including free memory ratio, memory pressure stall time (PSI), and active to inactive LRU lists size ratio. The feedback scores and the aggressiveness of the given DAMOS scheme are assumed to be positively proportional, though. Selecting metrics of the assumption is the users' responsibility. The core logic uses the below simple feedback loop algorithm to calculate the next aggressiveness level of the scheme from the current aggressiveness level and the current feedback (target_score and current_score). It calculates the compensation for next aggressiveness as a proportion of current aggressiveness and distance to the target score. As a result, it arrives at the near-goal state in a short time using big steps when it's far from the goal, but avoids making unnecessarily radical changes that could turn out to be a bad decision using small steps when its near to the goal. f(n) = max(1, f(n - 1) * ((target_score - current_score) / target_score + 1)) Note that the compensation value becomes negative when it's over achieving the goal. That's why the feedback metric and the aggressiveness of the scheme should be positively proportional. The distance-adaptive speed manipulation is simply applied. Example Use Cases ----------------- If users want to reduce the memory footprint of the system as much as possible as long as the time spent for handling the resulting memory pressure is within a threshold, they could use DAMOS scheme that reclaims cold memory regions aiming for a little level of memory pressure stall time. If users want the active/inactive LRU lists well balanced to reduce the performance impact due to possible future memory pressure, they could use two schemes. The first one would be set to locate hot pages in the active LRU list, aiming for a specific active-to-inactive LRU list size ratio, say, 70%. The second one would be to locate cold pages in the inactive LRU list, aiming for a specific inactive-to-active LRU list size ratio, say, 30%. Then, DAMOS will balance the two schemes based on the goal and feedback. This aim-oriented auto tuning could also be useful for general balancing-required access aware system operations such as system memory auto scaling[3] and tiered memory management[4]. These two example usages are not what current DAMOS implementation is already supporting, but require additional DAMOS action developments, though. Evaluation: subtle memory pressure aiming proactive reclamation --------------------------------------------------------------- To show if the implementation works as expected, we prepare four different system configurations on AWS i3.metal instances. The first setup (original) runs the workload without any DAMOS scheme. The second setup (not-tuned) runs the workload with a virtual address space-based proactive reclamation scheme that pages out memory regions that are not accessed for five seconds or more. The third setup (offline-tuned) runs the same proactive reclamation DAMOS scheme, but after making it tuned for each workload offline, using our previous user-space driven automatic tuning approach, namely DAMOOS[1]. The fourth and final setup (AFDAA) runs the scheme that is the same as that of 'not-tuned' setup, but aims to keep 0.5% of 'some' memory pressure stall time (PSI) for the last 10 seconds using the aiming-oriented auto tuning. For each setup, we run realistic workloads from PARSEC3 and SPLASH-2X benchmark suites. For each run, we measure RSS and runtime of the workload, and 'some' memory pressure stall time (PSI) of the system. We repeat the runs five times and use averaged measurements. For simple comparison of the results, we normalize the measurements to those of 'original'. In the case of the PSI, though, the measurement for 'original' was zero, so we normalize the value to that of 'not-tuned' scheme's result. The normalized results are shown below. Not-tuned Offline-tuned AFDAA RSS 0.622688178226118 0.787950678944904 0.740093483278979 runtime 1.11767826657912 1.0564674983585 1.0910833880499 PSI 1 0.727521443794069 0.308498846350299 The 'not-tuned' scheme achieves about 38.7% memory saving but incur about 11.7% runtime slowdown. The 'offline-tuned' scheme achieves about 22.2% memory saving with about 5.5% runtime slowdown. It also achieves about 28.2% memory pressure stall time saving. AFDAA achieves about 26% memory saving with about 9.1% runtime slowdown. It also achieves about 69.1% memory pressure stall time saving. We repeat this test multiple times, and get consistent results. AFDAA is now integrated in our daily DAMON performance test setup. Apparently the aggressiveness of 'AFDAA' setup is somewhere between those of 'not-tuned' and 'offline-tuned' setup, since its memory saving and runtime overhead are between those of the other two setups. Actually we set the memory pressure stall time goal aiming for this middle aggressiveness. The difference in the two metrics are not significant, though. However, it shows significant saving of the memory pressure stall time, which was the goal of the auto-tuning, over the two variants. Hence, we conclude the automatic tuning is working as expected. Please note that the AFDAA setup is only for the evaluation, and therefore intentionally set a bit aggressive. It might not be appropriate for production environments. The test code is also available[2], so you could reproduce it on your system and workloads. Patches Sequence ================ The first four patches implement the core logic and user interfaces for the auto tuning. The first patch implements the core logic for the auto tuning, and the API for DAMOS users in the kernel space. The second patch implements basic file operations of DAMON sysfs directories and files that will be used for setting the goals and providing the feedback. The third patch connects the quota goals files inputs to the DAMOS core logic. Finally the fourth patch implements a dedicated DAMOS sysfs command for efficiently committing the quota goals feedback. Two patches for simple tests of the logic and interfaces follow. The fifth patch implements the core logic unit test. The sixth patch implements a selftest for the DAMON Sysfs interface for the goals. Finally, three patches for documentation follows. The seventh patch documents the design of the feature. The eighth patch updates the API doc for the new sysfs files. The final eighth patch updates the usage document for the features. References ========== [1] DAOS paper: https://www.amazon.science/publications/daos-data-access-aware-operating-system [2] Evaluation code: https://github.com/damonitor/damon-tests/commit/3f884e61193f0166b8724554b6d06b0c449a712d [3] Memory auto scaling RFC idea: https://lore.kernel.org/damon/20231112195114.61474-1-sj@kernel.org/ [4] DAMON-based tiered memory management RFC idea: https://lore.kernel.org/damon/20231112195602.61525-1-sj@kernel.org/ SeongJae Park (9): mm/damon/core: implement goal-oriented feedback-driven quota auto-tuning mm/damon/sysfs-schemes: implement files for scheme quota goals setup mm/damon/sysfs-schemes: commit damos quota goals user input to DAMOS mm/damon/sysfs-schemes: implement a command for scheme quota goals only commit mm/damon/core-test: add a unit test for the feedback loop algorithm selftests/damon: test quota goals directory Docs/mm/damon/design: document DAMOS quota auto tuning Docs/ABI/damon: document DAMOS quota goals Docs/admin-guide/mm/damon/usage: document for quota goals .../ABI/testing/sysfs-kernel-mm-damon | 33 ++- Documentation/admin-guide/mm/damon/usage.rst | 48 +++- Documentation/mm/damon/design.rst | 13 + include/linux/damon.h | 19 ++ mm/damon/core-test.h | 32 +++ mm/damon/core.c | 68 ++++- mm/damon/sysfs-common.h | 3 + mm/damon/sysfs-schemes.c | 272 +++++++++++++++++- mm/damon/sysfs.c | 27 ++ tools/testing/selftests/damon/sysfs.sh | 27 ++ 10 files changed, 517 insertions(+), 25 deletions(-) base-commit: b4e0245a831a402cae8634a4dc277a04830ff07a
Comments
On Thu, 30 Nov 2023 02:36:43 +0000 SeongJae Park <sj@kernel.org> wrote: > The core logic uses the below simple feedback loop algorithm to > calculate the next aggressiveness level of the scheme from the current > aggressiveness level and the current feedback (target_score and > current_score). It calculates the compensation for next aggressiveness > as a proportion of current aggressiveness and distance to the target > score. As a result, it arrives at the near-goal state in a short time > using big steps when it's far from the goal, but avoids making > unnecessarily radical changes that could turn out to be a bad decision > using small steps when its near to the goal. fwiw, the above is a "proportional controller". MGLRU has, in vmscan.c, a PID controller (proportional, integral, derivative). PID controllers have better accuracy (the integral feedback) and better stability (the derivative feedback). Generalizing MGLRU's PID controller might be somewhat challenging!