From patchwork Mon Nov 20 17:46:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: andrey.konovalov@linux.dev X-Patchwork-Id: 16820 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2b07:b0:403:3b70:6f57 with SMTP id io7csp105796vqb; Mon, 20 Nov 2023 09:48:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IFS2nTInF59IyAptOrIV0c6zVc7E50FdJpsVkOQGQq4uyfoN3tbHOguW3nwbNbcB6cCo9nu X-Received: by 2002:a05:6a00:2786:b0:6cb:8dba:ce6a with SMTP id bd6-20020a056a00278600b006cb8dbace6amr5108030pfb.27.1700502493460; Mon, 20 Nov 2023 09:48:13 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700502493; cv=none; d=google.com; s=arc-20160816; b=y2TTb0ZzVF7BigAdn2qnsY2dg6iwOmALapG/kulI1AoCQA6KXYVv5Tjj0m3H0a6Zil uo6KT+JRltbK7GSR89wuPaRjHdnIngTYZmMMrnrSFUNDIJLmAqExZ4vHx9tkb0VuTEjS CG2tb7EJtHt13ZO7i4OgwnQp3GXhB5F8C7ejP7XJmC++QjxW2K0/UngmW3Ub7RetHgYz QcKsecDQ9YXH0aMtgGjrlKTXsPFzYA23TvHxLuI8ePw6IcukvB9BmcNMgPeA9temyUZ1 l//ekx+Eqqw+PT/7muIkeJC8jlP39qULTGcRNdeTi19NSxx+WJ3Dxq+bS3DGDGmwOtpD S+TA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=dirg16X8IjWPdQeQGOiKGtH7xHuc5gOowsqXecWSLlM=; fh=qWoxmPN1zRqnLSoCWGpfFsDtzJsx/SGdqx98lbP+Uho=; b=bDC5CKdAkedRXR01xhjcr3qf9O/qalCHxPJqnhYQaMuttLiKHLUwEmchihIz7GJKIN 3zMDDxvPgv6okum8ZVBFzYARw3nXCkA7/jVxejiqtF2pYgkQ9qnxmi4wDzTwHuwi0cQ4 bvS+vAqmh4F38DIHDlvw5BFhNilLwa9aaS9EvlZLKNgRNftwqOGR1zvXWUuajUDweqcT veWV+x2/S9WkpRbAiBSRoWOUvjNTrxlbcrPExEbUiXWvIYjbHgdvRXYimTVLEWVvsjKf d4Q8mz6vR+Vc7oSS1bSF0fIf5qLIb6xtkK9cey+akCB48CGOnEsBfQ7T7T56GqJqscWP VYCQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=LrmT6z7B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id fa18-20020a056a002d1200b006c98123e8acsi8874112pfb.24.2023.11.20.09.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 20 Nov 2023 09:48:13 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=LrmT6z7B; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id DE62C807BED3; Mon, 20 Nov 2023 09:48:03 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233607AbjKTRrh (ORCPT + 27 others); Mon, 20 Nov 2023 12:47:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51244 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232636AbjKTRrd (ORCPT ); Mon, 20 Nov 2023 12:47:33 -0500 Received: from out-171.mta1.migadu.com (out-171.mta1.migadu.com [IPv6:2001:41d0:203:375::ab]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97BAEF5 for ; Mon, 20 Nov 2023 09:47:28 -0800 (PST) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1700502446; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=dirg16X8IjWPdQeQGOiKGtH7xHuc5gOowsqXecWSLlM=; b=LrmT6z7BevrEOVmLW0g9//fbwyY1UogPCDw8xj/Xlc92kuYfZXlzWvlU+nKASw0KSd7GY+ kD5OERtPjKafidVOi7TNebi8D7eNrTExLNbbWD6Ila0/S8nb06sDJQX7+FJZVc9EmXLfNE KI7cobYyoicYG6JJ416fOHwwFvor0uw= From: andrey.konovalov@linux.dev To: Andrew Morton Cc: Andrey Konovalov , Marco Elver , Alexander Potapenko , Dmitry Vyukov , Vlastimil Babka , kasan-dev@googlegroups.com, Evgenii Stepanov , Oscar Salvador , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Andrey Konovalov Subject: [PATCH v4 00/22] stackdepot: allow evicting stack traces Date: Mon, 20 Nov 2023 18:46:58 +0100 Message-Id: MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Mon, 20 Nov 2023 09:48:03 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1783106102502395395 X-GMAIL-MSGID: 1783106102502395395 From: Andrey Konovalov Currently, the stack depot grows indefinitely until it reaches its capacity. Once that happens, the stack depot stops saving new stack traces. This creates a problem for using the stack depot for in-field testing and in production. For such uses, an ideal stack trace storage should: 1. Allow saving fresh stack traces on systems with a large uptime while limiting the amount of memory used to store the traces; 2. Have a low performance impact. Implementing #1 in the stack depot is impossible with the current keep-forever approach. This series targets to address that. Issue #2 is left to be addressed in a future series. This series changes the stack depot implementation to allow evicting unneeded stack traces from the stack depot. The users of the stack depot can do that via new stack_depot_save_flags(STACK_DEPOT_FLAG_GET) and stack_depot_put APIs. Internal changes to the stack depot code include: 1. Storing stack traces in fixed-frame-sized slots (vs precisely-sized slots in the current implementation); the slot size is controlled via CONFIG_STACKDEPOT_MAX_FRAMES (default: 64 frames); 2. Keeping available slots in a freelist (vs keeping an offset to the next free slot); 3. Using a read/write lock for synchronization (vs a lock-free approach combined with a spinlock). This series also integrates the eviction functionality into KASAN: the tag-based modes evict stack traces when the corresponding entry leaves the stack ring, and Generic KASAN evicts stack traces for objects once those leave the quarantine. With KASAN, despite wasting some space on rounding up the size of each stack record, the total memory consumed by stack depot gets saturated due to the eviction of irrelevant stack traces from the stack depot. With the tag-based KASAN modes, the average total amount of memory used for stack traces becomes ~0.5 MB (with the current default stack ring size of 32k entries and the default CONFIG_STACKDEPOT_MAX_FRAMES of 64). With Generic KASAN, the stack traces take up ~1 MB per 1 GB of RAM (as the quarantine's size depends on the amount of RAM). However, with KMSAN, the stack depot ends up using ~4x more memory per a stack trace than before. Thus, for KMSAN, the stack depot capacity is increased accordingly. KMSAN uses a lot of RAM for shadow memory anyway, so the increased stack depot memory usage will not make a significant difference. Other users of the stack depot do not save stack traces as often as KASAN and KMSAN. Thus, the increased memory usage is taken as an acceptable trade-off. In the future, these other users can take advantage of the eviction API to limit the memory waste. There is no measurable boot time performance impact of these changes for KASAN on x86-64. I haven't done any tests for arm64 modes (the stack depot without performance optimizations is not suitable for intended use of those anyway), but I expect a similar result. Obtaining and copying stack trace frames when saving them into stack depot is what takes the most time. This series does not yet provide a way to configure the maximum size of the stack depot externally (e.g. via a command-line parameter). This will be added in a separate series, possibly together with the performance improvement changes. --- Changes v3->v4: - Rebase onto 6.7-rc2. - Fix lockdep annotation in depot_fetch_stack. - New patch: "kasan: use stack_depot_put for Generic mode" (was sent for review separately but now merged into this series). - New patch: "lib/stackdepot: print disabled message only if truly disabled" (was sent for review separately but now merged into this series). - New patch: "lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN". Changes v2->v3: - Fix null-ptr-deref by using the proper number of entries for initializing the stack table when alloc_large_system_hash() auto-calculates the number (see patch #12). - Keep STACKDEPOT/STACKDEPOT_ALWAYS_INIT Kconfig options not configurable by users. - Use lockdep_assert_held_read annotation in depot_fetch_stack. - WARN_ON invalid flags in stack_depot_save_flags. - Moved "../slab.h" include in mm/kasan/report_tags.c in the right patch. - Various comment fixes. Changes v1->v2: - Rework API to stack_depot_save_flags(STACK_DEPOT_FLAG_GET) + stack_depot_put. - Add CONFIG_STACKDEPOT_MAX_FRAMES Kconfig option. - Switch stack depot to using list_head's. - Assorted minor changes, see the commit message for each path. Andrey Konovalov (22): lib/stackdepot: print disabled message only if truly disabled lib/stackdepot: check disabled flag when fetching lib/stackdepot: simplify __stack_depot_save lib/stackdepot: drop valid bit from handles lib/stackdepot: add depot_fetch_stack helper lib/stackdepot: use fixed-sized slots for stack records lib/stackdepot: fix and clean-up atomic annotations lib/stackdepot: rework helpers for depot_alloc_stack lib/stackdepot: rename next_pool_required to new_pool_required lib/stackdepot: store next pool pointer in new_pool lib/stackdepot: store free stack records in a freelist lib/stackdepot: use read/write lock lib/stackdepot: use list_head for stack record links kmsan: use stack_depot_save instead of __stack_depot_save lib/stackdepot, kasan: add flags to __stack_depot_save and rename lib/stackdepot: add refcount for records lib/stackdepot: allow users to evict stack traces kasan: remove atomic accesses to stack ring entries kasan: check object_size in kasan_complete_mode_report_info kasan: use stack_depot_put for tag-based modes kasan: use stack_depot_put for Generic mode lib/stackdepot: adjust DEPOT_POOLS_CAP for KMSAN include/linux/stackdepot.h | 59 ++++- lib/Kconfig | 10 + lib/stackdepot.c | 452 ++++++++++++++++++++++++------------- mm/kasan/common.c | 8 +- mm/kasan/generic.c | 27 ++- mm/kasan/kasan.h | 2 +- mm/kasan/quarantine.c | 26 ++- mm/kasan/report_tags.c | 27 +-- mm/kasan/tags.c | 24 +- mm/kmsan/core.c | 7 +- 10 files changed, 427 insertions(+), 215 deletions(-)