From patchwork Wed Jan 17 16:14:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 188886 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:414:b0:176:5f3:c5eb with SMTP id 20csp778838rwd; Wed, 17 Jan 2024 08:15:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IHNiN8w7vTpjWbFOg9f+UkxBZTXwnK4dmyQbWBu5DXs3zd4GGOicZAByS4WdhxD3ktCtCbV X-Received: by 2002:a05:6512:3196:b0:50e:b413:400f with SMTP id i22-20020a056512319600b0050eb413400fmr4519813lfe.49.1705508154111; Wed, 17 Jan 2024 08:15:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705508154; cv=pass; d=google.com; s=arc-20160816; b=YjblILuQlaQQVNFmep4A1QCo+i63So69Yfip38iOeXmVDArgGwQ47w5jvESbg+2AJX 2Xkg8Re2bWV/5qSm//omoken0N5jzfxmE539zVdDUSpzy34qH1gl4Y1MSmbX9A9JcoPm SUGrR3BTrVuN0Jb495DGuWxdbyRRPNBLKkJoJ8Paa47bTtcPWLcDFHe8ofi8LoAWnX+v CVFv0zwcLMkKfvXVAXGwWacYb8WjpzgDz6ae42sm7sBbcgHdM4WoY7087si81PQq9nnn bPQd3xK+qVE2F7MxrILbok42Uky+RtJqZQL0z9b6/6jciHjegaURlkZhKQilCuAJ/W+B Q9HQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=o3K4TPl78GOzKHIjcRo4cqwYWgTS0PNWTuRXct9Lkh0=; fh=mzZ5OKkyWbFHsrLyCorOBXM9JoluhtmgZuUWxRX6aw0=; b=FjfGzCmsE/LSvsMwYCad64JcG2kfXPIkefyhF3PFXKfa4h2bgws+qZwOCLuEZnifzC 2JYzVH5j7w6GpG4PxUrKbxxwibk9kL/h7+WZqmNtmcFcKo/ENH/deAwY9Ocv3o1YfxQ+ FptjAA1h167r8wVyZWsUq+ZK58sGXaikaUel2tx2cZEAG4jlY1lPFkLj2UdltJf0sb9R PzZV2+1fYZVxNf5pjDoAZ2sku94iqaQ8dKStZzMHO8SG18K/4Yn+HyQ3+x69I6MExIqk BtB0OYicj7pWKb9LMpbcCWlHljnS/hyfsgrCutYzxBw95pCJQoMGIs8oKAvZ089Kr6dc sWfw== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=WGlSJQ3j; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29200-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29200-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id g11-20020a170906c18b00b00a1d1f6bb286si5633477ejz.302.2024.01.17.08.15.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 08:15:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-29200-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=WGlSJQ3j; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29200-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29200-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 9BBE41F29D54 for ; Wed, 17 Jan 2024 16:15:53 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D38C022635; Wed, 17 Jan 2024 16:15:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="WGlSJQ3j" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2D8632232D; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508108; cv=none; b=CmuX8E3oEmtTzKUEa0W+/PU6zJ/rZgPT+RqH6dqU139jKH9yNknlx/8GoYkIvkgytNgTVAONp3MqJ+haY3aj25+eZC0lT0PsZSoYEOVpxhnB4LwUBfTNfKU5nU2ERK3aVrtxgmbcoIDya3MgAUzE9Nl1xYAxYt2LiGRV9NwaMpI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508108; c=relaxed/simple; bh=eP1aE7eGJrI56wu/JUZuR/oHJxD8hIQm2thQjg7z0tE=; h=Received:DKIM-Signature:From:To:Cc:Subject:Date:Message-ID: X-Mailer:In-Reply-To:References:MIME-Version: Content-Transfer-Encoding; b=dT6BBTW2j2GO4hOlOpxYOEwh4LpSIeek2gfL5Q0CBrYwT/Th7OhMK9HyeTaMwU/PNS0Tz1TmsboLYw2tywGFWzWnEvSqwmiyNfKWqQ+4Iq6IFtBBVGUt8l4rDUV9gYrUFNGHO6rMEgzsIn/fKka4FbNInsFH1rVR4dgYNExP6+E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=WGlSJQ3j; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4ED50C43394; Wed, 17 Jan 2024 16:15:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508108; bh=eP1aE7eGJrI56wu/JUZuR/oHJxD8hIQm2thQjg7z0tE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WGlSJQ3jQyZPQFC++EdqLNjVx8Yadpn7xElFro4BLyKUqFFpQuAncr1EOLlt3uyuI +sxXmuKxtLjUOA+r4tkrjzwa0w2fXlZGffuba9K/H5yHuoLqmQz6tLk8/uuRsnNwvp 0OESLveyoKN0OwCncqn6LD6mHP7NTY6OGGx7qOg0g8puA1fNuqUsAJCkH0E6nK2ru8 j4UkgHz40xh0P0O1yVFYkRg0lALH6UXwp8Ff3UgZgnlfnXSn/RGd8P6ZwFHx957AUc 3JfjzKdG1G9q/wZM6RZ1qlXsBSPYbFSAjWGhKG1QGsZTEw1T2W7rDmqsCvi4cGD8dl B1HUqFkOztYmA== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 1/4] fs/locks: Fix file lock cache accounting, again Date: Wed, 17 Jan 2024 08:14:43 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788354918018490835 X-GMAIL-MSGID: 1788354918018490835 A container can exceed its memcg limits by allocating a bunch of file locks. This bug was originally fixed by commit 0f12156dff28 ("memcg: enable accounting for file lock caches"), but was later reverted by commit 3754707bcc3e ("Revert "memcg: enable accounting for file lock caches"") due to performance issues. Unfortunately those performance issues were never addressed and the bug has remained unfixed for over two years. Fix it by default but allow users to disable it with a cmdline option (flock_accounting=off). Signed-off-by: Josh Poimboeuf --- .../admin-guide/kernel-parameters.txt | 17 +++++++++++ fs/locks.c | 30 +++++++++++++++++-- 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 6ee0f9a5da70..91987b06bc52 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1527,6 +1527,23 @@ See Documentation/admin-guide/sysctl/net.rst for fb_tunnels_only_for_init_ns + flock_accounting= + [KNL] Enable/disable accounting for kernel + memory allocations related to file locks. + Format: { on | off } + Default: on + on: Enable kernel memory accounting for file + locks. This prevents task groups from + exceeding their memcg allocation limits. + However, it may cause slowdowns in the + flock() system call. + off: Disable kernel memory accounting for + file locks. This may allow a rogue task + to DoS the system by forcing the kernel + to allocate memory beyond the task + group's memcg limits. Not recommended + unless you have trusted user space. + floppy= [HW] See Documentation/admin-guide/blockdev/floppy.rst. diff --git a/fs/locks.c b/fs/locks.c index cc7c117ee192..235ac56c557d 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2905,15 +2905,41 @@ static int __init proc_locks_init(void) fs_initcall(proc_locks_init); #endif +static bool flock_accounting __ro_after_init = true; + +static int __init flock_accounting_cmdline(char *str) +{ + if (!str) + return -EINVAL; + + if (!strcmp(str, "off")) + flock_accounting = false; + else if (!strcmp(str, "on")) + flock_accounting = true; + else + return -EINVAL; + + return 0; +} +early_param("flock_accounting", flock_accounting_cmdline); + +#define FLOCK_ACCOUNTING_MSG "WARNING: File lock accounting is disabled, container-triggered host memory exhaustion possible!\n" + static int __init filelock_init(void) { int i; + slab_flags_t flags = SLAB_PANIC; + + if (!flock_accounting) + pr_err(FLOCK_ACCOUNTING_MSG); + else + flags |= SLAB_ACCOUNT; flctx_cache = kmem_cache_create("file_lock_ctx", - sizeof(struct file_lock_context), 0, SLAB_PANIC, NULL); + sizeof(struct file_lock_context), 0, flags, NULL); filelock_cache = kmem_cache_create("file_lock_cache", - sizeof(struct file_lock), 0, SLAB_PANIC, NULL); + sizeof(struct file_lock), 0, flags, NULL); for_each_possible_cpu(i) { struct file_lock_list_struct *fll = per_cpu_ptr(&file_lock_list, i); From patchwork Wed Jan 17 16:14:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 188889 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:414:b0:176:5f3:c5eb with SMTP id 20csp785029rwd; Wed, 17 Jan 2024 08:25:54 -0800 (PST) X-Google-Smtp-Source: AGHT+IFhiD+VwzH7PnVBcEx9bSbnydmWv6J+Jl7qKMnZjeuGOKIv0r7SQz4PuyhZTRqsxvfaxl90 X-Received: by 2002:a62:ce0c:0:b0:6da:14f3:3f2a with SMTP id y12-20020a62ce0c000000b006da14f33f2amr5332039pfg.20.1705508754674; Wed, 17 Jan 2024 08:25:54 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705508754; cv=pass; d=google.com; s=arc-20160816; b=OMFYi9GPL732lHMqoxRBtQqDZKKB6Ec6bBj5jmecDe3q2Las2TM1Bl5RQaZ9FC6rvB qQEDOBKTMHjKFBaDdGAxJ8DuEhlMcgp3tm3sGEGvLmwCzSVz4qfGQkaJEvhn8N2Q0+Wt bUFaz/vsSx4BbfSc7DyFFZGg3a+hdbNz0qEOV8sr3KDZehf6zTkkJH7NHEYse9MYvrCC ys3YfqYG7Qmu1pIcvSw6BWDTdOrsYk2IguBUwci2RgPEjjo3thzGAM+j5mtRmU1CEaBZ XRnuVY2GtI40S7bD0ekSlr4E43z4pj+e6VOCEzNtemsuRK3qXjKqXbzjM97x/vIRjMoE RVFA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=DMIwppfDFtVsVkhkJAC34YqIuC0oTG/FedMP6I7fc6M=; fh=mzZ5OKkyWbFHsrLyCorOBXM9JoluhtmgZuUWxRX6aw0=; b=Abxsb2TSuSj7ODkbLXO3je2QUqX824rWORJs2TpC3TG01i5IWErB1HRw1eWEbJv3nR KA3qGiW+MJg/idOvVbdav9bpxcgL5hVaXfydvWTvKWwRy5WvJPm+oQkdzme/is6kzcoa KRUnx/mhSB6tVG7BOPtfgNDdixOX22iYqqoYzqel1dAg0TvIaXpw6JFMgcki6ovijS3W gCxh7sb2WpA+BNsTgpp4K9EB2hSTags7fAXStbQrDVuY3ZB2sU5OsCVto6RNGpuKsmiN 60AEvEi6bAaTlpMNVMOwwo6hyM5DGQpCzJ2De2SjXdG1YoVGWQxdAKiVo2ak793uHFBG e5tg== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GfNGTXOD; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29201-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29201-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id w10-20020a056a0014ca00b006db79fe0e59si2081821pfu.96.2024.01.17.08.25.54 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 08:25:54 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-29201-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=GfNGTXOD; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29201-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29201-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 7BBDA289573 for ; Wed, 17 Jan 2024 16:16:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 37BE82230A; Wed, 17 Jan 2024 16:15:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="GfNGTXOD" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 75DBF224F1; Wed, 17 Jan 2024 16:15:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508109; cv=none; b=m9HtLf3m8BEEchxTR20AnzzlAljwO9hlnDRHRxweU/ynMpd26mGvzc0XsafnJDYMiv/4mDhuadyMl1ma8vM3DUHD7fIXH2fS06KmKyqgX0RpNpvB0h73m667JinST7wGcTtSz998axQvwjof1s3bGKyMQ11pfsSWwDPZadop9dc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508109; c=relaxed/simple; bh=9hto33msnD5StAxQOT593RifYAulOEFb566w3+Gvz0Y=; h=Received:DKIM-Signature:From:To:Cc:Subject:Date:Message-ID: X-Mailer:In-Reply-To:References:MIME-Version: Content-Transfer-Encoding; b=BfgtHBsGva4jA7vsuuJh8FHGCEB+hPTC7kKKiMrCzhVPbCZ5I4Cg/zQECofbGgJsoPGy7uIWj/VHpFrXq50rTUo0femiyoLEYN2/JXng/rIFiu3/WjpkEChpSXjvH0QxvFg9wc3TTb4lZjB/JlLrlhTJtYnQ4wwichVrhG2GJ2o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=GfNGTXOD; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2FCCFC433A6; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508108; bh=9hto33msnD5StAxQOT593RifYAulOEFb566w3+Gvz0Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GfNGTXODLFiLsBVDsLJVvkfO5IV9m3xe3V1IyY9QDAunprrLmIybo1nR4EASgkYNV ZxPQIlNRhxCRwAxYGZDieP3mteI7PlVqT6M6G9/8gAEmGLFszo/GkHXoVWxEhGCbWB GfAdkBo2+ABWF5lVOu8hwXqwdmT5VxzE/6yKc717wIJmBUW6jziYmNUNHzx6b4t2EG y0+BwYYFdCGNNt9/M+0W+hwybVBy5e4GnXZisSfO8fNh6gu7uDHmNFncKLj9IOS9av RSGF7dBcaoqCcC/8kIJ8oaGkf89HJAPGrnkfu1IZJpIDSD/joHHVrBLAX3SfSL/BGF v7PL8XppMBEmg== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 2/4] fs/locks: Add CONFIG_FLOCK_ACCOUNTING Date: Wed, 17 Jan 2024 08:14:44 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788355547701268210 X-GMAIL-MSGID: 1788355547701268210 Allow flock cache accounting to be disabled at build time. Signed-off-by: Josh Poimboeuf --- fs/Kconfig | 15 +++++++++++++++ fs/locks.c | 2 +- 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/fs/Kconfig b/fs/Kconfig index a3159831ba98..591f54a03059 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -129,6 +129,21 @@ config FILE_LOCKING for filesystems like NFS and for the flock() system call. Disabling this option saves about 11k. +config FLOCK_ACCOUNTING + bool "Enable kernel memory accounting for file locks" if EXPERT + depends on FILE_LOCKING + default y + help + This option enables kernel memory accounting for file locks. This + prevents task groups from exceeding their memcg allocation limits. + However, it may cause slowdowns in the flock() system call. + + Disabling this option is not recommended as it may allow a rogue task + to DoS the system by forcing the kernel to allocate memory beyond the + task group's memcg limits. + + If unsure, say Y. + source "fs/crypto/Kconfig" source "fs/verity/Kconfig" diff --git a/fs/locks.c b/fs/locks.c index 235ac56c557d..e2799a18c4e8 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2905,7 +2905,7 @@ static int __init proc_locks_init(void) fs_initcall(proc_locks_init); #endif -static bool flock_accounting __ro_after_init = true; +static bool flock_accounting __ro_after_init = IS_ENABLED(CONFIG_FLOCK_ACCOUNTING); static int __init flock_accounting_cmdline(char *str) { From patchwork Wed Jan 17 16:14:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 188890 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:414:b0:176:5f3:c5eb with SMTP id 20csp786828rwd; Wed, 17 Jan 2024 08:29:20 -0800 (PST) X-Google-Smtp-Source: AGHT+IEgA7qashxrzLVzHQ2rHsR0Cc8SfwDx4INA9c9argCS+bVy2UQsF9q+35Jpn6bKLFKHD5gz X-Received: by 2002:a05:6a00:1947:b0:6d9:a0b2:7aa6 with SMTP id s7-20020a056a00194700b006d9a0b27aa6mr6512274pfk.41.1705508960333; Wed, 17 Jan 2024 08:29:20 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705508960; cv=pass; d=google.com; s=arc-20160816; b=JUcClUhnpkkezd5BS3pIV34RtVhbya5SmxC3UzfFnipor6FdLVVDUZmOg4v+64L5rY f88MsBnSqJuC3DM191dA8hJG5q9npWYchwMnF9LE1qTqY7WLOEvJnWKxo0mDzdHDdF1b oo7PhbAvjw+LuSOXhEEy2v1B3vzClatbe674We+rRKX5QTRtR4Y9LstqlexwOgOGgjiO Y5xueJW3pEbrCI4aFFATWZ118VDDHmIKKR3nudFrAmAXXEXIiux7MJyj3hxUUk/ZG9Pg wHuH4OSX2Ji1To7B+RjjjJmxN1hIpmU4Sc2bnKVF7M9KllB6knL6df0K2ril6LjMB4lC BcRA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=lYW7FMdw2mPifRv//jIC/vQYpevvbjQYxhbNLwNJyXA=; fh=mzZ5OKkyWbFHsrLyCorOBXM9JoluhtmgZuUWxRX6aw0=; b=jafVsgJ1fefGfjVvQVwJrm82LNbu9JtvIp1T0eXRt9XNTb1P/OAGujC3vLiFOPBVh4 ACMO5f2fbaXxttDdze1ohqZN4MNmy0ZLPwb5IdEtvNF2MoMxov0uNmOKoQPLgyrQnei7 xySUiJs3kpfKuAYGSRt9OnkM2o7Qv48Ts+OaQXLHk1UEHDbEboYVV/wU6FS1SgvUoq0X tnVmcYz/qtTADJUGWsLC3YDCS54DV58oqen6qcXFQuXibaLFbJo7/tJaynxTCdcRnZHd ANME0x44kZfOKsddkgGwkOx2Eatmo8ScWYj7+kf57tKadnGCB9Pj+9Q9xnYkUDVSKNu+ Yx+Q== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZyvIhOFN; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29202-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29202-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id d15-20020a056a00198f00b006d9a743434csi2018172pfl.161.2024.01.17.08.29.20 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 08:29:20 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-29202-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=ZyvIhOFN; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29202-ouuuleilei=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29202-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 897FC28BDDA for ; Wed, 17 Jan 2024 16:16:43 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id C6B4E2376A; Wed, 17 Jan 2024 16:15:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZyvIhOFN" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E400E22636; Wed, 17 Jan 2024 16:15:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508111; cv=none; b=t8OkREobosDG6Hps8HgvmVN9wfPGM3IYcd6+vEh1mHdZVklwK3x9FNURZJf4GiLeiFvTiqNOyo2aO1LaXXUdYvP/t+YGZThnx74tHaDEqWF75mcOqjcFHISKXw+3uxUVpAvkdlpvsUeDgXhqhuEUTIJjBkMaeE8Cy4DtcKb6grc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508111; c=relaxed/simple; bh=SbzlSXpRvsUpXUW2hg2vsE65KIFNDSoztygwRFd62Iw=; h=Received:DKIM-Signature:From:To:Cc:Subject:Date:Message-ID: X-Mailer:In-Reply-To:References:MIME-Version: Content-Transfer-Encoding; b=FN2Vgt/5iJ/EQx3gi6+Rb5k6K2yl10IwLS7e4fM6amsa8W7BTbmRM9jTVGyD+NvIsJqTZQAv+oWtKdJQMuhTy4xA5Qz5fSrAZt7dauf2sMBJlHeRyut3a/XnWXB7D6TgrMPL4CFvq5UN95AoptssMaQpRrxUWnuOhLNYisHWvGQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=ZyvIhOFN; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 00CBFC43394; Wed, 17 Jan 2024 16:15:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508109; bh=SbzlSXpRvsUpXUW2hg2vsE65KIFNDSoztygwRFd62Iw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZyvIhOFND4/d9lgtxET6AjQNr+5TfCtZM72YHaWKxkzrkGQQiTjsV9vryPHqzZ+YD nikLvgD8vIOlWx2dg/RxMbT9yTi45l0If5wWqleb9NUPwrIO7Utw2dFYRv9BbQ5thl oq/k8a+Xpp96kJTTRTzadt3PiWBxGR07IUBye6LrA5ZURe8VwAF63VYK3YQt7Hx0V7 0LZkLVxG9ySD0LuEtsK2rtOov5hIK6A5yzj1J1gM0ttDApF6T1dTp2+rCc3mgQq7h5 Ny6MIwcEY3L/zvvsiGDImPueYgg9GbH32KR0/avBtRTZs1r727xCk6NKUACNQutpU8 a+3aeBum1musA== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 3/4] mitigations: Expand 'mitigations=off' to include optional software mitigations Date: Wed, 17 Jan 2024 08:14:45 -0800 Message-ID: X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788355763800936151 X-GMAIL-MSGID: 1788355763800936151 The 'mitigations=off' cmdline option disables all CPU mitigations at runtime. It's intended for users who are running with trusted user space and don't want the performance impact associated with all the mitigations. Up until now, it was only used for CPU mitigations. However, there can also be optional software mitigations which have performance impact. Expand 'mitigations=' to include optional software mitigations. After all there's nothing in the "mitigations" name which limits it to CPU vulnerabilities. In theory we could introduce separate {cpu,sw}_mitigations= options, but for the time being there's no need to separate them out. It's simpler to have them combined since the use case of "I have trusted user space and don't want the performance impacts of unneeded mitigations" is the same, regardless of the source of the bug. Move the interfaces around and rename them to reflect the new broader impact of mitigations=off. No functional changes. Signed-off-by: Josh Poimboeuf --- .../admin-guide/kernel-parameters.txt | 27 ++++++---- arch/arm64/kernel/cpufeature.c | 2 +- arch/arm64/kernel/proton-pack.c | 6 +-- arch/powerpc/kernel/security.c | 14 +++--- arch/s390/kernel/nospec-branch.c | 2 +- arch/x86/kernel/cpu/bugs.c | 35 ++++++------- arch/x86/kvm/mmu/mmu.c | 2 +- arch/x86/mm/pti.c | 3 +- include/linux/bpf.h | 5 +- include/linux/cpu.h | 3 -- include/linux/mitigations.h | 4 ++ kernel/Makefile | 3 +- kernel/cpu.c | 43 ---------------- kernel/mitigations.c | 50 +++++++++++++++++++ 14 files changed, 109 insertions(+), 90 deletions(-) create mode 100644 include/linux/mitigations.h create mode 100644 kernel/mitigations.c diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 91987b06bc52..24e873351368 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3391,16 +3391,23 @@ https://repo.or.cz/w/linux-2.6/mini2440.git mitigations= - [X86,PPC,S390,ARM64] Control optional mitigations for - CPU vulnerabilities. This is a set of curated, - arch-independent options, each of which is an - aggregation of existing arch-specific options. + [KNL] Control optional mitigations for CPU + vulnerabilities and performance-impacting + software vulnerabilities. This is a set of + curated, arch-independent options, each of which + is an aggregation of existing arch-specific + options. off - Disable all optional CPU mitigations. This - improves system performance, but it may also - expose users to several CPU vulnerabilities. - Equivalent to: if nokaslr then kpti=0 [ARM64] + Disable all optional mitigations. This + improves system performance, but may also + expose users to several vulnerabilities. + + Equivalent to: + + CPU mitigations: + ---------------- + if nokaslr then kpti=0 [ARM64] gather_data_sampling=off [X86] kvm.nx_huge_pages=off [X86] l1tf=off [X86] @@ -3426,7 +3433,7 @@ kvm.nx_huge_pages=force. auto (default) - Mitigate all CPU vulnerabilities, but leave SMT + Enable all optional mitigations, but leave SMT enabled, even if it's vulnerable. This is for users who don't want to be surprised by SMT getting disabled across kernel upgrades, or who @@ -3434,7 +3441,7 @@ Equivalent to: (default behavior) auto,nosmt - Mitigate all CPU vulnerabilities, disabling SMT + Enable all optional mitigations, disabling SMT if needed. This is for users who always want to be fully mitigated, even if it means losing SMT. Equivalent to: l1tf=flush,nosmt [X86] diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c index 01a4c1d7fc09..ae37898e5b1a 100644 --- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -1719,7 +1719,7 @@ static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry, } } - if (cpu_mitigations_off() && !__kpti_forced) { + if (mitigations_off() && !__kpti_forced) { str = "mitigations=off"; __kpti_forced = -1; } diff --git a/arch/arm64/kernel/proton-pack.c b/arch/arm64/kernel/proton-pack.c index 6268a13a1d58..00242edf1885 100644 --- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -91,7 +91,7 @@ early_param("nospectre_v2", parse_spectre_v2_param); static bool spectre_v2_mitigations_off(void) { - bool ret = __nospectre_v2 || cpu_mitigations_off(); + bool ret = __nospectre_v2 || mitigations_off(); if (ret) pr_info_once("spectre-v2 mitigation disabled by command line option\n"); @@ -421,7 +421,7 @@ early_param("ssbd", parse_spectre_v4_param); */ static bool spectre_v4_mitigations_off(void) { - bool ret = cpu_mitigations_off() || + bool ret = mitigations_off() || __spectre_v4_policy == SPECTRE_V4_POLICY_MITIGATION_DISABLED; if (ret) @@ -1000,7 +1000,7 @@ void spectre_bhb_enable_mitigation(const struct arm64_cpu_capabilities *entry) /* No point mitigating Spectre-BHB alone. */ } else if (!IS_ENABLED(CONFIG_MITIGATE_SPECTRE_BRANCH_HISTORY)) { pr_info_once("spectre-bhb mitigation disabled by compile time option\n"); - } else if (cpu_mitigations_off() || __nospectre_bhb) { + } else if (mitigations_off() || __nospectre_bhb) { pr_info_once("spectre-bhb mitigation disabled by command line option\n"); } else if (supports_ecbhb(SCOPE_LOCAL_CPU)) { state = SPECTRE_MITIGATED; diff --git a/arch/powerpc/kernel/security.c b/arch/powerpc/kernel/security.c index 4856e1a5161c..52cf79b5d87a 100644 --- a/arch/powerpc/kernel/security.c +++ b/arch/powerpc/kernel/security.c @@ -64,7 +64,7 @@ void __init setup_barrier_nospec(void) enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) && security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR); - if (!no_nospec && !cpu_mitigations_off()) + if (!no_nospec && !mitigations_off()) enable_barrier_nospec(enable); } @@ -135,7 +135,7 @@ early_param("nospectre_v2", handle_nospectre_v2); #ifdef CONFIG_PPC_E500 void __init setup_spectre_v2(void) { - if (no_spectrev2 || cpu_mitigations_off()) + if (no_spectrev2 || mitigations_off()) do_btb_flush_fixups(); else btb_flush_enabled = true; @@ -331,7 +331,7 @@ void setup_stf_barrier(void) stf_enabled_flush_types = type; - if (!no_stf_barrier && !cpu_mitigations_off()) + if (!no_stf_barrier && !mitigations_off()) stf_barrier_enable(enable); } @@ -530,7 +530,7 @@ void setup_count_cache_flush(void) { bool enable = true; - if (no_spectrev2 || cpu_mitigations_off()) { + if (no_spectrev2 || mitigations_off()) { if (security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED) || security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED)) pr_warn("Spectre v2 mitigations not fully under software control, can't disable\n"); @@ -700,13 +700,13 @@ void setup_rfi_flush(enum l1d_flush_type types, bool enable) enabled_flush_types = types; - if (!cpu_mitigations_off() && !no_rfi_flush) + if (!mitigations_off() && !no_rfi_flush) rfi_flush_enable(enable); } void setup_entry_flush(bool enable) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!no_entry_flush) @@ -715,7 +715,7 @@ void setup_entry_flush(bool enable) void setup_uaccess_flush(bool enable) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!no_uaccess_flush) diff --git a/arch/s390/kernel/nospec-branch.c b/arch/s390/kernel/nospec-branch.c index d1b16d83e49a..75ec4ad4198b 100644 --- a/arch/s390/kernel/nospec-branch.c +++ b/arch/s390/kernel/nospec-branch.c @@ -59,7 +59,7 @@ early_param("nospectre_v2", nospectre_v2_setup_early); void __init nospec_auto_detect(void) { - if (test_facility(156) || cpu_mitigations_off()) { + if (test_facility(156) || mitigations_off()) { /* * The machine supports etokens. * Disable expolines and disable nobp. diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c index bb0ab8466b91..45d4c2664011 100644 --- a/arch/x86/kernel/cpu/bugs.c +++ b/arch/x86/kernel/cpu/bugs.c @@ -16,6 +16,7 @@ #include #include #include +#include #include #include @@ -243,7 +244,7 @@ static const char * const mds_strings[] = { static void __init mds_select_mitigation(void) { - if (!boot_cpu_has_bug(X86_BUG_MDS) || cpu_mitigations_off()) { + if (!boot_cpu_has_bug(X86_BUG_MDS) || mitigations_off()) { mds_mitigation = MDS_MITIGATION_OFF; return; } @@ -255,7 +256,7 @@ static void __init mds_select_mitigation(void) static_branch_enable(&mds_user_clear); if (!boot_cpu_has(X86_BUG_MSBDS_ONLY) && - (mds_nosmt || cpu_mitigations_auto_nosmt())) + (mds_nosmt || mitigations_auto_nosmt())) cpu_smt_disable(false); } } @@ -317,7 +318,7 @@ static void __init taa_select_mitigation(void) return; } - if (cpu_mitigations_off()) { + if (mitigations_off()) { taa_mitigation = TAA_MITIGATION_OFF; return; } @@ -358,7 +359,7 @@ static void __init taa_select_mitigation(void) */ static_branch_enable(&mds_user_clear); - if (taa_nosmt || cpu_mitigations_auto_nosmt()) + if (taa_nosmt || mitigations_auto_nosmt()) cpu_smt_disable(false); } @@ -408,7 +409,7 @@ static void __init mmio_select_mitigation(void) if (!boot_cpu_has_bug(X86_BUG_MMIO_STALE_DATA) || boot_cpu_has_bug(X86_BUG_MMIO_UNKNOWN) || - cpu_mitigations_off()) { + mitigations_off()) { mmio_mitigation = MMIO_MITIGATION_OFF; return; } @@ -451,7 +452,7 @@ static void __init mmio_select_mitigation(void) else mmio_mitigation = MMIO_MITIGATION_UCODE_NEEDED; - if (mmio_nosmt || cpu_mitigations_auto_nosmt()) + if (mmio_nosmt || mitigations_auto_nosmt()) cpu_smt_disable(false); } @@ -481,7 +482,7 @@ early_param("mmio_stale_data", mmio_stale_data_parse_cmdline); static void __init md_clear_update_mitigation(void) { - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!static_key_enabled(&mds_user_clear)) @@ -611,7 +612,7 @@ static void __init srbds_select_mitigation(void) srbds_mitigation = SRBDS_MITIGATION_HYPERVISOR; else if (!boot_cpu_has(X86_FEATURE_SRBDS_CTRL)) srbds_mitigation = SRBDS_MITIGATION_UCODE_NEEDED; - else if (cpu_mitigations_off() || srbds_off) + else if (mitigations_off() || srbds_off) srbds_mitigation = SRBDS_MITIGATION_OFF; update_srbds_msr(); @@ -742,7 +743,7 @@ static void __init gds_select_mitigation(void) goto out; } - if (cpu_mitigations_off()) + if (mitigations_off()) gds_mitigation = GDS_MITIGATION_OFF; /* Will verify below that mitigation _can_ be disabled */ @@ -841,7 +842,7 @@ static bool smap_works_speculatively(void) static void __init spectre_v1_select_mitigation(void) { - if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1) || cpu_mitigations_off()) { + if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1) || mitigations_off()) { spectre_v1_mitigation = SPECTRE_V1_MITIGATION_NONE; return; } @@ -974,7 +975,7 @@ static void __init retbleed_select_mitigation(void) { bool mitigate_smt = false; - if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || cpu_mitigations_off()) + if (!boot_cpu_has_bug(X86_BUG_RETBLEED) || mitigations_off()) return; switch (retbleed_cmd) { @@ -1068,7 +1069,7 @@ static void __init retbleed_select_mitigation(void) } if (mitigate_smt && !boot_cpu_has(X86_FEATURE_STIBP) && - (retbleed_nosmt || cpu_mitigations_auto_nosmt())) + (retbleed_nosmt || mitigations_auto_nosmt())) cpu_smt_disable(false); /* @@ -1391,7 +1392,7 @@ static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void) int ret, i; if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") || - cpu_mitigations_off()) + mitigations_off()) return SPECTRE_V2_CMD_NONE; ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg)); @@ -1885,7 +1886,7 @@ static enum ssb_mitigation_cmd __init ssb_parse_cmdline(void) int ret, i; if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") || - cpu_mitigations_off()) { + mitigations_off()) { return SPEC_STORE_BYPASS_CMD_NONE; } else { ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable", @@ -2283,9 +2284,9 @@ static void __init l1tf_select_mitigation(void) if (!boot_cpu_has_bug(X86_BUG_L1TF)) return; - if (cpu_mitigations_off()) + if (mitigations_off()) l1tf_mitigation = L1TF_MITIGATION_OFF; - else if (cpu_mitigations_auto_nosmt()) + else if (mitigations_auto_nosmt()) l1tf_mitigation = L1TF_MITIGATION_FLUSH_NOSMT; override_cache_bits(&boot_cpu_data); @@ -2410,7 +2411,7 @@ static void __init srso_select_mitigation(void) { bool has_microcode = boot_cpu_has(X86_FEATURE_IBPB_BRTYPE); - if (cpu_mitigations_off()) + if (mitigations_off()) return; if (!boot_cpu_has_bug(X86_BUG_SRSO)) { diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index 0b1f991b9a31..f0d105f740ed 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6819,7 +6819,7 @@ static int get_nx_huge_pages(char *buffer, const struct kernel_param *kp) static bool get_nx_auto_mode(void) { /* Return true when CPU has the bug, and mitigations are ON */ - return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !cpu_mitigations_off(); + return boot_cpu_has_bug(X86_BUG_ITLB_MULTIHIT) && !mitigations_off(); } static void __set_nx_huge_pages(bool val) diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c index 669ba1c345b3..16a63c241e1e 100644 --- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -28,6 +28,7 @@ #include #include #include +#include #include #include @@ -84,7 +85,7 @@ void __init pti_check_boottime_disable(void) return; } - if (cpu_mitigations_off()) + if (mitigations_off()) pti_mode = PTI_FORCE_OFF; if (pti_mode == PTI_FORCE_OFF) { pti_print_if_insecure("disabled on command line."); diff --git a/include/linux/bpf.h b/include/linux/bpf.h index e30100597d0a..04356b9fa82a 100644 --- a/include/linux/bpf.h +++ b/include/linux/bpf.h @@ -30,6 +30,7 @@ #include #include #include +#include struct bpf_verifier_env; struct bpf_verifier_log; @@ -2214,12 +2215,12 @@ static inline bool bpf_allow_uninit_stack(void) static inline bool bpf_bypass_spec_v1(void) { - return cpu_mitigations_off() || perfmon_capable(); + return mitigations_off() || perfmon_capable(); } static inline bool bpf_bypass_spec_v4(void) { - return cpu_mitigations_off() || perfmon_capable(); + return mitigations_off() || perfmon_capable(); } int bpf_map_new_fd(struct bpf_map *map, int flags); diff --git a/include/linux/cpu.h b/include/linux/cpu.h index fc8094419084..b8c81d924a62 100644 --- a/include/linux/cpu.h +++ b/include/linux/cpu.h @@ -212,7 +212,4 @@ void cpuhp_report_idle_dead(void); static inline void cpuhp_report_idle_dead(void) { } #endif /* #ifdef CONFIG_HOTPLUG_CPU */ -extern bool cpu_mitigations_off(void); -extern bool cpu_mitigations_auto_nosmt(void); - #endif /* _LINUX_CPU_H_ */ diff --git a/include/linux/mitigations.h b/include/linux/mitigations.h new file mode 100644 index 000000000000..5acc80d49230 --- /dev/null +++ b/include/linux/mitigations.h @@ -0,0 +1,4 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +extern bool mitigations_off(void); +extern bool mitigations_auto_nosmt(void); diff --git a/kernel/Makefile b/kernel/Makefile index ce105a5558fc..d1514432bbc7 100644 --- a/kernel/Makefile +++ b/kernel/Makefile @@ -10,7 +10,8 @@ obj-y = fork.o exec_domain.o panic.o \ extable.o params.o \ kthread.o sys_ni.o nsproxy.o \ notifier.o ksysfs.o cred.o reboot.o \ - async.o range.o smpboot.o ucount.o regset.o ksyms_common.o + async.o range.o smpboot.o ucount.o regset.o ksyms_common.o \ + mitigations.o obj-$(CONFIG_USERMODE_DRIVER) += usermode_driver.o obj-$(CONFIG_MULTIUSER) += groups.o diff --git a/kernel/cpu.c b/kernel/cpu.c index e6ec3ba4950b..e273478cd437 100644 --- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -3195,46 +3195,3 @@ void __init boot_cpu_hotplug_init(void) this_cpu_write(cpuhp_state.state, CPUHP_ONLINE); this_cpu_write(cpuhp_state.target, CPUHP_ONLINE); } - -/* - * These are used for a global "mitigations=" cmdline option for toggling - * optional CPU mitigations. - */ -enum cpu_mitigations { - CPU_MITIGATIONS_OFF, - CPU_MITIGATIONS_AUTO, - CPU_MITIGATIONS_AUTO_NOSMT, -}; - -static enum cpu_mitigations cpu_mitigations __ro_after_init = - CPU_MITIGATIONS_AUTO; - -static int __init mitigations_parse_cmdline(char *arg) -{ - if (!strcmp(arg, "off")) - cpu_mitigations = CPU_MITIGATIONS_OFF; - else if (!strcmp(arg, "auto")) - cpu_mitigations = CPU_MITIGATIONS_AUTO; - else if (!strcmp(arg, "auto,nosmt")) - cpu_mitigations = CPU_MITIGATIONS_AUTO_NOSMT; - else - pr_crit("Unsupported mitigations=%s, system may still be vulnerable\n", - arg); - - return 0; -} -early_param("mitigations", mitigations_parse_cmdline); - -/* mitigations=off */ -bool cpu_mitigations_off(void) -{ - return cpu_mitigations == CPU_MITIGATIONS_OFF; -} -EXPORT_SYMBOL_GPL(cpu_mitigations_off); - -/* mitigations=auto,nosmt */ -bool cpu_mitigations_auto_nosmt(void) -{ - return cpu_mitigations == CPU_MITIGATIONS_AUTO_NOSMT; -} -EXPORT_SYMBOL_GPL(cpu_mitigations_auto_nosmt); diff --git a/kernel/mitigations.c b/kernel/mitigations.c new file mode 100644 index 000000000000..2828a755a719 --- /dev/null +++ b/kernel/mitigations.c @@ -0,0 +1,50 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include +#include +#include +#include +#include + +enum mitigations { + MITIGATIONS_OFF, + MITIGATIONS_AUTO, + MITIGATIONS_AUTO_NOSMT, +}; + +static enum mitigations mitigations __ro_after_init = + MITIGATIONS_AUTO; + +/* + * The "mitigations=" cmdline option is for toggling optional CPU or software + * mitigations which may impact performance. Mitigations should only be turned + * off if user space and VMs are running trusted code. + */ +static int __init mitigations_parse_cmdline(char *arg) +{ + if (!strcmp(arg, "off")) + mitigations = MITIGATIONS_OFF; + else if (!strcmp(arg, "auto")) + mitigations = MITIGATIONS_AUTO; + else if (!strcmp(arg, "auto,nosmt")) + mitigations = MITIGATIONS_AUTO_NOSMT; + else + pr_crit("Unsupported mitigations=%s, system may still be vulnerable\n", arg); + + return 0; +} +early_param("mitigations", mitigations_parse_cmdline); + +/* mitigations=off */ +bool mitigations_off(void) +{ + return mitigations == MITIGATIONS_OFF; +} +EXPORT_SYMBOL_GPL(mitigations_off); + +/* mitigations=auto,nosmt */ +bool mitigations_auto_nosmt(void) +{ + return mitigations == MITIGATIONS_AUTO_NOSMT; +} +EXPORT_SYMBOL_GPL(mitigations_auto_nosmt); From patchwork Wed Jan 17 16:14:46 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Josh Poimboeuf X-Patchwork-Id: 188887 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:6358:414:b0:176:5f3:c5eb with SMTP id 20csp779492rwd; Wed, 17 Jan 2024 08:16:53 -0800 (PST) X-Google-Smtp-Source: AGHT+IHTMCg9l90lrtvu8cKwcmtMzMEB5QlGX72ATYthv1CLQXVvNAVxOWVRPIcXAvqudTcv18Ud X-Received: by 2002:a05:622a:1710:b0:429:a14e:12c1 with SMTP id h16-20020a05622a171000b00429a14e12c1mr12125903qtk.27.1705508212951; Wed, 17 Jan 2024 08:16:52 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1705508212; cv=pass; d=google.com; s=arc-20160816; b=bN4qfRHRFjniTGve6EqbaifIS0uZIvsfTI8DAiWSmv57/sYcvVqOus6qSoJJm5w+TB gq0rlmK4TkmUEighIqAkpTSZobY0CHRlyci6aW1PBec8QHRqeAgEQ7bEP3UgWWpGKm1p Ck6RGM8R7EkLnaGDbpGbKboIuqcGbZMMOaLRo+Q9RUvNLz6BtgiKLMnqCNDsTJQalU7e /QZSr/n2MXRXZ9TeC5imYuo51t4e9Mox5bipbWLK0EtBa2gmheLgzjHjNb5DEA09aklp fAWLzc62dMGNc5f3ZjJ2xukLJ7nW7JtPA4zim9dfCT24GQ7ztOoHz7+m1R7zG8N0k5gj iNhw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=Kmbez58j5lCnAqbtgiv72eK9zQgDnqcShPxrJ1mOWnM=; fh=mzZ5OKkyWbFHsrLyCorOBXM9JoluhtmgZuUWxRX6aw0=; b=swPsi2ambz04mQtet+1cB+F4tSNO8YECJG7TiH9bgV5IB7/Db18p8r3OZxpUI52hEV rYa8ye14WJWUWODyeAjDpO9AqcxznTTEZDh6K2vTlsu8vOPHWRJFj8XkUY9eqSZ3oXVG jJLlcGSqTxcXyD1EkW4MyVj2RmkbT0o6dArMhhK5Oibke1UGluJ9QvcaeNU5l4fyPDeM q4vOcnpaOOMoKnLQnZc1ZkHrVSBgY2rfWWACxrE4WCsfcUipQ923qoXNyGSTngRNzLuq AqQKpxMInItI5UVphDV42Yy583WHMkUD7EjSuPAnzveWb8wHbXNNty/Kxtb4L9dc3dOk kv+Q== ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=maEHAzHm; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29203-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29203-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id e18-20020ac85992000000b00429986ca00asi11930878qte.511.2024.01.17.08.16.52 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 Jan 2024 08:16:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-29203-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=maEHAzHm; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-29203-ouuuleilei=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-29203-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 5B38C1C257A3 for ; Wed, 17 Jan 2024 16:16:27 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A4B2B23752; Wed, 17 Jan 2024 16:15:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="maEHAzHm" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E403822638; Wed, 17 Jan 2024 16:15:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508111; cv=none; b=PIgYJawRp8KGcLl1nPR7I2ndMWRaRL/807XYJOcQWTgOYaxXm6OMH593HvTt1fXQuxTpoa7GLJDVrEHBmCn44k62d/PAD/crmO9O1mDkj8X3eimWJUcE0MlPjKoH/0VWV7oN9jYsY1R/fwiS5DqmKR1LGApfl1B/i2v29PC8vhk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705508111; c=relaxed/simple; bh=jDzObRYwRkapJySKK0oWQtxYQUq2OVssGctw+1duV94=; h=Received:DKIM-Signature:From:To:Cc:Subject:Date:Message-ID: X-Mailer:In-Reply-To:References:MIME-Version: Content-Transfer-Encoding; b=OTy5/2m4f+xv2yh+ryQzqaccw+hp68wgY/9EtZWhg0sTteIdBN/I5H8GxkpCiLGIA40I3M0jYzKfwCae7O28tf9/HwJdss9yiHHRqeB8Fyp10OeH72/8qRf2SJ8JSmNdgopbTXxe/TBFXdpY5/qnbalzwhh9o76Y37YXfPsuo+0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=maEHAzHm; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF1BAC43609; Wed, 17 Jan 2024 16:15:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1705508110; bh=jDzObRYwRkapJySKK0oWQtxYQUq2OVssGctw+1duV94=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=maEHAzHmX242xFhjDsTXdUbPh+/313gooww6/wGhfud2C6V+ivwG3IIUyyDsblUjI 05Cfc0FLTgiLtets7cdvt9U1BAaDYytmPqjoHp9v4TnGh8hJFgappYvNSadYIZD/lD Ma/df36cpC/lrmGPKu+YGfwmLsCZBQVQEtwaL1F7tdtCu8EBEZBhvc7RrCn1L8LYwJ fvIikk7HgNpBwTIlWV3cNxJgTC0GW6JTKfU1AfajNo4VedSGPAITaQkqyC23exjI6M Yll3SXrambsT2BAdjL4GzfrmGfa7A3h8IX0XJECuRirErY6rcQYAcJ2Vo4ChBB1lwD Yw7CgleyUXxhg== From: Josh Poimboeuf To: Linus Torvalds , Jeff Layton , Chuck Lever , Shakeel Butt , Roman Gushchin , Johannes Weiner , Michal Hocko Cc: linux-kernel@vger.kernel.org, Jens Axboe , Tejun Heo , Vasily Averin , Michal Koutny , Waiman Long , Muchun Song , Jiri Kosina , cgroups@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH RFC 4/4] mitigations: Add flock cache accounting to 'mitigations=off' Date: Wed, 17 Jan 2024 08:14:46 -0800 Message-ID: <3e803d5aee5dd1f4c738f0de1e839e6cfcb9dc41.1705507931.git.jpoimboe@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1788354980055765088 X-GMAIL-MSGID: 1788354980055765088 Allow flock cache accounting to be disabled with 'mitigations=off', as it fits the profile for that option: trusted user space combined with a performance-impacting mitigation. Also, for consistency with the other CONFIG_MITIGATION_* options, rename CONFIG_FLOCK_ACCOUNTING to CONFIG_MITIGATION_FLOCK_ACCOUNTING. Signed-off-by: Josh Poimboeuf --- Documentation/admin-guide/kernel-parameters.txt | 4 ++++ fs/Kconfig | 2 +- fs/locks.c | 5 +++-- 3 files changed, 8 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 24e873351368..b31fe7433b48 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -3427,6 +3427,10 @@ ssbd=force-off [ARM64] tsx_async_abort=off [X86] + Software mitigations: + --------------------- + flock_accounting=off [KNL] + Exceptions: This does not have any effect on kvm.nx_huge_pages when diff --git a/fs/Kconfig b/fs/Kconfig index 591f54a03059..4345b79d3b40 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -129,7 +129,7 @@ config FILE_LOCKING for filesystems like NFS and for the flock() system call. Disabling this option saves about 11k. -config FLOCK_ACCOUNTING +config MITIGATION_FLOCK_ACCOUNTING bool "Enable kernel memory accounting for file locks" if EXPERT depends on FILE_LOCKING default y diff --git a/fs/locks.c b/fs/locks.c index e2799a18c4e8..fd4157ccd504 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -64,6 +64,7 @@ #include #include #include +#include #define CREATE_TRACE_POINTS #include @@ -2905,7 +2906,7 @@ static int __init proc_locks_init(void) fs_initcall(proc_locks_init); #endif -static bool flock_accounting __ro_after_init = IS_ENABLED(CONFIG_FLOCK_ACCOUNTING); +static bool flock_accounting __ro_after_init = IS_ENABLED(CONFIG_MITIGATION_FLOCK_ACCOUNTING); static int __init flock_accounting_cmdline(char *str) { @@ -2930,7 +2931,7 @@ static int __init filelock_init(void) int i; slab_flags_t flags = SLAB_PANIC; - if (!flock_accounting) + if (mitigations_off() || !flock_accounting) pr_err(FLOCK_ACCOUNTING_MSG); else flags |= SLAB_ACCOUNT;