Message ID | 20231010092133.4093612-1-hi@alyssa.is |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp65441vqb; Tue, 10 Oct 2023 02:26:15 -0700 (PDT) X-Google-Smtp-Source: AGHT+IF0JObsuX2r6i9VETNjXIDHVpgDi5hJjkVn1N/ZI7iSffroak5pnjsHjyZ4j/Z3yVOipFNO X-Received: by 2002:a17:902:d4c9:b0:1c7:8345:f377 with SMTP id o9-20020a170902d4c900b001c78345f377mr17847009plg.29.1696929975410; Tue, 10 Oct 2023 02:26:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1696929975; cv=none; d=google.com; s=arc-20160816; b=A3yee/hWgO9cxPbLbGQShGurb4PMBuJ+dvYxKgde/HZo8fESiQhtJetFhHuxyDOp6z TG7tXdZKvyvrkE74+eW33sXjTb6UyW4/Plms55e7Ia8kYZ9kxFIq0+WMTbsppC5ZCffB T+DonQxsoA2HfgsBua8tF5eEywGSEF+K2dPuGjVefpsoYMdfw4ATNjddeEGjg2/SwU41 mDWAe7zZVcu7ccMhsuz/zTgfgDOjM89VEuB3lGzps8Me8n4AUc7kL6IHjDgiujq5ZVrI rNQ1CJWhZPTKLN3zZIbV5r99Gp+EVdo96CQVUAP31Q54E/sdTu3CR2jxYqWazlrJpnB8 m2Fw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:feedback-id:dkim-signature :dkim-signature; bh=ETNAk+cmNQvtBbCMDyx7O4aQl9NCuOdk+MTBq0tOexg=; fh=3ivs+4txid43XvEOzxKg6xbodSyk/p8gd8mb1GtM8bw=; b=hur9vTed8HNV2pYxoFplhPC/pO0p8bGRt3CPzpS/A1BXS+tGWPQjXful7SwClzNDnn hi6ILDbJbjdfhq5W1UR1wxNGPkoG2UH9AihHax20qMKVDZbpykJVgd1R+F3ErMiZ7f21 qzSqaDNLemXUla9qEUTD3uZZHCj7XGKgGZDICO96P9sQIneEqtGODN8DVQHd/nt9d6C/ 4QMEL3i4ZZADDuuSrx5imz8RKK+70JOGC+qcOeddtaiPsFxPewT4vY13Tm7bmxBI0nJi UjPEO6J0aBUsbTFZ4lAI3Vw/ZDF0nPVEPhRJBfJtDXSF9bt/7/CiqROgofxizoTcegsj GQ0Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@alyssa.is header.s=fm1 header.b=LaGGhGdX; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=XFjf2JnJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id m16-20020a170902db1000b001b8921fbd87si642409plx.490.2023.10.10.02.26.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 10 Oct 2023 02:26:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@alyssa.is header.s=fm1 header.b=LaGGhGdX; dkim=pass header.i=@messagingengine.com header.s=fm2 header.b=XFjf2JnJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 2396B8025DF3; Tue, 10 Oct 2023 02:26:12 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230010AbjJJJZy (ORCPT <rfc822;rua109.linux@gmail.com> + 20 others); Tue, 10 Oct 2023 05:25:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36202 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229594AbjJJJZx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 10 Oct 2023 05:25:53 -0400 Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D49C193; Tue, 10 Oct 2023 02:25:50 -0700 (PDT) Received: from compute4.internal (compute4.nyi.internal [10.202.2.44]) by mailout.nyi.internal (Postfix) with ESMTP id 475495C028D; Tue, 10 Oct 2023 05:25:50 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute4.internal (MEProxy); Tue, 10 Oct 2023 05:25:50 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=alyssa.is; h=cc :cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:message-id:mime-version:reply-to :sender:subject:subject:to:to; s=fm1; t=1696929950; x= 1697016350; bh=ETNAk+cmNQvtBbCMDyx7O4aQl9NCuOdk+MTBq0tOexg=; b=L aGGhGdX5jIFgCzgfyckJUhI8YK/Et3BZ0mlCFi6Tr6ffEiEisO4BFs5X/pR1SkPM r9vPdvH79Bz4jCxqYMmrPoz08G2eBT9S7A5sKpHnzC6DUrEjpxhpJMkeJZQ7eWeP QcL4jFP2Ak27f5DxCR4+YMGsm0GX1rpzgfEN5Y6A+3ZXL3FFDQrVv9HvHZjAYIj7 RGC1vCI6gYiIlR43NJXQaPWqbtv4xf29h4Uw+P8w84PGZFYwBqtaYYdxTjnOKuXp SdoEfo5UsEfTxNeiRfPclV4tNRboGPk9Q7srCOFCIGdH0hh/zMqUUh+qOVuj29kY mRCZUWcOMCe+JcJJQZpow== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:message-id:mime-version:reply-to:sender :subject:subject:to:to:x-me-proxy:x-me-proxy:x-me-sender :x-me-sender:x-sasl-enc; s=fm2; t=1696929950; x=1697016350; bh=E TNAk+cmNQvtBbCMDyx7O4aQl9NCuOdk+MTBq0tOexg=; b=XFjf2JnJhJ1GbV5vc +eMSlrjXafQ2TZOtYl8sBSbg/1WWcYpkZ8PUSt2NygDUVWlrbP1EJ1CvWlftaRKn c2xMxeVFFfWZKST3RQF5KIs4AIOQKZtZsegzc+hyPq7lNgfuq+jVE7HsXqFQHxg+ WVNU2kMt8vm+DaznzASv3x1P/7tVkq8gHhPi3ZQ7T+iYWLAZxakgfCgV9WtHEbjB 9PEuen+DEk3gRarytbRIAYEhGCYVsaLnEKhKnVd/9YRsbkvwCF7fWPcIXE6NKG2B 1M9ZJxIrkSWexAqysW91i0zEvrqV6XMnCtJRs5wu02/iIkMGPPxq7JgB8Npz8CSm qx14w== X-ME-Sender: <xms:nRglZReI3jaSIZbcXScNtFd9-157EsnZmmbZaHcw55BCsAgZoKBOyA> <xme:nRglZfO3LvpFfvBy_bMwJe2cqBPpbfCXfcgsPDM-TiykMzvy4F7o6jDoLspp4-hHr Y7MtO7FglBNSmH2QQ> X-ME-Received: <xmr:nRglZaj8jMQQ95RfnBxDcyGRbmEnEtJ1gC-Jh0DI5N8NlJZRQ3hOrZZH-oONUc7N7IKltpNKrLH1_z0KcA> X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvkedrheehgdduhecutefuodetggdotefrodftvf curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc fjughrpefhvfevufffkffogggtgfesthekredtredtjeenucfhrhhomheptehlhihsshgr ucftohhsshcuoehhihesrghlhihsshgrrdhisheqnecuggftrfgrthhtvghrnhepjeefhe ffheejjefgtdffteektdfgfefgfeejgeffkeejjeegtdevjeelheellefhnecuvehluhhs thgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomhephhhisegrlhihshhsrg drihhs X-ME-Proxy: <xmx:nRglZa_vyTq4lMmqhumyZ_gY9O9ArkhfjxUQ51jGyTYkg6htpvp9oA> <xmx:nRglZdsL6mjcPF1-mSy2iCL3A8ts61Oc33OoT4QedBa1GWWV0ei6dQ> <xmx:nRglZZFqGEuXUnFrminNouu4j9xdiB1iFwP6Wd2qtQyo_j7CQwJWhg> <xmx:nhglZRghaqEpfS22ubj-dZcaWuqGscv92J1AXtMm-ogFbodKarsBgg> Feedback-ID: i12284293:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 10 Oct 2023 05:25:49 -0400 (EDT) Received: by mbp.qyliss.net (Postfix, from userid 1000) id D7E29E9F; Tue, 10 Oct 2023 09:25:46 +0000 (UTC) From: Alyssa Ross <hi@alyssa.is> To: Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org> Cc: Kees Cook <keescook@chromium.org>, Tetsuo Handa <penguin-kernel@I-love.SAKURA.ne.jp>, Eric Biederman <ebiederm@xmission.com>, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] exec: allow executing block devices Date: Tue, 10 Oct 2023 09:21:33 +0000 Message-ID: <20231010092133.4093612-1-hi@alyssa.is> X-Mailer: git-send-email 2.42.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=2.7 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, RCVD_IN_SBL_CSS,SPF_HELO_NONE,SPF_PASS autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Tue, 10 Oct 2023 02:26:12 -0700 (PDT) X-Spam-Level: ** X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779360045882647221 X-GMAIL-MSGID: 1779360045882647221 |
Series |
exec: allow executing block devices
|
|
Commit Message
Alyssa Ross
Oct. 10, 2023, 9:21 a.m. UTC
As far as I can tell, the S_ISREG() check is there to prevent
executing files where that would be nonsensical, like directories,
fifos, or sockets. But the semantics for executing a block device are
quite obvious — the block device acts just like a regular file.
My use case is having a common VM image that takes a configurable
payload to run. The payload will always be a single ELF file.
I could share the file with virtio-fs, or I could create a disk image
containing a filesystem containing the payload, but both of those add
unnecessary layers of indirection when all I need to do is share a
single executable blob with the VM. Sharing it as a block device is
the most natural thing to do, aside from the (arbitrary, as far as I
can tell) restriction on executing block devices. (The only slight
complexity is that I need to ensure that my payload size is rounded up
to a whole number of sectors, but that's trivial and fast in
comparison to e.g. generating a filesystem image.)
Signed-off-by: Alyssa Ross <hi@alyssa.is>
---
fs/exec.c | 6 ++++--
fs/namei.c | 2 +-
2 files changed, 5 insertions(+), 3 deletions(-)
base-commit: 94f6f0550c625fab1f373bb86a6669b45e9748b3
Comments
On Tue, Oct 10, 2023 at 09:21:33AM +0000, Alyssa Ross wrote: > As far as I can tell, the S_ISREG() check is there to prevent > executing files where that would be nonsensical, like directories, > fifos, or sockets. But the semantics for executing a block device are > quite obvious — the block device acts just like a regular file. > > My use case is having a common VM image that takes a configurable > payload to run. The payload will always be a single ELF file. > > I could share the file with virtio-fs, or I could create a disk image > containing a filesystem containing the payload, but both of those add > unnecessary layers of indirection when all I need to do is share a > single executable blob with the VM. Sharing it as a block device is > the most natural thing to do, aside from the (arbitrary, as far as I > can tell) restriction on executing block devices. (The only slight > complexity is that I need to ensure that my payload size is rounded up > to a whole number of sectors, but that's trivial and fast in > comparison to e.g. generating a filesystem image.) > > Signed-off-by: Alyssa Ross <hi@alyssa.is> Hi, Thanks for the suggestion! I would prefer to not change this rather core behavior in the kernel for a few reasons, but it mostly revolves around both user and developer expectations and the resulting fragility. For users, this hasn't been possible in the past, so if we make it possible, what situations are suddenly exposed on systems that are trying to very carefully control their execution environments? For developers, this ends up exercising code areas that have never been tested, and could lead to unexpected conditions. For example, deny_write_access() is explicitly documented as "for regular files". Perhaps it accidentally works with block devices, but this would need much more careful examination, etc. And while looking at this from a design perspective, it looks like a layering violation: roughly speaking, the kernel execute files, from filesystems, from block devices. Bypassing layers tends to lead to troublesome bugs and other weird problems. I wonder, though, if you can already get what you need through other existing mechanisms that aren't too much more hassle? For example, what about having a tool that creates a memfd from a block device and executes that? The memfd code has been used in a lot of odd exec corner cases in the past... -Kees
Kees Cook <keescook@chromium.org> writes: > On Tue, Oct 10, 2023 at 09:21:33AM +0000, Alyssa Ross wrote: >> As far as I can tell, the S_ISREG() check is there to prevent >> executing files where that would be nonsensical, like directories, >> fifos, or sockets. But the semantics for executing a block device are >> quite obvious — the block device acts just like a regular file. >> >> My use case is having a common VM image that takes a configurable >> payload to run. The payload will always be a single ELF file. >> >> I could share the file with virtio-fs, or I could create a disk image >> containing a filesystem containing the payload, but both of those add >> unnecessary layers of indirection when all I need to do is share a >> single executable blob with the VM. Sharing it as a block device is >> the most natural thing to do, aside from the (arbitrary, as far as I >> can tell) restriction on executing block devices. (The only slight >> complexity is that I need to ensure that my payload size is rounded up >> to a whole number of sectors, but that's trivial and fast in >> comparison to e.g. generating a filesystem image.) >> >> Signed-off-by: Alyssa Ross <hi@alyssa.is> > > Hi, > > Thanks for the suggestion! I would prefer to not change this rather core > behavior in the kernel for a few reasons, but it mostly revolves around > both user and developer expectations and the resulting fragility. > > For users, this hasn't been possible in the past, so if we make it > possible, what situations are suddenly exposed on systems that are trying > to very carefully control their execution environments? I expect very few, considering it's still necessary to have root chmod the block device to make it executable. > For developers, this ends up exercising code areas that have never been > tested, and could lead to unexpected conditions. For example, > deny_write_access() is explicitly documented as "for regular files". > Perhaps it accidentally works with block devices, but this would need > much more careful examination, etc. > > And while looking at this from a design perspective, it looks like a > layering violation: roughly speaking, the kernel execute files, from > filesystems, from block devices. Bypassing layers tends to lead to > troublesome bugs and other weird problems. > > I wonder, though, if you can already get what you need through other > existing mechanisms that aren't too much more hassle? For example, > what about having a tool that creates a memfd from a block device and > executes that? The memfd code has been used in a lot of odd exec corner > cases in the past... Is it possible to have a file-backed memfd? Strange name if so!
On Wed, Oct 11, 2023 at 07:38:39AM +0000, Alyssa Ross wrote:
> Is it possible to have a file-backed memfd? Strange name if so!
Not that I'm aware, but a program could just read the ELF from the block
device and stick it in a memfd and execute the result.
Hello, kernel test robot noticed "kernel-selftests.exec.non-regular.fail" on: commit: f086dcc88a64a2022314af666bd15d64c6748d27 ("[PATCH] exec: allow executing block devices") url: https://github.com/intel-lab-lkp/linux/commits/Alyssa-Ross/exec-allow-executing-block-devices/20231010-172704 patch link: https://lore.kernel.org/all/20231010092133.4093612-1-hi@alyssa.is/ patch subject: [PATCH] exec: allow executing block devices in testcase: kernel-selftests version: kernel-selftests-x86_64-60acb023-1_20230329 with following parameters: group: group-01 compiler: gcc-12 test machine: 36 threads 1 sockets Intel(R) Core(TM) i9-10980XE CPU @ 3.00GHz (Cascade Lake) with 32G memory (please refer to attached dmesg/kmsg for entire log/backtrace) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <oliver.sang@intel.com> | Closes: https://lore.kernel.org/oe-lkp/202310201132.ec34d76b-oliver.sang@intel.com # timeout set to 300 # selftests: exec: non-regular # TAP version 13 # 1..6 # # Starting 6 tests from 6 test cases. # # RUN file.S_IFLNK.exec_errno ... # # OK file.S_IFLNK.exec_errno # ok 1 file.S_IFLNK.exec_errno # # RUN file.S_IFDIR.exec_errno ... # # OK file.S_IFDIR.exec_errno # ok 2 file.S_IFDIR.exec_errno # # RUN file.S_IFBLK.exec_errno ... # # non-regular.c:166:exec_errno:Expected errno (6) == variant->expected (13) # # exec_errno: Test failed at step #4 # # FAIL file.S_IFBLK.exec_errno # not ok 3 file.S_IFBLK.exec_errno # # RUN file.S_IFCHR.exec_errno ... # # OK file.S_IFCHR.exec_errno # ok 4 file.S_IFCHR.exec_errno # # RUN file.S_IFIFO.exec_errno ... # # OK file.S_IFIFO.exec_errno # ok 5 file.S_IFIFO.exec_errno # # RUN sock.exec_errno ... # # OK sock.exec_errno # ok 6 sock.exec_errno # # FAILED: 5 / 6 tests passed. # # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0 not ok 5 selftests: exec: non-regular # exit=1 The kernel config and materials to reproduce are available at: https://download.01.org/0day-ci/archive/20231020/202310201132.ec34d76b-oliver.sang@intel.com
diff --git a/fs/exec.c b/fs/exec.c index 6518e33ea813..e29a9f16da5f 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -148,7 +148,8 @@ SYSCALL_DEFINE1(uselib, const char __user *, library) * and check again at the very end too. */ error = -EACCES; - if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) || + if (WARN_ON_ONCE((!S_ISREG(file_inode(file)->i_mode) && + !S_ISBLK(file_inode(file)->i_mode)) || path_noexec(&file->f_path))) goto exit; @@ -931,7 +932,8 @@ static struct file *do_open_execat(int fd, struct filename *name, int flags) * and check again at the very end too. */ err = -EACCES; - if (WARN_ON_ONCE(!S_ISREG(file_inode(file)->i_mode) || + if (WARN_ON_ONCE((!S_ISREG(file_inode(file)->i_mode) && + !S_ISBLK(file_inode(file)->i_mode)) || path_noexec(&file->f_path))) goto exit; diff --git a/fs/namei.c b/fs/namei.c index 567ee547492b..60c89321604a 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -3254,7 +3254,7 @@ static int may_open(struct mnt_idmap *idmap, const struct path *path, fallthrough; case S_IFIFO: case S_IFSOCK: - if (acc_mode & MAY_EXEC) + if ((inode->i_mode & S_IFMT) != S_IFBLK && (acc_mode & MAY_EXEC)) return -EACCES; flag &= ~O_TRUNC; break;