Message ID | 20240221210626.155534-1-adrian.ratiu@collabora.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:693c:2685:b0:108:e6aa:91d0 with SMTP id mn5csp1298115dyc; Wed, 21 Feb 2024 13:07:11 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCXyVFgdkmNCY4vJRa7NCnt19zCgLgwnWaVOCpEzhwUzJjbtyFbvSHeKZbyo6FFedTWDMgZ4C7yemzNpOx1II54rtdUr7A== X-Google-Smtp-Source: AGHT+IExRHzcZH6p6Dun7igWVDsmsPgLLNGUB/Vy7GXvorVoUrx07rQhzUyF5xKAJcIztyL2Eecw X-Received: by 2002:a05:622a:1b91:b0:42c:74f8:1221 with SMTP id bp17-20020a05622a1b9100b0042c74f81221mr26729938qtb.33.1708549631461; Wed, 21 Feb 2024 13:07:11 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708549631; cv=pass; d=google.com; s=arc-20160816; b=L80YDx709hrThvOSmlevN2fFFsbQpoMIwjD+m93uSf1K/O0a4hugz0x0KsMr9WAosf 3v5DCFNjsXC8fLvWBqVRu1LuFyexELMdty2jJGJB9yD3NU9o13Cp2bcP5N+KwIvYTse0 aGRPHAI8Xb7wLuY6yM7GftZ45KlF2pGrpFMRhYz0cUDD9oWigLg4lnGcyjH1M4jJnQfq aSQYpB7VHByefVYadPjNta/wU/23YnBIMA2+zFqk48JDrvFY1w/rAdEREec99NaIr2va gcvefaZT1rPDkVxt1zVNNQ22i4BE9VYaVsQG2MFP1cL8fh/nsLE8hhbVc56u4azZIxld CNSg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=/zzex0Md3xlvrCgcQ1O6HrtJFLIRYcBC92lrcH3OCpY=; fh=RRitfpq+InK+G4SnrnIsHasq3GrzSKig5AkEbBMFW/g=; b=SryKcXUgkxGQFSRjSOv5iFRAFB55sJzhbLTzoomI487LBVMiwoUq+EuBmSj6qetRpR 0SibsRVN0/Wxbjm5Ix7aPZLTOXdZP2j80CkbQa7FIQFUHvZVwm4XFn1rnWJ0wNvzGaDE VN+5/JkP6pLNLx54cLwSf6ctVZG+YLHplhuF8ukIgpkE522dR6IHhoQYxE8Wvs8qtKKF LL9v1qnTqmpRRndCAHcT4APQBfT/r9JEQ7AS1d63lK80Zc09zapOLIR7lSGNsR/4lrnR ynHD8zvc/ecnuicp6Oh4ntHDV5/b8nW69159rgQjx5Xuld1DeBQUgBb5MwcAkf6gBT/Y KyXA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=Rfw6Bzj3; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id f11-20020ac8498b000000b0042dd382928fsi10943383qtq.356.2024.02.21.13.07.11 for <ouuuleilei@gmail.com> (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Feb 2024 13:07:11 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@collabora.com header.s=mail header.b=Rfw6Bzj3; arc=pass (i=1 spf=pass spfdomain=collabora.com dkim=pass dkdomain=collabora.com dmarc=pass fromdomain=collabora.com); spf=pass (google.com: domain of linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-75494-ouuuleilei=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=collabora.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 314C01C21796 for <ouuuleilei@gmail.com>; Wed, 21 Feb 2024 21:07:11 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 15B7C127B52; Wed, 21 Feb 2024 21:06:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b="Rfw6Bzj3" Received: from madrid.collaboradmins.com (madrid.collaboradmins.com [46.235.227.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 38895126F3A; Wed, 21 Feb 2024 21:06:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=46.235.227.194 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708549596; cv=none; b=BtmEEpaz339wLeSQ0QohQc3TRhQTC2yTgjW/TtE5755mDAwyo55PU/ZnROObHju8ZlswN6FUQMrV41ExVI6r9CtuCEhTDQ67djkmqbNDeppIAc1821Q17xpfnt5lI9cO52cn2jedYFGt32q3qbJu5aydHZdqPWiQjCYPvWu2nW0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708549596; c=relaxed/simple; bh=8VXcAAJto8svyUnZ97pB7BNBTXBuRwzYNmSFpJXmiLs=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=j6148Rj+X7jhOBgmDLSPuy2PM1NBHbX0Bku6gh/XMS667Bt359Xe+VezS7Cq8NiLGeTeQwmb+eUlIVPey1TMEreMZVjvvP+GVy1RLOiR1haigNQDl5JG1sg5t9N1sXsAHJ/0pJDaY9ZDTVLgJzWsU4LT2Sy3z62fAKmq61KyJC4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com; spf=pass smtp.mailfrom=collabora.com; dkim=pass (2048-bit key) header.d=collabora.com header.i=@collabora.com header.b=Rfw6Bzj3; arc=none smtp.client-ip=46.235.227.194 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=collabora.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=collabora.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=collabora.com; s=mail; t=1708549593; bh=8VXcAAJto8svyUnZ97pB7BNBTXBuRwzYNmSFpJXmiLs=; h=From:To:Cc:Subject:Date:From; b=Rfw6Bzj3tefn3WJ80NhDISRYJyhzmubBd3g7vnQA01bWTgOvBAuF4u3KfJgDFJ6bo PHAcLX+0Td8RDueLE4WCAlRg9HTPaW2Y8jfyIwFUyYRz8TBoR0aWp7m0qXafnDdeYT joTS2suyoGcf9sKky0E7nFeMtDHOi94M+GFwdiHtjWKfHkGXziB0WhdMXaHHtefe8P 7Dvm/fT6c/i6/MRtdw6krH/lsKGeJYgKg75YTelZ0H1hVr9LorZbY3cZ7tIf0aCUjO DtacEfTSJWuWccTgP4xSjcZaCMZWGDHSFfA8/mw+qf67bJBe+dJGNV4kdjsxgmAAwd 7D7Eic7Z3uqqg== Received: from localhost.localdomain (cola.collaboradmins.com [195.201.22.229]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: aratiu) by madrid.collaboradmins.com (Postfix) with ESMTPSA id ED19D37820D3; Wed, 21 Feb 2024 21:06:32 +0000 (UTC) From: Adrian Ratiu <adrian.ratiu@collabora.com> To: linux-security-module@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, kernel@collabora.com, Adrian Ratiu <adrian.ratiu@collabora.com>, Guenter Roeck <groeck@chromium.org>, Doug Anderson <dianders@chromium.org>, Mike Frysinger <vapier@chromium.org> Subject: [PATCH] proc: allow restricting /proc/pid/mem writes Date: Wed, 21 Feb 2024 23:06:26 +0200 Message-ID: <20240221210626.155534-1-adrian.ratiu@collabora.com> X-Mailer: git-send-email 2.43.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: <linux-kernel.vger.kernel.org> List-Subscribe: <mailto:linux-kernel+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:linux-kernel+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1791544138068785581 X-GMAIL-MSGID: 1791544138068785581 |
Series |
proc: allow restricting /proc/pid/mem writes
|
|
Commit Message
Adrian Ratiu
Feb. 21, 2024, 9:06 p.m. UTC
Prior to v2.6.39 write access to /proc/<pid>/mem was restricted, after which it got allowed in commit 198214a7ee50 ("proc: enable writing to /proc/pid/mem"). Famous last words from that patch: "no longer a security hazard". :) Afterwards exploits appeared started causing drama like [1]. The /proc/*/mem exploits can be rather sophisticated like [2] which installed an arbitrary payload from noexec storage into a running process then exec'd it, which itself could include an ELF loader to run arbitrary code off noexec storage. As part of hardening against these types of attacks, distrbutions can restrict /proc/*/mem to only allow writes when they makes sense, like in case of debuggers which have ptrace permissions, as they are able to access memory anyway via PTRACE_POKEDATA and friends. Dropping the mode bits disables write access for non-root users. Trying to `chmod` the paths back fails as the kernel rejects it. For users with CAP_DAC_OVERRIDE (usually just root) we have to disable the mem_write callback to avoid bypassing the mode bits. Writes can be used to bypass permissions on memory maps, even if a memory region is mapped r-x (as is a program's executable pages), the process can open its own /proc/self/mem file and write to the pages directly. Even if seccomp filters block mmap/mprotect calls with W|X perms, they often cannot block open calls as daemons want to read/write their own runtime state and seccomp filters cannot check file paths. Write calls also can't be blocked in general via seccomp. Since the mem file is part of the dynamic /proc/<pid>/ space, we can't run chmod once at boot to restrict it (and trying to react to every process and run chmod doesn't scale, and the kernel no longer allows chmod on any of these paths). SELinux could be used with a rule to cover all /proc/*/mem files, but even then having multiple ways to deny an attack is useful in case on layer fails. [1] https://lwn.net/Articles/476947/ [2] https://issues.chromium.org/issues/40089045 Based on an initial patch by Mike Frysinger <vapier@chromium.org>. Cc: Guenter Roeck <groeck@chromium.org> Cc: Doug Anderson <dianders@chromium.org> Signed-off-by: Mike Frysinger <vapier@chromium.org> Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> --- Tested on next-20240220. I would really like to avoid depending on CONFIG_MEMCG which is required for the struct mm_stryct "owner" pointer. Any suggestions how check the ptrace owner without MEMCG? --- fs/proc/base.c | 26 ++++++++++++++++++++++++-- security/Kconfig | 13 +++++++++++++ 2 files changed, 37 insertions(+), 2 deletions(-)
Comments
Hi, On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > Prior to v2.6.39 write access to /proc/<pid>/mem was restricted, > after which it got allowed in commit 198214a7ee50 ("proc: enable > writing to /proc/pid/mem"). Famous last words from that patch: > "no longer a security hazard". :) > > Afterwards exploits appeared started causing drama like [1]. The > /proc/*/mem exploits can be rather sophisticated like [2] which > installed an arbitrary payload from noexec storage into a running > process then exec'd it, which itself could include an ELF loader > to run arbitrary code off noexec storage. > > As part of hardening against these types of attacks, distrbutions > can restrict /proc/*/mem to only allow writes when they makes sense, > like in case of debuggers which have ptrace permissions, as they > are able to access memory anyway via PTRACE_POKEDATA and friends. > > Dropping the mode bits disables write access for non-root users. > Trying to `chmod` the paths back fails as the kernel rejects it. > > For users with CAP_DAC_OVERRIDE (usually just root) we have to > disable the mem_write callback to avoid bypassing the mode bits. > > Writes can be used to bypass permissions on memory maps, even if a > memory region is mapped r-x (as is a program's executable pages), > the process can open its own /proc/self/mem file and write to the > pages directly. > > Even if seccomp filters block mmap/mprotect calls with W|X perms, > they often cannot block open calls as daemons want to read/write > their own runtime state and seccomp filters cannot check file paths. > Write calls also can't be blocked in general via seccomp. > > Since the mem file is part of the dynamic /proc/<pid>/ space, we > can't run chmod once at boot to restrict it (and trying to react > to every process and run chmod doesn't scale, and the kernel no > longer allows chmod on any of these paths). > > SELinux could be used with a rule to cover all /proc/*/mem files, > but even then having multiple ways to deny an attack is useful in > case on layer fails. > > [1] https://lwn.net/Articles/476947/ > [2] https://issues.chromium.org/issues/40089045 > > Based on an initial patch by Mike Frysinger <vapier@chromium.org>. > > Cc: Guenter Roeck <groeck@chromium.org> > Cc: Doug Anderson <dianders@chromium.org> > Signed-off-by: Mike Frysinger <vapier@chromium.org> > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > --- > Tested on next-20240220. > > I would really like to avoid depending on CONFIG_MEMCG which is > required for the struct mm_stryct "owner" pointer. > > Any suggestions how check the ptrace owner without MEMCG? > --- > fs/proc/base.c | 26 ++++++++++++++++++++++++-- > security/Kconfig | 13 +++++++++++++ > 2 files changed, 37 insertions(+), 2 deletions(-) Thanks for posting this! This looks reasonable to me, but I'm nowhere near an expert on this so I won't add a Reviewed-by tag. This feels like the kind of thing that Kees might be interested in reviewing, so adding him to the "To" list.
On Mon, Feb 26, 2024 at 09:10:54AM -0800, Doug Anderson wrote: > Hi, > > On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > Prior to v2.6.39 write access to /proc/<pid>/mem was restricted, > > after which it got allowed in commit 198214a7ee50 ("proc: enable > > writing to /proc/pid/mem"). Famous last words from that patch: > > "no longer a security hazard". :) > > > > Afterwards exploits appeared started causing drama like [1]. The > > /proc/*/mem exploits can be rather sophisticated like [2] which > > installed an arbitrary payload from noexec storage into a running > > process then exec'd it, which itself could include an ELF loader > > to run arbitrary code off noexec storage. > > > > As part of hardening against these types of attacks, distrbutions > > can restrict /proc/*/mem to only allow writes when they makes sense, > > like in case of debuggers which have ptrace permissions, as they > > are able to access memory anyway via PTRACE_POKEDATA and friends. > > > > Dropping the mode bits disables write access for non-root users. > > Trying to `chmod` the paths back fails as the kernel rejects it. > > > > For users with CAP_DAC_OVERRIDE (usually just root) we have to > > disable the mem_write callback to avoid bypassing the mode bits. > > > > Writes can be used to bypass permissions on memory maps, even if a > > memory region is mapped r-x (as is a program's executable pages), > > the process can open its own /proc/self/mem file and write to the > > pages directly. > > > > Even if seccomp filters block mmap/mprotect calls with W|X perms, > > they often cannot block open calls as daemons want to read/write > > their own runtime state and seccomp filters cannot check file paths. > > Write calls also can't be blocked in general via seccomp. > > > > Since the mem file is part of the dynamic /proc/<pid>/ space, we > > can't run chmod once at boot to restrict it (and trying to react > > to every process and run chmod doesn't scale, and the kernel no > > longer allows chmod on any of these paths). > > > > SELinux could be used with a rule to cover all /proc/*/mem files, > > but even then having multiple ways to deny an attack is useful in > > case on layer fails. > > > > [1] https://lwn.net/Articles/476947/ > > [2] https://issues.chromium.org/issues/40089045 > > > > Based on an initial patch by Mike Frysinger <vapier@chromium.org>. > > > > Cc: Guenter Roeck <groeck@chromium.org> > > Cc: Doug Anderson <dianders@chromium.org> > > Signed-off-by: Mike Frysinger <vapier@chromium.org> This should have a "Co-developed-by: Mike..." tag, since you're making changes and not just passing it along directly. > > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > > --- > > Tested on next-20240220. > > > > I would really like to avoid depending on CONFIG_MEMCG which is > > required for the struct mm_stryct "owner" pointer. > > > > Any suggestions how check the ptrace owner without MEMCG? > > --- > > fs/proc/base.c | 26 ++++++++++++++++++++++++-- > > security/Kconfig | 13 +++++++++++++ > > 2 files changed, 37 insertions(+), 2 deletions(-) > > Thanks for posting this! This looks reasonable to me, but I'm nowhere > near an expert on this so I won't add a Reviewed-by tag. > > This feels like the kind of thing that Kees might be interested in > reviewing, so adding him to the "To" list. I'd love to make /proc/$pid/mem more strict. A few comments: > [...] > + if (ptracer_capable(current, mm->user_ns) && It really looks like you're trying to do a form of ptrace_may_access(), but _without_ the introspection exception? Also, using "current" in the write path can lead to problems[1], so this should somehow use file->f_cred, or limit write access during the open() instead? > [...] > +config SECURITY_PROC_MEM_RESTRICT_WRITES Instead of a build-time CONFIG, I'd prefer a boot-time config (or a sysctl, but that's be harder given the perms). That this is selectable by distro users, etc, and they don't need to rebuild their kernel to benefit from it. Jann Horn has tried to restrict access to this file in the past as well, so he may have some additional advice about it. -Kees [1] https://docs.kernel.org/security/credentials.html#open-file-credentials
[sorry for the duplicate, fixing Jann's email address] On Mon, Feb 26, 2024 at 09:10:54AM -0800, Doug Anderson wrote: > Hi, > > On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > Prior to v2.6.39 write access to /proc/<pid>/mem was restricted, > > after which it got allowed in commit 198214a7ee50 ("proc: enable > > writing to /proc/pid/mem"). Famous last words from that patch: > > "no longer a security hazard". :) > > > > Afterwards exploits appeared started causing drama like [1]. The > > /proc/*/mem exploits can be rather sophisticated like [2] which > > installed an arbitrary payload from noexec storage into a running > > process then exec'd it, which itself could include an ELF loader > > to run arbitrary code off noexec storage. > > > > As part of hardening against these types of attacks, distrbutions > > can restrict /proc/*/mem to only allow writes when they makes sense, > > like in case of debuggers which have ptrace permissions, as they > > are able to access memory anyway via PTRACE_POKEDATA and friends. > > > > Dropping the mode bits disables write access for non-root users. > > Trying to `chmod` the paths back fails as the kernel rejects it. > > > > For users with CAP_DAC_OVERRIDE (usually just root) we have to > > disable the mem_write callback to avoid bypassing the mode bits. > > > > Writes can be used to bypass permissions on memory maps, even if a > > memory region is mapped r-x (as is a program's executable pages), > > the process can open its own /proc/self/mem file and write to the > > pages directly. > > > > Even if seccomp filters block mmap/mprotect calls with W|X perms, > > they often cannot block open calls as daemons want to read/write > > their own runtime state and seccomp filters cannot check file paths. > > Write calls also can't be blocked in general via seccomp. > > > > Since the mem file is part of the dynamic /proc/<pid>/ space, we > > can't run chmod once at boot to restrict it (and trying to react > > to every process and run chmod doesn't scale, and the kernel no > > longer allows chmod on any of these paths). > > > > SELinux could be used with a rule to cover all /proc/*/mem files, > > but even then having multiple ways to deny an attack is useful in > > case on layer fails. > > > > [1] https://lwn.net/Articles/476947/ > > [2] https://issues.chromium.org/issues/40089045 > > > > Based on an initial patch by Mike Frysinger <vapier@chromium.org>. > > > > Cc: Guenter Roeck <groeck@chromium.org> > > Cc: Doug Anderson <dianders@chromium.org> > > Signed-off-by: Mike Frysinger <vapier@chromium.org> This should have a "Co-developed-by: Mike..." tag, since you're making changes and not just passing it along directly. > > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > > --- > > Tested on next-20240220. > > > > I would really like to avoid depending on CONFIG_MEMCG which is > > required for the struct mm_stryct "owner" pointer. > > > > Any suggestions how check the ptrace owner without MEMCG? > > --- > > fs/proc/base.c | 26 ++++++++++++++++++++++++-- > > security/Kconfig | 13 +++++++++++++ > > 2 files changed, 37 insertions(+), 2 deletions(-) > > Thanks for posting this! This looks reasonable to me, but I'm nowhere > near an expert on this so I won't add a Reviewed-by tag. > > This feels like the kind of thing that Kees might be interested in > reviewing, so adding him to the "To" list. I'd love to make /proc/$pid/mem more strict. A few comments: > [...] > + if (ptracer_capable(current, mm->user_ns) && It really looks like you're trying to do a form of ptrace_may_access(), but _without_ the introspection exception? Also, using "current" in the write path can lead to problems[1], so this should somehow use file->f_cred, or limit write access during the open() instead? > [...] > +config SECURITY_PROC_MEM_RESTRICT_WRITES Instead of a build-time CONFIG, I'd prefer a boot-time config (or a sysctl, but that's be harder given the perms). That this is selectable by distro users, etc, and they don't need to rebuild their kernel to benefit from it. Jann Horn has tried to restrict access to this file in the past as well, so he may have some additional advice about it. -Kees [1] https://docs.kernel.org/security/credentials.html#open-file-credentials
(lemme try this again as plain text) On Mon, Feb 26, 2024 at 2:24 PM Kees Cook <keescook@chromium.org> wrote: > On Mon, Feb 26, 2024 at 09:10:54AM -0800, Doug Anderson wrote: > > On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > + if (ptracer_capable(current, mm->user_ns) && > > It really looks like you're trying to do a form of ptrace_may_access(), > but _without_ the introspection exception? to be clear, we want the check to be "ptracer is attached, and the process attempting the write is the ptracer", not "does the writer pass ptrace access checks". the latter opens up more angles, including shellcode self-modification, that we don't want. the only use case we have for writable mem files is for debuggers, and those should already be attached. -mike
Hello On Monday, February 26, 2024 21:24 EET, Kees Cook <keescook@chromium.org> wrote: > [sorry for the duplicate, fixing Jann's email address] > > On Mon, Feb 26, 2024 at 09:10:54AM -0800, Doug Anderson wrote: > > Hi, > > > > On Wed, Feb 21, 2024 at 1:06 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > > > Prior to v2.6.39 write access to /proc/<pid>/mem was restricted, > > > after which it got allowed in commit 198214a7ee50 ("proc: enable > > > writing to /proc/pid/mem"). Famous last words from that patch: > > > "no longer a security hazard". :) > > > > > > Afterwards exploits appeared started causing drama like [1]. The > > > /proc/*/mem exploits can be rather sophisticated like [2] which > > > installed an arbitrary payload from noexec storage into a running > > > process then exec'd it, which itself could include an ELF loader > > > to run arbitrary code off noexec storage. > > > > > > As part of hardening against these types of attacks, distrbutions > > > can restrict /proc/*/mem to only allow writes when they makes sense, > > > like in case of debuggers which have ptrace permissions, as they > > > are able to access memory anyway via PTRACE_POKEDATA and friends. > > > > > > Dropping the mode bits disables write access for non-root users. > > > Trying to `chmod` the paths back fails as the kernel rejects it. > > > > > > For users with CAP_DAC_OVERRIDE (usually just root) we have to > > > disable the mem_write callback to avoid bypassing the mode bits. > > > > > > Writes can be used to bypass permissions on memory maps, even if a > > > memory region is mapped r-x (as is a program's executable pages), > > > the process can open its own /proc/self/mem file and write to the > > > pages directly. > > > > > > Even if seccomp filters block mmap/mprotect calls with W|X perms, > > > they often cannot block open calls as daemons want to read/write > > > their own runtime state and seccomp filters cannot check file paths. > > > Write calls also can't be blocked in general via seccomp. > > > > > > Since the mem file is part of the dynamic /proc/<pid>/ space, we > > > can't run chmod once at boot to restrict it (and trying to react > > > to every process and run chmod doesn't scale, and the kernel no > > > longer allows chmod on any of these paths). > > > > > > SELinux could be used with a rule to cover all /proc/*/mem files, > > > but even then having multiple ways to deny an attack is useful in > > > case on layer fails. > > > > > > [1] https://lwn.net/Articles/476947/ > > > [2] https://issues.chromium.org/issues/40089045 > > > > > > Based on an initial patch by Mike Frysinger <vapier@chromium.org>. > > > > > > Cc: Guenter Roeck <groeck@chromium.org> > > > Cc: Doug Anderson <dianders@chromium.org> > > > Signed-off-by: Mike Frysinger <vapier@chromium.org> > > This should have a "Co-developed-by: Mike..." tag, since you're making > changes and not just passing it along directly. Thanks, I'll address this in v2. > > > > Signed-off-by: Adrian Ratiu <adrian.ratiu@collabora.com> > > > --- > > > Tested on next-20240220. > > > > > > I would really like to avoid depending on CONFIG_MEMCG which is > > > required for the struct mm_stryct "owner" pointer. > > > > > > Any suggestions how check the ptrace owner without MEMCG? > > > --- > > > fs/proc/base.c | 26 ++++++++++++++++++++++++-- > > > security/Kconfig | 13 +++++++++++++ > > > 2 files changed, 37 insertions(+), 2 deletions(-) > > > > Thanks for posting this! This looks reasonable to me, but I'm nowhere > > near an expert on this so I won't add a Reviewed-by tag. > > > > This feels like the kind of thing that Kees might be interested in > > reviewing, so adding him to the "To" list. > > I'd love to make /proc/$pid/mem more strict. A few comments: > > > [...] > > + if (ptracer_capable(current, mm->user_ns) && > > It really looks like you're trying to do a form of ptrace_may_access(), > but _without_ the introspection exception? > > Also, using "current" in the write path can lead to problems[1], so this > should somehow use file->f_cred, or limit write access during the open() > instead? I think Mike explained pretty well why we need to check if current already is a ptracer. The point you raise is valid (thanks again) so we need to check a bit earlier, during open(). > > > [...] > > +config SECURITY_PROC_MEM_RESTRICT_WRITES > > Instead of a build-time CONFIG, I'd prefer a boot-time config (or a > sysctl, but that's be harder given the perms). That this is selectable > by distro users, etc, and they don't need to rebuild their kernel to > benefit from it. Ack, I'll implement a cmdline arg in v2. > > Jann Horn has tried to restrict access to this file in the past as well, > so he may have some additional advice about it. I'll leave this a few more days in case others have more ideas, then will send v2 and also add Jann to the "To:" list. > > -Kees > > [1] https://docs.kernel.org/security/credentials.html#open-file-credentials > > -- > Kees Cook
Hi, On Mon, Feb 26, 2024 at 2:33 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > [...] > > > +config SECURITY_PROC_MEM_RESTRICT_WRITES > > > > Instead of a build-time CONFIG, I'd prefer a boot-time config (or a > > sysctl, but that's be harder given the perms). That this is selectable > > by distro users, etc, and they don't need to rebuild their kernel to > > benefit from it. > > Ack, I'll implement a cmdline arg in v2. Any objections to doing both? Have a CONFIG option for a default and a cmdline to override it? This way if a distro wants to restrict writes by default then don't need to jam more stuff into the kernel command line. -Doug
On Mon, Feb 26, 2024 at 02:37:29PM -0800, Doug Anderson wrote: > Hi, > > On Mon, Feb 26, 2024 at 2:33 PM Adrian Ratiu <adrian.ratiu@collabora.com> wrote: > > > > > > [...] > > > > +config SECURITY_PROC_MEM_RESTRICT_WRITES > > > > > > Instead of a build-time CONFIG, I'd prefer a boot-time config (or a > > > sysctl, but that's be harder given the perms). That this is selectable > > > by distro users, etc, and they don't need to rebuild their kernel to > > > benefit from it. > > > > Ack, I'll implement a cmdline arg in v2. > > Any objections to doing both? Have a CONFIG option for a default and a > cmdline to override it? This way if a distro wants to restrict writes > by default then don't need to jam more stuff into the kernel command > line. For an example, take a look at randomize_kstack_offset and CONFIG_RANDOMIZE_KSTACK_OFFSET_DEFAULT.
diff --git a/fs/proc/base.c b/fs/proc/base.c index 98a031ac2648..e4d6829c5d1a 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -152,6 +152,12 @@ struct pid_entry { NULL, &proc_pid_attr_operations, \ { .lsmid = LSMID }) +#ifdef CONFIG_SECURITY_PROC_MEM_RESTRICT_WRITES +# define PROC_PID_MEM_MODE S_IRUSR +#else +# define PROC_PID_MEM_MODE (S_IRUSR|S_IWUSR) +#endif + /* * Count the number of hardlinks for the pid_entry table, excluding the . * and .. links. @@ -899,7 +905,24 @@ static ssize_t mem_read(struct file *file, char __user *buf, static ssize_t mem_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { +#ifdef CONFIG_SECURITY_PROC_MEM_RESTRICT_WRITES + /* Allow processes already ptracing the target process */ +#ifdef CONFIG_MEMCG + struct mm_struct *mm = file->private_data; + + rcu_read_lock(); + if (ptracer_capable(current, mm->user_ns) && + current == ptrace_parent(mm->owner)) { + rcu_read_unlock(); + return mem_rw(file, (char __user *)buf, count, ppos, 1); + } + rcu_read_unlock(); +#endif + + return -EACCES; +#else return mem_rw(file, (char __user*)buf, count, ppos, 1); +#endif } loff_t mem_lseek(struct file *file, loff_t offset, int orig) @@ -3281,7 +3303,7 @@ static const struct pid_entry tgid_base_stuff[] = { #ifdef CONFIG_NUMA REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations), #endif - REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations), + REG("mem", PROC_PID_MEM_MODE, proc_mem_operations), LNK("cwd", proc_cwd_link), LNK("root", proc_root_link), LNK("exe", proc_exe_link), @@ -3631,7 +3653,7 @@ static const struct pid_entry tid_base_stuff[] = { #ifdef CONFIG_NUMA REG("numa_maps", S_IRUGO, proc_pid_numa_maps_operations), #endif - REG("mem", S_IRUSR|S_IWUSR, proc_mem_operations), + REG("mem", PROC_PID_MEM_MODE, proc_mem_operations), LNK("cwd", proc_cwd_link), LNK("root", proc_root_link), LNK("exe", proc_exe_link), diff --git a/security/Kconfig b/security/Kconfig index 412e76f1575d..4082a07a33e5 100644 --- a/security/Kconfig +++ b/security/Kconfig @@ -19,6 +19,19 @@ config SECURITY_DMESG_RESTRICT If you are unsure how to answer this question, answer N. +config SECURITY_PROC_MEM_RESTRICT_WRITES + bool "Restrict /proc/<pid>/mem write access" + default n + help + This restricts writes to /proc/<pid>/mem, except when the current + process ptraces the /proc/<pid>/mem task, because a ptracer already + has write access to the tracee memory. + + Write access to this file allows bypassing memory map permissions, + such as modifying read-only code. + + If you are unsure how to answer this question, answer N. + config SECURITY bool "Enable different security models" depends on SYSFS