Message ID | 20230517150321.2890206-4-revest@chromium.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1212800vqo; Wed, 17 May 2023 08:14:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ65nnTJ+OiVy+cOoNT5HoZe0eDuiqKMrNEp3DQerRvRpAfHjuOVyTRcveY9NEIXplUTXFOq X-Received: by 2002:a05:6a00:1808:b0:64a:6cad:d840 with SMTP id y8-20020a056a00180800b0064a6cadd840mr1566125pfa.25.1684336497045; Wed, 17 May 2023 08:14:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1684336497; cv=none; d=google.com; s=arc-20160816; b=R1qohyuvvWUQl9AD2cxEm66AWKLZx9jeSv0ob/GAsFjl9/aqUmCGrnzoMQobx8w0um P+wTfygpZa1Pf+f1c4zSsLReiQ+yguRlK+39uOQkzK+tWW1Rys8LhMyJJARH0Sr4It83 1M/BCYGKyjON4vZNpgtSLBbFVUrHTUfvv+LvWIEV5/0dukWhFsS8hx7GtuYCufP8VCtM P19UekiuJ3cp/+FjyQ/sqj3hx/rU3vgX5OiA9jshdLqxN2srKgSD5JmLrE565akOn+gP lkI3FcPdMx9pZAt/H2NgRAzdv1RsuzNckIvV/NsrGp7L3IBfDlqwLNZxXoVLaJapgeAQ 9MqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=oipxQEZj91stE4t+lxud7GTslgXbD/udBSL61+uH1L8=; b=COBxNnlJFLqra/V2A18dMNgkJ+cVrXP2KtfEXfseQBX6bffLKyiqq01ZvqA0weKwXJ H49EBN3wrSD3ZrNqhIAIZvDLMqJFzCSgI2Di0ZNQZjlUeyeb8FpyMPKjSgAJoB9sjHfy elSI/GgLzXUBqJvjS1Bj0YJajE/gng+LWfxP6cNZcsX+oylkJaRNywrdEiSp8lZutiiA zaOKl+l9CimZ3NjKNFVoC1+j6WT44Mw3YNq8qqBJpv4UUbNiCl8lIY2ZrLgdzG1b3JVn n178gRfSpejgjkzxs7+gARqjSV7rZM1KMD3aKZ6gQcgz+FxYVPDN+7zu+wlQw4TJSygY Lrpg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=nYRCTEvO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y187-20020a638ac4000000b00524f3526a91si18065348pgd.16.2023.05.17.08.14.43; Wed, 17 May 2023 08:14:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@chromium.org header.s=google header.b=nYRCTEvO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=chromium.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231675AbjEQPFS (ORCPT <rfc822;abdi.embedded@gmail.com> + 99 others); Wed, 17 May 2023 11:05:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231152AbjEQPEj (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 17 May 2023 11:04:39 -0400 Received: from mail-wr1-x42b.google.com (mail-wr1-x42b.google.com [IPv6:2a00:1450:4864:20::42b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC9887EE7 for <linux-kernel@vger.kernel.org>; Wed, 17 May 2023 08:04:20 -0700 (PDT) Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-30644c18072so607480f8f.2 for <linux-kernel@vger.kernel.org>; Wed, 17 May 2023 08:04:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1684335831; x=1686927831; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oipxQEZj91stE4t+lxud7GTslgXbD/udBSL61+uH1L8=; b=nYRCTEvOWhb0FzekCMnJ6+34uErlW9tdW8D7eUgNnDXRCvoUVAdioTe4bDYStyLbtZ +wOkz4pBqp/FKbfovlnRGHcbww4zHsdW1pobdsCTi7FymG/8pUeZAv4ix9bN6fTKR5K0 yeoouszxWZdws1CxpHmjxNcnk9WQ5iIHC5Qms= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684335831; x=1686927831; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oipxQEZj91stE4t+lxud7GTslgXbD/udBSL61+uH1L8=; b=Gx9noDoJncCDjnI2jPkT310rwAwaX0iUYQdKvP+lHzIZnptNn+lmq3G5TGdML5znoK kLkUFsGFnkCRGBB4ps6yIHXbW7573AjqVEvvv+pZqD3KVouWQ2Wp9AH8Rz1stTwurEjO SmPXagSA1YxDFXf7xBVxvbrBfBEKoPE2mkWuW+Okp8VxaEgF45x94d3q4RkACBMyYpHY L8ml1DzKEC45N85mjQJCQeaPP9dv0Qgj1njqh77Llv2PwDMwLCD17CzvQ/qVUvEnXBpn wk7/WHTXG96SIqlAu97x2Xp4Sjpyn6xRHZSn9EXAi2ft0nv6rPVgTJyJDNT0KGznHyb4 3pIg== X-Gm-Message-State: AC+VfDwyqt5I4Ni9s6X4GBSbBfMrAtywlz20xmfBKfhChLj7XT0izKZR DZzYxsPdKyzUI5eZ54mIy9VMePHe8nVHcD3tIH0= X-Received: by 2002:a5d:4cc9:0:b0:2fb:7099:6070 with SMTP id c9-20020a5d4cc9000000b002fb70996070mr866319wrt.47.1684335831152; Wed, 17 May 2023 08:03:51 -0700 (PDT) Received: from revest.zrh.corp.google.com ([2a00:79e0:9d:6:e223:a0c2:d2c:c371]) by smtp.gmail.com with ESMTPSA id e17-20020adffd11000000b003047ea78b42sm3038211wrr.43.2023.05.17.08.03.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 17 May 2023 08:03:50 -0700 (PDT) From: Florent Revest <revest@chromium.org> To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: akpm@linux-foundation.org, catalin.marinas@arm.com, anshuman.khandual@arm.com, joey.gouly@arm.com, mhocko@suse.com, keescook@chromium.org, david@redhat.com, peterx@redhat.com, izbyshev@ispras.ru, broonie@kernel.org, szabolcs.nagy@arm.com, kpsingh@kernel.org, gthelen@google.com, toiwoton@gmail.com, Florent Revest <revest@chromium.org> Subject: [PATCH v2 3/5] mm: Make PR_MDWE_REFUSE_EXEC_GAIN an unsigned long Date: Wed, 17 May 2023 17:03:19 +0200 Message-ID: <20230517150321.2890206-4-revest@chromium.org> X-Mailer: git-send-email 2.40.1.606.ga4b1b128d6-goog In-Reply-To: <20230517150321.2890206-1-revest@chromium.org> References: <20230517150321.2890206-1-revest@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1766154826321129656?= X-GMAIL-MSGID: =?utf-8?q?1766154826321129656?= |
Series |
MDWE without inheritance
|
|
Commit Message
Florent Revest
May 17, 2023, 3:03 p.m. UTC
Alexey pointed out that defining a prctl flag as an int is a footgun
because, under some circumstances, when used as a flag to prctl, it can
be casted to long with garbage upper bits which would result in
unexpected behaviors.
This patch changes the constant to a UL to eliminate these
possibilities.
Signed-off-by: Florent Revest <revest@chromium.org>
Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru>
---
include/uapi/linux/prctl.h | 2 +-
tools/include/uapi/linux/prctl.h | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
Comments
On 17.05.23 17:03, Florent Revest wrote: > Alexey pointed out that defining a prctl flag as an int is a footgun > because, under some circumstances, when used as a flag to prctl, it can > be casted to long with garbage upper bits which would result in > unexpected behaviors. > > This patch changes the constant to a UL to eliminate these > possibilities. > > Signed-off-by: Florent Revest <revest@chromium.org> > Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> > --- > include/uapi/linux/prctl.h | 2 +- > tools/include/uapi/linux/prctl.h | 2 +- > 2 files changed, 2 insertions(+), 2 deletions(-) > > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h > index f23d9a16507f..6e9af6cbc950 100644 > --- a/include/uapi/linux/prctl.h > +++ b/include/uapi/linux/prctl.h > @@ -283,7 +283,7 @@ struct prctl_mm_map { > > /* Memory deny write / execute */ > #define PR_SET_MDWE 65 > -# define PR_MDWE_REFUSE_EXEC_GAIN 1 > +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) > > #define PR_GET_MDWE 66 > > diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h > index 759b3f53e53f..6e6563e97fef 100644 > --- a/tools/include/uapi/linux/prctl.h > +++ b/tools/include/uapi/linux/prctl.h > @@ -283,7 +283,7 @@ struct prctl_mm_map { > > /* Memory deny write / execute */ > #define PR_SET_MDWE 65 > -# define PR_MDWE_REFUSE_EXEC_GAIN 1 > +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) > > #define PR_GET_MDWE 66 > Both are changing existing uapi, so you'll already have existing user space using the old values, that your kernel code has to deal with no?
On 2023-05-22 11:55, David Hildenbrand wrote: > On 17.05.23 17:03, Florent Revest wrote: >> Alexey pointed out that defining a prctl flag as an int is a footgun >> because, under some circumstances, when used as a flag to prctl, it >> can >> be casted to long with garbage upper bits which would result in >> unexpected behaviors. >> >> This patch changes the constant to a UL to eliminate these >> possibilities. >> >> Signed-off-by: Florent Revest <revest@chromium.org> >> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> >> --- >> include/uapi/linux/prctl.h | 2 +- >> tools/include/uapi/linux/prctl.h | 2 +- >> 2 files changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h >> index f23d9a16507f..6e9af6cbc950 100644 >> --- a/include/uapi/linux/prctl.h >> +++ b/include/uapi/linux/prctl.h >> @@ -283,7 +283,7 @@ struct prctl_mm_map { >> /* Memory deny write / execute */ >> #define PR_SET_MDWE 65 >> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >> #define PR_GET_MDWE 66 >> diff --git a/tools/include/uapi/linux/prctl.h >> b/tools/include/uapi/linux/prctl.h >> index 759b3f53e53f..6e6563e97fef 100644 >> --- a/tools/include/uapi/linux/prctl.h >> +++ b/tools/include/uapi/linux/prctl.h >> @@ -283,7 +283,7 @@ struct prctl_mm_map { >> /* Memory deny write / execute */ >> #define PR_SET_MDWE 65 >> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >> #define PR_GET_MDWE 66 >> > > Both are changing existing uapi, so you'll already have existing user > space using the old values, that your kernel code has to deal with no? I'm the one who suggested this change, so I feel the need to clarify. For any existing 64-bit user space code using the kernel and the uapi headers before this patch and doing the wrong prctl(PR_SET_MDWE, PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct prctl(PR_SET_MDWE, (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities when prctl() implementation extracts the second argument via va_arg(op, unsigned long): * It gets lucky, and the upper 32 bits of the argument are zero. The call does what is expected by the user. * The upper 32 bits are non-zero junk. The flags argument is rejected by the kernel, and the call fails with EINVAL (unexpectedly for the user). This change is intended to affect only the second case, and only after the program is recompiled with the new uapi headers. The currently wrong, but naturally-looking prctl(PR_SET_MDWE, PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct. The kernel ABI is unaffected by this change, since it has been defined in terms of unsigned long from the start. Thanks, Alexey
On 22.05.23 12:35, Alexey Izbyshev wrote: > On 2023-05-22 11:55, David Hildenbrand wrote: >> On 17.05.23 17:03, Florent Revest wrote: >>> Alexey pointed out that defining a prctl flag as an int is a footgun >>> because, under some circumstances, when used as a flag to prctl, it >>> can >>> be casted to long with garbage upper bits which would result in >>> unexpected behaviors. >>> >>> This patch changes the constant to a UL to eliminate these >>> possibilities. >>> >>> Signed-off-by: Florent Revest <revest@chromium.org> >>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> >>> --- >>> include/uapi/linux/prctl.h | 2 +- >>> tools/include/uapi/linux/prctl.h | 2 +- >>> 2 files changed, 2 insertions(+), 2 deletions(-) >>> >>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h >>> index f23d9a16507f..6e9af6cbc950 100644 >>> --- a/include/uapi/linux/prctl.h >>> +++ b/include/uapi/linux/prctl.h >>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>> /* Memory deny write / execute */ >>> #define PR_SET_MDWE 65 >>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>> #define PR_GET_MDWE 66 >>> diff --git a/tools/include/uapi/linux/prctl.h >>> b/tools/include/uapi/linux/prctl.h >>> index 759b3f53e53f..6e6563e97fef 100644 >>> --- a/tools/include/uapi/linux/prctl.h >>> +++ b/tools/include/uapi/linux/prctl.h >>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>> /* Memory deny write / execute */ >>> #define PR_SET_MDWE 65 >>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>> #define PR_GET_MDWE 66 >>> >> >> Both are changing existing uapi, so you'll already have existing user >> space using the old values, that your kernel code has to deal with no? > > I'm the one who suggested this change, so I feel the need to clarify. > > For any existing 64-bit user space code using the kernel and the uapi > headers before this patch and doing the wrong prctl(PR_SET_MDWE, > PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct prctl(PR_SET_MDWE, > (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities > when prctl() implementation extracts the second argument via va_arg(op, > unsigned long): > > * It gets lucky, and the upper 32 bits of the argument are zero. The > call does what is expected by the user. > > * The upper 32 bits are non-zero junk. The flags argument is rejected by > the kernel, and the call fails with EINVAL (unexpectedly for the user). > > This change is intended to affect only the second case, and only after > the program is recompiled with the new uapi headers. The currently > wrong, but naturally-looking prctl(PR_SET_MDWE, > PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct. > > The kernel ABI is unaffected by this change, since it has been defined > in terms of unsigned long from the start. The thing I'm concerned about is the following: old user space (that would fail) on new kernel where we define some upper 32bit to actually have a meaning (where it would succeed with wrong semantics). IOW, can we ever really "use" these upper 32bit, or should we instead only consume the lower 32bit in the kernel and effectively ignore the upper 32bit? I guess the feature is not that old, so having many existing user space applications is unlikely. Which raises the question if we want to tag this here with a "Fixes" and eventually cc stable (hmm ...)?
On 2023-05-22 19:22, David Hildenbrand wrote: > On 22.05.23 12:35, Alexey Izbyshev wrote: >> On 2023-05-22 11:55, David Hildenbrand wrote: >>> On 17.05.23 17:03, Florent Revest wrote: >>>> Alexey pointed out that defining a prctl flag as an int is a footgun >>>> because, under some circumstances, when used as a flag to prctl, it >>>> can >>>> be casted to long with garbage upper bits which would result in >>>> unexpected behaviors. >>>> >>>> This patch changes the constant to a UL to eliminate these >>>> possibilities. >>>> >>>> Signed-off-by: Florent Revest <revest@chromium.org> >>>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> >>>> --- >>>> include/uapi/linux/prctl.h | 2 +- >>>> tools/include/uapi/linux/prctl.h | 2 +- >>>> 2 files changed, 2 insertions(+), 2 deletions(-) >>>> >>>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h >>>> index f23d9a16507f..6e9af6cbc950 100644 >>>> --- a/include/uapi/linux/prctl.h >>>> +++ b/include/uapi/linux/prctl.h >>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>> /* Memory deny write / execute */ >>>> #define PR_SET_MDWE 65 >>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>> #define PR_GET_MDWE 66 >>>> diff --git a/tools/include/uapi/linux/prctl.h >>>> b/tools/include/uapi/linux/prctl.h >>>> index 759b3f53e53f..6e6563e97fef 100644 >>>> --- a/tools/include/uapi/linux/prctl.h >>>> +++ b/tools/include/uapi/linux/prctl.h >>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>> /* Memory deny write / execute */ >>>> #define PR_SET_MDWE 65 >>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>> #define PR_GET_MDWE 66 >>>> >>> >>> Both are changing existing uapi, so you'll already have existing user >>> space using the old values, that your kernel code has to deal with >>> no? >> >> I'm the one who suggested this change, so I feel the need to clarify. >> >> For any existing 64-bit user space code using the kernel and the uapi >> headers before this patch and doing the wrong prctl(PR_SET_MDWE, >> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct >> prctl(PR_SET_MDWE, >> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities >> when prctl() implementation extracts the second argument via >> va_arg(op, >> unsigned long): >> >> * It gets lucky, and the upper 32 bits of the argument are zero. The >> call does what is expected by the user. >> >> * The upper 32 bits are non-zero junk. The flags argument is rejected >> by >> the kernel, and the call fails with EINVAL (unexpectedly for the >> user). >> >> This change is intended to affect only the second case, and only after >> the program is recompiled with the new uapi headers. The currently >> wrong, but naturally-looking prctl(PR_SET_MDWE, >> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct. >> >> The kernel ABI is unaffected by this change, since it has been defined >> in terms of unsigned long from the start. > > The thing I'm concerned about is the following: old user space (that > would fail) on new kernel where we define some upper 32bit to actually > have a meaning (where it would succeed with wrong semantics). > > IOW, can we ever really "use" these upper 32bit, or should we instead > only consume the lower 32bit in the kernel and effectively ignore the > upper 32bit? > I see, thanks. But I think this question is mostly independent from this patch. The patch removes a footgun, but it doesn't change the flags check in the kernel, and nothing stops the user from doing int flags = PR_MDWE_REFUSE_EXEC_GAIN; prctl(PR_SET_MDWE, flags); So we have to decide whether to ignore the upper 32 bits or not even if this patch is not applied (actually *had to* when PR_SET_MDWE op was being added). Possible arguments for ignoring them: * Upper 32 bits can't be passed on 32-bit targets via the current prctl() interface, so a change that adds meaning to them would have to be both 64-bit-specific and unable to use another prctl() argument instead. That seems unlikely. * It's not hard to accidentally pass int to prctl() even after this patch, so making technically wrong user code work as intended could be a good thing. * A similar footgun exists for ILP32 ABIs (e.g. x32) on a lower level: while prctl(PR_SET_MDWE, 1) is fine there because long is 32-bit, the syscall interface is still 64-bit, so e.g. syscall(SYS_prctl, PR_SET_MDWE, -1) could, depending on syscall() implementation, sign-extend -1 to 64 bits and pass 64 set bits instead of 32 to the kernel. Possible arguments for checking them: * Code like "prctl(PR_SET_MDWE, 1)" is UB on 64-bit platforms. If the compiler notices that (e.g. if somebody ever manages to build a program and a libc together with LTO), it's allowed to make things much worse than just passing junk. Allowing the user to detect at least some of such calls now by checking for junk could be better. * I have the impression that the kernel security community prefers strict argument validation. * PR_SET_MDWE is a new op added in 6.3, so we don't have lots of legacy code that is known to pass junk in the upper 32 bits and must be kept working (i.e. failing) in the same way in a potential future kernel that assigns meaning to those bits. My preference would be to keep checking the upper 32 bits. Florent, what do you think? > I guess the feature is not that old, so having many existing user > space applications is unlikely. > > Which raises the question if we want to tag this here with a "Fixes" > and eventually cc stable (hmm ...)? Yes, IMO the faster we propagate this change, the better. Thanks, Alexey
On 22.05.23 20:58, Alexey Izbyshev wrote: > On 2023-05-22 19:22, David Hildenbrand wrote: >> On 22.05.23 12:35, Alexey Izbyshev wrote: >>> On 2023-05-22 11:55, David Hildenbrand wrote: >>>> On 17.05.23 17:03, Florent Revest wrote: >>>>> Alexey pointed out that defining a prctl flag as an int is a footgun >>>>> because, under some circumstances, when used as a flag to prctl, it >>>>> can >>>>> be casted to long with garbage upper bits which would result in >>>>> unexpected behaviors. >>>>> >>>>> This patch changes the constant to a UL to eliminate these >>>>> possibilities. >>>>> >>>>> Signed-off-by: Florent Revest <revest@chromium.org> >>>>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> >>>>> --- >>>>> include/uapi/linux/prctl.h | 2 +- >>>>> tools/include/uapi/linux/prctl.h | 2 +- >>>>> 2 files changed, 2 insertions(+), 2 deletions(-) >>>>> >>>>> diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h >>>>> index f23d9a16507f..6e9af6cbc950 100644 >>>>> --- a/include/uapi/linux/prctl.h >>>>> +++ b/include/uapi/linux/prctl.h >>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>>> /* Memory deny write / execute */ >>>>> #define PR_SET_MDWE 65 >>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>>> #define PR_GET_MDWE 66 >>>>> diff --git a/tools/include/uapi/linux/prctl.h >>>>> b/tools/include/uapi/linux/prctl.h >>>>> index 759b3f53e53f..6e6563e97fef 100644 >>>>> --- a/tools/include/uapi/linux/prctl.h >>>>> +++ b/tools/include/uapi/linux/prctl.h >>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>>> /* Memory deny write / execute */ >>>>> #define PR_SET_MDWE 65 >>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>>> #define PR_GET_MDWE 66 >>>>> >>>> >>>> Both are changing existing uapi, so you'll already have existing user >>>> space using the old values, that your kernel code has to deal with >>>> no? >>> >>> I'm the one who suggested this change, so I feel the need to clarify. >>> >>> For any existing 64-bit user space code using the kernel and the uapi >>> headers before this patch and doing the wrong prctl(PR_SET_MDWE, >>> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct >>> prctl(PR_SET_MDWE, >>> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two possibilities >>> when prctl() implementation extracts the second argument via >>> va_arg(op, >>> unsigned long): >>> >>> * It gets lucky, and the upper 32 bits of the argument are zero. The >>> call does what is expected by the user. >>> >>> * The upper 32 bits are non-zero junk. The flags argument is rejected >>> by >>> the kernel, and the call fails with EINVAL (unexpectedly for the >>> user). >>> >>> This change is intended to affect only the second case, and only after >>> the program is recompiled with the new uapi headers. The currently >>> wrong, but naturally-looking prctl(PR_SET_MDWE, >>> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct. >>> >>> The kernel ABI is unaffected by this change, since it has been defined >>> in terms of unsigned long from the start. >> >> The thing I'm concerned about is the following: old user space (that >> would fail) on new kernel where we define some upper 32bit to actually >> have a meaning (where it would succeed with wrong semantics). >> >> IOW, can we ever really "use" these upper 32bit, or should we instead >> only consume the lower 32bit in the kernel and effectively ignore the >> upper 32bit? >> > I see, thanks. But I think this question is mostly independent from this > patch. The patch removes a footgun, but it doesn't change the flags > check in the kernel, and nothing stops the user from doing > > int flags = PR_MDWE_REFUSE_EXEC_GAIN; > prctl(PR_SET_MDWE, flags); > > So we have to decide whether to ignore the upper 32 bits or not even if > this patch is not applied (actually *had to* when PR_SET_MDWE op was > being added). Well, an alternative to this patch would be to say "well, for this prctl we ignore any upper 32bit. Then, this change would not be needed. Yes, I also don't like that :) Bu unrelated, I looked at some other random prctl. #define SUID_DUMP_USER 1 And in kernel/sys.c: case PR_SET_DUMPABLE: if (arg2 != SUID_DUMP_DISABLE && arg2 != SUID_DUMP_USER) ... Wouldn't that also suffer from the same issue, or how is this different? Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We need arg2 -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ? prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) I'm easily confused by such (va_args) things, so sorry for the dummy questions.
On 2023-05-23 12:12, David Hildenbrand wrote: > On 22.05.23 20:58, Alexey Izbyshev wrote: >> On 2023-05-22 19:22, David Hildenbrand wrote: >>> On 22.05.23 12:35, Alexey Izbyshev wrote: >>>> On 2023-05-22 11:55, David Hildenbrand wrote: >>>>> On 17.05.23 17:03, Florent Revest wrote: >>>>>> Alexey pointed out that defining a prctl flag as an int is a >>>>>> footgun >>>>>> because, under some circumstances, when used as a flag to prctl, >>>>>> it >>>>>> can >>>>>> be casted to long with garbage upper bits which would result in >>>>>> unexpected behaviors. >>>>>> >>>>>> This patch changes the constant to a UL to eliminate these >>>>>> possibilities. >>>>>> >>>>>> Signed-off-by: Florent Revest <revest@chromium.org> >>>>>> Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> >>>>>> --- >>>>>> include/uapi/linux/prctl.h | 2 +- >>>>>> tools/include/uapi/linux/prctl.h | 2 +- >>>>>> 2 files changed, 2 insertions(+), 2 deletions(-) >>>>>> >>>>>> diff --git a/include/uapi/linux/prctl.h >>>>>> b/include/uapi/linux/prctl.h >>>>>> index f23d9a16507f..6e9af6cbc950 100644 >>>>>> --- a/include/uapi/linux/prctl.h >>>>>> +++ b/include/uapi/linux/prctl.h >>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>>>> /* Memory deny write / execute */ >>>>>> #define PR_SET_MDWE 65 >>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>>>> #define PR_GET_MDWE 66 >>>>>> diff --git a/tools/include/uapi/linux/prctl.h >>>>>> b/tools/include/uapi/linux/prctl.h >>>>>> index 759b3f53e53f..6e6563e97fef 100644 >>>>>> --- a/tools/include/uapi/linux/prctl.h >>>>>> +++ b/tools/include/uapi/linux/prctl.h >>>>>> @@ -283,7 +283,7 @@ struct prctl_mm_map { >>>>>> /* Memory deny write / execute */ >>>>>> #define PR_SET_MDWE 65 >>>>>> -# define PR_MDWE_REFUSE_EXEC_GAIN 1 >>>>>> +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) >>>>>> #define PR_GET_MDWE 66 >>>>>> >>>>> >>>>> Both are changing existing uapi, so you'll already have existing >>>>> user >>>>> space using the old values, that your kernel code has to deal with >>>>> no? >>>> >>>> I'm the one who suggested this change, so I feel the need to >>>> clarify. >>>> >>>> For any existing 64-bit user space code using the kernel and the >>>> uapi >>>> headers before this patch and doing the wrong prctl(PR_SET_MDWE, >>>> PR_MDWE_REFUSE_EXEC_GAIN) call instead of the correct >>>> prctl(PR_SET_MDWE, >>>> (unsigned long)PR_MDWE_REFUSE_EXEC_GAIN), there are two >>>> possibilities >>>> when prctl() implementation extracts the second argument via >>>> va_arg(op, >>>> unsigned long): >>>> >>>> * It gets lucky, and the upper 32 bits of the argument are zero. The >>>> call does what is expected by the user. >>>> >>>> * The upper 32 bits are non-zero junk. The flags argument is >>>> rejected >>>> by >>>> the kernel, and the call fails with EINVAL (unexpectedly for the >>>> user). >>>> >>>> This change is intended to affect only the second case, and only >>>> after >>>> the program is recompiled with the new uapi headers. The currently >>>> wrong, but naturally-looking prctl(PR_SET_MDWE, >>>> PR_MDWE_REFUSE_EXEC_GAIN) call becomes correct. >>>> >>>> The kernel ABI is unaffected by this change, since it has been >>>> defined >>>> in terms of unsigned long from the start. >>> >>> The thing I'm concerned about is the following: old user space (that >>> would fail) on new kernel where we define some upper 32bit to >>> actually >>> have a meaning (where it would succeed with wrong semantics). >>> >>> IOW, can we ever really "use" these upper 32bit, or should we instead >>> only consume the lower 32bit in the kernel and effectively ignore the >>> upper 32bit? >>> >> I see, thanks. But I think this question is mostly independent from >> this >> patch. The patch removes a footgun, but it doesn't change the flags >> check in the kernel, and nothing stops the user from doing >> >> int flags = PR_MDWE_REFUSE_EXEC_GAIN; >> prctl(PR_SET_MDWE, flags); >> >> So we have to decide whether to ignore the upper 32 bits or not even >> if >> this patch is not applied (actually *had to* when PR_SET_MDWE op was >> being added). > > Well, an alternative to this patch would be to say "well, for this > prctl we ignore any upper 32bit. Then, this change would not be > needed. Yes, I also don't like that :) > > Bu unrelated, I looked at some other random prctl. > > #define SUID_DUMP_USER 1 > > And in kernel/sys.c: > > case PR_SET_DUMPABLE: > if (arg2 != SUID_DUMP_DISABLE && arg2 != SUID_DUMP_USER) > ... > > Wouldn't that also suffer from the same issue, or how is this > different? > Yes, it is the same issue, so e.g. prctl(PR_SET_DUMPABLE, SUID_DUMP_DISABLE ) may wrongly fail with EINVAL on 64-bit targets. > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We > need arg2 -> arg5 to be 0. But wouldn't the following also just pass a > 0 "int" ? > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) > Yes, this is not reliable on 64-bit targets too. The simplest fix is to use "0L", as done in MDWE self-tests (but many other tests get this wrong). Florent also expressed surprise[1] that we don't see a lot of failures due to such issues, and I tried to provide some reasons. To elaborate on the x86-64 thing, for prctl(PR_SET_DUMPABLE, 0) the compiler will likely generate "xorl %esi, %esi" to pass zero, but this instruction will also clear the upper 32 bits of %rsi, so the problem is masked (and I believe CPU vendors are motivated to do such zeroing to reduce false dependencies). But this zeroing is not required by the ABI, so in a more complex situation junk might get through. Real-world examples of very similar breakage in variadic functions involving NULL sentinels are mentioned in [2] (the musl bug report is [3]). In short, musl defined NULL as plain 0 for C++, so when people do e.g. execl("/bin/true", "true", NULL), junk might prevent detection of the sentinel in execl() impl. (Though if the sentinel is passed via stack because there are a lot of preceding arguments, the breakage becomes more apparent because auto-zeroing of registers doesn't come into play anymore.) > > I'm easily confused by such (va_args) things, so sorry for the dummy > questions. This stuff *is* confusing, and note that Linux man pages don't even tell that prctl() is actually declared as a variadic function (and for ptrace() this is mentioned only in the notes, but not in its signature). Thanks, Alexey [1] https://lore.kernel.org/lkml/3a38319a3b241e578729ffa5484ad24b@ispras.ru [2] https://ewontfix.com/11 [3] https://www.openwall.com/lists/musl/2013/01/09/1
On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote: > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We need arg2 > -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ? > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) > > I'm easily confused by such (va_args) things, so sorry for the dummy > questions. Isn't the prctl() prototype in the user headers defined with the first argument as int while the rest as unsigned long? At least from the man page: int prctl(int option, unsigned long arg2, unsigned long arg3, unsigned long arg4, unsigned long arg5); So there are no va_args tricks (which confuse me as well). Any int passed to arg[2-5] would be converted by the compiler to an unsigned long before being passed to the kernel. So I think the change in this patch is harmless as the conversion is happening anyway. (well, unless I completely missed what the problem is)
On 2023-05-23 16:07, Catalin Marinas wrote: > On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote: >> Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We >> need arg2 >> -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ? >> >> prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) >> >> I'm easily confused by such (va_args) things, so sorry for the dummy >> questions. > > Isn't the prctl() prototype in the user headers defined with the first > argument as int while the rest as unsigned long? At least from the man > page: > > int prctl(int option, unsigned long arg2, unsigned long arg3, > unsigned long arg4, unsigned long arg5); > > So there are no va_args tricks (which confuse me as well). > I have explicitly mentioned the problem with man pages in my response to David[1]. Quoting myself: > This stuff *is* confusing, and note that Linux man pages don't even > tell that prctl() is actually declared as a variadic function (and for ptrace() this is mentioned only in the notes, but not in its signature). The reality: * glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sys/prctl.h;h=821aeefc1339b35210e8918ecfe9833ed2792626;hb=glibc-2.37#l42 * musl: https://git.musl-libc.org/cgit/musl/tree/include/sys/prctl.h?h=v1.2.4#n180 Though there is a test in the kernel that does define its own prototype, avoiding the issue: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/sched/cs_prctl_test.c?h=v6.3#n77 Thanks, Alexey [1] https://lore.kernel.org/lkml/7c572622c0d8e283fc880fe3f4ffac27@ispras.ru//lkml/7c572622c0d8e283fc880fe3f4ffac27@ispras.ru > Any int passed to arg[2-5] would be converted by the compiler to an > unsigned long before being passed to the kernel. So I think the change > in this patch is harmless as the conversion is happening anyway. > > (well, unless I completely missed what the problem is)
On Tue, May 23, 2023 at 04:25:45PM +0300, Alexey Izbyshev wrote: > On 2023-05-23 16:07, Catalin Marinas wrote: > > On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote: > > > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We > > > need arg2 > > > -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ? > > > > > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) > > > > > > I'm easily confused by such (va_args) things, so sorry for the dummy > > > questions. > > > > Isn't the prctl() prototype in the user headers defined with the first > > argument as int while the rest as unsigned long? At least from the man > > page: > > > > int prctl(int option, unsigned long arg2, unsigned long arg3, > > unsigned long arg4, unsigned long arg5); > > > > So there are no va_args tricks (which confuse me as well). > > > I have explicitly mentioned the problem with man pages in my response to > David[1]. Quoting myself: > > > This stuff *is* confusing, and note that Linux man pages don't even tell > that prctl() is actually declared as a variadic function (and for > ptrace() this is mentioned only in the notes, but not in its signature). Ah, thanks for the clarification (I somehow missed your reply). > The reality: > > * glibc: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sys/prctl.h;h=821aeefc1339b35210e8918ecfe9833ed2792626;hb=glibc-2.37#l42 > > * musl: > https://git.musl-libc.org/cgit/musl/tree/include/sys/prctl.h?h=v1.2.4#n180 > > Though there is a test in the kernel that does define its own prototype, > avoiding the issue: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/sched/cs_prctl_test.c?h=v6.3#n77 At least for glibc, it seems that there is a conversion to unsigned long: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28 unsigned long int arg2 = va_arg (arg, unsigned long int); (does va_arg expand to an actual cast?) If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I think it's a user-space bug and not a kernel ABI issue.
>> Wouldn't that also suffer from the same issue, or how is this >> different? >> > Yes, it is the same issue, so e.g. prctl(PR_SET_DUMPABLE, > SUID_DUMP_DISABLE ) may wrongly fail with EINVAL on 64-bit targets. > >> Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We >> need arg2 -> arg5 to be 0. But wouldn't the following also just pass a >> 0 "int" ? >> >> prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) >> > Yes, this is not reliable on 64-bit targets too. The simplest fix is to > use "0L", as done in MDWE self-tests (but many other tests get this > wrong). Oh, it's even worse than I thought, then. :) Even in our selftest most of $ git grep prctl tools/testing/selftests/ | grep "0" gets it wrong. > > Florent also expressed surprise[1] that we don't see a lot of failures > due to such issues, and I tried to provide some reasons. To elaborate on Yes, I'm also surprised! > the x86-64 thing, for prctl(PR_SET_DUMPABLE, 0) the compiler will likely > generate "xorl %esi, %esi" to pass zero, but this instruction will also > clear the upper 32 bits of %rsi, so the problem is masked (and I believe > CPU vendors are motivated to do such zeroing to reduce false > dependencies). But this zeroing is not required by the ABI, so in a more > complex situation junk might get through. :/ > > Real-world examples of very similar breakage in variadic functions > involving NULL sentinels are mentioned in [2] (the musl bug report is > [3]). In short, musl defined NULL as plain 0 for C++, so when people do > e.g. execl("/bin/true", "true", NULL), junk might prevent detection of > the sentinel in execl() impl. (Though if the sentinel is passed via > stack because there are a lot of preceding arguments, the breakage > becomes more apparent because auto-zeroing of registers doesn't come > into play anymore.) Yes, I heard about the "fun" with NULL already. Thanks for the musl pointer. And thanks for the confirmation/explanation. > >> >> I'm easily confused by such (va_args) things, so sorry for the dummy >> questions. > > This stuff *is* confusing, and note that Linux man pages don't even tell > that prctl() is actually declared as a variadic function (and for > ptrace() this is mentioned only in the notes, but not in its signature). Agreed, that's easy to miss (and probably many people missed it). Anyhow, for this patch as is (although it feels like drops in the ocean after our discussion) Reviewed-by: David Hildenbrand <david@redhat.com>
On Wed, May 17, 2023 at 05:03:19PM +0200, Florent Revest wrote: > Alexey pointed out that defining a prctl flag as an int is a footgun > because, under some circumstances, when used as a flag to prctl, it can > be casted to long with garbage upper bits which would result in > unexpected behaviors. > > This patch changes the constant to a UL to eliminate these > possibilities. > > Signed-off-by: Florent Revest <revest@chromium.org> > Suggested-by: Alexey Izbyshev <izbyshev@ispras.ru> FWIW, I'm fine with this patch and I don't think it introduces an ABI change. Acked-by: Catalin Marinas <catalin.marinas@arm.com>
On 2023-05-23 17:09, Catalin Marinas wrote: > On Tue, May 23, 2023 at 04:25:45PM +0300, Alexey Izbyshev wrote: >> On 2023-05-23 16:07, Catalin Marinas wrote: >> > On Tue, May 23, 2023 at 11:12:37AM +0200, David Hildenbrand wrote: >> > > Also, how is passing "0"s to e.g., PR_GET_THP_DISABLE reliable? We >> > > need arg2 >> > > -> arg5 to be 0. But wouldn't the following also just pass a 0 "int" ? >> > > >> > > prctl(PR_GET_THP_DISABLE, 0, 0, 0, 0) >> > > >> > > I'm easily confused by such (va_args) things, so sorry for the dummy >> > > questions. >> > >> > Isn't the prctl() prototype in the user headers defined with the first >> > argument as int while the rest as unsigned long? At least from the man >> > page: >> > >> > int prctl(int option, unsigned long arg2, unsigned long arg3, >> > unsigned long arg4, unsigned long arg5); >> > >> > So there are no va_args tricks (which confuse me as well). >> > >> I have explicitly mentioned the problem with man pages in my response >> to >> David[1]. Quoting myself: >> >> > This stuff *is* confusing, and note that Linux man pages don't even tell >> that prctl() is actually declared as a variadic function (and for >> ptrace() this is mentioned only in the notes, but not in its >> signature). > > Ah, thanks for the clarification (I somehow missed your reply). > >> The reality: >> >> * glibc: >> https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sys/prctl.h;h=821aeefc1339b35210e8918ecfe9833ed2792626;hb=glibc-2.37#l42 >> >> * musl: >> https://git.musl-libc.org/cgit/musl/tree/include/sys/prctl.h?h=v1.2.4#n180 >> >> Though there is a test in the kernel that does define its own >> prototype, >> avoiding the issue: >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/sched/cs_prctl_test.c?h=v6.3#n77 > > At least for glibc, it seems that there is a conversion to unsigned > long: > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28 > > unsigned long int arg2 = va_arg (arg, unsigned long int); > > (does va_arg expand to an actual cast?) > No, this not a conversion or a cast in the sense that I think you mean it. What happens in the situation discussed in this thread is the following (assuming the argument is passed via a register, which is typical for initial variadic arguments on 64-bit targets): * User calls prctl(op, 0) on a 64-bit target. * The second argument is an int. * The compiler generates code to pass an int (32 bits) via a 64-bit register. The compiler is NOT required to clear the upper 32 bits of the register, so they might contain arbitrary junk in a general case. * The prctl() implementation calls va_arg(arg, unsigned long) (as in your quote). * The compiler extracts the full 64-bit value of the same register (which in our case might contain junk in the upper 32 bits). * This extracted 64-bit value is then passed to the system call. So... > If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I > think > it's a user-space bug and not a kernel ABI issue. ... the problem happens not at the user/kernel boundary, but in prctl() call/implementation in user space. But yes, it's still a user-space bug and not a kernel ABI issue. The David's question, as I understand it, was whether we want to keep such buggy code that happens to pass junk failing with EINVAL in future kernels or not. If we do want to keep it failing, we can never assign any meaning to the upper 32 bits of the second prctl() argument for PR_SET_MDWE op. Thanks, Alexey
The 05/23/2023 15:09, Catalin Marinas wrote: > At least for glibc, it seems that there is a conversion to unsigned > long: > > https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/prctl.c#l28 > > unsigned long int arg2 = va_arg (arg, unsigned long int); > > (does va_arg expand to an actual cast?) this is not a conversion. at this point the argument is already corrupt since an int was passed instead of unsigned long, usually it's just wrong top 32bit, but in theory arg passing for int and long could be completely different. > > If the libc passes a 32-bit to a kernel ABI that expects 64-bit, I think > it's a user-space bug and not a kernel ABI issue. this is point 3 in an earlier mail i wrote about varargs (we ran into vararg issues during morello porting but it affects all 64bit targets): https://lkml.org/lkml/2022/10/31/386 IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
On Mon, May 22, 2023 at 8:58 PM Alexey Izbyshev <izbyshev@ispras.ru> wrote: > > My preference would be to keep checking the upper 32 bits. Florent, what > do you think? I don't mind. I can do this as part of v3. :) > > Which raises the question if we want to tag this here with a "Fixes" > > and eventually cc stable (hmm ...)? > > Yes, IMO the faster we propagate this change, the better. Okay, will do
On Tue, May 23, 2023 at 4:10 PM David Hildenbrand <david@redhat.com> wrote: > > >> I'm easily confused by such (va_args) things, so sorry for the dummy > >> questions. > > > > This stuff *is* confusing, and note that Linux man pages don't even tell > > that prctl() is actually declared as a variadic function (and for > > ptrace() this is mentioned only in the notes, but not in its signature). > > Agreed, that's easy to miss (and probably many people missed it). > > > Anyhow, for this patch as is (although it feels like drops in the ocean > after our discussion) > > Reviewed-by: David Hildenbrand <david@redhat.com> Thanks everyone for the review and the exchange ! :)
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h index f23d9a16507f..6e9af6cbc950 100644 --- a/include/uapi/linux/prctl.h +++ b/include/uapi/linux/prctl.h @@ -283,7 +283,7 @@ struct prctl_mm_map { /* Memory deny write / execute */ #define PR_SET_MDWE 65 -# define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) #define PR_GET_MDWE 66 diff --git a/tools/include/uapi/linux/prctl.h b/tools/include/uapi/linux/prctl.h index 759b3f53e53f..6e6563e97fef 100644 --- a/tools/include/uapi/linux/prctl.h +++ b/tools/include/uapi/linux/prctl.h @@ -283,7 +283,7 @@ struct prctl_mm_map { /* Memory deny write / execute */ #define PR_SET_MDWE 65 -# define PR_MDWE_REFUSE_EXEC_GAIN 1 +# define PR_MDWE_REFUSE_EXEC_GAIN (1UL << 0) #define PR_GET_MDWE 66