Message ID | 20230330155707.3106228-2-peterx@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp1249040vqo; Thu, 30 Mar 2023 09:20:47 -0700 (PDT) X-Google-Smtp-Source: AKy350a2WhB+n8l25YzEogQPh2vfCYN/+wWXj16aKK6qdrDvLVe1mcohQEUGa1bbU4icCr7spHJ0 X-Received: by 2002:a17:90b:1d86:b0:23b:3f18:a8fe with SMTP id pf6-20020a17090b1d8600b0023b3f18a8femr27218813pjb.31.1680193247323; Thu, 30 Mar 2023 09:20:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680193247; cv=none; d=google.com; s=arc-20160816; b=AzgRRacGe/xXYA0K0qumPrO58gJgLApJQR0FVU0u+AWXLH012WDzf39MXnevmRMIDv zGHOdIaK5w+Hj+dQ4mMzKGCRcC0d/uhVmxujpIrn8onhDuGNcRu7n76Ubn4tD2gufHnF tGINSvVOhsdsj090rEoWSN5TWk2ozf8a8KBSFgP7/ObRtp5WXZeSTdg5L0f8+CpnuFuS NO2c6tTz6Z/TUcSOr/qlQYwqAIC8gdhDJFOBFWxbjkLecjn8sklTYWCEzung89VkF/fT TYcC3SUXGvawolaNYF/owcRQt69wGZGpxvnkLFLi0+mBGQoC0ySdfHrscV7jTM5EouSc X+6g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=sP9rqjignnWh2VSPIiMl4HPMQh9/IMP3JUMCK79G4dM=; b=AccF+OZ4d4iK32/tP8dh3pxiTPnSGLsAxYOcM1ccsnt+wzpyco2DPBnRERBQKi6GzQ OJOlzN1+NoGEKl6y1viZVZXJHs8dQglZxv/jWrCmfwDr/geD3/zIShiR4RUjodkTDzpy D8WN8Z/vMyF9ViVD4XsB0jpLB3jAtms2e7kCaw8v5DngPCqXMrAtzUQYLXOsGVNOKkEE Zvc7AH044uzPfeHYsVT2v2wCSES0Yr2DDtPv8uKek1MqhyuGxPkEquBhv7H6eEGkJoAV WXrAu0PXC5T3XtXt5pOk4xEJyTfwhKtPDhJDSfQE9hFHk24OR56cVjBB9/j6jBU8hSRj dPsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eMCUvD3e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t10-20020a17090a6a0a00b0023f14c9a5fesi4181468pjj.100.2023.03.30.09.20.33; Thu, 30 Mar 2023 09:20:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=eMCUvD3e; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232874AbjC3P6K (ORCPT <rfc822;rua109.linux@gmail.com> + 99 others); Thu, 30 Mar 2023 11:58:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49992 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233501AbjC3P6E (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 30 Mar 2023 11:58:04 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6933B10CF for <linux-kernel@vger.kernel.org>; Thu, 30 Mar 2023 08:57:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680191839; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=sP9rqjignnWh2VSPIiMl4HPMQh9/IMP3JUMCK79G4dM=; b=eMCUvD3eYSZZ+FKilEItqt1NS+/NsoI6qIuR8ZAmwdr86TJL9TbGbpblRWwMUE2rUPq8pr WXYperdklGGfr4NBOz6d7FrapK0wSqghgVmISjHHaoJbkNtuE/0INoY7zaAZiDJc5+a18J YLRQbUpBwxpILWifXMc/N5zRlIMKOe4= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-563-AZLvEGlFPpeN5WRBTqjq0Q-1; Thu, 30 Mar 2023 11:57:17 -0400 X-MC-Unique: AZLvEGlFPpeN5WRBTqjq0Q-1 Received: by mail-qk1-f198.google.com with SMTP id 72-20020a37064b000000b007467c5d3abeso9069881qkg.19 for <linux-kernel@vger.kernel.org>; Thu, 30 Mar 2023 08:57:17 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1680191832; x=1682783832; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=sP9rqjignnWh2VSPIiMl4HPMQh9/IMP3JUMCK79G4dM=; b=i0GshvcBPzUahTTFTz2t/WBoCOGrDI61mIusz6wfi4olvzrEFHtwR5J4XB/f5ihFsa aP9SmuM6azLKTEFzXBYW44ACo2Do09+Z3h4mtm+pFvDkCjERIoa6uHZe+lxX4pgqnSln oE04fYMbj7Jia7jjccQn8aKCoFQ3HFq6O4a8s6h3jvxKKcnBiBVNbx8/BtqSBBlVAEu0 3wwE9UU3HskXHo6pMhwKPqd0m5Lf0vJ216tnDVwjY9JI6byQ4mxZ+9e6dY5AxtDecAFT cnxM2i76xdBSL++CEJa2VA5cCXswLkKT4+j2D+3KcffuaFsIFFqzRV0olslNnO9K+eZ+ yBpA== X-Gm-Message-State: AO0yUKXCcTIYI70Kt7UMCltnQwWJFCGNSVCNDOy7rwQfiESmQ+OB6qUt mXqhLy4waicgIO6qpGPEBK3gLxSrCi+wcEkt8lfg+KBG5izdq36MY6WF+TQUT+mQ3ZKhc93s8J1 AnSbgtOVI71L/gzZqADa8Qau8 X-Received: by 2002:a05:622a:1981:b0:3dc:483f:9c82 with SMTP id u1-20020a05622a198100b003dc483f9c82mr36684514qtc.0.1680191832432; Thu, 30 Mar 2023 08:57:12 -0700 (PDT) X-Received: by 2002:a05:622a:1981:b0:3dc:483f:9c82 with SMTP id u1-20020a05622a198100b003dc483f9c82mr36684482qtc.0.1680191832185; Thu, 30 Mar 2023 08:57:12 -0700 (PDT) Received: from x1n.redhat.com (bras-base-aurron9127w-grc-40-70-52-229-124.dsl.bell.ca. [70.52.229.124]) by smtp.gmail.com with ESMTPSA id o10-20020a05620a0d4a00b0074281812276sm13059380qkl.97.2023.03.30.08.57.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 30 Mar 2023 08:57:10 -0700 (PDT) From: Peter Xu <peterx@redhat.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Mike Kravetz <mike.kravetz@oracle.com>, Andrew Morton <akpm@linux-foundation.org>, Andrea Arcangeli <aarcange@redhat.com>, Mike Rapoport <rppt@linux.vnet.ibm.com>, Axel Rasmussen <axelrasmussen@google.com>, Nadav Amit <nadav.amit@gmail.com>, Leonardo Bras Soares Passos <lsoaresp@redhat.com>, David Hildenbrand <david@redhat.com>, linux-stable <stable@vger.kernel.org> Subject: [PATCH 01/29] Revert "userfaultfd: don't fail on unrecognized features" Date: Thu, 30 Mar 2023 11:56:39 -0400 Message-Id: <20230330155707.3106228-2-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 In-Reply-To: <20230330155707.3106228-1-peterx@redhat.com> References: <20230330155707.3106228-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1761810314880139031?= X-GMAIL-MSGID: =?utf-8?q?1761810314880139031?= |
Series |
selftests/mm: Split / Refactor userfault test
|
|
Commit Message
Peter Xu
March 30, 2023, 3:56 p.m. UTC
This is a proposal to revert commit 914eedcb9ba0ff53c33808.
I found this when writting a simple UFFDIO_API test to be the first unit
test in this set. Two things breaks with the commit:
- UFFDIO_API check was lost and missing. According to man page, the
kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This
check is needed if the api version will be extended in the future, or
user app won't be able to identify which is a new kernel.
- Feature flags checks were removed, which means UFFDIO_API with a
feature that does not exist will also succeed. According to the man
page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if
unknown features passed in.
Link: https://lore.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com
Cc: Axel Rasmussen <axelrasmussen@google.com>
Cc: linux-stable <stable@vger.kernel.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
fs/userfaultfd.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
Comments
On 30.03.23 17:56, Peter Xu wrote: > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > I found this when writting a simple UFFDIO_API test to be the first unit > test in this set. Two things breaks with the commit: > > - UFFDIO_API check was lost and missing. According to man page, the > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > check is needed if the api version will be extended in the future, or > user app won't be able to identify which is a new kernel. Agreed. > > - Feature flags checks were removed, which means UFFDIO_API with a > feature that does not exist will also succeed. According to the man > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > unknown features passed in. > Agreed. I understand the motivation of the original commit, but it should not have changed existing checks/functionality. Introducing a different way to enable such functionality on explicit request would be better. But maybe simple feature probing (is X support? is Y supported? is Z supported) might be easier without requiring ABI changes. I assume we better add Fixes: 914eedcb9ba0 ("userfaultfd: don't fail on unrecognized features") Acked-by: David Hildenbrand <david@redhat.com> > Link: https://lore.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com > Cc: Axel Rasmussen <axelrasmussen@google.com> > Cc: linux-stable <stable@vger.kernel.org> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > fs/userfaultfd.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 8395605790f6..3b2a41c330e6 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1977,8 +1977,10 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, > ret = -EFAULT; > if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api))) > goto out; > - /* Ignore unsupported features (userspace built against newer kernel) */ > - features = uffdio_api.features & UFFD_API_FEATURES; > + features = uffdio_api.features; > + ret = -EINVAL; > + if (uffdio_api.api != UFFD_API || (features & ~UFFD_API_FEATURES)) > + goto err_out; > ret = -EPERM; > if ((features & UFFD_FEATURE_EVENT_FORK) && !capable(CAP_SYS_PTRACE)) > goto err_out;
On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > I found this when writting a simple UFFDIO_API test to be the first unit > test in this set. Two things breaks with the commit: > > - UFFDIO_API check was lost and missing. According to man page, the > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > check is needed if the api version will be extended in the future, or > user app won't be able to identify which is a new kernel. 100% agreed, this was a mistake. > > - Feature flags checks were removed, which means UFFDIO_API with a > feature that does not exist will also succeed. According to the man > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > unknown features passed in. I still strongly disagree with reverting this part, my feeling is still that doing so makes things more complicated for no reason. Re: David's point, it's clearly wrong to change semantics so a thing that used to work now fails. But this instead makes it more permissive - existing userspace programs continue to work as-is, but *also* one can achieve the same thing more simply (combine probing + configuration into one step). I don't see any problem with that, generally. But, if David and others don't find my argument convincing, it isn't the end of the world. It just means I have to go update my userspace code to be a bit more complicated. :) I also still think the man page is incorrect or at least incomplete no matter what we do here; we should be sure to update it as a follow-up. > > Link: https://lore.kernel.org/r/20220722201513.1624158-1-axelrasmussen@google.com > Cc: Axel Rasmussen <axelrasmussen@google.com> > Cc: linux-stable <stable@vger.kernel.org> > Signed-off-by: Peter Xu <peterx@redhat.com> > --- > fs/userfaultfd.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c > index 8395605790f6..3b2a41c330e6 100644 > --- a/fs/userfaultfd.c > +++ b/fs/userfaultfd.c > @@ -1977,8 +1977,10 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, > ret = -EFAULT; > if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api))) > goto out; > - /* Ignore unsupported features (userspace built against newer kernel) */ > - features = uffdio_api.features & UFFD_API_FEATURES; > + features = uffdio_api.features; > + ret = -EINVAL; > + if (uffdio_api.api != UFFD_API || (features & ~UFFD_API_FEATURES)) > + goto err_out; > ret = -EPERM; > if ((features & UFFD_FEATURE_EVENT_FORK) && !capable(CAP_SYS_PTRACE)) > goto err_out; > -- > 2.39.1 >
On Thu, Mar 30, 2023 at 08:31:30PM +0200, David Hildenbrand wrote: > On 30.03.23 17:56, Peter Xu wrote: > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > I found this when writting a simple UFFDIO_API test to be the first unit > > test in this set. Two things breaks with the commit: > > > > - UFFDIO_API check was lost and missing. According to man page, the > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > > check is needed if the api version will be extended in the future, or > > user app won't be able to identify which is a new kernel. > > Agreed. > > > > > - Feature flags checks were removed, which means UFFDIO_API with a > > feature that does not exist will also succeed. According to the man > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > > unknown features passed in. > > > > Agreed. > > I understand the motivation of the original commit, but it should not have > changed existing checks/functionality. Introducing a different way to enable > such functionality on explicit request would be better. But maybe simple > feature probing (is X support? is Y supported? is Z supported) might be > easier without requiring ABI changes. Yes, I mentioned a similar "proposal" of UFFDIO_FEATURES here too, simply returning the feature bitmask before UFFDIO_API: https://lore.kernel.org/all/ZCSUTSbAcwBINiNk@x1n/ But I think current way is still fine; so maybe we'd just not bother. > > I assume we better add > > Fixes: 914eedcb9ba0 ("userfaultfd: don't fail on unrecognized features") Yes I'll add it. > Acked-by: David Hildenbrand <david@redhat.com> Thanks,
On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote: > On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: > > > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > I found this when writting a simple UFFDIO_API test to be the first unit > > test in this set. Two things breaks with the commit: > > > > - UFFDIO_API check was lost and missing. According to man page, the > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > > check is needed if the api version will be extended in the future, or > > user app won't be able to identify which is a new kernel. > > 100% agreed, this was a mistake. > > > > > - Feature flags checks were removed, which means UFFDIO_API with a > > feature that does not exist will also succeed. According to the man > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > > unknown features passed in. > > I still strongly disagree with reverting this part, my feeling is > still that doing so makes things more complicated for no reason. > > Re: David's point, it's clearly wrong to change semantics so a thing > that used to work now fails. But this instead makes it more permissive > - existing userspace programs continue to work as-is, but *also* one > can achieve the same thing more simply (combine probing + > configuration into one step). I don't see any problem with that, > generally. > > But, if David and others don't find my argument convincing, it isn't > the end of the world. It just means I have to go update my userspace > code to be a bit more complicated. :) I'd say it's fine if it's your own program that you can in full control easily. :) Sorry again for not noticing that earlier. There's one reason that we may consider keeping the behavior. IMHO it is when there're major softwares that uses the "wrong" ABI (let's say so; because it's not following the man pages). If you're aware any such major softwares (especially open sourced) will break due to this revert patch, please shoot. > > I also still think the man page is incorrect or at least incomplete no > matter what we do here; we should be sure to update it as a follow-up. So far it looks still fine if with this revert. Maybe I overlooked somewhere? I'll add this into my todo, but with low priority. If you have suggestion already on how to improve the man page please do so before me!
On Thu, Mar 30, 2023 at 3:27 PM Peter Xu <peterx@redhat.com> wrote: > > On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote: > > On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: > > > > > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > > > I found this when writting a simple UFFDIO_API test to be the first unit > > > test in this set. Two things breaks with the commit: > > > > > > - UFFDIO_API check was lost and missing. According to man page, the > > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > > > check is needed if the api version will be extended in the future, or > > > user app won't be able to identify which is a new kernel. > > > > 100% agreed, this was a mistake. > > > > > > > > - Feature flags checks were removed, which means UFFDIO_API with a > > > feature that does not exist will also succeed. According to the man > > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > > > unknown features passed in. > > > > I still strongly disagree with reverting this part, my feeling is > > still that doing so makes things more complicated for no reason. > > > > Re: David's point, it's clearly wrong to change semantics so a thing > > that used to work now fails. But this instead makes it more permissive > > - existing userspace programs continue to work as-is, but *also* one > > can achieve the same thing more simply (combine probing + > > configuration into one step). I don't see any problem with that, > > generally. > > > > But, if David and others don't find my argument convincing, it isn't > > the end of the world. It just means I have to go update my userspace > > code to be a bit more complicated. :) > > I'd say it's fine if it's your own program that you can in full control > easily. :) Sorry again for not noticing that earlier. > > There's one reason that we may consider keeping the behavior. IMHO it is > when there're major softwares that uses the "wrong" ABI (let's say so; > because it's not following the man pages). If you're aware any such major > softwares (especially open sourced) will break due to this revert patch, > please shoot. Well, I did find one example, criu: https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L266 It doesn't do the two-step probing process, it looks to me like it does what my patch was intending users to do: It just asks for the requested features up-front, without any probing. And then it returns the subset of features it *actually* got, ostensibly so the caller can compare that vs. what it requested. Then again, it looks like the caller doesn't *actually* compare the features it got vs. what it asked for. I don't know enough about criu to know if this is a bug, or if they actually just don't care. https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L312 > > > > > I also still think the man page is incorrect or at least incomplete no > > matter what we do here; we should be sure to update it as a follow-up. > > So far it looks still fine if with this revert. Maybe I overlooked > somewhere? > > I'll add this into my todo, but with low priority. If you have suggestion > already on how to improve the man page please do so before me! My thinking is that it describes the bits and pieces, but doesn't explicitly describe end-to-end how to configure a userfaultfd using the two-step probing approach. (Or state that this is actually *required*, unless you only want to set features=0 in any case.) Maybe it will be easiest if I just send a patch myself with what I'm thinking, and we can see what folks think. Always easier to just look at a patch vs. talking about it in the abstract. :) > > -- > Peter Xu >
On Fri, 31 Mar 2023 at 17:52, Axel Rasmussen <axelrasmussen@google.com> wrote: > > On Thu, Mar 30, 2023 at 3:27 PM Peter Xu <peterx@redhat.com> wrote: > > > > On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote: > > > On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: > > > > > > > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > > > > > I found this when writting a simple UFFDIO_API test to be the first unit > > > > test in this set. Two things breaks with the commit: > > > > > > > > - UFFDIO_API check was lost and missing. According to man page, the > > > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > > > > check is needed if the api version will be extended in the future, or > > > > user app won't be able to identify which is a new kernel. > > > > > > > > - Feature flags checks were removed, which means UFFDIO_API with a > > > > feature that does not exist will also succeed. According to the man > > > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > > > > unknown features passed in. If features/flags are not checked in kernel, and the kernel doesn't return an error on an unknown flag/error, that makes the syscall non-extendable, meaning that adding any new feature may break existing software, which doesn't sanitize them properly. https://lwn.net/Articles/588444/ See a bunch of painful exercises from syscalls with numbers in the end: https://lwn.net/Articles/792628/ To adding an additional setsockopt() because an old one didn't have sanity checks for flags: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8917a777be3b (not the best example, as the new setsockopt() didn't check flags for sanity as well (sic!), but that's near the code I work on now) This is even documented nowadays: https://www.kernel.org/doc/html/latest/process/adding-syscalls.html#designing-the-api-planning-for-extension ...and everyone knows what happens when you blame userspace for breaking by not doing what you would have expected it to do: https://lkml.org/lkml/2012/12/23/75 [..] > > There's one reason that we may consider keeping the behavior. IMHO it is > > when there're major softwares that uses the "wrong" ABI (let's say so; > > because it's not following the man pages). If you're aware any such major > > softwares (especially open sourced) will break due to this revert patch, > > please shoot. > > Well, I did find one example, criu: > https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L266 Mike can speak better than me about uffd, but AFAICS, CRIU correctly detects features with kerneldat/kdat: https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/kerndat.c#L1235 So, doing a sane thing in kernel shouldn't break CRIU (at least here). Thanks, Dmitry
On Fri, Mar 31, 2023 at 11:08 AM Dmitry Safonov <0x7f454c46@gmail.com> wrote: > > On Fri, 31 Mar 2023 at 17:52, Axel Rasmussen <axelrasmussen@google.com> wrote: > > > > On Thu, Mar 30, 2023 at 3:27 PM Peter Xu <peterx@redhat.com> wrote: > > > > > > On Thu, Mar 30, 2023 at 12:04:09PM -0700, Axel Rasmussen wrote: > > > > On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: > > > > > > > > > > This is a proposal to revert commit 914eedcb9ba0ff53c33808. > > > > > > > > > > I found this when writting a simple UFFDIO_API test to be the first unit > > > > > test in this set. Two things breaks with the commit: > > > > > > > > > > - UFFDIO_API check was lost and missing. According to man page, the > > > > > kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This > > > > > check is needed if the api version will be extended in the future, or > > > > > user app won't be able to identify which is a new kernel. > > > > > > > > > > - Feature flags checks were removed, which means UFFDIO_API with a > > > > > feature that does not exist will also succeed. According to the man > > > > > page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if > > > > > unknown features passed in. > > If features/flags are not checked in kernel, and the kernel doesn't > return an error on > an unknown flag/error, that makes the syscall non-extendable, meaning > that adding > any new feature may break existing software, which doesn't sanitize > them properly. > https://lwn.net/Articles/588444/ I don't think the same problem applies here. In the case of syscalls, the problem is the only way the kernel can communicate is by the EINVAL return value. Without the check, if a call succeeds the caller can't tell: was the flag supported + applied, or unrecognized + ignored? With UFFDIO_API (we aren't talking about userfaultfd(2) itself), when you pass in a set of flags, we return the subset of flags which were enabled, in addition to the return code. So via that mechanism, one is "able to check whether it is running on a kernel where [userfaultfd] supports [the feature]" as the article describes - the only difference is, the caller must check the returned set of features, instead of checking for an error code. I don't think it's exactly *how* userspace can check that's important, but rather *that* it can check. Another important difference: I have a hard time imagining a case where adding a new feature could break userspace, even with my approach, but let's say for the sake of argument one arises in the future. Unlike normal syscalls, we have the UFFD_API version check, so we have the option of incrementing that to separate users relying on the old behavior, from users willing to deal with the new behavior. (Syscalls can kind of replicate this by adding a new syscall, like clone() vs clone2(), but I think that's messier than the API version check being built-in to the API.) > > See a bunch of painful exercises from syscalls with numbers in the end: > https://lwn.net/Articles/792628/ > To adding an additional setsockopt() because an old one didn't have > sanity checks for flags: > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=8917a777be3b > (not the best example, as the new setsockopt() didn't check flags for > sanity as well (sic!), > but that's near the code I work on now) > > This is even documented nowadays: > https://www.kernel.org/doc/html/latest/process/adding-syscalls.html#designing-the-api-planning-for-extension > > ...and everyone knows what happens when you blame userspace for breaking by > not doing what you would have expected it to do: > https://lkml.org/lkml/2012/12/23/75 100% agreed. :) > > [..] > > > There's one reason that we may consider keeping the behavior. IMHO it is > > > when there're major softwares that uses the "wrong" ABI (let's say so; > > > because it's not following the man pages). If you're aware any such major > > > softwares (especially open sourced) will break due to this revert patch, > > > please shoot. > > > > Well, I did find one example, criu: > > https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/uffd.c#L266 > > Mike can speak better than me about uffd, but AFAICS, CRIU correctly detects > features with kerneldat/kdat: > https://github.com/checkpoint-restore/criu/blob/criu-dev/criu/kerndat.c#L1235 Ah, right, this is the simplest case where no optional features are asked for. So, it's not a great example; this particular case would look the same regardless of what the kernel does. > > So, doing a sane thing in kernel shouldn't break CRIU (at least here). > > Thanks, > Dmitry
On 30.03.23 21:04, Axel Rasmussen wrote: > On Thu, Mar 30, 2023 at 8:57 AM Peter Xu <peterx@redhat.com> wrote: >> >> This is a proposal to revert commit 914eedcb9ba0ff53c33808. >> >> I found this when writting a simple UFFDIO_API test to be the first unit >> test in this set. Two things breaks with the commit: >> >> - UFFDIO_API check was lost and missing. According to man page, the >> kernel should reject ioctl(UFFDIO_API) if uffdio_api.api != 0xaa. This >> check is needed if the api version will be extended in the future, or >> user app won't be able to identify which is a new kernel. > > 100% agreed, this was a mistake. > >> >> - Feature flags checks were removed, which means UFFDIO_API with a >> feature that does not exist will also succeed. According to the man >> page, we should (and it makes sense) to reject ioctl(UFFDIO_API) if >> unknown features passed in. > > I still strongly disagree with reverting this part, my feeling is > still that doing so makes things more complicated for no reason. > > Re: David's point, it's clearly wrong to change semantics so a thing > that used to work now fails. But this instead makes it more permissive > - existing userspace programs continue to work as-is, but *also* one > can achieve the same thing more simply (combine probing + > configuration into one step). I don't see any problem with that, > generally. > > But, if David and others don't find my argument convincing, it isn't > the end of the world. It just means I have to go update my userspace > code to be a bit more complicated. :) I'd probably find it more convincing if we'd started out with that approach ;) . User space would have to deal with the behavior of old kernels either way already? IOW, old kernels would reject the new flags, new kernels would not reject them but mask them out. So changing that behavior after the effects is somewhat suboptimal IMHO ... and rather makes things more complicated.
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 8395605790f6..3b2a41c330e6 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1977,8 +1977,10 @@ static int userfaultfd_api(struct userfaultfd_ctx *ctx, ret = -EFAULT; if (copy_from_user(&uffdio_api, buf, sizeof(uffdio_api))) goto out; - /* Ignore unsupported features (userspace built against newer kernel) */ - features = uffdio_api.features & UFFD_API_FEATURES; + features = uffdio_api.features; + ret = -EINVAL; + if (uffdio_api.api != UFFD_API || (features & ~UFFD_API_FEATURES)) + goto err_out; ret = -EPERM; if ((features & UFFD_FEATURE_EVENT_FORK) && !capable(CAP_SYS_PTRACE)) goto err_out;