Message ID | 20230512003102.3149737-1-peterx@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:b0ea:0:b0:3b6:4342:cba0 with SMTP id b10csp4759786vqo; Thu, 11 May 2023 17:33:12 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ43eLI/nQO/nhsiDPiBwPlrabipo63qRwRousZZ4swWyJT7xDuID9tjCUkXbV/LYCQgmF0z X-Received: by 2002:a05:6a20:72a6:b0:100:57cd:cdf5 with SMTP id o38-20020a056a2072a600b0010057cdcdf5mr20760685pzk.7.1683851592237; Thu, 11 May 2023 17:33:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1683851592; cv=none; d=google.com; s=arc-20160816; b=vWAoz8OpvUXvppOpu6lSJiSD7eoBTVwSitRgAJrAxNdjt/e5bk4rbdb+RE7qN7dREs xNoQBCLp6SunoNIkIo2XVSsf3o70Qw7/hOaE4P+iVM2Qfpk8Wdsg7Xrb2JzNcLqm1gpd Zl2ipEp5aWVZp3GRrVxxHsYNLF8HbtLwWQFtC4c1dhVifQKriegkiphVoUfPy6jKulG1 /Q6U/5YMcXq2lqKULMXjE4WF3zKBLGoPh/B+t1wvXQLM/V5Q5+7/09uGPF/qQIp+JApH +BKflRIU0WdHXXW9TwXTk67hPe9FO+sl3aZKfJePtX2Ju7Tr+QhmUUQw2VLNrNz8DHKJ i2ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=GadWHwp50KH9hxb9lEESltKioaCu8cw3tBG7j+NI1MU=; b=sXx+bA2iE/H4yCgtNlppoACanXA6+t3Yvzo/1vZD6GA+PKA8V63eCr80e1R3/JvbS0 FYi7s2kUCbK0CWCYrmQaHmhM+liKluJOGrOkZOHAQzpPGLWAKH/HOywQwSXgne1K/sXv tm6JCyLocrlRhmg3XZ1103GnhDu6FKQ4kcHrhMcY6hQyIlovlPkaWSw/+SUgGeEtUZKa 0qs7OirO7mcnjZ7DwmsCEHJz+4N86ldsDLVBA5TRv6e0BIus5GsYOI4ERfg2RJHLtCgA 7vYhm7Zbs4D5Dad3YHudorfRqVrP+u7eKNjNGkR4Fq5f+qnuNB6XLrAfli0qq7vtcZpa oERw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bi6qAr9G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w18-20020aa79a12000000b006293f8330fcsi8830682pfj.322.2023.05.11.17.33.00; Thu, 11 May 2023 17:33:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=bi6qAr9G; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239589AbjELAbz (ORCPT <rfc822;peekingduck44@gmail.com> + 99 others); Thu, 11 May 2023 20:31:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42476 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229654AbjELAbx (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 11 May 2023 20:31:53 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 567F255AC for <linux-kernel@vger.kernel.org>; Thu, 11 May 2023 17:31:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1683851467; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding; bh=GadWHwp50KH9hxb9lEESltKioaCu8cw3tBG7j+NI1MU=; b=bi6qAr9GuWIyuG+j1CiPwuKJqvA98vklEQD/QWWJcsEhCUF9MmjskXUc4Gm6Aynt1da5z1 yp37wfNKpOHfFHO9exzo+NVrktlutvXHsFwtZnx7QHuP33CmgCN7x1Xe4XJP+OCEb/7M5J VPlc6g200uPAtKWD7rbtI0fVk+BHt3k= Received: from mail-qk1-f197.google.com (mail-qk1-f197.google.com [209.85.222.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-673-SuA8UH45OsaiXJVRmf4XDw-1; Thu, 11 May 2023 20:31:06 -0400 X-MC-Unique: SuA8UH45OsaiXJVRmf4XDw-1 Received: by mail-qk1-f197.google.com with SMTP id af79cd13be357-74e23e33f80so96671785a.0 for <linux-kernel@vger.kernel.org>; Thu, 11 May 2023 17:31:06 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1683851465; x=1686443465; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=GadWHwp50KH9hxb9lEESltKioaCu8cw3tBG7j+NI1MU=; b=fk5eM8HifheCEp8zhWJsmV1A+jWssqUBAKMzEanQAQ7ZRi5n39A4jRMangtFf+tr5i zm+k08THoZBdnHgDzB2X+aGnYCoYeWgq3qACT+FJGQ7QGiipA9Ad8nNKavyff51Z16vA 4ijTtRQyQTFjtQvLfm/GKtUoHmlS2b4FvKd/3HG3CMcjApF1KvzhK8zipgTFX9ZB3sZ6 HZoRcdcLxarXsYoEcnbgHcefhSU35z8YOEr3606AMg4+N7LDhCOf3GFAK1rXehkWuJH0 k8j/EzstEJN8APooCld0VLXo8Ij3jjWN/X21vkoKdnWhHwvlba/XHBhdOYBEeYv29xVs TJ7Q== X-Gm-Message-State: AC+VfDznqKsu7NEqBPlgSXQ3Tb/QzrFgiNQ6sMIw30gUXJfUjowkiIFX loC59e3TaBWZQHTCuJapFsA+MQVCBNKQJRsVSnPrMV5sE+Bc7FCPcHF691qON2tISUBqLdHE2kw j1CmY5854bIRT8h4/AKf6Bya+f1f0JyK+ X-Received: by 2002:ac8:5c91:0:b0:3f2:38d:5c84 with SMTP id r17-20020ac85c91000000b003f2038d5c84mr38416138qta.4.1683851464974; Thu, 11 May 2023 17:31:04 -0700 (PDT) X-Received: by 2002:ac8:5c91:0:b0:3f2:38d:5c84 with SMTP id r17-20020ac85c91000000b003f2038d5c84mr38416117qta.4.1683851464664; Thu, 11 May 2023 17:31:04 -0700 (PDT) Received: from x1n.. (bras-base-aurron9127w-grc-62-70-24-86-62.dsl.bell.ca. [70.24.86.62]) by smtp.gmail.com with ESMTPSA id v21-20020a05622a189500b003ef38277445sm1498590qtc.16.2023.05.11.17.31.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 11 May 2023 17:31:04 -0700 (PDT) From: Peter Xu <peterx@redhat.com> To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: John Hubbard <jhubbard@nvidia.com>, Jason Gunthorpe <jgg@nvidia.com>, Andrew Morton <akpm@linux-foundation.org>, "Kirill A . Shutemov" <kirill@shutemov.name>, Lorenzo Stoakes <lstoakes@gmail.com>, peterx@redhat.com, David Hildenbrand <david@redhat.com> Subject: [PATCH] mm/gup: Fixes FOLL_UNLOCKABLE against FOLL_NOWAIT Date: Thu, 11 May 2023 20:31:02 -0400 Message-Id: <20230512003102.3149737-1-peterx@redhat.com> X-Mailer: git-send-email 2.39.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1765646367288775774?= X-GMAIL-MSGID: =?utf-8?q?1765646367288775774?= |
Series |
mm/gup: Fixes FOLL_UNLOCKABLE against FOLL_NOWAIT
|
|
Commit Message
Peter Xu
May 12, 2023, 12:31 a.m. UTC
This is a follow up on f04740f54594 ("mm/gup: add FOLL_UNLOCKABLE").
FOLL_NOWAIT is the gup alias of FAULT_FLAG_RETRY_NOWAIT, which means when
FOLL_NOWAIT is set we definitely don't want to release the mmap read lock
when faulting. It's against the meaning of the newly introduced flag
FOLL_UNLOCKABLE.
E.g., with current code we could at last have FAULT_FLAG_RETRY_NOWAIT set
even if with a FOLL_UNLOCKABLE gup which doesn't make a lot of sense.
Code-wise, it _seems_ all still fine, because when NOWAIT+UNLOCKABLE both
set it'll be the same as old NOWAIT plus FAULT_FLAG_KILLABLE (since luckily
both of them leverage ALLOW_RETRY OTOH), which I don't see a major issue so
far. So not copying stable or attaching fixes, as there's no immediate
issue found. Still better clarify the use.
Since at it, the same commit added unconditional FOLL_UNLOCKABLE in
faultin_vma_page_range(), which is code-wise correct becuase the helper
only has one user right now and it always has "locked" set. However it can
be abused if someone reuse faultin_vma_page_range() in other call sites in
the future. Add a sanity check for that, also add the missing comment for
UNLOCKABLE.
Cc: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
---
This is something I found when I was reading the code alongside only. I
hope I didn't miss something.
---
mm/gup.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
Comments
On Thu, May 11, 2023 at 08:31:02PM -0400, Peter Xu wrote: > E.g., with current code we could at last have FAULT_FLAG_RETRY_NOWAIT set > even if with a FOLL_UNLOCKABLE gup which doesn't make a lot of > sense. I would say NOWAIT and UNLOCKABLE are different things. UNLOCKABLE says the mmap sem is allowed to be unlocked, which is true, and NOWAIT says it shouldn't "wait" (which is something more nebulous than just sleep). In FOLL_ flag terms it would be fine if the mmap sem was unlocked while doing NOWAIT - even though the fault hanlder will not doe this. The only caller is fine with this too. !UNLOCKABLE literally means not to ever drop the mmap lock which is not something KVM needs at all. So I'd say it is fine as is. A caller should never assume that calling an unlocked function or passing null locked means that the mmap sem won't be unlocked while running indirectly because of other GUP flags. If it wants this behavior it needs to ask for it explicitly with a locked GUP call and a NULL locked. > Since at it, the same commit added unconditional FOLL_UNLOCKABLE in > faultin_vma_page_range(), which is code-wise correct becuase the helper > only has one user right now and it always has "locked" set. Not quite, it is correct because that is the API contract of this function. The caller must provide a non-NULL locked and non-NULL locked at the external interfaces always mean it can be unlocked while running. > However it can be abused if someone reuse faultin_vma_page_range() > in other call sites in the future. Add a sanity check for that, > also add the missing comment for UNLOCKABLE. Then we have bigger problems because the API has become confusing if a non-NULL locked somehow means 'don't ever unlock', but only sometimes. Jason
On Fri, May 12, 2023 at 02:06:36PM -0300, Jason Gunthorpe wrote: > On Thu, May 11, 2023 at 08:31:02PM -0400, Peter Xu wrote: > > > E.g., with current code we could at last have FAULT_FLAG_RETRY_NOWAIT set > > even if with a FOLL_UNLOCKABLE gup which doesn't make a lot of > > sense. > > I would say NOWAIT and UNLOCKABLE are different things. UNLOCKABLE > says the mmap sem is allowed to be unlocked, which is true, and NOWAIT > says it shouldn't "wait" (which is something more nebulous than just > sleep). In FOLL_ flag terms it would be fine if the mmap sem was > unlocked while doing NOWAIT - even though the fault hanlder will not > doe this. > > The only caller is fine with this too. > > !UNLOCKABLE literally means not to ever drop the mmap lock which is > not something KVM needs at all. The problem is FOLL_NOWAIT implies FAULT_FLAG_RETRY_NOWAIT internally. Then we'll have FAULT_FLAG_RETRY_NOWAIT+FAULT_FLAG_KILLABLE which makes it very confusing, because RETRY_NOWAIT means we never release mmap lock or retry, then KILL means "if we wait, allow us to be killed". Considering FOLL_UNLOCKABLE is an internal flag while FOLL_NOWAIT a public (even if only with a single caller...), I'd still think it makes more sense and cleaner to just remove FOLL_UNLOCKABLE if FOLL_NOWAIT, no? Again, nothing to blame for previous commit (I explained in the commit message too that we don't need fixes, but simply a cleanup), but it seems removing this confusion of NOWAIT+UNLOCKABLE could be helpful to me. > > So I'd say it is fine as is. A caller should never assume that calling > an unlocked function or passing null locked means that the mmap sem > won't be unlocked while running indirectly because of other GUP > flags. If it wants this behavior it needs to ask for it explicitly > with a locked GUP call and a NULL locked. > > > Since at it, the same commit added unconditional FOLL_UNLOCKABLE in > > faultin_vma_page_range(), which is code-wise correct becuase the helper > > only has one user right now and it always has "locked" set. > > Not quite, it is correct because that is the API contract of this > function. The caller must provide a non-NULL locked and non-NULL > locked at the external interfaces always mean it can be unlocked while > running. Hmm yes, that's the contract. But then it makes more sense to assert on that contract (by checking locked)? How about I rework the commit message but keep the change (which literally only add the assertion)?
On Mon, May 15, 2023 at 11:48:17AM -0400, Peter Xu wrote: > The problem is FOLL_NOWAIT implies FAULT_FLAG_RETRY_NOWAIT internally. > > Then we'll have FAULT_FLAG_RETRY_NOWAIT+FAULT_FLAG_KILLABLE which makes it > very confusing, because RETRY_NOWAIT means we never release mmap lock or > retry, then KILL means "if we wait, allow us to be killed". I don't know if it is so confusing, the flags still make sense when composed together even if one is a NOP. > Considering FOLL_UNLOCKABLE is an internal flag while FOLL_NOWAIT a public > (even if only with a single caller...), I'd still think it makes more sense > and cleaner to just remove FOLL_UNLOCKABLE if FOLL_NOWAIT, no? I don't really like it.. The FOLL_ flags are supposed to be statements about what the caller is expecting. In this context the caller is clearly perfectly happy with unlocking the mmap sem during operation. It should set the flag. That the underlying code can't possibly do that when FOLL_NOWAIT is set too doesn't really matter to the caller. If there is something that needs tidying I'd say it is adjusting the FAULT_FLAG logic to not make the combination, but I don't think it is actually that confusing. "don't sleep" & "allow kill if you do sleep" are still logically combinable operations. Jason
diff --git a/mm/gup.c b/mm/gup.c index 90d9b65ff35c..202097627667 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1621,6 +1621,9 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start, VM_BUG_ON_VMA(end > vma->vm_end, vma); mmap_assert_locked(mm); + /* We'll do unconditional FOLL_UNLOCKABLE */ + VM_WARN_ON_ONCE(!locked); + /* * FOLL_TOUCH: Mark page accessed and thereby young; will also mark * the page dirty with FOLL_WRITE -- which doesn't make a @@ -1629,6 +1632,7 @@ long faultin_vma_page_range(struct vm_area_struct *vma, unsigned long start, * FOLL_HWPOISON: Return -EHWPOISON instead of -EFAULT when we hit * a poisoned page. * !FOLL_FORCE: Require proper access permissions. + * FOLL_UNLOCKABLE: Allow the fault to unlock mmap read lock */ gup_flags = FOLL_TOUCH | FOLL_HWPOISON | FOLL_UNLOCKABLE; if (write) @@ -2334,10 +2338,13 @@ EXPORT_SYMBOL(get_user_pages); long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages, struct page **pages, unsigned int gup_flags) { + unsigned int extra = FOLL_TOUCH; int locked = 0; - if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, - FOLL_TOUCH | FOLL_UNLOCKABLE)) + if (!(gup_flags & FOLL_NOWAIT)) + extra |= FOLL_UNLOCKABLE; + + if (!is_valid_gup_args(pages, NULL, NULL, &gup_flags, extra)) return -EINVAL; return __get_user_pages_locked(current->mm, start, nr_pages, pages,