Message ID | 20221107161740.144456-17-david@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:6687:0:0:0:0:0 with SMTP id l7csp2152242wru; Mon, 7 Nov 2022 08:22:58 -0800 (PST) X-Google-Smtp-Source: AA0mqf5jALfPGW//EuWn3DE5FtYayvsIBqlKkDOg+IT1dihvfKcLNEB28SwNdnjeuGGU6NVQWMAf X-Received: by 2002:a17:902:a98a:b0:188:86ff:a888 with SMTP id bh10-20020a170902a98a00b0018886ffa888mr2298358plb.124.1667838178442; Mon, 07 Nov 2022 08:22:58 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1667838178; cv=none; d=google.com; s=arc-20160816; b=HurVTHLFDDnmphF++xWHxahr4PPxQzoItbJSZEhHubZFddJreCMZnSd3ADyUACS8Ix OlVMtjOohsTXM6wCW6A2UJdqq3FQuFXHqetRprMCkloAbr+sfMXYKS94vCx/R6s6wtwy TUrtAWv8CYhp9qyvi2puETH+BPc7BS7cdpaQjWNhp4C0v2TqL+luACJiogok3BzO2FQf cfm0z9QQ3XAkdbNW8FETk6Ytrj7W+CNQq8NwZ+py2zPv6dvp+xV0kAiUTZQF/0V0Mlzb NK/eBPsfkS/K0nqsgi5sP1z4/Jqg599ywIii0ucFgr9stS5S5Sa/tRaXpETGdchGxKdt 1OiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=UbbCBm1eBZ2zIHN3XWo9t4J+cAWKhV3lkE1Len3Ewko=; b=v/XEWJ51C0rYSh2QcGogKaiortSBtabukF7RDnVOxcoOP8OrbX+66dLtnUEsJ8g/Lf RpuoB20t++lkxf8MsJ7fry4TetP44ii0JwDM9VGp3cwSJ7zMxdkafY+Id3ry9vBtz56x yZkA9hWlOSdv/dmG/bNKlxzp6+SD8k3S/ZUN1pDB6GKNyrqOErFOAeamgh6R4C9i7vmz BMyuOHz76j4sU7ZRxBm91SGGSQZQS1sg7YSqhWUbl2m/FSg6AK9N/1qwjqrPb/4hCU2P hcZcVyWasdez8ldqaYXDDrMFaYa6VAaSTmEL1NyZikO0lOwanlraUbuv6lv6KKCLKUVN 5iFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Kium0O9P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id b12-20020a170902bd4c00b00186c56d9829si1019951plx.328.2022.11.07.08.22.44; Mon, 07 Nov 2022 08:22:58 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Kium0O9P; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231966AbiKGQV5 (ORCPT <rfc822;hjfbswb@gmail.com> + 99 others); Mon, 7 Nov 2022 11:21:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232040AbiKGQVA (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 7 Nov 2022 11:21:00 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 23A2F17898 for <linux-kernel@vger.kernel.org>; Mon, 7 Nov 2022 08:19:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1667837977; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=UbbCBm1eBZ2zIHN3XWo9t4J+cAWKhV3lkE1Len3Ewko=; b=Kium0O9PEHW90zf1HwDFYb853eSD+zF1isjhXq1W2ip2+VpRBGxv+eo6qWo9A5Kl9jOTO3 7jLmfALRLKkyec+nd8EddVCifgphCanYxHwu5GTTANfwt0p4f76D0wtb/YPTfBcQ4P7JtJ UEKX1OAuhKzJwrIBFjZJgU3IB2pvwRQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-271-EWhylD7YPUCsp45VLDkKVw-1; Mon, 07 Nov 2022 11:19:33 -0500 X-MC-Unique: EWhylD7YPUCsp45VLDkKVw-1 Received: from smtp.corp.redhat.com (int-mx09.intmail.prod.int.rdu2.redhat.com [10.11.54.9]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id D5D6038149A7; Mon, 7 Nov 2022 16:19:31 +0000 (UTC) Received: from t480s.redhat.com (unknown [10.39.195.106]) by smtp.corp.redhat.com (Postfix) with ESMTP id 421834B3FC6; Mon, 7 Nov 2022 16:19:26 +0000 (UTC) From: David Hildenbrand <david@redhat.com> To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, etnaviv@lists.freedesktop.org, dri-devel@lists.freedesktop.org, linux-arm-kernel@lists.infradead.org, linux-samsung-soc@vger.kernel.org, linux-rdma@vger.kernel.org, linux-media@vger.kernel.org, linux-kselftest@vger.kernel.org, David Hildenbrand <david@redhat.com>, Linus Torvalds <torvalds@linux-foundation.org>, Andrew Morton <akpm@linux-foundation.org>, Jason Gunthorpe <jgg@ziepe.ca>, John Hubbard <jhubbard@nvidia.com>, Peter Xu <peterx@redhat.com>, Greg Kroah-Hartman <gregkh@linuxfoundation.org>, Andrea Arcangeli <aarcange@redhat.com>, Hugh Dickins <hughd@google.com>, Nadav Amit <namit@vmware.com>, Vlastimil Babka <vbabka@suse.cz>, Matthew Wilcox <willy@infradead.org>, Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <songmuchun@bytedance.com>, Lucas Stach <l.stach@pengutronix.de>, David Airlie <airlied@gmail.com>, Oded Gabbay <ogabbay@kernel.org>, Arnd Bergmann <arnd@arndb.de>, Tomasz Figa <tfiga@chromium.org>, Marek Szyprowski <m.szyprowski@samsung.com>, Mauro Carvalho Chehab <mchehab@kernel.org> Subject: [PATCH RFC 16/19] mm/frame-vector: remove FOLL_FORCE usage Date: Mon, 7 Nov 2022 17:17:37 +0100 Message-Id: <20221107161740.144456-17-david@redhat.com> In-Reply-To: <20221107161740.144456-1-david@redhat.com> References: <20221107161740.144456-1-david@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.9 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1748855086108647234?= X-GMAIL-MSGID: =?utf-8?q?1748855086108647234?= |
Series |
mm/gup: remove FOLL_FORCE usage from drivers (reliable R/O long-term pinning)
|
|
Commit Message
David Hildenbrand
Nov. 7, 2022, 4:17 p.m. UTC
FOLL_FORCE is really only for debugger access. According to commit
707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always
writable"), the pinned pages are always writable.
FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove
it.
Cc: Tomasz Figa <tfiga@chromium.org>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
drivers/media/common/videobuf2/frame_vector.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Comments
Hi David, On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: > > FOLL_FORCE is really only for debugger access. According to commit > 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always > writable"), the pinned pages are always writable. Actually that patch is only a workaround to temporarily disable support for read-only pages as they seemed to suffer from some corruption issues in the retrieved user pages. We expect to support read-only pages as hardware input after. That said, FOLL_FORCE doesn't sound like the right thing even in that case, but I don't know the background behind it being added here in the first place. +Hans Verkuil +Marek Szyprowski do you happen to remember anything about it? Best regards, Tomasz > > FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove > it. > > Cc: Tomasz Figa <tfiga@chromium.org> > Cc: Marek Szyprowski <m.szyprowski@samsung.com> > Cc: Mauro Carvalho Chehab <mchehab@kernel.org> > Signed-off-by: David Hildenbrand <david@redhat.com> > --- > drivers/media/common/videobuf2/frame_vector.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c > index 542dde9d2609..062e98148c53 100644 > --- a/drivers/media/common/videobuf2/frame_vector.c > +++ b/drivers/media/common/videobuf2/frame_vector.c > @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, > start = untagged_addr(start); > > ret = pin_user_pages_fast(start, nr_frames, > - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, > + FOLL_WRITE | FOLL_LONGTERM, > (struct page **)(vec->ptrs)); > if (ret > 0) { > vec->got_ref = true; > -- > 2.38.1 >
On 08.11.22 05:45, Tomasz Figa wrote: > Hi David, Hi Tomasz, thanks for looking into this! > > On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >> >> FOLL_FORCE is really only for debugger access. According to commit >> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >> writable"), the pinned pages are always writable. > As reference, the cover letter of the series: https://lkml.kernel.org/r/20221107161740.144456-1-david@redhat.com > Actually that patch is only a workaround to temporarily disable > support for read-only pages as they seemed to suffer from some > corruption issues in the retrieved user pages. We expect to support > read-only pages as hardware input after. That said, FOLL_FORCE doesn't > sound like the right thing even in that case, but I don't know the > background behind it being added here in the first place. +Hans > Verkuil +Marek Szyprowski do you happen to remember anything about it? Maybe I mis-interpreted 707947247e95; re-reading it again, I am not quite sure what the actual problem is and how it relates to GUP FOLL_WRITE handling. FOLL_FORCE already was in place before 707947247e95 and should be removed. What I understood is "Just always call vb2_create_framevec() with FOLL_WRITE to always allow writing to the buffers.". If the pinned page is never written to via the obtained GUP reference: * FOLL_WRITE should not be set * FOLL_FORCE should not be set * We should not dirty the page when unpinning If the pinned page may be written to via the obtained GUP reference: * FOLL_WRITE should be set * FOLL_FORCE should not be set * We should dirty the page when unpinning If the function is called for both, we should think about doing it conditional based on a "write" variable, like pre-707947247e95 did. @Hans, any insight?
Hi Tomasz, David, On 11/8/22 05:45, Tomasz Figa wrote: > Hi David, > > On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >> >> FOLL_FORCE is really only for debugger access. According to commit >> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >> writable"), the pinned pages are always writable. > > Actually that patch is only a workaround to temporarily disable > support for read-only pages as they seemed to suffer from some > corruption issues in the retrieved user pages. We expect to support > read-only pages as hardware input after. That said, FOLL_FORCE doesn't > sound like the right thing even in that case, but I don't know the > background behind it being added here in the first place. +Hans > Verkuil +Marek Szyprowski do you happen to remember anything about it? I tracked the use of 'force' all the way back to the first git commit (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the reason is lost in the mists of time. I'm not sure if the 'force' argument of get_user_pages() at that time even meant the same as FOLL_FORCE today. From what I can tell it has just been faithfully used ever since, but I have my doubt that anyone understands the reason behind it since it was never explained. Looking at this old LWN article https://lwn.net/Articles/28548/ suggests that it might be related to calling get_user_pages for write buffers (non-zero write argument) where you also want to be able to read from the buffer. That is certainly something that some drivers need to do post-capture fixups. But 'force' was also always set for read buffers, and I don't know if that was something that was actually needed, or just laziness. I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still allow drivers to read from the buffer? Regards, Hans > > Best regards, > Tomasz > >> >> FOLL_FORCE in this case seems to be a legacy leftover. Let's just remove >> it. >> >> Cc: Tomasz Figa <tfiga@chromium.org> >> Cc: Marek Szyprowski <m.szyprowski@samsung.com> >> Cc: Mauro Carvalho Chehab <mchehab@kernel.org> >> Signed-off-by: David Hildenbrand <david@redhat.com> >> --- >> drivers/media/common/videobuf2/frame_vector.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c >> index 542dde9d2609..062e98148c53 100644 >> --- a/drivers/media/common/videobuf2/frame_vector.c >> +++ b/drivers/media/common/videobuf2/frame_vector.c >> @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, >> start = untagged_addr(start); >> >> ret = pin_user_pages_fast(start, nr_frames, >> - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, >> + FOLL_WRITE | FOLL_LONGTERM, >> (struct page **)(vec->ptrs)); >> if (ret > 0) { >> vec->got_ref = true; >> -- >> 2.38.1 >>
On 22.11.22 13:25, Hans Verkuil wrote: > Hi Tomasz, David, > > On 11/8/22 05:45, Tomasz Figa wrote: >> Hi David, >> >> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>> >>> FOLL_FORCE is really only for debugger access. According to commit >>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>> writable"), the pinned pages are always writable. >> >> Actually that patch is only a workaround to temporarily disable >> support for read-only pages as they seemed to suffer from some >> corruption issues in the retrieved user pages. We expect to support >> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >> sound like the right thing even in that case, but I don't know the >> background behind it being added here in the first place. +Hans >> Verkuil +Marek Szyprowski do you happen to remember anything about it? > > I tracked the use of 'force' all the way back to the first git commit > (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the > reason is lost in the mists of time. > > I'm not sure if the 'force' argument of get_user_pages() at that time > even meant the same as FOLL_FORCE today. From what I can tell it has just > been faithfully used ever since, but I have my doubt that anyone understands > the reason behind it since it was never explained. > > Looking at this old LWN article https://lwn.net/Articles/28548/ suggests > that it might be related to calling get_user_pages for write buffers > (non-zero write argument) where you also want to be able to read from the > buffer. That is certainly something that some drivers need to do post-capture > fixups. > > But 'force' was also always set for read buffers, and I don't know if that > was something that was actually needed, or just laziness. > > I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still > allow drivers to read from the buffer? Yes. The only problematic corner case I can imagine is if someone has a VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin user space pages as a read buffer. We'd specify now FOLL_WRITE without FOLL_FORCE and GUP would reject that: write access without write permissions is invalid. There would be no way around "fixing" this implementation to not specify FOLL_WRITE when only reading from user-space pages. Not sure what the implications are regarding that corruption that was mentioned in 707947247e95. Having said that, I assume such a scenario is unlikely -- but you might know better how user space usually uses this interface. There would be three options: 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and leave the implementation as is for now. 3) Remove FOLL_FORCE and fixup the implementation to only specify FOLL_WRITE if the pages will actually get written to. 3) would most probably ideal, however, I am no expert on that code and can't do it (707947247e95 confuses me). So naive me would go with 2) first.
On 11/22/22 13:38, David Hildenbrand wrote: > On 22.11.22 13:25, Hans Verkuil wrote: >> Hi Tomasz, David, >> >> On 11/8/22 05:45, Tomasz Figa wrote: >>> Hi David, >>> >>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>>> >>>> FOLL_FORCE is really only for debugger access. According to commit >>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>>> writable"), the pinned pages are always writable. >>> >>> Actually that patch is only a workaround to temporarily disable >>> support for read-only pages as they seemed to suffer from some >>> corruption issues in the retrieved user pages. We expect to support >>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >>> sound like the right thing even in that case, but I don't know the >>> background behind it being added here in the first place. +Hans >>> Verkuil +Marek Szyprowski do you happen to remember anything about it? >> >> I tracked the use of 'force' all the way back to the first git commit >> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the >> reason is lost in the mists of time. >> >> I'm not sure if the 'force' argument of get_user_pages() at that time >> even meant the same as FOLL_FORCE today. From what I can tell it has just >> been faithfully used ever since, but I have my doubt that anyone understands >> the reason behind it since it was never explained. >> >> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests >> that it might be related to calling get_user_pages for write buffers >> (non-zero write argument) where you also want to be able to read from the >> buffer. That is certainly something that some drivers need to do post-capture >> fixups. >> >> But 'force' was also always set for read buffers, and I don't know if that >> was something that was actually needed, or just laziness. >> >> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still >> allow drivers to read from the buffer? > > Yes. The only problematic corner case I can imagine is if someone has a > VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin > user space pages as a read buffer. We'd specify now FOLL_WRITE without > FOLL_FORCE and GUP would reject that: write access without write > permissions is invalid. I do not believe this will be an issue. > > There would be no way around "fixing" this implementation to not specify > FOLL_WRITE when only reading from user-space pages. Not sure what the > implications are regarding that corruption that was mentioned in > 707947247e95. Before 707947247e95 the FOLL_WRITE flag was only set for write buffers (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, DMA_TO_DEVICE). In the video output case there should never be any need for drivers to write to the buffer to the best of my knowledge. But I have had some complaints about that commit that it causes problems in some scenarios, and it has been on my todo list for quite some time now to dig deeper into this. I probably should prioritize this for this or next week. > > Having said that, I assume such a scenario is unlikely -- but you might > know better how user space usually uses this interface. There would be > three options: > > 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. > 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and > leave the implementation as is for now. > 3) Remove FOLL_FORCE and fixup the implementation to only specify > FOLL_WRITE if the pages will actually get written to. > > 3) would most probably ideal, however, I am no expert on that code and > can't do it (707947247e95 confuses me). So naive me would go with 2) first. > Option 3 would be best. And 707947247e95 confuses me as well, and I actually wrote it :-) I am wondering whether it was addressed at the right level, but as I said, I need to dig a bit deeper into this. Regards, Hans
On 22.11.22 15:07, Hans Verkuil wrote: > On 11/22/22 13:38, David Hildenbrand wrote: >> On 22.11.22 13:25, Hans Verkuil wrote: >>> Hi Tomasz, David, >>> >>> On 11/8/22 05:45, Tomasz Figa wrote: >>>> Hi David, >>>> >>>> On Tue, Nov 8, 2022 at 1:19 AM David Hildenbrand <david@redhat.com> wrote: >>>>> >>>>> FOLL_FORCE is really only for debugger access. According to commit >>>>> 707947247e95 ("media: videobuf2-vmalloc: get_userptr: buffers are always >>>>> writable"), the pinned pages are always writable. >>>> >>>> Actually that patch is only a workaround to temporarily disable >>>> support for read-only pages as they seemed to suffer from some >>>> corruption issues in the retrieved user pages. We expect to support >>>> read-only pages as hardware input after. That said, FOLL_FORCE doesn't >>>> sound like the right thing even in that case, but I don't know the >>>> background behind it being added here in the first place. +Hans >>>> Verkuil +Marek Szyprowski do you happen to remember anything about it? >>> >>> I tracked the use of 'force' all the way back to the first git commit >>> (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the >>> reason is lost in the mists of time. >>> >>> I'm not sure if the 'force' argument of get_user_pages() at that time >>> even meant the same as FOLL_FORCE today. From what I can tell it has just >>> been faithfully used ever since, but I have my doubt that anyone understands >>> the reason behind it since it was never explained. >>> >>> Looking at this old LWN article https://lwn.net/Articles/28548/ suggests >>> that it might be related to calling get_user_pages for write buffers >>> (non-zero write argument) where you also want to be able to read from the >>> buffer. That is certainly something that some drivers need to do post-capture >>> fixups. >>> >>> But 'force' was also always set for read buffers, and I don't know if that >>> was something that was actually needed, or just laziness. >>> >>> I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still >>> allow drivers to read from the buffer? >> >> Yes. The only problematic corner case I can imagine is if someone has a >> VMA without write permissions (no PROT_WRITE/VM_WRITE) and wants to pin >> user space pages as a read buffer. We'd specify now FOLL_WRITE without >> FOLL_FORCE and GUP would reject that: write access without write >> permissions is invalid. > > I do not believe this will be an issue. > >> >> There would be no way around "fixing" this implementation to not specify >> FOLL_WRITE when only reading from user-space pages. Not sure what the >> implications are regarding that corruption that was mentioned in >> 707947247e95. > > Before 707947247e95 the FOLL_WRITE flag was only set for write buffers > (i.e. video capture, DMA_FROM_DEVICE), not for read buffers (video output, > DMA_TO_DEVICE). In the video output case there should never be any need > for drivers to write to the buffer to the best of my knowledge. > > But I have had some complaints about that commit that it causes problems > in some scenarios, and it has been on my todo list for quite some time now > to dig deeper into this. I probably should prioritize this for this or > next week. > >> >> Having said that, I assume such a scenario is unlikely -- but you might >> know better how user space usually uses this interface. There would be >> three options: >> >> 1) Leave the FOLL_FORCE hack in for now, which I *really* want to avoid. >> 2) Remove FOLL_FORCE and see if anybody even notices (this patch) and >> leave the implementation as is for now. >> 3) Remove FOLL_FORCE and fixup the implementation to only specify >> FOLL_WRITE if the pages will actually get written to. >> >> 3) would most probably ideal, however, I am no expert on that code and >> can't do it (707947247e95 confuses me). So naive me would go with 2) first. >> > > Option 3 would be best. And 707947247e95 confuses me as well, and I actually > wrote it :-) I am wondering whether it was addressed at the right level, but > as I said, I need to dig a bit deeper into this. Cool, let me know if I can help!
On Tue, Nov 22, 2022 at 4:25 AM Hans Verkuil <hverkuil@xs4all.nl> wrote: > > I tracked the use of 'force' all the way back to the first git commit > (2.6.12-rc1) in the very old video-buf.c. So it is very, very old and the > reason is lost in the mists of time. Well, not entirely. For archaeology reasons, I went back to the old BK history, which exists as a git conversion in https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/ and there you can actually find it. Not with a lot of explanations, though - it's commit b7649ef789 ("[PATCH] videobuf update"): This updates the video-buf.c module (helper module for video buffer management). Some memory management fixes, also some adaptions to the final v4l2 api. but it went from err = get_user_pages(current,current->mm, - data, dma->nr_pages, - rw == READ, 0, /* don't force */ + data & PAGE_MASK, dma->nr_pages, + rw == READ, 1, /* force */ dma->pages, NULL); in that commit. So it goes back to October 2002. > Looking at this old LWN article https://lwn.net/Articles/28548/ suggests > that it might be related to calling get_user_pages for write buffers The timing roughly matches. > I assume that removing FOLL_FORCE from 'FOLL_FORCE|FOLL_WRITE' will still > allow drivers to read from the buffer? The issue with some of the driver hacks has been that - they only want to read, and the buffer may be read-only - they then used FOLL_WRITE despite that, because they want to break COW (due to the issue that David is now fixing with his series) - but that means that the VM layer says "nope, you can't write to this read-only user mapping" - ... and then they use FOLL_FORCE to say "yes, I can". iOW, the FOLL_FORCE may be entirely due to an (incorrect, but historically needed) FOLL_WRITE. Linus
diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c index 542dde9d2609..062e98148c53 100644 --- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -50,7 +50,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, start = untagged_addr(start); ret = pin_user_pages_fast(start, nr_frames, - FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, + FOLL_WRITE | FOLL_LONGTERM, (struct page **)(vec->ptrs)); if (ret > 0) { vec->got_ref = true;