From patchwork Thu Nov 17 14:54:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 21700 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp450240wrr; Thu, 17 Nov 2022 07:07:25 -0800 (PST) X-Google-Smtp-Source: AA0mqf6pe0ePCbfp5NC3GTOcAgGKyJyXjGRwjn7lszjFPSEbxnrHsqXmktvw9iPwa16Qb2+cMOqK X-Received: by 2002:a2e:be8a:0:b0:26f:c081:9aed with SMTP id a10-20020a2ebe8a000000b0026fc0819aedmr1159938ljr.222.1668697645504; Thu, 17 Nov 2022 07:07:25 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668697645; cv=none; d=google.com; s=arc-20160816; b=mq07A/uYm/wdP6EwCWznv5ylXqYy3f3Y6Xotbyim7SljNv3nUvOuBOlADFGZUMNWfa J9ERfQNRvcWtryP0ktlCDosaAQBsjEnpL0xgINiyt8yadS8wkukkQ7CMRNsq3BWJI4Hg RY41YHf6mYnkmXN+4TmykLNXqI/ji+QIB3b3O+WuKZHw7/rb4FfjByz/K+rIj1hy2n+/ cwmOj2FeQ95KtQ18mX8IQvhk/+qfa3El3pEjU0JyoJh89b/L6XdzPc/0rbF8pppm67UK u3KUj2Gz/WfxzPfe6xLLiqVM083qCSSbyhKyDCtxeFujM/53F7J1fP8OTJWa34SVgeIy 9w9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:organization:dkim-signature; bh=SByhc5VsJEeiVwJxzJfkOEqV7pDWaUCcZFv6lsxN4+w=; b=hKaAjrk3dHefvFzwTu21QmSMon8HjVcguzNWQHATzROmuhLGa/H95yg1MFG3IIHXcS 3FkLxANlMfRSYReW6tDz5jwre1bcHF2+GkZy018EVn0UbRIVmM5WhdTWJN9Bmda2meLF V3czBu2hqv4aVRs6LXOyBvKxU72lebJ79+6e5nxsTUKNU7P88g+5irNdSY63DStA/wM/ gDXc+4+w9urQ42fV63zS7xyluCYYyKBl+ny0do511Tldmu9oGslu58VP6DRwiIite4U7 n4IGyCjbcUbWt9wQb2On89Gtu6JKo7oX/ZOEa3pe0Pk1OK7A4Z3/fFjPHmk5u9IP7GYk 8Rwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fxOqO9tt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id tb26-20020a1709078b9a00b00741a16e8562si710916ejc.826.2022.11.17.07.07.01; Thu, 17 Nov 2022 07:07:25 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=fxOqO9tt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239291AbiKQOzt (ORCPT + 99 others); Thu, 17 Nov 2022 09:55:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45912 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234934AbiKQOzq (ORCPT ); Thu, 17 Nov 2022 09:55:46 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 07B8E43AF7 for ; Thu, 17 Nov 2022 06:54:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668696893; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SByhc5VsJEeiVwJxzJfkOEqV7pDWaUCcZFv6lsxN4+w=; b=fxOqO9ttpI3wrosvoFt25a5u0KINH/SENFQTy9rpMiZ73L0ykFAH1mGUzlOObt4cI4M2LI qkM6HilS6d3QOi45HuRoVbIfQfci7ZV228oh8/c/qvdr9+S/7ndt9/boc2KqgZq3eO6E8g ScTcjHPXMjWGah6zUwGUPuQcp9i+bGc= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-529-xH_rhtuaNAqOtLjlZE7alQ-1; Thu, 17 Nov 2022 09:54:50 -0500 X-MC-Unique: xH_rhtuaNAqOtLjlZE7alQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 582EC1C0A110; Thu, 17 Nov 2022 14:54:49 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id 346B8C1E890; Thu, 17 Nov 2022 14:54:48 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 1/4] mm: Move FOLL_* defs to mm_types.h From: David Howells To: Al Viro Cc: Matthew Wilcox , John Hubbard , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 17 Nov 2022 14:54:45 +0000 Message-ID: <166869688542.3723671.10243929000823258622.stgit@warthog.procyon.org.uk> In-Reply-To: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> References: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749756302184416121?= X-GMAIL-MSGID: =?utf-8?q?1749756302184416121?= Move FOLL_* definitions to linux/mm_types.h to make them more accessible without having to drag in all of linux/mm.h and everything that drags in too[1]. Suggested-by: Matthew Wilcox Signed-off-by: David Howells cc: John Hubbard cc: Al Viro cc: linux-mm@kvack.org cc: linux-fsdevel@vger.kernel.org Link: https://lore.kernel.org/linux-fsdevel/Y1%2FhSO+7kAJhGShG@casper.infradead.org/ [1] Reviewed-by: John Hubbard Reviewed-by: Christoph Hellwig --- include/linux/mm.h | 74 ---------------------------------------------- include/linux/mm_types.h | 73 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 73 insertions(+), 74 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 8bbcccbc5565..7a7a287818ad 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2941,80 +2941,6 @@ static inline vm_fault_t vmf_error(int err) struct page *follow_page(struct vm_area_struct *vma, unsigned long address, unsigned int foll_flags); -#define FOLL_WRITE 0x01 /* check pte is writable */ -#define FOLL_TOUCH 0x02 /* mark page accessed */ -#define FOLL_GET 0x04 /* do get_page on page */ -#define FOLL_DUMP 0x08 /* give error on hole if it would be zero */ -#define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */ -#define FOLL_NOWAIT 0x20 /* if a disk transfer is needed, start the IO - * and return without waiting upon it */ -#define FOLL_NOFAULT 0x80 /* do not fault in pages */ -#define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ -#define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ -#define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ -#define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ -#define FOLL_ANON 0x8000 /* don't do file mappings */ -#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ -#define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ -#define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ -#define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup */ - -/* - * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each - * other. Here is what they mean, and how to use them: - * - * FOLL_LONGTERM indicates that the page will be held for an indefinite time - * period _often_ under userspace control. This is in contrast to - * iov_iter_get_pages(), whose usages are transient. - * - * FIXME: For pages which are part of a filesystem, mappings are subject to the - * lifetime enforced by the filesystem and we need guarantees that longterm - * users like RDMA and V4L2 only establish mappings which coordinate usage with - * the filesystem. Ideas for this coordination include revoking the longterm - * pin, delaying writeback, bounce buffer page writeback, etc. As FS DAX was - * added after the problem with filesystems was found FS DAX VMAs are - * specifically failed. Filesystem pages are still subject to bugs and use of - * FOLL_LONGTERM should be avoided on those pages. - * - * FIXME: Also NOTE that FOLL_LONGTERM is not supported in every GUP call. - * Currently only get_user_pages() and get_user_pages_fast() support this flag - * and calls to get_user_pages_[un]locked are specifically not allowed. This - * is due to an incompatibility with the FS DAX check and - * FAULT_FLAG_ALLOW_RETRY. - * - * In the CMA case: long term pins in a CMA region would unnecessarily fragment - * that region. And so, CMA attempts to migrate the page before pinning, when - * FOLL_LONGTERM is specified. - * - * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, - * but an additional pin counting system) will be invoked. This is intended for - * anything that gets a page reference and then touches page data (for example, - * Direct IO). This lets the filesystem know that some non-file-system entity is - * potentially changing the pages' data. In contrast to FOLL_GET (whose pages - * are released via put_page()), FOLL_PIN pages must be released, ultimately, by - * a call to unpin_user_page(). - * - * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different - * and separate refcounting mechanisms, however, and that means that each has - * its own acquire and release mechanisms: - * - * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. - * - * FOLL_PIN: pin_user_pages*() to acquire, and unpin_user_pages to release. - * - * FOLL_PIN and FOLL_GET are mutually exclusive for a given function call. - * (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based - * calls applied to them, and that's perfectly OK. This is a constraint on the - * callers, not on the pages.) - * - * FOLL_PIN should be set internally by the pin_user_pages*() APIs, never - * directly by the caller. That's in order to help avoid mismatches when - * releasing pages: get_user_pages*() pages must be released via put_page(), - * while pin_user_pages*() pages must be released via unpin_user_page(). - * - * Please see Documentation/core-api/pin_user_pages.rst for more information. - */ - static inline int vm_fault_to_errno(vm_fault_t vm_fault, int foll_flags) { if (vm_fault & VM_FAULT_OOM) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 500e536796ca..0c80a5ad6e6a 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1003,4 +1003,77 @@ enum fault_flag { typedef unsigned int __bitwise zap_flags_t; +/* + * FOLL_PIN and FOLL_LONGTERM may be used in various combinations with each + * other. Here is what they mean, and how to use them: + * + * FOLL_LONGTERM indicates that the page will be held for an indefinite time + * period _often_ under userspace control. This is in contrast to + * iov_iter_get_pages(), whose usages are transient. + * + * FIXME: For pages which are part of a filesystem, mappings are subject to the + * lifetime enforced by the filesystem and we need guarantees that longterm + * users like RDMA and V4L2 only establish mappings which coordinate usage with + * the filesystem. Ideas for this coordination include revoking the longterm + * pin, delaying writeback, bounce buffer page writeback, etc. As FS DAX was + * added after the problem with filesystems was found FS DAX VMAs are + * specifically failed. Filesystem pages are still subject to bugs and use of + * FOLL_LONGTERM should be avoided on those pages. + * + * FIXME: Also NOTE that FOLL_LONGTERM is not supported in every GUP call. + * Currently only get_user_pages() and get_user_pages_fast() support this flag + * and calls to get_user_pages_[un]locked are specifically not allowed. This + * is due to an incompatibility with the FS DAX check and + * FAULT_FLAG_ALLOW_RETRY. + * + * In the CMA case: long term pins in a CMA region would unnecessarily fragment + * that region. And so, CMA attempts to migrate the page before pinning, when + * FOLL_LONGTERM is specified. + * + * FOLL_PIN indicates that a special kind of tracking (not just page->_refcount, + * but an additional pin counting system) will be invoked. This is intended for + * anything that gets a page reference and then touches page data (for example, + * Direct IO). This lets the filesystem know that some non-file-system entity is + * potentially changing the pages' data. In contrast to FOLL_GET (whose pages + * are released via put_page()), FOLL_PIN pages must be released, ultimately, by + * a call to unpin_user_page(). + * + * FOLL_PIN is similar to FOLL_GET: both of these pin pages. They use different + * and separate refcounting mechanisms, however, and that means that each has + * its own acquire and release mechanisms: + * + * FOLL_GET: get_user_pages*() to acquire, and put_page() to release. + * + * FOLL_PIN: pin_user_pages*() to acquire, and unpin_user_pages to release. + * + * FOLL_PIN and FOLL_GET are mutually exclusive for a given function call. + * (The underlying pages may experience both FOLL_GET-based and FOLL_PIN-based + * calls applied to them, and that's perfectly OK. This is a constraint on the + * callers, not on the pages.) + * + * FOLL_PIN should be set internally by the pin_user_pages*() APIs, never + * directly by the caller. That's in order to help avoid mismatches when + * releasing pages: get_user_pages*() pages must be released via put_page(), + * while pin_user_pages*() pages must be released via unpin_user_page(). + * + * Please see Documentation/core-api/pin_user_pages.rst for more information. + */ +#define FOLL_WRITE 0x01 /* check pte is writable */ +#define FOLL_TOUCH 0x02 /* mark page accessed */ +#define FOLL_GET 0x04 /* do get_page on page */ +#define FOLL_DUMP 0x08 /* give error on hole if it would be zero */ +#define FOLL_FORCE 0x10 /* get_user_pages read/write w/o permission */ +#define FOLL_NOWAIT 0x20 /* if a disk transfer is needed, start the IO + * and return without waiting upon it */ +#define FOLL_NOFAULT 0x80 /* do not fault in pages */ +#define FOLL_HWPOISON 0x100 /* check page is hwpoisoned */ +#define FOLL_MIGRATION 0x400 /* wait for page to replace migration entry */ +#define FOLL_TRIED 0x800 /* a retry, previous pass started an IO */ +#define FOLL_REMOTE 0x2000 /* we are working on non-current tsk/mm */ +#define FOLL_ANON 0x8000 /* don't do file mappings */ +#define FOLL_LONGTERM 0x10000 /* mapping lifetime is indefinite: see below */ +#define FOLL_SPLIT_PMD 0x20000 /* split huge pmd before returning */ +#define FOLL_PIN 0x40000 /* pages must be released via unpin_user_page */ +#define FOLL_FAST_ONLY 0x80000 /* gup_fast: prevent fall-back to slow gup */ + #endif /* _LINUX_MM_TYPES_H */ From patchwork Thu Nov 17 14:54:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 21701 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp450716wrr; Thu, 17 Nov 2022 07:08:03 -0800 (PST) X-Google-Smtp-Source: AA0mqf78XQYf+HcgrDgGDaO11qbaboSIXnojoiHujKFHd8csixirq+iNM7Vjwa6R9PR6xN/Axe0I X-Received: by 2002:a17:906:a18c:b0:7ad:9629:fb96 with SMTP id s12-20020a170906a18c00b007ad9629fb96mr2408176ejy.751.1668697683104; Thu, 17 Nov 2022 07:08:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668697683; cv=none; d=google.com; s=arc-20160816; b=XAHQwrQU0A4b8281ybHkJSApLkWbDAz5qBj21YjsKLXH0O+i8p9rIbXg/IA7yBfseC 1wlx0RMMPZg4bOIq+5L0s3rczzPqZAkx8qrjl+jLJvedZrHBGTe83qtGw2dlbh45kMbK 7RAlLnp2kc7ge+B8XqQQNsxnEXf/fIAuDrVaSwprpS2S5t6sNztc1H6ADG+4QVDUxZkn DuwuSNWd5gBThZCkmZthrME0fhS+LNLdGP/rbVOrdzMbYtmemVifo3zPawdjWEOJtYzS AOeSJ43Po1n1E/8B9GwFSyHow1/mXvWDHOT4LGXGLdQg8+Nj4vz2hsZw0hUcYyPn/aSJ xSBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:organization:dkim-signature; bh=nXjKZg5WljgzOeSAUW09t9JKhnsX6Sn+uoUMMkWxvVw=; b=Fh1pmuBuyi0YY41u/8jbF2YIR5ek2UkHi+oJ86AAdGP6tFe0ZLoEEeqyAh53BPNBFY KTBSbpsTbWiLxJf98wYWoBuO3csZ0HrTsUE3gVYgzgBpI7VlBCTTslmEeffuUg/EB6Ff GAZj3V7ptsjGS0ofkpurJ0RE2K0VH9O2iY/OGnx5Q3+7jqsjfjEkOOUEQuZ1N6MsdTeJ LEhzts6mYlu7KQARjsmeWw1gQXoilydT9YJiCO62seg06Sxl1xxZClU122HOed0xwdj+ uQ5zNsb8xfORoRJTcc0/BkKGz3SVli6uGVMNzVrj1+FWWuOO4QZyVgSHGDZ6GEJkeq38 yawQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=S3yyxgpl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id g13-20020a056402090d00b004595af54eacsi1134574edz.226.2022.11.17.07.07.24; Thu, 17 Nov 2022 07:08:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=S3yyxgpl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239836AbiKQO4E (ORCPT + 99 others); Thu, 17 Nov 2022 09:56:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231463AbiKQOz4 (ORCPT ); Thu, 17 Nov 2022 09:55:56 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 45EFB59876 for ; Thu, 17 Nov 2022 06:55:02 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668696901; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nXjKZg5WljgzOeSAUW09t9JKhnsX6Sn+uoUMMkWxvVw=; b=S3yyxgplCOdEsoC071X/5ZOJOlx3YuOj4BA8MC1VDOMb45/S+dVdTrdu0hmIE3yTzXyBOq 7mvqkQ6hIVfWyh79oUS8KMaReBfpcRey4n39CdXJce4ffQiC4gp+0fVOiVAY/NG3goXqNK 9DWQq5xbwyu9NKEFdsXowuxGTRFpj+s= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-310-TBgu8tP6N3Ob-LPQgFlYPw-1; Thu, 17 Nov 2022 09:54:59 -0500 X-MC-Unique: TBgu8tP6N3Ob-LPQgFlYPw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9891038012F3; Thu, 17 Nov 2022 14:54:58 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4449B1415119; Thu, 17 Nov 2022 14:54:57 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 2/4] iov_iter: Add a function to extract a page list from an iterator From: David Howells To: Al Viro Cc: Christoph Hellwig , John Hubbard , Matthew Wilcox , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, dhowells@redhat.com, Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 17 Nov 2022 14:54:54 +0000 Message-ID: <166869689451.3723671.18242195992447653092.stgit@warthog.procyon.org.uk> In-Reply-To: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> References: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749756341855793290?= X-GMAIL-MSGID: =?utf-8?q?1749756341855793290?= Add a function, iov_iter_extract_pages(), to extract a list of pages from an iterator. The pages may be returned with a reference added or a pin added or neither, depending on the type of iterator and the direction of transfer. An additional function, iov_iter_extract_mode() is also provided so that the mode of retention that will be employed for an iterator can be queried - and therefore how the caller should dispose of the pages later. There are three cases: (1) Transfer *into* an ITER_IOVEC or ITER_UBUF iterator. Extracted pages will have pins obtained on them (but not references) so that fork() doesn't CoW the pages incorrectly whilst the I/O is in progress. iov_iter_extract_mode() will return FOLL_PIN for this case. The caller should use something like unpin_user_page() to dispose of the page. (2) Transfer is *out of* an ITER_IOVEC or ITER_UBUF iterator. Extracted pages will have references obtained on them, but not pins. iov_iter_extract_mode() will return FOLL_GET. The caller should use something like put_page() for page disposal. (3) Any other sort of iterator. No refs or pins are obtained on the page, the assumption is made that the caller will manage page retention. iov_iter_extract_mode() will return 0. The pages don't need additional disposal. Signed-off-by: David Howells cc: Al Viro cc: Christoph Hellwig cc: John Hubbard cc: Matthew Wilcox cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org Link: https://lore.kernel.org/r/166722777971.2555743.12953624861046741424.stgit@warthog.procyon.org.uk/ --- include/linux/uio.h | 29 ++++ lib/iov_iter.c | 333 +++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 362 insertions(+) diff --git a/include/linux/uio.h b/include/linux/uio.h index 2e3134b14ffd..329e36d41f0a 100644 --- a/include/linux/uio.h +++ b/include/linux/uio.h @@ -351,4 +351,33 @@ static inline void iov_iter_ubuf(struct iov_iter *i, unsigned int direction, }; } +ssize_t iov_iter_extract_pages(struct iov_iter *i, struct page ***pages, + size_t maxsize, unsigned int maxpages, + size_t *offset0); + +/** + * iov_iter_extract_mode - Indicate how pages from the iterator will be retained + * @iter: The iterator + * + * Examine the indicator and indicate with FOLL_PIN, FOLL_GET or 0 as to how, + * if at all, pages extracted from the iterator will be retained by the + * extraction function. + * + * FOLL_GET indicates that the pages will have a reference taken on them that + * the caller must put. This can be done for DMA/async DIO write from a page. + * + * FOLL_PIN indicates that the pages will have a pin placed in them that the + * caller must unpin. This is must be done for DMA/async DIO read to a page to + * avoid CoW problems in fork. + * + * 0 indicates that no measures are taken and that it's up to the caller to + * retain the pages. + */ +static inline unsigned int iov_iter_extract_mode(struct iov_iter *iter) +{ + if (user_backed_iter(iter)) + return iter->data_source ? FOLL_GET : FOLL_PIN; + return 0; +} + #endif diff --git a/lib/iov_iter.c b/lib/iov_iter.c index c3ca28ca68a6..17f63f4d499b 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -1892,3 +1892,336 @@ void iov_iter_restore(struct iov_iter *i, struct iov_iter_state *state) i->iov -= state->nr_segs - i->nr_segs; i->nr_segs = state->nr_segs; } + +/* + * Extract a list of contiguous pages from an ITER_PIPE iterator. This does + * not get references of its own on the pages, nor does it get a pin on them. + * If there's a partial page, it adds that first and will then allocate and add + * pages into the pipe to make up the buffer space to the amount required. + * + * The caller must hold the pipe locked and only transferring into a pipe is + * supported. + */ +static ssize_t iov_iter_extract_pipe_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + unsigned int nr, offset, chunk, j; + struct page **p; + size_t left; + + if (!sanity(i)) + return -EFAULT; + + offset = pipe_npages(i, &nr); + if (!nr) + return -EFAULT; + *offset0 = offset; + + maxpages = min_t(size_t, nr, maxpages); + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + + left = maxsize; + for (j = 0; j < maxpages; j++) { + struct page *page = append_pipe(i, left, &offset); + if (!page) + break; + chunk = min_t(size_t, left, PAGE_SIZE - offset); + left -= chunk; + *p++ = page; + } + if (!j) + return -EFAULT; + return maxsize - left; +} + +/* + * Extract a list of contiguous pages from an ITER_XARRAY iterator. This does not + * get references on the pages, nor does it get a pin on them. + */ +static ssize_t iov_iter_extract_xarray_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + struct page *page, **p; + unsigned int nr = 0, offset; + loff_t pos = i->xarray_start + i->iov_offset; + pgoff_t index = pos >> PAGE_SHIFT; + XA_STATE(xas, i->xarray, index); + + offset = pos & ~PAGE_MASK; + *offset0 = offset; + + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + + rcu_read_lock(); + for (page = xas_load(&xas); page; page = xas_next(&xas)) { + if (xas_retry(&xas, page)) + continue; + + /* Has the page moved or been split? */ + if (unlikely(page != xas_reload(&xas))) { + xas_reset(&xas); + continue; + } + + p[nr++] = find_subpage(page, xas.xa_index); + if (nr == maxpages) + break; + } + rcu_read_unlock(); + + maxsize = min_t(size_t, nr * PAGE_SIZE - offset, maxsize); + i->iov_offset += maxsize; + i->count -= maxsize; + return maxsize; +} + +/* + * Extract a list of contiguous pages from an ITER_BVEC iterator. This does + * not get references on the pages, nor does it get a pin on them. + */ +static ssize_t iov_iter_extract_bvec_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + struct page **p, *page; + size_t skip = i->iov_offset, offset; + int k; + + maxsize = min(maxsize, i->bvec->bv_len - skip); + skip += i->bvec->bv_offset; + page = i->bvec->bv_page + skip / PAGE_SIZE; + offset = skip % PAGE_SIZE; + *offset0 = offset; + + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + p = *pages; + for (k = 0; k < maxpages; k++) + p[k] = page + k; + + maxsize = min_t(size_t, maxsize, maxpages * PAGE_SIZE - offset); + i->count -= maxsize; + i->iov_offset += maxsize; + if (i->iov_offset == i->bvec->bv_len) { + i->iov_offset = 0; + i->bvec++; + i->nr_segs--; + } + return maxsize; +} + +/* + * Get the first segment from an ITER_UBUF or ITER_IOVEC iterator. The + * iterator must not be empty. + */ +static unsigned long iov_iter_extract_first_user_segment(const struct iov_iter *i, + size_t *size) +{ + size_t skip; + long k; + + if (iter_is_ubuf(i)) + return (unsigned long)i->ubuf + i->iov_offset; + + for (k = 0, skip = i->iov_offset; k < i->nr_segs; k++, skip = 0) { + size_t len = i->iov[k].iov_len - skip; + + if (unlikely(!len)) + continue; + if (*size > len) + *size = len; + return (unsigned long)i->iov[k].iov_base + skip; + } + BUG(); // if it had been empty, we wouldn't get called +} + +/* + * Extract a list of contiguous pages from a user iterator and get references + * on them. This should only be used iff the iterator is user-backed + * (IOBUF/UBUF) and data is being transferred out of the buffer described by + * the iterator (ie. this is the source). + * + * The pages are returned with incremented refcounts that the caller must undo + * once the transfer is complete, but no additional pins are obtained. + * + * This is only safe to be used where background IO/DMA is not going to be + * modifying the buffer, and so won't cause a problem with CoW on fork. + */ +static ssize_t iov_iter_extract_user_pages_and_get(struct iov_iter *i, + struct page ***pages, + size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + unsigned long addr; + unsigned int gup_flags = FOLL_GET; + size_t offset; + int res; + + if (WARN_ON_ONCE(iov_iter_rw(i) != WRITE)) + return -EFAULT; + + if (i->nofault) + gup_flags |= FOLL_NOFAULT; + + addr = iov_iter_extract_first_user_segment(i, &maxsize); + *offset0 = offset = addr % PAGE_SIZE; + addr &= PAGE_MASK; + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + res = get_user_pages_fast(addr, maxpages, gup_flags, *pages); + if (unlikely(res <= 0)) + return res; + maxsize = min_t(size_t, maxsize, res * PAGE_SIZE - offset); + iov_iter_advance(i, maxsize); + return maxsize; +} + +/* + * Extract a list of contiguous pages from a user iterator and get a pin on + * each of them. This should only be used iff the iterator is user-backed + * (IOBUF/UBUF) and data is being transferred into the buffer described by the + * iterator (ie. this is the destination). + * + * It does not get refs on the pages, but the pages must be unpinned by the + * caller once the transfer is complete. + * + * This is safe to be used where background IO/DMA *is* going to be modifying + * the buffer; using a pin rather than a ref makes sure that CoW happens + * correctly in the parent during fork. + */ +static ssize_t iov_iter_extract_user_pages_and_pin(struct iov_iter *i, + struct page ***pages, + size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + unsigned long addr; + unsigned int gup_flags = FOLL_PIN | FOLL_WRITE; + size_t offset; + int res; + + if (WARN_ON_ONCE(iov_iter_rw(i) != READ)) + return -EFAULT; + + if (i->nofault) + gup_flags |= FOLL_NOFAULT; + + addr = first_iovec_segment(i, &maxsize); + *offset0 = offset = addr % PAGE_SIZE; + addr &= PAGE_MASK; + maxpages = want_pages_array(pages, maxsize, offset, maxpages); + if (!maxpages) + return -ENOMEM; + res = pin_user_pages_fast(addr, maxpages, gup_flags, *pages); + if (unlikely(res <= 0)) + return res; + maxsize = min_t(size_t, maxsize, res * PAGE_SIZE - offset); + iov_iter_advance(i, maxsize); + return maxsize; +} + +static ssize_t iov_iter_extract_user_pages(struct iov_iter *i, + struct page ***pages, size_t maxsize, + unsigned int maxpages, + size_t *offset0) +{ + switch (iov_iter_extract_mode(i)) { + case FOLL_GET: + return iov_iter_extract_user_pages_and_get(i, pages, maxsize, + maxpages, offset0); + case FOLL_PIN: + return iov_iter_extract_user_pages_and_pin(i, pages, maxsize, + maxpages, offset0); + default: + BUG(); + } +} + +/** + * iov_iter_extract_pages - Extract a list of contiguous pages from an iterator + * @i: The iterator to extract from + * @pages: Where to return the list of pages + * @maxsize: The maximum amount of iterator to extract + * @maxpages: The maximum size of the list of pages + * @offset0: Where to return the starting offset into (*@pages)[0] + * + * Extract a list of contiguous pages from the current point of the iterator, + * advancing the iterator. The maximum number of pages and the maximum amount + * of page contents can be set. + * + * If *@pages is NULL, a page list will be allocated to the required size and + * *@pages will be set to its base. If *@pages is not NULL, it will be assumed + * that the caller allocated a page list at least @maxpages in size and this + * will be filled in. + * + * Extra refs or pins on the pages may be obtained as follows: + * + * (*) If the iterator is user-backed (ITER_IOVEC/ITER_UBUF) and data is to be + * transferred /OUT OF/ the described buffer, refs will be taken on the + * pages, but pins will not be added. This can be used for DMA from a + * page; it cannot be used for DMA to a page, as it may cause page-COW + * problems in fork. + * + * (*) If the iterator is user-backed (ITER_IOVEC/ITER_UBUF) and data is to be + * transferred /INTO/ the described buffer, pins will be added to the + * pages, but refs will not be taken. This must be used for DMA to a + * page. + * + * (*) If the iterator is ITER_PIPE, this must describe a destination for the + * data. Additional pages may be allocated and added to the pipe (which + * will hold the refs), but neither refs nor pins will be obtained for the + * caller. The caller must hold the pipe lock. + * + * (*) If the iterator is ITER_BVEC or ITER_XARRAY, the pages are merely + * listed; no extra refs or pins are obtained. + * + * Note also: + * + * (*) Use with ITER_KVEC is not supported as that may refer to memory that + * doesn't have associated page structs. + * + * (*) Use with ITER_DISCARD is not supported as that has no content. + * + * On success, the function sets *@pages to the new pagelist, if allocated, and + * sets *offset0 to the offset into the first page and returns the amount of + * buffer space added represented by the page list. + * + * It may also return -ENOMEM and -EFAULT. + */ +ssize_t iov_iter_extract_pages(struct iov_iter *i, struct page ***pages, + size_t maxsize, unsigned int maxpages, + size_t *offset0) +{ + maxsize = min_t(size_t, min_t(size_t, maxsize, i->count), MAX_RW_COUNT); + if (!maxsize) + return 0; + + if (likely(user_backed_iter(i))) + return iov_iter_extract_user_pages(i, pages, maxsize, + maxpages, offset0); + if (iov_iter_is_bvec(i)) + return iov_iter_extract_bvec_pages(i, pages, maxsize, + maxpages, offset0); + if (iov_iter_is_pipe(i)) + return iov_iter_extract_pipe_pages(i, pages, maxsize, + maxpages, offset0); + if (iov_iter_is_xarray(i)) + return iov_iter_extract_xarray_pages(i, pages, maxsize, + maxpages, offset0); + return -EFAULT; +} +EXPORT_SYMBOL(iov_iter_extract_pages); From patchwork Thu Nov 17 14:55:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 21713 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp452758wrr; Thu, 17 Nov 2022 07:11:03 -0800 (PST) X-Google-Smtp-Source: AA0mqf4tjtg8RZvf7VQ5etGKfGbmgAtTcF88Cn65zNWrVR9dZhwd3sIfhy5qrNHqIJQ8x5MUA6j1 X-Received: by 2002:a05:6402:f:b0:468:56c3:7c8 with SMTP id d15-20020a056402000f00b0046856c307c8mr2613929edu.109.1668697863133; Thu, 17 Nov 2022 07:11:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668697863; cv=none; d=google.com; s=arc-20160816; b=a5P7/ecV1zA+2nvUILH1cnR5xcJiN4w6+btz/RhvxbuJPPtT0SeQuNudldFH/uF98g oWXspoYRhddx+d2jSjf7vJugHwXWB5JtDEDTle1dG3yh6f5Sor5yo1Rx5SFg1hCDxZ+r M9SkvErRoSdiNX9OuxVX+HQxOzcET6Tgj78hjQTTjBLJRjGGMKd8+hnl5AkPLDk9bYzX BqKute0gstcoO04OFJ9JMxLCLK8KIByAYbUEeQ968OUz5CGG7c85d3ZEOaZsM1Pl07vE rn4MPGawsCT1vk0CpT72+Tw0W+IX7D2QTSKoqI52JjfthSOpDj3MP1oTnpC8dpL5FMPc CNxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:organization:dkim-signature; bh=Z3P53KIxfMRXAPqaFtDYK0vfI2Qi3JDHggMcI2BJt1g=; b=PtpTZAebSAgm8uvraTTJPmrUCmik71YBp7v2clS+SAphbqMeeoy2mWfgUa0v6ZyzH/ KgOTrIlVYhRCT/R2AlJR+7O8JI6gZgGZIF4rolRSzux9YnliYpg9FuK3Atu8wedEGLXu eNa6FzQWJCg28UqEHkSIRaJugJE8L7rkNbhZtwpgOOP32YJgePnwL9KXqRaz2SNldd+K jD1/WuFCYdQELDcD83oGzRpffH8WrVfzu+Mus8iCQXN+sdkOIZLOtf8BrTcixukca681 7qFF75P3vT1W6JGXKG2WMEeQfRzJJ1G1A0ZU9sK1lX9MpuHmL4lRAdJkB0xirP7tHEe+ g7LQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Mnxgk9SO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id tl2-20020a170907c30200b00781599eb7dbsi641019ejc.573.2022.11.17.07.10.30; Thu, 17 Nov 2022 07:11:03 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Mnxgk9SO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234248AbiKQO5G (ORCPT + 99 others); Thu, 17 Nov 2022 09:57:06 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239572AbiKQO4e (ORCPT ); Thu, 17 Nov 2022 09:56:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1151A5E3D8 for ; Thu, 17 Nov 2022 06:55:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668696919; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z3P53KIxfMRXAPqaFtDYK0vfI2Qi3JDHggMcI2BJt1g=; b=Mnxgk9SOUw5ZlvL86nLl9915881fcNtH2ZLNrOMW0JBSt2u6T0rVAAu/bmmZycWfeleg1D jdK+pPot77KkOToZOjuwoOMs33UJRqjhD/VeZ9cIl271L13bRx/o6dFH7vFt7SeTfFo9w1 1IDD/pVip6gHkLC83VtUs4vxaO+1c1A= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-623-uM6Zn6CkPaewCesa9jCaPQ-1; Thu, 17 Nov 2022 09:55:08 -0500 X-MC-Unique: uM6Zn6CkPaewCesa9jCaPQ-1 Received: from smtp.corp.redhat.com (int-mx10.intmail.prod.int.rdu2.redhat.com [10.11.54.10]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F20D11006E32; Thu, 17 Nov 2022 14:55:07 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id 878B1492B04; Thu, 17 Nov 2022 14:55:06 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 3/4] netfs: Add a function to extract a UBUF or IOVEC into a BVEC iterator From: David Howells To: Al Viro Cc: Jeff Layton , Steve French , Shyam Prasad N , Rohith Surabattula , linux-cachefs@redhat.com, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 17 Nov 2022 14:55:03 +0000 Message-ID: <166869690376.3723671.8813331570219190705.stgit@warthog.procyon.org.uk> In-Reply-To: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> References: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.10 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749756530065335137?= X-GMAIL-MSGID: =?utf-8?q?1749756530065335137?= Add a function to extract the pages from a user-space supplied iterator (UBUF- or IOVEC-type) into a BVEC-type iterator, pinning the pages as we go. This is useful in three situations: (1) A userspace thread may have a sibling that unmaps or remaps the process's VM during the operation, changing the assignment of the pages and potentially causing an error. Pinning the pages keeps some pages around, even if this occurs; futher, we find out at the point of extraction if EFAULT is going to be incurred. (2) Pages might get swapped out/discarded if not pinned, so we want to pin them to avoid the reload causing a deadlock due to a DIO from/to an mmapped region on the same file. (3) The iterator may get passed to sendmsg() by the filesystem. If a fault occurs, we may get a short write to a TCP stream that's then tricky to recover from. We assume that other types of iterator (eg. BVEC-, KVEC- and XARRAY-type) are constructed only by kernel internals and that the pages are pinned in those cases. DISCARD- and PIPE-type iterators aren't DIO'able. Signed-off-by: David Howells cc: Jeff Layton cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/Makefile | 1 + fs/netfs/iterator.c | 94 +++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/netfs.h | 2 + 3 files changed, 97 insertions(+) create mode 100644 fs/netfs/iterator.c diff --git a/fs/netfs/Makefile b/fs/netfs/Makefile index f684c0cd1ec5..386d6fb92793 100644 --- a/fs/netfs/Makefile +++ b/fs/netfs/Makefile @@ -3,6 +3,7 @@ netfs-y := \ buffered_read.o \ io.o \ + iterator.o \ main.o \ objects.o diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c new file mode 100644 index 000000000000..c11d05a66a4a --- /dev/null +++ b/fs/netfs/iterator.c @@ -0,0 +1,94 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* Iterator helpers. + * + * Copyright (C) 2022 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#include +#include +#include +#include +#include "internal.h" + +/** + * netfs_extract_user_iter - Extract the pages from a user iterator into a bvec + * @orig: The original iterator + * @orig_len: The amount of iterator to copy + * @new: The iterator to be set up + * + * Extract the page fragments from the given amount of the source iterator and + * build up a second iterator that refers to all of those bits. This allows + * the original iterator to disposed of. + * + * On success, the number of elements in the bvec is returned and the original + * iterator will have been advanced by the amount extracted. + */ +ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, + struct iov_iter *new) +{ + struct bio_vec *bv = NULL; + struct page **pages; + unsigned int cur_npages; + unsigned int max_pages; + unsigned int npages = 0; + unsigned int i; + ssize_t ret; + size_t count = orig_len, offset, len; + size_t bv_size, pg_size; + + if (WARN_ON_ONCE(!iter_is_ubuf(orig) && !iter_is_iovec(orig))) + return -EIO; + + max_pages = iov_iter_npages(orig, INT_MAX); + bv_size = array_size(max_pages, sizeof(*bv)); + bv = kvmalloc(bv_size, GFP_KERNEL); + if (!bv) + return -ENOMEM; + + /* Put the page list at the end of the bvec list storage. bvec + * elements are larger than page pointers, so as long as we work + * 0->last, we should be fine. + */ + pg_size = array_size(max_pages, sizeof(*pages)); + pages = (void *)bv + bv_size - pg_size; + + while (count && npages < max_pages) { + ret = iov_iter_extract_pages(orig, &pages, count, + max_pages - npages, &offset); + if (ret < 0) { + pr_err("Couldn't get user pages (rc=%zd)\n", ret); + break; + } + + if (ret > count) { + pr_err("get_pages rc=%zd more than %zu\n", ret, count); + break; + } + + count -= ret; + ret += offset; + cur_npages = DIV_ROUND_UP(ret, PAGE_SIZE); + + if (npages + cur_npages > max_pages) { + pr_err("Out of bvec array capacity (%u vs %u)\n", + npages + cur_npages, max_pages); + break; + } + + for (i = 0; i < cur_npages; i++) { + len = ret > PAGE_SIZE ? PAGE_SIZE : ret; + bv[npages + i].bv_page = *pages++; + bv[npages + i].bv_offset = offset; + bv[npages + i].bv_len = len - offset; + ret -= len; + offset = 0; + } + + npages += cur_npages; + } + + iov_iter_bvec(new, iov_iter_rw(orig), bv, npages, orig_len - count); + return npages; +} +EXPORT_SYMBOL(netfs_extract_user_iter); diff --git a/include/linux/netfs.h b/include/linux/netfs.h index f2402ddeafbf..5f6ad0246946 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -288,6 +288,8 @@ void netfs_get_subrequest(struct netfs_io_subrequest *subreq, void netfs_put_subrequest(struct netfs_io_subrequest *subreq, bool was_async, enum netfs_sreq_ref_trace what); void netfs_stats_show(struct seq_file *); +ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, + struct iov_iter *new); /** * netfs_inode - Get the netfs inode context from the inode From patchwork Thu Nov 17 14:55:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 21702 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:f944:0:0:0:0:0 with SMTP id q4csp451093wrr; Thu, 17 Nov 2022 07:08:35 -0800 (PST) X-Google-Smtp-Source: AA0mqf7on5Gprd8FX2t9nW6iU2cfHl0nzuZMaN4ZcCMZrVmZter6gf3IpzvNfiQ1b2LTwHcyc2Yo X-Received: by 2002:a17:907:2c57:b0:7ad:9893:fba9 with SMTP id hf23-20020a1709072c5700b007ad9893fba9mr2433243ejc.62.1668697715582; Thu, 17 Nov 2022 07:08:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1668697715; cv=none; d=google.com; s=arc-20160816; b=LKwxs2LuxP109QcCAZ6fjVCAFmZ6XPAno7ei265dKsfRPRz83ddCBXv4ocFmt8uwOS Vo4CwnsXBY3dDzXzNA4lV35DWSf9UNftjKiwHVShzG0wNabYyFIY1/4uGvgP5FRzAsE1 aHcBFXDQ+i0jdlFpP/AIF23Ek7Uhk9/5slEkrF9o6LvlpUtU9FN/KHRIXDHQ2DcI618m AHHNDscFeDf/lz/L2IV0M0JgFUwV1fyg5g6wPQAbnOX/IFDFuTaFlleDWuEU7OMwV2lk udAqocPbDM2PUSdTXsvn46zDU5EeOrqB3OdssBvMI/cEHPEGctR/AIwphyLwDOGTTCl4 lLtQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:cc:to:from :subject:organization:dkim-signature; bh=YJcni/qoz2Jn4Q6qmO5bzrBJYh/kxcflZir3I1y2Bzk=; b=Uqx6oB5VTm/JuBKK01nJDzhmj97HVEaP5RjQ8ZeCBokGkK/56kfnElZBUfP6Qqha1E 0+Vf6VMt8fhRiF9OSaz6DRhy7LJGJ0WO1VpS8kuYiuUcbS6MMWr2KgS3wekB0Tjt5iBS Z/3wfiA2+5elfOKuvlmhQZ5nX/GoigPlgFh+jMYGi0OfR2RL2REK0Os6bGTxJD0HOM0u cZWzBEM4CtRpUhgABOYMMAZBlOBEhYcqKBdnBZnr4tZa12YASf9WFwkaoUG4yk/QGI1V WCAbDMBRcTRrgKWionl1M3IRa8p2UI+M5wLDcuUquPtnpzcR2l1izC0j97NdSKNBsz0k 2CEQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Se3fII+k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id sb7-20020a1709076d8700b007ae64410b4bsi801413ejc.741.2022.11.17.07.07.58; Thu, 17 Nov 2022 07:08:35 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Se3fII+k; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239691AbiKQO5I (ORCPT + 99 others); Thu, 17 Nov 2022 09:57:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46014 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239700AbiKQO4l (ORCPT ); Thu, 17 Nov 2022 09:56:41 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EFFA6177F for ; Thu, 17 Nov 2022 06:55:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1668696922; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=YJcni/qoz2Jn4Q6qmO5bzrBJYh/kxcflZir3I1y2Bzk=; b=Se3fII+k12qUm4VkC1Mq862FV5uDwX8AQtC1YvLXR7OfEQqaZ5f1cYAK9vwS8aUOExHvqj Cxl9hrk2vPrCINkomLJluorjsyce7bOfV2YndwadDzgU36OjphxEb+G9inkMIADbRhbwB/ Pp5oQAuYWDKUvIeHJQVr0qk2lEPt0nc= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-624-Tm7bUv0QOTycOu1RYONZZg-1; Thu, 17 Nov 2022 09:55:17 -0500 X-MC-Unique: Tm7bUv0QOTycOu1RYONZZg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 327CA94AB04; Thu, 17 Nov 2022 14:55:17 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id BA6961415119; Thu, 17 Nov 2022 14:55:15 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 Subject: [RFC PATCH 4/4] netfs: Add a function to extract an iterator into a scatterlist From: David Howells To: Al Viro Cc: Jeff Layton , Steve French , Shyam Prasad N , Rohith Surabattula , linux-cachefs@redhat.com, linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, dhowells@redhat.com, Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Date: Thu, 17 Nov 2022 14:55:13 +0000 Message-ID: <166869691313.3723671.10714823767342163891.stgit@warthog.procyon.org.uk> In-Reply-To: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> References: <166869687556.3723671.10061142538708346995.stgit@warthog.procyon.org.uk> User-Agent: StGit/1.5 MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1749756375572468162?= X-GMAIL-MSGID: =?utf-8?q?1749756375572468162?= Provide a function for filling in a scatterlist from the list of pages contained in an iterator. If the iterator is UBUF- or IOBUF-type, the pages have a ref taken on them. If the iterator is BVEC-, KVEC- or XARRAY-type, no ref is taken on the pages and it is left to the caller to manage their lifetime. It cannot be assumed that a ref can be validly taken, particularly in the case of a KVEC iterator. Signed-off-by: David Howells cc: Jeff Layton cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: linux-cachefs@redhat.com cc: linux-cifs@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/netfs/iterator.c | 252 +++++++++++++++++++++++++++++++++++++++++++++++++ include/linux/netfs.h | 3 + mm/vmalloc.c | 1 3 files changed, 256 insertions(+) diff --git a/fs/netfs/iterator.c b/fs/netfs/iterator.c index c11d05a66a4a..62485416cc3d 100644 --- a/fs/netfs/iterator.c +++ b/fs/netfs/iterator.c @@ -7,7 +7,9 @@ #include #include +#include #include +#include #include #include "internal.h" @@ -92,3 +94,253 @@ ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, return npages; } EXPORT_SYMBOL(netfs_extract_user_iter); + +/* + * Extract and pin up to sg_max pages from UBUF- or IOVEC-class iterators and + * add them to the scatterlist. + */ +static ssize_t netfs_extract_user_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max) +{ + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + struct page **pages; + unsigned int npages; + ssize_t ret = 0, res; + size_t len, off; + + /* We decant the page list into the tail of the scatterlist */ + pages = (void *)sgtable->sgl + array_size(sg_max, sizeof(struct scatterlist)); + pages -= sg_max; + + do { + res = iov_iter_get_pages2(iter, pages, maxsize, sg_max, &off); + if (res < 0) + goto failed; + + len = res; + maxsize -= len; + ret += len; + npages = DIV_ROUND_UP(off + len, PAGE_SIZE); + sg_max -= npages; + + for (; npages < 0; npages--) { + struct page *page = *pages; + size_t seg = min_t(size_t, PAGE_SIZE - off, len); + + *pages++ = NULL; + sg_set_page(sg, page, len, off); + sgtable->nents++; + sg++; + len -= seg; + off = 0; + } + } while (maxsize > 0 && sg_max > 0); + + return ret; + +failed: + while (sgtable->nents > sgtable->orig_nents) + put_page(sg_page(&sgtable->sgl[--sgtable->nents])); + return res; +} + +/* + * Extract up to sg_max pages from a BVEC-type iterator and add them to the + * scatterlist. The pages are not pinned. + */ +static ssize_t netfs_extract_bvec_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max) +{ + const struct bio_vec *bv = iter->bvec; + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + unsigned long start = iter->iov_offset; + unsigned int i; + ssize_t ret = 0; + + for (i = 0; i < iter->nr_segs; i++) { + size_t off, len; + + len = bv[i].bv_len; + if (start >= len) { + start -= len; + continue; + } + + len = min_t(size_t, maxsize, len - start); + off = bv[i].bv_offset + start; + + sg_set_page(sg, bv[i].bv_page, len, off); + sgtable->nents++; + sg++; + sg_max--; + + ret += len; + maxsize -= len; + if (maxsize <= 0 || sg_max == 0) + break; + start = 0; + } + + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/* + * Extract up to sg_max pages from a KVEC-type iterator and add them to the + * scatterlist. This can deal with vmalloc'd buffers as well as kmalloc'd or + * static buffers. The pages are not pinned. + */ +static ssize_t netfs_extract_kvec_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max) +{ + const struct kvec *kv = iter->kvec; + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + unsigned long start = iter->iov_offset; + unsigned int i; + ssize_t ret = 0; + + for (i = 0; i < iter->nr_segs; i++) { + struct page *page; + unsigned long kaddr; + size_t off, len, seg; + + len = kv[i].iov_len; + if (start >= len) { + start -= len; + continue; + } + + kaddr = (unsigned long)kv[i].iov_base + start; + off = kaddr & ~PAGE_MASK; + len = min_t(size_t, maxsize, len - start); + kaddr &= PAGE_MASK; + + maxsize -= len; + ret += len; + do { + seg = min_t(size_t, len, PAGE_SIZE - off); + if (is_vmalloc_or_module_addr((void *)kaddr)) + page = vmalloc_to_page((void *)kaddr); + else + page = virt_to_page(kaddr); + + sg_set_page(sg, page, len, off); + sgtable->nents++; + sg++; + sg_max--; + + len -= seg; + kaddr += PAGE_SIZE; + off = 0; + } while (len > 0 && sg_max > 0); + + if (maxsize <= 0 || sg_max == 0) + break; + start = 0; + } + + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/* + * Extract up to sg_max folios from an XARRAY-type iterator and add them to + * the scatterlist. The pages are not pinned. + */ +static ssize_t netfs_extract_xarray_to_sg(struct iov_iter *iter, + ssize_t maxsize, + struct sg_table *sgtable, + unsigned int sg_max) +{ + struct scatterlist *sg = sgtable->sgl + sgtable->nents; + struct xarray *xa = iter->xarray; + struct folio *folio; + loff_t start = iter->xarray_start + iter->iov_offset; + pgoff_t index = start / PAGE_SIZE; + ssize_t ret = 0; + size_t offset, len; + XA_STATE(xas, xa, index); + + rcu_read_lock(); + + xas_for_each(&xas, folio, ULONG_MAX) { + if (xas_retry(&xas, folio)) + continue; + if (WARN_ON(xa_is_value(folio))) + break; + if (WARN_ON(folio_test_hugetlb(folio))) + break; + + offset = offset_in_folio(folio, start); + len = min_t(size_t, maxsize, folio_size(folio) - offset); + + sg_set_page(sg, folio_page(folio, 0), len, offset); + sgtable->nents++; + sg++; + sg_max--; + + maxsize -= len; + ret += len; + if (maxsize <= 0 || sg_max == 0) + break; + } + + rcu_read_unlock(); + if (ret > 0) + iov_iter_advance(iter, ret); + return ret; +} + +/** + * netfs_extract_iter_to_sg - Extract pages from an iterator and add ot an sglist + * @iter: The iterator to extract from + * @maxsize: The amount of iterator to copy + * @sgtable: The scatterlist table to fill in + * @sg_max: Maximum number of elements in @sgtable that may be filled + * + * Extract the page fragments from the given amount of the source iterator and + * add them to a scatterlist that refers to all of those bits, to a maximum + * addition of @sg_max elements. + * + * The pages referred to by UBUF- and IOVEC-type iterators are extracted and + * pinned; BVEC-, KVEC- and XARRAY-type are extracted but aren't pinned; PIPE- + * and DISCARD-type are not supported. + * + * No end mark is placed on the scatterlist; that's left to the caller. + * + * If successul, @sgtable->nents is updated to include the number of elements + * added and the number of bytes added is returned. @sgtable->orig_nents is + * left unaltered. + */ +ssize_t netfs_extract_iter_to_sg(struct iov_iter *iter, size_t maxsize, + struct sg_table *sgtable, unsigned int sg_max) +{ + if (maxsize == 0) + return 0; + + switch (iov_iter_type(iter)) { + case ITER_UBUF: + case ITER_IOVEC: + return netfs_extract_user_to_sg(iter, maxsize, sgtable, sg_max); + case ITER_BVEC: + return netfs_extract_bvec_to_sg(iter, maxsize, sgtable, sg_max); + case ITER_KVEC: + return netfs_extract_kvec_to_sg(iter, maxsize, sgtable, sg_max); + case ITER_XARRAY: + return netfs_extract_xarray_to_sg(iter, maxsize, sgtable, sg_max); + default: + pr_err("netfs_extract_iter_to_sg(%u) unsupported\n", + iov_iter_type(iter)); + WARN_ON_ONCE(1); + return -EIO; + } +} +EXPORT_SYMBOL(netfs_extract_iter_to_sg); diff --git a/include/linux/netfs.h b/include/linux/netfs.h index 5f6ad0246946..21771dd594a1 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -290,6 +290,9 @@ void netfs_put_subrequest(struct netfs_io_subrequest *subreq, void netfs_stats_show(struct seq_file *); ssize_t netfs_extract_user_iter(struct iov_iter *orig, size_t orig_len, struct iov_iter *new); +struct sg_table; +ssize_t netfs_extract_iter_to_sg(struct iov_iter *iter, size_t len, + struct sg_table *sgtable, unsigned int sg_max); /** * netfs_inode - Get the netfs inode context from the inode diff --git a/mm/vmalloc.c b/mm/vmalloc.c index ccaa461998f3..b13ac142685b 100644 --- a/mm/vmalloc.c +++ b/mm/vmalloc.c @@ -653,6 +653,7 @@ int is_vmalloc_or_module_addr(const void *x) #endif return is_vmalloc_addr(x); } +EXPORT_SYMBOL_GPL(is_vmalloc_or_module_addr); /* * Walk a vmap address to the struct page it maps. Huge vmap mappings will