From patchwork Wed Sep 13 16:56:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 139248 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:172:b0:3f2:4152:657d with SMTP id h50csp65445vqi; Wed, 13 Sep 2023 19:30:29 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFhg/wFdSEIy4KsqiZQ1lVtP3HAs6lGmuOLKHb/G7C7ca4FWsllQDr/p+ozsvaeJpVuhJWk X-Received: by 2002:a81:8490:0:b0:586:cf7:2207 with SMTP id u138-20020a818490000000b005860cf72207mr4147359ywf.14.1694658629415; Wed, 13 Sep 2023 19:30:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694658629; cv=none; d=google.com; s=arc-20160816; b=PoaK8+ijT0o5IFMMrvhLQEAMJPmjLZDl2xpU9KJ7KVDh5IuY+tdBg5wpttVHhAP+04 4FBWdfmIPBQHUwW+D0BNTXItupO9XC9jUWmIfJ2IY1uWqqct48jdiOJf4F6uYpifOb+F ZpNQp3Xql9yECG+WvxJRyynhXLnqh1WguFYJ4hXel2vsnrCwr6zfcqrTjvWQ91Ug4/r+ F3AsdWcGRZR7Fy2+ba1t+Qt6Bx6of/ojsNAQCwqOuHbojfzbmL8zLg4Iug3psakya8yI IVgvIqwrm4/8k348RcMvvYaCXZWRGWv6j419P5/ql4MrgxlbbZU8hqPShRVrhllJYmQZ rjAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7f30E5rfqXOL8OIlALoKZyRz07q/M/xUz+BJRcAuap4=; fh=JMFAeSz7Gk3tUajr4TCvrb0uYdtniI0MVPhc5ULEcNg=; b=ipyv6DAZ+bK5QW8oKPV9wpvssvo1rO4IHSzvhGMwnnE8Wq3TmRqhQ9/lfEAG2DkHdW bjW4CMskRyRDSQKsCF3VWyQ/t3Y6vFoQkpOH4plRgF4BCG1f409E7P5C+UHCpAoKJODk QjnHd1tGbFqJukdpMAgR2lS4X4LrGe8kCWMFRwzyfy/P2+92ZJeLIXoVfa2pQM3HX+d0 0feGkWoirUHM/jPslhLUs+FPCZHv79o8WPTCUoyw+x+V+SpJTNAi/qlZHszk0yN75MM1 WglHzAZ6ecMI710JJV+vztNNKwHsAOWnaDikSpnwqarFtxG8robvbtrIzk9X8H+Py2jf i6Sg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=X8Io225g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from agentk.vger.email (agentk.vger.email. [2620:137:e000::3:2]) by mx.google.com with ESMTPS id e23-20020a62aa17000000b0068fe7c4147esi520948pff.391.2023.09.13.19.30.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Sep 2023 19:30:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) client-ip=2620:137:e000::3:2; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=X8Io225g; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:2 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 37C5F80B31F7; Wed, 13 Sep 2023 09:58:48 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231349AbjIMQ6U (ORCPT + 35 others); Wed, 13 Sep 2023 12:58:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36720 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230364AbjIMQ6C (ORCPT ); Wed, 13 Sep 2023 12:58:02 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id C1C931BE4 for ; Wed, 13 Sep 2023 09:57:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1694624230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7f30E5rfqXOL8OIlALoKZyRz07q/M/xUz+BJRcAuap4=; b=X8Io225goG2hh+l9kcO05IlnRznggCDmmxuez8sTVbo3dXyS5KFrVVClmLvb4JzDsraoFP h5iJOUPy8iAvpigEi8tFJCUlKYBxiTI/r8TagDnHvS/TSgfpBtJdo+FVJb0zjN5iK9w6RH 8ZEg6XyviFmhuNdCOyCRv73uO/fDjdg= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-510-cMka_6aIPNiC8mOgNkW5Mw-1; Wed, 13 Sep 2023 12:57:06 -0400 X-MC-Unique: cMka_6aIPNiC8mOgNkW5Mw-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9603B185A797; Wed, 13 Sep 2023 16:57:05 +0000 (UTC) Received: from warthog.procyon.org.com (unknown [10.42.28.216]) by smtp.corp.redhat.com (Postfix) with ESMTP id D5E5F40C6EBF; Wed, 13 Sep 2023 16:57:03 +0000 (UTC) From: David Howells To: Al Viro , Linus Torvalds Cc: David Howells , Jens Axboe , Christoph Hellwig , Christian Brauner , David Laight , Matthew Wilcox , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 05/13] iov: Move iterator functions to a header file Date: Wed, 13 Sep 2023 17:56:40 +0100 Message-ID: <20230913165648.2570623-6-dhowells@redhat.com> In-Reply-To: <20230913165648.2570623-1-dhowells@redhat.com> References: <20230913165648.2570623-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Wed, 13 Sep 2023 09:58:48 -0700 (PDT) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1776978366890095631 X-GMAIL-MSGID: 1776978366890095631 Move the iterator functions to a header file so that other operations that need to scan over an iterator can be added. For instance, the rbd driver could use this to scan a buffer to see if it is all zeros and libceph could use this to generate a crc. Signed-off-by: David Howells cc: Alexander Viro cc: Jens Axboe cc: Christoph Hellwig cc: Christian Brauner cc: Matthew Wilcox cc: Linus Torvalds cc: David Laight cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org cc: linux-mm@kvack.org --- include/linux/iov_iter.h | 261 +++++++++++++++++++++++++++++++++++++++ lib/iov_iter.c | 197 +---------------------------- 2 files changed, 262 insertions(+), 196 deletions(-) create mode 100644 include/linux/iov_iter.h diff --git a/include/linux/iov_iter.h b/include/linux/iov_iter.h new file mode 100644 index 000000000000..836854847cdf --- /dev/null +++ b/include/linux/iov_iter.h @@ -0,0 +1,261 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* I/O iterator iteration building functions. + * + * Copyright (C) 2023 Red Hat, Inc. All Rights Reserved. + * Written by David Howells (dhowells@redhat.com) + */ + +#ifndef _LINUX_IOV_ITER_H +#define _LINUX_IOV_ITER_H + +#include +#include + +typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len, + void *priv, void *priv2); +typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len, + void *priv, void *priv2); + +/* + * Handle ITER_UBUF. + */ +static __always_inline +size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2, + iov_ustep_f step) +{ + void __user *base = iter->ubuf; + size_t progress = 0, remain; + + remain = step(base + iter->iov_offset, 0, len, priv, priv2); + progress = len - remain; + iter->iov_offset += progress; + return progress; +} + +/* + * Handle ITER_IOVEC. + */ +static __always_inline +size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2, + iov_ustep_f step) +{ + const struct iovec *p = iter->__iov; + size_t progress = 0, skip = iter->iov_offset; + + do { + size_t remain, consumed; + size_t part = min(len, p->iov_len - skip); + + if (likely(part)) { + remain = step(p->iov_base + skip, progress, part, priv, priv2); + consumed = part - remain; + progress += consumed; + skip += consumed; + len -= consumed; + if (skip < p->iov_len) + break; + } + p++; + skip = 0; + } while (len); + + iter->__iov = p; + iter->nr_segs -= p - iter->__iov; + iter->iov_offset = skip; + return progress; +} + +/* + * Handle ITER_KVEC. + */ +static __always_inline +size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, + iov_step_f step) +{ + const struct kvec *p = iter->kvec; + size_t progress = 0, skip = iter->iov_offset; + + do { + size_t remain, consumed; + size_t part = min(len, p->iov_len - skip); + + if (likely(part)) { + remain = step(p->iov_base + skip, progress, part, priv, priv2); + consumed = part - remain; + progress += consumed; + skip += consumed; + len -= consumed; + if (skip < p->iov_len) + break; + } + p++; + skip = 0; + } while (len); + + iter->nr_segs -= p - iter->kvec; + iter->kvec = p; + iter->iov_offset = skip; + return progress; +} + +/* + * Handle ITER_BVEC. + */ +static __always_inline +size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, + iov_step_f step) +{ + const struct bio_vec *p = iter->bvec; + size_t progress = 0, skip = iter->iov_offset; + + do { + size_t remain, consumed; + size_t offset = p->bv_offset + skip, part; + void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE); + + part = min3(len, + (size_t)(p->bv_len - skip), + (size_t)(PAGE_SIZE - offset % PAGE_SIZE)); + remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2); + kunmap_local(kaddr); + consumed = part - remain; + len -= consumed; + progress += consumed; + skip += consumed; + if (skip >= p->bv_len) { + skip = 0; + p++; + } + if (remain) + break; + } while (len); + + iter->nr_segs -= p - iter->bvec; + iter->bvec = p; + iter->iov_offset = skip; + return progress; +} + +/* + * Handle ITER_XARRAY. + */ +static __always_inline +size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2, + iov_step_f step) +{ + struct folio *folio; + size_t progress = 0; + loff_t start = iter->xarray_start + iter->iov_offset; + pgoff_t index = start / PAGE_SIZE; + XA_STATE(xas, iter->xarray, index); + + rcu_read_lock(); + xas_for_each(&xas, folio, ULONG_MAX) { + size_t remain, consumed, offset, part, flen; + + if (xas_retry(&xas, folio)) + continue; + if (WARN_ON(xa_is_value(folio))) + break; + if (WARN_ON(folio_test_hugetlb(folio))) + break; + + offset = offset_in_folio(folio, start + progress); + flen = min(folio_size(folio) - offset, len); + + while (flen) { + void *base = kmap_local_folio(folio, offset); + + part = min_t(size_t, flen, + PAGE_SIZE - offset_in_page(offset)); + remain = step(base, progress, part, priv, priv2); + kunmap_local(base); + + consumed = part - remain; + progress += consumed; + len -= consumed; + + if (remain || len == 0) + goto out; + flen -= consumed; + offset += consumed; + } + } + +out: + rcu_read_unlock(); + iter->iov_offset += progress; + return progress; +} + +/** + * iterate_and_advance2 - Iterate over an iterator + * @iter: The iterator to iterate over. + * @len: The amount to iterate over. + * @priv: Data for the step functions. + * @priv2: More data for the step functions. + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses. + * @step: Function for other iterators; given kernel addresses. + * + * Iterate over the next part of an iterator, up to the specified length. The + * buffer is presented in segments, which for kernel iteration are broken up by + * physical pages and mapped, with the mapped address being presented. + * + * Two step functions, @step and @ustep, must be provided, one for handling + * mapped kernel addresses and the other is given user addresses which have the + * potential to fault since no pinning is performed. + * + * The step functions are passed the address and length of the segment, @priv, + * @priv2 and the amount of data so far iterated over (which can, for example, + * be added to @priv to point to the right part of a second buffer). The step + * functions should return the amount of the segment they didn't process (ie. 0 + * indicates complete processsing). + * + * This function returns the amount of data processed (ie. 0 means nothing was + * processed and the value of @len means processes to completion). + */ +static __always_inline +size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv, + void *priv2, iov_ustep_f ustep, iov_step_f step) +{ + size_t progress; + + if (unlikely(iter->count < len)) + len = iter->count; + if (unlikely(!len)) + return 0; + + if (likely(iter_is_ubuf(iter))) + progress = iterate_ubuf(iter, len, priv, priv2, ustep); + else if (likely(iter_is_iovec(iter))) + progress = iterate_iovec(iter, len, priv, priv2, ustep); + else if (iov_iter_is_bvec(iter)) + progress = iterate_bvec(iter, len, priv, priv2, step); + else if (iov_iter_is_kvec(iter)) + progress = iterate_kvec(iter, len, priv, priv2, step); + else if (iov_iter_is_xarray(iter)) + progress = iterate_xarray(iter, len, priv, priv2, step); + else + progress = len; + iter->count -= progress; + return progress; +} + +/** + * iterate_and_advance - Iterate over an iterator + * @iter: The iterator to iterate over. + * @len: The amount to iterate over. + * @priv: Data for the step functions. + * @ustep: Function for UBUF/IOVEC iterators; given __user addresses. + * @step: Function for other iterators; given kernel addresses. + * + * As iterate_and_advance2(), but priv2 is always NULL. + */ +static __always_inline +size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv, + iov_ustep_f ustep, iov_step_f step) +{ + return iterate_and_advance2(iter, len, priv, NULL, ustep, step); +} + +#endif /* _LINUX_IOV_ITER_H */ diff --git a/lib/iov_iter.c b/lib/iov_iter.c index b3ce6fa5f7a5..65374ee91ecd 100644 --- a/lib/iov_iter.c +++ b/lib/iov_iter.c @@ -13,202 +13,7 @@ #include #include #include - -typedef size_t (*iov_step_f)(void *iter_base, size_t progress, size_t len, - void *priv, void *priv2); -typedef size_t (*iov_ustep_f)(void __user *iter_base, size_t progress, size_t len, - void *priv, void *priv2); - -static __always_inline -size_t iterate_ubuf(struct iov_iter *iter, size_t len, void *priv, void *priv2, - iov_ustep_f step) -{ - void __user *base = iter->ubuf; - size_t progress = 0, remain; - - remain = step(base + iter->iov_offset, 0, len, priv, priv2); - progress = len - remain; - iter->iov_offset += progress; - return progress; -} - -static __always_inline -size_t iterate_iovec(struct iov_iter *iter, size_t len, void *priv, void *priv2, - iov_ustep_f step) -{ - const struct iovec *p = iter->__iov; - size_t progress = 0, skip = iter->iov_offset; - - do { - size_t remain, consumed; - size_t part = min(len, p->iov_len - skip); - - if (likely(part)) { - remain = step(p->iov_base + skip, progress, part, priv, priv2); - consumed = part - remain; - progress += consumed; - skip += consumed; - len -= consumed; - if (skip < p->iov_len) - break; - } - p++; - skip = 0; - } while (len); - - iter->__iov = p; - iter->nr_segs -= p - iter->__iov; - iter->iov_offset = skip; - return progress; -} - -static __always_inline -size_t iterate_kvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, - iov_step_f step) -{ - const struct kvec *p = iter->kvec; - size_t progress = 0, skip = iter->iov_offset; - - do { - size_t remain, consumed; - size_t part = min(len, p->iov_len - skip); - - if (likely(part)) { - remain = step(p->iov_base + skip, progress, part, priv, priv2); - consumed = part - remain; - progress += consumed; - skip += consumed; - len -= consumed; - if (skip < p->iov_len) - break; - } - p++; - skip = 0; - } while (len); - - iter->nr_segs -= p - iter->kvec; - iter->kvec = p; - iter->iov_offset = skip; - return progress; -} - -static __always_inline -size_t iterate_bvec(struct iov_iter *iter, size_t len, void *priv, void *priv2, - iov_step_f step) -{ - const struct bio_vec *p = iter->bvec; - size_t progress = 0, skip = iter->iov_offset; - - do { - size_t remain, consumed; - size_t offset = p->bv_offset + skip, part; - void *kaddr = kmap_local_page(p->bv_page + offset / PAGE_SIZE); - - part = min3(len, - (size_t)(p->bv_len - skip), - (size_t)(PAGE_SIZE - offset % PAGE_SIZE)); - remain = step(kaddr + offset % PAGE_SIZE, progress, part, priv, priv2); - kunmap_local(kaddr); - consumed = part - remain; - len -= consumed; - progress += consumed; - skip += consumed; - if (skip >= p->bv_len) { - skip = 0; - p++; - } - if (remain) - break; - } while (len); - - iter->nr_segs -= p - iter->bvec; - iter->bvec = p; - iter->iov_offset = skip; - return progress; -} - -static __always_inline -size_t iterate_xarray(struct iov_iter *iter, size_t len, void *priv, void *priv2, - iov_step_f step) -{ - struct folio *folio; - size_t progress = 0; - loff_t start = iter->xarray_start + iter->iov_offset; - pgoff_t index = start / PAGE_SIZE; - XA_STATE(xas, iter->xarray, index); - - rcu_read_lock(); - xas_for_each(&xas, folio, ULONG_MAX) { - size_t remain, consumed, offset, part, flen; - - if (xas_retry(&xas, folio)) - continue; - if (WARN_ON(xa_is_value(folio))) - break; - if (WARN_ON(folio_test_hugetlb(folio))) - break; - - offset = offset_in_folio(folio, start + progress); - flen = min(folio_size(folio) - offset, len); - - while (flen) { - void *base = kmap_local_folio(folio, offset); - - part = min_t(size_t, flen, - PAGE_SIZE - offset_in_page(offset)); - remain = step(base, progress, part, priv, priv2); - kunmap_local(base); - - consumed = part - remain; - progress += consumed; - len -= consumed; - - if (remain || len == 0) - goto out; - flen -= consumed; - offset += consumed; - } - } - -out: - rcu_read_unlock(); - iter->iov_offset += progress; - return progress; -} - -static __always_inline -size_t iterate_and_advance2(struct iov_iter *iter, size_t len, void *priv, - void *priv2, iov_ustep_f ustep, iov_step_f step) -{ - size_t progress; - - if (unlikely(iter->count < len)) - len = iter->count; - if (unlikely(!len)) - return 0; - - if (likely(iter_is_ubuf(iter))) - progress = iterate_ubuf(iter, len, priv, priv2, ustep); - else if (likely(iter_is_iovec(iter))) - progress = iterate_iovec(iter, len, priv, priv2, ustep); - else if (iov_iter_is_bvec(iter)) - progress = iterate_bvec(iter, len, priv, priv2, step); - else if (iov_iter_is_kvec(iter)) - progress = iterate_kvec(iter, len, priv, priv2, step); - else if (iov_iter_is_xarray(iter)) - progress = iterate_xarray(iter, len, priv, priv2, step); - else - progress = len; - iter->count -= progress; - return progress; -} - -static __always_inline -size_t iterate_and_advance(struct iov_iter *iter, size_t len, void *priv, - iov_ustep_f ustep, iov_step_f step) -{ - return iterate_and_advance2(iter, len, priv, NULL, ustep, step); -} +#include static __always_inline size_t copy_to_user_iter(void __user *iter_to, size_t progress,