From patchwork Tue Jan 31 18:28:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 51004 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp62934wrn; Tue, 31 Jan 2023 10:32:01 -0800 (PST) X-Google-Smtp-Source: AMrXdXsZDDp+E8QARcTyFebAeQn3hzwqr1BlX0US2wdyxhmkSOMABm6nzYPkRWAOSZkYE3Y+yaGS X-Received: by 2002:a17:902:e54e:b0:195:eea0:5712 with SMTP id n14-20020a170902e54e00b00195eea05712mr50146057plf.38.1675189921673; Tue, 31 Jan 2023 10:32:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675189921; cv=none; d=google.com; s=arc-20160816; b=bAz4+zIyp9gZjW8IwFl9g10/w9tvU+brZi5hs8EovtPRpB7gc5WdTCj1jbc8EmEppH E7TTLrsdhCqfgRhsXnc+v72z6xjqKygeVejzXj5JQ25nkNjOEcolSbOWVSOP8q/WFNNS 1tbkPvwY6fHxA/5Hg0q/OkwTNbbGe1z4MCzdaZBpFHojuv4AHCz6WTSvc0TNc+nhfOhX WwvCQ13Pj1bQL9hG/Iul827LZ3CyYDoqfo0TYabdqvvbWWJQiD7LOoFKAr4bB+Ixig3X FcH2dsF6aeQGagfihjbEfcgrwrhjal+3sAK0dbMRy0tHF0Qf9wanHUfaTJ4ACo6rAahw S6UQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=fK9UANJIw+SgKwPMH1h7QmOCsXxDJO7CfeS4m/u44DQ=; b=dgm++LEErgIwSyn96NPtxvtbUa425NGQswHhMSS1+fpLlCW2lIPl6GkD84Tmih6BCh o4UtFitIbBgpGB6dk3cRHG7cEqIRcLzeeLARWPl22rT/HHHVupMkhD8hr7ARmi37ybKT 6pLDEeBNrDgA3GzPuKGrMvQp2GwZNBomF+Jzvspn/DTOPswR0pxawCRE54lw7dCGLeB+ v/feRDO9JX2e1Yn2ch0mwvJFq01hRdIbfoZG1w7wUf2ora4/p2y96zX7VdyR5VdqeGb8 sIw7xKVXEyx8U7hqJmjURo4CcdF5pnViIQa2oCjCORcGeykTK2Ekh2RHJyw6jUfej2wo fnfA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IOdhSDvp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id n12-20020a170902d2cc00b0019329cd26dcsi10009095plc.442.2023.01.31.10.31.47; Tue, 31 Jan 2023 10:32:01 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=IOdhSDvp; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231520AbjAaSav (ORCPT + 99 others); Tue, 31 Jan 2023 13:30:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:54028 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231509AbjAaSae (ORCPT ); Tue, 31 Jan 2023 13:30:34 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9DCF758976 for ; Tue, 31 Jan 2023 10:29:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675189781; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fK9UANJIw+SgKwPMH1h7QmOCsXxDJO7CfeS4m/u44DQ=; b=IOdhSDvp9VtREcYdvjzPafQxoAN6GwqkhX7oBVTYe0TF0ZMexuGrMXp7NygvDJX4InGZlr 7Yf8To3sH+BE6OY+KNVz79P1LBAUi7Q8O1BmRsT8Gtxf1htoRLHc5CHyqLScHKEfwLOwpM qJUpUvxorTTEdSvT6mI5VA+/6XpB7OI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-455-qIHfuhcfP6KtFfwlJJioYw-1; Tue, 31 Jan 2023 13:29:33 -0500 X-MC-Unique: qIHfuhcfP6KtFfwlJJioYw-1 Received: from smtp.corp.redhat.com (int-mx01.intmail.prod.int.rdu2.redhat.com [10.11.54.1]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 541D418A650B; Tue, 31 Jan 2023 18:29:07 +0000 (UTC) Received: from warthog.procyon.org.uk.com (unknown [10.33.36.97]) by smtp.corp.redhat.com (Postfix) with ESMTP id AF61C40C2064; Tue, 31 Jan 2023 18:29:05 +0000 (UTC) From: David Howells To: Steve French Cc: David Howells , Al Viro , Shyam Prasad N , Rohith Surabattula , Tom Talpey , Stefan Metzmacher , Christoph Hellwig , Matthew Wilcox , Jeff Layton , linux-cifs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Steve French Subject: [PATCH 03/12] cifs: Implement splice_read to pass down ITER_BVEC not ITER_PIPE Date: Tue, 31 Jan 2023 18:28:46 +0000 Message-Id: <20230131182855.4027499-4-dhowells@redhat.com> In-Reply-To: <20230131182855.4027499-1-dhowells@redhat.com> References: <20230131182855.4027499-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.1 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756563947083524295?= X-GMAIL-MSGID: =?utf-8?q?1756563947083524295?= Provide cifs_splice_read() to use a bvec rather than an pipe iterator as the latter cannot so easily be split and advanced, which is necessary to pass an iterator down to the bottom levels. Upstream cifs gets around this problem by using iov_iter_get_pages() to prefill the pipe and then passing the list of pages down. This is done by: (1) Bulk-allocate a bunch of pages to carry as much of the requested amount of data as possible, but without overrunning the available slots in the pipe and add them to an ITER_BVEC. (2) Synchronously call ->read_iter() to read into the buffer. (3) Discard any unused pages. (4) Load the remaining pages into the pipe in order and advance the head pointer. Signed-off-by: David Howells cc: Steve French cc: Shyam Prasad N cc: Rohith Surabattula cc: Jeff Layton cc: Al Viro cc: linux-cifs@vger.kernel.org Link: https://lore.kernel.org/r/166732028113.3186319.1793644937097301358.stgit@warthog.procyon.org.uk/ # rfc --- fs/cifs/cifsfs.c | 12 +++---- fs/cifs/cifsfs.h | 3 ++ fs/cifs/file.c | 92 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/splice.c | 1 + 4 files changed, 102 insertions(+), 6 deletions(-) diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 10e00c624922..3c57e8b11692 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -1358,7 +1358,7 @@ const struct file_operations cifs_file_ops = { .fsync = cifs_fsync, .flush = cifs_flush, .mmap = cifs_file_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .llseek = cifs_llseek, .unlocked_ioctl = cifs_ioctl, @@ -1378,7 +1378,7 @@ const struct file_operations cifs_file_strict_ops = { .fsync = cifs_strict_fsync, .flush = cifs_flush, .mmap = cifs_file_strict_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .llseek = cifs_llseek, .unlocked_ioctl = cifs_ioctl, @@ -1398,7 +1398,7 @@ const struct file_operations cifs_file_direct_ops = { .fsync = cifs_fsync, .flush = cifs_flush, .mmap = cifs_file_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .unlocked_ioctl = cifs_ioctl, .copy_file_range = cifs_copy_file_range, @@ -1416,7 +1416,7 @@ const struct file_operations cifs_file_nobrl_ops = { .fsync = cifs_fsync, .flush = cifs_flush, .mmap = cifs_file_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .llseek = cifs_llseek, .unlocked_ioctl = cifs_ioctl, @@ -1434,7 +1434,7 @@ const struct file_operations cifs_file_strict_nobrl_ops = { .fsync = cifs_strict_fsync, .flush = cifs_flush, .mmap = cifs_file_strict_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .llseek = cifs_llseek, .unlocked_ioctl = cifs_ioctl, @@ -1452,7 +1452,7 @@ const struct file_operations cifs_file_direct_nobrl_ops = { .fsync = cifs_fsync, .flush = cifs_flush, .mmap = cifs_file_mmap, - .splice_read = generic_file_splice_read, + .splice_read = cifs_splice_read, .splice_write = iter_file_splice_write, .unlocked_ioctl = cifs_ioctl, .copy_file_range = cifs_copy_file_range, diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h index 1705c76529d8..2e979d2f4e36 100644 --- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -100,6 +100,9 @@ extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to); extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from); extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from); extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from); +extern ssize_t cifs_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags); extern int cifs_flock(struct file *pfile, int cmd, struct file_lock *plock); extern int cifs_lock(struct file *, int, struct file_lock *); extern int cifs_fsync(struct file *, loff_t, loff_t, int); diff --git a/fs/cifs/file.c b/fs/cifs/file.c index 22dfc1f8b4f1..30d01b236f77 100644 --- a/fs/cifs/file.c +++ b/fs/cifs/file.c @@ -5273,3 +5273,95 @@ const struct address_space_operations cifs_addr_ops_smallbuf = { .launder_folio = cifs_launder_folio, .migrate_folio = filemap_migrate_folio, }; + +/* + * Splice data from a file into a pipe. + */ +ssize_t cifs_splice_read(struct file *file, loff_t *ppos, + struct pipe_inode_info *pipe, size_t len, + unsigned int flags) +{ + LIST_HEAD(pages); + struct iov_iter to; + struct bio_vec *bv; + struct kiocb kiocb; + struct page *page; + unsigned int head; + ssize_t ret; + size_t used, npages, chunk, remain, reclaim; + int i; + + /* Work out how much data we can actually add into the pipe */ + used = pipe_occupancy(pipe->head, pipe->tail); + npages = max_t(ssize_t, pipe->max_usage - used, 0); + len = min_t(size_t, len, npages * PAGE_SIZE); + npages = DIV_ROUND_UP(len, PAGE_SIZE); + + bv = kmalloc(array_size(npages, sizeof(bv[0])), GFP_KERNEL); + if (!bv) + return -ENOMEM; + + npages = alloc_pages_bulk_list(GFP_USER, npages, &pages); + if (!npages) { + kfree(bv); + return -ENOMEM; + } + + remain = len = min_t(size_t, len, npages * PAGE_SIZE); + + for (i = 0; i < npages; i++) { + chunk = min_t(size_t, PAGE_SIZE, remain); + page = list_first_entry(&pages, struct page, lru); + list_del_init(&page->lru); + bv[i].bv_page = page; + bv[i].bv_offset = 0; + bv[i].bv_len = chunk; + remain -= chunk; + } + + /* Do the I/O */ + iov_iter_bvec(&to, READ, bv, npages, len); + init_sync_kiocb(&kiocb, file); + kiocb.ki_pos = *ppos; + ret = call_read_iter(file, &kiocb, &to); + + reclaim = npages * PAGE_SIZE; + remain = 0; + if (ret > 0) { + reclaim -= ret; + remain = ret; + *ppos = kiocb.ki_pos; + file_accessed(file); + } else if (ret < 0) { + /* + * callers of ->splice_read() expect -EAGAIN on + * "can't put anything in there", rather than -EFAULT. + */ + if (ret == -EFAULT) + ret = -EAGAIN; + } + + /* Free any pages that didn't get touched at all. */ + for (; reclaim >= PAGE_SIZE; reclaim -= PAGE_SIZE) + __free_page(bv[--npages].bv_page); + + /* Push the remaining pages into the pipe. */ + head = pipe->head; + for (i = 0; i < npages; i++) { + struct pipe_buffer *buf = &pipe->bufs[head & (pipe->ring_size - 1)]; + + chunk = min_t(size_t, remain, PAGE_SIZE); + *buf = (struct pipe_buffer) { + .ops = &default_pipe_buf_ops, + .page = bv[i].bv_page, + .offset = 0, + .len = chunk, + }; + head++; + remain -= chunk; + } + pipe->head = head; + + kfree(bv); + return ret; +} diff --git a/fs/splice.c b/fs/splice.c index 5969b7a1d353..95435b5cca2a 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -330,6 +330,7 @@ const struct pipe_buf_operations default_pipe_buf_ops = { .try_steal = generic_pipe_buf_try_steal, .get = generic_pipe_buf_get, }; +EXPORT_SYMBOL(default_pipe_buf_ops); /* Pipe buffer operations for a socket and similar. */ const struct pipe_buf_operations nosteal_pipe_buf_ops = {