Message ID | 20231013160423.2218093-12-dhowells@redhat.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:612c:2908:b0:403:3b70:6f57 with SMTP id ib8csp1997031vqb; Fri, 13 Oct 2023 09:07:45 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEXiXnlk12VWlaSDC6CAAIcJwtoW4yXbIEqmD3aBXAgks5xk6VcbYik5JU2gb1LFk0hJEFT X-Received: by 2002:a17:902:da8d:b0:1c1:fbec:bc3f with SMTP id j13-20020a170902da8d00b001c1fbecbc3fmr29936932plx.5.1697213265140; Fri, 13 Oct 2023 09:07:45 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697213265; cv=none; d=google.com; s=arc-20160816; b=VCjOrM1RzCDRKvMTDeM2kMV6llOIBdWAJKoEmUvjApRMT728Yy8pGTLfCo7Mg0ywdT 9ddAVtjomBifo7R3R08LgvGkMUT9uHYbfx4P+pTkMvgYYwbIZrXmwf69YvH2lxQnKOjB LwVFGuEukBMBjTPfUVbA0MTBvcEAxOrhMcXKN18Owtrw1DAwA0YXdi5posFQCPR2bQvU S0/WRrqj9ohzjBy47Utfm16Q1ni6NTLE+gv/UBkwshGJBPuRqFyenISjZieew8DmwMNg txkLANNSG7BwChFwAPRdt5mbMNh94mmoy4V5TlE1uO34bYowXkhSjO0IVujFaZKCUFfw MT2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ntUQ/reWYDAgg2miRGVf+hR8qevYgo6i7or36auU25I=; fh=QCasRecNkSMoqSQLnZiX89u33dCID3L05AOtURpI2pA=; b=kZ7oCio0TV/C9LqZt6iLOqz+35eUqguc8/SUcoXB4Cw0K2jGIQNU9MeklfihzY+uvI HMMkgq17297CoUo5SHuI0p+OtdOKiaxSAP1WE/aRAnIqtdiQKfJz1kOd5p0jStFt+oUq 72TJ50KWdLjyZxaKbJejFOCtAkTugsXzxXWLu84WyLc5fCP7fnmMmRLVjN61ukBVgo02 WSERznbJuEyGjj1/HrQ4BXA1R/bSpNGizRgM1aIjrIwPHnQ0u0oAfSs4oDECtfHWcxNI mENNLk9ROhh/UgVgLeVE5rzl5gIwME6TYUEIkHUW5tUfAdK2LKcDRcpVBCIK449qcSnU iSpA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QkUMsCdF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id n14-20020a170902e54e00b001c5f8995611si5113546plf.483.2023.10.13.09.07.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 13 Oct 2023 09:07:45 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=QkUMsCdF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 9A95D8360A12; Fri, 13 Oct 2023 09:07:39 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232686AbjJMQHO (ORCPT <rfc822;rua109.linux@gmail.com> + 19 others); Fri, 13 Oct 2023 12:07:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41120 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232890AbjJMQGM (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Fri, 13 Oct 2023 12:06:12 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92776187 for <linux-kernel@vger.kernel.org>; Fri, 13 Oct 2023 09:05:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1697213109; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ntUQ/reWYDAgg2miRGVf+hR8qevYgo6i7or36auU25I=; b=QkUMsCdFmYcgezhYZ3qOYa74kMHRbgJes3Frc8pS+lA9Ltn7nBmBNCiuIb7eFVBDQiCHcf AlloG7XceJj2A7rBvajfm8q+OyHMyPr+3W55Ukm33smrhP8bJY/yEG384qx4ayRQFLokKQ 9sJHmSO6B92ZEM6PE2tuQghg5aiW6h8= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-681--Tly13tIP_OoFZmS3aBALw-1; Fri, 13 Oct 2023 12:05:05 -0400 X-MC-Unique: -Tly13tIP_OoFZmS3aBALw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 6FBB83C40C20; Fri, 13 Oct 2023 16:05:04 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.42.28.226]) by smtp.corp.redhat.com (Postfix) with ESMTP id E4F5C1C06535; Fri, 13 Oct 2023 16:05:01 +0000 (UTC) From: David Howells <dhowells@redhat.com> To: Jeff Layton <jlayton@kernel.org>, Steve French <smfrench@gmail.com> Cc: David Howells <dhowells@redhat.com>, Matthew Wilcox <willy@infradead.org>, Marc Dionne <marc.dionne@auristor.com>, Paulo Alcantara <pc@manguebit.com>, Shyam Prasad N <sprasad@microsoft.com>, Tom Talpey <tom@talpey.com>, Dominique Martinet <asmadeus@codewreck.org>, Ilya Dryomov <idryomov@gmail.com>, Christian Brauner <christian@brauner.io>, linux-afs@lists.infradead.org, linux-cifs@vger.kernel.org, linux-nfs@vger.kernel.org, ceph-devel@vger.kernel.org, v9fs@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-cachefs@redhat.com Subject: [RFC PATCH 11/53] netfs: Add support for DIO buffering Date: Fri, 13 Oct 2023 17:03:40 +0100 Message-ID: <20231013160423.2218093-12-dhowells@redhat.com> In-Reply-To: <20231013160423.2218093-1-dhowells@redhat.com> References: <20231013160423.2218093-1-dhowells@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Fri, 13 Oct 2023 09:07:39 -0700 (PDT) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1779657096710764129 X-GMAIL-MSGID: 1779657096710764129 |
Series |
netfs, afs, cifs: Delegate high-level I/O to netfslib
|
|
Commit Message
David Howells
Oct. 13, 2023, 4:03 p.m. UTC
Add a bvec array pointer and an iterator to netfs_io_request for either
holding a copy of a DIO iterator or a list of all the bits of buffer
pointed to by a DIO iterator.
There are two problems: Firstly, if an iovec-class iov_iter is passed to
->read_iter() or ->write_iter(), this cannot be passed directly to
kernel_sendmsg() or kernel_recvmsg() as that may cause locking recursion if
a fault is generated, so we need to keep track of the pages involved
separately.
Secondly, if the I/O is asynchronous, we must copy the iov_iter describing
the buffer before returning to the caller as it may be immediately
deallocated.
Signed-off-by: David Howells <dhowells@redhat.com>
cc: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
cc: linux-fsdevel@vger.kernel.org
cc: linux-mm@kvack.org
---
fs/netfs/objects.c | 10 ++++++++++
include/linux/netfs.h | 3 +++
2 files changed, 13 insertions(+)
Comments
On Fri, 2023-10-13 at 17:03 +0100, David Howells wrote: > Add a bvec array pointer and an iterator to netfs_io_request for either > holding a copy of a DIO iterator or a list of all the bits of buffer > pointed to by a DIO iterator. > > There are two problems: Firstly, if an iovec-class iov_iter is passed to > ->read_iter() or ->write_iter(), this cannot be passed directly to > kernel_sendmsg() or kernel_recvmsg() as that may cause locking recursion if > a fault is generated, so we need to keep track of the pages involved > separately. > > Secondly, if the I/O is asynchronous, we must copy the iov_iter describing > the buffer before returning to the caller as it may be immediately > deallocated. > > Signed-off-by: David Howells <dhowells@redhat.com> > cc: Jeff Layton <jlayton@kernel.org> > cc: linux-cachefs@redhat.com > cc: linux-fsdevel@vger.kernel.org > cc: linux-mm@kvack.org > --- > fs/netfs/objects.c | 10 ++++++++++ > include/linux/netfs.h | 3 +++ > 2 files changed, 13 insertions(+) > > diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c > index 8e92b8401aaa..4396318081bf 100644 > --- a/fs/netfs/objects.c > +++ b/fs/netfs/objects.c > @@ -78,6 +78,7 @@ static void netfs_free_request(struct work_struct *work) > { > struct netfs_io_request *rreq = > container_of(work, struct netfs_io_request, work); > + unsigned int i; > > trace_netfs_rreq(rreq, netfs_rreq_trace_free); > netfs_proc_del_rreq(rreq); > @@ -86,6 +87,15 @@ static void netfs_free_request(struct work_struct *work) > rreq->netfs_ops->free_request(rreq); > if (rreq->cache_resources.ops) > rreq->cache_resources.ops->end_operation(&rreq->cache_resources); > + if (rreq->direct_bv) { > + for (i = 0; i < rreq->direct_bv_count; i++) { > + if (rreq->direct_bv[i].bv_page) { > + if (rreq->direct_bv_unpin) > + unpin_user_page(rreq->direct_bv[i].bv_page); > + } > + } > + kvfree(rreq->direct_bv); > + } > kfree_rcu(rreq, rcu); > netfs_stat_d(&netfs_n_rh_rreq); > } > diff --git a/include/linux/netfs.h b/include/linux/netfs.h > index bd0437088f0e..66479a61ad00 100644 > --- a/include/linux/netfs.h > +++ b/include/linux/netfs.h > @@ -191,7 +191,9 @@ struct netfs_io_request { > struct list_head subrequests; /* Contributory I/O operations */ > struct iov_iter iter; /* Unencrypted-side iterator */ > struct iov_iter io_iter; /* I/O (Encrypted-side) iterator */ > + struct bio_vec *direct_bv; /* DIO buffer list (when handling iovec-iter) */ > void *netfs_priv; /* Private data for the netfs */ > + unsigned int direct_bv_count; /* Number of elements in bv[] */ nit: "number of elements in direct_bv[]" Also, just for better readability, can you swap direct_bv and netfs_priv? Then at least the array and count are together. > unsigned int debug_id; > unsigned int rsize; /* Maximum read size (0 for none) */ > atomic_t nr_outstanding; /* Number of ops in progress */ > @@ -200,6 +202,7 @@ struct netfs_io_request { > size_t len; /* Length of the request */ > short error; /* 0 or error that occurred */ > enum netfs_io_origin origin; /* Origin of the request */ > + bool direct_bv_unpin; /* T if direct_bv[] must be unpinned */ > loff_t i_size; /* Size of the file */ > loff_t start; /* Start position */ > pgoff_t no_unlock_folio; /* Don't unlock this folio after read */ >
Jeff Layton <jlayton@kernel.org> wrote: > > + struct bio_vec *direct_bv; /* DIO buffer list (when handling iovec-iter) */ > > void *netfs_priv; /* Private data for the netfs */ > > + unsigned int direct_bv_count; /* Number of elements in bv[] */ > > nit: "number of elements in direct_bv[]" > > Also, just for better readability, can you swap direct_bv and > netfs_priv? Then at least the array and count are together. Yeah - and stick a __counted_by() on too. David
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c index 8e92b8401aaa..4396318081bf 100644 --- a/fs/netfs/objects.c +++ b/fs/netfs/objects.c @@ -78,6 +78,7 @@ static void netfs_free_request(struct work_struct *work) { struct netfs_io_request *rreq = container_of(work, struct netfs_io_request, work); + unsigned int i; trace_netfs_rreq(rreq, netfs_rreq_trace_free); netfs_proc_del_rreq(rreq); @@ -86,6 +87,15 @@ static void netfs_free_request(struct work_struct *work) rreq->netfs_ops->free_request(rreq); if (rreq->cache_resources.ops) rreq->cache_resources.ops->end_operation(&rreq->cache_resources); + if (rreq->direct_bv) { + for (i = 0; i < rreq->direct_bv_count; i++) { + if (rreq->direct_bv[i].bv_page) { + if (rreq->direct_bv_unpin) + unpin_user_page(rreq->direct_bv[i].bv_page); + } + } + kvfree(rreq->direct_bv); + } kfree_rcu(rreq, rcu); netfs_stat_d(&netfs_n_rh_rreq); } diff --git a/include/linux/netfs.h b/include/linux/netfs.h index bd0437088f0e..66479a61ad00 100644 --- a/include/linux/netfs.h +++ b/include/linux/netfs.h @@ -191,7 +191,9 @@ struct netfs_io_request { struct list_head subrequests; /* Contributory I/O operations */ struct iov_iter iter; /* Unencrypted-side iterator */ struct iov_iter io_iter; /* I/O (Encrypted-side) iterator */ + struct bio_vec *direct_bv; /* DIO buffer list (when handling iovec-iter) */ void *netfs_priv; /* Private data for the netfs */ + unsigned int direct_bv_count; /* Number of elements in bv[] */ unsigned int debug_id; unsigned int rsize; /* Maximum read size (0 for none) */ atomic_t nr_outstanding; /* Number of ops in progress */ @@ -200,6 +202,7 @@ struct netfs_io_request { size_t len; /* Length of the request */ short error; /* 0 or error that occurred */ enum netfs_io_origin origin; /* Origin of the request */ + bool direct_bv_unpin; /* T if direct_bv[] must be unpinned */ loff_t i_size; /* Size of the file */ loff_t start; /* Start position */ pgoff_t no_unlock_folio; /* Don't unlock this folio after read */