Message ID | 20230603204939.1598818-2-AVKrasnov@sberdevices.ru |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:994d:0:b0:3d9:f83d:47d9 with SMTP id k13csp1843677vqr; Sat, 3 Jun 2023 14:04:03 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ72CkOPvMmPJpN9TNF8PPEjT3SO6kRGXFEGdwJk0unzOpqPtWHPlP4Kv5DEb0XoGK5bC5pg X-Received: by 2002:a05:6a21:6d89:b0:105:e434:670b with SMTP id wl9-20020a056a216d8900b00105e434670bmr3426358pzb.4.1685826242993; Sat, 03 Jun 2023 14:04:02 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1685826242; cv=none; d=google.com; s=arc-20160816; b=bFU/qkYxcIqavxVcKiunMMxdLcInvTNisD4Ecs6EazI2v5NqZOWkMmqjvR+iu5tj0s t+X+B38yjorYjXnhsMzvR3OXvQ6Imtw8f21dEh6peJgTdiAiTGsAt31dAnciR2f36wmf uIfvuyOWucsWV1V3ErFIMAz1HPmitIH5QGcWjBl3edxZKdrKncZ2xAJ65mnJOzG+wkNz TEJ7NFtd1LzKAFrzfgOMtN/KfnG8WggDhVLnO2xwummyAtn2Zs7RSDfYoE6dsJnCKOOK bNIzSHtmgF5Q9IGtIZGWv5HQePv16rxgtxWEZmVCmyCq/JeM6znCEUjkLDT0snUwrEbI KsYg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=PDBAwhjzQiqgrPTljhwk+nGe366nkV6M9z7XnlirNW8=; b=a2eyjSCEUyZZgaYKMKparQFCix99joMwF7iHW0/v887c8XqRx797OO75dS9xSptkfp sFKxnwYwHf/OzEASmMbVnB4GRQnSjI4gcMiBF8a98c/X5xcZQrxAerIqTLIVArYl98Uv kAGiWXUZqj3KWKgrma6J0t0im3tYHeN48Q9pinjCP3DoAklpaPP84WS7MFud93qEk7Sz Yc/yKWA7/W+aOncC8XbSwmYIBUZKrE6ijuJAu/rBUZ10LF/wG/J9BF1IB8ekXjyLXAqi WT2JEBexaG03yCJmIOyVEVzAR5/HZhr8ehzi3OsMVbJhD5ZbtHfTk34uf+k0nfcF3zI1 qLtQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@sberdevices.ru header.s=mail header.b=FR3zMR8Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=sberdevices.ru Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id j13-20020a63cf0d000000b0052c54de0299si3137324pgg.637.2023.06.03.14.03.49; Sat, 03 Jun 2023 14:04:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@sberdevices.ru header.s=mail header.b=FR3zMR8Y; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=sberdevices.ru Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229501AbjFCUzE (ORCPT <rfc822;stefanalexe802@gmail.com> + 99 others); Sat, 3 Jun 2023 16:55:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37830 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232482AbjFCUyy (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Sat, 3 Jun 2023 16:54:54 -0400 Received: from mx.sberdevices.ru (mx.sberdevices.ru [45.89.227.171]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E26361B3; Sat, 3 Jun 2023 13:54:49 -0700 (PDT) Received: from s-lin-edge02.sberdevices.ru (localhost [127.0.0.1]) by mx.sberdevices.ru (Postfix) with ESMTP id 6A8285FD12; Sat, 3 Jun 2023 23:54:46 +0300 (MSK) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sberdevices.ru; s=mail; t=1685825686; bh=PDBAwhjzQiqgrPTljhwk+nGe366nkV6M9z7XnlirNW8=; h=From:To:Subject:Date:Message-ID:MIME-Version:Content-Type; b=FR3zMR8YRK5GQa6omA48DZfBFK8qiKu7IHMLXkG597CuAn/lHd3aksNa/DAsWDWr3 znvKwhkwL/AbKaVCrXJF2979aU1r3E6EitpbU2fSEEVoxRlpMonhjGiFuzC9wHFhz2 /MpQ0zde9leqwFIDj761TGWyyfiVehPnTI0QWIfvd/L0NgH/epWKA5ijaU5jjW5AHf UESxZn3ATlwRgymk0Xs2tTknuiMRjVREZamed1vSOjDxuJw05OW4/VIqDMyavuCIzI DmLHG8uP1E+vPCsZc31/nvPA7AwoQcl01X3ZHRbU6/r+nsr4IjIV1cJ+0GyDIT+Rsc FO6W8AEbCuObw== Received: from S-MS-EXCH01.sberdevices.ru (S-MS-EXCH01.sberdevices.ru [172.16.1.4]) by mx.sberdevices.ru (Postfix) with ESMTP; Sat, 3 Jun 2023 23:54:46 +0300 (MSK) From: Arseniy Krasnov <AVKrasnov@sberdevices.ru> To: Stefan Hajnoczi <stefanha@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Bobby Eshleman <bobby.eshleman@bytedance.com> CC: <kvm@vger.kernel.org>, <virtualization@lists.linux-foundation.org>, <netdev@vger.kernel.org>, <linux-kernel@vger.kernel.org>, <kernel@sberdevices.ru>, <oxffffaa@gmail.com>, <avkrasnov@sberdevices.ru>, Arseniy Krasnov <AVKrasnov@sberdevices.ru> Subject: [RFC PATCH v4 01/17] vsock/virtio: read data from non-linear skb Date: Sat, 3 Jun 2023 23:49:23 +0300 Message-ID: <20230603204939.1598818-2-AVKrasnov@sberdevices.ru> X-Mailer: git-send-email 2.35.0 In-Reply-To: <20230603204939.1598818-1-AVKrasnov@sberdevices.ru> References: <20230603204939.1598818-1-AVKrasnov@sberdevices.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [172.16.1.6] X-ClientProxiedBy: S-MS-EXCH02.sberdevices.ru (172.16.1.5) To S-MS-EXCH01.sberdevices.ru (172.16.1.4) X-KSMG-Rule-ID: 4 X-KSMG-Message-Action: clean X-KSMG-AntiSpam-Status: not scanned, disabled by settings X-KSMG-AntiSpam-Interceptor-Info: not scanned X-KSMG-AntiPhishing: not scanned, disabled by settings X-KSMG-AntiVirus: Kaspersky Secure Mail Gateway, version 1.1.2.30, bases: 2023/06/03 16:55:00 #21417531 X-KSMG-AntiVirus-Status: Clean, skipped X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1767716938147937909?= X-GMAIL-MSGID: =?utf-8?q?1767716938147937909?= |
Series |
vsock: MSG_ZEROCOPY flag support
|
|
Commit Message
Arseniy Krasnov
June 3, 2023, 8:49 p.m. UTC
This is preparation patch for non-linear skbuff handling. It replaces
direct calls of 'memcpy_to_msg()' with 'skb_copy_datagram_iter()'. Main
advantage of the second one is that is can handle paged part of the skb
by using 'kmap()' on each page, but if there are no pages in the skb,
it behaves like simple copying to iov iterator. This patch also adds
new field to the control block of skb - this value shows current offset
in the skb to read next portion of data (it doesn't matter linear it or
not). Idea is that 'skb_copy_datagram_iter()' handles both types of
skb internally - it just needs an offset from which to copy data from
the given skb. This offset is incremented on each read from skb. This
approach allows to avoid special handling of non-linear skbs:
1) We can't call 'skb_pull()' on it, because it updates 'data' pointer.
2) We need to update 'data_len' also on each read from this skb.
Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru>
---
include/linux/virtio_vsock.h | 1 +
net/vmw_vsock/virtio_transport_common.c | 26 +++++++++++++++++--------
2 files changed, 19 insertions(+), 8 deletions(-)
Comments
On Sat, Jun 03, 2023 at 11:49:23PM +0300, Arseniy Krasnov wrote: > This is preparation patch for non-linear skbuff handling. It replaces > direct calls of 'memcpy_to_msg()' with 'skb_copy_datagram_iter()'. Main > advantage of the second one is that is can handle paged part of the skb > by using 'kmap()' on each page, but if there are no pages in the skb, > it behaves like simple copying to iov iterator. This patch also adds > new field to the control block of skb - this value shows current offset > in the skb to read next portion of data (it doesn't matter linear it or > not). Idea is that 'skb_copy_datagram_iter()' handles both types of > skb internally - it just needs an offset from which to copy data from > the given skb. This offset is incremented on each read from skb. This > approach allows to avoid special handling of non-linear skbs: > 1) We can't call 'skb_pull()' on it, because it updates 'data' pointer. > 2) We need to update 'data_len' also on each read from this skb. > > Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> > --- > include/linux/virtio_vsock.h | 1 + > net/vmw_vsock/virtio_transport_common.c | 26 +++++++++++++++++-------- > 2 files changed, 19 insertions(+), 8 deletions(-) > > diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h > index c58453699ee9..17dbb7176e37 100644 > --- a/include/linux/virtio_vsock.h > +++ b/include/linux/virtio_vsock.h > @@ -12,6 +12,7 @@ > struct virtio_vsock_skb_cb { > bool reply; > bool tap_delivered; > + u32 frag_off; > }; > > #define VIRTIO_VSOCK_SKB_CB(skb) ((struct virtio_vsock_skb_cb *)((skb)->cb)) > diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c > index b769fc258931..5819a9cd4515 100644 > --- a/net/vmw_vsock/virtio_transport_common.c > +++ b/net/vmw_vsock/virtio_transport_common.c > @@ -355,7 +355,7 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, > spin_lock_bh(&vvs->rx_lock); > > skb_queue_walk_safe(&vvs->rx_queue, skb, tmp) { > - off = 0; > + off = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; > > if (total == len) > break; > @@ -370,7 +370,10 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, > */ > spin_unlock_bh(&vvs->rx_lock); > > - err = memcpy_to_msg(msg, skb->data + off, bytes); > + err = skb_copy_datagram_iter(skb, off, > + &msg->msg_iter, > + bytes); > + > if (err) > goto out; > > @@ -414,24 +417,28 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, > skb = skb_peek(&vvs->rx_queue); > > bytes = len - total; > - if (bytes > skb->len) > - bytes = skb->len; > + if (bytes > skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off) > + bytes = skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off; > > /* sk_lock is held by caller so no one else can dequeue. > * Unlock rx_lock since memcpy_to_msg() may sleep. > */ > spin_unlock_bh(&vvs->rx_lock); > > - err = memcpy_to_msg(msg, skb->data, bytes); > + err = skb_copy_datagram_iter(skb, > + VIRTIO_VSOCK_SKB_CB(skb)->frag_off, > + &msg->msg_iter, bytes); > + > if (err) > goto out; > > spin_lock_bh(&vvs->rx_lock); > > total += bytes; > - skb_pull(skb, bytes); > > - if (skb->len == 0) { > + VIRTIO_VSOCK_SKB_CB(skb)->frag_off += bytes; > + > + if (skb->len == VIRTIO_VSOCK_SKB_CB(skb)->frag_off) { > u32 pkt_len = le32_to_cpu(virtio_vsock_hdr(skb)->len); > > virtio_transport_dec_rx_pkt(vvs, pkt_len); > @@ -503,7 +510,10 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk, > */ > spin_unlock_bh(&vvs->rx_lock); > > - err = memcpy_to_msg(msg, skb->data, bytes_to_copy); > + err = skb_copy_datagram_iter(skb, 0, > + &msg->msg_iter, > + bytes_to_copy); > + > if (err) { > /* Copy of message failed. Rest of > * fragments will be freed without copy. > -- > 2.25.1 > LGTM. Reviewed-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
On Sat, Jun 03, 2023 at 11:49:23PM +0300, Arseniy Krasnov wrote: >This is preparation patch for non-linear skbuff handling. It replaces >direct calls of 'memcpy_to_msg()' with 'skb_copy_datagram_iter()'. Main >advantage of the second one is that is can handle paged part of the skb >by using 'kmap()' on each page, but if there are no pages in the skb, >it behaves like simple copying to iov iterator. This patch also adds >new field to the control block of skb - this value shows current offset >in the skb to read next portion of data (it doesn't matter linear it or >not). Idea is that 'skb_copy_datagram_iter()' handles both types of >skb internally - it just needs an offset from which to copy data from >the given skb. This offset is incremented on each read from skb. This >approach allows to avoid special handling of non-linear skbs: >1) We can't call 'skb_pull()' on it, because it updates 'data' pointer. >2) We need to update 'data_len' also on each read from this skb. > >Signed-off-by: Arseniy Krasnov <AVKrasnov@sberdevices.ru> >--- > include/linux/virtio_vsock.h | 1 + > net/vmw_vsock/virtio_transport_common.c | 26 +++++++++++++++++-------- > 2 files changed, 19 insertions(+), 8 deletions(-) > >diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h >index c58453699ee9..17dbb7176e37 100644 >--- a/include/linux/virtio_vsock.h >+++ b/include/linux/virtio_vsock.h >@@ -12,6 +12,7 @@ > struct virtio_vsock_skb_cb { > bool reply; > bool tap_delivered; >+ u32 frag_off; > }; > > #define VIRTIO_VSOCK_SKB_CB(skb) ((struct virtio_vsock_skb_cb *)((skb)->cb)) >diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >index b769fc258931..5819a9cd4515 100644 >--- a/net/vmw_vsock/virtio_transport_common.c >+++ b/net/vmw_vsock/virtio_transport_common.c >@@ -355,7 +355,7 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, > spin_lock_bh(&vvs->rx_lock); > > skb_queue_walk_safe(&vvs->rx_queue, skb, tmp) { >- off = 0; >+ off = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; > > if (total == len) > break; >@@ -370,7 +370,10 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, > */ > spin_unlock_bh(&vvs->rx_lock); > >- err = memcpy_to_msg(msg, skb->data + off, bytes); >+ err = skb_copy_datagram_iter(skb, off, >+ &msg->msg_iter, >+ bytes); >+ > if (err) > goto out; > >@@ -414,24 +417,28 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, > skb = skb_peek(&vvs->rx_queue); > > bytes = len - total; >- if (bytes > skb->len) >- bytes = skb->len; >+ if (bytes > skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off) >+ bytes = skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off; What about storing `VIRTIO_VSOCK_SKB_CB(skb)->frag_off` in a variable? More for readability than optimization, which I hope the compiler already does on its own. The rest LGTM. Stefano > > /* sk_lock is held by caller so no one else can dequeue. > * Unlock rx_lock since memcpy_to_msg() may sleep. > */ > spin_unlock_bh(&vvs->rx_lock); > >- err = memcpy_to_msg(msg, skb->data, bytes); >+ err = skb_copy_datagram_iter(skb, >+ VIRTIO_VSOCK_SKB_CB(skb)->frag_off, >+ &msg->msg_iter, bytes); >+ > if (err) > goto out; > > spin_lock_bh(&vvs->rx_lock); > > total += bytes; >- skb_pull(skb, bytes); > >- if (skb->len == 0) { >+ VIRTIO_VSOCK_SKB_CB(skb)->frag_off += bytes; >+ >+ if (skb->len == VIRTIO_VSOCK_SKB_CB(skb)->frag_off) { > u32 pkt_len = le32_to_cpu(virtio_vsock_hdr(skb)->len); > > virtio_transport_dec_rx_pkt(vvs, pkt_len); >@@ -503,7 +510,10 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk, > */ > spin_unlock_bh(&vvs->rx_lock); > >- err = memcpy_to_msg(msg, skb->data, bytes_to_copy); >+ err = skb_copy_datagram_iter(skb, 0, >+ &msg->msg_iter, >+ bytes_to_copy); >+ > if (err) { > /* Copy of message failed. Rest of > * fragments will be freed without copy. >-- >2.25.1 >
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h index c58453699ee9..17dbb7176e37 100644 --- a/include/linux/virtio_vsock.h +++ b/include/linux/virtio_vsock.h @@ -12,6 +12,7 @@ struct virtio_vsock_skb_cb { bool reply; bool tap_delivered; + u32 frag_off; }; #define VIRTIO_VSOCK_SKB_CB(skb) ((struct virtio_vsock_skb_cb *)((skb)->cb)) diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c index b769fc258931..5819a9cd4515 100644 --- a/net/vmw_vsock/virtio_transport_common.c +++ b/net/vmw_vsock/virtio_transport_common.c @@ -355,7 +355,7 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, spin_lock_bh(&vvs->rx_lock); skb_queue_walk_safe(&vvs->rx_queue, skb, tmp) { - off = 0; + off = VIRTIO_VSOCK_SKB_CB(skb)->frag_off; if (total == len) break; @@ -370,7 +370,10 @@ virtio_transport_stream_do_peek(struct vsock_sock *vsk, */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data + off, bytes); + err = skb_copy_datagram_iter(skb, off, + &msg->msg_iter, + bytes); + if (err) goto out; @@ -414,24 +417,28 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk, skb = skb_peek(&vvs->rx_queue); bytes = len - total; - if (bytes > skb->len) - bytes = skb->len; + if (bytes > skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off) + bytes = skb->len - VIRTIO_VSOCK_SKB_CB(skb)->frag_off; /* sk_lock is held by caller so no one else can dequeue. * Unlock rx_lock since memcpy_to_msg() may sleep. */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data, bytes); + err = skb_copy_datagram_iter(skb, + VIRTIO_VSOCK_SKB_CB(skb)->frag_off, + &msg->msg_iter, bytes); + if (err) goto out; spin_lock_bh(&vvs->rx_lock); total += bytes; - skb_pull(skb, bytes); - if (skb->len == 0) { + VIRTIO_VSOCK_SKB_CB(skb)->frag_off += bytes; + + if (skb->len == VIRTIO_VSOCK_SKB_CB(skb)->frag_off) { u32 pkt_len = le32_to_cpu(virtio_vsock_hdr(skb)->len); virtio_transport_dec_rx_pkt(vvs, pkt_len); @@ -503,7 +510,10 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk, */ spin_unlock_bh(&vvs->rx_lock); - err = memcpy_to_msg(msg, skb->data, bytes_to_copy); + err = skb_copy_datagram_iter(skb, 0, + &msg->msg_iter, + bytes_to_copy); + if (err) { /* Copy of message failed. Rest of * fragments will be freed without copy.