Message ID | 20230118-support-vsock-sockmap-connectible-v2-0-58ffafde0965@bytedance.com |
---|---|
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:adf:eb09:0:0:0:0:0 with SMTP id s9csp2555596wrn; Mon, 30 Jan 2023 20:38:16 -0800 (PST) X-Google-Smtp-Source: AK7set9Qjii7sabEvHchh5qKDs6TU6WHrdkcoaat+X0zILfv6kkKupgKOWedSXB9NRUrDmOiM9ft X-Received: by 2002:a17:906:d29b:b0:888:42ed:e4d4 with SMTP id ay27-20020a170906d29b00b0088842ede4d4mr7516497ejb.18.1675139896097; Mon, 30 Jan 2023 20:38:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1675139896; cv=none; d=google.com; s=arc-20160816; b=PNfR3t1J7awvco/hbmcU28F+XFoSNX/umWzd0QcdWVFxUVdt7EX3qLJJ/UXvR9O3Ot q64BUVug5hcG4O+ZV6Dr1hdWw/CO2iL6MpKzDuxxhua6pW7P0M4Z8lFeHdNRDbTXWIcJ sV90N1v5nTIUvEluwFmX7MJgYLv2/smyCLc3pZRDGaDO7nVoBCMhyWsP5NqtLbCblMuL fm48c0SpFJc+u0kP5RPhqQ1+/sC+fTeCO5IAnuj/EhHqah32f2TXzpLDFRZcHWt2xrgT xh4fjjupM9fiKAcmk8Glg/oGsT8allJ7F8Mg6JDJYy3t5aLiFg4eVdt+M32FI6ONUHeO ubjg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=BzFUUu2PxeLWAubEu7oUk+cl1ljeVkwcyofI8r7Scvs=; b=DGjRTV6IFqCbOB/cGtVv+X2hzj8sauoeH5XIkkWaFm5ffmLN8nM2U3uxjUK97xYhTK TNZlbs4Z6G7Ium0quTmbmw3JRIJEu8WF7IOipmrc27hfS6BqfUpX4TmKO1q2XLe8e+nq V2VpTquP61pi5hl6uOh4a4g9cyzljpv5aiA0mM4dQxZlAIdnN5t6+YLZ0XdHXGXW3LiU 8PEyTAb7Rtkx9r9dbUPMQvqN5WTu44AoUZ1DY526SFRlI7QrwA+vPzkC08dtZLpGvLVH ibqjaFYfPvqMlDhlOUjKNvTfWnC3noeK9vl9bzp/5cLnMztm2EoIYA45lEbxWoONPHs9 xrSQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=znZSnDoO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id iy21-20020a170907819500b0088458045d8fsi317908ejc.639.2023.01.30.20.37.52; Mon, 30 Jan 2023 20:38:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=znZSnDoO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230109AbjAaEf6 (ORCPT <rfc822;maxin.john@gmail.com> + 99 others); Mon, 30 Jan 2023 23:35:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42828 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229634AbjAaEfz (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Mon, 30 Jan 2023 23:35:55 -0500 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4393839BAA for <linux-kernel@vger.kernel.org>; Mon, 30 Jan 2023 20:35:29 -0800 (PST) Received: by mail-qt1-x829.google.com with SMTP id h24so12332786qta.12 for <linux-kernel@vger.kernel.org>; Mon, 30 Jan 2023 20:35:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=BzFUUu2PxeLWAubEu7oUk+cl1ljeVkwcyofI8r7Scvs=; b=znZSnDoOe3nDYLRFlNFYnyeYC5gsDn4pxJLz+Tymq1HcbWAhX2QvQwO0kIwjdBiQif qGdDh2whiQPMAgM3htG+azRVdenVxkVm4OlqDqtU7TvwgzggQl69uTYddZQrrSib285g 6HBoqIHnqCViCM0BdWaxpGydbecIzlCurs4O/Fc7NsmMSLhjSQ5kncUDGvTU5D3JuHD5 odTMcaGJIYLOYMy8065biGmuOsMf+4uNqwfyrpAckCeKpim+IGbbhZt2CjGXmXMtFWcu 2lcueqieTEdCKiFUI1+dQ4tZx4u+9VH/pv3S/tTxV6BVuZihPCIBNcRpCsS85mdnao3/ KE/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=BzFUUu2PxeLWAubEu7oUk+cl1ljeVkwcyofI8r7Scvs=; b=5HnKRAofHnYaLP8uZ57Gfektf4VDWFFmNHkXGUQeinbbM9XAXMljKXhMGrMJnSK5kZ aIWeTeH0xtkSTjw+NddwOTpN0asqskobKEFvzo5Gy/+5NrvTK9bPE5DDr/PxP1BisHrv am+tfo1yB60+ItLXuiFIZwoN4PVg4eXUrw5OkMWEJJ9oWmgmzF7oe9s04SE0QO3JMtt/ rHZki8+cMiIB4EbFKm++dq0qIBdlrXvO1dHmTGut1wpBXMlH258SdMViBnNAukLXqRcH brBXl0ToHc5bi1WE9LDdCe5hDjk+Jglhgw1rwIo3VBeZPZx56YKBbBpXbNszFzIWg/fJ Tv6Q== X-Gm-Message-State: AO0yUKU9SbgABtwBOkYe8RMe1tnjETo+R9Sb+YZ4p5TyM+y/BB8+OLMP TWTMFkGpoKkL6u21/up9BdUveA== X-Received: by 2002:ac8:574f:0:b0:3b8:3629:7cb7 with SMTP id 15-20020ac8574f000000b003b836297cb7mr20071628qtx.64.1675139728000; Mon, 30 Jan 2023 20:35:28 -0800 (PST) Received: from C02G8BMUMD6R.bytedance.net ([148.59.24.152]) by smtp.gmail.com with ESMTPSA id b13-20020ac801cd000000b003a6a19ee4f0sm9260682qtg.33.2023.01.30.20.35.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 30 Jan 2023 20:35:27 -0800 (PST) From: Bobby Eshleman <bobby.eshleman@bytedance.com> To: Stefan Hajnoczi <stefanha@redhat.com>, Stefano Garzarella <sgarzare@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, "David S. Miller" <davem@davemloft.net>, Eric Dumazet <edumazet@google.com>, Jakub Kicinski <kuba@kernel.org>, Paolo Abeni <pabeni@redhat.com>, Andrii Nakryiko <andrii@kernel.org>, Mykola Lysenko <mykolal@fb.com>, Alexei Starovoitov <ast@kernel.org>, Daniel Borkmann <daniel@iogearbox.net>, Martin KaFai Lau <martin.lau@linux.dev>, Song Liu <song@kernel.org>, Yonghong Song <yhs@fb.com>, John Fastabend <john.fastabend@gmail.com>, KP Singh <kpsingh@kernel.org>, Stanislav Fomichev <sdf@google.com>, Hao Luo <haoluo@google.com>, Jiri Olsa <jolsa@kernel.org>, Shuah Khan <shuah@kernel.org> Cc: Bobby Eshleman <bobbyeshleman@gmail.com>, Bobby Eshleman <bobby.eshleman@bytedance.com>, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kselftest@vger.kernel.org, jakub@cloudflare.com, hdanton@sina.com, cong.wang@bytedance.com Subject: [PATCH RFC net-next v2 0/3] vsock: add support for sockmap Date: Mon, 30 Jan 2023 20:35:11 -0800 Message-Id: <20230118-support-vsock-sockmap-connectible-v2-0-58ffafde0965@bytedance.com> X-Mailer: git-send-email 2.35.1 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" X-Mailer: b4 0.12.1 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1756511491814168808?= X-GMAIL-MSGID: =?utf-8?q?1756511491814168808?= |
Series |
vsock: add support for sockmap
|
|
Message
Bobby Eshleman
Jan. 31, 2023, 4:35 a.m. UTC
Add support for sockmap to vsock.
We're testing usage of vsock as a way to redirect guest-local UDS requests to
the host and this patch series greatly improves the performance of such a
setup.
Compared to copying packets via userspace, this improves throughput by 121% in
basic testing.
Tested as follows.
Setup: guest unix dgram sender -> guest vsock redirector -> host vsock server
Threads: 1
Payload: 64k
No sockmap:
- 76.3 MB/s
- The guest vsock redirector was
"socat VSOCK-CONNECT:2:1234 UNIX-RECV:/path/to/sock"
Using sockmap (this patch):
- 168.8 MB/s (+121%)
- The guest redirector was a simple sockmap echo server,
redirecting unix ingress to vsock 2:1234 egress.
- Same sender and server programs
*Note: these numbers are from RFC v1
Only the virtio transport has been tested. The loopback transport was used in
writing bpf/selftests, but not thoroughly tested otherwise.
This series requires the skb patch.
Changes in v2:
- vsock/bpf: rename vsock_dgram_* -> vsock_*
- vsock/bpf: change sk_psock_{get,put} and {lock,release}_sock() order to
minimize slock hold time
- vsock/bpf: use "new style" wait
- vsock/bpf: fix bug in wait log
- vsock/bpf: add check that recvmsg sk_type is one dgram, seqpacket, or stream.
Return error if not one of the three.
- virtio/vsock: comment __skb_recv_datagram() usage
- virtio/vsock: do not init copied in read_skb()
- vsock/bpf: add ifdef guard around struct proto in dgram_recvmsg()
- selftests/bpf: add vsock loopback config for aarch64
- selftests/bpf: add vsock loopback config for s390x
- selftests/bpf: remove vsock device from vmtest.sh qemu machine
- selftests/bpf: remove CONFIG_VIRTIO_VSOCKETS=y from config.x86_64
- vsock/bpf: move transport-related (e.g., if (!vsk->transport)) checks out of
fast path
Signed-off-by: Bobby Eshleman <bobby.eshleman@bytedance.com>
---
Bobby Eshleman (3):
vsock: support sockmap
selftests/bpf: add vsock to vmtest.sh
selftests/bpf: Add a test case for vsock sockmap
drivers/vhost/vsock.c | 1 +
include/linux/virtio_vsock.h | 1 +
include/net/af_vsock.h | 17 ++
net/vmw_vsock/Makefile | 1 +
net/vmw_vsock/af_vsock.c | 55 ++++++-
net/vmw_vsock/virtio_transport.c | 2 +
net/vmw_vsock/virtio_transport_common.c | 24 +++
net/vmw_vsock/vsock_bpf.c | 175 +++++++++++++++++++++
net/vmw_vsock/vsock_loopback.c | 2 +
tools/testing/selftests/bpf/config.aarch64 | 2 +
tools/testing/selftests/bpf/config.s390x | 3 +
tools/testing/selftests/bpf/config.x86_64 | 3 +
.../selftests/bpf/prog_tests/sockmap_listen.c | 163 +++++++++++++++++++
13 files changed, 443 insertions(+), 6 deletions(-)
---
base-commit: d83115ce337a632f996e44c9f9e18cadfcf5a094
change-id: 20230118-support-vsock-sockmap-connectible-2e1297d2111a
Best regards,
Comments
On Sun, Feb 05, 2023 at 12:08:49PM -0800, Cong Wang wrote: > On Mon, Jan 30, 2023 at 08:35:11PM -0800, Bobby Eshleman wrote: > > Add support for sockmap to vsock. > > > > We're testing usage of vsock as a way to redirect guest-local UDS requests to > > the host and this patch series greatly improves the performance of such a > > setup. > > > > Compared to copying packets via userspace, this improves throughput by 121% in > > basic testing. > > > > Tested as follows. > > > > Setup: guest unix dgram sender -> guest vsock redirector -> host vsock server > > Threads: 1 > > Payload: 64k > > No sockmap: > > - 76.3 MB/s > > - The guest vsock redirector was > > "socat VSOCK-CONNECT:2:1234 UNIX-RECV:/path/to/sock" > > Using sockmap (this patch): > > - 168.8 MB/s (+121%) > > - The guest redirector was a simple sockmap echo server, > > redirecting unix ingress to vsock 2:1234 egress. > > - Same sender and server programs > > > > *Note: these numbers are from RFC v1 > > > > Only the virtio transport has been tested. The loopback transport was used in > > writing bpf/selftests, but not thoroughly tested otherwise. > > > > This series requires the skb patch. > > > > Looks good to me. Definitely good to go as non-RFC. > > Thanks. Thank you for the review. Best, Bobby
On Mon, Jan 30, 2023 at 08:35:11PM -0800, Bobby Eshleman wrote: > Add support for sockmap to vsock. > > We're testing usage of vsock as a way to redirect guest-local UDS requests to > the host and this patch series greatly improves the performance of such a > setup. > > Compared to copying packets via userspace, this improves throughput by 121% in > basic testing. > > Tested as follows. > > Setup: guest unix dgram sender -> guest vsock redirector -> host vsock server > Threads: 1 > Payload: 64k > No sockmap: > - 76.3 MB/s > - The guest vsock redirector was > "socat VSOCK-CONNECT:2:1234 UNIX-RECV:/path/to/sock" > Using sockmap (this patch): > - 168.8 MB/s (+121%) > - The guest redirector was a simple sockmap echo server, > redirecting unix ingress to vsock 2:1234 egress. > - Same sender and server programs > > *Note: these numbers are from RFC v1 > > Only the virtio transport has been tested. The loopback transport was used in > writing bpf/selftests, but not thoroughly tested otherwise. > > This series requires the skb patch. > Looks good to me. Definitely good to go as non-RFC. Thanks.
Hi Bobby, sorry for my late reply, but I have been offline these days. I came back a few days ago and had to work off some accumulated work :-) On Mon, Jan 30, 2023 at 08:35:11PM -0800, Bobby Eshleman wrote: >Add support for sockmap to vsock. > >We're testing usage of vsock as a way to redirect guest-local UDS requests to >the host and this patch series greatly improves the performance of such a >setup. > >Compared to copying packets via userspace, this improves throughput by 121% in >basic testing. > >Tested as follows. > >Setup: guest unix dgram sender -> guest vsock redirector -> host vsock server >Threads: 1 >Payload: 64k >No sockmap: >- 76.3 MB/s >- The guest vsock redirector was > "socat VSOCK-CONNECT:2:1234 UNIX-RECV:/path/to/sock" >Using sockmap (this patch): >- 168.8 MB/s (+121%) >- The guest redirector was a simple sockmap echo server, > redirecting unix ingress to vsock 2:1234 egress. >- Same sender and server programs > >*Note: these numbers are from RFC v1 > >Only the virtio transport has been tested. The loopback transport was used in >writing bpf/selftests, but not thoroughly tested otherwise. > >This series requires the skb patch. > >Changes in v2: >- vsock/bpf: rename vsock_dgram_* -> vsock_* >- vsock/bpf: change sk_psock_{get,put} and {lock,release}_sock() order to > minimize slock hold time >- vsock/bpf: use "new style" wait >- vsock/bpf: fix bug in wait log >- vsock/bpf: add check that recvmsg sk_type is one dgram, seqpacket, or stream. > Return error if not one of the three. >- virtio/vsock: comment __skb_recv_datagram() usage >- virtio/vsock: do not init copied in read_skb() >- vsock/bpf: add ifdef guard around struct proto in dgram_recvmsg() >- selftests/bpf: add vsock loopback config for aarch64 >- selftests/bpf: add vsock loopback config for s390x >- selftests/bpf: remove vsock device from vmtest.sh qemu machine >- selftests/bpf: remove CONFIG_VIRTIO_VSOCKETS=y from config.x86_64 >- vsock/bpf: move transport-related (e.g., if (!vsk->transport)) checks out of > fast path The series looks in a good shape. I left some small comments on the first patch, but I think the next version could be without RFC, so we can receive some feedbacks from net/bpf maintainers. Great job! Thanks, Stefano