From patchwork Wed Jul 19 00:50:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Bobby Eshleman X-Patchwork-Id: 12241 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:c923:0:b0:3e4:2afc:c1 with SMTP id j3csp2119866vqt; Tue, 18 Jul 2023 18:12:26 -0700 (PDT) X-Google-Smtp-Source: APBJJlE++Xv/2vu7Nru54UxiS9MI8/YBY/iPzIDOaqRRWHgN6771WxWFzhHdppyZkvjuLbQMZsvE X-Received: by 2002:a05:6512:6c6:b0:4fb:89ad:6651 with SMTP id u6-20020a05651206c600b004fb89ad6651mr11821217lff.28.1689729146572; Tue, 18 Jul 2023 18:12:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1689729146; cv=none; d=google.com; s=arc-20160816; b=JHncgc5AHE/uD2C/iVTm5iJnL/eb09FvZybufBm8J77S8Q1RM0WDy/iGS8f68HYIZw 0se1apXVT4J/4CubOm6GXcEy6cKU1P7afT0lKLqccmLE8E80VcSWNklLRoV5kdHdNOWT f++Iv6K/9oi5SB8dgHtYxLcuzzMLDJb4uW93fbIihXgLK7ZVHdDS7R5SqkI4RysjXkJG 69nsqyX6EHk0YN0zFqMLVfXpXJGS/6oAmJajkr7H5Mti8Aqf82WTkYcjxMT11cZCHCIP 5SSGWDrRQg6ORePhwUkrI7Isb36plYzeKx+ySXmaAN1Gw8fgA1M7U4Vr914LlR1p3rj+ OpXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:content-transfer-encoding:mime-version :message-id:date:subject:from:dkim-signature; bh=g4MQh0GVYvXxrf88kEVfj3RWJKvjT6rBvWGaToebS1k=; fh=mG5Sox+GcIzYFDJZVrRRBLlD2ICT0t2fGmdMMv0dhmY=; b=R8OYv/br9bh1/pS+u83JaU90574hwvx6OOcgJKeqgPFEdAcC1nRI4hRCTVepqhd1oq nBTI3nJx2DYVkRMlh76esWEu9YKjUW9sR+xaxD5IMBG12fGK0dWGKZy6Q/vdISi7RQcn s38MCcjcqhPoVtKbYmBQnB3RTZdwcSHPDOAJg99UtKZA0BXOFunUQilhFPBqggz/AmdV yp+XyzfIO2PuQhg3WuMZfr10JrYNmY/Swh6kN2+yi4t3kCBs+o3UrJLiQkZlL3OEw7rz Pfo412YegWM3VG4Z8D6pbgif4rLVzPkEUgy30Che/DAySYtAfmbAnOOMc8lrAZDPhYNv 7VDA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=OlP5EF+D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x24-20020aa7cd98000000b0051df10548a0si2126284edv.600.2023.07.18.18.12.02; Tue, 18 Jul 2023 18:12:26 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=OlP5EF+D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229672AbjGSAuN (ORCPT + 99 others); Tue, 18 Jul 2023 20:50:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229502AbjGSAuM (ORCPT ); Tue, 18 Jul 2023 20:50:12 -0400 Received: from mail-qt1-x829.google.com (mail-qt1-x829.google.com [IPv6:2607:f8b0:4864:20::829]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5C53718D for ; Tue, 18 Jul 2023 17:50:10 -0700 (PDT) Received: by mail-qt1-x829.google.com with SMTP id d75a77b69052e-3ff24a193dbso40403541cf.3 for ; Tue, 18 Jul 2023 17:50:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1689727809; x=1692319809; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:from:to:cc:subject:date:message-id:reply-to; bh=g4MQh0GVYvXxrf88kEVfj3RWJKvjT6rBvWGaToebS1k=; b=OlP5EF+DSJN59SkgdI1UB5eOS4em/1xb1WNYdfgYt2Ukkjivjm+QbZj65DjHzNYEyI QqIAVHDlSwzvNIMiBTfgbHTKBSZcgGY15vdPnsO9PLqnscYrYHchHIhlFLeibW6jejsZ dvyNtltzAvm8e3K49GaTAs/0qElRdNwn7qFGc4P+0Xzw6n746bakT/97EoDhZqkzDjbG F80kiMG6yPyD/JHuq+ejdvIj6wfg6lEFKl4ldHJHejg/bTLxA0CsA8YLiQFrhZ3yvinl e3AVdlCq6NRI5Rai+WUijlbOZNiluE/zuHqd2kOcGLJ+nJKAqlGjmaAcQRcpR5Wk1yCL d6/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689727809; x=1692319809; h=cc:to:content-transfer-encoding:mime-version:message-id:date :subject:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=g4MQh0GVYvXxrf88kEVfj3RWJKvjT6rBvWGaToebS1k=; b=PgrnAKUw5zEoZoZ6xK2VkYiUEE5KMsJv41z2qU+gl6NPKc6eci/QAySwUaZ5oPYytb KCpwZY5mBcLLUdaz+GzPu/3331203972v2naSuMOBWfsJ55ZMsnbsDST+JsgTVXzWKNR Yyg1uIFAJduHbWzWimVfA+tTLV7KA+7bXJPyfQRNKv/HFy61T2+nvINrNFgx6CN14xiw RqXXvwdbetOS3Fq8V9K2W3d0aL8g471uKvaLO18z6wKcTCo2t0CB0Bd9hJVlnHkmP+BE +M0HDZ+ABr9ogh7J9uiTm5qYMSmpQ/W0nSl7nkFAGODWdZQPZDz/3Rv6bD2N9cwKBjiD zxLw== X-Gm-Message-State: ABy/qLZLtLdRlAwSbCr/s1ZoZEezTEXQHXElj1bSJI4PB4+qXrGYkuDX ws6EVcSYqbKuG1pSvSLX5mN98A== X-Received: by 2002:ac8:5f0b:0:b0:404:e41c:616f with SMTP id x11-20020ac85f0b000000b00404e41c616fmr151778qta.68.1689727809469; Tue, 18 Jul 2023 17:50:09 -0700 (PDT) Received: from [172.17.0.7] ([130.44.212.112]) by smtp.gmail.com with ESMTPSA id c5-20020a05620a11a500b0076738337cd1sm968696qkk.1.2023.07.18.17.50.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Jul 2023 17:50:08 -0700 (PDT) From: Bobby Eshleman Subject: [PATCH RFC net-next v5 00/14] virtio/vsock: support datagrams Date: Wed, 19 Jul 2023 00:50:04 +0000 Message-Id: <20230413-b4-vsock-dgram-v5-0-581bd37fdb26@bytedance.com> MIME-Version: 1.0 X-B4-Tracking: v=1; b=H4sIADwzt2QC/32OwQrCMBBEf6Xk7EpNY2s9CYIf4FU8ZNPRBmkqS QmK9N9Ne/Ag6HF2dt7MSwR4iyC22Ut4RBts75JYLzJhWu2uINskLWQui1ytCmJFMfTmRs3V644 KLsG61KXCWqQQ6wBir51pp5jDQA6PYbLuHhf7mLtO4njYT7ePf06itWHo/XPeEuX89qs2Ssopr 2pjKgOUEjt+DmhSLZam72ZcVP8RakIYMLPU2NT1N2Icxzf/7ETBHwEAAA== To: Stefan Hajnoczi , Stefano Garzarella , "Michael S. Tsirkin" , Jason Wang , Xuan Zhuo , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "K. Y. Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Bryan Tan , Vishnu Dasa , VMware PV-Drivers Reviewers Cc: Dan Carpenter , Simon Horman , Krasnov Arseniy , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org, bpf@vger.kernel.org, Bobby Eshleman , Jiang Wang X-Mailer: b4 0.12.2 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1771809429486932713 X-GMAIL-MSGID: 1771809429486932713 Hey all! This series introduces support for datagrams to virtio/vsock. It is a spin-off (and smaller version) of this series from the summer: https://lore.kernel.org/all/cover.1660362668.git.bobby.eshleman@bytedance.com/ Please note that this is an RFC and should not be merged until associated changes are made to the virtio specification, which will follow after discussion from this series. Another aside, the v4 of the series has only been mildly tested with a run of tools/testing/vsock/vsock_test. Some code likely needs cleaning up, but I'm hoping to get some of the design choices agreed upon before spending too much time making it pretty. This series first supports datagrams in a basic form for virtio, and then optimizes the sendpath for all datagram transports. The result is a very fast datagram communication protocol that outperforms even UDP on multi-queue virtio-net w/ vhost on a variety of multi-threaded workload samples. For those that are curious, some summary data comparing UDP and VSOCK DGRAM (N=5): vCPUS: 16 virtio-net queues: 16 payload size: 4KB Setup: bare metal + vm (non-nested) UDP: 287.59 MB/s VSOCK DGRAM: 509.2 MB/s Some notes about the implementation... This datagram implementation forces datagrams to self-throttle according to the threshold set by sk_sndbuf. It behaves similar to the credits used by streams in its effect on throughput and memory consumption, but it is not influenced by the receiving socket as credits are. The device drops packets silently. As discussed previously, this series introduces datagrams and defers fairness to future work. See discussion in v2 for more context around datagrams, fairness, and this implementation. Signed-off-by: Bobby Eshleman --- Changes in v5: - teach vhost to drop dgram when a datagram exceeds the receive buffer - now uses MSG_ERRQUEUE and depends on Arseniy's zerocopy patch: "vsock: read from socket's error queue" - replace multiple ->dgram_* callbacks with single ->dgram_addr_init() callback - refactor virtio dgram skb allocator to reduce conflicts w/ zerocopy series - add _fallback/_FALLBACK suffix to dgram transport variables/macros - add WARN_ONCE() for table_size / VSOCK_HASH issue - add static to vsock_find_bound_socket_common - dedupe code in vsock_dgram_sendmsg() using module_got var - drop concurrent sendmsg() for dgram and defer to future series - Add more tests - test EHOSTUNREACH in errqueue - test stream + dgram address collision - improve clarity of dgram msg bounds test code - Link to v4: https://lore.kernel.org/r/20230413-b4-vsock-dgram-v4-0-0cebbb2ae899@bytedance.com Changes in v4: - style changes - vsock: use sk_vsock(vsk) in vsock_dgram_recvmsg instead of &sk->vsk - vsock: fix xmas tree declaration - vsock: fix spacing issues - virtio/vsock: virtio_transport_recv_dgram returns void because err unused - sparse analysis warnings/errors - virtio/vsock: fix unitialized skerr on destroy - virtio/vsock: fix uninitialized err var on goto out - vsock: fix declarations that need static - vsock: fix __rcu annotation order - bugs - vsock: fix null ptr in remote_info code - vsock/dgram: make transport_dgram a fallback instead of first priority - vsock: remove redundant rcu read lock acquire in getname() - tests - add more tests (message bounds and more) - add vsock_dgram_bind() helper - add vsock_dgram_connect() helper Changes in v3: - Support multi-transport dgram, changing logic in connect/bind to support VMCI case - Support per-pkt transport lookup for sendto() case - Fix dgram_allow() implementation - Fix dgram feature bit number (now it is 3) - Fix binding so dgram and connectible (cid,port) spaces are non-overlapping - RCU protect transport ptr so connect() calls never leave a lockless read of the transport and remote_addr are always in sync - Link to v2: https://lore.kernel.org/r/20230413-b4-vsock-dgram-v2-0-079cc7cee62e@bytedance.com --- Bobby Eshleman (13): af_vsock: generalize vsock_dgram_recvmsg() to all transports af_vsock: refactor transport lookup code af_vsock: support multi-transport datagrams af_vsock: generalize bind table functions af_vsock: use a separate dgram bind table virtio/vsock: add VIRTIO_VSOCK_TYPE_DGRAM virtio/vsock: add common datagram send path af_vsock: add vsock_find_bound_dgram_socket() virtio/vsock: add common datagram recv path virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit vhost/vsock: implement datagram support vsock/loopback: implement datagram support virtio/vsock: implement datagram support Jiang Wang (1): test/vsock: add vsock dgram tests drivers/vhost/vsock.c | 64 ++- include/linux/virtio_vsock.h | 10 +- include/net/af_vsock.h | 14 +- include/uapi/linux/virtio_vsock.h | 2 + net/vmw_vsock/af_vsock.c | 281 ++++++++++--- net/vmw_vsock/hyperv_transport.c | 13 - net/vmw_vsock/virtio_transport.c | 26 +- net/vmw_vsock/virtio_transport_common.c | 190 +++++++-- net/vmw_vsock/vmci_transport.c | 60 +-- net/vmw_vsock/vsock_loopback.c | 10 +- tools/testing/vsock/util.c | 141 ++++++- tools/testing/vsock/util.h | 6 + tools/testing/vsock/vsock_test.c | 680 ++++++++++++++++++++++++++++++++ 13 files changed, 1320 insertions(+), 177 deletions(-) --- base-commit: 37cadc266ebdc7e3531111c2b3304fa01b2131e8 change-id: 20230413-b4-vsock-dgram-3b6eba6a64e5 Best regards,