From patchwork Thu Aug 3 14:04:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 13089 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1255629vqx; Thu, 3 Aug 2023 09:14:46 -0700 (PDT) X-Google-Smtp-Source: APBJJlH0kK+tAGvR48Lpd/sBpSwktADJxdtq+wtTkDwB8CoR135o9YWhPQWN595ks/VUhe99LiP3 X-Received: by 2002:a05:6870:fb91:b0:1be:c688:1cc with SMTP id kv17-20020a056870fb9100b001bec68801ccmr14803528oab.5.1691079286360; Thu, 03 Aug 2023 09:14:46 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691079286; cv=none; d=google.com; s=arc-20160816; b=NzXZgMMHE0yjXvZ3B2TESpbvMpH0/1bypAtwcihgAqp4ipUgJpN0xbtYnUyUNrmOdh 5Y8MDy9LUZTBGAXlVqTZDwUB+XBPo3s1/0nFKc9m6dPJrRLwSlZu9BYPVTJN+shQ/ldC 3TNrEOcFq2zZ5I4iUw2naqS7GVMvVuJtJL8iiQ49GJgviaohIdZdOiBZjpyZ8d/BkISG 0ddR2kg71E5ev+SfB8vwJ+DrOkBalnMDeDTMz7xpjiIjFYNpfolV3/IiocU4NCsndvPG q9JWwmJTF/zf5SGbC2iU6OKDgwQ/nBc8FEZDNQ6EUSjtZ7v3TkLX+YyiaLV0Vntb+eb0 O5Ug== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=J/VvNs2uIhcAZTPecrKiR+txw4pz0J1IGRiFLglW3T0=; fh=g4uBolO3oPEe9/DEEo6mvafqNfqpb5cy3Mv12uv7tV0=; b=ADicZb/WhZDpz4Hd9nPI1MAhJREvTzjgkRjAIaUBuQagmIb+7Y5g57z3Aebe80JS+E ourbFQ5Mj/tE0kBvfBqMpVJpwfKXYmiXJ4Jvnev1N8ZhcYdBj2pMR3Hu1fbOep0xdx/z 0DwLaLGXlYRg5YOX1z6+wzKFFT1HeK9yMXr3Y9VuKZseKG+ONEJgQ0qeNeJeB9VwaUud eNvyHDEbC5vGs8OyFHLnpd3dvorqzZdF/4lEIjhb+dCKBwedBreTAZPQp+2bLnHG2j3g ajFyjBV0d8+l5IjQstuXSsOEZQELltDlE8//BGoqkD/g173rG4q2qW+4GUY1zus1a8r4 01Aw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=OSwzcoDO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 18-20020a631252000000b00563e93dfc8dsi74829pgs.375.2023.08.03.09.14.25; Thu, 03 Aug 2023 09:14:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=OSwzcoDO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236251AbjHCOG3 (ORCPT + 99 others); Thu, 3 Aug 2023 10:06:29 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48228 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235385AbjHCOGB (ORCPT ); Thu, 3 Aug 2023 10:06:01 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6621B4C0A for ; Thu, 3 Aug 2023 07:05:11 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1bba2318546so8731515ad.1 for ; Thu, 03 Aug 2023 07:05:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071509; x=1691676309; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=J/VvNs2uIhcAZTPecrKiR+txw4pz0J1IGRiFLglW3T0=; b=OSwzcoDOq6NERRdYfrPstK+ZW351dEAa5g+Y+deWGFfvYUWuFhETUOJ6o5tgWGLxIA 659j09hFeGeYePE5RSp+EnmNsHkgmwExFqo5Br7jFBwlYxOCQZhSYTbkvfUfRN5BGrlo Tvb482xPMVsWjkdwPdZlrz8IjfAy9oM/2NC3GPrzALV7jsWdPLhgd5BvCMCZZCLHbiCE rbT12hRqWHh+n7ipPcPcRP9PmiqAAYZ824/aA734gUAFUzK1jfBqrypj1dBeHvEr5Taq R62qFsHHxo9AHuZfiTFwD5qCDcVwLGyS1XAk+BK8mIuvri2TP07hrYnX82BW4vniF6mc ogjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071509; x=1691676309; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=J/VvNs2uIhcAZTPecrKiR+txw4pz0J1IGRiFLglW3T0=; b=LDOa0uedTKGdha5hfhNQQ4hnlztpPxO8Mkdx0m/tAb/3HFJjxGOl7TY0csNxZuUgG7 00dnvzt9KjJLkO/ELrL6wnhfX2g5Uhw1O0txCLiQT1484bbX2XAHHPppYluou0WnbxBJ XK8LRgXyUufq3d5mTx7FuZZxw1x/238mXICGVrJYttiRErynm4muvX2+kwwj4mTYIl+z NiKY4/tcSMPK7QG94cvahICGhJ13zI5p0xKfUR/lxeHyIGzOa5MXav1rBSxa5b3lMpCh H1jT96UtolIr7XwvDzQgRnqwa3ugyJbBH99pRIoYHg+nAoaiJU7Xyl2ZMOJn1n1KanuG bPgQ== X-Gm-Message-State: ABy/qLbqoAo1RozeuMyUpXVzwjmpnmEMzpzQt4cxNWxfdDtRZRY3XiXT qDR/BXBvipdUUBlqCGVY4ldAdw== X-Received: by 2002:a17:902:8207:b0:1b9:ebe9:5f01 with SMTP id x7-20020a170902820700b001b9ebe95f01mr19539395pln.19.1691071508775; Thu, 03 Aug 2023 07:05:08 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.05.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:05:08 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Yunsheng Lin , Kees Cook , Richard Gobert , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 00/10] Date: Thu, 3 Aug 2023 22:04:26 +0800 Message-Id: <20230803140441.53596-1-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773219538992880246 X-GMAIL-MSGID: 1773225153374012465 AF_XDP is a kernel bypass technology that can greatly improve performance. However, for virtual devices like veth, even with the use of AF_XDP sockets, there are still many additional software paths that consume CPU resources. This patch series focuses on optimizing the performance of AF_XDP sockets for veth virtual devices. Patches 1 to 4 mainly involve preparatory work. Patch 5 introduces tx queue and tx napi for packet transmission, while patch 9 primarily implements zero-copy, and patch 10 adds support for batch sending of IPv4 UDP packets. These optimizations significantly reduce the software path and support checksum offload. I tested those feature with A typical topology is shown below: veth<-->veth-peer veth1-peer<--->veth1 1 | | 7 |2 6| | | bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1 3 4 5 (machine1) (machine2) AF_XDP socket is attach to veth and veth1. and send packets to physical NIC(eth0) veth:(172.17.0.2/24) bridge:(172.17.0.1/24) eth0:(192.168.156.66/24) eth1(172.17.0.2/24) bridge1:(172.17.0.1/24) eth0:(192.168.156.88/24) after set default route、snat、dnat. we can have a tests to get the performance results. packets send from veth to veth1: af_xdp test tool: link:https://github.com/cclinuxer/libxudp send:(veth) ./objs/xudpperf send --dst 192.168.156.88:6002 -l 1300 recv:(veth1) ./objs/xudpperf recv --src 172.17.0.2:6002 udp test tool:iperf3 send:(veth) iperf3 -c 192.168.156.88 -p 6002 -l 1300 -b 60G -u recv:(veth1) iperf3 -s -p 6002 performance: performance:(test weth libxdp lib) UDP : 250 Kpps (with 100% cpu) AF_XDP no zerocopy + no batch : 480 Kpps (with ksoftirqd 100% cpu) AF_XDP with zerocopy + no batch : 540 Kpps (with ksoftirqd 100% cpu) AF_XDP with batch + zerocopy : 1.5 Mpps (with ksoftirqd 15% cpu) With af_xdp batch, the libxdp user-space program reaches a bottleneck. Therefore, the softirq did not reach the limit. This is just an RFC patch series, and some code details still need further consideration. Please review this proposal. thanks! huangjie.albert (10): veth: Implement ethtool's get_ringparam() callback xsk: add dma_check_skip for skipping dma check veth: add support for send queue xsk: add xsk_tx_completed_addr function veth: use send queue tx napi to xmit xsk tx desc veth: add ndo_xsk_wakeup callback for veth sk_buff: add destructor_arg_xsk_pool for zero copy xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX veth: support zero copy for af xdp veth: af_xdp tx batch support for ipv4 udp drivers/net/veth.c | 729 +++++++++++++++++++++++++++++++++++- include/linux/skbuff.h | 1 + include/net/xdp.h | 1 + include/net/xdp_sock_drv.h | 1 + include/net/xsk_buff_pool.h | 1 + net/xdp/xsk.c | 6 + net/xdp/xsk_buff_pool.c | 3 +- net/xdp/xsk_queue.h | 11 + 8 files changed, 751 insertions(+), 2 deletions(-)