From patchwork Thu Aug 3 14:04:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130700 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1243801vqx; Thu, 3 Aug 2023 08:59:31 -0700 (PDT) X-Google-Smtp-Source: APBJJlEYjwBaH7CYjOxB2Wo4MTA3/dfq49NPMUQFdyAGp7HO1qL1D7/HLLmMPsvo3uJGau4ALo2r X-Received: by 2002:a17:90b:1e43:b0:268:e30e:e92f with SMTP id pi3-20020a17090b1e4300b00268e30ee92fmr12437631pjb.18.1691078370836; Thu, 03 Aug 2023 08:59:30 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691078370; cv=none; d=google.com; s=arc-20160816; b=i5pAn1psWAgVpOefKiOB2B2t+dWnw0+Cln+Ow21KC8WZPUHg4fJE9K7MtH558VsXMw DubrYBS8arDOS+H6w0S29DFJXvajRhs5rg/Xora6AZC5Te6EmMY+7+t7l5FX7snI6Unt cKF1c+/LExd3Z9ezFnzW3fON37vJRPQRoCeWiBfGNP6DY5IlgKx8JtYIGRefqQ8YGTuf ri9cp+TbgSRtb5yTwsb94xnyprBcs9VbIcYC3+/uts3a2/SQwWC6dKGL9wM8za+8WmsW 7zoyyncxUZNiHT0NXzUKMX2vVxcBAlqHTbW10ONQf7+Mk8wuZw1k1+eXGJeqcAYFEzE4 mA6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=7lti4noGvWp81QMD78qgE1vMb8ru0CmeNy8GFo0ds5M=; fh=AUa4WPgyk+2UDLoa4XXEC4v1iQCfVrxHGKq4C2BjbIc=; b=0hEYUjhjzqOS8kFlakFfRRem9Qys1WcOsufpuBVEJFHvfiiZp/d3tAPzFDXJot3+jl BQG/ijDgKxrr7gtUF6t6ItwbCtd6aH521INc41AjNHgqR/kcCWhTOoLNvWZt2QZuBVB4 y1IK36zc/CrZRaiBl5hq5mK6h7TjyLJu6jVHviNw5NAm+j61HMHyvGF8+ofmxM3aesxi R8KfCboVO1GpXvIUvbIq3/+Skv3W1Oqcupcfk2TLYyO5b+3znYED5iq8n7XT6k+6AG8A c/zE/a2GxsD83Y5S/GvDWa10hsmoNq8hyQ9RWl+7+yh3mJ76yo2aY4u08DFGCcxFT+Gm oTwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="ARPb/1sW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lj7-20020a17090b344700b0026826148914si183763pjb.32.2023.08.03.08.59.16; Thu, 03 Aug 2023 08:59:30 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="ARPb/1sW"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236324AbjHCOHG (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235799AbjHCOGs (ORCPT ); Thu, 3 Aug 2023 10:06:48 -0400 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E0963C25 for ; Thu, 3 Aug 2023 07:05:31 -0700 (PDT) Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1bb775625e2so7268545ad.1 for ; Thu, 03 Aug 2023 07:05:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071521; x=1691676321; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7lti4noGvWp81QMD78qgE1vMb8ru0CmeNy8GFo0ds5M=; b=ARPb/1sWh5T9p5GWjB03DuP/6QZsEazyqs/OXDGx5lkxhGk5XSHzLq7ODIElIYrFOb w6NmJppncUr5m/HHnhC+ubn3OXKSO93n7CunQ91Kqvw7Tf48agwl/sfkzVFunkSBsUwL JDSPtcxynczQTY4LUWfzEj8vza7wxX8VrrPUBZDCl9dsKp1h0oShS+rTDk01lxfzOOiw y6/9vPodJQo4JciEVHY9A/tNsegQ1GZOZrbkayOHoEORYyBxV7vd9Endd5Bcsumt2EBl c0qPe6AsKe7fZ1nTpg/XssSjVOlfsWqyaGzoMubvQPNbG7/icyQChTqyeTV/5C8talsZ Co2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071521; x=1691676321; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7lti4noGvWp81QMD78qgE1vMb8ru0CmeNy8GFo0ds5M=; b=LVUvMh/N1OyEo325jm2XgPtHi+obj91DFiPu8vN5zzxHQ1IBSkTXRd7CiHOuq/k0OX fs2E900aZ4qx7YaC4SmHhlmIRQn2VWGQSMnZWadvPJxI2Qin7mlhRT+ix9RaaYdqo4A5 IKwq+ZizWZc+tro9nC+OW0PnVCwKXg2ZKT11EU5r5bshSWVXS4a0uhiLRh85AfW00Whi Qkw7uatliGRfs/aXP29bxJMt13G5IP8mtzVveePgXm4j5kdZWGeKecG9IGArv5xFSWkQ vnmhIXtMTKzHnq0vyMmEvYt2iBd+oM3ipc1MCOoEvwTG//1Pd+RoZwgGNOi2/0JW5+6k 4cvw== X-Gm-Message-State: ABy/qLagPxku/X5JVCHhfo2/Ls0ePYcDrBuzXxjyjelE29NBxwzdY3gr RWE5rp4txnATuaLmQTQx9nZx1A== X-Received: by 2002:a17:902:6943:b0:1b8:6984:f5e5 with SMTP id k3-20020a170902694300b001b86984f5e5mr18474453plt.12.1691071521455; Thu, 03 Aug 2023 07:05:21 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:05:20 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Menglong Dong , Yunsheng Lin , Richard Gobert , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 01/10] veth: Implement ethtool's get_ringparam() callback Date: Thu, 3 Aug 2023 22:04:27 +0800 Message-Id: <20230803140441.53596-2-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773224194149712539 X-GMAIL-MSGID: 1773224194149712539 some xsk libary calls get_ringparam() API to get the queue length to init the xsk umem. Implement that in veth so those scenarios can work properly. Signed-off-by: huangjie.albert --- drivers/net/veth.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 614f3e3efab0..c2b431a7a017 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -255,6 +255,17 @@ static void veth_get_channels(struct net_device *dev, static int veth_set_channels(struct net_device *dev, struct ethtool_channels *ch); +static void veth_get_ringparam(struct net_device *dev, + struct ethtool_ringparam *ring, + struct kernel_ethtool_ringparam *kernel_ring, + struct netlink_ext_ack *extack) +{ + ring->rx_max_pending = VETH_RING_SIZE; + ring->tx_max_pending = VETH_RING_SIZE; + ring->rx_pending = VETH_RING_SIZE; + ring->tx_pending = VETH_RING_SIZE; +} + static const struct ethtool_ops veth_ethtool_ops = { .get_drvinfo = veth_get_drvinfo, .get_link = ethtool_op_get_link, @@ -265,6 +276,7 @@ static const struct ethtool_ops veth_ethtool_ops = { .get_ts_info = ethtool_op_get_ts_info, .get_channels = veth_get_channels, .set_channels = veth_set_channels, + .get_ringparam = veth_get_ringparam, }; /* general routines */ From patchwork Thu Aug 3 14:04:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130658 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1182722vqx; Thu, 3 Aug 2023 07:21:20 -0700 (PDT) X-Google-Smtp-Source: APBJJlErA+avRXETQ0GR2DQuMNxYcVpiodMTZJfPYrB5V6zXgVkapfnaXkn3wlB388s2iAtd3T19 X-Received: by 2002:aa7:d5ca:0:b0:522:2019:2020 with SMTP id d10-20020aa7d5ca000000b0052220192020mr7134565eds.17.1691072480188; Thu, 03 Aug 2023 07:21:20 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691072480; cv=none; d=google.com; s=arc-20160816; b=bzegPd0E5+yjFzT+L/wZIVq9/mIHFXHRgDbpfU/0iVsSkTA6aTLTLNaLVUYcU0Xvc1 PzludMwbijEOC6AwEjn3LwkfaPIIJoFRKrTcW+gvmyeAAMmWsZqTPLhFXIM2B9WUTgYV s3aDMsSsP81xec5xzLnz1jaAriXWBX4rdP1lYZ4LsfTx8HZVTeKxSClIv6awJdhwgVZr ArsVQctiRnsYtNShzxrPLYnbWCqtY4ndRQJRpFY26AJlLAsVqY+49tx9x9jJ0EtHznDg 1aDUtslMqpB3ufYKd8upF2x7aOdehM7FiqX3T9aK5+VaF/Nya/u+w/BcZa3Pgv6OJ6Mc kLPA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=U+EKFuLLGda3oiEd1qoWkJ3WKIdEe7yI+fFIgxoez3I=; fh=pQMjDmBpkga//+9OKrKQl28a8lyKhz4s+dSSuCJIV1I=; b=cLm6zktH1WWMmmkdk6IyF/I4W2waNVoQ0/+PBa5/R1AbFuamq+eWXQOWKyYL/M3EQd 1NxadRYVMFXykh/KRJZ8kBnUE9Mo9nSxqoMAFJe0lasEM1TDG4fkTU9Kqo/omFXMRtFI AnoG1GGKdGyQbG52wLsor2SP1R9599Q70wjuJcISZ+Xq8R49B/vSjS3gDlkLXugp0wZs rU5RB4ATfx8/jh+7NojoAWozaWz8jS8Oi+52xhxy0cizhxiNr9tr9xLSnMLrGiBnh1gJ YM5M3tJwUuZ0azUIzFqZ7MwFLPVRmp5kezkIH6qQjeoz6anp5D5y5cQRWD+U+yVwfZUs Ioog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="fwe4/jwU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id d11-20020a50fb0b000000b0052228f8c22csi11962735edq.58.2023.08.03.07.20.52; Thu, 03 Aug 2023 07:21:20 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="fwe4/jwU"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235494AbjHCOHJ (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233957AbjHCOGv (ORCPT ); Thu, 3 Aug 2023 10:06:51 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 28EA4422A for ; Thu, 3 Aug 2023 07:05:35 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1bba48b0bd2so6835505ad.3 for ; Thu, 03 Aug 2023 07:05:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071534; x=1691676334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=U+EKFuLLGda3oiEd1qoWkJ3WKIdEe7yI+fFIgxoez3I=; b=fwe4/jwUmizIlQByLG/O10dPGK2dq5O1KyDOna7TN71l452ImeRZyualxarybGdLTE KSINW/D7yJGI/aFLQkhxK3+64JQiodpZvCwoGQg9YTTRbW5UIBKGDIGqSCDZc0josq2d 2P9Mp7v9JLZYEnKdld7C+HuTI+F9QlP+06UlKoisZWQYVZTBEVOxhHgulrtl2LX2dsLj T14TWHGsmOO/0dAKlV2/RtTPFXi7/aWi6ccZCUGBq76s4ZGEvQgdo9wugFVEtweVmFCk koL+jVBuohHproWlNCAlhWsk57vd+MbRFaTzJhLqjDR+JpizjbmPbOzaXkgS2AvXJWJo PDtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071534; x=1691676334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=U+EKFuLLGda3oiEd1qoWkJ3WKIdEe7yI+fFIgxoez3I=; b=BYo54lsZlZjxW7uQxz/tcpXWf4OjgXMIL7rt6yFgBBOLeHOcVf4WSNcMdizREqNuIt yyb5ipNO10S7J7hmhvZLWp6Sj3ns5amRKWMjLz7PwHiABtMxerMTcNSyqjJ59nIE+BJa B4WFSSChzv28Xzl5FMPB8ZJkCxS7uBrPncUNUaaM7PW+ms/xL5Aj85mphBQbLI7HhCuv tb764DDI9kTVHLz3wTMwBaCDLLnArYqVbuT30pOCKNeREcMSkl4ciiglrMoO2+UYeMmL LCyYRmKo76gp/Im9N1Csv8USKxEv228j5A2WBmrSMomLCJ838mLcTFsHPx4xZf67biZE 5a4g== X-Gm-Message-State: ABy/qLZwBYvjH7F10AFG3bYZ5W1LxpCso2sqjzySijWIC6ocgVcCG7BT R5djjrK3eAI+pvI60jKL42uEBw== X-Received: by 2002:a17:902:e811:b0:1bc:2c79:c6b5 with SMTP id u17-20020a170902e81100b001bc2c79c6b5mr7539181plg.4.1691071533982; Thu, 03 Aug 2023 07:05:33 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.05.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:05:33 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Shmulik Ladkani , Kees Cook , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 02/10] xsk: add dma_check_skip for skipping dma check Date: Thu, 3 Aug 2023 22:04:28 +0800 Message-Id: <20230803140441.53596-3-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773218016703609648 X-GMAIL-MSGID: 1773218016703609648 for the virtual net device such as veth, there is no need to do dma check if we support zero copy. add this flag after unaligned. beacause there are 4 bytes hole pahole -V ./net/xdp/xsk_buff_pool.o: ----------- ... /* --- cacheline 3 boundary (192 bytes) --- */ u32 chunk_size; /* 192 4 */ u32 frame_len; /* 196 4 */ u8 cached_need_wakeup; /* 200 1 */ bool uses_need_wakeup; /* 201 1 */ bool dma_need_sync; /* 202 1 */ bool unaligned; /* 203 1 */ /* XXX 4 bytes hole, try to pack */ void * addrs; /* 208 8 */ spinlock_t cq_lock; /* 216 4 */ ... ----------- Signed-off-by: huangjie.albert --- include/net/xsk_buff_pool.h | 1 + net/xdp/xsk_buff_pool.c | 3 ++- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h index b0bdff26fc88..fe31097dc11b 100644 --- a/include/net/xsk_buff_pool.h +++ b/include/net/xsk_buff_pool.h @@ -81,6 +81,7 @@ struct xsk_buff_pool { bool uses_need_wakeup; bool dma_need_sync; bool unaligned; + bool dma_check_skip; void *addrs; /* Mutual exclusion of the completion ring in the SKB mode. Two cases to protect: * NAPI TX thread and sendmsg error paths in the SKB destructor callback and when diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c index b3f7b310811e..ed251b8e8773 100644 --- a/net/xdp/xsk_buff_pool.c +++ b/net/xdp/xsk_buff_pool.c @@ -85,6 +85,7 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs, XDP_PACKET_HEADROOM; pool->umem = umem; pool->addrs = umem->addrs; + pool->dma_check_skip = false; INIT_LIST_HEAD(&pool->free_list); INIT_LIST_HEAD(&pool->xskb_list); INIT_LIST_HEAD(&pool->xsk_tx_list); @@ -202,7 +203,7 @@ int xp_assign_dev(struct xsk_buff_pool *pool, if (err) goto err_unreg_pool; - if (!pool->dma_pages) { + if (!pool->dma_pages && !pool->dma_check_skip) { WARN(1, "Driver did not DMA map zero-copy buffers"); err = -EINVAL; goto err_unreg_xsk; From patchwork Thu Aug 3 14:04:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130752 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1290955vqx; Thu, 3 Aug 2023 10:13:40 -0700 (PDT) X-Google-Smtp-Source: APBJJlEbJVYpOj0aZIxjZIHcfwzYG8YlXTDIW7v1omXMpoYMf3OuF19N7UMl6lxcQX14kko1OFyv X-Received: by 2002:a05:6e02:1d0a:b0:346:77f4:e22d with SMTP id i10-20020a056e021d0a00b0034677f4e22dmr22481475ila.6.1691082819763; Thu, 03 Aug 2023 10:13:39 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691082819; cv=none; d=google.com; s=arc-20160816; b=UcwJ5D7PiqcdNBlM43J0gMpWZlfjYVE8sWapjWQIfIMAifu4r7RU6QOi3KHncMpM6e BWjughdNPjFHcLsygSbsENbq2vorTvUJvlHlMLluTcnUMAd4FfnJPDHONV26x5SS+zlE fBgSg8QgRo/sHdNdaNBoufpq80fvZxESzbuJql/f/Wa0eAxqiZlXrSiwm2bQqISLb3Mr X0Uoocz+OL39BjQiLOU24aC7LkOEEgvb8184PV2WtFLnXj5adUTn/n/4sJtGSQLP2O2D YxhuawWgFwK0jzc6MzhnQ3sRVXiF1zuYIXmfpTPvgXrX4kFY6TQ4qvoircX+6R2jEwAU zt2Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=6Lk7Clg45RdUie/KjcmE3+28g2kd4eecHFcFfnkQv4M=; fh=VmDIXGzgGAt+mNxTrP6+sl6fho9mCQcJ7z//e5pew+A=; b=EMa3leDXIcMVPqV3a9bb1TaQUDhYdmilybUWr0U4Na7fdtpjLWTVjYZDgeW3vPhNRf +4MfzBNrA405VrsBep40NzRDYAPZdqq3wf4zdQxpcmjH8VsSYkwO005bA7yLiir0uMsF BZq6i2Iu1Fhy6SMG+0VEKIMpmVboU5XBpkUl7VGukAvut8coTpXOMhXWWTa7WvPRcmL5 5aY2CVaKlA8uKTwS1ThlL5dcgacOE4Zfe2usImF+MkR2VpiMVrXHt+qI3Pp0QZTg0YIn qKoZW0qGqurAros8aUIFcpn9rZztu7ATE2b9gl8mlasfIAvdr1PbKpjRBgztmBYycvXp yfhA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=asWSJ+14; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id c8-20020a17090a020800b00263f3c1bb86si303731pjc.158.2023.08.03.10.13.26; Thu, 03 Aug 2023 10:13:39 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=asWSJ+14; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235653AbjHCOH0 (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235486AbjHCOHB (ORCPT ); Thu, 3 Aug 2023 10:07:01 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BD66044AE for ; Thu, 3 Aug 2023 07:05:47 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1bba48b0bd2so6837605ad.3 for ; Thu, 03 Aug 2023 07:05:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071545; x=1691676345; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6Lk7Clg45RdUie/KjcmE3+28g2kd4eecHFcFfnkQv4M=; b=asWSJ+14SkxdbE8UXoYgIqvov0BFM79fmeO9euJ1Wv5XRqpkHwShFP6NIXKgvBYS39 KY1uMbrWNQbSjJupGWVrrURhvJ8kmrkqWIv6ZEFPZTTKBxHUS5kikgwuA3z8c5IUvEuo pwfcOp99klDueriStK3YRP+G2A3ta6IrwdNffYwsY1QRFXnKH7rVeHr4XVx9ICdMcs8u TYhqMuVKoTXjFoBzzxm/+LJENMac9MTDwKIUuXzmbia+bqx/NXWlVFRC025fJ/SNhI7G 0sUAAhVdNnFRuvikv0BqaV30V85oEqKXByDdz6upJq+kqp/EKzh4Br8vam8A46D6ers7 qYNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071545; x=1691676345; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6Lk7Clg45RdUie/KjcmE3+28g2kd4eecHFcFfnkQv4M=; b=fuiISlavSALpClyHugZLb2O5AbiWQGwGCBgdbiJ6+PnRn8H2okPJtHPoEm+aR5A9Qf c1lCW/wJppL/F1Xwm3/xgRkUnKIdc9BU5nQKC5JgmqTVju9oEvzyEgzrHQ7X8YgnRQU/ 0c0P0DRLueQFDvGEfZ+srnxb46KF0eqTmuQ94R2Jr1sq5J1w8rrvjIqB74x/NFadGqOe 88qFeMOJLiULo/wkfoQ6wLDTS+hcaKmVd9vqWeLaBajCcD2058ozPPumgnQmQ/kAbrex poaQjQGN6ijs8BIZvG/HM94t91Xnl8P08uo5gZUAjIykSdSi25e6C+lFRMRcPBPGKGsP DPbg== X-Gm-Message-State: ABy/qLbcZmyBc78xxHLY6bMnkdbCab/U4rVDvjDO8yQ66DlD80SOq1dr 8DjPySrj982jlTZ7q6/DFOMgyg== X-Received: by 2002:a17:903:22c1:b0:1b8:a936:1905 with SMTP id y1-20020a17090322c100b001b8a9361905mr19934085plg.38.1691071545362; Thu, 03 Aug 2023 07:05:45 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.05.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:05:44 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Kees Cook , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 03/10] veth: add support for send queue Date: Thu, 3 Aug 2023 22:04:29 +0800 Message-Id: <20230803140441.53596-4-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773228859062962037 X-GMAIL-MSGID: 1773228859062962037 in order to support native af_xdp for veth. we need support for send queue for napi tx. the upcoming patch will make use of it. Signed-off-by: huangjie.albert --- drivers/net/veth.c | 29 +++++++++++++++++++++++++++++ 1 file changed, 29 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index c2b431a7a017..63c3ebe4c5d0 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -56,6 +56,11 @@ struct veth_rq_stats { struct u64_stats_sync syncp; }; +struct veth_sq_stats { + struct veth_stats vs; + struct u64_stats_sync syncp; +}; + struct veth_rq { struct napi_struct xdp_napi; struct napi_struct __rcu *napi; /* points to xdp_napi when the latter is initialized */ @@ -69,11 +74,25 @@ struct veth_rq { struct page_pool *page_pool; }; +struct veth_sq { + struct napi_struct xdp_napi; + struct net_device *dev; + struct xdp_mem_info xdp_mem; + struct veth_sq_stats stats; + u32 queue_index; + /* this is for xsk */ + struct { + struct xsk_buff_pool __rcu *pool; + u32 last_cpu; + }xsk; +}; + struct veth_priv { struct net_device __rcu *peer; atomic64_t dropped; struct bpf_prog *_xdp_prog; struct veth_rq *rq; + struct veth_sq *sq; unsigned int requested_headroom; }; @@ -1495,6 +1514,15 @@ static int veth_alloc_queues(struct net_device *dev) u64_stats_init(&priv->rq[i].stats.syncp); } + priv->sq = kcalloc(dev->num_tx_queues, sizeof(*priv->sq), GFP_KERNEL); + if (!priv->sq) + return -ENOMEM; + + for (i = 0; i < dev->num_tx_queues; i++) { + priv->sq[i].dev = dev; + u64_stats_init(&priv->sq[i].stats.syncp); + } + return 0; } @@ -1503,6 +1531,7 @@ static void veth_free_queues(struct net_device *dev) struct veth_priv *priv = netdev_priv(dev); kfree(priv->rq); + kfree(priv->sq); } static int veth_dev_init(struct net_device *dev) From patchwork Thu Aug 3 14:04:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130725 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1259739vqx; Thu, 3 Aug 2023 09:21:10 -0700 (PDT) X-Google-Smtp-Source: APBJJlEhhNp9Gfjlnwi1gt4Y1hVjthCuKOCQf378bfpOy8Cu/kQBlPEIJFJJJ1MKUN6+4Q5djTc6 X-Received: by 2002:a05:6a20:1456:b0:137:e09b:21a6 with SMTP id a22-20020a056a20145600b00137e09b21a6mr23104724pzi.27.1691079670627; Thu, 03 Aug 2023 09:21:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691079670; cv=none; d=google.com; s=arc-20160816; b=hkhf1CMGBtIX9IDGNI+BsdwKXcd8+rox4t8X2SDwrZbJDAYmYBzv8pxW6EZVoQwCRa XGk/6QW+FC061rCW9qlzVGywtYGPZQbCO5Wj17S972T5bGidR+P1m5qbtxeUghdhqgse jRpR+132l5cYnDp7fslv/y+defW3VzhQipdyv9lZLQAGcW2MWbGDDioN9irUtiS6TXJX YKpUz6EXCvCcaERreyiwHpiMPV1lBSfMohACAMLYXqR6y6t9H68yArhGHyxYZIa7bgzt o8lOlZr+NkSbf/xTAUyoD4Ox0CkDDwtxCaxC78BpJL8ltIlLQ+YhO9uQyL3dph1pcvi7 qggA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=gL5Z8CGNWAetROxW2z+SYhypMB/ajEP4EJq25cVNZ/U=; fh=/aqO8VYO6ujDSzwSfLjrHYKPMxBvpYC7bUkGlUfMJjc=; b=J0T2lFWPLSoOFnNrORqG+JsJy27/Vj6uDclC2zGBtZCfngPasymBNf6XuM5oMf4yER tlo/aEvM7sjb4HmIgebUB0kcNWgq/tn/TAeaVq8VGvBZAiBGHcg3QDndo5v2Lb9FG2oK Wjrb3EsDd0zrHCtUNIliVejjW+0e8CfXVKQ7PFU7dUCKwfgKGSB3lc/O46vC5cQW0JzN X3jk12wk4UQplwavqXXTqoyH55UOTJlgQWVc3alU2ygbX3kPc5HxkE+jCKZd0yPsQjlO pP5c+qEdX13YS0zJ0P6bazx0ldhXlnz0Wk4bgLK+rqImZjRwASKvC+pFgCXD6u0Zbf+S GDlw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=bihjjANS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id k63-20020a638442000000b0055e6076752dsi104699pgd.729.2023.08.03.09.20.56; Thu, 03 Aug 2023 09:21:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=bihjjANS; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234803AbjHCOHb (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235534AbjHCOHB (ORCPT ); Thu, 3 Aug 2023 10:07:01 -0400 Received: from mail-pl1-x630.google.com (mail-pl1-x630.google.com [IPv6:2607:f8b0:4864:20::630]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 822751BE4 for ; Thu, 3 Aug 2023 07:05:57 -0700 (PDT) Received: by mail-pl1-x630.google.com with SMTP id d9443c01a7336-1bbc06f830aso7390505ad.0 for ; Thu, 03 Aug 2023 07:05:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071557; x=1691676357; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gL5Z8CGNWAetROxW2z+SYhypMB/ajEP4EJq25cVNZ/U=; b=bihjjANSY7lExRM+eNzd60fiHXxilGR3q6QxUzPHDL32kYYtKnJpLWNJk1NXvt6QQf 1pz3H9+q910ZhOR8HFBpRcNh6COunDjRS0T8nQsdCCUXi2N66puUcGVJzjxIvoznbUzg ZjCXnAuf0z/AuAGOW696eb2lF2TDa7J1cKthkrLJXtaZhGUfcWxa618oDyWBfhFuQ9lf 5ao7kmH4KaZB/dHyeGdbu9iYN+b2r8Ge/Ni/XjQi/VzdgQRyH83sYvUkGstzHgIyd4Ie bQlh8M3HEMPgB2jpqkCdeTNqEz1rbRxRaeNHUcqK58QOA8wadmleYL9CtxY2V08NMqCl dbZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071557; x=1691676357; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gL5Z8CGNWAetROxW2z+SYhypMB/ajEP4EJq25cVNZ/U=; b=bTrTfnYa17CtQJd5+VD1GRemlNYRJVsqqq+1FihHpLfQA5Q3Zrq4y3N4hIwXNQnGFV 4kJZmWriN+P0AyB0OLiFi2BGkhRYCXsKu1ohDPjREwA/xz8OdWCnBAV+BacwVtDkYNmg jZMkMNFf4QaLZYAPBp97Ykw8TVwz4bGq+Q46eZYFRqB1JlGLY91vfSuyfffub1Eyw+x2 0JPMsk4J2aZY9ocBAYWlsYqAj0nt/N0fKudnbjuYqr0/soQ/K5V3kzSBFa0ic1XQhLOE 17nQa29J4j+TkllkLRaVIjaeSO2x0Fv6d9jvm5cch2C5pqz0N4lEZbGKdF1fyq1DVbrS xA6Q== X-Gm-Message-State: ABy/qLbEGhi3WJBY1eUaaiqtUxMMN+fBU7nY7Qgd3xu+kpoKmUVDiCJd Ij2ALJzljHZJf/kzFyHsT1hjzA== X-Received: by 2002:a17:902:d506:b0:1b8:c8bc:c81b with SMTP id b6-20020a170902d50600b001b8c8bcc81bmr23436241plg.21.1691071556771; Thu, 03 Aug 2023 07:05:56 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.05.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:05:56 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Yunsheng Lin , Menglong Dong , Richard Gobert , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 04/10] xsk: add xsk_tx_completed_addr function Date: Thu, 3 Aug 2023 22:04:30 +0800 Message-Id: <20230803140441.53596-5-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773225556342339332 X-GMAIL-MSGID: 1773225556342339332 Return desc to the cq by using the descriptor address. Signed-off-by: huangjie.albert --- include/net/xdp_sock_drv.h | 1 + net/xdp/xsk.c | 6 ++++++ net/xdp/xsk_queue.h | 11 +++++++++++ 3 files changed, 18 insertions(+) diff --git a/include/net/xdp_sock_drv.h b/include/net/xdp_sock_drv.h index 1f6fc8c7a84c..5220454bff5c 100644 --- a/include/net/xdp_sock_drv.h +++ b/include/net/xdp_sock_drv.h @@ -15,6 +15,7 @@ #ifdef CONFIG_XDP_SOCKETS void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries); +void xsk_tx_completed_addr(struct xsk_buff_pool *pool, u64 addr); bool xsk_tx_peek_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc); u32 xsk_tx_peek_release_desc_batch(struct xsk_buff_pool *pool, u32 max); void xsk_tx_release(struct xsk_buff_pool *pool); diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c index 4f1e0599146e..b2b8aa7b0bcf 100644 --- a/net/xdp/xsk.c +++ b/net/xdp/xsk.c @@ -396,6 +396,12 @@ void xsk_tx_completed(struct xsk_buff_pool *pool, u32 nb_entries) } EXPORT_SYMBOL(xsk_tx_completed); +void xsk_tx_completed_addr(struct xsk_buff_pool *pool, u64 addr) +{ + xskq_prod_submit_addr(pool->cq, addr); +} +EXPORT_SYMBOL(xsk_tx_completed_addr); + void xsk_tx_release(struct xsk_buff_pool *pool) { struct xdp_sock *xs; diff --git a/net/xdp/xsk_queue.h b/net/xdp/xsk_queue.h index 13354a1e4280..a494d1dcb1c3 100644 --- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -428,6 +428,17 @@ static inline void __xskq_prod_submit(struct xsk_queue *q, u32 idx) smp_store_release(&q->ring->producer, idx); /* B, matches C */ } + +static inline void xskq_prod_submit_addr(struct xsk_queue *q, u64 addr) +{ + struct xdp_umem_ring *ring = (struct xdp_umem_ring *)q->ring; + u32 idx = q->ring->producer; + + ring->desc[idx++ & q->ring_mask] = addr; + + __xskq_prod_submit(q, idx); +} + static inline void xskq_prod_submit(struct xsk_queue *q) { __xskq_prod_submit(q, q->cached_prod); From patchwork Thu Aug 3 14:04:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130723 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1258819vqx; Thu, 3 Aug 2023 09:19:33 -0700 (PDT) X-Google-Smtp-Source: APBJJlGTkF014hanF8skt2SxyZvhLEDtWzp09rL60PucZj0StGemQ1SFkMlX4G5BBXJfB3khMt47 X-Received: by 2002:a17:903:1cc:b0:1bb:59a0:3d34 with SMTP id e12-20020a17090301cc00b001bb59a03d34mr16292361plh.30.1691079572707; Thu, 03 Aug 2023 09:19:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691079572; cv=none; d=google.com; s=arc-20160816; b=P0lhWnE7lTrVQCvrBG6DEss0/tuoOCrRDjXDnP5mdeIlCbV5PAIMZ4tfP2Zo198afi qg7ZxqB9/ISrRFeEjCEDX1r5gpwKVyC+AjdkWF7Jypna/TTXd60XxDRQkGARPCZLbn5W NyP9JSskhFzNXs5eO+CMNApE9vwlYVk/xQuY2S0lpWBJX8kt0R/m3bN8TnzxRgnaIY7x aFLUPmNvB5KohVRj0Ko/ZnJMrDLOQ3fmC4BHwd474MYzyviPx+JWhak0wq6bwdsA7LED VEmPNnhoGMYk5GWkSSJ+xn6oeOlh10HCOEhoJbE1AjWSQCff5vol4X2fiFoPsLSf+1Ks 6Eig== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=ldmbgmFCPgTqW2fPQLY5TbtgDmSxsPryWGSwRaZKrPQ=; fh=1AjVp6CX4JUgAXegYUyT8j3MkJJ9uhodBXekeZMnqcU=; b=WJF5/wXoR6UMn33qHlWSqdzZXxrZzHucJA1fHHbVWTUAldBsF1/NHcgZuoLqsRR4hJ KJKAhXojwA3w9kO5ZxX9ReeA1cu0ODvl/opTF9pzBYdST1PvwUO7/KjJ11dVCyH5ljQg WlM+32mKYq6Q+C9Gma3kN+usdb04k+98hxUKbhXuer5cb4/FFajYt1ODffZZ+Uz7Ohyo jaEU4Tt5AnYgReWBs0jzjKcVkpIjJFXylJYXSpruzvL7R+cSiDbU8F8iRrfQ0E0ypKdj 2i8e3RYfyFcUhMw+pM6E+ThqpOy/dJgdGVwpoeqL1llI0PGe1H3jvDEGUiM3GymZnzID fpNQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=fgb+z1UH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q5-20020a17090311c500b001b89bfd0c2csi90826plh.647.2023.08.03.09.19.18; Thu, 03 Aug 2023 09:19:32 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=fgb+z1UH; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236219AbjHCOHL (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53162 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234346AbjHCOGw (ORCPT ); Thu, 3 Aug 2023 10:06:52 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 738AA4696 for ; Thu, 3 Aug 2023 07:06:09 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1bb8a89b975so6890855ad.1 for ; Thu, 03 Aug 2023 07:06:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071568; x=1691676368; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ldmbgmFCPgTqW2fPQLY5TbtgDmSxsPryWGSwRaZKrPQ=; b=fgb+z1UHH51nnAt8NwEDrRzGqf1c0PuIYH3dTKXFyhE9MPizHTukjzM/X6LgJi8uwz zjSph5k0AUXsRkWV2F2Ci7jSTppkWbcDuQeH0adRzCeJdiI8gCyc7V11IBEWdPn2YOd5 txT7HJzEQZu+rpfM+KM9HIvR1Ijdh+0QasGlgWWUmXH1wsPh3ur4qjiHRwzvsYWqGL6V jGgNY2rZdZ3/jJzDn6ovpYavvEw/9n5UJxh22VGbCGZ1XBlQQPoHubgF5bUkbJaCS8dj VIS0HHCiFUUO9K0iGt2GVI0tQ5THSmrMh71g0ExCpLKJRWJHu2y1TSjI337RTIBh5VBy Wqrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071568; x=1691676368; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ldmbgmFCPgTqW2fPQLY5TbtgDmSxsPryWGSwRaZKrPQ=; b=XzC1w8A3EnPD6bauK6PnqhM0mvSlhKTQKzKzx4tMIHYBbcKcy0FEfxEuak1LL0Oht2 a7pRkEAiBPVr8FUBJjeQNi7uDNyb1dAWV5Rc8kFqgLfdaypDiecFIu32uWwg6rahtadx wLOQY+hs1ExOznf+FLwfYXDI9qfD+s4pFD+Gfr6xCKeIRJh2xk/g9PD/mfVly5i0W2cF IaNUex2QaOw9aqIp+ryCAb90MuYDpwT1YIHU/WH74V1LOY9BNXDDYRw2txwrmNCVDxOp JDRYFoGrKOAdxdVLEdap4ZuUEPeyRHEC+N5X4Y2ObAeyGJOmejOwf6Lnby0x93kgV8mf K9Yw== X-Gm-Message-State: ABy/qLZKSr3kwmF1YYfeybeYG8z2d1d0EwaIXNItJBZqX7ZtXKXS0a/D SpY1VufvkF42uv5jXwnVtcE8Ow== X-Received: by 2002:a17:902:d2cd:b0:1bc:239:a7e3 with SMTP id n13-20020a170902d2cd00b001bc0239a7e3mr15632332plc.44.1691071568602; Thu, 03 Aug 2023 07:06:08 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.06.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:06:08 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Kees Cook , Menglong Dong , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 05/10] veth: use send queue tx napi to xmit xsk tx desc Date: Thu, 3 Aug 2023 22:04:31 +0800 Message-Id: <20230803140441.53596-6-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773225454241271703 X-GMAIL-MSGID: 1773225454241271703 Signed-off-by: huangjie.albert --- drivers/net/veth.c | 265 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 264 insertions(+), 1 deletion(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 63c3ebe4c5d0..944761807ca4 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -27,6 +27,8 @@ #include #include #include +#include +#include #define DRV_NAME "veth" #define DRV_VERSION "1.0" @@ -1061,6 +1063,176 @@ static int veth_poll(struct napi_struct *napi, int budget) return done; } +static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, int budget) +{ + struct veth_priv *priv, *peer_priv; + struct net_device *dev, *peer_dev; + struct veth_rq *peer_rq; + struct veth_stats peer_stats = {}; + struct veth_stats stats = {}; + struct veth_xdp_tx_bq bq; + struct xdp_desc desc; + void *xdpf; + int done = 0; + + bq.count = 0; + dev = sq->dev; + priv = netdev_priv(dev); + peer_dev = priv->peer; + peer_priv = netdev_priv(peer_dev); + + /* todo: queue index must set before this */ + peer_rq = &peer_priv->rq[sq->queue_index]; + + /* set xsk wake up flag, to do: where to disable */ + if (xsk_uses_need_wakeup(xsk_pool)) + xsk_set_tx_need_wakeup(xsk_pool); + + while (budget-- > 0) { + unsigned int truesize = 0; + struct xdp_frame *p_frame; + struct page *page; + void *new_addr; + void *addr; + + /* + * get a desc from xsk pool + */ + if (!xsk_tx_peek_desc(xsk_pool, &desc)) { + break; + } + + /* + * Get a xmit addr + * desc.addr is a offset, so we should to convert to real virtual address + */ + addr = xsk_buff_raw_get_data(xsk_pool, desc.addr); + + /* can not hold all data in a page */ + truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + desc.len + sizeof(struct xdp_frame); + if (truesize > PAGE_SIZE) { + stats.xdp_drops++; + xsk_tx_completed_addr(xsk_pool, desc.addr); + continue; + } + + page = dev_alloc_page(); + if (!page) { + /* + * error , release xdp frame and increase drops + */ + xsk_tx_completed_addr(xsk_pool, desc.addr); + stats.xdp_drops++; + break; + } + new_addr = page_to_virt(page); + + p_frame = new_addr; + new_addr += sizeof(struct xdp_frame); + p_frame->data = new_addr; + p_frame->len = desc.len; + + /* frame should change to the page size, beacause the (struct skb_shared_info) is so large, + * if we build skb in veth_xdp_rcv_one, skb->tail may larger than skb->end which could triger a skb_panic + */ + p_frame->headroom = 0; + p_frame->metasize = 0; + p_frame->frame_sz = PAGE_SIZE; + p_frame->flags = 0; + p_frame->mem.type = MEM_TYPE_PAGE_SHARED; + memcpy(p_frame->data, addr, p_frame->len); + xsk_tx_completed_addr(xsk_pool, desc.addr); + + /* if peer have xdp prog, if it has ,just send to peer */ + p_frame = veth_xdp_rcv_one(peer_rq, p_frame, &bq, &peer_stats); + /* if no xdp with this queue, convert to skb to xmit*/ + if (p_frame) { + xdpf = p_frame; + veth_xdp_rcv_bulk_skb(peer_rq, &xdpf, 1, &bq, &peer_stats); + p_frame = NULL; + } + + stats.xdp_bytes += desc.len; + + done++; + } + + /* release, move consumer,and wakeup the producer */ + if (done) { + napi_schedule(&peer_rq->xdp_napi); + xsk_tx_release(xsk_pool); + } + + + + /* just for peer rq */ + if (peer_stats.xdp_tx > 0) + veth_xdp_flush(peer_rq, &bq); + if (peer_stats.xdp_redirect > 0) + xdp_do_flush(); + + /* update peer rq stats, or maybe we do not need to do this */ + u64_stats_update_begin(&peer_rq->stats.syncp); + peer_rq->stats.vs.xdp_redirect += peer_stats.xdp_redirect; + peer_rq->stats.vs.xdp_packets += done; + peer_rq->stats.vs.xdp_bytes += stats.xdp_bytes; + peer_rq->stats.vs.xdp_drops += peer_stats.xdp_drops; + peer_rq->stats.vs.rx_drops += peer_stats.rx_drops; + peer_rq->stats.vs.xdp_tx += peer_stats.xdp_tx; + u64_stats_update_end(&peer_rq->stats.syncp); + + /* update sq stats */ + u64_stats_update_begin(&sq->stats.syncp); + sq->stats.vs.xdp_packets += done; + sq->stats.vs.xdp_bytes += stats.xdp_bytes; + sq->stats.vs.xdp_drops += stats.xdp_drops; + u64_stats_update_end(&sq->stats.syncp); + + return done; +} + +static int veth_poll_tx(struct napi_struct *napi, int budget) +{ + struct veth_sq *sq = container_of(napi, struct veth_sq, xdp_napi); + struct xsk_buff_pool *pool; + int done = 0; + xdp_set_return_frame_no_direct(); + + sq->xsk.last_cpu = smp_processor_id(); + + /* xmit for tx queue */ + rcu_read_lock(); + pool = rcu_dereference(sq->xsk.pool); + if (pool) { + done = veth_xsk_tx_xmit(sq, pool, budget); + } + rcu_read_unlock(); + + if (done < budget) { + /* if done < budget, the tx ring is no buffer */ + napi_complete_done(napi, done); + } + + xdp_clear_return_frame_no_direct(); + + return done; +} + + +static int veth_napi_add_tx(struct net_device *dev) +{ + struct veth_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < dev->real_num_rx_queues; i++) { + struct veth_sq *sq = &priv->sq[i]; + netif_napi_add(dev, &sq->xdp_napi, veth_poll_tx); + napi_enable(&sq->xdp_napi); + } + + return 0; +} + static int veth_create_page_pool(struct veth_rq *rq) { struct page_pool_params pp_params = { @@ -1153,6 +1325,19 @@ static void veth_napi_del_range(struct net_device *dev, int start, int end) } } +static void veth_napi_del_tx(struct net_device *dev) +{ + struct veth_priv *priv = netdev_priv(dev); + int i; + + for (i = 0; i < dev->real_num_rx_queues; i++) { + struct veth_sq *sq = &priv->sq[i]; + + napi_disable(&sq->xdp_napi); + __netif_napi_del(&sq->xdp_napi); + } +} + static void veth_napi_del(struct net_device *dev) { veth_napi_del_range(dev, 0, dev->real_num_rx_queues); @@ -1360,7 +1545,7 @@ static void veth_set_xdp_features(struct net_device *dev) struct veth_priv *priv_peer = netdev_priv(peer); xdp_features_t val = NETDEV_XDP_ACT_BASIC | NETDEV_XDP_ACT_REDIRECT | - NETDEV_XDP_ACT_RX_SG; + NETDEV_XDP_ACT_RX_SG | NETDEV_XDP_ACT_XSK_ZEROCOPY; if (priv_peer->_xdp_prog || veth_gro_requested(peer)) val |= NETDEV_XDP_ACT_NDO_XMIT | @@ -1737,11 +1922,89 @@ static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog, return err; } +static int veth_xsk_pool_enable(struct net_device *dev, struct xsk_buff_pool *pool, u16 qid) +{ + struct veth_priv *peer_priv; + struct veth_priv *priv = netdev_priv(dev); + struct net_device *peer_dev = priv->peer; + int err = 0; + + if (qid >= dev->real_num_tx_queues) + return -EINVAL; + + if(!peer_dev) + return -EINVAL; + + /* no dma, so we just skip dma skip in xsk zero copy */ + pool->dma_check_skip = true; + + peer_priv = netdev_priv(peer_dev); + /* + * enable peer tx xdp here, this side + * xdp is enable by veth_xdp_set + * to do: we need to check whther this side is already enable xdp + * maybe it do not have xdp prog + */ + if (!(peer_priv->_xdp_prog) && (!veth_gro_requested(peer_dev))) { + /* peer should enable napi*/ + err = veth_napi_enable(peer_dev); + if (err) + return err; + } + + /* Here is already protected by rtnl_lock, so rcu_assign_pointer + * is safe. + */ + rcu_assign_pointer(priv->sq[qid].xsk.pool, pool); + + veth_napi_add_tx(dev); + + return err; +} + +static int veth_xsk_pool_disable(struct net_device *dev, u16 qid) +{ + struct veth_priv *peer_priv; + struct veth_priv *priv = netdev_priv(dev); + struct net_device *peer_dev = priv->peer; + int err = 0; + + if (qid >= dev->real_num_tx_queues) + return -EINVAL; + + if(!peer_dev) + return -EINVAL; + + peer_priv = netdev_priv(peer_dev); + + /* to do: this may be failed */ + if (!(peer_priv->_xdp_prog) && (!veth_gro_requested(peer_dev))) { + /* disable peer napi */ + veth_napi_del(peer_dev); + } + + veth_napi_del_tx(dev); + + rcu_assign_pointer(priv->sq[qid].xsk.pool, NULL); + return err; +} + +/* this is for setup xdp */ +static int veth_xsk_pool_setup(struct net_device *dev, struct netdev_bpf *xdp) +{ + if (xdp->xsk.pool) + return veth_xsk_pool_enable(dev, xdp->xsk.pool, xdp->xsk.queue_id); + else + return veth_xsk_pool_disable(dev, xdp->xsk.queue_id); +} + static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp) { switch (xdp->command) { case XDP_SETUP_PROG: return veth_xdp_set(dev, xdp->prog, xdp->extack); + case XDP_SETUP_XSK_POOL: + return veth_xsk_pool_setup(dev, xdp); default: return -EINVAL; } From patchwork Thu Aug 3 14:04:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130693 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1238130vqx; Thu, 3 Aug 2023 08:48:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFJkPJR9MSDoeDkNtdtG38dDfq30VtFGqTUeqeBUvbhroditrxiaZRKijzwByR6PfYAvo+A X-Received: by 2002:a17:902:8488:b0:1bc:48d7:f2a2 with SMTP id c8-20020a170902848800b001bc48d7f2a2mr776633plo.19.1691077728577; Thu, 03 Aug 2023 08:48:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691077728; cv=none; d=google.com; s=arc-20160816; b=LXacYkgdsuXmPnLBP7QfKL5GGszZBUaUEiT8DcjNrYRXAqG1zfsOvC8uI8VZ/7Bvnz B75WMFpuoJAgRCoZW0NiwdzDHhWksNZn5Q4R05aKValiNleroRMVSfEPwIHttNO4elaH 4tYPELyThapDSNanoM0UHhsoeUMRbm74SAJG9NxBtq0Al/tR6oFRMlzPiJXeCqNEv15C K00WbwHai0hyq5uzUSsWgficj6rex7gzgI7wkcWyhZBkz66yJBUvwJ7VHcdQbv28G3K6 +NkWzujrZYEKLwmjaFaSaf8/gfye5ADWnFLdk8+aW2eFL2KHOVARup5wmTpIRjF24eE0 Zkeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=uAdtU7SLip58WY/jMpbQZS/1BOrBTm47cvrvRETW+jo=; fh=pQMjDmBpkga//+9OKrKQl28a8lyKhz4s+dSSuCJIV1I=; b=y4n5z+3N2NTwSDLLXbKoxY+LJXmNBih8ORp+ch1Z1kKyuCAunW/TEET2ZHC2DnLuRK LItPnICTQICsjRtz0cOOOJGF37ruAD9I9mPAp02MUUA8cAXr5bCK3QSyiQ2sZP6f210r 6voI7Wt56ZbNYh1yQVIv5qK12h+/vlFj9EkIbIDHQHRmmINPMOohTJ9Vp0ZfTe7EYG0w umEf8liwbXgj8w2FNrI2xieEUrRAW6VmZ9sZF/HGKOOsYnYcxEwZCFlKjICrp1QTyOcQ W+y/KNh1d+cqSFMnLyA1hk3R0j+bjuOz4gdWWRWEjjRfqNXbkBxPtksVUBukk6tOO6UF TYag== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="gje/ZNVd"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l11-20020a170902e2cb00b001b231cb6f22si57873plc.111.2023.08.03.08.48.08; Thu, 03 Aug 2023 08:48:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="gje/ZNVd"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236541AbjHCOHp (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52736 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234598AbjHCOHT (ORCPT ); Thu, 3 Aug 2023 10:07:19 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4330D46AC for ; Thu, 3 Aug 2023 07:06:21 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1bbc64f9a91so8987615ad.0 for ; Thu, 03 Aug 2023 07:06:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071580; x=1691676380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uAdtU7SLip58WY/jMpbQZS/1BOrBTm47cvrvRETW+jo=; b=gje/ZNVdV37PsuBXcn9JcvDF1Ybdczh0zKLeYvcGuYmGxjT6hPC415oZTPIOtzx83g 7FFCFIT6Kmor5VCwN/i3qktqe92oFscFmNOAH082Ja7YR86GHFserclV4lc3CJNE1JVd +JiXjBg0V2wOrubU8ZMJeKXuoUv0Ozj0un1MiQjMdSXoHWHZZSf8vRs5OnkLjIX36jjU ig5yH157Hd/7+McdeYRCdJOIPawm3lbgYqB1DJYMWZGMAML63qPftJeILVLdeNK3uyOL pQHZQMLoxkAmw95IlOXouBPXN7IMMwa9psr71neaanIiaQcBn8UqW9AyIOJ3Td1T4F+e 6mDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071580; x=1691676380; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uAdtU7SLip58WY/jMpbQZS/1BOrBTm47cvrvRETW+jo=; b=eiV3SsjyqBDXvvqFwe9QFTCKYNA0c9Mcd+1XEcBIUkHLhjMkvubMqzlV6Y/PHZghLR BBiYzK8xY4kfH+4qIS3F6YFqccdgjF/dIuPq29Z1WXvyUsdkf6kIsuxAaQGzkbX65VR1 GEZnHvLKjYFH4DOFeWT1F7RJvk6MAj5XmQRdhwXcnLwyNpGgsjEmSziC5XChjR53e+HM TH8VNhU/RmMpNl8o8/5C2jDpaDSb5xll6e3ojUILSA9DOoEgVgCVpBzZ7d3S2TqegJp7 mIzpwbIGGiEVMRpEFYnH0MTyY8ZZeNMVW4o2mBsB1uAYrfOjc48DJggY0epa4dWlpBQw +wrg== X-Gm-Message-State: ABy/qLZSQ3NPtex4x6ectbaTjmMpbhUoGdfeJFhImJ6bDYGdsiD0LGTP 5Rw4zJtX3+rg9x0YMZ4v5crpDQ== X-Received: by 2002:a17:902:e807:b0:1b9:e091:8037 with SMTP id u7-20020a170902e80700b001b9e0918037mr23334397plg.30.1691071580298; Thu, 03 Aug 2023 07:06:20 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.06.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:06:19 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Shmulik Ladkani , Kees Cook , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 06/10] veth: add ndo_xsk_wakeup callback for veth Date: Thu, 3 Aug 2023 22:04:32 +0800 Message-Id: <20230803140441.53596-7-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773223520009216454 X-GMAIL-MSGID: 1773223520009216454 Signed-off-by: huangjie.albert --- drivers/net/veth.c | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 944761807ca4..600225e27e9e 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -1840,6 +1840,45 @@ static void veth_set_rx_headroom(struct net_device *dev, int new_hr) rcu_read_unlock(); } +static void veth_xsk_remote_trigger_napi(void *info) +{ + struct veth_sq *sq = info; + + napi_schedule(&sq->xdp_napi); +} + +static int veth_xsk_wakeup(struct net_device *dev, u32 qid, u32 flag) +{ + struct veth_priv *priv; + struct veth_sq *sq; + u32 last_cpu, cur_cpu; + + if (!netif_running(dev)) + return -ENETDOWN; + + if (qid >= dev->real_num_rx_queues) + return -EINVAL; + + priv = netdev_priv(dev); + sq = &priv->sq[qid]; + + if (napi_if_scheduled_mark_missed(&sq->xdp_napi)) + return 0; + + last_cpu = sq->xsk.last_cpu; + cur_cpu = get_cpu(); + + /* raise a napi */ + if (last_cpu == cur_cpu) { + napi_schedule(&sq->xdp_napi); + } else { + smp_call_function_single(last_cpu, veth_xsk_remote_trigger_napi, sq, true); + } + + put_cpu(); + return 0; +} + static int veth_xdp_set(struct net_device *dev, struct bpf_prog *prog, struct netlink_ext_ack *extack) { @@ -2054,6 +2093,7 @@ static const struct net_device_ops veth_netdev_ops = { .ndo_set_rx_headroom = veth_set_rx_headroom, .ndo_bpf = veth_xdp, .ndo_xdp_xmit = veth_ndo_xdp_xmit, + .ndo_xsk_wakeup = veth_xsk_wakeup, .ndo_get_peer_dev = veth_peer_dev, }; From patchwork Thu Aug 3 14:04:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130739 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1280320vqx; Thu, 3 Aug 2023 09:58:14 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHllIFHoDbeu50Y5bg1+ABku8mZyxU/gOoTnw+948as7xPflwbygwsY6Xqw8iUCB17pF73h X-Received: by 2002:a05:6a20:3caa:b0:13f:8855:d5a0 with SMTP id b42-20020a056a203caa00b0013f8855d5a0mr1670964pzj.50.1691081893756; Thu, 03 Aug 2023 09:58:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691081893; cv=none; d=google.com; s=arc-20160816; b=wZrCjmZFiELp4LeuuguZi059NN3KP+BJM+/46OUiQcdVbuJXJNju8kwjrMDe78qLZ+ QZujXRvEWEeZQEOeRHnA/C7kh0IBlCUx85R/EhjJSeAezIcjN+bb7yHh8wwNbhAdzm+Q b5JespWmNteRKNaRBiBBpnD3SRlhD+gGuOTF3rz/RVr+1HR+mOV1dA1Kls2//JOk+Fdb kAyEbYOBsvxkvQM0AFGzuH+FC4t4UuDtWYxgte7yhObjNYk1pYXUHdVKL3QKpDyel81y gJibaoQh7XdmYxOdftONerEHcNJ4W3BsCKqJqtW+2hRuZv2vH75oU8P83123D8ZMdgmr ehng== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Eg+CmrWjgB1kgvmO7SAfesuSxaP3dPOXxrKbkUe7IUY=; fh=pQMjDmBpkga//+9OKrKQl28a8lyKhz4s+dSSuCJIV1I=; b=DE7HvcXagPPi2YYIJOekyXsbfOWOitE2km8cA0aVYdZTacBwHZ6YWL+knJ3J3fJVTf WerJdIrOOJDgrMhgjXIwE2RDhgWZtTsmgczl9N00USw8mfja0widkSPCZXGwxugmxTjR 8LtThnZ0vpTnDK/arMknyKqtBHZ6hHiGzIh8Fc3gOPk6Cku0usP7WZNBvzSLKMJH+T5g gA9pfn8cjP1cH/Fm0bc/rh8ZLYhZSjvzDRNr3qm6w1S4iLWsQiGuUYS3Neu5ztMaCjra D9jZE2b9h54zoUCkfYdg9xtgWO/ewjqqOSzg1Q8MylSyC0fR8QBZvZY0eK1dnBKl1g8p n+/A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=XrqyvZDx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id kp3-20020a170903280300b001b88997ababsi146612plb.412.2023.08.03.09.58.00; Thu, 03 Aug 2023 09:58:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=XrqyvZDx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235983AbjHCOHv (ORCPT + 99 others); Thu, 3 Aug 2023 10:07:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236015AbjHCOHY (ORCPT ); Thu, 3 Aug 2023 10:07:24 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ACE546BC for ; Thu, 3 Aug 2023 07:06:32 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b8ad356f03so6965685ad.1 for ; Thu, 03 Aug 2023 07:06:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071592; x=1691676392; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Eg+CmrWjgB1kgvmO7SAfesuSxaP3dPOXxrKbkUe7IUY=; b=XrqyvZDxSgiavPWsBCbSRHg+ou1GVyTzy2qzcAS64qQGp+5sGIgEphDFf+c7iQsHOn sZUqX8uA285/7n1CGhGV341ljQbfmvQZRJA2l/46yYxYNTZNY//6jbTn4Sz5r00eY15x uP68lQ9F0N2yR9rt4u2yYkcTx+bCa4Z5wEMIUI8iFf5eXqYrRcPeDEfmOkp7L+DROPOW 49sQrOMYJDe2u70gPeP+xU/GOuRU3LH+bIrm3AOp+0kMDIpIgOwC7BAiBy+nnWtFq7XK uOJx0krx1qjTykH3Pfsu4sp+nJH4RWaU59BpMVaGHqHStjFrTTCi4SYIYS+lZoPloPYZ yZAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071592; x=1691676392; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Eg+CmrWjgB1kgvmO7SAfesuSxaP3dPOXxrKbkUe7IUY=; b=lxFxFIaxc4ZV3mqu8hTD5uAoKx8MDVOIwVzl14vPEWQ+SO/jXes56loXU+3U9IPfXr hEhY4yTN9WaByW7lUxrwQphSK4mKFLUGTm5PObDZGQj0VTfoEVS64RgtvxTGAYjY0Z0t UfizUkopdeTac0+WoIx3YEKXZB1/s7IPI/0e4uXkJU5HJuHOElZpPBgdedYE1RwfPGQC AOV8S9tKR2zjFnffpTQpot4FH7xl5hdbI+ud9DTqQpH/8WR/xkj1oJQyyufhOUyfUsJS 5YbxeFdbZB221k332DfAeEWVKY2DOD8eXy1OscEYUM789OIGGEsIgPprGK0gUiEM6KpQ kImA== X-Gm-Message-State: ABy/qLaedCLlZrgl9t1pZfHdPATxOJXZNOk9/wvjFYdv8N7xe6SSqfTt tyiE0Yus1Ps6wMlwHJSjwD8Ybg== X-Received: by 2002:a17:903:120a:b0:1bb:9bc8:d230 with SMTP id l10-20020a170903120a00b001bb9bc8d230mr20412138plh.23.1691071591969; Thu, 03 Aug 2023 07:06:31 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.06.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:06:31 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Shmulik Ladkani , Kees Cook , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 07/10] sk_buff: add destructor_arg_xsk_pool for zero copy Date: Thu, 3 Aug 2023 22:04:33 +0800 Message-Id: <20230803140441.53596-8-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773227887919036433 X-GMAIL-MSGID: 1773227887919036433 this member is add for dummy dev to suppot zero copy Signed-off-by: huangjie.albert --- include/linux/skbuff.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 16a49ba534e4..fa9577d233a4 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -592,6 +592,7 @@ struct skb_shared_info { /* Intermediate layers must ensure that destructor_arg * remains valid until skb destructor */ void * destructor_arg; + void * destructor_arg_xsk_pool; /* just for dummy device xsk zero copy */ /* must be last field, see pskb_expand_head() */ skb_frag_t frags[MAX_SKB_FRAGS]; From patchwork Thu Aug 3 14:04:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130686 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1227410vqx; Thu, 3 Aug 2023 08:30:56 -0700 (PDT) X-Google-Smtp-Source: APBJJlGl152kDoEYILmJFOa2HlWuYdQof7+jQbwBpBU8KJpUAfUZw4+AG+axwODCTY2pzNxIpn2h X-Received: by 2002:a17:907:2ce9:b0:993:d536:3cb7 with SMTP id hz9-20020a1709072ce900b00993d5363cb7mr7304469ejc.11.1691076656103; Thu, 03 Aug 2023 08:30:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691076656; cv=none; d=google.com; s=arc-20160816; b=x8vDoHSMDL44vV2hdNR2/vgaB+L+vgo0qepP6rGeoCzKfMZ3qpOs1WEu4tZzt0a2Vr AVrfVPE8YRT+aXobjL1YeEq1RDrN82Q58VNgpL+ye8xAN5k7XuP7XnMG+Bo+Vq+lMCnI oKIWE0UQg+EqaYUUdf89nt3qTq9fEdEMfgp80N2k4QQIVqqEEkYnqX/H0L2bqJ/u6BnX xAfl9iuGcd59wa0+ocqxS35sxKTlATbJaxLYoSJ73sNd2KNJHM7oSycECyddWBaEYkWb COKedrZh3ro9kKE7mtn9d3/DeBRvePimtHcQjDZxsFgGInuqhxQTozW0j7eTiFPZ/WAB 91Mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TLnjc8jPq2PoD5M5aiFgUeVBrc9aIeJJlwbo6fwSV28=; fh=dg990XKbH+C4xTsBhEdr0dwTMKS/BXfsG30JvJVxzb8=; b=TX8MsgJoxo0ULkoDqi8Deo7J1RGcU3ftk3lWF//415+4EXV72XSegg9GIgAJHMz3uU dDeuPO3hOkl/6jV+NzOFw43kYZWAUcNMF4wXDXsGy9TBFzWv3MSgwCasfCRgifqQmCX3 5rIqw6QIT3/uGhMoIUsi2fCnM1dKq4EsUMAgI/rRGr6FGz/PU/NFWYnvE2K6tWoRmi1/ 2PlVRWaFXklXOSANGBL1ylvVjS2/a/PIzbBzkIlFRmEWv2eiDpi92lFmn2aZrOr4QEI5 1Z+c+i0aejDHYTVkfC9GsO4IanDGxv60GU8ojhylEf4Q3iXcfa9sEIw8PUlULAr4L0yk Tbgw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=PniuI4mh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s6-20020a1709067b8600b00997d1bed609si8102711ejo.550.2023.08.03.08.30.29; Thu, 03 Aug 2023 08:30:56 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=PniuI4mh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233979AbjHCOJO (ORCPT + 99 others); Thu, 3 Aug 2023 10:09:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52820 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235797AbjHCOHn (ORCPT ); Thu, 3 Aug 2023 10:07:43 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 903EB4224 for ; Thu, 3 Aug 2023 07:06:44 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1b9c5e07c1bso8731885ad.2 for ; Thu, 03 Aug 2023 07:06:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071604; x=1691676404; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TLnjc8jPq2PoD5M5aiFgUeVBrc9aIeJJlwbo6fwSV28=; b=PniuI4mhE+9Hr5kVvw8WodCwDYHCknlrSoi4oY/F/4MMl9qgwDvgyPHh76hzQ6TSjH Vd0kAhHQsPBcV7x4jj/bvR4YBGspwCSImxmOJnMrRL7xOC6KwcEj9UDlYCeE9sLTaJh5 a8SJArDhjkzS9FNXvbI3C6V+aHuAWEdph5x0qqqWF352VlalvDF0dWL3rqWtUR1xQ0yu 7kasjrpqgsxNGpDDdDWhB2qbQGwLM4HYCXkVaBmBqQm4ke9nl12IufmKSnFbwO+bgZWO UEWrwVu6BztJAvACi1qE0AXwr+3xmtO4RtI/OJr2LdfmbI53BnQ9b9o8HUYub4rpIsjD Ex7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071604; x=1691676404; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TLnjc8jPq2PoD5M5aiFgUeVBrc9aIeJJlwbo6fwSV28=; b=FaiGp+Czur+9+OP5YgyeQFUD7EXloNBy68w/wE5JQkj1PiLkKRbHyrhteSsg5Ov2w8 I5XU0bIL0XAKo1WuZbK9DriOpouhkHerYaUpdL/xBobRgxcmHHgeLaioWvmA54GpycIj cYKwVjJqvkozIjI/l45O03MV2k/Zact9WJq1UkKcpywyH3OIqdCp+27b2P6pxTTSxjQq A0h7CbXZWAzkEwmEwBSCH2HgmCw55zPBUD6WWzjteVdGamYRt49P/aODVAiXvoj/qOH8 8G7m9Z5IS+kT2kN1MhC5+lqRUDliDCESC/WVZYmhOLn0rfbKXbK2B9l0yUeIPkTGkrCL KclQ== X-Gm-Message-State: ABy/qLZtrt4hxKn7D+S5QN8dqPoO9XE3h6+ry3twd7qQsOdB7pKHjwLL lymK8tQ7bFscrBqYeAJoKZxX3Q== X-Received: by 2002:a17:903:22c8:b0:1bb:b91b:2b40 with SMTP id y8-20020a17090322c800b001bbb91b2b40mr22927532plg.60.1691071603910; Thu, 03 Aug 2023 07:06:43 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.06.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:06:43 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Kees Cook , Shmulik Ladkani , Richard Gobert , Yunsheng Lin , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 08/10] xdp: add xdp_mem_type MEM_TYPE_XSK_BUFF_POOL_TX Date: Thu, 3 Aug 2023 22:04:34 +0800 Message-Id: <20230803140441.53596-9-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773222395290435353 X-GMAIL-MSGID: 1773222395290435353 this type of xdp mem will be used for zero copy in later patch Signed-off-by: huangjie.albert --- include/net/xdp.h | 1 + 1 file changed, 1 insertion(+) diff --git a/include/net/xdp.h b/include/net/xdp.h index d1c5381fc95f..cb1621b5a0c9 100644 --- a/include/net/xdp.h +++ b/include/net/xdp.h @@ -42,6 +42,7 @@ enum xdp_mem_type { MEM_TYPE_PAGE_ORDER0, /* Orig XDP full page model */ MEM_TYPE_PAGE_POOL, MEM_TYPE_XSK_BUFF_POOL, + MEM_TYPE_XSK_BUFF_POOL_TX, MEM_TYPE_MAX, }; From patchwork Thu Aug 3 14:04:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130721 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1257275vqx; Thu, 3 Aug 2023 09:17:10 -0700 (PDT) X-Google-Smtp-Source: APBJJlFibil2yvixVk83hv/JSDElfhpoyFdhXc+jY2rphVm7a6UpqstNew/UaycqfIKDnACShzDM X-Received: by 2002:a05:6870:a2ce:b0:1b0:3637:2bbe with SMTP id w14-20020a056870a2ce00b001b036372bbemr22677923oak.54.1691079430121; Thu, 03 Aug 2023 09:17:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691079430; cv=none; d=google.com; s=arc-20160816; b=xB8WJ/xTm63xl+3HWrTT5V7G4NgEtSjUo1eNH154wH4zc8Ref2+1m7nlgbfyUiKePq tq7hP8ysCHZdGjwUoRn1Zi4JUMwFTaaeuc6DBLrLIEil/IkYIIgi/lh70sA0t6IqXGsc UwiTq73KUPuv1QROh0X52dqWbDwZuOB8afsRGv1dgv4U9lmmUSgiykU/8m1mfirdtGbf gyJpQiFCWAurts/aKP5swoYNkTRqNaRCJQ8MY7osI0jiVOsD+0IdP5rNK3NzcVFO32Lr a2EbbxdYdK+Lvsqvl3i2aQByvjnwhs/jzEhON09VyQOwXAPaaEYEyszBGwmKp214X1sq BuEw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=owtOoOyNNyHXuBh5jsdR6sP92r7iqlMI4HArUfBEUKw=; fh=g4uBolO3oPEe9/DEEo6mvafqNfqpb5cy3Mv12uv7tV0=; b=GM1dpHgchxVuED1GpFn82d3yQuT7+HeCRy4pezOptuAtvmOUpvP01kFRgCOe4iWpks AYH6gKvJNhyF4ocwKxYMVFmJAYPOC775FUBuw9PdFG3ZcrUEtAGAIF4oAHO63rnv1N4F xmqcw7yJGKvtse4NDScj2z9GFqD8E6KIvv/sxiuEVlU0LuZNWN0MaWQ7cykKBrx9BJPY S6QMpxm8xbeJPHAtGbfQpar4flXw366tJMPyuYwTjNhV5o4tJ1biME7n4ShbKl6BK3an kT5q+9Tdb/o16YCxhDMo664z+UWJVlPAFq7K5aZ9yUivpF1+8hEd2avd3oEuCWS5uBuE QWOA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="Ghs/tD2b"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nv12-20020a17090b1b4c00b00262ef440ed4si3667493pjb.27.2023.08.03.09.16.54; Thu, 03 Aug 2023 09:17:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b="Ghs/tD2b"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236525AbjHCOJd (ORCPT + 99 others); Thu, 3 Aug 2023 10:09:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236643AbjHCOH6 (ORCPT ); Thu, 3 Aug 2023 10:07:58 -0400 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B9A531706 for ; Thu, 3 Aug 2023 07:06:56 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1b8b2b60731so6882235ad.2 for ; Thu, 03 Aug 2023 07:06:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071616; x=1691676416; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=owtOoOyNNyHXuBh5jsdR6sP92r7iqlMI4HArUfBEUKw=; b=Ghs/tD2bqRAzRE4qNjDbjZ2jqi4Aud2qJLVOS1R3Mn88/gE9N324cixR+sHK93osyp kISFnDgXvzvcrT9htphzgxZPqyDOOydh3MGFg0+3skcYy7YW0UgwejBFFqHoE510WuH4 sFCXuh4V1WWmQnAyX2ZkAfhjFe6RKRym9yHoJyr/F9WBTvHWGi5Fk5PcjLbyqUFLQtxt /Mb2H98BsMj1+apKPqpzLhPSfVFPZyo8iu6ITI8us/EfGT2OT3+7ELiVbaLkQi8wj9lF mMxas9KXjr51l0KO8jbXfO/aY1egn8YB8+1ATEyujoYTK0aj92t+hL/hw7Db5hxZpLTj dLmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071616; x=1691676416; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=owtOoOyNNyHXuBh5jsdR6sP92r7iqlMI4HArUfBEUKw=; b=Af14aiDjDNnpaTRt5YHgttVJevylBz1B2XZVhERuumYBjuAriMtMxtF5lc4YeMtjqA dRY6vb/EPoyCS9a8LTCr+ONAylXxSZrmntn+Gife+X1GIG7Qkz3JhfU866rnsWQoZNzt MWuGZoMmuy/vLYo+QWYERFHH1KzRAiuie2Vn8RfhE0vda6oafmleVn45e9C7qShC2bYA mZetaRMVAgiXEHCC5OI429LMEjklbh1PizPGneAOwgQ6tNsMfvDuItZ7DqcgbwDywfzn Qc7n+ZXSo7bcZNnjuDl6AJm9dUV53atilMohqYC+L3cGAkBREBtXP4jQq1jfGY0PHkkY Yaow== X-Gm-Message-State: ABy/qLZGuhDWpE3IsHjinqkRmXPY/jteiDoO1MQxPhbsAQN1AzCOZTLa cjas6wh0rituUhzcpaTfQSt/AA== X-Received: by 2002:a17:902:f686:b0:1bb:673f:36ae with SMTP id l6-20020a170902f68600b001bb673f36aemr19669038plg.15.1691071616012; Thu, 03 Aug 2023 07:06:56 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.06.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:06:55 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Yunsheng Lin , Kees Cook , Richard Gobert , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 09/10] veth: support zero copy for af xdp Date: Thu, 3 Aug 2023 22:04:35 +0800 Message-Id: <20230803140441.53596-10-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773225304275753830 X-GMAIL-MSGID: 1773225304275753830 The following conditions need to be satisfied to achieve zero-copy: 1. The tx desc has enough space to store the xdp_frame and skb_share_info. 2. The memory address pointed to by the tx desc is within a page. test zero copy with libxdp Performance: |MSS (bytes) | Packet rate (PPS) AF_XDP | 1300 | 480k AF_XDP with zero copy| 1300 | 540K signed-off-by: huangjie.albert --- drivers/net/veth.c | 207 ++++++++++++++++++++++++++++++++++++++------- 1 file changed, 178 insertions(+), 29 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index 600225e27e9e..e4f1a8345f42 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -103,6 +103,11 @@ struct veth_xdp_tx_bq { unsigned int count; }; +struct veth_seg_info { + u32 segs; + u64 desc[] ____cacheline_aligned_in_smp; +}; + /* * ethtool interface */ @@ -645,6 +650,100 @@ static int veth_xdp_tx(struct veth_rq *rq, struct xdp_buff *xdp, return 0; } +static struct sk_buff *veth_build_skb(void *head, int headroom, int len, + int buflen) +{ + struct sk_buff *skb; + + skb = build_skb(head, buflen); + if (!skb) + return NULL; + + skb_reserve(skb, headroom); + skb_put(skb, len); + + return skb; +} + +static void veth_xsk_destruct_skb(struct sk_buff *skb) +{ + struct veth_seg_info *seg_info = (struct veth_seg_info *)skb_shinfo(skb)->destructor_arg; + struct xsk_buff_pool *pool = (struct xsk_buff_pool *)skb_shinfo(skb)->destructor_arg_xsk_pool; + unsigned long flags; + u32 index = 0; + u64 addr; + + /* release cq */ + spin_lock_irqsave(&pool->cq_lock, flags); + for (index = 0; index < seg_info->segs; index++) { + addr = (u64)(long)seg_info->desc[index]; + xsk_tx_completed_addr(pool, addr); + } + spin_unlock_irqrestore(&pool->cq_lock, flags); + + kfree(seg_info); + skb_shinfo(skb)->destructor_arg = NULL; + skb_shinfo(skb)->destructor_arg_xsk_pool = NULL; +} + +static struct sk_buff *veth_build_skb_zerocopy(struct net_device *dev, struct xsk_buff_pool *pool, + struct xdp_desc *desc) +{ + struct veth_seg_info *seg_info; + struct sk_buff *skb; + struct page *page; + void *hard_start; + u32 len, ts; + void *buffer; + int headroom; + u64 addr; + u32 index; + + addr = desc->addr; + len = desc->len; + buffer = xsk_buff_raw_get_data(pool, addr); + ts = pool->unaligned ? len : pool->chunk_size; + + headroom = offset_in_page(buffer); + + /* offset in umem pool buffer */ + addr = buffer - pool->addrs; + + /* get the page of the desc */ + page = pool->umem->pgs[addr >> PAGE_SHIFT]; + + /* in order to avoid to get freed by kfree_skb */ + get_page(page); + + hard_start = page_to_virt(page); + + skb = veth_build_skb(hard_start, headroom, len, ts); + seg_info = (struct veth_seg_info *)kmalloc(struct_size(seg_info, desc, MAX_SKB_FRAGS), GFP_KERNEL); + if (!seg_info) + { + printk("here must to deal with\n"); + } + + /* later we will support gso for this */ + index = skb_shinfo(skb)->gso_segs; + seg_info->desc[index] = desc->addr; + seg_info->segs = ++index; + + skb->truesize += ts; + skb->dev = dev; + skb_shinfo(skb)->destructor_arg = (void *)(long)seg_info; + skb_shinfo(skb)->destructor_arg_xsk_pool = (void *)(long)pool; + skb->destructor = veth_xsk_destruct_skb; + + /* set the mac header */ + skb->protocol = eth_type_trans(skb, dev); + + /* to do, add skb to sock. may be there is no need to do for this + * refcount_add(ts, &xs->sk.sk_wmem_alloc); + */ + return skb; +} + static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, struct xdp_frame *frame, struct veth_xdp_tx_bq *bq, @@ -1063,6 +1162,20 @@ static int veth_poll(struct napi_struct *napi, int budget) return done; } +/* if buffer contain in a page */ +static inline bool buffer_in_page(void *buffer, u32 len) +{ + u32 offset; + + offset = offset_in_page(buffer); + + if(PAGE_SIZE - offset >= len) { + return true; + } else { + return false; + } +} + static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, int budget) { struct veth_priv *priv, *peer_priv; @@ -1073,6 +1186,9 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, struct veth_xdp_tx_bq bq; struct xdp_desc desc; void *xdpf; + struct sk_buff *skb = NULL; + bool zc = xsk_pool->umem->zc; + u32 xsk_headroom = xsk_pool->headroom; int done = 0; bq.count = 0; @@ -1102,12 +1218,6 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, break; } - /* - * Get a xmit addr - * desc.addr is a offset, so we should to convert to real virtual address - */ - addr = xsk_buff_raw_get_data(xsk_pool, desc.addr); - /* can not hold all data in a page */ truesize = SKB_DATA_ALIGN(sizeof(struct skb_shared_info)) + desc.len + sizeof(struct xdp_frame); if (truesize > PAGE_SIZE) { @@ -1116,16 +1226,39 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, continue; } - page = dev_alloc_page(); - if (!page) { - /* - * error , release xdp frame and increase drops - */ - xsk_tx_completed_addr(xsk_pool, desc.addr); - stats.xdp_drops++; - break; + /* + * Get a xmit addr + * desc.addr is a offset, so we should to convert to real virtual address + */ + addr = xsk_buff_raw_get_data(xsk_pool, desc.addr); + + /* + * in order to support zero copy, headroom must have enough space to hold xdp_frame + */ + if (zc && (xsk_headroom < sizeof(struct xdp_frame))) + zc = false; + + /* + * if desc not contain in a page, also do not support zero copy + */ + if (!buffer_in_page(addr, desc.len)) + zc = false; + + if (zc) { + /* headroom is reserved for xdp_frame */ + new_addr = addr - sizeof(struct xdp_frame); + } else { + page = dev_alloc_page(); + if (!page) { + /* + * error , release xdp frame and increase drops + */ + xsk_tx_completed_addr(xsk_pool, desc.addr); + stats.xdp_drops++; + break; + } + new_addr = page_to_virt(page); } - new_addr = page_to_virt(page); p_frame = new_addr; new_addr += sizeof(struct xdp_frame); @@ -1137,19 +1270,37 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, */ p_frame->headroom = 0; p_frame->metasize = 0; - p_frame->frame_sz = PAGE_SIZE; p_frame->flags = 0; - p_frame->mem.type = MEM_TYPE_PAGE_SHARED; - memcpy(p_frame->data, addr, p_frame->len); - xsk_tx_completed_addr(xsk_pool, desc.addr); - - /* if peer have xdp prog, if it has ,just send to peer */ - p_frame = veth_xdp_rcv_one(peer_rq, p_frame, &bq, &peer_stats); - /* if no xdp with this queue, convert to skb to xmit*/ - if (p_frame) { - xdpf = p_frame; - veth_xdp_rcv_bulk_skb(peer_rq, &xdpf, 1, &bq, &peer_stats); - p_frame = NULL; + + if (zc) { + p_frame->frame_sz = xsk_pool->frame_len; + /* to do: if there is a xdp, how to recycle the tx desc */ + p_frame->mem.type = MEM_TYPE_XSK_BUFF_POOL_TX; + /* no need to copy address for af+xdp */ + p_frame = veth_xdp_rcv_one(peer_rq, p_frame, &bq, &peer_stats); + if (p_frame) { + skb = veth_build_skb_zerocopy(peer_dev, xsk_pool, &desc); + if (skb) { + napi_gro_receive(&peer_rq->xdp_napi, skb); + skb = NULL; + } else { + xsk_tx_completed_addr(xsk_pool, desc.addr); + } + } + } else { + p_frame->frame_sz = PAGE_SIZE; + p_frame->mem.type = MEM_TYPE_PAGE_SHARED; + memcpy(p_frame->data, addr, p_frame->len); + xsk_tx_completed_addr(xsk_pool, desc.addr); + + /* if peer have xdp prog, if it has ,just send to peer */ + p_frame = veth_xdp_rcv_one(peer_rq, p_frame, &bq, &peer_stats); + /* if no xdp with this queue, convert to skb to xmit*/ + if (p_frame) { + xdpf = p_frame; + veth_xdp_rcv_bulk_skb(peer_rq, &xdpf, 1, &bq, &peer_stats); + p_frame = NULL; + } } stats.xdp_bytes += desc.len; @@ -1163,8 +1314,6 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, xsk_tx_release(xsk_pool); } - - /* just for peer rq */ if (peer_stats.xdp_tx > 0) veth_xdp_flush(peer_rq, &bq); From patchwork Thu Aug 3 14:04:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?6buE5p2w?= X-Patchwork-Id: 130662 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:9f41:0:b0:3e4:2afc:c1 with SMTP id v1csp1187375vqx; Thu, 3 Aug 2023 07:28:52 -0700 (PDT) X-Google-Smtp-Source: APBJJlHArHZLJ0gpW9447qMtHekhz0X0JqWL1nkE/olgjZ/ojSC9Xn6xMXiNsNc31gybxhfwskOs X-Received: by 2002:a05:6a00:22d3:b0:687:536a:2e5b with SMTP id f19-20020a056a0022d300b00687536a2e5bmr8675960pfj.26.1691072931895; Thu, 03 Aug 2023 07:28:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691072931; cv=none; d=google.com; s=arc-20160816; b=MT97iuHJgmhwNVVnZIyYujStgY1Lz8CRag1ypbDYyMA7rJnzSExsFLh7NrZjpP49d+ kedOrUFXAuqsK9jeW9YoOM4yBXyfNdm8bo2rQV5XCJdM37mRhv19ZgD1EjigoliGQhlp i5AybTT4jVdja/xpd6vfZge98rxvgk5wUmBbW+s4DVRzdabNItsnUArFfA94OUM6MPYA xvWHgJ41/wqnExL7t0HNGDdKLwdq6a+82BjJHitGR5Htz4uEudGgy77cKPCx5WqxAi70 DSuMSGyYsh/IClclPLlfHpA+7YjKWQETVCtHZuPHwt9KkeIZr5epssQq/oECYpxSqCse pP/g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=/2IExsisHLl5wetq8sgZJvv6YFTcLdIhC4TMC2ESGVY=; fh=g4uBolO3oPEe9/DEEo6mvafqNfqpb5cy3Mv12uv7tV0=; b=zuMEj7aiFEUv52Y909rFLSRiZmVtGiSN9Gz3UN+48UUA9/5ChkaUvJy4aLAreGCo1b AoG3CybEJQJeX/dRS3Sx2rcNsvZ8llPpNZ5LrBi2USU79OPR4/+GosdwfqeopZ9ms7TE ot3XDyeweBP15KAFvVqbfcEkOkEi49Ds4wWhlh6WGchS1i2UU0Xuotg/2lbun7pMCtrt azcRVA2cb+vztn//oWAAFmnrNhANdHkTDVOGb31WqwNoZy8oxkPXJTq+4NRsQQLuAT+8 kARoaAO/Zej+l+zl4PXyZq8nHWC2nxMYrBgoq/c5pBtCAofCTISYCunj3Jj3VjSzEp5I fvjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Oo5SDTLn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y36-20020a056a00182400b0066883879b57si13250364pfa.51.2023.08.03.07.28.30; Thu, 03 Aug 2023 07:28:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=Oo5SDTLn; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236263AbjHCOJW (ORCPT + 99 others); Thu, 3 Aug 2023 10:09:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236737AbjHCOIG (ORCPT ); Thu, 3 Aug 2023 10:08:06 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ACCBF212B for ; Thu, 3 Aug 2023 07:07:08 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1bbf0f36ce4so6971065ad.0 for ; Thu, 03 Aug 2023 07:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691071628; x=1691676428; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/2IExsisHLl5wetq8sgZJvv6YFTcLdIhC4TMC2ESGVY=; b=Oo5SDTLnzyXY8qDJYDdgxiSm5KInG31WNp+6s7Aw/7vgUIpls7XV7PDYHeze+UBg5n DNl2lunWdtdOaDHXuIxE5JrJyhPZkrGSa5Td1Oqf49qpwWw8PUrIhrB7fyBHyv9odtE/ 2qhsRMSNlJy/L5ER8K/Vh3MdsTTXMR3Gnygwt5BAmPJvMOF3dflvGX+gQaOJXTqq16QM P5ltuB+AVoHAG8A95Q946y5BIfsKc/I92jLgMKQ0AYzhnBfHhIGGNmOqSujdFbxrxmja ioZjmD0Md1lh9M3POXnCgEhPxjLWOm1sgM3OhCN/Kk0TToqHwyEuF4xyR9Y7y+KH8Otp jFCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691071628; x=1691676428; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/2IExsisHLl5wetq8sgZJvv6YFTcLdIhC4TMC2ESGVY=; b=IkXa5jgT5nSfz1BMj6BQo5gzuTP/LyA3YLVsF52qsd/2lbe5vadHAUOu5myo4sG86e dZ/rL7aTBbW/2hpELYrEniwdhWJBQqe5VeuegcggPJdBLeRv4ITXA4ExgcPQ6jLlIH9/ ZwKotULfqe4Z6i7WexOQlLsv9HXqWBHROLsXuznfxgGhwm+Q6p9dvMrBY4JlGevouuPI AZ/ZolmY3aSEh2BZ57KWxZLzUsbjgkdtbpV/4rGT9oSj8iRBDyF7qLTMKPYieZEYXi2q tawt0YE0OrUirE4idNLrhqcJoz7BTcf+OCY7feJJfeCHySm9jedVtM50Ilrd2JwenbQC Ttmg== X-Gm-Message-State: ABy/qLahbLS9cCTPluT9scEOpp7G7vy4wVUnQD0Gg9ZKjZ79ILoSQUdi RH99rs3IFrSYp292LnyvhnnL+w== X-Received: by 2002:a17:902:ea08:b0:1bb:893e:5df5 with SMTP id s8-20020a170902ea0800b001bb893e5df5mr22673401plg.34.1691071627752; Thu, 03 Aug 2023 07:07:07 -0700 (PDT) Received: from C02FG34NMD6R.bytedance.net ([2001:c10:ff04:0:1000::8]) by smtp.gmail.com with ESMTPSA id ji11-20020a170903324b00b001b8a897cd26sm14367485plb.195.2023.08.03.07.07.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 03 Aug 2023 07:07:07 -0700 (PDT) From: "huangjie.albert" To: davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com Cc: "huangjie.albert" , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Yunsheng Lin , Kees Cook , Richard Gobert , netdev@vger.kernel.org (open list:NETWORKING DRIVERS), linux-kernel@vger.kernel.org (open list), bpf@vger.kernel.org (open list:XDP (eXpress Data Path)) Subject: [RFC Optimizing veth xsk performance 10/10] veth: af_xdp tx batch support for ipv4 udp Date: Thu, 3 Aug 2023 22:04:36 +0800 Message-Id: <20230803140441.53596-11-huangjie.albert@bytedance.com> X-Mailer: git-send-email 2.37.1 (Apple Git-137.1) In-Reply-To: <20230803140441.53596-1-huangjie.albert@bytedance.com> References: <20230803140441.53596-1-huangjie.albert@bytedance.com> MIME-Version: 1.0 X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1773218490858234238 X-GMAIL-MSGID: 1773218490858234238 A typical topology is shown below: veth<--------veth-peer 1 | |2 | bridge<------->eth0(such as mlnx5 NIC) If you use af_xdp to send packets from veth to a physical NIC, it needs to go through some software paths, so we can refer to the implementation of kernel GSO. When af_xdp sends packets out from veth, consider aggregating packets and send a large packet from the veth virtual NIC to the physical NIC. performance:(test weth libxdp lib) AF_XDP without batch : 480 Kpps (with ksoftirqd 100% cpu) AF_XDP with batch : 1.5 Mpps (with ksoftirqd 15% cpu) With af_xdp batch, the libxdp user-space program reaches a bottleneck. Therefore, the softirq did not reach the limit. Signed-off-by: huangjie.albert --- drivers/net/veth.c | 264 ++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 249 insertions(+), 15 deletions(-) diff --git a/drivers/net/veth.c b/drivers/net/veth.c index e4f1a8345f42..b0dbd21089c8 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -29,6 +29,7 @@ #include #include #include +#include #define DRV_NAME "veth" #define DRV_VERSION "1.0" @@ -103,6 +104,18 @@ struct veth_xdp_tx_bq { unsigned int count; }; +struct veth_gso_tuple { + __u8 protocol; + __be32 saddr; + __be32 daddr; + __be16 source; + __be16 dest; + __be16 gso_size; + __be16 gso_segs; + bool gso_enable; + bool gso_flush; +}; + struct veth_seg_info { u32 segs; u64 desc[] ____cacheline_aligned_in_smp; @@ -650,6 +663,84 @@ static int veth_xdp_tx(struct veth_rq *rq, struct xdp_buff *xdp, return 0; } +static struct sk_buff *veth_build_gso_head_skb(struct net_device *dev, char *buff, u32 tot_len, u32 headroom, u32 iph_len, u32 th_len) +{ + struct sk_buff *skb = NULL; + int err = 0; + + skb = alloc_skb(tot_len, GFP_KERNEL); + if (unlikely(!skb)) + return NULL; + + /* header room contains the eth header */ + skb_reserve(skb, headroom - ETH_HLEN); + + skb_put(skb, ETH_HLEN + iph_len + th_len); + + skb_shinfo(skb)->gso_segs = 0; + + err = skb_store_bits(skb, 0, buff, ETH_HLEN + iph_len + th_len); + if (unlikely(err)) { + kfree_skb(skb); + return NULL; + } + + skb->protocol = eth_type_trans(skb, dev); + skb->network_header = skb->mac_header + ETH_HLEN; + skb->transport_header = skb->network_header + iph_len; + skb->ip_summed = CHECKSUM_PARTIAL; + + return skb; +} + +static inline bool gso_segment_match(struct veth_gso_tuple *gso_tuple, struct iphdr *iph, struct udphdr *udph) +{ + if (gso_tuple->protocol == iph->protocol && + gso_tuple->saddr == iph->saddr && + gso_tuple->daddr == iph->daddr && + gso_tuple->source == udph->source && + gso_tuple->dest == udph->dest && + gso_tuple->gso_size == ntohs(udph->len)) + { + gso_tuple->gso_flush = false; + return true; + } else { + gso_tuple->gso_flush = true; + return false; + } +} + +static inline void gso_tuple_init(struct veth_gso_tuple *gso_tuple, struct iphdr *iph, struct udphdr *udph) +{ + gso_tuple->protocol = iph->protocol; + gso_tuple->saddr = iph->saddr; + gso_tuple->daddr = iph->daddr; + gso_tuple->source = udph->source; + gso_tuple->dest = udph->dest; + gso_tuple->gso_flush = false; + gso_tuple->gso_size = ntohs(udph->len); + gso_tuple->gso_segs = 0; +} + +/* only ipv4 udp support gso now */ +static inline bool ip_hdr_gso_check(unsigned char *buff, u32 len) +{ + struct iphdr *iph; + + if (len <= (ETH_HLEN + sizeof(*iph))) + return false; + + iph = (struct iphdr *)(buff + ETH_HLEN); + + /* + * check for ip headers, if the data support gso + */ + if (iph->ihl < 5 || iph->version != 4 || len < (iph->ihl * 4 + ETH_HLEN) || iph->protocol != IPPROTO_UDP) + return false; + + return true; +} + static struct sk_buff *veth_build_skb(void *head, int headroom, int len, int buflen) { @@ -686,8 +777,8 @@ static void veth_xsk_destruct_skb(struct sk_buff *skb) skb_shinfo(skb)->destructor_arg_xsk_pool = NULL; } -static struct sk_buff *veth_build_skb_zerocopy(struct net_device *dev, struct xsk_buff_pool *pool, - struct xdp_desc *desc) +static struct sk_buff *veth_build_skb_zerocopy_normal(struct net_device *dev, + struct xsk_buff_pool *pool, struct xdp_desc *desc) { struct veth_seg_info *seg_info; struct sk_buff *skb; @@ -698,45 +789,133 @@ static struct sk_buff *veth_build_skb_zerocopy(struct net_device *dev, struct xs int headroom; u64 addr; u32 index; - addr = desc->addr; len = desc->len; buffer = xsk_buff_raw_get_data(pool, addr); ts = pool->unaligned ? len : pool->chunk_size; - headroom = offset_in_page(buffer); - /* offset in umem pool buffer */ addr = buffer - pool->addrs; - /* get the page of the desc */ page = pool->umem->pgs[addr >> PAGE_SHIFT]; - /* in order to avoid to get freed by kfree_skb */ get_page(page); - hard_start = page_to_virt(page); - skb = veth_build_skb(hard_start, headroom, len, ts); seg_info = (struct veth_seg_info *)kmalloc(struct_size(seg_info, desc, MAX_SKB_FRAGS), GFP_KERNEL); if (!seg_info) { printk("here must to deal with\n"); } - /* later we will support gso for this */ index = skb_shinfo(skb)->gso_segs; seg_info->desc[index] = desc->addr; seg_info->segs = ++index; - skb->truesize += ts; skb->dev = dev; skb_shinfo(skb)->destructor_arg = (void *)(long)seg_info; skb_shinfo(skb)->destructor_arg_xsk_pool = (void *)(long)pool; skb->destructor = veth_xsk_destruct_skb; - /* set the mac header */ skb->protocol = eth_type_trans(skb, dev); + /* to do, add skb to sock. may be there is no need to do for this + * refcount_add(ts, &xs->sk.sk_wmem_alloc); + */ + return skb; +} + +static struct sk_buff *veth_build_skb_zerocopy_gso(struct net_device *dev, struct xsk_buff_pool *pool, + struct xdp_desc *desc, struct veth_gso_tuple *gso_tuple, struct sk_buff *prev_skb) +{ + u32 hr, len, ts, index, iph_len, th_len, data_offset, data_len, tot_len; + struct veth_seg_info *seg_info; + void *buffer; + struct udphdr *udph; + struct iphdr *iph; + struct sk_buff *skb; + struct page *page; + int hh_len = 0; + u64 addr; + + addr = desc->addr; + len = desc->len; + + /* l2 reserved len */ + hh_len = LL_RESERVED_SPACE(dev); + hr = max(NET_SKB_PAD, L1_CACHE_ALIGN(hh_len)); + + /* data points to eth header */ + buffer = (unsigned char *)xsk_buff_raw_get_data(pool, addr); + + iph = (struct iphdr *)(buffer + ETH_HLEN); + iph_len = iph->ihl * 4; + + udph = (struct udphdr *)(buffer + ETH_HLEN + iph_len); + th_len = sizeof(struct udphdr); + + if (gso_tuple->gso_flush) + gso_tuple_init(gso_tuple, iph, udph); + + ts = pool->unaligned ? len : pool->chunk_size; + + data_offset = offset_in_page(buffer) + ETH_HLEN + iph_len + th_len; + data_len = len - (ETH_HLEN + iph_len + th_len); + + /* head is null or this is a new 5 tuple */ + if (NULL == prev_skb || !gso_segment_match(gso_tuple, iph, udph)) { + tot_len = hr + iph_len + th_len; + skb = veth_build_gso_head_skb(dev, buffer, tot_len, hr, iph_len, th_len); + if (!skb) { + /* to do: handle here for skb */ + return NULL; + } + + /* store information for gso */ + seg_info = (struct veth_seg_info *)kmalloc(struct_size(seg_info, desc, MAX_SKB_FRAGS), GFP_KERNEL); + if (!seg_info) { + /* to do */ + kfree_skb(skb); + return NULL; + } + } else { + skb = prev_skb; + skb_shinfo(skb)->gso_type = SKB_GSO_UDP_L4 | SKB_GSO_PARTIAL; + skb_shinfo(skb)->gso_size = data_len; + skb->ip_summed = CHECKSUM_PARTIAL; + + /* max segment is MAX_SKB_FRAGS */ + if(skb_shinfo(skb)->gso_segs >= MAX_SKB_FRAGS - 1) { + gso_tuple->gso_flush = true; + } + seg_info = (struct veth_seg_info *)skb_shinfo(skb)->destructor_arg; + } + + /* offset in umem pool buffer */ + addr = buffer - pool->addrs; + + /* get the page of the desc */ + page = pool->umem->pgs[addr >> PAGE_SHIFT]; + + /* in order to avoid to get freed by kfree_skb */ + get_page(page); + + /* desc.data can not hold in two */ + skb_fill_page_desc(skb, skb_shinfo(skb)->gso_segs, page, data_offset, data_len); + + skb->len += data_len; + skb->data_len += data_len; + skb->truesize += ts; + skb->dev = dev; + + /* later we will support gso for this */ + index = skb_shinfo(skb)->gso_segs; + seg_info->desc[index] = desc->addr; + seg_info->segs = ++index; + skb_shinfo(skb)->gso_segs++; + + skb_shinfo(skb)->destructor_arg = (void *)(long)seg_info; + skb_shinfo(skb)->destructor_arg_xsk_pool = (void *)(long)pool; + skb->destructor = veth_xsk_destruct_skb; /* to do, add skb to sock. may be there is no need to do for this * refcount_add(ts, &xs->sk.sk_wmem_alloc); @@ -744,6 +923,22 @@ static struct sk_buff *veth_build_skb_zerocopy(struct net_device *dev, struct xs return skb; } +static inline struct sk_buff *veth_build_skb_zerocopy(struct net_device *dev, struct xsk_buff_pool *pool, + struct xdp_desc *desc, struct veth_gso_tuple *gso_tuple, struct sk_buff *prev_skb) +{ + void *buffer; + + buffer = xsk_buff_raw_get_data(pool, desc->addr); + if (ip_hdr_gso_check(buffer, desc->len)) { + gso_tuple->gso_enable = true; + return veth_build_skb_zerocopy_gso(dev, pool, desc, gso_tuple, prev_skb); + } else { + gso_tuple->gso_flush = false; + gso_tuple->gso_enable = false; + return veth_build_skb_zerocopy_normal(dev, pool, desc); + } +} + static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, struct xdp_frame *frame, struct veth_xdp_tx_bq *bq, @@ -1176,16 +1371,33 @@ static inline bool buffer_in_page(void *buffer, u32 len) } } +static inline void veth_skb_gso_check_update(struct sk_buff *skb) +{ + struct iphdr *iph = ip_hdr(skb); + struct udphdr *uh = udp_hdr(skb); + int ip_tot_len = skb->len; + int udp_len = skb->len - (skb->transport_header - skb->network_header); + iph->tot_len = htons(ip_tot_len); + ip_send_check(iph); + uh->len = htons(udp_len); + uh->check = 0; + + /* udp4 checksum update */ + udp4_hwcsum(skb, iph->saddr, iph->daddr); +} + static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, int budget) { struct veth_priv *priv, *peer_priv; struct net_device *dev, *peer_dev; + struct veth_gso_tuple gso_tuple; struct veth_rq *peer_rq; struct veth_stats peer_stats = {}; struct veth_stats stats = {}; struct veth_xdp_tx_bq bq; struct xdp_desc desc; void *xdpf; + struct sk_buff *prev_skb = NULL; struct sk_buff *skb = NULL; bool zc = xsk_pool->umem->zc; u32 xsk_headroom = xsk_pool->headroom; @@ -1200,6 +1412,8 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, /* todo: queue index must set before this */ peer_rq = &peer_priv->rq[sq->queue_index]; + memset(&gso_tuple, 0, sizeof(gso_tuple)); + /* set xsk wake up flag, to do: where to disable */ if (xsk_uses_need_wakeup(xsk_pool)) xsk_set_tx_need_wakeup(xsk_pool); @@ -1279,12 +1493,26 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, /* no need to copy address for af+xdp */ p_frame = veth_xdp_rcv_one(peer_rq, p_frame, &bq, &peer_stats); if (p_frame) { - skb = veth_build_skb_zerocopy(peer_dev, xsk_pool, &desc); - if (skb) { + skb = veth_build_skb_zerocopy(peer_dev, xsk_pool, &desc, &gso_tuple, prev_skb); + if (!gso_tuple.gso_enable) { napi_gro_receive(&peer_rq->xdp_napi, skb); skb = NULL; } else { - xsk_tx_completed_addr(xsk_pool, desc.addr); + if (prev_skb && gso_tuple.gso_flush) { + veth_skb_gso_check_update(prev_skb); + napi_gro_receive(&peer_rq->xdp_napi, prev_skb); + + if (prev_skb == skb) { + skb = NULL; + prev_skb = NULL; + } else { + prev_skb = skb; + } + } else if (NULL == skb){ + xsk_tx_completed_addr(xsk_pool, desc.addr); + } else { + prev_skb = skb; + } } } } else { @@ -1308,6 +1536,12 @@ static int veth_xsk_tx_xmit(struct veth_sq *sq, struct xsk_buff_pool *xsk_pool, done++; } + /* gso skb */ + if (NULL!=skb) { + veth_skb_gso_check_update(skb); + napi_gro_receive(&peer_rq->xdp_napi, skb); + } + /* release, move consumer,and wakeup the producer */ if (done) { napi_schedule(&peer_rq->xdp_napi);