From patchwork Fri Dec 8 00:52:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mina Almasry X-Patchwork-Id: 175499 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp5163388vqy; Thu, 7 Dec 2023 16:53:50 -0800 (PST) X-Google-Smtp-Source: AGHT+IE/ezrv3GsZb5YZEIkTE0B/ssCekHmUzUt+ymXvnAeqluhUd0ECF+K8CXSprX3kQE5ZJXl6 X-Received: by 2002:a05:6a20:2618:b0:18b:31eb:8b66 with SMTP id i24-20020a056a20261800b0018b31eb8b66mr2610686pze.50.1701996830099; Thu, 07 Dec 2023 16:53:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701996830; cv=none; d=google.com; s=arc-20160816; b=DrKVHibicjxaGdkBlrJzdBa2VDBZJNwm4iesL36lUKOrp4pc44W/n4meTaIcFGmy7Z 9+afJBVgfygzYceGhJoGKfspOQ4C07DJ/162mD2QxTGaVMTAWj3EvmBGvo/n3D2yfe6I 83HGZ9/XTmrYiztHwTl/QjXIMps6/3RpZcZ75Z9U3/fPcNP3suNpbuLq4KnJnHanQRWo SXd9q0pWjxy2x2B3JOnHEZd/0/W6f43UF249bvlbf2XNlAa8/t3z00BqDXfq/bUiSZ1u x3t7WoMdr+ZDBVJR1ODf/3rrS5FKdu0HhpR9TflW5MvvCXKhMoURiDasPDRtpHmvt771 QQXA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=lPnDX1rYMAtCe+hQ8qdPonfWEnFnkE5EPZQcdSbkY2c=; fh=spYIC3Yl69sWtCLqpfRLMGcxeD5ncKplqDHTlWXfkQk=; b=zXpdNdNeAcQbdMu2kloUIYry/ZUXfYoQ4RsU75A9qdftTNr7GIDNMhEiYDnYbgSEtk m9wZMD7g8Az1U3+uldF+71xq78L07dCLrnFAinDO3p8VNvLBcMa00+WC/wikEsSepZAp MLzzclG6XX75/FCUSML+3RxmW6e2d00Nd3K4ZabBDuqiKR7smqCir6fm3N88lMMssJEO PoDGKguzxmGJijtU8ziNJ0EFzqTUNJRyYQTzKWD/9V65HCjItrQ7bmZK+OLrVZrz7OCw tPK5mg7E8J6n6okEL3NqxxzdSM7Ws/PlRqR03TFnKnF81cOyhGd8KKLYDAFciiq26uSZ uEoQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tItI9ksJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from snail.vger.email (snail.vger.email. [23.128.96.37]) by mx.google.com with ESMTPS id ci2-20020a17090afc8200b0028694fc7a61si669739pjb.13.2023.12.07.16.53.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Dec 2023 16:53:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) client-ip=23.128.96.37; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=tItI9ksJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.37 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 1B5A280DECD6; Thu, 7 Dec 2023 16:53:49 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1572899AbjLHAxf (ORCPT + 99 others); Thu, 7 Dec 2023 19:53:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235574AbjLHAxK (ORCPT ); Thu, 7 Dec 2023 19:53:10 -0500 Received: from mail-yw1-x1149.google.com (mail-yw1-x1149.google.com [IPv6:2607:f8b0:4864:20::1149]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7A843172D for ; Thu, 7 Dec 2023 16:53:16 -0800 (PST) Received: by mail-yw1-x1149.google.com with SMTP id 00721157ae682-5c5daf2baccso17492397b3.3 for ; Thu, 07 Dec 2023 16:53:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1701996795; x=1702601595; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=lPnDX1rYMAtCe+hQ8qdPonfWEnFnkE5EPZQcdSbkY2c=; b=tItI9ksJxKZZeSA4cIlz1awMH257FvktV3a5FQasmksOiVe6EDUTdc38DlVmY9AMUv 1yUbt++nhyY11JmQKVKMdqq/CKI60A8ETh2zdhXhofyLY1NvfvxI27vAS+LaBEfcsC0m dMAr4tkoEokr4qCBaTMo4v6nkfqhHxyI7j+wwuw3MRT2niwmzCRi6fGzm0jpDy+wGnvE PZvJaSLBlM5QaFpeBtM0oCVky9+C8Jg5N9L//cguRIlLlMuU524zOQUz7g3+6xyaACVK H4SLrLLPCGbX164C9j9lRBfetzidTop22PysOSNLg7487MDm0dggC/tegdH5rd8KCgry Sp0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701996795; x=1702601595; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lPnDX1rYMAtCe+hQ8qdPonfWEnFnkE5EPZQcdSbkY2c=; b=wfrh5rYuJBkAmgEK5GW85mB1dsq0hqHr7SMLlDp01xPRGW8ggiq9shlhOkZfixXZco MbMdZTtPrhdI4MeLGuHb9E7uEEulb/nv5da9oaNJ9bDEH+Lt5dOXhWIonxJbqD6Kl9+T 6LsJLShhCuEII8KUJOrBPjrP0f1Tmp/lf2gn2k/aJKcdCohQ5bygUNYIpo7amD9CO9l9 c3Aa3IvDSroE3okcBEQoF8uSaCKwVBVA/rteYjc1E8r7SWe1ZmCtWh9dEJOBGgNTsA7j 08jsTWuO6lG65pRipLcIklJhMHebudtcVAz8C9X8awxjPYKWPkpFW/yA7em8Q6Nw1TKN lOVA== X-Gm-Message-State: AOJu0Ywx6dcR+VAyXIYojEHvdTyGPywDVEAa6E2lMsXoGXGcxfM8pTUt /hYLtPtUt80GljocxmjTh1GuR0RkPzGOrkr0ag== X-Received: from almasrymina.svl.corp.google.com ([2620:15c:2c4:200:f1cf:c733:235b:9fff]) (user=almasrymina job=sendgmr) by 2002:a05:690c:4707:b0:5d4:ce2:e908 with SMTP id gz7-20020a05690c470700b005d40ce2e908mr53364ywb.3.1701996795620; Thu, 07 Dec 2023 16:53:15 -0800 (PST) Date: Thu, 7 Dec 2023 16:52:41 -0800 In-Reply-To: <20231208005250.2910004-1-almasrymina@google.com> Mime-Version: 1.0 References: <20231208005250.2910004-1-almasrymina@google.com> X-Mailer: git-send-email 2.43.0.472.g3155946c3a-goog Message-ID: <20231208005250.2910004-11-almasrymina@google.com> Subject: [net-next v1 10/16] page_pool: don't release iov on elevanted refcount From: Mina Almasry To: Shailend Chand , netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-arch@vger.kernel.org, linux-kselftest@vger.kernel.org, bpf@vger.kernel.org, linux-media@vger.kernel.org, dri-devel@lists.freedesktop.org Cc: Mina Almasry , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Jeroen de Borst , Praveen Kaligineedi , Jesper Dangaard Brouer , Ilias Apalodimas , Arnd Bergmann , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , " =?utf-8?q?Christian_K=C3=B6nig?= " , Yunsheng Lin , Harshitha Ramamurthy , Shakeel Butt X-Spam-Status: No, score=-9.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 07 Dec 2023 16:53:49 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784673027956269146 X-GMAIL-MSGID: 1784673027956269146 Currently the page_pool behavior is that a page is considered for recycling only once, the first time __page_pool_put_page() is called on it. This works because in practice the net stack only holds 1 reference to the skb frags. In that case, the page_pool recycling works as expected, as the skb frags will have 1 reference on the pages from the net stack when __page_pool_put_page() is called (if the driver is not holding extra references for recycling), and so the page will be recycled. However, this is not compatible with devmem TCP. For devmem TCP, the net stack holds 2 references for each frag, 1 reference is part of the SKB, and the second reference is for the user holding the frag until they call SO_DEVMEM_DONTNEED. This causes a bug in the page_pool recycling where, when the skb is freed, the reference count goes from 2->1, the page_pool sees a pending reference, releases the page, and so no devmem iovs get recycled. To fix this, don't release iovs on elevated refcount. Signed-off-by: Mina Almasry --- net/core/page_pool.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+) diff --git a/net/core/page_pool.c b/net/core/page_pool.c index f0148d66371b..dc2a148f5b06 100644 --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -731,6 +731,29 @@ __page_pool_put_page(struct page_pool *pool, struct page *page, /* Page found as candidate for recycling */ return page; } + + if (page_is_page_pool_iov(page)) { + /* With devmem TCP and ppiovs, we can't release pages if the + * refcount is > 1. This is because the net stack holds + * 2 references: + * - 1 for the skb, and + * - 1 for the user until they call SO_DEVMEM_DONTNEED. + * Releasing pages for elevated refcounts completely disables + * page_pool recycling. Instead, simply don't release pages and + * the next call to napi_pp_put_page() via SO_DEVMEM_DONTNEED + * will consider the page again for recycling. As a result, + * devmem TCP incompatible with drivers doing refcnt based + * recycling unless those drivers: + * + * - don't mark skb_mark_for_recycle() + * - are sure to release the last reference with + * page_pool_put_full_page() to consider the page for + * page_pool recycling. + */ + page_pool_page_put_many(page, 1); + return NULL; + } + /* Fallback/non-XDP mode: API user have elevated refcnt. * * Many drivers split up the page into fragments, and some