From patchwork Mon Dec 4 08:50:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Souradeep Chakrabarti X-Patchwork-Id: 173139 Return-Path: Delivered-To: ouuuleilei@gmail.com Received: by 2002:a59:bcd1:0:b0:403:3b70:6f57 with SMTP id r17csp2629255vqy; Mon, 4 Dec 2023 00:51:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IHA56WzMN7FfZhhU6YafyV1V0Ygr7RavR5PWEM1aWNnCig4yJG5VKrZ5W0z7VSMmpqMHN3C X-Received: by 2002:a05:6a20:7291:b0:18b:962c:1ead with SMTP id o17-20020a056a20729100b0018b962c1eadmr5160718pzk.3.1701679865648; Mon, 04 Dec 2023 00:51:05 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701679865; cv=none; d=google.com; s=arc-20160816; b=0NgMGTuJIjHMNfqGRIO2i+qXdJztGdYgb95g14hWwyQ71kHaTLRCgXUnyxuZbgqZnO vD0F9JD6LD72vFIBgsEac/gt3Y2DxgUAVfyFBruDl7Lp6P7CicWHNlnpApZeZwFVLyWf qbFiAIQVndM29BBb78SSUZg4rT5T1kasnVVbIYPPW3sCsqrdt3aAzTllcDx0EAgnjHvw C0CIvGuR6c/pG2ER5FyHzGmEuuDd8Qho9DSxxr/4A9mK6F5LoLG7bZWCKiy8mYT92pRy fNBMMSQGvKhK36JsU1q+oogj5Q6oYap4wPhn1BObHmrTBrTFyaXVVSXGzKk8+L6iY/9b FTNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:cc:to:from :dkim-signature:dkim-filter; bh=+82ZmFRWEGviEskvJhBjDw/CYlrBPi2B0lQ9asHWocA=; fh=Hh7d9ukVkoaMDOlPHkbcv03b7xLEMwfFR7i+nOdaGEY=; b=OCKMa/c8fG6L5Z7gBUOEQMoFMF4MJ1RIG25ipN/D6ejj+05+DgPDcGv77GIHm9XChR 6meNif3yJJd8ujeRBM9C2dQm1WtFSJzTKk9R4B3gF+Yc3BXu5M7+enqvMFb0X3p/g0pY 7o9brJAc7q3nLc/4gFsODBJEFfmsCQS07xZe/qUPNmm65PixD7SDiRhZVZ1A7EVpXVFZ CRLVsJPC/hBcamMUNRhjgwrf++JJgONDiF9eP5MvPPBnyHK8YP6Rgz1YsD4Xt8NmCmCc RDdK0sGLAwbgUQRKWVot1fFlr0OwALElYpl4UM+NEA/YcOCdzfr0uAwJ6ijNzHXCYwj+ yvWg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=FG72TTRU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from fry.vger.email (fry.vger.email. [2620:137:e000::3:8]) by mx.google.com with ESMTPS id bf9-20020a170902b90900b001bc162f3318si7222214plb.640.2023.12.04.00.51.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Dec 2023 00:51:05 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) client-ip=2620:137:e000::3:8; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=FG72TTRU; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:8 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by fry.vger.email (Postfix) with ESMTP id 80FC3808F009; Mon, 4 Dec 2023 00:50:59 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at fry.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234893AbjLDIuu (ORCPT + 99 others); Mon, 4 Dec 2023 03:50:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60688 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1343735AbjLDIur (ORCPT ); Mon, 4 Dec 2023 03:50:47 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 2981911F; Mon, 4 Dec 2023 00:50:53 -0800 (PST) Received: by linux.microsoft.com (Postfix, from userid 1099) id 5F36D20B74C0; Mon, 4 Dec 2023 00:50:52 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 5F36D20B74C0 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1701679852; bh=+82ZmFRWEGviEskvJhBjDw/CYlrBPi2B0lQ9asHWocA=; h=From:To:Cc:Subject:Date:From; b=FG72TTRUO8BK7F1aqqE2fnikQCMuZI4qTS0WgkBu4sSkEAW7yIXTzXy58+I+LBTMg kkJKNVQUYwaUGWbUXEG3eVxLzPlSl5g2cR6R7R/aUrH39vl87fQSIIf0fb9NBflrZC mzKhFP068KZ3yBveHtZl2aXbvLv1fLkEtkkui8kg= From: Souradeep Chakrabarti To: kys@microsoft.com, haiyangz@microsoft.com, wei.liu@kernel.org, decui@microsoft.com, davem@davemloft.net, edumazet@google.com, kuba@kernel.org, pabeni@redhat.com, longli@microsoft.com, yury.norov@gmail.com, leon@kernel.org, cai.huoqing@linux.dev, ssengar@linux.microsoft.com, vkuznets@redhat.com, tglx@linutronix.de, linux-hyperv@vger.kernel.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-rdma@vger.kernel.org Cc: sch^Crabarti@microsoft.com, paulros@microsoft.com, Souradeep Chakrabarti Subject: [PATCH V4 net-next] net: mana: Assigning IRQ affinity on HT cores Date: Mon, 4 Dec 2023 00:50:41 -0800 Message-Id: <1701679841-9359-1-git-send-email-schakrabarti@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on fry.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (fry.vger.email [0.0.0.0]); Mon, 04 Dec 2023 00:50:59 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1784340666827173894 X-GMAIL-MSGID: 1784340666827173894 Existing MANA design assigns IRQ to every CPU, including sibling hyper-threads. This may cause multiple IRQs to be active simultaneously in the same core and may reduce the network performance with RSS. Improve the performance by assigning IRQ to non sibling CPUs in local NUMA node. Signed-off-by: Souradeep Chakrabarti --- V3 -> V4: * Used for_each_numa_hop_mask() macro and simplified the code. Thanks to Yury Norov for the suggestion. * Added code to assign hwc irq separately in mana_gd_setup_irqs. V2 -> V3: * Created a helper function to get the next NUMA with CPU. * Added some error checks for unsuccessful memory allocation. * Fixed some comments on the code. V1 -> V2: * Simplified the code by removing filter_mask_list and using avail_cpus. * Addressed infinite loop issue when there are numa nodes with no CPUs. * Addressed uses of local numa node instead of 0 to start. * Removed uses of BUG_ON. * Placed cpus_read_lock in parent function to avoid num_online_cpus to get changed before function finishes the affinity assignment. --- .../net/ethernet/microsoft/mana/gdma_main.c | 70 +++++++++++++++++-- 1 file changed, 63 insertions(+), 7 deletions(-) diff --git a/drivers/net/ethernet/microsoft/mana/gdma_main.c b/drivers/net/ethernet/microsoft/mana/gdma_main.c index 6367de0c2c2e..2194a53cce10 100644 --- a/drivers/net/ethernet/microsoft/mana/gdma_main.c +++ b/drivers/net/ethernet/microsoft/mana/gdma_main.c @@ -1243,15 +1243,57 @@ void mana_gd_free_res_map(struct gdma_resource *r) r->size = 0; } +static int irq_setup(int *irqs, int nvec, int start_numa_node) +{ + int i = 0, cpu, err = 0; + const struct cpumask *node_cpumask; + unsigned int next_node = start_numa_node; + cpumask_var_t visited_cpus, node_cpumask_temp; + + if (!zalloc_cpumask_var(&visited_cpus, GFP_KERNEL)) { + err = ENOMEM; + return err; + } + if (!zalloc_cpumask_var(&node_cpumask_temp, GFP_KERNEL)) { + err = -ENOMEM; + return err; + } + rcu_read_lock(); + for_each_numa_hop_mask(node_cpumask, next_node) { + cpumask_copy(node_cpumask_temp, node_cpumask); + for_each_cpu(cpu, node_cpumask_temp) { + cpumask_andnot(node_cpumask_temp, node_cpumask_temp, + topology_sibling_cpumask(cpu)); + irq_set_affinity_and_hint(irqs[i], cpumask_of(cpu)); + if (++i == nvec) + goto free_mask; + cpumask_set_cpu(cpu, visited_cpus); + if (cpumask_empty(node_cpumask_temp)) { + cpumask_copy(node_cpumask_temp, node_cpumask); + cpumask_andnot(node_cpumask_temp, node_cpumask_temp, + visited_cpus); + cpu = 0; + } + } + } +free_mask: + rcu_read_unlock(); + free_cpumask_var(visited_cpus); + free_cpumask_var(node_cpumask_temp); + return err; +} + static int mana_gd_setup_irqs(struct pci_dev *pdev) { - unsigned int max_queues_per_port = num_online_cpus(); struct gdma_context *gc = pci_get_drvdata(pdev); + unsigned int max_queues_per_port; struct gdma_irq_context *gic; unsigned int max_irqs, cpu; - int nvec, irq; + int nvec, *irqs, irq; int err, i = 0, j; + cpus_read_lock(); + max_queues_per_port = num_online_cpus(); if (max_queues_per_port > MANA_MAX_NUM_QUEUES) max_queues_per_port = MANA_MAX_NUM_QUEUES; @@ -1261,6 +1303,11 @@ static int mana_gd_setup_irqs(struct pci_dev *pdev) nvec = pci_alloc_irq_vectors(pdev, 2, max_irqs, PCI_IRQ_MSIX); if (nvec < 0) return nvec; + irqs = kmalloc_array(max_queues_per_port, sizeof(int), GFP_KERNEL); + if (!irqs) { + err = -ENOMEM; + goto free_irq_vector; + } gc->irq_contexts = kcalloc(nvec, sizeof(struct gdma_irq_context), GFP_KERNEL); @@ -1287,21 +1334,28 @@ static int mana_gd_setup_irqs(struct pci_dev *pdev) goto free_irq; } - err = request_irq(irq, mana_gd_intr, 0, gic->name, gic); + if (!i) { + err = request_irq(irq, mana_gd_intr, 0, gic->name, gic); + cpu = cpumask_local_spread(i, gc->numa_node); + irq_set_affinity_and_hint(irq, cpumask_of(cpu)); + } else { + irqs[i - 1] = irq; + err = request_irq(irqs[i - 1], mana_gd_intr, 0, gic->name, gic); + } if (err) goto free_irq; - - cpu = cpumask_local_spread(i, gc->numa_node); - irq_set_affinity_and_hint(irq, cpumask_of(cpu)); } + err = irq_setup(irqs, max_queues_per_port, gc->numa_node); + if (err) + goto free_irq; err = mana_gd_alloc_res_map(nvec, &gc->msix_resource); if (err) goto free_irq; gc->max_num_msix = nvec; gc->num_msix_usable = nvec; - + cpus_read_unlock(); return 0; free_irq: @@ -1314,8 +1368,10 @@ static int mana_gd_setup_irqs(struct pci_dev *pdev) } kfree(gc->irq_contexts); + kfree(irqs); gc->irq_contexts = NULL; free_irq_vector: + cpus_read_unlock(); pci_free_irq_vectors(pdev); return err; }