Message ID | 1677785686-2152-1-git-send-email-ssengar@linux.microsoft.com |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a5d:5915:0:0:0:0:0 with SMTP id v21csp35235wrd; Thu, 2 Mar 2023 11:40:23 -0800 (PST) X-Google-Smtp-Source: AK7set8mK4UT1uz6T1AXu62M5l17/cUdNmqv5x4JmVjqYFsSxU5FOuLkXQWiKj6f5j+g3SAGum6p X-Received: by 2002:aa7:d885:0:b0:4c5:7a47:9234 with SMTP id u5-20020aa7d885000000b004c57a479234mr600770edq.8.1677786023284; Thu, 02 Mar 2023 11:40:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1677786023; cv=none; d=google.com; s=arc-20160816; b=BqumMFPRlKg3YvEndeESRr228ZY65JhVTGfzTM6ZSxvcA6xCnUaJoN/6m9H4jipTRC f5aqr1N42TEDa1igy6SftgqfW2NJPAYtfaYzgAOBdDr7l8hJnZq/eaLmg+naoSebGDhq xzAZ889ZOCEib+t8V7gjuAosq0oL2H41gN0MNilvnQXzygT4y1yuOqiqc4XCzS6zZZ6K jNxotfG3/EM0HHu7PE2asDB8pyu9f1o+546imy6SPljp6gD3jPkCjrqirfdYiA0XMswn S0jQOgugRx6cY69rbIZ7Du+nEmWQs2mCp/SND5YsTgXvpP53C7n9fJrd2UiejahdhfsP hLVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:message-id:date:subject:to:from:dkim-signature :dkim-filter; bh=6Tvtuiro6Rs44YifCyzt6GKgNmrargfr4Wcp7VYJ/Rw=; b=FlFjJHSdwjpLO54yUJP6Yg4Pn1XU+Cm0IXBHzgRl25MEn3tEPlYifmkPWERNAwBVZ8 SaBp3tPXNnwl5qiq5zACe2685jMolP89JP9j/I2Ci96yLDewPia5NyBVbURVpA1U3XpA PN/SpunsBfnlY1X5rlvmbpnD1v1QW1PND/Tl1wEYqwLthw0vwwSXiHdlsbmtvXdTiU2G wCth1KvaEtNx1jYo1WGHAFdsS/HFeRi3ECz8Z+rb/NOtB3cCv0AHhO5JF5EqhgvWvDXV OhJcrTNzidYHmfP6+wb3daZx0PNp7WX/ih+weo2ErWHlwbkEng5X88isTkbpVyVQn8n4 2ihA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=dm4J+fjR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p15-20020aa7d30f000000b004bd5e278481si443449edq.628.2023.03.02.11.39.58; Thu, 02 Mar 2023 11:40:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.microsoft.com header.s=default header.b=dm4J+fjR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.microsoft.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229850AbjCBTez (ORCPT <rfc822;davidbtadokoro@gmail.com> + 99 others); Thu, 2 Mar 2023 14:34:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229591AbjCBTev (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Thu, 2 Mar 2023 14:34:51 -0500 Received: from linux.microsoft.com (linux.microsoft.com [13.77.154.182]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 15F4D1B542 for <linux-kernel@vger.kernel.org>; Thu, 2 Mar 2023 11:34:51 -0800 (PST) Received: from linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net (linux.microsoft.com [13.77.154.182]) by linux.microsoft.com (Postfix) with ESMTPSA id 6EF0920B9C3D; Thu, 2 Mar 2023 11:34:50 -0800 (PST) DKIM-Filter: OpenDKIM Filter v2.11.0 linux.microsoft.com 6EF0920B9C3D DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.microsoft.com; s=default; t=1677785690; bh=6Tvtuiro6Rs44YifCyzt6GKgNmrargfr4Wcp7VYJ/Rw=; h=From:To:Subject:Date:From; b=dm4J+fjRRLotjP7dvmezTypnoJLOxIvEVLDJq9AoVZ+cNBO4FhmFp5ZYrBi6hLEys zDH8HgC0bm+0LgUJM6YwI+LMSbVvz5BQKwDQkS9RlImGnLjzmew1WcLDmTR/Pld7Dx RV65/T76wxnm/wVu3o4K+QTTPmrAJlVq8fzIQ3s0= From: Saurabh Sengar <ssengar@linux.microsoft.com> To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, x86@kernel.org, hpa@zytor.com, johan+linaro@kernel.org, isaku.yamahata@intel.com, mikelley@microsoft.com, linux-kernel@vger.kernel.org Subject: [PATCH] x86/ioapic: Don't return 0 as valid virq Date: Thu, 2 Mar 2023 11:34:46 -0800 Message-Id: <1677785686-2152-1-git-send-email-ssengar@linux.microsoft.com> X-Mailer: git-send-email 1.8.3.1 X-Spam-Status: No, score=-19.8 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_MED, SPF_HELO_PASS,SPF_PASS,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759286156825745519?= X-GMAIL-MSGID: =?utf-8?q?1759286156825745519?= |
Series |
x86/ioapic: Don't return 0 as valid virq
|
|
Commit Message
Saurabh Singh Sengar
March 2, 2023, 7:34 p.m. UTC
Zero is invalid virq and should't be returned as a valid value for
lower irq bound. If IO-APIC and gsi_top are not initialized return
the 'from' value as virq.
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
---
arch/x86/kernel/apic/io_apic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
Comments
On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote: > Zero is invalid virq and should't be returned as a valid value for > lower irq bound. If IO-APIC and gsi_top are not initialized return > the 'from' value as virq. > > Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com> > --- > arch/x86/kernel/apic/io_apic.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c > index a868b76cd3d4..000cc6159b8b 100644 > --- a/arch/x86/kernel/apic/io_apic.c > +++ b/arch/x86/kernel/apic/io_apic.c > @@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from) > /* > * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use > * gsi_top if ioapic_dynirq_base hasn't been initialized yet. > + * > + * Incase gsi_top is also not initialized return @from. > */ > if (!ioapic_initialized) > - return gsi_top; > + return gsi_top ? : from; > /* > * For DT enabled machines ioapic_dynirq_base is irrelevant and not > * updated. So simply return @from if ioapic_dynirq_base == 0. > -- > 2.34.1 Hi Maintainers, May I get your feedback on this patch. Regards, Saurabh
On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote: > Zero is invalid virq and should't be returned as a valid value for > lower irq bound. If IO-APIC and gsi_top are not initialized return Why isn't gsi_top initialized? What is this fixing? Don't be afraid to do git annotate arch/x86/kernel/apic/io_apic.c and see which commit added this. This one: 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled machines") Now add the folks from this commit to Cc and tell them why in your case gsi_top is not initialized and what they're breaking by doing that. The more your commit message explains *why* you're fixing something, the better it is for the maintainers/reviewers to actually know what to do. Right now I'm reading this and I'm thinking, random, unjustified change. Ignore. Ok? Thx.
Cc: rahul.tanwar@linux.intel.com, andriy.shevchenko@intel.com Thanks for you comments, please see my responses below. > -----Original Message----- > From: Borislav Petkov <bp@alien8.de> > Sent: Monday, March 13, 2023 2:10 AM > To: Saurabh Sengar <ssengar@linux.microsoft.com> > Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com; > x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org; > isaku.yamahata@intel.com; Michael Kelley (LINUX) > <mikelley@microsoft.com>; linux-kernel@vger.kernel.org > Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq > > On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote: > > Zero is invalid virq and should't be returned as a valid value for > > lower irq bound. If IO-APIC and gsi_top are not initialized return > > Why isn't gsi_top initialized? > > What is this fixing? In the absence of a device tree node for IO-APIC, IO-APIC is not registered, resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound will return 0. Returning 0 from this function will allow interrupts to have 0 assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq should be derived from 'hint' value passed to function as 'from'. I can add above info in commit message, please let me know if anything more to be added. To be specific in our system which is a guest VM we don't need IO-APIC and hence there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI. > > Don't be afraid to do > > git annotate arch/x86/kernel/apic/io_apic.c > > and see which commit added this. This one: > > 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT enabled > machines") > > Now add the folks from this commit to Cc and tell them why in your case > gsi_top is not initialized and what they're breaking by doing that. Thanks. I will add "Fixes:" and "Cc:" tag in next version. > > The more your commit message explains *why* you're fixing something, the > better it is for the maintainers/reviewers to actually know what to do. > > Right now I'm reading this and I'm thinking, random, unjustified change. > Ignore. > > Ok? > > Thx. > > -- > Regards/Gruss, > Boris. > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl > e.kernel.org%2Ftglx%2Fnotes-about- > netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4 > 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0% > 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C > %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3 > D&reserved=0
I just see mail to rahul.tanwar@linux.intel.com is undelivered, shall I still add it in "Cc:" ? Please let me know what we usually do in such cases. Regards, Saurabh > -----Original Message----- > From: Saurabh Singh Sengar <ssengar@microsoft.com> > Sent: Monday, March 13, 2023 9:00 AM > To: Borislav Petkov <bp@alien8.de>; Saurabh Sengar > <ssengar@linux.microsoft.com> > Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com; > x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org; > isaku.yamahata@intel.com; Michael Kelley (LINUX) > <mikelley@microsoft.com>; linux-kernel@vger.kernel.org; > rahul.tanwar@linux.intel.com; andriy.shevchenko@intel.com > Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq > > Cc: rahul.tanwar@linux.intel.com, andriy.shevchenko@intel.com > > Thanks for you comments, please see my responses below. > > > -----Original Message----- > > From: Borislav Petkov <bp@alien8.de> > > Sent: Monday, March 13, 2023 2:10 AM > > To: Saurabh Sengar <ssengar@linux.microsoft.com> > > Cc: tglx@linutronix.de; mingo@redhat.com; dave.hansen@linux.intel.com; > > x86@kernel.org; hpa@zytor.com; johan+linaro@kernel.org; > > isaku.yamahata@intel.com; Michael Kelley (LINUX) > > <mikelley@microsoft.com>; linux-kernel@vger.kernel.org > > Subject: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid > > virq > > > > On Thu, Mar 02, 2023 at 11:34:46AM -0800, Saurabh Sengar wrote: > > > Zero is invalid virq and should't be returned as a valid value for > > > lower irq bound. If IO-APIC and gsi_top are not initialized return > > > > Why isn't gsi_top initialized? > > > > What is this fixing? > > In the absence of a device tree node for IO-APIC, IO-APIC is not registered, > resulting in uninitialized gsi_top. And in such cases arch_dynirq_lower_bound > will return 0. Returning 0 from this function will allow interrupts to have 0 > assigned as valid irq, which is wrong. In case gsi_top is 0, lower bound of irq > should be derived from 'hint' value passed to function as 'from'. > > I can add above info in commit message, please let me know if anything > more to be added. > > To be specific in our system which is a guest VM we don't need IO-APIC and > hence there is no device tree node for it. It is observed that we get irq 0 > assigned to PCI-MSI. > > > > > Don't be afraid to do > > > > git annotate arch/x86/kernel/apic/io_apic.c > > > > and see which commit added this. This one: > > > > 3e5bedc2c258 ("x86/apic: Fix arch_dynirq_lower_bound() bug for DT > > enabled > > machines") > > > > Now add the folks from this commit to Cc and tell them why in your > > case gsi_top is not initialized and what they're breaking by doing that. > > Thanks. I will add "Fixes:" and "Cc:" tag in next version. > > > > > The more your commit message explains *why* you're fixing something, > > the better it is for the maintainers/reviewers to actually know what to do. > > > > Right now I'm reading this and I'm thinking, random, unjustified change. > > Ignore. > > > > Ok? > > > > Thx. > > > > -- > > Regards/Gruss, > > Boris. > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeop > > > l%2F&data=05%7C01%7Cssengar%40microsoft.com%7C0595a41023f849c5ee > b308db > > > 23732cd0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C638142749 > 8888650 > > > 08%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzI > iLCJBTiI > > > 6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=qpOM5MMYpUof > VOaNsp8HxTmv > > %2B80iVn5rFfzNQTlTwLw%3D&reserved=0 > > e.kernel.org%2Ftglx%2Fnotes-about- > > > netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C6e1e0e21051c4 > > > 9c1cfe008db233a0376%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0% > > > 7C638142504360574969%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > > > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C > > > %7C&sdata=OLdgb1AuLbLvlzucgNFBQEEK6G%2FsFV%2BO2TqT%2FNCujJU%3 > > D&reserved=0
On Mon, Mar 13, 2023 at 03:37:44AM +0000, Saurabh Singh Sengar wrote: > I just see mail to rahul.tanwar@linux.intel.com is undelivered, shall I still add it in "Cc:" ? > Please let me know what we usually do in such cases. You were posting just fine inline, why do you have to top-post now? If it is undelivered, then the address is probably invalid now so drop it. Thx.
On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote: > To be specific in our system which is a guest VM we don't need IO-APIC and hence > there is no device tree node for it. It is observed that we get irq 0 assigned to PCI-MSI. This should be added to your commit message: what guest VM is that and why should the kernel support it. Why doesn't it need an IO-APIC and why does the current code need to be changed just for your guest VM? What else needs to be changed so that your VM works? Where is that VM's documentation and why can't that VM be fixed *not* to need kernel changes? IOW, why can't that VM emulate an IO-APIC like the others do...
> -----Original Message----- > From: Borislav Petkov <bp@alien8.de> > Sent: Monday, March 13, 2023 4:44 PM > To: Saurabh Singh Sengar <ssengar@microsoft.com> > Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de; > mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org; > hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com; > Michael Kelley (LINUX) <mikelley@microsoft.com>; linux- > kernel@vger.kernel.org; rahul.tanwar@linux.intel.com; > andriy.shevchenko@intel.com > Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq > > On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote: > > To be specific in our system which is a guest VM we don't need IO-APIC > > and hence there is no device tree node for it. It is observed that we get irq 0 > assigned to PCI-MSI. > > This should be added to your commit message: what guest VM is that and > why should the kernel support it. Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux documentation is in Documentation/virt/hyperv/. In commit I wanted to mention that any system which is not registering IO-APIC will have this issue. But I am fine to mention specifically about the issue I am facing. As part of your next comment, I have explained the issue in detail if that is good, I can put that as commit message. > > Why doesn't it need an IO-APIC and why does the current code need to be > changed just for your guest VM? For Hyper-V Virtual Machines, few platforms don't have any devices to be hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which assigns interrupts to PCIe devices. In such platforms IO-APIC is not registered which causes gsi_top value to remain at 0 and not get properly assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound function in io_apic.c decides the lower bound of irq numbers based on gsi_top. Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first virq number because gsi_top is still 0. 0 being invalid virq is ignored by MSI irq domain and results allocation of the same PCIe MSI twice. CPU0 CPU1 0: 2 0 Hyper-V PCIe MSI 1073741824-edge 1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0 To avoid this issue, if IO-APIC and gsi_top are not initialized, return the hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0. This will also be identical to the behaviour of weak arch_dynirq_lower_bound function defined in kernel/softirq.c. > > What else needs to be changed so that your VM works? This is the only change required. > > Where is that VM's documentation and why can't that VM be fixed *not* to > need kernel changes? IOW, why can't that VM emulate an IO-APIC like the > others do... Documentation is mentioned above. As there is no need of IO-APIC there is no need emulating it. Please let me know if there is any further clarification required. > > -- > Regards/Gruss, > Boris. > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl > e.kernel.org%2Ftglx%2Fnotes-about- > netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324 > 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0% > 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C > %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re > served=0
> -----Original Message----- > From: Saurabh Singh Sengar <ssengar@microsoft.com> > Sent: Tuesday, March 14, 2023 3:54 PM > To: Borislav Petkov <bp@alien8.de> > Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de; > mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org; > hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com; > Michael Kelley (LINUX) <mikelley@microsoft.com>; linux- > kernel@vger.kernel.org; rahul.tanwar@linux.intel.com; > andriy.shevchenko@intel.com > Subject: RE: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq > > > > > -----Original Message----- > > From: Borislav Petkov <bp@alien8.de> > > Sent: Monday, March 13, 2023 4:44 PM > > To: Saurabh Singh Sengar <ssengar@microsoft.com> > > Cc: Saurabh Sengar <ssengar@linux.microsoft.com>; tglx@linutronix.de; > > mingo@redhat.com; dave.hansen@linux.intel.com; x86@kernel.org; > > hpa@zytor.com; johan+linaro@kernel.org; isaku.yamahata@intel.com; > > Michael Kelley (LINUX) <mikelley@microsoft.com>; linux- > > kernel@vger.kernel.org; rahul.tanwar@linux.intel.com; > > andriy.shevchenko@intel.com > > Subject: Re: [EXTERNAL] Re: [PATCH] x86/ioapic: Don't return 0 as valid virq > > > > On Mon, Mar 13, 2023 at 03:29:32AM +0000, Saurabh Singh Sengar wrote: > > > To be specific in our system which is a guest VM we don't need IO-APIC > > > and hence there is no device tree node for it. It is observed that we get irq > 0 > > assigned to PCI-MSI. > > > > This should be added to your commit message: what guest VM is that and > > why should the kernel support it. > > Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux > documentation is in Documentation/virt/hyperv/. > > In commit I wanted to mention that any system which is not registering IO- > APIC > will have this issue. But I am fine to mention specifically about the issue I am > facing. > As part of your next comment, I have explained the issue in detail if that is > good, I > can put that as commit message. > > > > > Why doesn't it need an IO-APIC and why does the current code need to be > > changed just for your guest VM? > > For Hyper-V Virtual Machines, few platforms don't have any devices to be > hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which > assigns interrupts to PCIe devices. In such platforms IO-APIC is not > registered which causes gsi_top value to remain at 0 and not get properly > assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC > flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound > function in io_apic.c decides the lower bound of irq numbers based on > gsi_top. > > Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first > virq number because gsi_top is still 0. 0 being invalid virq is ignored by > MSI irq domain and results allocation of the same PCIe MSI twice. > > CPU0 CPU1 > 0: 2 0 Hyper-V PCIe MSI > 1073741824-edge > 1: 69 0 Hyper-V PCIe MSI > 1073741824-edge nvme0q0 > > To avoid this issue, if IO-APIC and gsi_top are not initialized, return the > hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0. > This will also be identical to the behaviour of weak arch_dynirq_lower_bound > function defined in kernel/softirq.c. Hi Borislav, Have you had an opportunity to review the above commit message ? Kindly provide me with your opinion on whether it meets your expectations. Regards, Saurabh > > > > > What else needs to be changed so that your VM works? > > This is the only change required. > > > > > Where is that VM's documentation and why can't that VM be fixed *not* to > > need kernel changes? IOW, why can't that VM emulate an IO-APIC like the > > others do... > > Documentation is mentioned above. As there is no need of IO-APIC there is > no need emulating it. > > Please let me know if there is any further clarification required. > > > > > -- > > Regards/Gruss, > > Boris. > > > > > https://nam06.safelinks.protection.outlook.com/?url=https%3A%2F%2Fpeopl > %2F&data=05%7C01%7Cssengar%40microsoft.com%7C84362c605bf04e6d56 > 6a08db247630d0%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C63 > 8143862345546032%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMD > AiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C > &sdata=7iCmUXeDyD%2Fqv%2BO63N6HG%2FPrS9HWP3yaGClT7X2RB0c%3D > &reserved=0 > > e.kernel.org%2Ftglx%2Fnotes-about- > > > netiquette&data=05%7C01%7Cssengar%40microsoft.com%7C817c78e7bb324 > > > 8cd73b708db23b41c2a%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0% > > > 7C638143028755917117%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjA > > > wMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C > > > %7C&sdata=3N5Mkl2gjMPHKOJGykZ3LvM6h%2FfD86dXLTQo3VH0Svc%3D&re > > served=0
On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote: >> This should be added to your commit message: what guest VM is that and >> why should the kernel support it. > > Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux > documentation is in Documentation/virt/hyperv/. > > In commit I wanted to mention that any system which is not registering > IO-APIC will have this issue. But I am fine to mention specifically > about the issue I am facing. As part of your next comment, I have > explained the issue in detail if that is good, I can put that as > commit message. >> >> Why doesn't it need an IO-APIC and why does the current code need to be >> changed just for your guest VM? > > For Hyper-V Virtual Machines, few platforms don't have any devices to be > hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which > assigns interrupts to PCIe devices. In such platforms IO-APIC is not > registered which causes gsi_top value to remain at 0 and not get properly > assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC > flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound > function in io_apic.c decides the lower bound of irq numbers based on gsi_top. > > Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first > virq number because gsi_top is still 0. 0 being invalid virq is ignored by > MSI irq domain and results allocation of the same PCIe MSI twice. > > CPU0 CPU1 > 0: 2 0 Hyper-V PCIe MSI 1073741824-edge > 1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0 > > To avoid this issue, if IO-APIC and gsi_top are not initialized, return the > hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0. > This will also be identical to the behaviour of weak arch_dynirq_lower_bound > function defined in kernel/softirq.c. I find this mightly confusing. Something like this perhaps: Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound() arch_dynirq_lower_bound() is invoked by the core interrupt code to retrieve the lowest possible Linux interrupt number for dynamically allocated interrupts like MSI. The x86 implementation uses this to exclude the IO/APIC GSI space. This works correctly as long as there is an IO/APIC registered, but returns 0 if not. This has been observed in VMs where the BIOS does not advertise an IO/APIC. 0 is an invalid interrupt number except for the legacy timer interrupt on x86. The return value is unchecked in the core code, so it ends up to allocate interrupt number 0 which is subsequently considered to be invalid by the caller, e.g. the MSI allocation code. The function has already a check for 0 in the case that an IO/APIC is registered, but ioapic_dynirq_base is 0 in case of device tree setups. Consolidate this and zero check for both ioapic_dynirq_base and gsi_top, which is used in the case that no IO/APIC is registered. And then make the code to look like the below, which makes it very clear what this is about. Thanks, tglx --- --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int unsigned int arch_dynirq_lower_bound(unsigned int from) { + unsigned int ret; + /* * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use * gsi_top if ioapic_dynirq_base hasn't been initialized yet. */ - if (!ioapic_initialized) - return gsi_top; + ret = ioapic_dynirq_base ? : gsi_top; + /* - * For DT enabled machines ioapic_dynirq_base is irrelevant and not - * updated. So simply return @from if ioapic_dynirq_base == 0. + * For DT enabled machines ioapic_dynirq_base is irrelevant and + * always 0. gsi_top can be 0 if there is no IO/APIC registered. + * + * 0 is an invalid interrupt number for dynamic allocations. Return + * @from instead. */ - return ioapic_dynirq_base ? : from; + return ret ? : from; } #ifdef CONFIG_X86_32
On Fri, Mar 24, 2023 at 04:39:07PM +0100, Thomas Gleixner wrote: > On Tue, Mar 14 2023 at 10:23, Saurabh Singh Sengar wrote: > >> This should be added to your commit message: what guest VM is that and > >> why should the kernel support it. > > > > Guest VM is a linux VM running as child partition on Hyper-V. Hyper-v Linux > > documentation is in Documentation/virt/hyperv/. > > > > In commit I wanted to mention that any system which is not registering > > IO-APIC will have this issue. But I am fine to mention specifically > > about the issue I am facing. As part of your next comment, I have > > explained the issue in detail if that is good, I can put that as > > commit message. > >> > >> Why doesn't it need an IO-APIC and why does the current code need to be > >> changed just for your guest VM? > > > > For Hyper-V Virtual Machines, few platforms don't have any devices to be > > hooked to IO-APIC. Although it has Hyper-V based MSI over VMBus which > > assigns interrupts to PCIe devices. In such platforms IO-APIC is not > > registered which causes gsi_top value to remain at 0 and not get properly > > assigned. Moreover, due to the inability to disable CONFIG_X86_IO_APIC > > flag, the io-apic code still gets compiled. Thus, arch_dynirq_lower_bound > > function in io_apic.c decides the lower bound of irq numbers based on gsi_top. > > > > Later when PCIe-MSI attempts to allocate interrupts, it gets 0 as the first > > virq number because gsi_top is still 0. 0 being invalid virq is ignored by > > MSI irq domain and results allocation of the same PCIe MSI twice. > > > > CPU0 CPU1 > > 0: 2 0 Hyper-V PCIe MSI 1073741824-edge > > 1: 69 0 Hyper-V PCIe MSI 1073741824-edge nvme0q0 > > > > To avoid this issue, if IO-APIC and gsi_top are not initialized, return the > > hint value passed as 'from' value to arch_dynirq_lower_bound instead of 0. > > This will also be identical to the behaviour of weak arch_dynirq_lower_bound > > function defined in kernel/softirq.c. > > I find this mightly confusing. Something like this perhaps: > > Subject: x86/ioapic: Don't return 0 from arch_dynirq_lower_bound() > > arch_dynirq_lower_bound() is invoked by the core interrupt code to > retrieve the lowest possible Linux interrupt number for dynamically > allocated interrupts like MSI. > > The x86 implementation uses this to exclude the IO/APIC GSI space. > This works correctly as long as there is an IO/APIC registered, but > returns 0 if not. This has been observed in VMs where the BIOS does > not advertise an IO/APIC. > > 0 is an invalid interrupt number except for the legacy timer interrupt > on x86. The return value is unchecked in the core code, so it ends up > to allocate interrupt number 0 which is subsequently considered to be > invalid by the caller, e.g. the MSI allocation code. > > The function has already a check for 0 in the case that an IO/APIC is > registered, but ioapic_dynirq_base is 0 in case of device tree setups. > > Consolidate this and zero check for both ioapic_dynirq_base and gsi_top, > which is used in the case that no IO/APIC is registered. > > And then make the code to look like the below, which makes it very > clear what this is about. > > Thanks, > > tglx > --- > --- a/arch/x86/kernel/apic/io_apic.c > +++ b/arch/x86/kernel/apic/io_apic.c > @@ -2477,17 +2477,22 @@ static int io_apic_get_redir_entries(int > > unsigned int arch_dynirq_lower_bound(unsigned int from) > { > + unsigned int ret; > + > /* > * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use > * gsi_top if ioapic_dynirq_base hasn't been initialized yet. > */ > - if (!ioapic_initialized) > - return gsi_top; > + ret = ioapic_dynirq_base ? : gsi_top; > + > /* > - * For DT enabled machines ioapic_dynirq_base is irrelevant and not > - * updated. So simply return @from if ioapic_dynirq_base == 0. > + * For DT enabled machines ioapic_dynirq_base is irrelevant and > + * always 0. gsi_top can be 0 if there is no IO/APIC registered. > + * > + * 0 is an invalid interrupt number for dynamic allocations. Return > + * @from instead. > */ > - return ioapic_dynirq_base ? : from; > + return ret ? : from; > } > > #ifdef CONFIG_X86_32 > Thanks you for your valuable suggestions. Commit message and code looks much better now. I will send the V2 with your "Co-Developed-by" tag, I hope its fine with you. Regards, Saurabh
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c index a868b76cd3d4..000cc6159b8b 100644 --- a/arch/x86/kernel/apic/io_apic.c +++ b/arch/x86/kernel/apic/io_apic.c @@ -2483,9 +2483,11 @@ unsigned int arch_dynirq_lower_bound(unsigned int from) /* * dmar_alloc_hwirq() may be called before setup_IO_APIC(), so use * gsi_top if ioapic_dynirq_base hasn't been initialized yet. + * + * Incase gsi_top is also not initialized return @from. */ if (!ioapic_initialized) - return gsi_top; + return gsi_top ? : from; /* * For DT enabled machines ioapic_dynirq_base is irrelevant and not * updated. So simply return @from if ioapic_dynirq_base == 0.