[RFC,v3,03/21] ACPI: processor: Register CPUs that are online, but not described in the DSDT
Message ID | E1rDOg2-00Dvjk-RI@rmk-PC.armlinux.org.uk |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:3b04:b0:fb:cd0c:d3e with SMTP id c4csp7749407dys; Wed, 13 Dec 2023 04:50:47 -0800 (PST) X-Google-Smtp-Source: AGHT+IEKcs4g1IwA4K3j/3LRQA6AW7ecRxPNdXc3aYdH5WyReQxsoIupSKtG4NGxRVThfB2kwqDs X-Received: by 2002:a17:903:41ce:b0:1d0:9c03:a7dc with SMTP id u14-20020a17090341ce00b001d09c03a7dcmr4729117ple.100.1702471846809; Wed, 13 Dec 2023 04:50:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702471846; cv=none; d=google.com; s=arc-20160816; b=nvO/Aj5RXHqJe2l9XN3dyV6kERikQ/77iHMpjmGtRgQn7HJwpS41ym2e7O7TTDH2J1 L0M9ZNWRLNb2ejaLMVIheGK10D/i7ZrCjzf46x4ZVT4/LfJ4wrNZqxLntqpCwA2wPPXv pDlsnBZHsFLuIiD1sLwxIr8ExGujP0O+B1VAFZpaSXRQd7H5kyJAmlKWUegSeHFUsbuk x6NME3FSml/Q5sW8A/wYHtEAAycc3BMGtrOeDoIC5TTnprIMvHohcaPd5qqDLAx3GJn2 c8d+fVLo20CJMtrKD+3Pxv6uNdJjulWlUUDlp1OncU7TwUHiOmxwb/zzVQpHt9aERlJX UWNg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:date:sender:message-id:content-transfer-encoding :content-disposition:mime-version:subject:cc:to:from:references :in-reply-to:dkim-signature; bh=TBP+R/WnvXUfMyTGAeqg+zDZHJCc+YgTDDU9BjPnE9g=; fh=YWIBRcP8cdLYR894sEToaan3lN1VSUITe+B678IO0WY=; b=CTTvtXApLsnPzRiAJSn/ahwkydAWSu0VL6HpC3CzkVDY8EzsdVRDtNRYbfK9CBIR9J NBJ3zztIMK5qMIaeIyL0f9Sd1v127+iwxSVa7ffeDo+tU7y28YE8Fq+84PXu/vjHOp+3 qck+l0VraxGU4x1CTz1d3V2b+4MsrwfIwYkYhvAUbLXFro4N2c8qdLbXk9uH7JcDykxG kpGVwbp8vmYxjsgaA7OJP9JbDdzvyEEk6zNRo62i3fR4ER+4VnpCISyehlsXFRjxbKEf tqihN8wHuxivCqLB/EHU5LEDmaAqwcAxGhwGxuRnhcLTN1kVysQUNCuCASZm5MFOZoxI 3eiw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=ksm1NKE0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: from lipwig.vger.email (lipwig.vger.email. [2620:137:e000::3:3]) by mx.google.com with ESMTPS id a8-20020a170902b58800b001d05cf08ac6si9269382pls.477.2023.12.13.04.50.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 04:50:46 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) client-ip=2620:137:e000::3:3; Authentication-Results: mx.google.com; dkim=fail (test mode) header.i=@armlinux.org.uk header.s=pandora-2019 header.b=ksm1NKE0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:3 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=armlinux.org.uk Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 6B674803A518; Wed, 13 Dec 2023 04:50:37 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378894AbjLMMt3 (ORCPT <rfc822;dexuan.linux@gmail.com> + 99 others); Wed, 13 Dec 2023 07:49:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49550 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1378893AbjLMMt0 (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Wed, 13 Dec 2023 07:49:26 -0500 Received: from pandora.armlinux.org.uk (pandora.armlinux.org.uk [IPv6:2001:4d48:ad52:32c8:5054:ff:fe00:142]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BBD94A4; Wed, 13 Dec 2023 04:49:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=armlinux.org.uk; s=pandora-2019; h=Date:Sender:Message-Id:Content-Type: Content-Transfer-Encoding:MIME-Version:Subject:Cc:To:From:References: In-Reply-To:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help: List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=TBP+R/WnvXUfMyTGAeqg+zDZHJCc+YgTDDU9BjPnE9g=; b=ksm1NKE0scGCsyGD98yGeYXubr hB68rislLCulb5Vrhre3UZbE0fxzcZ56LCzd5+Jx/UWiG25ops81IW195eVeI8hCQ5EANLwLRWDLA BmC4J9EmHzrINYzT6DlHEOYKjkDSM0BA4JczIJo4NcTNxODYtV4qUZxAeg1J7TH+VUw4KpKxTHLPz mpmzh+ajIEmyMMSsM4T56yj66lxMZNRqG8qaewbcE/l7g7lRVI3rtAua0Mp747FPjrkKFWvOHFqFD 2bv0xOQ5zwDWdKfZo9UeZUjaRSQLgY0cl+j0aJ56oPkOTkQW4csde5cwMAKUOpa9x8rBSwoAiid2c Df8GEnuA==; Received: from e0022681537dd.dyn.armlinux.org.uk ([fd8f:7570:feb6:1:222:68ff:fe15:37dd]:43366 helo=rmk-PC.armlinux.org.uk) by pandora.armlinux.org.uk with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from <rmk@armlinux.org.uk>) id 1rDOg0-0008D5-24; Wed, 13 Dec 2023 12:49:24 +0000 Received: from rmk by rmk-PC.armlinux.org.uk with local (Exim 4.94.2) (envelope-from <rmk@rmk-PC.armlinux.org.uk>) id 1rDOg2-00Dvjk-RI; Wed, 13 Dec 2023 12:49:26 +0000 In-Reply-To: <ZXmn46ptis59F0CO@shell.armlinux.org.uk> References: <ZXmn46ptis59F0CO@shell.armlinux.org.uk> From: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> To: linux-pm@vger.kernel.org, loongarch@lists.linux.dev, linux-acpi@vger.kernel.org, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-riscv@lists.infradead.org, kvmarm@lists.linux.dev, x86@kernel.org, acpica-devel@lists.linuxfoundation.org, linux-csky@vger.kernel.org, linux-doc@vger.kernel.org, linux-ia64@vger.kernel.org, linux-parisc@vger.kernel.org Cc: Salil Mehta <salil.mehta@huawei.com>, Jean-Philippe Brucker <jean-philippe@linaro.org>, jianyong.wu@arm.com, justin.he@arm.com, James Morse <james.morse@arm.com> Subject: [PATCH RFC v3 03/21] ACPI: processor: Register CPUs that are online, but not described in the DSDT MIME-Version: 1.0 Content-Disposition: inline Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="utf-8" Message-Id: <E1rDOg2-00Dvjk-RI@rmk-PC.armlinux.org.uk> Sender: Russell King <rmk@armlinux.org.uk> Date: Wed, 13 Dec 2023 12:49:26 +0000 X-Spam-Status: No, score=-0.6 required=5.0 tests=DKIM_INVALID,DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 13 Dec 2023 04:50:37 -0800 (PST) X-getmail-retrieved-from-mailbox: INBOX X-GMAIL-THRID: 1785171119220729788 X-GMAIL-MSGID: 1785171119220729788 |
Series |
ACPI/arm64: add support for virtual cpu hotplug
|
|
Commit Message
Russell King (Oracle)
Dec. 13, 2023, 12:49 p.m. UTC
From: James Morse <james.morse@arm.com> ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" says "Each processor in the system must be declared in the ACPI namespace"). Having two descriptions allows firmware authors to get this wrong. If CPUs are described in the MADT/APIC, they will be brought online early during boot. Once the register_cpu() calls are moved to ACPI, they will be based on the DSDT description of the CPUs. When CPUs are missing from the DSDT description, they will end up online, but not registered. Add a helper that runs after acpi_init() has completed to register CPUs that are online, but weren't found in the DSDT. Any CPU that is registered by this code triggers a firmware-bug warning and kernel taint. Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug is configured. Signed-off-by: James Morse <james.morse@arm.com> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> Reviewed-by: Gavin Shan <gshan@redhat.com> Tested-by: Miguel Luis <miguel.luis@oracle.com> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> Tested-by: Jianyong Wu <jianyong.wu@arm.com> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> --- drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
Comments
On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: > > From: James Morse <james.morse@arm.com> > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" > says "Each processor in the system must be declared in the ACPI > namespace"). Having two descriptions allows firmware authors to get > this wrong. > > If CPUs are described in the MADT/APIC, they will be brought online > early during boot. Once the register_cpu() calls are moved to ACPI, > they will be based on the DSDT description of the CPUs. When CPUs are > missing from the DSDT description, they will end up online, but not > registered. > > Add a helper that runs after acpi_init() has completed to register > CPUs that are online, but weren't found in the DSDT. Any CPU that > is registered by this code triggers a firmware-bug warning and kernel > taint. > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug > is configured. So why is this a kernel problem? > Signed-off-by: James Morse <james.morse@arm.com> > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > Reviewed-by: Gavin Shan <gshan@redhat.com> > Tested-by: Miguel Luis <miguel.luis@oracle.com> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++ > 1 file changed, 19 insertions(+) > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > index 6a542e0ce396..0511f2bc10bc 100644 > --- a/drivers/acpi/acpi_processor.c > +++ b/drivers/acpi/acpi_processor.c > @@ -791,6 +791,25 @@ void __init acpi_processor_init(void) > acpi_pcc_cpufreq_init(); > } > > +static int __init acpi_processor_register_missing_cpus(void) > +{ > + int cpu; > + > + if (acpi_disabled) > + return 0; > + > + for_each_online_cpu(cpu) { > + if (!get_cpu_device(cpu)) { > + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu); > + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); > + arch_register_cpu(cpu); Which part of this code is related to ACPI? > + } > + } > + > + return 0; > +} > +subsys_initcall_sync(acpi_processor_register_missing_cpus); > + > #ifdef CONFIG_ACPI_PROCESSOR_CSTATE > /** > * acpi_processor_claim_cst_control - Request _CST control from the platform. > --
On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote: > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: > > > > From: James Morse <james.morse@arm.com> > > > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other > > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" > > says "Each processor in the system must be declared in the ACPI > > namespace"). Having two descriptions allows firmware authors to get > > this wrong. > > > > If CPUs are described in the MADT/APIC, they will be brought online > > early during boot. Once the register_cpu() calls are moved to ACPI, > > they will be based on the DSDT description of the CPUs. When CPUs are > > missing from the DSDT description, they will end up online, but not > > registered. > > > > Add a helper that runs after acpi_init() has completed to register > > CPUs that are online, but weren't found in the DSDT. Any CPU that > > is registered by this code triggers a firmware-bug warning and kernel > > taint. > > > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug > > is configured. > > So why is this a kernel problem? So what are you proposing should be the behaviour here? What this statement seems to be saying is that QEMU as it exists today only describes the first CPU in DSDT. As this patch series changes when arch_register_cpu() gets called (as described in the paragraph above) we obviously need to preserve the _existing_ behaviour to avoid causing regressions. So, if changing the kernel causes user visible regressions (e.g. sysfs entries to disappear) then it obviously _is_ a kernel problem that needs to be solved. We can't say "well fix QEMU then" without invoking the wrath of Linus. > > Signed-off-by: James Morse <james.morse@arm.com> > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > Reviewed-by: Gavin Shan <gshan@redhat.com> > > Tested-by: Miguel Luis <miguel.luis@oracle.com> > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++ > > 1 file changed, 19 insertions(+) > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > > index 6a542e0ce396..0511f2bc10bc 100644 > > --- a/drivers/acpi/acpi_processor.c > > +++ b/drivers/acpi/acpi_processor.c > > @@ -791,6 +791,25 @@ void __init acpi_processor_init(void) > > acpi_pcc_cpufreq_init(); > > } > > > > +static int __init acpi_processor_register_missing_cpus(void) > > +{ > > + int cpu; > > + > > + if (acpi_disabled) > > + return 0; > > + > > + for_each_online_cpu(cpu) { > > + if (!get_cpu_device(cpu)) { > > + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu); > > + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); > > + arch_register_cpu(cpu); > > Which part of this code is related to ACPI? That's a good question, and I suspect it would be more suited to being placed in drivers/base/cpu.c except for the problem that the error message refers to ACPI. As long as we keep the acpi_disabled test, I guess that's fine. cpu_dev_register_generic() there already tests acpi_disabled.
On Mon, 15 Jan 2024 11:06:29 +0000 "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote: > > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: > > > > > > From: James Morse <james.morse@arm.com> > > > > > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other > > > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" > > > says "Each processor in the system must be declared in the ACPI > > > namespace"). Having two descriptions allows firmware authors to get > > > this wrong. > > > > > > If CPUs are described in the MADT/APIC, they will be brought online > > > early during boot. Once the register_cpu() calls are moved to ACPI, > > > they will be based on the DSDT description of the CPUs. When CPUs are > > > missing from the DSDT description, they will end up online, but not > > > registered. > > > > > > Add a helper that runs after acpi_init() has completed to register > > > CPUs that are online, but weren't found in the DSDT. Any CPU that > > > is registered by this code triggers a firmware-bug warning and kernel > > > taint. > > > > > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug > > > is configured. > > > > So why is this a kernel problem? > > So what are you proposing should be the behaviour here? What this > statement seems to be saying is that QEMU as it exists today only > describes the first CPU in DSDT. This confuses me somewhat, because I'm far from sure which machines this is true for in QEMU. I'm guessing it's a legacy thing with some old distro version of QEMU - so we'll have to paper over it anyway but for current QEMU I'm not sure it's true. Helpfully there are a bunch of ACPI table tests so I've been checking through all the multi CPU cases. CPU hotplug not enabled. pc/DSDT.dimmpxm - 4x Processor entries. -smp 4 pc/DSDT.acpihmat - 2x Processor entries. -smp 2 q35/DSDT.acpihmat - 2x Processor entries. -smp 2 virt/DSDT.acpihmatvirt - 4x ACPI0007 entries -smp 4 q35/DSDT.acpihmat-noinitiator - 4 x Processor () entries -smp 4 virt/DSDT.topology - 8x ACPI0007 entries I've also looked at the code and we have various types of CPU hotplug on x86 but they all build appropriate numbers of Processor() entries in DSDT. Arm likewise seems to build the right number of ACPI0007 entries (and doesn't yet have CPU HP support). If anyone can add a reference on why this is needed that would be very helpful. > > As this patch series changes when arch_register_cpu() gets called (as > described in the paragraph above) we obviously need to preserve the > _existing_ behaviour to avoid causing regressions. So, if changing the > kernel causes user visible regressions (e.g. sysfs entries to > disappear) then it obviously _is_ a kernel problem that needs to be > solved. > > We can't say "well fix QEMU then" without invoking the wrath of Linus. Overall I'm fine with the defensive nature of this patch as there 'might' be firmware out there with this problem - I just can't establish that there is! If anyone else recalls the history of this then give a shout. I vaguely wondered if this was an ia64 thing but nope, QEMU never generated tables for ia64 before dropping support back in QEMU 2.11 > > > > Signed-off-by: James Morse <james.morse@arm.com> > > > Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> > > > Reviewed-by: Gavin Shan <gshan@redhat.com> > > > Tested-by: Miguel Luis <miguel.luis@oracle.com> > > > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com> > > > Tested-by: Jianyong Wu <jianyong.wu@arm.com> > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > --- > > > drivers/acpi/acpi_processor.c | 19 +++++++++++++++++++ > > > 1 file changed, 19 insertions(+) > > > > > > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c > > > index 6a542e0ce396..0511f2bc10bc 100644 > > > --- a/drivers/acpi/acpi_processor.c > > > +++ b/drivers/acpi/acpi_processor.c > > > @@ -791,6 +791,25 @@ void __init acpi_processor_init(void) > > > acpi_pcc_cpufreq_init(); > > > } > > > > > > +static int __init acpi_processor_register_missing_cpus(void) > > > +{ > > > + int cpu; > > > + > > > + if (acpi_disabled) > > > + return 0; > > > + > > > + for_each_online_cpu(cpu) { > > > + if (!get_cpu_device(cpu)) { > > > + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu); > > > + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); > > > + arch_register_cpu(cpu); > > > > Which part of this code is related to ACPI? > > That's a good question, and I suspect it would be more suited to being > placed in drivers/base/cpu.c except for the problem that the error > message refers to ACPI. > > As long as we keep the acpi_disabled test, I guess that's fine. > cpu_dev_register_generic() there already tests acpi_disabled. > Moving it seems fine to me. Jonathan
On Mon, Jan 22, 2024 at 5:02 PM Jonathan Cameron <Jonathan.Cameron@huawei.com> wrote: > > On Mon, 15 Jan 2024 11:06:29 +0000 > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > > > On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote: > > > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: > > > > > > > > From: James Morse <james.morse@arm.com> > > > > > > > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other > > > > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" > > > > says "Each processor in the system must be declared in the ACPI > > > > namespace"). Having two descriptions allows firmware authors to get > > > > this wrong. > > > > > > > > If CPUs are described in the MADT/APIC, they will be brought online > > > > early during boot. Once the register_cpu() calls are moved to ACPI, > > > > they will be based on the DSDT description of the CPUs. When CPUs are > > > > missing from the DSDT description, they will end up online, but not > > > > registered. > > > > > > > > Add a helper that runs after acpi_init() has completed to register > > > > CPUs that are online, but weren't found in the DSDT. Any CPU that > > > > is registered by this code triggers a firmware-bug warning and kernel > > > > taint. > > > > > > > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug > > > > is configured. > > > > > > So why is this a kernel problem? > > > > So what are you proposing should be the behaviour here? What this > > statement seems to be saying is that QEMU as it exists today only > > describes the first CPU in DSDT. > > This confuses me somewhat, because I'm far from sure which machines this > is true for in QEMU. I'm guessing it's a legacy thing with > some old distro version of QEMU - so we'll have to paper over it anyway > but for current QEMU I'm not sure it's true. > > Helpfully there are a bunch of ACPI table tests so I've been checking > through all the multi CPU cases. > > CPU hotplug not enabled. > pc/DSDT.dimmpxm - 4x Processor entries. -smp 4 > pc/DSDT.acpihmat - 2x Processor entries. -smp 2 > q35/DSDT.acpihmat - 2x Processor entries. -smp 2 > virt/DSDT.acpihmatvirt - 4x ACPI0007 entries -smp 4 > q35/DSDT.acpihmat-noinitiator - 4 x Processor () entries -smp 4 > virt/DSDT.topology - 8x ACPI0007 entries > > I've also looked at the code and we have various types of > CPU hotplug on x86 but they all build appropriate numbers of > Processor() entries in DSDT. > Arm likewise seems to build the right number of ACPI0007 entries > (and doesn't yet have CPU HP support). > > If anyone can add a reference on why this is needed that would be very > helpful. Yes, it would. Personally, I would prefer to assume that it is not necessary until it turns out that (1) there is firmware with this issue actually in use and (2) updating the firmware in question to follow the specification is not practical. Otherwise, we'd make it easier to ship non-compliant firmware for no good reason.
On Mon, Jan 22, 2024 at 05:22:46PM +0100, Rafael J. Wysocki wrote: > On Mon, Jan 22, 2024 at 5:02 PM Jonathan Cameron > <Jonathan.Cameron@huawei.com> wrote: > > > > On Mon, 15 Jan 2024 11:06:29 +0000 > > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > > > > > On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote: > > > > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: > > > > > > > > > > From: James Morse <james.morse@arm.com> > > > > > > > > > > ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other > > > > > in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" > > > > > says "Each processor in the system must be declared in the ACPI > > > > > namespace"). Having two descriptions allows firmware authors to get > > > > > this wrong. > > > > > > > > > > If CPUs are described in the MADT/APIC, they will be brought online > > > > > early during boot. Once the register_cpu() calls are moved to ACPI, > > > > > they will be based on the DSDT description of the CPUs. When CPUs are > > > > > missing from the DSDT description, they will end up online, but not > > > > > registered. > > > > > > > > > > Add a helper that runs after acpi_init() has completed to register > > > > > CPUs that are online, but weren't found in the DSDT. Any CPU that > > > > > is registered by this code triggers a firmware-bug warning and kernel > > > > > taint. > > > > > > > > > > Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug > > > > > is configured. > > > > > > > > So why is this a kernel problem? > > > > > > So what are you proposing should be the behaviour here? What this > > > statement seems to be saying is that QEMU as it exists today only > > > describes the first CPU in DSDT. > > > > This confuses me somewhat, because I'm far from sure which machines this > > is true for in QEMU. I'm guessing it's a legacy thing with > > some old distro version of QEMU - so we'll have to paper over it anyway > > but for current QEMU I'm not sure it's true. > > > > Helpfully there are a bunch of ACPI table tests so I've been checking > > through all the multi CPU cases. > > > > CPU hotplug not enabled. > > pc/DSDT.dimmpxm - 4x Processor entries. -smp 4 > > pc/DSDT.acpihmat - 2x Processor entries. -smp 2 > > q35/DSDT.acpihmat - 2x Processor entries. -smp 2 > > virt/DSDT.acpihmatvirt - 4x ACPI0007 entries -smp 4 > > q35/DSDT.acpihmat-noinitiator - 4 x Processor () entries -smp 4 > > virt/DSDT.topology - 8x ACPI0007 entries > > > > I've also looked at the code and we have various types of > > CPU hotplug on x86 but they all build appropriate numbers of > > Processor() entries in DSDT. > > Arm likewise seems to build the right number of ACPI0007 entries > > (and doesn't yet have CPU HP support). > > > > If anyone can add a reference on why this is needed that would be very > > helpful. > > Yes, it would. > > Personally, I would prefer to assume that it is not necessary until it > turns out that (1) there is firmware with this issue actually in use > and (2) updating the firmware in question to follow the specification > is not practical. > > Otherwise, we'd make it easier to ship non-compliant firmware for no > good reason. If Salil can't come up with a reason, then I'm in favour of dropping the patch like already done for patch 2. If the code change serves no useful purpose, there's no point in making the change.
Hi > On 23 Jan 2024, at 08:27, Jonathan Cameron <jonathan.cameron@huawei.com> wrote: > > On Mon, 22 Jan 2024 17:30:05 +0000 > "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: > >> On Mon, Jan 22, 2024 at 05:22:46PM +0100, Rafael J. Wysocki wrote: >>> On Mon, Jan 22, 2024 at 5:02 PM Jonathan Cameron >>> <Jonathan.Cameron@huawei.com> wrote: >>>> >>>> On Mon, 15 Jan 2024 11:06:29 +0000 >>>> "Russell King (Oracle)" <linux@armlinux.org.uk> wrote: >>>> >>>>> On Mon, Dec 18, 2023 at 09:22:03PM +0100, Rafael J. Wysocki wrote: >>>>>> On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote: >>>>>>> >>>>>>> From: James Morse <james.morse@arm.com> >>>>>>> >>>>>>> ACPI has two descriptions of CPUs, one in the MADT/APIC table, the other >>>>>>> in the DSDT. Both are required. (ACPI 6.5's 8.4 "Declaring Processors" >>>>>>> says "Each processor in the system must be declared in the ACPI >>>>>>> namespace"). Having two descriptions allows firmware authors to get >>>>>>> this wrong. >>>>>>> >>>>>>> If CPUs are described in the MADT/APIC, they will be brought online >>>>>>> early during boot. Once the register_cpu() calls are moved to ACPI, >>>>>>> they will be based on the DSDT description of the CPUs. When CPUs are >>>>>>> missing from the DSDT description, they will end up online, but not >>>>>>> registered. >>>>>>> >>>>>>> Add a helper that runs after acpi_init() has completed to register >>>>>>> CPUs that are online, but weren't found in the DSDT. Any CPU that >>>>>>> is registered by this code triggers a firmware-bug warning and kernel >>>>>>> taint. >>>>>>> >>>>>>> Qemu TCG only describes the first CPU in the DSDT, unless cpu-hotplug >>>>>>> is configured. >>>>>> >>>>>> So why is this a kernel problem? >>>>> >>>>> So what are you proposing should be the behaviour here? What this >>>>> statement seems to be saying is that QEMU as it exists today only >>>>> describes the first CPU in DSDT. >>>> >>>> This confuses me somewhat, because I'm far from sure which machines this >>>> is true for in QEMU. I'm guessing it's a legacy thing with >>>> some old distro version of QEMU - so we'll have to paper over it anyway >>>> but for current QEMU I'm not sure it's true. >>>> >>>> Helpfully there are a bunch of ACPI table tests so I've been checking >>>> through all the multi CPU cases. >>>> >>>> CPU hotplug not enabled. >>>> pc/DSDT.dimmpxm - 4x Processor entries. -smp 4 >>>> pc/DSDT.acpihmat - 2x Processor entries. -smp 2 >>>> q35/DSDT.acpihmat - 2x Processor entries. -smp 2 >>>> virt/DSDT.acpihmatvirt - 4x ACPI0007 entries -smp 4 >>>> q35/DSDT.acpihmat-noinitiator - 4 x Processor () entries -smp 4 >>>> virt/DSDT.topology - 8x ACPI0007 entries >>>> >>>> I've also looked at the code and we have various types of >>>> CPU hotplug on x86 but they all build appropriate numbers of >>>> Processor() entries in DSDT. >>>> Arm likewise seems to build the right number of ACPI0007 entries >>>> (and doesn't yet have CPU HP support). >>>> >>>> If anyone can add a reference on why this is needed that would be very >>>> helpful. >>> >>> Yes, it would. >>> >>> Personally, I would prefer to assume that it is not necessary until it >>> turns out that (1) there is firmware with this issue actually in use >>> and (2) updating the firmware in question to follow the specification >>> is not practical. >>> >>> Otherwise, we'd make it easier to ship non-compliant firmware for no >>> good reason. >> >> If Salil can't come up with a reason, then I'm in favour of dropping >> the patch like already done for patch 2. If the code change serves no >> useful purpose, there's no point in making the change. >> > > Salil's out today, but I've messaged him to follow up later in the week. > > It 'might' be the odd cold plug path where QEMU half comes up, then extra > CPUs are added, then it boots. (used by some orchestration frameworks) > I don't have a set up for that and I won't get to creating one today anyway > (we all love start of the year planning workshops!) > > I've +CC'd a few people have run tests on the various iterations of this > work in the past. Maybe one of them can shed some light on this? > IIUC, this patch covers a scenario for non compliant firmware and in which my tests for AArch64 using RFC v2 have been unable to trigger its error message so far. This does not mean, however, this patch should not be taken forward though. It seems benevolent enough detecting non compliant firmware and still proceed while having whoever uses that firmware to get to know that. I'm not sure, however, whether the reference to a specific VMM should be in the commit message though. That might not be anything to do with the kernel so a more meaningful rewrite on this separation of concerns could be useful. Miguel > Jonathan
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 6a542e0ce396..0511f2bc10bc 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -791,6 +791,25 @@ void __init acpi_processor_init(void) acpi_pcc_cpufreq_init(); } +static int __init acpi_processor_register_missing_cpus(void) +{ + int cpu; + + if (acpi_disabled) + return 0; + + for_each_online_cpu(cpu) { + if (!get_cpu_device(cpu)) { + pr_err_once(FW_BUG "CPU %u has no ACPI namespace description!\n", cpu); + add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK); + arch_register_cpu(cpu); + } + } + + return 0; +} +subsys_initcall_sync(acpi_processor_register_missing_cpus); + #ifdef CONFIG_ACPI_PROCESSOR_CSTATE /** * acpi_processor_claim_cst_control - Request _CST control from the platform.