[RFC,v3,02/21] ACPI: processor: Add support for processors described as container packages

Message ID E1rDOfx-00Dvje-MS@rmk-PC.armlinux.org.uk
State New
Headers
Series ACPI/arm64: add support for virtual cpu hotplug |

Commit Message

Russell King (Oracle) Dec. 13, 2023, 12:49 p.m. UTC
  From: James Morse <james.morse@arm.com>

ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
5.2.12:

"Starting with ACPI Specification 6.3, the use of the Processor() object
was deprecated. Only legacy systems should continue with this usage. On
the Itanium architecture only, a _UID is provided for the Processor()
that is a string object. This usage of _UID is also deprecated since it
can preclude an OSPM from being able to match a processor to a
non-enumerable device, such as those defined in the MADT. From ACPI
Specification 6.3 onward, all processor objects for all architectures
except Itanium must now use Device() objects with an _HID of ACPI0007,
and use only integer _UID values."

Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors

Duplicate descriptions are not allowed, the ACPI processor driver already
parses the UID from both devices and containers. acpi_processor_get_info()
returns an error if the UID exists twice in the DSDT.

The missing probe for CPUs described as packages creates a problem for
moving the cpu_register() calls into the acpi_processor driver, as CPUs
described like this don't get registered, leading to errors from other
subsystems when they try to add new sysfs entries to the CPU node.
(e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)

To fix this, parse the processor container and call acpi_processor_add()
for each processor that is discovered like this. The processor container
handler is added with acpi_scan_add_handler(), so no detach call will
arrive.

Qemu TCG describes CPUs using processor devices in a processor container.
For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and
https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Miguel Luis <miguel.luis@oracle.com>
Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
Tested-by: Jianyong Wu <jianyong.wu@arm.com>
---
Outstanding comments:
 https://lore.kernel.org/r/20230914145353.000072e2@Huawei.com
 https://lore.kernel.org/r/50571c2f-aa3c-baeb-3add-cd59e0eddc02@redhat.com
---
 drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)
  

Comments

Jonathan Cameron Dec. 14, 2023, 5:36 p.m. UTC | #1
On Wed, 13 Dec 2023 12:49:21 +0000
Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:

> From: James Morse <james.morse@arm.com>
> 
> ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> 5.2.12:
> 
> "Starting with ACPI Specification 6.3, the use of the Processor() object
> was deprecated. Only legacy systems should continue with this usage. On
> the Itanium architecture only, a _UID is provided for the Processor()
> that is a string object. This usage of _UID is also deprecated since it
> can preclude an OSPM from being able to match a processor to a
> non-enumerable device, such as those defined in the MADT. From ACPI
> Specification 6.3 onward, all processor objects for all architectures
> except Itanium must now use Device() objects with an _HID of ACPI0007,
> and use only integer _UID values."
> 
> Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> 
> Duplicate descriptions are not allowed, the ACPI processor driver already
> parses the UID from both devices and containers. acpi_processor_get_info()
> returns an error if the UID exists twice in the DSDT.
> 
> The missing probe for CPUs described as packages creates a problem for
> moving the cpu_register() calls into the acpi_processor driver, as CPUs
> described like this don't get registered, leading to errors from other
> subsystems when they try to add new sysfs entries to the CPU node.
> (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> 
> To fix this, parse the processor container and call acpi_processor_add()
> for each processor that is discovered like this. The processor container
> handler is added with acpi_scan_add_handler(), so no detach call will
> arrive.
> 
> Qemu TCG describes CPUs using processor devices in a processor container.
> For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and
> https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
> Outstanding comments:
>  https://lore.kernel.org/r/20230914145353.000072e2@Huawei.com
Looks like you resolved those (were all patch description things).

So I'm happy.
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Thanks,

J
>  https://lore.kernel.org/r/50571c2f-aa3c-baeb-3add-cd59e0eddc02@redhat.com
> ---
>  drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..6a542e0ce396 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,9 +626,31 @@ static struct acpi_scan_handler processor_handler = {
>  	},
>  };
>  
> +static acpi_status acpi_processor_container_walk(acpi_handle handle,
> +						 u32 lvl,
> +						 void *context,
> +						 void **rv)
> +{
> +	struct acpi_device *adev;
> +	acpi_status status;
> +
> +	adev = acpi_get_acpi_dev(handle);
> +	if (!adev)
> +		return AE_ERROR;
> +
> +	status = acpi_processor_add(adev, &processor_device_ids[0]);
> +	acpi_put_acpi_dev(adev);
> +
> +	return status;
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>  					   const struct acpi_device_id *id)
>  {
> +	acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
> +			    ACPI_UINT32_MAX, acpi_processor_container_walk,
> +			    NULL, NULL, NULL);
> +
>  	return 1;
>  }
>
  
Russell King (Oracle) Dec. 14, 2023, 5:57 p.m. UTC | #2
On Thu, Dec 14, 2023 at 05:36:26PM +0000, Jonathan Cameron wrote:
> On Wed, 13 Dec 2023 12:49:21 +0000
> Russell King (Oracle) <rmk+kernel@armlinux.org.uk> wrote:
> 
> > From: James Morse <james.morse@arm.com>
> > 
> > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > 5.2.12:
> > 
> > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > was deprecated. Only legacy systems should continue with this usage. On
> > the Itanium architecture only, a _UID is provided for the Processor()
> > that is a string object. This usage of _UID is also deprecated since it
> > can preclude an OSPM from being able to match a processor to a
> > non-enumerable device, such as those defined in the MADT. From ACPI
> > Specification 6.3 onward, all processor objects for all architectures
> > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > and use only integer _UID values."
> > 
> > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> > 
> > Duplicate descriptions are not allowed, the ACPI processor driver already
> > parses the UID from both devices and containers. acpi_processor_get_info()
> > returns an error if the UID exists twice in the DSDT.
> > 
> > The missing probe for CPUs described as packages creates a problem for
> > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > described like this don't get registered, leading to errors from other
> > subsystems when they try to add new sysfs entries to the CPU node.
> > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > 
> > To fix this, parse the processor container and call acpi_processor_add()
> > for each processor that is discovered like this. The processor container
> > handler is added with acpi_scan_add_handler(), so no detach call will
> > arrive.
> > 
> > Qemu TCG describes CPUs using processor devices in a processor container.
> > For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and
> > https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device
> > 
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > ---
> > Outstanding comments:
> >  https://lore.kernel.org/r/20230914145353.000072e2@Huawei.com
> Looks like you resolved those (were all patch description things).
> 
> So I'm happy.
> Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>

Great, I wasn't sure if I had resolved them to your satisfaction, so I
kept the reference to your original review. I've now removed it and
added your r-b. Thanks.
  
Rafael J. Wysocki Dec. 18, 2023, 8:17 p.m. UTC | #3
On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
>
> From: James Morse <james.morse@arm.com>
>
> ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> 5.2.12:
>
> "Starting with ACPI Specification 6.3, the use of the Processor() object
> was deprecated. Only legacy systems should continue with this usage. On
> the Itanium architecture only, a _UID is provided for the Processor()
> that is a string object. This usage of _UID is also deprecated since it
> can preclude an OSPM from being able to match a processor to a
> non-enumerable device, such as those defined in the MADT. From ACPI
> Specification 6.3 onward, all processor objects for all architectures
> except Itanium must now use Device() objects with an _HID of ACPI0007,
> and use only integer _UID values."
>
> Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
>
> Duplicate descriptions are not allowed, the ACPI processor driver already
> parses the UID from both devices and containers. acpi_processor_get_info()
> returns an error if the UID exists twice in the DSDT.

I'm not really sure how the above is related to the actual patch.

> The missing probe for CPUs described as packages

It is unclear what exactly is meant by "CPUs described as packages".

From the patch, it looks like those would be Processor() objects
defined under a processor container device.

> creates a problem for
> moving the cpu_register() calls into the acpi_processor driver, as CPUs
> described like this don't get registered, leading to errors from other
> subsystems when they try to add new sysfs entries to the CPU node.
> (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
>
> To fix this, parse the processor container and call acpi_processor_add()
> for each processor that is discovered like this.

Discovered like what?

> The processor container
> handler is added with acpi_scan_add_handler(), so no detach call will
> arrive.

The above requires clarification too.

> Qemu TCG describes CPUs using processor devices in a processor container.
> For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and
> https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device
>
> Signed-off-by: James Morse <james.morse@arm.com>
> Tested-by: Miguel Luis <miguel.luis@oracle.com>
> Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> ---
> Outstanding comments:
>  https://lore.kernel.org/r/20230914145353.000072e2@Huawei.com
>  https://lore.kernel.org/r/50571c2f-aa3c-baeb-3add-cd59e0eddc02@redhat.com
> ---
>  drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> index 4fe2ef54088c..6a542e0ce396 100644
> --- a/drivers/acpi/acpi_processor.c
> +++ b/drivers/acpi/acpi_processor.c
> @@ -626,9 +626,31 @@ static struct acpi_scan_handler processor_handler = {
>         },
>  };
>
> +static acpi_status acpi_processor_container_walk(acpi_handle handle,
> +                                                u32 lvl,
> +                                                void *context,
> +                                                void **rv)
> +{
> +       struct acpi_device *adev;
> +       acpi_status status;
> +
> +       adev = acpi_get_acpi_dev(handle);
> +       if (!adev)
> +               return AE_ERROR;

Why is the reference counting needed here?

Wouldn't acpi_fetch_acpi_dev() suffice?

Also, should the walk really be terminated on the first error?

> +
> +       status = acpi_processor_add(adev, &processor_device_ids[0]);
> +       acpi_put_acpi_dev(adev);
> +
> +       return status;
> +}
> +
>  static int acpi_processor_container_attach(struct acpi_device *dev,
>                                            const struct acpi_device_id *id)
>  {
> +       acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
> +                           ACPI_UINT32_MAX, acpi_processor_container_walk,
> +                           NULL, NULL, NULL);

This covers processor objects only, so why is this not needed for
processor devices defined under a processor container object?

It is not obvious, so it would be nice to add a comment explaining the
difference.

> +
>         return 1;
>  }
>
> --
  
Russell King (Oracle) Jan. 9, 2024, 3:49 p.m. UTC | #4
On Mon, Dec 18, 2023 at 09:17:34PM +0100, Rafael J. Wysocki wrote:
> On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> >
> > From: James Morse <james.morse@arm.com>
> >
> > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > 5.2.12:
> >
> > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > was deprecated. Only legacy systems should continue with this usage. On
> > the Itanium architecture only, a _UID is provided for the Processor()
> > that is a string object. This usage of _UID is also deprecated since it
> > can preclude an OSPM from being able to match a processor to a
> > non-enumerable device, such as those defined in the MADT. From ACPI
> > Specification 6.3 onward, all processor objects for all architectures
> > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > and use only integer _UID values."
> >
> > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> >
> > Duplicate descriptions are not allowed, the ACPI processor driver already
> > parses the UID from both devices and containers. acpi_processor_get_info()
> > returns an error if the UID exists twice in the DSDT.
> 
> I'm not really sure how the above is related to the actual patch.
> 
> > The missing probe for CPUs described as packages
> 
> It is unclear what exactly is meant by "CPUs described as packages".
> 
> From the patch, it looks like those would be Processor() objects
> defined under a processor container device.
> 
> > creates a problem for
> > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > described like this don't get registered, leading to errors from other
> > subsystems when they try to add new sysfs entries to the CPU node.
> > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> >
> > To fix this, parse the processor container and call acpi_processor_add()
> > for each processor that is discovered like this.
> 
> Discovered like what?
> 
> > The processor container
> > handler is added with acpi_scan_add_handler(), so no detach call will
> > arrive.
> 
> The above requires clarification too.

The above comments... yea. As I didn't write the commit description, but
James did, and James has basically vanished, I don't think these can be
answered, short of rewriting the entire commit message, with me spending
a lot of time with the ACPI specification trying to get the terminology
right - because at lot of the above on the face of it seems to be things
to do with wrong terminology being used.

I wasn't expecting this level of issues with this patch set, and I now
feel completely out of my depth with this series. I'm wondering whether
I should even continue with it, since I don't have the ACPI knowledge
to address a lot of these comments.
  
Rafael J. Wysocki Jan. 9, 2024, 4:05 p.m. UTC | #5
On Tue, Jan 9, 2024 at 4:49 PM Russell King (Oracle)
<linux@armlinux.org.uk> wrote:
>
> On Mon, Dec 18, 2023 at 09:17:34PM +0100, Rafael J. Wysocki wrote:
> > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > >
> > > From: James Morse <james.morse@arm.com>
> > >
> > > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > > 5.2.12:
> > >
> > > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > > was deprecated. Only legacy systems should continue with this usage. On
> > > the Itanium architecture only, a _UID is provided for the Processor()
> > > that is a string object. This usage of _UID is also deprecated since it
> > > can preclude an OSPM from being able to match a processor to a
> > > non-enumerable device, such as those defined in the MADT. From ACPI
> > > Specification 6.3 onward, all processor objects for all architectures
> > > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > > and use only integer _UID values."
> > >
> > > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> > >
> > > Duplicate descriptions are not allowed, the ACPI processor driver already
> > > parses the UID from both devices and containers. acpi_processor_get_info()
> > > returns an error if the UID exists twice in the DSDT.
> >
> > I'm not really sure how the above is related to the actual patch.
> >
> > > The missing probe for CPUs described as packages
> >
> > It is unclear what exactly is meant by "CPUs described as packages".
> >
> > From the patch, it looks like those would be Processor() objects
> > defined under a processor container device.
> >
> > > creates a problem for
> > > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > > described like this don't get registered, leading to errors from other
> > > subsystems when they try to add new sysfs entries to the CPU node.
> > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > >
> > > To fix this, parse the processor container and call acpi_processor_add()
> > > for each processor that is discovered like this.
> >
> > Discovered like what?
> >
> > > The processor container
> > > handler is added with acpi_scan_add_handler(), so no detach call will
> > > arrive.
> >
> > The above requires clarification too.
>
> The above comments... yea. As I didn't write the commit description, but
> James did, and James has basically vanished, I don't think these can be
> answered, short of rewriting the entire commit message, with me spending
> a lot of time with the ACPI specification trying to get the terminology
> right - because at lot of the above on the face of it seems to be things
> to do with wrong terminology being used.
>
> I wasn't expecting this level of issues with this patch set, and I now
> feel completely out of my depth with this series. I'm wondering whether
> I should even continue with it, since I don't have the ACPI knowledge
> to address a lot of these comments.

Well, sorry about this.

I met James at the LPC last year, so he seems to be still around, in
some way at least..
  
Russell King (Oracle) Jan. 9, 2024, 4:13 p.m. UTC | #6
On Tue, Jan 09, 2024 at 05:05:15PM +0100, Rafael J. Wysocki wrote:
> On Tue, Jan 9, 2024 at 4:49 PM Russell King (Oracle)
> <linux@armlinux.org.uk> wrote:
> >
> > On Mon, Dec 18, 2023 at 09:17:34PM +0100, Rafael J. Wysocki wrote:
> > > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > > >
> > > > From: James Morse <james.morse@arm.com>
> > > >
> > > > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > > > 5.2.12:
> > > >
> > > > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > > > was deprecated. Only legacy systems should continue with this usage. On
> > > > the Itanium architecture only, a _UID is provided for the Processor()
> > > > that is a string object. This usage of _UID is also deprecated since it
> > > > can preclude an OSPM from being able to match a processor to a
> > > > non-enumerable device, such as those defined in the MADT. From ACPI
> > > > Specification 6.3 onward, all processor objects for all architectures
> > > > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > > > and use only integer _UID values."
> > > >
> > > > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> > > >
> > > > Duplicate descriptions are not allowed, the ACPI processor driver already
> > > > parses the UID from both devices and containers. acpi_processor_get_info()
> > > > returns an error if the UID exists twice in the DSDT.
> > >
> > > I'm not really sure how the above is related to the actual patch.
> > >
> > > > The missing probe for CPUs described as packages
> > >
> > > It is unclear what exactly is meant by "CPUs described as packages".
> > >
> > > From the patch, it looks like those would be Processor() objects
> > > defined under a processor container device.
> > >
> > > > creates a problem for
> > > > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > > > described like this don't get registered, leading to errors from other
> > > > subsystems when they try to add new sysfs entries to the CPU node.
> > > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > > >
> > > > To fix this, parse the processor container and call acpi_processor_add()
> > > > for each processor that is discovered like this.
> > >
> > > Discovered like what?
> > >
> > > > The processor container
> > > > handler is added with acpi_scan_add_handler(), so no detach call will
> > > > arrive.
> > >
> > > The above requires clarification too.
> >
> > The above comments... yea. As I didn't write the commit description, but
> > James did, and James has basically vanished, I don't think these can be
> > answered, short of rewriting the entire commit message, with me spending
> > a lot of time with the ACPI specification trying to get the terminology
> > right - because at lot of the above on the face of it seems to be things
> > to do with wrong terminology being used.
> >
> > I wasn't expecting this level of issues with this patch set, and I now
> > feel completely out of my depth with this series. I'm wondering whether
> > I should even continue with it, since I don't have the ACPI knowledge
> > to address a lot of these comments.
> 
> Well, sorry about this.
> 
> I met James at the LPC last year, so he seems to be still around, in
> some way at least..

On the previous posting, I wanted James to comment on some of the
feedback from Jonathan, and despite explicitly asking, there has been
nothing but radio silence ever since James' last post of this series.

So, I now deem this work to be completely dead in the water, and not
going to happen - not unless others can input on your comments.
  
Jonathan Cameron Jan. 11, 2024, 4:17 p.m. UTC | #7
On Tue, 9 Jan 2024 16:13:21 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Tue, Jan 09, 2024 at 05:05:15PM +0100, Rafael J. Wysocki wrote:
> > On Tue, Jan 9, 2024 at 4:49 PM Russell King (Oracle)
> > <linux@armlinux.org.uk> wrote:  
> > >
> > > On Mon, Dec 18, 2023 at 09:17:34PM +0100, Rafael J. Wysocki wrote:  
> > > > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:  
> > > > >
> > > > > From: James Morse <james.morse@arm.com>
> > > > >
> > > > > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > > > > 5.2.12:
> > > > >
> > > > > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > > > > was deprecated. Only legacy systems should continue with this usage. On
> > > > > the Itanium architecture only, a _UID is provided for the Processor()
> > > > > that is a string object. This usage of _UID is also deprecated since it
> > > > > can preclude an OSPM from being able to match a processor to a
> > > > > non-enumerable device, such as those defined in the MADT. From ACPI
> > > > > Specification 6.3 onward, all processor objects for all architectures
> > > > > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > > > > and use only integer _UID values."
> > > > >
> > > > > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> > > > >
> > > > > Duplicate descriptions are not allowed, the ACPI processor driver already
> > > > > parses the UID from both devices and containers. acpi_processor_get_info()
> > > > > returns an error if the UID exists twice in the DSDT.  
> > > >
> > > > I'm not really sure how the above is related to the actual patch.
> > > >  
> > > > > The missing probe for CPUs described as packages  
> > > >
> > > > It is unclear what exactly is meant by "CPUs described as packages".
> > > >
> > > > From the patch, it looks like those would be Processor() objects
> > > > defined under a processor container device.
> > > >  
> > > > > creates a problem for
> > > > > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > > > > described like this don't get registered, leading to errors from other
> > > > > subsystems when they try to add new sysfs entries to the CPU node.
> > > > > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> > > > >
> > > > > To fix this, parse the processor container and call acpi_processor_add()
> > > > > for each processor that is discovered like this.  
> > > >
> > > > Discovered like what?
> > > >  
> > > > > The processor container
> > > > > handler is added with acpi_scan_add_handler(), so no detach call will
> > > > > arrive.  
> > > >
> > > > The above requires clarification too.  
> > >
> > > The above comments... yea. As I didn't write the commit description, but
> > > James did, and James has basically vanished, I don't think these can be
> > > answered, short of rewriting the entire commit message, with me spending
> > > a lot of time with the ACPI specification trying to get the terminology
> > > right - because at lot of the above on the face of it seems to be things
> > > to do with wrong terminology being used.
> > >
> > > I wasn't expecting this level of issues with this patch set, and I now
> > > feel completely out of my depth with this series. I'm wondering whether
> > > I should even continue with it, since I don't have the ACPI knowledge
> > > to address a lot of these comments.  
> > 
> > Well, sorry about this.
> > 
> > I met James at the LPC last year, so he seems to be still around, in
> > some way at least..  
> 
> On the previous posting, I wanted James to comment on some of the
> feedback from Jonathan, and despite explicitly asking, there has been
> nothing but radio silence ever since James' last post of this series.
> 
> So, I now deem this work to be completely dead in the water, and not
> going to happen - not unless others can input on your comments.
> 
I'll take another pass at this and see which comments I can resolve.
Will need a few additional test setups so may take a few days.

So far I've established that QEMU uses Processor for x86 and
ACPI0007 for arm64.  Goody, at least that simplifies testing
the various options.

Jonathan
  
Jonathan Cameron Jan. 11, 2024, 5:59 p.m. UTC | #8
On Mon, 18 Dec 2023 21:17:34 +0100
"Rafael J. Wysocki" <rafael@kernel.org> wrote:

> On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinuxorg.uk> wrote:
> >
> > From: James Morse <james.morse@arm.com>

Done some digging + machine faking.  This is mid stage results at best.

Summary: I don't think this patch is necessary.  If anyone happens to be in
the mood for testing on various platforms, can you drop this patch and
see if everything still works.

With this patch in place, and a processor container containing
Processor() objects acpi_process_add is called twice - once via
the path added here and once via acpi_bus_attach etc.

Maybe it's a left over from earlier approaches to some of this?


> >
> > ACPI has two ways of describing processors in the DSDT. From ACPI v6.5,
> > 5.2.12:
> >
> > "Starting with ACPI Specification 6.3, the use of the Processor() object
> > was deprecated. Only legacy systems should continue with this usage. On
> > the Itanium architecture only, a _UID is provided for the Processor()
> > that is a string object. This usage of _UID is also deprecated since it
> > can preclude an OSPM from being able to match a processor to a
> > non-enumerable device, such as those defined in the MADT. From ACPI
> > Specification 6.3 onward, all processor objects for all architectures
> > except Itanium must now use Device() objects with an _HID of ACPI0007,
> > and use only integer _UID values."

Well, we definitely don't care about Itanium any more so most of this is irrelevant
and can be scrubbed going forwards!

Otherwise I think we only care about Device() and Processor() being two things
that might be seen to describe CPUs and they may or may not be in a
Processor container.

> >
> > Also see https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#declaring-processors
> >
> > Duplicate descriptions are not allowed, the ACPI processor driver already
> > parses the UID from both devices and containers. acpi_processor_get_info()
> > returns an error if the UID exists twice in the DSDT.  
> 
> I'm not really sure how the above is related to the actual patch.

This is nasty.  They key is that with this patch in place, we are actually
adding them twice if they are are instantiated via Processor() in a processor
container.  So this reference is explaining why we don't get two lots registered.

This patch should call out explicitly why we want to do it twice
(I'm assuming on a temporary baseis).

> 
> > The missing probe for CPUs described as packages  
> 
> It is unclear what exactly is meant by "CPUs described as packages".
> 
> From the patch, it looks like those would be Processor() objects
> defined under a processor container device.
Agreed.

> 
> > creates a problem for
> > moving the cpu_register() calls into the acpi_processor driver, as CPUs
> > described like this don't get registered, leading to errors from other
> > subsystems when they try to add new sysfs entries to the CPU node.
> > (e.g. topology_sysfs_init()'s use of topology_add_dev() via cpuhp)
> >
> > To fix this, parse the processor container and call acpi_processor_add()
> > for each processor that is discovered like this.  
> 
> Discovered like what?
Doesn't add any info.

"To fix this, parse the processor container and call acpi_processor_add() for
each processor found."

> 
> > The processor container
> > handler is added with acpi_scan_add_handler(), so no detach call will
> > arrive.  
> 
> The above requires clarification too.
> 
> > Qemu TCG describes CPUs using processor devices in a processor container.

Hmm. This isn't so clear cut.

For ARM it does it nicely with ACPI0007 etc. For x86 it is still
Processor() under some circumstances... (why exactly doesn't matter here
- it's all legacy mess).

To poke this I hacked the arm virt qemu platform to use Processor() in a
container so I could like for like comparisons.

The logic that injects a HID into Processor() objects means the existing
handlers get fired without this patch.  I'm going to assume that might
not be the case later in this patch set, but I've not found where it
is broken yet :(


> > For more information, see build_cpus_aml() in Qemu hw/acpi/cpu.c and
> > https://uefi.org/specs/ACPI/6.5/08_Processor_Configuration_and_Control.html#processor-container-device
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > Tested-by: Miguel Luis <miguel.luis@oracle.com>
> > Tested-by: Vishnu Pajjuri <vishnu@os.amperecomputing.com>
> > Tested-by: Jianyong Wu <jianyong.wu@arm.com>
> > ---
> > Outstanding comments:
> >  https://lore.kernel.org/r/20230914145353.000072e2@Huawei.com
> >  https://lore.kernel.org/r/50571c2f-aa3c-baeb-3add-cd59e0eddc02@redhat.com
> > ---
> >  drivers/acpi/acpi_processor.c | 22 ++++++++++++++++++++++
> >  1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
> > index 4fe2ef54088c..6a542e0ce396 100644
> > --- a/drivers/acpi/acpi_processor.c
> > +++ b/drivers/acpi/acpi_processor.c
> > @@ -626,9 +626,31 @@ static struct acpi_scan_handler processor_handler = {
> >         },
> >  };
> >
> > +static acpi_status acpi_processor_container_walk(acpi_handle handle,
> > +                                                u32 lvl,
> > +                                                void *context,
> > +                                                void **rv)
> > +{
> > +       struct acpi_device *adev;
> > +       acpi_status status;
> > +
> > +       adev = acpi_get_acpi_dev(handle);
> > +       if (!adev)
> > +               return AE_ERROR;  
> 
> Why is the reference counting needed here?
> 
> Wouldn't acpi_fetch_acpi_dev() suffice?
You are the expert here :)  I can't see why the reference is needed
so would be fine with dropping it.

> 
> Also, should the walk really be terminated on the first error?

If this patch makes sense things will probably blow up later but no
worse than before so sure, keep going.

> 
> > +
> > +       status = acpi_processor_add(adev, &processor_device_ids[0]);
> > +       acpi_put_acpi_dev(adev);
> > +
> > +       return status;
> > +}
> > +
> >  static int acpi_processor_container_attach(struct acpi_device *dev,
> >                                            const struct acpi_device_id *id)
> >  {
> > +       acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
> > +                           ACPI_UINT32_MAX, acpi_processor_container_walk,
> > +                           NULL, NULL, NULL);  
> 
> This covers processor objects only, so why is this not needed for
> processor devices defined under a processor container object?

Both cases are covered by the existing handling without this.

I'm far from clear on why we need this patch.  Presumably
it's the reference in the description on it breaking for
Processor Package containing Processor() objects that matters
after a move... I'm struggling to find that move though!



> 
> It is not obvious, so it would be nice to add a comment explaining the
> difference.
> 
> > +
> >         return 1;
> >  }
> >
> > --  
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
  
Russell King (Oracle) Jan. 11, 2024, 6:46 p.m. UTC | #9
On Thu, Jan 11, 2024 at 05:59:08PM +0000, Jonathan Cameron wrote:
> On Mon, 18 Dec 2023 21:17:34 +0100
> "Rafael J. Wysocki" <rafael@kernel.org> wrote:
> 
> > On Wed, Dec 13, 2023 at 1:49 PM Russell King <rmk+kernel@armlinux.org.uk> wrote:
> > >
> > > From: James Morse <james.morse@arm.com>
> 
> Done some digging + machine faking.  This is mid stage results at best.
> 
> Summary: I don't think this patch is necessary.  If anyone happens to be in
> the mood for testing on various platforms, can you drop this patch and
> see if everything still works.
> 
> With this patch in place, and a processor container containing
> Processor() objects acpi_process_add is called twice - once via
> the path added here and once via acpi_bus_attach etc.
> 
> Maybe it's a left over from earlier approaches to some of this?

From what you're saying, it seems that way. It would be really good to
get a reply from James to see whether he agrees - or at least get the
reason why this patch is in the series... but I suspect that will never
come.

> Both cases are covered by the existing handling without this.
> 
> I'm far from clear on why we need this patch.  Presumably
> it's the reference in the description on it breaking for
> Processor Package containing Processor() objects that matters
> after a move... I'm struggling to find that move though!

I do know that James did a lot of testing, so maybe he found some
corner case somewhere which made this necessary - but without input
from James, we can't know that.

So, maybe the right way forward on this is to re-test the series
with this patch dropped, and see whether there's any ill effects.
It should be possible to resurect the patch if it does turn out to
be necessary.

Does that sound like a good way forward?

Thanks.
  

Patch

diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c
index 4fe2ef54088c..6a542e0ce396 100644
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -626,9 +626,31 @@  static struct acpi_scan_handler processor_handler = {
 	},
 };
 
+static acpi_status acpi_processor_container_walk(acpi_handle handle,
+						 u32 lvl,
+						 void *context,
+						 void **rv)
+{
+	struct acpi_device *adev;
+	acpi_status status;
+
+	adev = acpi_get_acpi_dev(handle);
+	if (!adev)
+		return AE_ERROR;
+
+	status = acpi_processor_add(adev, &processor_device_ids[0]);
+	acpi_put_acpi_dev(adev);
+
+	return status;
+}
+
 static int acpi_processor_container_attach(struct acpi_device *dev,
 					   const struct acpi_device_id *id)
 {
+	acpi_walk_namespace(ACPI_TYPE_PROCESSOR, dev->handle,
+			    ACPI_UINT32_MAX, acpi_processor_container_walk,
+			    NULL, NULL, NULL);
+
 	return 1;
 }