[1/2] amd_nb: Add PCI ID for family 19h model 78h

Message ID 20230427053338.16653-2-mario.limonciello@amd.com
State New
Headers
Series Regression in s2idle for family 19h model 78h |

Commit Message

Mario Limonciello April 27, 2023, 5:33 a.m. UTC
  s2idle previously worked on this system, but it regressed in kernel
6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
index 0 for driver probe").

The reason for the regression is that before this commit the SMN
communication was hardcoded, but after amd_smn_read() is used which
relies upon the misc PCI ID used by DF function 3 being included in
a table.  The ID was missing for model 78h, so this meant that the
amd_smn_read() wouldn't work.

Add the missing ID into amd_nb, restoring s2idle on this system.

Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 arch/x86/kernel/amd_nb.c | 2 ++
 include/linux/pci_ids.h  | 1 +
 2 files changed, 3 insertions(+)
  

Comments

Bjorn Helgaas April 27, 2023, 4:46 p.m. UTC | #1
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
> 
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table.  The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
> 
> Add the missing ID into amd_nb, restoring s2idle on this system.

Is there a long-term solution for this that will not require adding
new IDs every time new hardware comes out?

drivers/platform/x86/amd/pmc.c already matches ACPI IDs; maybe there's
some way for the platform to provide the information you need via
ACPI or something?

Bjorn
  
Bjorn Helgaas April 27, 2023, 4:48 p.m. UTC | #2
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
> 
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table.  The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
> 
> Add the missing ID into amd_nb, restoring s2idle on this system.
> 
> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

Grudgingly, because this really isn't a maintainable strategy:

Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # pci_ids.h

> ---
>  arch/x86/kernel/amd_nb.c | 2 ++
>  include/linux/pci_ids.h  | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> index 4266b64631a4..7e331e8f3692 100644
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -36,6 +36,7 @@
>  #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
>  #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
>  #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>  
>  /* Protect the PCI config register pairs used for SMN. */
>  static DEFINE_MUTEX(smn_mutex);
> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
> +	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
>  	{}
>  };
>  
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 45c3d62e616d..95f33dadb2be 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -567,6 +567,7 @@
>  #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
>  #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
>  #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
>  #define PCI_DEVICE_ID_AMD_CNB17H_F3	0x1703
>  #define PCI_DEVICE_ID_AMD_LANCE		0x2000
>  #define PCI_DEVICE_ID_AMD_LANCE_HOME	0x2001
> -- 
> 2.34.1
>
  
Mario Limonciello April 27, 2023, 4:50 p.m. UTC | #3
[Public]

> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> > s2idle previously worked on this system, but it regressed in kernel
> > 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> > index 0 for driver probe").
> >
> > The reason for the regression is that before this commit the SMN
> > communication was hardcoded, but after amd_smn_read() is used which
> > relies upon the misc PCI ID used by DF function 3 being included in
> > a table.  The ID was missing for model 78h, so this meant that the
> > amd_smn_read() wouldn't work.
> >
> > Add the missing ID into amd_nb, restoring s2idle on this system.
> 
> Is there a long-term solution for this that will not require adding
> new IDs every time new hardware comes out?
> 
> drivers/platform/x86/amd/pmc.c already matches ACPI IDs; maybe there's
> some way for the platform to provide the information you need via
> ACPI or something?
> 
> Bjorn

I had the same question when I came up with this and found out the 
ACPI tables don't encode the information needed right now.

But, yes there are discussions ongoing to make this more scalable in
a future generation.
  
Guenter Roeck May 6, 2023, 2:05 p.m. UTC | #4
On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
> s2idle previously worked on this system, but it regressed in kernel
> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
> index 0 for driver probe").
> 
> The reason for the regression is that before this commit the SMN
> communication was hardcoded, but after amd_smn_read() is used which
> relies upon the misc PCI ID used by DF function 3 being included in
> a table.  The ID was missing for model 78h, so this meant that the
> amd_smn_read() wouldn't work.
> 
> Add the missing ID into amd_nb, restoring s2idle on this system.
> 
> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

FWIW:

Acked-by: Guenter Roeck <linux@roeck-us.net>

Note that this patch is not upstream, meaning the second patch
in the series can not be applied either. I am not sure if that is
because of "regressed in kernel 6.4" - after all, that kernel does not
exist yet. The offending patch _is_ in the upstream kernel, though.
It might make sense to inform the regression bot if the problem is
not fixed when v6.4-rc1 is made available.

Guenter

> ---
>  arch/x86/kernel/amd_nb.c | 2 ++
>  include/linux/pci_ids.h  | 1 +
>  2 files changed, 3 insertions(+)
> 
> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
> index 4266b64631a4..7e331e8f3692 100644
> --- a/arch/x86/kernel/amd_nb.c
> +++ b/arch/x86/kernel/amd_nb.c
> @@ -36,6 +36,7 @@
>  #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
>  #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
>  #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>  
>  /* Protect the PCI config register pairs used for SMN. */
>  static DEFINE_MUTEX(smn_mutex);
> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
>  	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
> +	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
>  	{}
>  };
>  
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 45c3d62e616d..95f33dadb2be 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -567,6 +567,7 @@
>  #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
>  #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
>  #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
>  #define PCI_DEVICE_ID_AMD_CNB17H_F3	0x1703
>  #define PCI_DEVICE_ID_AMD_LANCE		0x2000
>  #define PCI_DEVICE_ID_AMD_LANCE_HOME	0x2001
> -- 
> 2.34.1
>
  
Mario Limonciello May 7, 2023, 12:51 p.m. UTC | #5
On 5/6/23 09:05, Guenter Roeck wrote:
> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
>> s2idle previously worked on this system, but it regressed in kernel
>> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
>> index 0 for driver probe").
>>
>> The reason for the regression is that before this commit the SMN
>> communication was hardcoded, but after amd_smn_read() is used which
>> relies upon the misc PCI ID used by DF function 3 being included in
>> a table.  The ID was missing for model 78h, so this meant that the
>> amd_smn_read() wouldn't work.
>>
>> Add the missing ID into amd_nb, restoring s2idle on this system.
>>
>> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for driver probe")
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> FWIW:
>
> Acked-by: Guenter Roeck <linux@roeck-us.net>
>
> Note that this patch is not upstream, meaning the second patch
> in the series can not be applied either. I am not sure if that is
> because of "regressed in kernel 6.4" - after all, that kernel does not
> exist yet. The offending patch _is_ in the upstream kernel, though.
> It might make sense to inform the regression bot if the problem is
> not fixed when v6.4-rc1 is made available.

You're right; the commit message should:

s,but it regressed in kernel 6.4 due,but it regressed in,


Boris told me that he's waiting for 6.4-rc1 to pick this series up.

#regzbot ^introduced 310e782a99c7

>
> Guenter
>
>> ---
>>   arch/x86/kernel/amd_nb.c | 2 ++
>>   include/linux/pci_ids.h  | 1 +
>>   2 files changed, 3 insertions(+)
>>
>> diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
>> index 4266b64631a4..7e331e8f3692 100644
>> --- a/arch/x86/kernel/amd_nb.c
>> +++ b/arch/x86/kernel/amd_nb.c
>> @@ -36,6 +36,7 @@
>>   #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
>>   #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
>>   #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
>> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
>>   
>>   /* Protect the PCI config register pairs used for SMN. */
>>   static DEFINE_MUTEX(smn_mutex);
>> @@ -79,6 +80,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
>>   	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
>>   	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
>>   	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
>> +	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
>>   	{}
>>   };
>>   
>> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
>> index 45c3d62e616d..95f33dadb2be 100644
>> --- a/include/linux/pci_ids.h
>> +++ b/include/linux/pci_ids.h
>> @@ -567,6 +567,7 @@
>>   #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
>>   #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
>>   #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
>> +#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
>>   #define PCI_DEVICE_ID_AMD_CNB17H_F3	0x1703
>>   #define PCI_DEVICE_ID_AMD_LANCE		0x2000
>>   #define PCI_DEVICE_ID_AMD_LANCE_HOME	0x2001
>> -- 
>> 2.34.1
>>
  
Borislav Petkov May 8, 2023, 8:52 a.m. UTC | #6
On Thu, Apr 27, 2023 at 11:48:16AM -0500, Bjorn Helgaas wrote:
> Grudgingly, because this really isn't a maintainable strategy:
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>  # pci_ids.h

I was wondering whether that define should even go into pci_ids.h  but
there's a patch 2 which uses it and which is *not* in my mbox.

Mario, please CC everybody on all patches in the future.

I'll take the k10temp one too so that there's no cross-tree deps.

Thx.
  
Thorsten Leemhuis May 8, 2023, 11:13 a.m. UTC | #7
On 07.05.23 14:51, Mario Limonciello wrote:
> On 5/6/23 09:05, Guenter Roeck wrote:
>> On Thu, Apr 27, 2023 at 12:33:36AM -0500, Mario Limonciello wrote:
>>> s2idle previously worked on this system, but it regressed in kernel
>>> 6.4 due to commit 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN
>>> index 0 for driver probe").
>>>
>>> The reason for the regression is that before this commit the SMN
>>> communication was hardcoded, but after amd_smn_read() is used which
>>> relies upon the misc PCI ID used by DF function 3 being included in
>>> a table.  The ID was missing for model 78h, so this meant that the
>>> amd_smn_read() wouldn't work.
>>>
>>> Add the missing ID into amd_nb, restoring s2idle on this system.
>>>
>>> Fixes: 310e782a99c7 ("platform/x86/amd: pmc: Utilize SMN index 0 for
>>> driver probe")
>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> FWIW:
>>
>> Acked-by: Guenter Roeck <linux@roeck-us.net>
>>
>> Note that this patch is not upstream, meaning the second patch
>> in the series can not be applied either. I am not sure if that is
>> because of "regressed in kernel 6.4" - after all, that kernel does not
>> exist yet. The offending patch _is_ in the upstream kernel, though.
>> It might make sense to inform the regression bot if the problem is
>> not fixed when v6.4-rc1 is made available.
> 
> You're right; the commit message should:
> 
> s,but it regressed in kernel 6.4 due,but it regressed in,
> 
> Boris told me that he's waiting for 6.4-rc1 to pick this series up.

Which afaics means that users of -rc1 are now affected by this and might
waste time bisecting a known issue that could easily have been fixed
already. :-/ That doesn't feel right. Or am I missing something?

/me wonders I he should start tracking regressions more closely during
the merge window to catch and prevent situations like this...

> #regzbot ^introduced 310e782a99c7

Thx for adding it.

#regzbot fix: 7d8accfaa0ab65e42

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.
  
Borislav Petkov May 8, 2023, 11:25 a.m. UTC | #8
On Mon, May 08, 2023 at 01:13:14PM +0200, Thorsten Leemhuis wrote:
> Which afaics means that users of -rc1 are now affected by this and might
> waste time bisecting a known issue that could easily have been fixed
> already. :-/ That doesn't feel right. Or am I missing something?

-rc1 is pretty much the most broken tree there is. And it is not an
officially released tree but a, well, the first release candidate. So
fixes are trickling in, there's lag between what gets found, when the
maintainers pick it up and when it ends up upstream and so on and so on.

Some fixes need longer testing because there have been cases where a fix
breaks something else.

And yes, maintainers can always expedite a fix or Linus will pick it up
directly if it breaks a lot of boxes in a nasty way.

So, long story short, I don't think you should track -rcs. You are
tracking the reports already and you're tracking where their fixes land
so I guess that's good enough.

> /me wonders I he should start tracking regressions more closely during
> the merge window to catch and prevent situations like this...

I don't see a "situation" here. rcs can be broken for some use cases and
that is fine as long as that breakage doesn't get released.

Thx.
  
Guenter Roeck May 8, 2023, 1:28 p.m. UTC | #9
On Mon, May 08, 2023 at 01:25:43PM +0200, Borislav Petkov wrote:
> On Mon, May 08, 2023 at 01:13:14PM +0200, Thorsten Leemhuis wrote:
> > Which afaics means that users of -rc1 are now affected by this and might
> > waste time bisecting a known issue that could easily have been fixed
> > already. :-/ That doesn't feel right. Or am I missing something?
> 
> -rc1 is pretty much the most broken tree there is. And it is not an
> officially released tree but a, well, the first release candidate. So
> fixes are trickling in, there's lag between what gets found, when the
> maintainers pick it up and when it ends up upstream and so on and so on.
> 
> Some fixes need longer testing because there have been cases where a fix
> breaks something else.
> 
> And yes, maintainers can always expedite a fix or Linus will pick it up
> directly if it breaks a lot of boxes in a nasty way.
> 
> So, long story short, I don't think you should track -rcs. You are
> tracking the reports already and you're tracking where their fixes land
> so I guess that's good enough.
> 

I absolutely disagree. Without Thorsten's tracking, Linus would have no
idea what the status of the kernel is.

> > /me wonders I he should start tracking regressions more closely during
> > the merge window to catch and prevent situations like this...
> 
> I don't see a "situation" here. rcs can be broken for some use cases and
> that is fine as long as that breakage doesn't get released.
> 

Again, I disagree. The whole point of testing release candidates is to get
problems fixed. If issues are not fixed timely, they just pile up on top
of each other and make it difficult to identify new issues (and, in many
cases, to test the kernel in the first place).

I find it quite annoying that problems are identfied, often even in -next,
the patch intoducing them is applied to mainline anyway, and then
it sometimes takes until -rc5 or even later to get the fix applied (even if
the fix has been known for weeks or even months). It sometimes even takes
Linus' intervention to get fixes applied to the upstream kernel. That
really should not be necessary. Telling those who track regressions
to stay away from release candidates is absolutely the wrong thing to do
and would only serve to make release candidates quite pointless.

v6.4-rc1 is a good example. Fixes for all build breakages were published
before the commit window opened, yet at least one of them did not make
it into -rc1. Together with this patch there are now at least two
regressions if -rc1 whch could have been avoided and may impact testability
on affected systems.

Guenter
  
Borislav Petkov May 8, 2023, 1:44 p.m. UTC | #10
On Mon, May 08, 2023 at 06:28:03AM -0700, Guenter Roeck wrote:
> Together with this patch there are now at least two regressions if
> -rc1 whch could have been avoided and may impact testability on
> affected systems.

Are you saying that this patch which fixes s2idle on some random box
should've gone to Linus *immediately*?

And read my mail again:

"Some fixes need longer testing because there have been cases where
a fix breaks something else."

So yes, I disagree with rushing fixes immediately. If they're obvious
- whatever that means - then sure but not all of them are such.
  
Mario Limonciello May 11, 2023, 7:51 p.m. UTC | #11
[AMD Official Use Only - General]

+stable, Sasha

> > Together with this patch there are now at least two regressions if
> > -rc1 whch could have been avoided and may impact testability on
> > affected systems.
>
> Are you saying that this patch which fixes s2idle on some random box
> should've gone to Linus *immediately*?
>
> And read my mail again:
>
> "Some fixes need longer testing because there have been cases where
> a fix breaks something else."
>
> So yes, I disagree with rushing fixes immediately. If they're obvious
> - whatever that means - then sure but not all of them are such.
>
> --

Unfortunately, it looks like the broken commit got backported into 6.1.28,
but the fix still isn't in Linus' tree.

Sasha,

Can you please pick up
https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
for 6.1.29 to fix the regression?

Thanks,
  
Thorsten Leemhuis May 11, 2023, 8:09 p.m. UTC | #12
On 11.05.23 21:51, Limonciello, Mario wrote:
> [AMD Official Use Only - General]
> 
> +stable, Sasha
> 
>>> Together with this patch there are now at least two regressions if
>>> -rc1 whch could have been avoided and may impact testability on
>>> affected systems.
>>
>> Are you saying that this patch which fixes s2idle on some random box
>> should've gone to Linus *immediately*?
>>
>> And read my mail again:
>>
>> "Some fixes need longer testing because there have been cases where
>> a fix breaks something else."
>>
>> So yes, I disagree with rushing fixes immediately. If they're obvious
>> - whatever that means - then sure but not all of them are such.
>>
>> --
> 
> Unfortunately, it looks like the broken commit got backported into 6.1.28,
> but the fix still isn't in Linus' tree.
> 
> Sasha,
> 
> Can you please pick up
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
> for 6.1.29 to fix the regression?

FWIW, the stable team afaics usually does not fix anything in stable
trees before it's fixed in mainline. IOW: that fix now quickly needs to
go to Linus to get it quickly fixed in 6.1.y.

Side note: I'll soon post a rewritten section of 'Prioritize work on
fixing regressions' which is part of
Documentation/process/handling-regressions.rst. It among others will
cover this case:

```
* Whenever you want to swiftly resolve a regression that recently also
made it into a proper mainline, stable, or longterm release, fix it
quickly in mainline; when appropriate thus involve Linus to fast-track
the fix. That's because the stable team normally does neither revert nor
fix any changes that cause a regression in mainline, too.
```

Ciao, Thorsten
  
Sasha Levin May 11, 2023, 9:50 p.m. UTC | #13
On Thu, May 11, 2023 at 07:51:42PM +0000, Limonciello, Mario wrote:
>[AMD Official Use Only - General]
>
>+stable, Sasha
>
>> > Together with this patch there are now at least two regressions if
>> > -rc1 whch could have been avoided and may impact testability on
>> > affected systems.
>>
>> Are you saying that this patch which fixes s2idle on some random box
>> should've gone to Linus *immediately*?
>>
>> And read my mail again:
>>
>> "Some fixes need longer testing because there have been cases where
>> a fix breaks something else."
>>
>> So yes, I disagree with rushing fixes immediately. If they're obvious
>> - whatever that means - then sure but not all of them are such.
>>
>> --
>
>Unfortunately, it looks like the broken commit got backported into 6.1.28,
>but the fix still isn't in Linus' tree.
>
>Sasha,
>
>Can you please pick up
>https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
>for 6.1.29 to fix the regression?

Happily, once it lands upstream :)
  
Greg KH May 15, 2023, 12:26 p.m. UTC | #14
On Thu, May 11, 2023 at 07:51:42PM +0000, Limonciello, Mario wrote:
> [AMD Official Use Only - General]
> 
> +stable, Sasha
> 
> > > Together with this patch there are now at least two regressions if
> > > -rc1 whch could have been avoided and may impact testability on
> > > affected systems.
> >
> > Are you saying that this patch which fixes s2idle on some random box
> > should've gone to Linus *immediately*?
> >
> > And read my mail again:
> >
> > "Some fixes need longer testing because there have been cases where
> > a fix breaks something else."
> >
> > So yes, I disagree with rushing fixes immediately. If they're obvious
> > - whatever that means - then sure but not all of them are such.
> >
> > --
> 
> Unfortunately, it looks like the broken commit got backported into 6.1.28,
> but the fix still isn't in Linus' tree.
> 
> Sasha,
> 
> Can you please pick up
> https://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git/commit/?h=x86/urgent&id=23a5b8bb022c1e071ca91b1a9c10f0ad6a0966e9
> for 6.1.29 to fix the regression?

Now that this is in Linus's tree, it's queued up, thanks.

greg k-h
  

Patch

diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 4266b64631a4..7e331e8f3692 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -36,6 +36,7 @@ 
 #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F4 0x166e
 #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
 #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
 
 /* Protect the PCI config register pairs used for SMN. */
 static DEFINE_MUTEX(smn_mutex);
@@ -79,6 +80,7 @@  static const struct pci_device_id amd_nb_misc_ids[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F3) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
+	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
 	{}
 };
 
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 45c3d62e616d..95f33dadb2be 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -567,6 +567,7 @@ 
 #define PCI_DEVICE_ID_AMD_19H_M50H_DF_F3 0x166d
 #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
 #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
+#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
 #define PCI_DEVICE_ID_AMD_CNB17H_F3	0x1703
 #define PCI_DEVICE_ID_AMD_LANCE		0x2000
 #define PCI_DEVICE_ID_AMD_LANCE_HOME	0x2001