[1/2] thunderbolt: Read DROM directly from NVM before trying bit banging

Message ID 20230214154647.874-2-mario.limonciello@amd.com
State New
Headers
Series Fix problems fetching TBT3 DROM from AMD USB4 routers |

Commit Message

Mario Limonciello Feb. 14, 2023, 3:46 p.m. UTC
  Some TBT3 devices have a hard time reliably responding to bit banging
requests correctly when connected to AMD USB4 hosts running Linux.

These problems are not reported in any other CM, and comparing the
implementations the Linux CM is the only one that utilizes bit banging
to access the DROM. Other CM implementations access the DROM directly
from the NVM instead of bit banging.

Adjust the flow to try this on TBT3 devices before resorting to bit
banging.

Cc: stable@vger.kernel.org
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
 drivers/thunderbolt/eeprom.c | 4 ++++
 1 file changed, 4 insertions(+)
  

Comments

Mika Westerberg Feb. 15, 2023, 5:58 a.m. UTC | #1
Hi,

On Tue, Feb 14, 2023 at 09:46:45AM -0600, Mario Limonciello wrote:
> Some TBT3 devices have a hard time reliably responding to bit banging
> requests correctly when connected to AMD USB4 hosts running Linux.
> 
> These problems are not reported in any other CM, and comparing the
> implementations the Linux CM is the only one that utilizes bit banging
> to access the DROM. Other CM implementations access the DROM directly
> from the NVM instead of bit banging.

I'm sure Apple CM uses bitbanging because it is what Andreas reverse
engineered when he added the initial Linux Thunderbolt support ;-) I
guess this is then only Window CM? The problem with reading NVM directly
is that we may lose things like UUID, so I'm wondering if there is
something else going on.

Can you give some details, like what is the device in question?

> Adjust the flow to try this on TBT3 devices before resorting to bit
> banging.
> 
> Cc: stable@vger.kernel.org
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
>  drivers/thunderbolt/eeprom.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c
> index c90d22f56d4e1..d9d9567bb938b 100644
> --- a/drivers/thunderbolt/eeprom.c
> +++ b/drivers/thunderbolt/eeprom.c
> @@ -640,6 +640,10 @@ int tb_drom_read(struct tb_switch *sw)
>  		return 0;
>  	}
>  
> +	/* TBT3 devices have the DROM as part of NVM */
> +	if (tb_drom_copy_nvm(sw, &size) == 0)
> +		goto parse;
> +
>  	res = tb_drom_read_n(sw, 14, (u8 *) &size, 2);
>  	if (res)
>  		return res;
> -- 
> 2.25.1
  
Mario Limonciello Feb. 15, 2023, 6:10 a.m. UTC | #2
On 2/14/23 23:58, Mika Westerberg wrote:
> Hi,
>
> On Tue, Feb 14, 2023 at 09:46:45AM -0600, Mario Limonciello wrote:
>> Some TBT3 devices have a hard time reliably responding to bit banging
>> requests correctly when connected to AMD USB4 hosts running Linux.
>>
>> These problems are not reported in any other CM, and comparing the
>> implementations the Linux CM is the only one that utilizes bit banging
>> to access the DROM. Other CM implementations access the DROM directly
>> from the NVM instead of bit banging.
> I'm sure Apple CM uses bitbanging because it is what Andreas reverse
> engineered when he added the initial Linux Thunderbolt support ;-) I
> guess this is then only Window CM? The problem with reading NVM directly
> is that we may lose things like UUID, so I'm wondering if there is
> something else going on.

When I say other CMs, maybe I should have specified which ones were 
checked :)

The following CM get the DROM without bit-banging:

Win11 CM (MS inbox)

Win10 CM (AMD)

Pre-OS CM (AMD)

> Can you give some details, like what is the device in question?

It happens with both AR and TR based TBT3 devices connected to AMD USB4 
router.
It's not any one specific vendor or model, we've seen it across multiple 
vendors with
a failure rate of about 30%.


With an analyzer connected in between we can see that the connected TBT3 
device
does respond to the bit banging correctly, but the response is not 
making it over to
the USB4 router.

It happens with multiple retimer vendors, but it hasn't been checked on 
a retimer-less
system yet.

>> Adjust the flow to try this on TBT3 devices before resorting to bit
>> banging.
>>
>> Cc: stable@vger.kernel.org
>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>> ---
>>   drivers/thunderbolt/eeprom.c | 4 ++++
>>   1 file changed, 4 insertions(+)
>>
>> diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c
>> index c90d22f56d4e1..d9d9567bb938b 100644
>> --- a/drivers/thunderbolt/eeprom.c
>> +++ b/drivers/thunderbolt/eeprom.c
>> @@ -640,6 +640,10 @@ int tb_drom_read(struct tb_switch *sw)
>>   		return 0;
>>   	}
>>   
>> +	/* TBT3 devices have the DROM as part of NVM */
>> +	if (tb_drom_copy_nvm(sw, &size) == 0)
>> +		goto parse;
>> +
>>   	res = tb_drom_read_n(sw, 14, (u8 *) &size, 2);
>>   	if (res)
>>   		return res;
>> -- 
>> 2.25.1

I guess something else that might be less detrimental the loss of UUID
by reading DROM this way would be to only read DROM this way if any CRC 
failed.
  
Mika Westerberg Feb. 15, 2023, 6:24 a.m. UTC | #3
Hi,

On Wed, Feb 15, 2023 at 12:10:54AM -0600, Mario Limonciello wrote:
> 
> On 2/14/23 23:58, Mika Westerberg wrote:
> > Hi,
> > 
> > On Tue, Feb 14, 2023 at 09:46:45AM -0600, Mario Limonciello wrote:
> > > Some TBT3 devices have a hard time reliably responding to bit banging
> > > requests correctly when connected to AMD USB4 hosts running Linux.
> > > 
> > > These problems are not reported in any other CM, and comparing the
> > > implementations the Linux CM is the only one that utilizes bit banging
> > > to access the DROM. Other CM implementations access the DROM directly
> > > from the NVM instead of bit banging.
> > I'm sure Apple CM uses bitbanging because it is what Andreas reverse
> > engineered when he added the initial Linux Thunderbolt support ;-) I
> > guess this is then only Window CM? The problem with reading NVM directly
> > is that we may lose things like UUID, so I'm wondering if there is
> > something else going on.
> 
> When I say other CMs, maybe I should have specified which ones were checked
> :)
> 
> The following CM get the DROM without bit-banging:
> 
> Win11 CM (MS inbox)
> 
> Win10 CM (AMD)
> 
> Pre-OS CM (AMD)

Okay that's good to know :) I think you may want to mention this in the
commit log too.

> > Can you give some details, like what is the device in question?
> 
> It happens with both AR and TR based TBT3 devices connected to AMD USB4
> router.
> It's not any one specific vendor or model, we've seen it across multiple
> vendors with
> a failure rate of about 30%.
> 
> 
> With an analyzer connected in between we can see that the connected TBT3
> device
> does respond to the bit banging correctly, but the response is not making it
> over to
> the USB4 router.

I see.

> It happens with multiple retimer vendors, but it hasn't been checked on a
> retimer-less
> system yet.
> 
> > > Adjust the flow to try this on TBT3 devices before resorting to bit
> > > banging.
> > > 
> > > Cc: stable@vger.kernel.org
> > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> > > ---
> > >   drivers/thunderbolt/eeprom.c | 4 ++++
> > >   1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c
> > > index c90d22f56d4e1..d9d9567bb938b 100644
> > > --- a/drivers/thunderbolt/eeprom.c
> > > +++ b/drivers/thunderbolt/eeprom.c
> > > @@ -640,6 +640,10 @@ int tb_drom_read(struct tb_switch *sw)
> > >   		return 0;
> > >   	}
> > > +	/* TBT3 devices have the DROM as part of NVM */
> > > +	if (tb_drom_copy_nvm(sw, &size) == 0)
> > > +		goto parse;
> > > +
> > >   	res = tb_drom_read_n(sw, 14, (u8 *) &size, 2);
> > >   	if (res)
> > >   		return res;
> > > -- 
> > > 2.25.1
> 
> I guess something else that might be less detrimental the loss of UUID
> by reading DROM this way would be to only read DROM this way if any CRC
> failed.

Actually we do read UUID for TBT3 devices from link controller registers
(see tb_lc_read_uuid()) instead so I think perhaps we can limit the
bitbanging just for older TBT devices with no LC or something like that?
  
Mika Westerberg Feb. 15, 2023, 7:26 a.m. UTC | #4
On Wed, Feb 15, 2023 at 08:24:39AM +0200, Mika Westerberg wrote:
> > I guess something else that might be less detrimental the loss of UUID
> > by reading DROM this way would be to only read DROM this way if any CRC
> > failed.
> 
> Actually we do read UUID for TBT3 devices from link controller registers
> (see tb_lc_read_uuid()) instead so I think perhaps we can limit the
> bitbanging just for older TBT devices with no LC or something like that?

Just make sure UUID stays the same so that users don't need to
re-authorize their devices.
  

Patch

diff --git a/drivers/thunderbolt/eeprom.c b/drivers/thunderbolt/eeprom.c
index c90d22f56d4e1..d9d9567bb938b 100644
--- a/drivers/thunderbolt/eeprom.c
+++ b/drivers/thunderbolt/eeprom.c
@@ -640,6 +640,10 @@  int tb_drom_read(struct tb_switch *sw)
 		return 0;
 	}
 
+	/* TBT3 devices have the DROM as part of NVM */
+	if (tb_drom_copy_nvm(sw, &size) == 0)
+		goto parse;
+
 	res = tb_drom_read_n(sw, 14, (u8 *) &size, 2);
 	if (res)
 		return res;