[net-next,RFC,0/2] net: phy: aquantia: fix system interface provision

Message ID 20240213182415.17223-1-ansuelsmth@gmail.com
Headers
Series net: phy: aquantia: fix system interface provision |

Message

Christian Marangi Feb. 13, 2024, 6:24 p.m. UTC
  Posting this as RFC as I think this require some discussion on the topic.

There is currently a problem. OEM multiple time provision Aquantia FW
with random and wrong data that may apply for one board but doesn't for
another. And at the same time OEM use the same broken FW for multiple
board and apply fixup at runtime.

This is the common case for AQR112 where downstream (uboot, OEM sdk,
openwrt to have the port correctly working) hack patch are used to fixup
broken system interface provision from the FW.

The downstream patch do one simple thing, they setup the SERDES startup
rate (that the FW may wrongly not init) and overwrite the
global system config for each rate to default values for the rwquested PHY
interface.

Now setting the SERDES startup value is SAFE, and this can be implemented
right away.

Overwriting the SERDES modes for each rate tho might pose some question
on how this is correct or wrong.

Reality is that probably every user an Aquantia PHY in one way or another
makes use of the SDK and have this patch in use making any kind of
provision on the FW ignored, (since the default values are always applied
at runtime) making the introduction of this change safe and restoring
correct functionality of AQR112 in the case of a broken FW loaded.

As said in the commit description, one thing this handle is the problem
where the FW is provision with 10BASE-R while the MAC supports and expect
UXSGMII.

The AQR PHY can correctly switch from one mode to another and I think it's
the most common case where one FW is broken.
This might be the safest change but again would not give us 100% idea that
the thing provision by the FW are correct.

Another idea might be adding a property like
"aquantia,broken-system-interface-provision" and with that enable we would
overwrite values with the default one.

Christian Marangi (2):
  net: phy: aquantia: setup interface protocols for AQR112
  net: phy: aquantia: add AQR112C and AQR112R PHY ID

 drivers/net/phy/aquantia/aquantia.h      |  17 +++
 drivers/net/phy/aquantia/aquantia_main.c | 152 +++++++++++++++++++++++
 2 files changed, 169 insertions(+)
  

Comments

Andrew Lunn Feb. 13, 2024, 6:46 p.m. UTC | #1
On Tue, Feb 13, 2024 at 07:24:10PM +0100, Christian Marangi wrote:
> Posting this as RFC as I think this require some discussion on the topic.
> 
> There is currently a problem. OEM multiple time provision Aquantia FW
> with random and wrong data that may apply for one board but doesn't for
> another. And at the same time OEM use the same broken FW for multiple
> board and apply fixup at runtime.
> 
> This is the common case for AQR112 where downstream (uboot, OEM sdk,
> openwrt to have the port correctly working) hack patch are used to fixup
> broken system interface provision from the FW.
> 
> The downstream patch do one simple thing, they setup the SERDES startup
> rate (that the FW may wrongly not init) and overwrite the
> global system config for each rate to default values for the rwquested PHY
> interface.
> 
> Now setting the SERDES startup value is SAFE, and this can be implemented
> right away.
> 
> Overwriting the SERDES modes for each rate tho might pose some question
> on how this is correct or wrong.
> 
> Reality is that probably every user an Aquantia PHY in one way or another
> makes use of the SDK and have this patch in use making any kind of
> provision on the FW ignored, (since the default values are always applied
> at runtime) making the introduction of this change safe and restoring
> correct functionality of AQR112 in the case of a broken FW loaded.

This is part of the discussion i had with Aquantia about
provisioning. Basically, you cannot trust any register to contain a
known value, e.g the value the data sheet indicates the reset value
should be, or that the 802.3 standard says it should be.

So in effect, the driver needs to write every single register it
depends on.

> This might be the safest change but again would not give us 100% idea that
> the thing provision by the FW are correct.

I would say, we have to assume provision is 100% wrong. Write every
single register with the needed value.

Is the provisioning information available? Can it be read from the
flash? Can it be dumped from firmware we have on disk? Dumping it for
a number of devices could give a list of register values which are
highly suspect, ones that OEMs typically mess with. We could start by
always setting those registers.

       Andrew
  
Christian Marangi Feb. 13, 2024, 6:53 p.m. UTC | #2
On Tue, Feb 13, 2024 at 07:46:45PM +0100, Andrew Lunn wrote:
> On Tue, Feb 13, 2024 at 07:24:10PM +0100, Christian Marangi wrote:
> > Posting this as RFC as I think this require some discussion on the topic.
> > 
> > There is currently a problem. OEM multiple time provision Aquantia FW
> > with random and wrong data that may apply for one board but doesn't for
> > another. And at the same time OEM use the same broken FW for multiple
> > board and apply fixup at runtime.
> > 
> > This is the common case for AQR112 where downstream (uboot, OEM sdk,
> > openwrt to have the port correctly working) hack patch are used to fixup
> > broken system interface provision from the FW.
> > 
> > The downstream patch do one simple thing, they setup the SERDES startup
> > rate (that the FW may wrongly not init) and overwrite the
> > global system config for each rate to default values for the rwquested PHY
> > interface.
> > 
> > Now setting the SERDES startup value is SAFE, and this can be implemented
> > right away.
> > 
> > Overwriting the SERDES modes for each rate tho might pose some question
> > on how this is correct or wrong.
> > 
> > Reality is that probably every user an Aquantia PHY in one way or another
> > makes use of the SDK and have this patch in use making any kind of
> > provision on the FW ignored, (since the default values are always applied
> > at runtime) making the introduction of this change safe and restoring
> > correct functionality of AQR112 in the case of a broken FW loaded.
> 
> This is part of the discussion i had with Aquantia about
> provisioning. Basically, you cannot trust any register to contain a
> known value, e.g the value the data sheet indicates the reset value
> should be, or that the 802.3 standard says it should be.
> 
> So in effect, the driver needs to write every single register it
> depends on.
>

Well if that's the case then this RFC patch is a must. With a
misconfigured System Interface configuration, the PHY can't comunicate
with the MAC.

> > This might be the safest change but again would not give us 100% idea that
> > the thing provision by the FW are correct.
> 
> I would say, we have to assume provision is 100% wrong. Write every
> single register with the needed value.
> 
> Is the provisioning information available? Can it be read from the
> flash? Can it be dumped from firmware we have on disk? Dumping it for
> a number of devices could give a list of register values which are
> highly suspect, ones that OEMs typically mess with. We could start by
> always setting those registers.
>

We know where they are stored in the FW but it's not documented how the
provision values are stored in the FW. (the format, how they are
organized...) I can waste some time trying to reverse it and produce a
tool to parse them if needed.

Would love also some comments by Russell about this, there was a patch
adding support for WoL where another user was messing with these regs
and he was with the idea of being careful with overwriting the provision
values.
  
Andrew Lunn Feb. 13, 2024, 8:58 p.m. UTC | #3
> > So in effect, the driver needs to write every single register it
> > depends on.
> >
> 
> Well if that's the case then this RFC patch is a must. With a
> misconfigured System Interface configuration, the PHY can't comunicate
> with the MAC.
> 
> > > This might be the safest change but again would not give us 100% idea that
> > > the thing provision by the FW are correct.
> > 
> > I would say, we have to assume provision is 100% wrong. Write every
> > single register with the needed value.
> > 
> > Is the provisioning information available? Can it be read from the
> > flash? Can it be dumped from firmware we have on disk? Dumping it for
> > a number of devices could give a list of register values which are
> > highly suspect, ones that OEMs typically mess with. We could start by
> > always setting those registers.
> >
> 
> We know where they are stored in the FW but it's not documented how the
> provision values are stored in the FW. (the format, how they are
> organized...) I can waste some time trying to reverse it and produce a
> tool to parse them if needed.

It might be worth it. How complex could it be? The obvious format is a
C45 mmd.reg pair and a value.

> Would love also some comments by Russell about this, there was a patch
> adding support for WoL where another user was messing with these regs
> and he was with the idea of being careful with overwriting the provision
> values.

I expect the SERDES eye configuration is in there somewhere, and we
should not touch that. That was one of the arguments Aquantia made at
the time, that needs to be stored somewhere, and is board specific.

But knowing what standard 802.3 registers are commonly changed would
be useful, and could help track down silly problems like the
transmitter being disabled by default by provisioning.

	Andrew
  
Christian Marangi Feb. 13, 2024, 9:03 p.m. UTC | #4
On Tue, Feb 13, 2024 at 09:58:59PM +0100, Andrew Lunn wrote:
> > > So in effect, the driver needs to write every single register it
> > > depends on.
> > >
> > 
> > Well if that's the case then this RFC patch is a must. With a
> > misconfigured System Interface configuration, the PHY can't comunicate
> > with the MAC.
> > 
> > > > This might be the safest change but again would not give us 100% idea that
> > > > the thing provision by the FW are correct.
> > > 
> > > I would say, we have to assume provision is 100% wrong. Write every
> > > single register with the needed value.
> > > 
> > > Is the provisioning information available? Can it be read from the
> > > flash? Can it be dumped from firmware we have on disk? Dumping it for
> > > a number of devices could give a list of register values which are
> > > highly suspect, ones that OEMs typically mess with. We could start by
> > > always setting those registers.
> > >
> > 
> > We know where they are stored in the FW but it's not documented how the
> > provision values are stored in the FW. (the format, how they are
> > organized...) I can waste some time trying to reverse it and produce a
> > tool to parse them if needed.
> 
> It might be worth it. How complex could it be? The obvious format is a
> C45 mmd.reg pair and a value.
>

Working on it. I already confirmed the FW have actually a provision part
and is not empty.

The format looks to be u16 reg 16 value but I need to understand it
better as not everything about provision is in mmd 1e so there must be
some magic values to signal where the section has to be appled.

> > Would love also some comments by Russell about this, there was a patch
> > adding support for WoL where another user was messing with these regs
> > and he was with the idea of being careful with overwriting the provision
> > values.
> 
> I expect the SERDES eye configuration is in there somewhere, and we
> should not touch that. That was one of the arguments Aquantia made at
> the time, that needs to be stored somewhere, and is board specific.
> 
> But knowing what standard 802.3 registers are commonly changed would
> be useful, and could help track down silly problems like the
> transmitter being disabled by default by provisioning.
>

Yes having a tool to parse them would probably be useful and eventually
even apply fixup in the firmware loading (if we really want)
  
Christian Marangi Feb. 16, 2024, 11:26 p.m. UTC | #5
On Tue, Feb 13, 2024 at 10:03:05PM +0100, Christian Marangi wrote:
> On Tue, Feb 13, 2024 at 09:58:59PM +0100, Andrew Lunn wrote:
> > > > So in effect, the driver needs to write every single register it
> > > > depends on.
> > > >
> > > 
> > > Well if that's the case then this RFC patch is a must. With a
> > > misconfigured System Interface configuration, the PHY can't comunicate
> > > with the MAC.
> > > 
> > > > > This might be the safest change but again would not give us 100% idea that
> > > > > the thing provision by the FW are correct.
> > > > 
> > > > I would say, we have to assume provision is 100% wrong. Write every
> > > > single register with the needed value.
> > > > 
> > > > Is the provisioning information available? Can it be read from the
> > > > flash? Can it be dumped from firmware we have on disk? Dumping it for
> > > > a number of devices could give a list of register values which are
> > > > highly suspect, ones that OEMs typically mess with. We could start by
> > > > always setting those registers.
> > > >
> > > 
> > > We know where they are stored in the FW but it's not documented how the
> > > provision values are stored in the FW. (the format, how they are
> > > organized...) I can waste some time trying to reverse it and produce a
> > > tool to parse them if needed.
> > 
> > It might be worth it. How complex could it be? The obvious format is a
> > C45 mmd.reg pair and a value.
> >
> 
> Working on it. I already confirmed the FW have actually a provision part
> and is not empty.
> 
> The format looks to be u16 reg 16 value but I need to understand it
> better as not everything about provision is in mmd 1e so there must be
> some magic values to signal where the section has to be appled.
> 
> > > Would love also some comments by Russell about this, there was a patch
> > > adding support for WoL where another user was messing with these regs
> > > and he was with the idea of being careful with overwriting the provision
> > > values.
> > 
> > I expect the SERDES eye configuration is in there somewhere, and we
> > should not touch that. That was one of the arguments Aquantia made at
> > the time, that needs to be stored somewhere, and is board specific.
> > 
> > But knowing what standard 802.3 registers are commonly changed would
> > be useful, and could help track down silly problems like the
> > transmitter being disabled by default by provisioning.
> >
> 
> Yes having a tool to parse them would probably be useful and eventually
> even apply fixup in the firmware loading (if we really want)
>

As promised, I reversed the format and created a script. It's still WIP
in the sense that I have to still to find a better way to show the
values. Here the script [1].

Feel free to suggest improvements to it. Various discovery were done
while reversing this, especially the thing with the BUG.

[1] https://github.com/Ansuel/aqr_prov_table_parser