[net-next,v12,09/18] net: ethernet: mtk_eth_soc: Fix link status for none-SGMII modes

Message ID 1590fb0e69f6243ac6a961b16bf7ae7534f46949.1678201958.git.daniel@makrotopia.org
State New
Headers
Series net: ethernet: mtk_eth_soc: various enhancements |

Commit Message

Daniel Golle March 7, 2023, 3:54 p.m. UTC
  Link partner advertised link modes are not reported by the SerDes
hardware if not operating in SGMII mode. Hence we cannot use
phylink_mii_c22_pcs_decode_state() in this case.
Implement reporting link and an_complete only and use speed according to
the interface mode.

Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
 drivers/net/ethernet/mediatek/mtk_sgmii.c | 26 ++++++++++++++++++-----
 1 file changed, 21 insertions(+), 5 deletions(-)
  

Comments

Frank Wunderlich March 7, 2023, 6:50 p.m. UTC | #1
> Gesendet: Dienstag, 07. März 2023 um 16:54 Uhr
> Von: "Daniel Golle" <daniel@makrotopia.org>
> Link partner advertised link modes are not reported by the SerDes
> hardware if not operating in SGMII mode. Hence we cannot use
> phylink_mii_c22_pcs_decode_state() in this case.
> Implement reporting link and an_complete only and use speed according to
> the interface mode.

Hi,

have tested Parts 1-12 this on bananapi-r3 (mt7986) with 1G Fiber SFP (no 2g5 available yet) on gmac1 and lan4 (mt7531 p5)

Tested-by: Frank Wunderlich <frank-w@public-files.de>

Thx Daniel for working on SFP support :)

regards Frank
  
Russell King (Oracle) March 8, 2023, 11:38 a.m. UTC | #2
On Tue, Mar 07, 2023 at 03:54:11PM +0000, Daniel Golle wrote:
> Link partner advertised link modes are not reported by the SerDes
> hardware if not operating in SGMII mode. Hence we cannot use
> phylink_mii_c22_pcs_decode_state() in this case.
> Implement reporting link and an_complete only and use speed according to
> the interface mode.
> 
> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
> Signed-off-by: Daniel Golle <daniel@makrotopia.org>

This has been proven to work by Frank Wunderlich last October, so by
making this change, you will be regressing his setup.

What are you testing against? Have you proven independently that the
link partner is indeed sending a valid advertisement for the LPA
register to be filled in?
  
Frank Wunderlich March 8, 2023, 11:44 a.m. UTC | #3
Am 8. März 2023 12:38:31 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
>On Tue, Mar 07, 2023 at 03:54:11PM +0000, Daniel Golle wrote:
>> Link partner advertised link modes are not reported by the SerDes
>> hardware if not operating in SGMII mode. Hence we cannot use
>> phylink_mii_c22_pcs_decode_state() in this case.
>> Implement reporting link and an_complete only and use speed according to
>> the interface mode.
>> 
>> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
>> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
>
>This has been proven to work by Frank Wunderlich last October, so by
>making this change, you will be regressing his setup.

Hi

My tests were done with 1 kind of 1g fibre sfp as i only have these atm...have ordered some 2g5 rj54 ones,but don't have them yet. I'm not sure if they are working with/without sgmii (1000base-X) and if they have builtin phy.

Daniel have a lot more of different SFPs and some (especially 2g5) were not working after our pcs change.

>What are you testing against? Have you proven independently that the
>link partner is indeed sending a valid advertisement for the LPA
>register to be filled in?
>


regards Frank
  
Daniel Golle March 8, 2023, 12:05 p.m. UTC | #4
On Wed, Mar 08, 2023 at 12:44:57PM +0100, Frank Wunderlich wrote:
> Am 8. März 2023 12:38:31 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
> >On Tue, Mar 07, 2023 at 03:54:11PM +0000, Daniel Golle wrote:
> >> Link partner advertised link modes are not reported by the SerDes
> >> hardware if not operating in SGMII mode. Hence we cannot use
> >> phylink_mii_c22_pcs_decode_state() in this case.
> >> Implement reporting link and an_complete only and use speed according to
> >> the interface mode.
> >> 
> >> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
> >> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> >
> >This has been proven to work by Frank Wunderlich last October, so by
> >making this change, you will be regressing his setup.
> 
> Hi
> 
> My tests were done with 1 kind of 1g fibre sfp as i only have these atm...have ordered some 2g5 rj54 ones,but don't have them yet. I'm not sure if they are working with/without sgmii (1000base-X) and if they have builtin phy.
> 
> Daniel have a lot more of different SFPs and some (especially 2g5) were not working after our pcs change.

Exactly. 1 GBit/s SFPs with built-in PHY and using SGMII are working
fine before and after conversion to phylink_pcs. 1000Base-X and
2500Base-X PHYs and SFPs were broken after this.

The patch about (and the other one you already NACK'ed) fixes those
codepaths which were simply not used in Frank's setup.

> 
> >What are you testing against? Have you proven independently that the
> >link partner is indeed sending a valid advertisement for the LPA
> >register to be filled in?
> >

I have a ballpark of different SFPs and MediaTek boards with different
PHYs here and tried all of them.

I have no way to tell if the SFPs and PHYs which stopped working after
the phylink_pcs conversion are sending valid advertisement. The only
other boards with SFP slots I got here are RealTek-based switches, and
all I can say is that on an RTL8380 based 1G switch both, the SFP
modules containing a PHY and operating in SGMII mode as well as the
ones without a PHY exposed via i2c-mdio and operating in 1000Base-X
mode are working fine with that switch, with both, stock firmware and
OpenWrt running on it.

However, even should they not send valid advertisement, they are very
common parts and they were working before and not after the change to
phylink_pcs, for the reasons mentioned in the description of this
patch.
  
Russell King (Oracle) March 8, 2023, 12:35 p.m. UTC | #5
On Wed, Mar 08, 2023 at 12:44:57PM +0100, Frank Wunderlich wrote:
> Am 8. März 2023 12:38:31 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
> >On Tue, Mar 07, 2023 at 03:54:11PM +0000, Daniel Golle wrote:
> >> Link partner advertised link modes are not reported by the SerDes
> >> hardware if not operating in SGMII mode. Hence we cannot use
> >> phylink_mii_c22_pcs_decode_state() in this case.
> >> Implement reporting link and an_complete only and use speed according to
> >> the interface mode.
> >> 
> >> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
> >> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> >
> >This has been proven to work by Frank Wunderlich last October, so by
> >making this change, you will be regressing his setup.
> 
> Hi
> 
> My tests were done with 1 kind of 1g fibre sfp as i only have these atm...have ordered some 2g5 rj54 ones,but don't have them yet. I'm not sure if they are working with/without sgmii (1000base-X) and if they have builtin phy.

Fiber SFPs do not have a built in PHY. They merely convert the serdes
electrical waveform into light and back again, possibly with some
retiming of the received waveform, and a bit of monitoring. They're
pretty simple devices.

They certainly do not change the protocol in any way.
  
Russell King (Oracle) March 8, 2023, 4:24 p.m. UTC | #6
On Wed, Mar 08, 2023 at 12:05:19PM +0000, Daniel Golle wrote:
> On Wed, Mar 08, 2023 at 12:44:57PM +0100, Frank Wunderlich wrote:
> > Am 8. März 2023 12:38:31 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
> > >On Tue, Mar 07, 2023 at 03:54:11PM +0000, Daniel Golle wrote:
> > >> Link partner advertised link modes are not reported by the SerDes
> > >> hardware if not operating in SGMII mode. Hence we cannot use
> > >> phylink_mii_c22_pcs_decode_state() in this case.
> > >> Implement reporting link and an_complete only and use speed according to
> > >> the interface mode.
> > >> 
> > >> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
> > >> Signed-off-by: Daniel Golle <daniel@makrotopia.org>
> > >
> > >This has been proven to work by Frank Wunderlich last October, so by
> > >making this change, you will be regressing his setup.
> > 
> > Hi
> > 
> > My tests were done with 1 kind of 1g fibre sfp as i only have these atm...have ordered some 2g5 rj54 ones,but don't have them yet. I'm not sure if they are working with/without sgmii (1000base-X) and if they have builtin phy.
> > 
> > Daniel have a lot more of different SFPs and some (especially 2g5) were not working after our pcs change.
> 
> Exactly. 1 GBit/s SFPs with built-in PHY and using SGMII are working
> fine before and after conversion to phylink_pcs. 1000Base-X and
> 2500Base-X PHYs and SFPs were broken after this.
> 
> The patch about (and the other one you already NACK'ed) fixes those
> codepaths which were simply not used in Frank's setup.

You're replying to "Frank" but I think the "you" you are using there
is addressed to me.

I beg to differ with your assessment over whether the code paths were
used or not. Here's the proof from Frank's testing:

https://lore.kernel.org/all/trinity-4470b00b-771b-466e-9f3a-a3df72758208-1666435920485@3c-app-gmx-bs49/

The values at offset 8 (0x40e041a0) _are_ 1000base-X AN values, not
SGMII. There are many messages in that thread showing valid 1000base-X
AN values in register 8 (that being the local advertisement in the
lower 16 bits, and the partner's advertisement in the upper 16 bits.

The link partner in this case sent 0x40e0, which is:

#define LPA_1000XFULL           0x0020  /* Can do 1000BASE-X full-duplex */
#define LPA_1000XHALF           0x0040  /* Can do 1000BASE-X half-duplex */
#define LPA_1000XPAUSE          0x0080  /* Can do 1000BASE-X pause     */
#define LPA_LPACK               0x4000  /* Link partner acked us       */

and our advertisement was 0x41a0:

#define ADVERTISE_1000XFULL     0x0020  /* Try for 1000BASE-X full-duplex */
#define ADVERTISE_1000XPAUSE    0x0080  /* Try for 1000BASE-X pause    */
#define ADVERTISE_1000XPSE_ASYM 0x0100  /* Try for 1000BASE-X asym pause */
#define ADVERTISE_LPACK         0x4000  /* Ack link partners response  */

(bit 14 is managed by the PCS.)

> I have a ballpark of different SFPs and MediaTek boards with different
> PHYs here and tried all of them.
> 
> I have no way to tell if the SFPs and PHYs which stopped working after
> the phylink_pcs conversion are sending valid advertisement. The only
> other boards with SFP slots I got here are RealTek-based switches, and
> all I can say is that on an RTL8380 based 1G switch both, the SFP
> modules containing a PHY and operating in SGMII mode as well as the
> ones without a PHY exposed via i2c-mdio and operating in 1000Base-X
> mode are working fine with that switch, with both, stock firmware and
> OpenWrt running on it.
> 
> However, even should they not send valid advertisement, they are very
> common parts and they were working before and not after the change to
> phylink_pcs, for the reasons mentioned in the description of this
> patch.

Let me re-iterate in a different way to make this crystal clear: the
fibre SFP is completely transparent with respect to the PCS-to-PCS
link.

The clause 37 autonegotiation is between the PCS at one end and the PCS
at the other end. The fibre SFP is not involved in it. All that fibre
SFPs do is convert the serdes waveform into a varying optical intensity
and back again to a serdes waveform. They do not attempt to interpret
the waveform, but they may shape it via retimer circuitry and adjust
the gain to provide a compliant electrical signal.

So, if the PCS at the far end of the link is sending a zero
advertisement or not sending an advertisement at all, then the LPA
register will contain a zero irrespective of the fibre SFP module
being used.

Another case where it may give a zero value but with a copper SFP
is if the copper SFP is using 1000base-X or SGMII but without
sending an advertisement. As I've already stated, BCM84881 is
known to do this, which is used on some copper SFP modules. It
may be that some copper SFP modules have the PHY setup to use
1000base-X but without any clause 37 advertisement - that would
not surprise me in the least.


I would suggest that if you replaced your mediatek board in your
setup with something else, e.g. an Armada 388 based Clearfog board
which is known to work correctly, used the same fibre SFPs, you'll
find that negotiation would not complete and the link wouldn't
come up. Then if you use the same fibre SFPs that I do at both
ends, you'll find the same problem (because the fibre SFPs make
no difference as far as the in-band AN is concerned.)

I hope this helps.
  

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 58d8cb3aa7f4..98d80007d3bd 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -23,14 +23,30 @@  static void mtk_pcs_get_state(struct phylink_pcs *pcs,
 			      struct phylink_link_state *state)
 {
 	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
-	unsigned int bm, adv;
+	unsigned int bm, bmsr, adv;
 
-	/* Read the BMSR and LPA */
+	/* Read the BMSR */
 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
-	regmap_read(mpcs->regmap, SGMSYS_PCS_ADVERTISE, &adv);
+	bmsr = FIELD_GET(SGMII_BMSR, bm);
 
-	phylink_mii_c22_pcs_decode_state(state, FIELD_GET(SGMII_BMSR, bm),
-					 FIELD_GET(SGMII_LPA, adv));
+	/* link partner advertised link modes are not reported by the
+	 * hardware when not operating in SGMII mode. Hence we cannot
+	 * use phylink_mii_c22_pcs_decode_state() in this case.
+	 */
+	if (state->interface != PHY_INTERFACE_MODE_SGMII) {
+		state->link = !!(bmsr & BMSR_LSTATUS);
+		state->an_complete = !!(bmsr & BMSR_ANEGCOMPLETE);
+		state->speed = (state->interface ==
+				PHY_INTERFACE_MODE_2500BASEX) ?
+			       SPEED_2500 : SPEED_1000;
+		state->duplex = DUPLEX_FULL;
+
+		return;
+	}
+
+	/* Read LPA and use standard decode function for SGMII mode */
+	regmap_read(mpcs->regmap, SGMSYS_PCS_ADVERTISE, &adv);
+	phylink_mii_c22_pcs_decode_state(state, bmsr, FIELD_GET(SGMII_LPA, adv));
 }
 
 static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode,