[v2] net: mtk_sgmii: implement mtk_pcs_ops

Message ID 20221020144431.126124-1-linux@fw-web.de
State New
Headers
Series [v2] net: mtk_sgmii: implement mtk_pcs_ops |

Commit Message

Frank Wunderlich Oct. 20, 2022, 2:44 p.m. UTC
  From: Alexander Couzens <lynxis@fe80.eu>

Implement mtk_pcs_ops for the SGMII pcs to read the current state
of the hardware.

Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>

Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
---
v2:
- reorder members in phylink_pcs_ops
- drop pause setting

without this Patch i get this trace in 6.1-rc1 when enabling the left
sfp (eth1) on bpi-r3 (crash happens in phylink core because a NULL-pointer,
so fixes-Tag is a guess):

[  108.302810] Unable to handle kernel execute from non-executable memory at virtual address 0000000000000000
[  108.312462] Mem abort info:
[  108.315263]   ESR = 0x0000000086000005
[  108.319003]   EC = 0x21: IABT (current EL), IL = 32 bits
[  108.324335]   SET = 0, FnV = 0
[  108.327378]   EA = 0, S1PTW = 0
[  108.330505]   FSC = 0x05: level 1 translation fault
[  108.335375] user pgtable: 4k pages, 39-bit VAs, pgdp=00000000419be000
[  108.341798] [0000000000000000] pgd=0000000000000000, p4d=0000000000000000, pu
d=0000000000000000
[  108.350489] Internal error: Oops: 0000000086000005 [#1] SMP
[  108.356047] Modules linked in: cdc_mbim cdc_ncm cdc_wdm cdc_ether qcserial us
a_wwan usbnet usbser
a Mmeisis
ge from syslogd@bpi-r3 at Aug  7 15:26:54 ...
 kernel:[  108.350489] Internal error: Oops: 0000000086000005 [#1] SMP
[  108.376743] CPU: 3 PID: 8 Comm: kworker/u8:0 Not tainted 6.0.0-bpi-r3 #1
[  108.383461] Hardware name: Bananapi BPI-R3 (sdmmc) (DT)
[  108.388671] Workqueue: events_power_efficient phylink_resolve
[  108.394411] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[  108.401355] pc : 0x0
[  108.403531] lr : phylink_mac_pcs_get_state+0x7c/0x100
[  108.408572] sp : ffffffc00963bd00
[  108.411873] x29: ffffffc00963bd00 x28: 0000000000000000 x27: 0000000000000000
[  108.418994] x26: ffffff80000ed074 x25: ffffff800001c105 x24: 0000000000000000
[  108.426116] x23: ffffff80022f8cd8 x22: ffffff80022f8d30 x21: ffffff80022f8c00
[  108.433236] x20: 0000000000000000 x19: ffffff8000126040 x18: 0000000000000000
[  108.440357] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
[  108.447478] x14: ffffffffffffffff x13: 0000000000000030 x12: 0101010101010101
[  108.454598] x11: 7f7f7f7f7f7f7f7f x10: fefefefefefefeff x9 : ffffff80000ed07c
[  108.461720] x8 : fefefefefefefeff x7 : 0000000000000017 x6 : ffffff80000ed074
[  108.468840] x5 : 0000000000000000 x4 : 0000020000006440 x3 : 00000000fffffffa
[  108.475960] x2 : 0000000000000000 x1 : ffffffc00963bd80 x0 : ffffff80014c3ab0
[  108.483082] Call trace:
[  108.485516]  0x0
[  108.487344]  phylink_resolve+0x248/0x520
[  108.491256]  process_one_work+0x204/0x478
[  108.495256]  worker_thread+0x148/0x4c0
[  108.498993]  kthread+0xdc/0xe8
[  108.502037]  ret_from_fork+0x10/0x20
[  108.505608] Code: bad PC value
[  108.508652] ---[ end trace 0000000000000000 ]---
[  108.513255] Kernel panic - not syncing: Oops: Fatal exception
[  108.518984] SMP: stopping secondary CPUs
[  108.522894] Kernel Offset: disabled
[  108.526369] CPU features: 0x00000,00800084,0000420b
[  108.531232] Memory Limit: none
[  108.534274] Rebooting in 1 seconds..
---
 drivers/net/ethernet/mediatek/mtk_sgmii.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)
  

Comments

Russell King (Oracle) Oct. 20, 2022, 4:17 p.m. UTC | #1
On Thu, Oct 20, 2022 at 04:44:31PM +0200, Frank Wunderlich wrote:
> diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
> index 736839c84130..8b7465057e57 100644
> --- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
> +++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
> @@ -122,7 +122,21 @@ static void mtk_pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
>  	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
>  }
>  
> +static void mtk_pcs_get_state(struct phylink_pcs *pcs, struct phylink_link_state *state)
> +{
> +	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
> +	unsigned int val;
> +
> +	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
> +	state->speed = val & RG_PHY_SPEED_3_125G ? SPEED_2500 : SPEED_1000;
> +

Sorry, looking back at my initial comment on the first revision of this
patch, I see my second point was confused. It should have read:

"2) There should also be a setting for state->duplex."

IOW, "duplex" instead of "pause".

Also, is there no way to read the link partner advertisement?
  
Jakub Kicinski Oct. 20, 2022, 7:10 p.m. UTC | #2
On Thu, 20 Oct 2022 16:44:31 +0200 Frank Wunderlich wrote:
> Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
> Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
> 
> Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")

nit, the correct formatting of tags would be:

From: Alexander Couzens <lynxis@fe80.eu>

Implement mtk_pcs_ops for the SGMII pcs to read the current state
of the hardware.

Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
Signed-off-by: Alexander Couzens <lynxis@fe80.eu>
Signed-off-by: Frank Wunderlich <frank-w@public-files.de>
---
  
Frank Wunderlich Oct. 21, 2022, 6:04 a.m. UTC | #3
Am 20. Oktober 2022 18:17:41 MESZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
>On Thu, Oct 20, 2022 at 04:44:31PM +0200, Frank Wunderlich wrote:
>> diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
>> index 736839c84130..8b7465057e57 100644
>> --- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
>> +++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
>> @@ -122,7 +122,21 @@ static void mtk_pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
>>  	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
>>  }
>>  
>> +static void mtk_pcs_get_state(struct phylink_pcs *pcs, struct phylink_link_state *state)
>> +{
>> +	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
>> +	unsigned int val;
>> +
>> +	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
>> +	state->speed = val & RG_PHY_SPEED_3_125G ? SPEED_2500 : SPEED_1000;
>> +
>
>Sorry, looking back at my initial comment on the first revision of this
>patch, I see my second point was confused. It should have read:
>
>"2) There should also be a setting for state->duplex."
>
>IOW, "duplex" instead of "pause".
>
>Also, is there no way to read the link partner advertisement?

Hi,

On my board (bpi-r3) we have no autoneg on the gmacs. We have a switch (mt7531) with fixed-link on the first and a sfp-cage on the other. Second mac gets speed-setting (1000base-X or 2500base-X) from sfp eeprom, but no advertisement from the "other end". Imho it is always full duplex.

I have no register documentation to check if there is any way to read out pause/duplex setting. Maybe MTK can answer this or extend function later.

regards Frank
  
Russell King (Oracle) Oct. 21, 2022, 7:24 a.m. UTC | #4
On Fri, Oct 21, 2022 at 08:04:51AM +0200, Frank Wunderlich wrote:
> On my board (bpi-r3) we have no autoneg on the gmacs. We have a switch (mt7531) with fixed-link on the first and a sfp-cage on the other. Second mac gets speed-setting (1000base-X or 2500base-X) from sfp eeprom, but no advertisement from the "other end". Imho it is always full duplex.

If it's a fixed link, then this function you're adding won't be called.
It's only called for in-band mode which is exclusive with fixed-link
mode.
  
Russell King (Oracle) Oct. 21, 2022, 9 a.m. UTC | #5
On Fri, Oct 21, 2022 at 10:41:22AM +0200, Frank Wunderlich wrote:
> Am 21. Oktober 2022 09:24:02 MESZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
> >On Fri, Oct 21, 2022 at 08:04:51AM +0200, Frank Wunderlich wrote:
> >> On my board (bpi-r3) we have no autoneg on the gmacs. We have a switch (mt7531) with fixed-link on the first and a sfp-cage on the other. Second mac gets speed-setting (1000base-X or 2500base-X) from sfp eeprom, but no advertisement from the "other end". Imho it is always full duplex.
> >
> >If it's a fixed link, then this function you're adding won't be called.
> >It's only called for in-band mode which is exclusive with fixed-link
> >mode.
> >
> >-- 
> >RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> >FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
> 
> Right, i get this trace for the second mac which is without fixed-link because of in-band-managed for sfp (read speed settings from sfp eeprom).

So, you need to set state->duplex to DUPLEX_FULL if this is what the
hardware is actually doing.
  
Russell King (Oracle) Oct. 21, 2022, 9:06 a.m. UTC | #6
On Fri, Oct 21, 2022 at 08:04:51AM +0200, Frank Wunderlich wrote:
> I have no register documentation to check if there is any way to read out pause/duplex setting. Maybe MTK can answer this or extend function later.

I suspect we can probably guess.

Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
the low 16 bits, and BMSR in the upper 16 bits, so:

At address 4, I'd expect the PHYSID.
At address 8, I'd expect the advertisement register in the low 16 bits
and the link partner advertisement in the upper 16 bits.

Can you try an experiment, and in mtk_sgmii_init() try accessing the
regmap at address 0, 4, and 8 and print their contents please?
  
Frank Wunderlich Oct. 21, 2022, 5:47 p.m. UTC | #7
> Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> An: "Frank Wunderlich" <frank-w@public-files.de>
> On Fri, Oct 21, 2022 at 08:04:51AM +0200, Frank Wunderlich wrote:
> > I have no register documentation to check if there is any way to read out pause/duplex setting. Maybe MTK can answer this or extend function later.
>
> I suspect we can probably guess.
>
> Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
> the low 16 bits, and BMSR in the upper 16 bits, so:
>
> At address 4, I'd expect the PHYSID.
> At address 8, I'd expect the advertisement register in the low 16 bits
> and the link partner advertisement in the upper 16 bits.
>
> Can you try an experiment, and in mtk_sgmii_init() try accessing the
> regmap at address 0, 4, and 8 and print their contents please?

is this what you want to see?

 int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *r, u32 ana_rgc3)
 {
        struct device_node *np;
+       unsigned int val;
        int i;

        for (i = 0; i < MTK_MAX_DEVS; i++) {
@@ -158,6 +159,12 @@ int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *r, u32 ana_rgc3)
                if (IS_ERR(ss->pcs[i].regmap))
                        return PTR_ERR(ss->pcs[i].regmap);

+               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1, &val);
+               printk(KERN_ALERT "dev: %d offset:0 0x%x",i,val);
+               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1+4, &val);
+               printk(KERN_ALERT "dev: %d offset:4 0x%x",i,val);
+               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1+8, &val);
+               printk(KERN_ALERT "dev: %d offset:8 0x%x",i,val);
                ss->pcs[i].pcs.ops = &mtk_pcs_ops;
        }


[    1.083359] dev: 0 offset:0 0x840140
[    1.083376] dev: 0 offset:4 0x4d544950
[    1.086955] dev: 0 offset:8 0x1
[    1.090697] dev: 1 offset:0 0x81140
[    1.093866] dev: 1 offset:4 0x4d544950
[    1.097342] dev: 1 offset:8 0x1

> --
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
>

regards Frank
  
Russell King (Oracle) Oct. 21, 2022, 6:31 p.m. UTC | #8
On Fri, Oct 21, 2022 at 07:47:47PM +0200, Frank Wunderlich wrote:
> > Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > An: "Frank Wunderlich" <frank-w@public-files.de>
> > On Fri, Oct 21, 2022 at 08:04:51AM +0200, Frank Wunderlich wrote:
> > > I have no register documentation to check if there is any way to read out pause/duplex setting. Maybe MTK can answer this or extend function later.
> >
> > I suspect we can probably guess.
> >
> > Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
> > the low 16 bits, and BMSR in the upper 16 bits, so:
> >
> > At address 4, I'd expect the PHYSID.
> > At address 8, I'd expect the advertisement register in the low 16 bits
> > and the link partner advertisement in the upper 16 bits.
> >
> > Can you try an experiment, and in mtk_sgmii_init() try accessing the
> > regmap at address 0, 4, and 8 and print their contents please?
> 
> is this what you want to see?
> 
>  int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *r, u32 ana_rgc3)
>  {
>         struct device_node *np;
> +       unsigned int val;
>         int i;
> 
>         for (i = 0; i < MTK_MAX_DEVS; i++) {
> @@ -158,6 +159,12 @@ int mtk_sgmii_init(struct mtk_sgmii *ss, struct device_node *r, u32 ana_rgc3)
>                 if (IS_ERR(ss->pcs[i].regmap))
>                         return PTR_ERR(ss->pcs[i].regmap);
> 
> +               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1, &val);
> +               printk(KERN_ALERT "dev: %d offset:0 0x%x",i,val);
> +               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1+4, &val);
> +               printk(KERN_ALERT "dev: %d offset:4 0x%x",i,val);
> +               regmap_read(ss->pcs[i].regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> +               printk(KERN_ALERT "dev: %d offset:8 0x%x",i,val);
>                 ss->pcs[i].pcs.ops = &mtk_pcs_ops;
>         }
> 
> 
> [    1.083359] dev: 0 offset:0 0x840140
> [    1.083376] dev: 0 offset:4 0x4d544950
> [    1.086955] dev: 0 offset:8 0x1
> [    1.090697] dev: 1 offset:0 0x81140
> [    1.093866] dev: 1 offset:4 0x4d544950
> [    1.097342] dev: 1 offset:8 0x1

Thanks. Decoding these...

dev 0:
 BMCR: fixed, full duplex, 1000Mbps, !autoneg
 BMSR: link up
 Phy ID: 0x4d54 0x4950
 Advertise: 0x0001 (which would correspond with the MAC side of SGMII)
 Link partner: 0x0000 (no advertisement received, but we're not using
    negotiation.)

dev 1:
 BMCR: autoneg (full duplex, 1000Mbps - both would be ignored)
 BMSR: able to do autoneg, no link
 Phy ID: 0x4d54 0x4950
 Advertise: 0x0001 (same as above)
 Link partner: 0x0000 (no advertisement received due to no link)

Okay, what would now be interesting is to see how dev 1 behaves when
it has link with a 1000base-X link partner that is advertising
properly. If this changes to 0x01e0 or similar (in the high 16-bits
of offset 8) then we definitely know that this is an IEEE PHY register
set laid out in memory, and we can program the advertisement and read
the link partner's abilities.

At that point, we should try programming the low 16-bits of offset 8
to see whether that advertisement makes it to the far end of the link.

It would be well worth trying to work this out, because it will then
vastly improve the knowledge of this hardware, and improve the
driver no end.

Is this something you could investigate please?
  
Frank Wunderlich Oct. 21, 2022, 7:52 p.m. UTC | #9
> Gesendet: Freitag, 21. Oktober 2022 um 20:31 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> On Fri, Oct 21, 2022 at 07:47:47PM +0200, Frank Wunderlich wrote:
> > > Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> > > Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
> > > the low 16 bits, and BMSR in the upper 16 bits, so:
> > >
> > > At address 4, I'd expect the PHYSID.
> > > At address 8, I'd expect the advertisement register in the low 16 bits
> > > and the link partner advertisement in the upper 16 bits.
> > >
> > > Can you try an experiment, and in mtk_sgmii_init() try accessing the
> > > regmap at address 0, 4, and 8 and print their contents please?
> >
> > is this what you want to see?

> > [    1.083359] dev: 0 offset:0 0x840140
> > [    1.083376] dev: 0 offset:4 0x4d544950
> > [    1.086955] dev: 0 offset:8 0x1
> > [    1.090697] dev: 1 offset:0 0x81140
> > [    1.093866] dev: 1 offset:4 0x4d544950
> > [    1.097342] dev: 1 offset:8 0x1
>
> Thanks. Decoding these...
>
> dev 0:
>  BMCR: fixed, full duplex, 1000Mbps, !autoneg
>  BMSR: link up
>  Phy ID: 0x4d54 0x4950
>  Advertise: 0x0001 (which would correspond with the MAC side of SGMII)
>  Link partner: 0x0000 (no advertisement received, but we're not using
>     negotiation.)
>
> dev 1:
>  BMCR: autoneg (full duplex, 1000Mbps - both would be ignored)
>  BMSR: able to do autoneg, no link
>  Phy ID: 0x4d54 0x4950
>  Advertise: 0x0001 (same as above)
>  Link partner: 0x0000 (no advertisement received due to no link)
>
> Okay, what would now be interesting is to see how dev 1 behaves when
> it has link with a 1000base-X link partner that is advertising
> properly. If this changes to 0x01e0 or similar (in the high 16-bits
> of offset 8) then we definitely know that this is an IEEE PHY register
> set laid out in memory, and we can program the advertisement and read
> the link partner's abilities.

added register-read on the the new get_state function too

on bootup it is now a bit different

[    1.086283] dev: 0 offset:0 0x40140 #was previously 0x840140
[    1.086301] dev: 0 offset:4 0x4d544950
[    1.089795] dev: 0 offset:8 0x1
[    1.093584] dev: 1 offset:0 0x81140
[    1.096716] dev: 1 offset:4 0x4d544950
[    1.100191] dev: 1 offset:8 0x1

root@bpi-r3:~# ip link set eth1 up
[  172.037519] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
root@bpi-r3:~#
[  172.102949] offset:0 0x40140 #still same value
[  172.102964] offset:4 0x4d544950
[  172.105842] offset:8 0x1
[  172.108992] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
[  172.120058] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

so no change on link up

maybe ethtool-output is interesting

root@bpi-r3:~# ethtool -m eth1
        Identifier                                : 0x03 (SFP)
        Extended identifier                       : 0x04 (GBIC/SFP defined by 2-wire interface ID)
        Connector                                 : 0x07 (LC)
        Transceiver codes                         : 0x00 0x00 0x00 0x01 0x00 0x00 0x00 0x00 0x00
        Transceiver type                          : Ethernet: 1000BASE-SX
        Encoding                                  : 0x01 (8B/10B)
        BR, Nominal                               : 1300MBd
        Rate identifier                           : 0x00 (unspecified)
        Length (SMF,km)                           : 0km
        Length (SMF)                              : 0m
        Length (50um)                             : 550m
        Length (62.5um)                           : 270m
        Length (Copper)                           : 0m
        Length (OM3)                              : 0m
        Laser wavelength                          : 850nm
        Vendor name                               : OEM
        Vendor OUI                                : 00:90:65
        Vendor PN                                 : GLC-SX-MMD     _
        Vendor rev                                : _e
        Option values                             : 0x00 0x1a
        Option                                    : RX_LOS implemented
        Option                                    : TX_FAULT implemented
        Option                                    : TX_DISABLE implemented
        BR margin, max                            : 0%
        BR margin, min                            : 0%
        Vendor SN                                 : CSCGE1M14134
        Date code                                 : 220120
        Optical diagnostics support               : Yes
        Laser bias current                        : 3.634 mA
        Laser output power                        : 0.3181 mW / -4.97 dBm
        Receiver signal average optical power     : 0.3444 mW / -4.63 dBm
        Module temperature                        : 35.32 degrees C / 95.58 degrees F
        Module voltage                            : 3.3101 V
        Alarm/warning flags implemented           : Yes
        Laser bias current high alarm             : Off
        Laser bias current low alarm              : Off
        Laser bias current high warning           : Off
        Laser bias current low warning            : Off
        Laser output power high alarm             : Off
        Laser output power low alarm              : Off
        Laser output power high warning           : Off
        Laser output power low warning            : Off
        Module temperature high alarm             : Off
        Module temperature low alarm              : Off
        Module temperature high warning           : Off
        Module temperature low warning            : Off
        Module voltage high alarm                 : Off
        Module voltage low alarm                  : Off
        Module voltage high warning               : Off
        Module voltage low warning                : Off
        Laser rx power high alarm                 : Off
        Laser rx power low alarm                  : Off
        Laser rx power high warning               : Off
        Laser rx power low warning                : Off
        Laser bias current high alarm threshold   : 100.000 mA
        Laser bias current low alarm threshold    : 0.000 mA
        Laser bias current high warning threshold : 90.000 mA
        Laser bias current low warning threshold  : 0.100 mA
        Laser output power high alarm threshold   : 1.2589 mW / 1.00 dBm
        Laser output power low alarm threshold    : 0.0501 mW / -13.00 dBm
        Laser output power high warning threshold : 0.7943 mW / -1.00 dBm
        Laser output power low warning threshold  : 0.0631 mW / -12.00 dBm
        Module temperature high alarm threshold   : 90.00 degrees C / 194.00 degrees F
        Module temperature low alarm threshold    : -45.00 degrees C / -49.00 degrees F
        Module temperature high warning threshold : 85.00 degrees C / 185.00 degrees F
        Module temperature low warning threshold  : -40.00 degrees C / -40.00 degrees F
        Module voltage high alarm threshold       : 3.8000 V
        Module voltage low alarm threshold        : 2.7000 V
        Module voltage high warning threshold     : 3.7000 V
        Module voltage low warning threshold      : 2.8000 V
        Laser rx power high alarm threshold       : 0.7943 mW / -1.00 dBm
        Laser rx power low alarm threshold        : 0.0079 mW / -21.02 dBm
        Laser rx power high warning threshold     : 0.5012 mW / -3.00 dBm
        Laser rx power low warning threshold      : 0.0126 mW / -19.00 dBm
root@bpi-r3:~# ethtool eth1
[  295.755349] offset:0 0x40140
[  295.755368] offset:4 0x4d544950
Settings for eth[  295.758247] offset:8 0x1
1:
        Supported p[  295.761551] offset:0 0x40140
[  295.765446] offset:4 0x4d544950

        Supported link modes:   1000baseX/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseX/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes


> At that point, we should try programming the low 16-bits of offset 8
> to see whether that advertisement makes it to the far end of the link.

have only my switch on the other side where i imho cannot read out these advertising. and programming to which values at which point (in the get-state callback to be done on link-up)?

> It would be well worth trying to work this out, because it will then
> vastly improve the knowledge of this hardware, and improve the
> driver no end.
>
> Is this something you could investigate please?

regards Frank
  
Russell King (Oracle) Oct. 21, 2022, 9:28 p.m. UTC | #10
On Fri, Oct 21, 2022 at 09:52:36PM +0200, Frank Wunderlich wrote:
> > Gesendet: Freitag, 21. Oktober 2022 um 20:31 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > On Fri, Oct 21, 2022 at 07:47:47PM +0200, Frank Wunderlich wrote:
> > > > Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > > > Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
> > > > the low 16 bits, and BMSR in the upper 16 bits, so:
> > > >
> > > > At address 4, I'd expect the PHYSID.
> > > > At address 8, I'd expect the advertisement register in the low 16 bits
> > > > and the link partner advertisement in the upper 16 bits.
> > > >
> > > > Can you try an experiment, and in mtk_sgmii_init() try accessing the
> > > > regmap at address 0, 4, and 8 and print their contents please?
> > >
> > > is this what you want to see?
> 
> > > [    1.083359] dev: 0 offset:0 0x840140
> > > [    1.083376] dev: 0 offset:4 0x4d544950
> > > [    1.086955] dev: 0 offset:8 0x1
> > > [    1.090697] dev: 1 offset:0 0x81140
> > > [    1.093866] dev: 1 offset:4 0x4d544950
> > > [    1.097342] dev: 1 offset:8 0x1
> >
> > Thanks. Decoding these...
> >
> > dev 0:
> >  BMCR: fixed, full duplex, 1000Mbps, !autoneg
> >  BMSR: link up
> >  Phy ID: 0x4d54 0x4950
> >  Advertise: 0x0001 (which would correspond with the MAC side of SGMII)
> >  Link partner: 0x0000 (no advertisement received, but we're not using
> >     negotiation.)
> >
> > dev 1:
> >  BMCR: autoneg (full duplex, 1000Mbps - both would be ignored)
> >  BMSR: able to do autoneg, no link
> >  Phy ID: 0x4d54 0x4950
> >  Advertise: 0x0001 (same as above)
> >  Link partner: 0x0000 (no advertisement received due to no link)
> >
> > Okay, what would now be interesting is to see how dev 1 behaves when
> > it has link with a 1000base-X link partner that is advertising
> > properly. If this changes to 0x01e0 or similar (in the high 16-bits
> > of offset 8) then we definitely know that this is an IEEE PHY register
> > set laid out in memory, and we can program the advertisement and read
> > the link partner's abilities.
> 
> added register-read on the the new get_state function too
> 
> on bootup it is now a bit different
> 
> [    1.086283] dev: 0 offset:0 0x40140 #was previously 0x840140
> [    1.086301] dev: 0 offset:4 0x4d544950
> [    1.089795] dev: 0 offset:8 0x1
> [    1.093584] dev: 1 offset:0 0x81140
> [    1.096716] dev: 1 offset:4 0x4d544950
> [    1.100191] dev: 1 offset:8 0x1
> 
> root@bpi-r3:~# ip link set eth1 up
> [  172.037519] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
> root@bpi-r3:~#
> [  172.102949] offset:0 0x40140 #still same value

If this is "dev: 1" the value has changed - the ANENABLE bit has been
turned off, which means it's not going to bother receiving or sending
the 16-bit control word. Bit 12 needs to stay set for it to perform
the exchange.
  
Frank Wunderlich Oct. 22, 2022, 6:25 a.m. UTC | #11
Am 21. Oktober 2022 23:28:09 MESZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
>On Fri, Oct 21, 2022 at 09:52:36PM +0200, Frank Wunderlich wrote:
>> > Gesendet: Freitag, 21. Oktober 2022 um 20:31 Uhr
>> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
>> 
>> > On Fri, Oct 21, 2022 at 07:47:47PM +0200, Frank Wunderlich wrote:
>> > > > Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
>> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
>> 
>> > > > Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
>> > > > the low 16 bits, and BMSR in the upper 16 bits, so:
>> > > >
>> > > > At address 4, I'd expect the PHYSID.
>> > > > At address 8, I'd expect the advertisement register in the low 16 bits
>> > > > and the link partner advertisement in the upper 16 bits.
>> > > >
>> > > > Can you try an experiment, and in mtk_sgmii_init() try accessing the
>> > > > regmap at address 0, 4, and 8 and print their contents please?
>> > >
>> > > is this what you want to see?
>> 
>> > > [    1.083359] dev: 0 offset:0 0x840140
>> > > [    1.083376] dev: 0 offset:4 0x4d544950
>> > > [    1.086955] dev: 0 offset:8 0x1
>> > > [    1.090697] dev: 1 offset:0 0x81140
>> > > [    1.093866] dev: 1 offset:4 0x4d544950
>> > > [    1.097342] dev: 1 offset:8 0x1
>> >
>> > Thanks. Decoding these...
>> >
>> > dev 0:
>> >  BMCR: fixed, full duplex, 1000Mbps, !autoneg
>> >  BMSR: link up
>> >  Phy ID: 0x4d54 0x4950
>> >  Advertise: 0x0001 (which would correspond with the MAC side of SGMII)
>> >  Link partner: 0x0000 (no advertisement received, but we're not using
>> >     negotiation.)
>> >
>> > dev 1:
>> >  BMCR: autoneg (full duplex, 1000Mbps - both would be ignored)
>> >  BMSR: able to do autoneg, no link
>> >  Phy ID: 0x4d54 0x4950
>> >  Advertise: 0x0001 (same as above)
>> >  Link partner: 0x0000 (no advertisement received due to no link)
>> >
>> > Okay, what would now be interesting is to see how dev 1 behaves when
>> > it has link with a 1000base-X link partner that is advertising
>> > properly. If this changes to 0x01e0 or similar (in the high 16-bits
>> > of offset 8) then we definitely know that this is an IEEE PHY register
>> > set laid out in memory, and we can program the advertisement and read
>> > the link partner's abilities.
>> 
>> added register-read on the the new get_state function too
>> 
>> on bootup it is now a bit different
>> 
>> [    1.086283] dev: 0 offset:0 0x40140 #was previously 0x840140
>> [    1.086301] dev: 0 offset:4 0x4d544950
>> [    1.089795] dev: 0 offset:8 0x1
>> [    1.093584] dev: 1 offset:0 0x81140
>> [    1.096716] dev: 1 offset:4 0x4d544950
>> [    1.100191] dev: 1 offset:8 0x1
>> 
>> root@bpi-r3:~# ip link set eth1 up
>> [  172.037519] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
>> root@bpi-r3:~#
>> [  172.102949] offset:0 0x40140 #still same value
>
>If this is "dev: 1" the value has changed - the ANENABLE bit has been
>turned off, which means it's not going to bother receiving or sending
>the 16-bit control word. Bit 12 needs to stay set for it to perform
>the exchange.

Your right,was confused that dev 0 (fixed link to switch chip) had different value.

offset:0 0x81140 => 0x40140

So i should change offset 8 (currently 0x1) to at least 0x1 | BIT(12)? I can try to set this in the get_state callback,but i'm unsure i can read out it on my switch (basic mode changes yes,but not the value directly)...if mode is not autoneg i will see no change there.

regards Frank
  
Russell King (Oracle) Oct. 22, 2022, 9:11 a.m. UTC | #12
On Sat, Oct 22, 2022 at 08:25:26AM +0200, Frank Wunderlich wrote:
> Am 21. Oktober 2022 23:28:09 MESZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>:
> >On Fri, Oct 21, 2022 at 09:52:36PM +0200, Frank Wunderlich wrote:
> >> > Gesendet: Freitag, 21. Oktober 2022 um 20:31 Uhr
> >> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> >> 
> >> > On Fri, Oct 21, 2022 at 07:47:47PM +0200, Frank Wunderlich wrote:
> >> > > > Gesendet: Freitag, 21. Oktober 2022 um 11:06 Uhr
> >> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> >> 
> >> > > > Looking at SGMSYS_PCS_CONTROL_1, this is actually the standard BMCR in
> >> > > > the low 16 bits, and BMSR in the upper 16 bits, so:
> >> > > >
> >> > > > At address 4, I'd expect the PHYSID.
> >> > > > At address 8, I'd expect the advertisement register in the low 16 bits
> >> > > > and the link partner advertisement in the upper 16 bits.
> >> > > >
> >> > > > Can you try an experiment, and in mtk_sgmii_init() try accessing the
> >> > > > regmap at address 0, 4, and 8 and print their contents please?
> >> > >
> >> > > is this what you want to see?
> >> 
> >> > > [    1.083359] dev: 0 offset:0 0x840140
> >> > > [    1.083376] dev: 0 offset:4 0x4d544950
> >> > > [    1.086955] dev: 0 offset:8 0x1
> >> > > [    1.090697] dev: 1 offset:0 0x81140
> >> > > [    1.093866] dev: 1 offset:4 0x4d544950
> >> > > [    1.097342] dev: 1 offset:8 0x1
> >> >
> >> > Thanks. Decoding these...
> >> >
> >> > dev 0:
> >> >  BMCR: fixed, full duplex, 1000Mbps, !autoneg
> >> >  BMSR: link up
> >> >  Phy ID: 0x4d54 0x4950
> >> >  Advertise: 0x0001 (which would correspond with the MAC side of SGMII)
> >> >  Link partner: 0x0000 (no advertisement received, but we're not using
> >> >     negotiation.)
> >> >
> >> > dev 1:
> >> >  BMCR: autoneg (full duplex, 1000Mbps - both would be ignored)
> >> >  BMSR: able to do autoneg, no link
> >> >  Phy ID: 0x4d54 0x4950
> >> >  Advertise: 0x0001 (same as above)
> >> >  Link partner: 0x0000 (no advertisement received due to no link)
> >> >
> >> > Okay, what would now be interesting is to see how dev 1 behaves when
> >> > it has link with a 1000base-X link partner that is advertising
> >> > properly. If this changes to 0x01e0 or similar (in the high 16-bits
> >> > of offset 8) then we definitely know that this is an IEEE PHY register
> >> > set laid out in memory, and we can program the advertisement and read
> >> > the link partner's abilities.
> >> 
> >> added register-read on the the new get_state function too
> >> 
> >> on bootup it is now a bit different
> >> 
> >> [    1.086283] dev: 0 offset:0 0x40140 #was previously 0x840140
> >> [    1.086301] dev: 0 offset:4 0x4d544950
> >> [    1.089795] dev: 0 offset:8 0x1
> >> [    1.093584] dev: 1 offset:0 0x81140
> >> [    1.096716] dev: 1 offset:4 0x4d544950
> >> [    1.100191] dev: 1 offset:8 0x1
> >> 
> >> root@bpi-r3:~# ip link set eth1 up
> >> [  172.037519] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
> >> root@bpi-r3:~#
> >> [  172.102949] offset:0 0x40140 #still same value
> >
> >If this is "dev: 1" the value has changed - the ANENABLE bit has been
> >turned off, which means it's not going to bother receiving or sending
> >the 16-bit control word. Bit 12 needs to stay set for it to perform
> >the exchange.
> 
> Your right,was confused that dev 0 (fixed link to switch chip) had different value.
> 
> offset:0 0x81140 => 0x40140
> 
> So i should change offset 8 (currently 0x1) to at least 0x1 | BIT(12)? I can try to set this in the get_state callback,but i'm unsure i can read out it on my switch (basic mode changes yes,but not the value directly)...if mode is not autoneg i will see no change there.

Hi Frank,

Please try this untested patch, which should setup the PCS to perform
autonegotiation when using in-band mode for 1000base-X, write the
correct to offset 8, and set the link timer correctly.

Thanks.

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index b52f3b0177ef..1a3eb3ecf7e3 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -479,7 +479,7 @@
 
 /* Register to programmable link timer, the unit in 2 * 8ns */
 #define SGMSYS_PCS_LINK_TIMER	0x18
-#define SGMII_LINK_TIMER_DEFAULT	(0x186a0 & GENMASK(19, 0))
+#define SGMII_LINK_TIMER_MASK		GENMASK(19, 0)
 
 /* Register to control remote fault */
 #define SGMSYS_SGMII_MODE		0x20
diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 736839c84130..973275c8e29e 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -20,19 +20,35 @@ static struct mtk_pcs *pcs_to_mtk_pcs(struct phylink_pcs *pcs)
 }
 
 /* For SGMII interface mode */
-static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs)
+static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs,
+				 phy_interface_t interface,
+				 const unsigned long *advertising)
 {
 	unsigned int val;
+	int advertise;
+
+	advertise = phylink_mii_c22_pcs_encode_advertisement(interface,
+							     advertising);
+	if (advertise < 0)
+		advertise = 0;
+
+	if (interface == PHY_INTERFACE_MODE_SGMII)
+		val = 16000000 / 2 / 8;
+	else
+		val = 10000000 / 2 / 8;
 
 	/* Setup the link timer and QPHY power up inside SGMIISYS */
-	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER,
-		     SGMII_LINK_TIMER_DEFAULT);
+	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER, val);
 
 	regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
 	val |= SGMII_REMOTE_FAULT_DIS;
 	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
 
+	regmap_update_bits(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, 0xffff,
+			   advertise);
+
 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
+	val |= SGMII_AN_ENABLE;
 	val |= SGMII_AN_RESTART;
 	regmap_write(mpcs->regmap, SGMSYS_PCS_CONTROL_1, val);
 
@@ -52,12 +68,6 @@ static int mtk_pcs_setup_mode_force(struct mtk_pcs *mpcs,
 {
 	unsigned int val;
 
-	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
-	val &= ~RG_PHY_SPEED_MASK;
-	if (interface == PHY_INTERFACE_MODE_2500BASEX)
-		val |= RG_PHY_SPEED_3_125G;
-	regmap_write(mpcs->regmap, mpcs->ana_rgc3, val);
-
 	/* Disable SGMII AN */
 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
 	val &= ~SGMII_AN_ENABLE;
@@ -83,13 +93,20 @@ static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
 			  bool permit_pause_to_mac)
 {
 	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
+	unsigned int val;
 	int err = 0;
 
+	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
+	val &= ~RG_PHY_SPEED_MASK;
+	if (interface == PHY_INTERFACE_MODE_2500BASEX)
+		val |= RG_PHY_SPEED_3_125G;
+	regmap_write(mpcs->regmap, mpcs->ana_rgc3, val);
+
 	/* Setup SGMIISYS with the determined property */
-	if (interface != PHY_INTERFACE_MODE_SGMII)
+	if (phylink_autoneg_inband(mode))
+		err = mtk_pcs_setup_mode_an(mpcs, interface, advertising);
+	else if (interface != PHY_INTERFACE_MODE_SGMII)
 		err = mtk_pcs_setup_mode_force(mpcs, interface);
-	else if (phylink_autoneg_inband(mode))
-		err = mtk_pcs_setup_mode_an(mpcs);
 
 	return err;
 }
  
Frank Wunderlich Oct. 22, 2022, 10:52 a.m. UTC | #13
> Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> Please try this untested patch, which should setup the PCS to perform
> autonegotiation when using in-band mode for 1000base-X, write the
> correct to offset 8, and set the link timer correctly.

hi,

this patch breaks connectivity at least on the sfp-port (eth1).

root@bpi-r3:~# ip link set eth1 up
[   65.457521] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
root@bpi-r3:~# [   65.522936] offset:0 0x2c1140
[   65.522950] offset:4 0x4d544950
[   65.525914] offset:8 0x40e041a0
[   65.529064] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
[   65.540733] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
root@bpi-r3:~# ip r a default via 192.168.0.10
root@bpi-r3:~# iperf3 -c 192.168.0.21 #ping does not work too
iperf3: error - unable to send control message: Bad file descriptor
root@bpi-r3:~# ethtool eth1
[  177.346183] offset:0 0x2c1140
[  177.346202] offset:4 0x4d544950
Settings for eth[  177.349168] offset:8 0x40e041a0
1:
        Supported p[  177.352477] offset:0 0x2c1140
[  177.356952] offset:4 0x4d544950

        Supported link modes:   1000baseX/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseX/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Unknown! (255)
        Auto-negotiation: on
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes
root@bpi-r3:~#

from sgmii_init
[    1.091796] dev: 1 offset:0 0x81140
[    1.094977] dev: 1 offset:4 0x4d544950
[    1.098456] dev: 1 offset:8 0x1
...
pcs_get_state
[   65.522936] offset:0 0x2c1140
[   65.522950] offset:4 0x4d544950
[   65.525914] offset:8 0x40e041a0
[  177.346183] offset:0 0x2c1140
[  177.346202] offset:4 0x4d544950
[  177.349168] offset:8 0x40e041a0
[  177.352477] offset:0 0x2c1140
[  177.356952] offset:4 0x4d544950

regards Frank
  
Russell King (Oracle) Oct. 22, 2022, 5:05 p.m. UTC | #14
On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > Please try this untested patch, which should setup the PCS to perform
> > autonegotiation when using in-band mode for 1000base-X, write the
> > correct to offset 8, and set the link timer correctly.
> 
> hi,
> 
> this patch breaks connectivity at least on the sfp-port (eth1).
> 
> root@bpi-r3:~# ip link set eth1 up
> [   65.457521] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
> root@bpi-r3:~# [   65.522936] offset:0 0x2c1140
> [   65.522950] offset:4 0x4d544950
> [   65.525914] offset:8 0x40e041a0
> [   65.529064] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> [   65.540733] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> 
> root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
> root@bpi-r3:~# ip r a default via 192.168.0.10
> root@bpi-r3:~# iperf3 -c 192.168.0.21 #ping does not work too
> iperf3: error - unable to send control message: Bad file descriptor
> root@bpi-r3:~# ethtool eth1
> [  177.346183] offset:0 0x2c1140
> [  177.346202] offset:4 0x4d544950
> Settings for eth[  177.349168] offset:8 0x40e041a0
> 1:
>         Supported p[  177.352477] offset:0 0x2c1140
> [  177.356952] offset:4 0x4d544950
> 
>         Supported link modes:   1000baseX/Full
>         Supported pause frame use: Symmetric Receive-only
>         Supports auto-negotiation: Yes
>         Supported FEC modes: Not reported
>         Advertised link modes:  1000baseX/Full
>         Advertised pause frame use: Symmetric Receive-only
>         Advertised auto-negotiation: Yes
>         Advertised FEC modes: Not reported
>         Speed: 1000Mb/s
>         Duplex: Unknown! (255)
>         Auto-negotiation: on
>         Port: FIBRE
>         PHYAD: 0
>         Transceiver: internal
>         Current message level: 0x000000ff (255)
>                                drv probe link timer ifdown ifup rx_err tx_err
>         Link detected: yes
> root@bpi-r3:~#
> 
> from sgmii_init
> [    1.091796] dev: 1 offset:0 0x81140
> [    1.094977] dev: 1 offset:4 0x4d544950
> [    1.098456] dev: 1 offset:8 0x1
> ...
> pcs_get_state
> [   65.522936] offset:0 0x2c1140
> [   65.522950] offset:4 0x4d544950
> [   65.525914] offset:8 0x40e041a0
> [  177.346183] offset:0 0x2c1140
> [  177.346202] offset:4 0x4d544950
> [  177.349168] offset:8 0x40e041a0
> [  177.352477] offset:0 0x2c1140
> [  177.356952] offset:4 0x4d544950

Hi,

Thanks. Well, the results suggest that the register at offset 8 is
indeed the advertisement and link-partner advertisement register. So
we have a bit of progress and a little more understanding of this
hardware.

Do you know if your link partner also thinks the link is up?

What I notice is:

mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off

The duplex is "unknown" which means you're not filling in the
state->duplex field in your pcs_get_state() function. Given the
link parter adverisement is 0x00e0, this means the link partner
supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
is therefore full duplex, so can we hack that in to your
pcs_get_state() so we're getting that right for this testing please?

Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
in the SGMSYS_SGMII_MODE register. Does one of these bits set the
format for the 16-bit control word that's used to convey the
advertisements. I think the next step would be to play around with
these and see what effect setting or clearing these bits has -
please can you give that a go?

Thanks.
  
Frank Wunderlich Oct. 22, 2022, 5:53 p.m. UTC | #15
> Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> > this patch breaks connectivity at least on the sfp-port (eth1).

> > pcs_get_state
> > [   65.522936] offset:0 0x2c1140
> > [   65.522950] offset:4 0x4d544950
> > [   65.525914] offset:8 0x40e041a0
> > [  177.346183] offset:0 0x2c1140
> > [  177.346202] offset:4 0x4d544950
> > [  177.349168] offset:8 0x40e041a0
> > [  177.352477] offset:0 0x2c1140
> > [  177.356952] offset:4 0x4d544950
>
> Hi,
>
> Thanks. Well, the results suggest that the register at offset 8 is
> indeed the advertisement and link-partner advertisement register. So
> we have a bit of progress and a little more understanding of this
> hardware.
>
> Do you know if your link partner also thinks the link is up?

yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.

> What I notice is:
>
> mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
>
> The duplex is "unknown" which means you're not filling in the
> state->duplex field in your pcs_get_state() function. Given the
> link parter adverisement is 0x00e0, this means the link partner
> supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> is therefore full duplex, so can we hack that in to your
> pcs_get_state() so we're getting that right for this testing please?

0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?

so i should set state->duplex/pause based on this value (maybe compare with own caps)?

found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)

regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
partner_advertising = (val & 0x00ff0000) >> 16;

if (partner_advertising & BIT(5)) state->duplex = DUPLEX_FULL;
else if (partner_advertising & BIT(6)) state->duplex = DUPLEX_HALF;

if (partner_advertising & BIT(7)) state->pause = MAC_SYM_PAUSE;
else if (partner_advertising & BIT(8)) state->pause = MAC_ASYM_PAUSE;

> Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> format for the 16-bit control word that's used to convey the
> advertisements. I think the next step would be to play around with
> these and see what effect setting or clearing these bits has -
> please can you give that a go?

these is not clear to me...should i blindly set these and how to verify what they do?

is network broken because of wrong duplex/pause setting? do not fully understand your Patch.
But the timer-change can also break sgmii...

regards Frank
  
Russell King (Oracle) Oct. 22, 2022, 7:18 p.m. UTC | #16
Hi,

On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > > this patch breaks connectivity at least on the sfp-port (eth1).
> 
> > > pcs_get_state
> > > [   65.522936] offset:0 0x2c1140
> > > [   65.522950] offset:4 0x4d544950
> > > [   65.525914] offset:8 0x40e041a0
> > > [  177.346183] offset:0 0x2c1140
> > > [  177.346202] offset:4 0x4d544950
> > > [  177.349168] offset:8 0x40e041a0
> > > [  177.352477] offset:0 0x2c1140
> > > [  177.356952] offset:4 0x4d544950
> >
> > Hi,
> >
> > Thanks. Well, the results suggest that the register at offset 8 is
> > indeed the advertisement and link-partner advertisement register. So
> > we have a bit of progress and a little more understanding of this
> > hardware.
> >
> > Do you know if your link partner also thinks the link is up?
> 
> yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> 
> > What I notice is:
> >
> > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> >
> > The duplex is "unknown" which means you're not filling in the
> > state->duplex field in your pcs_get_state() function. Given the
> > link parter adverisement is 0x00e0, this means the link partner
> > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > is therefore full duplex, so can we hack that in to your
> > pcs_get_state() so we're getting that right for this testing please?
> 
> 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> 
> so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> 
> found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> 
> regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> partner_advertising = (val & 0x00ff0000) >> 16;

Not quite :) When we have the link partner's advertisement and the BMSR,
we have a helper function in phylink to do all the gritty work:

	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);

	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);

will do all the work for you without having to care about whether
you're operating at 2500base-X, 1000base-X or SGMII mode.

> > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > format for the 16-bit control word that's used to convey the
> > advertisements. I think the next step would be to play around with
> > these and see what effect setting or clearing these bits has -
> > please can you give that a go?
> 
> these is not clear to me...should i blindly set these and how to
> verify what they do?

Yes please - I don't think anyone knows what they do.

> is network broken because of wrong duplex/pause setting? do not
> fully understand your Patch.

I suspect not having the duplex correct _could_ break stuff, but I
also wonder whether the PCS is trying to decode the advertisements
itself and coming out with the wrong settings.

If it's interpreting a link partner advertisement of 0x00e0 using
SGMII rules, then it will be looking at bits 11 and 10 for the
speed, both of which are zero, which means 10Mbps - and 1000base-X
doesn't operate at 10Mbps!

So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
the way the PCS interprets the control word, but as we don't have
any documentation to go on, only experimentation will answer this
question.

If these registers are MMIO, you could ensure that you have /dev/mem
access enabled, and use devmem2 to poke at this register which would
probably be quicker than doing a build-boot-test cycle with the
kernel - this is how I do a lot of this kind of discovery when
documentation is lacking.

> But the timer-change can also break sgmii...

SGMII mode should be writing the same value to the link timer, but
looking at it now, I see I ended up with one too many zeros on the
16000000! It should be 1.6ms in nanoseconds, so 1600000. Please
correct for future testing.

Many thanks for your patience.
  
Frank Wunderlich Oct. 23, 2022, 7:26 a.m. UTC | #17
> Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> An: "Frank Wunderlich" <frank-w@public-files.de>
> Cc: "Frank Wunderlich" <linux@fw-web.de>, linux-mediatek@lists.infradead.org, "Alexander Couzens" <lynxis@fe80.eu>, "Felix Fietkau" <nbd@nbd.name>, "John Crispin" <john@phrozen.org>, "Sean Wang" <sean.wang@mediatek.com>, "Mark Lee" <Mark-MC.Lee@mediatek.com>, "David S. Miller" <davem@davemloft.net>, "Eric Dumazet" <edumazet@google.com>, "Jakub Kicinski" <kuba@kernel.org>, "Paolo Abeni" <pabeni@redhat.com>, "Matthias Brugger" <matthias.bgg@gmail.com>, netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org
> Betreff: Re: Re: Re: Re: Re: [PATCH v2] net: mtk_sgmii: implement mtk_pcs_ops
>
> Hi,
>
> On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> >
> > > > this patch breaks connectivity at least on the sfp-port (eth1).
> >
> > > > pcs_get_state
> > > > [   65.522936] offset:0 0x2c1140
> > > > [   65.522950] offset:4 0x4d544950
> > > > [   65.525914] offset:8 0x40e041a0
> > > > [  177.346183] offset:0 0x2c1140
> > > > [  177.346202] offset:4 0x4d544950
> > > > [  177.349168] offset:8 0x40e041a0
> > > > [  177.352477] offset:0 0x2c1140
> > > > [  177.356952] offset:4 0x4d544950
> > >
> > > Hi,
> > >
> > > Thanks. Well, the results suggest that the register at offset 8 is
> > > indeed the advertisement and link-partner advertisement register. So
> > > we have a bit of progress and a little more understanding of this
> > > hardware.
> > >
> > > Do you know if your link partner also thinks the link is up?
> >
> > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> >
> > > What I notice is:
> > >
> > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > >
> > > The duplex is "unknown" which means you're not filling in the
> > > state->duplex field in your pcs_get_state() function. Given the
> > > link parter adverisement is 0x00e0, this means the link partner
> > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > is therefore full duplex, so can we hack that in to your
> > > pcs_get_state() so we're getting that right for this testing please?
> >
> > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> >
> > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> >
> > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> >
> > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > partner_advertising = (val & 0x00ff0000) >> 16;
>
> Not quite :) When we have the link partner's advertisement and the BMSR,
> we have a helper function in phylink to do all the gritty work:
>
> 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
>
> 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
>
> will do all the work for you without having to care about whether
> you're operating at 2500base-X, 1000base-X or SGMII mode.
>
> > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > format for the 16-bit control word that's used to convey the
> > > advertisements. I think the next step would be to play around with
> > > these and see what effect setting or clearing these bits has -
> > > please can you give that a go?
> >
> > these is not clear to me...should i blindly set these and how to
> > verify what they do?
>
> Yes please - I don't think anyone knows what they do.

i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
or for 1G vs. 2G5.

but how to check what has changed...i guess only the register itself changed
and i have to readout another to check whats changed.

do we really need these 2 bits? reading/setting duplex/pause from/to the register
makes sense, but digging into undocumented bits is much work and we still only guess.

so i would first want to get sgmii working again and then strip the pause/duplex from it.

> > is network broken because of wrong duplex/pause setting? do not
> > fully understand your Patch.
>
> I suspect not having the duplex correct _could_ break stuff, but I
> also wonder whether the PCS is trying to decode the advertisements
> itself and coming out with the wrong settings.
>
> If it's interpreting a link partner advertisement of 0x00e0 using
> SGMII rules, then it will be looking at bits 11 and 10 for the
> speed, both of which are zero, which means 10Mbps - and 1000base-X
> doesn't operate at 10Mbps!

so maybe this breaks sgmii? have you changed this behaviour with your Patch?

> So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> the way the PCS interprets the control word, but as we don't have
> any documentation to go on, only experimentation will answer this
> question.
>
> If these registers are MMIO, you could ensure that you have /dev/mem
> access enabled, and use devmem2 to poke at this register which would
> probably be quicker than doing a build-boot-test cycle with the
> kernel - this is how I do a lot of this kind of discovery when
> documentation is lacking.
>
> > But the timer-change can also break sgmii...
>
> SGMII mode should be writing the same value to the link timer, but
> looking at it now, I see I ended up with one too many zeros on the
> 16000000! It should be 1.6ms in nanoseconds, so 1600000. Please
> correct for future testing.

tried removing 1  zero from the 16000000, but same result.
tried also setting duplex with ethtool, but  after read it is still unknown.
and i get no traffic working...i wonder because duplex was not set before your
patch too, but interface was working.

> Many thanks for your patience.

i do what i can, but i'm limited in time.

Frank
  
Russell King (Oracle) Oct. 23, 2022, 9:43 a.m. UTC | #18
On Sun, Oct 23, 2022 at 09:26:39AM +0200, Frank Wunderlich wrote:
> > Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > An: "Frank Wunderlich" <frank-w@public-files.de>
> > Cc: "Frank Wunderlich" <linux@fw-web.de>, linux-mediatek@lists.infradead.org, "Alexander Couzens" <lynxis@fe80.eu>, "Felix Fietkau" <nbd@nbd.name>, "John Crispin" <john@phrozen.org>, "Sean Wang" <sean.wang@mediatek.com>, "Mark Lee" <Mark-MC.Lee@mediatek.com>, "David S. Miller" <davem@davemloft.net>, "Eric Dumazet" <edumazet@google.com>, "Jakub Kicinski" <kuba@kernel.org>, "Paolo Abeni" <pabeni@redhat.com>, "Matthias Brugger" <matthias.bgg@gmail.com>, netdev@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org
> > Betreff: Re: Re: Re: Re: Re: [PATCH v2] net: mtk_sgmii: implement mtk_pcs_ops
> >
> > Hi,
> >
> > On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > >
> > > > > this patch breaks connectivity at least on the sfp-port (eth1).
> > >
> > > > > pcs_get_state
> > > > > [   65.522936] offset:0 0x2c1140
> > > > > [   65.522950] offset:4 0x4d544950
> > > > > [   65.525914] offset:8 0x40e041a0
> > > > > [  177.346183] offset:0 0x2c1140
> > > > > [  177.346202] offset:4 0x4d544950
> > > > > [  177.349168] offset:8 0x40e041a0
> > > > > [  177.352477] offset:0 0x2c1140
> > > > > [  177.356952] offset:4 0x4d544950
> > > >
> > > > Hi,
> > > >
> > > > Thanks. Well, the results suggest that the register at offset 8 is
> > > > indeed the advertisement and link-partner advertisement register. So
> > > > we have a bit of progress and a little more understanding of this
> > > > hardware.
> > > >
> > > > Do you know if your link partner also thinks the link is up?
> > >
> > > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> > >
> > > > What I notice is:
> > > >
> > > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > > >
> > > > The duplex is "unknown" which means you're not filling in the
> > > > state->duplex field in your pcs_get_state() function. Given the
> > > > link parter adverisement is 0x00e0, this means the link partner
> > > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > > is therefore full duplex, so can we hack that in to your
> > > > pcs_get_state() so we're getting that right for this testing please?
> > >
> > > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> > >
> > > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> > >
> > > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> > >
> > > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > > partner_advertising = (val & 0x00ff0000) >> 16;
> >
> > Not quite :) When we have the link partner's advertisement and the BMSR,
> > we have a helper function in phylink to do all the gritty work:
> >
> > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> >
> > 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> >
> > will do all the work for you without having to care about whether
> > you're operating at 2500base-X, 1000base-X or SGMII mode.
> >
> > > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > > format for the 16-bit control word that's used to convey the
> > > > advertisements. I think the next step would be to play around with
> > > > these and see what effect setting or clearing these bits has -
> > > > please can you give that a go?
> > >
> > > these is not clear to me...should i blindly set these and how to
> > > verify what they do?
> >
> > Yes please - I don't think anyone knows what they do.
> 
> i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
> Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
> or for 1G vs. 2G5.

"other sgmii implementations" ?

If this is the SGMII_EN flag, maybe SGMII_IF_MODE_BIT0 should be
renamed to SGMII_IF_SGMII_EN ? Maybe it needs to be set for SGMII
and clear for base-X ?

> but how to check what has changed...i guess only the register itself changed
> and i have to readout another to check whats changed.
> 
> do we really need these 2 bits? reading/setting duplex/pause from/to the register
> makes sense, but digging into undocumented bits is much work and we still only guess.

I don't know - I've no idea about this hardware, or what the PCS is,
and other people over the years I've talked to have said "we're not
using it, we can't help". The mediatek driver has been somewhat of a
pain for phylink as a result.

> so i would first want to get sgmii working again and then strip the pause/duplex from it.

I think I'd need more information on your setup - is this dev 0? Are
you using in-band mode or fixed-link mode?

I don't think you've updated me with register values for this since
the patch. With the link timer adjusted back to 1.6ms, that should
result in it working again, but if not, I think there's some
possibilities.

The addition of SGMII_AN_ENABLE for SGMSYS_PCS_CONTROL_1 could have
broken your setup if there is no actual in-band signalling, which
basically means that your firmware description is wrong - and you've
possibly been led astray by the poor driver implementation.

Can you confirm that the device on the other end for dev 0 does in
actual fact use in-band signalling please?

> > If it's interpreting a link partner advertisement of 0x00e0 using
> > SGMII rules, then it will be looking at bits 11 and 10 for the
> > speed, both of which are zero, which means 10Mbps - and 1000base-X
> > doesn't operate at 10Mbps!
> 
> so maybe this breaks sgmii? have you changed this behaviour with your Patch?

Nope, but not setting the duplex properly is yet another buggy and poor
quality of implementation that afficts this driver. I've said about
setting the duplex value when reviewing your patch to add .pcs_get_state
and I'm disappointed that you seemingly haven't yet corrected it in the
code you're testing despite those review comments.

If duplex remains as "unknown", then the MAC will be programmed to
operate in _half_ _duplex_ mode (read mtk_mac_link_up()) which is not
what you likely want. Many MACs don't support half duplex at 1G speed,
so it's likely that without setting state->duplex, the result is that
the MAC hardware is programmed incorrectly.

> > So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> > the way the PCS interprets the control word, but as we don't have
> > any documentation to go on, only experimentation will answer this
> > question.
> >
> > If these registers are MMIO, you could ensure that you have /dev/mem
> > access enabled, and use devmem2 to poke at this register which would
> > probably be quicker than doing a build-boot-test cycle with the
> > kernel - this is how I do a lot of this kind of discovery when
> > documentation is lacking.
> >
> > > But the timer-change can also break sgmii...
> >
> > SGMII mode should be writing the same value to the link timer, but
> > looking at it now, I see I ended up with one too many zeros on the
> > 16000000! It should be 1.6ms in nanoseconds, so 1600000. Please
> > correct for future testing.
> 
> tried removing 1  zero from the 16000000, but same result.
> tried also setting duplex with ethtool, but  after read it is still unknown.

Honestly this doesn't surprise me given the poor state of the mtk_sgmii
code. There's lots that this implementation gets wrong, but I can't fix
it without either people willing to research and test stuff, or without
the actual hardware in front of me.

This mtk_eth_soc driver has been a right pain for me ever since the
half-hearted switch-over to use phylink.
  
Frank Wunderlich Oct. 23, 2022, 3:05 p.m. UTC | #19
> Gesendet: Sonntag, 23. Oktober 2022 um 11:43 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> On Sun, Oct 23, 2022 at 09:26:39AM +0200, Frank Wunderlich wrote:
> > > Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > Hi,
> > >
> > > On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > >
> > > > > > this patch breaks connectivity at least on the sfp-port (eth1).
> > > >
> > > > > > pcs_get_state
> > > > > > [   65.522936] offset:0 0x2c1140
> > > > > > [   65.522950] offset:4 0x4d544950
> > > > > > [   65.525914] offset:8 0x40e041a0
> > > > > > [  177.346183] offset:0 0x2c1140
> > > > > > [  177.346202] offset:4 0x4d544950
> > > > > > [  177.349168] offset:8 0x40e041a0
> > > > > > [  177.352477] offset:0 0x2c1140
> > > > > > [  177.356952] offset:4 0x4d544950
> > > > >
> > > > > Hi,
> > > > >
> > > > > Thanks. Well, the results suggest that the register at offset 8 is
> > > > > indeed the advertisement and link-partner advertisement register. So
> > > > > we have a bit of progress and a little more understanding of this
> > > > > hardware.
> > > > >
> > > > > Do you know if your link partner also thinks the link is up?
> > > >
> > > > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> > > >
> > > > > What I notice is:
> > > > >
> > > > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > > > >
> > > > > The duplex is "unknown" which means you're not filling in the
> > > > > state->duplex field in your pcs_get_state() function. Given the
> > > > > link parter adverisement is 0x00e0, this means the link partner
> > > > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > > > is therefore full duplex, so can we hack that in to your
> > > > > pcs_get_state() so we're getting that right for this testing please?
> > > >
> > > > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> > > >
> > > > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> > > >
> > > > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> > > >
> > > > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > > > partner_advertising = (val & 0x00ff0000) >> 16;
> > >
> > > Not quite :) When we have the link partner's advertisement and the BMSR,
> > > we have a helper function in phylink to do all the gritty work:
> > >
> > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> > >
> > > 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> > >
> > > will do all the work for you without having to care about whether
> > > you're operating at 2500base-X, 1000base-X or SGMII mode.
> > >
> > > > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > > > format for the 16-bit control word that's used to convey the
> > > > > advertisements. I think the next step would be to play around with
> > > > > these and see what effect setting or clearing these bits has -
> > > > > please can you give that a go?
> > > >
> > > > these is not clear to me...should i blindly set these and how to
> > > > verify what they do?
> > >
> > > Yes please - I don't think anyone knows what they do.
> >
> > i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
> > Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
> > or for 1G vs. 2G5.
>
> "other sgmii implementations" ?

yes i googled for sgmii and found register definition for different vendor...
i don't know if sgmii is a standard for each vendor, afair trgmii was not.

> If this is the SGMII_EN flag, maybe SGMII_IF_MODE_BIT0 should be
> renamed to SGMII_IF_SGMII_EN ? Maybe it needs to be set for SGMII
> and clear for base-X ?
>
> > but how to check what has changed...i guess only the register itself changed
> > and i have to readout another to check whats changed.
> >
> > do we really need these 2 bits? reading/setting duplex/pause from/to the register
> > makes sense, but digging into undocumented bits is much work and we still only guess.
>
> I don't know - I've no idea about this hardware, or what the PCS is,
> and other people over the years I've talked to have said "we're not
> using it, we can't help". The mediatek driver has been somewhat of a
> pain for phylink as a result.
>
> > so i would first want to get sgmii working again and then strip the pause/duplex from it.
>
> I think I'd need more information on your setup - is this dev 0? Are
> you using in-band mode or fixed-link mode?

i only test with dev1 which is the sfp-port/eth1/gmac1...dev0 is the fixed-link to switch-chip.

> I don't think you've updated me with register values for this since
> the patch. With the link timer adjusted back to 1.6ms, that should
> result in it working again, but if not, I think there's some
> possibilities.

sorry for that, have debugged timing and it was wrong because if-condition had not included 1000baseX and 2500baseX. only sgmii

> The addition of SGMII_AN_ENABLE for SGMSYS_PCS_CONTROL_1 could have
> broken your setup if there is no actual in-band signalling, which
> basically means that your firmware description is wrong - and you've
> possibly been led astray by the poor driver implementation.

disabled it, but makes no change.

but i've noticed that timing is wrong

old value: 0x186a0
new value: 0x98968

so it takes the 10000000 and not the 1600000. so it looks like interface-mode is not (yet) SGMII.

debugged it and got 1000baseX (21),in dts i have
phy-mode = "2500base-x";
but SFP only supports 1G so mode 1000baseX is right

set the old value with your calculation, but still not working, also with disabled AN_ENABLE-flag ;(

root@bpi-r3:~# ip link set eth1 up
[   44.287442] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000be
[   44.295902] interface-mode 21 (sgmii:4)
root@bpi-r3:~# [   44.295907] timer 0x186a0
[   44.352872] offset:0 0x2c1140
[   44.355507] offset:4 0x4d544950
[   44.358462] offset:8 0x40e041a0
[   44.361609] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flf
[   44.373042] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
root@bpi-r3:~# ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
^C

> Can you confirm that the device on the other end for dev 0 does in
> actual fact use in-band signalling please?
>
> > > If it's interpreting a link partner advertisement of 0x00e0 using
> > > SGMII rules, then it will be looking at bits 11 and 10 for the
> > > speed, both of which are zero, which means 10Mbps - and 1000base-X
> > > doesn't operate at 10Mbps!
> >
> > so maybe this breaks sgmii? have you changed this behaviour with your Patch?
>
> Nope, but not setting the duplex properly is yet another buggy and poor
> quality of implementation that afficts this driver. I've said about
> setting the duplex value when reviewing your patch to add .pcs_get_state
> and I'm disappointed that you seemingly haven't yet corrected it in the
> code you're testing despite those review comments.

sorry, i thought we want to read out the values from registers to set it based on them.

currently i test only with the dev 1 (in-band-managed with 1GBit/s SFP)

[    1.088310] dev: 0 offset:0 0x40140
[    1.088331] dev: 0 offset:4 0x4d544950
[    1.091827] dev: 0 offset:8 0x1
[    1.095607] dev: 1 offset:0 0x81140
[    1.098739] dev: 1 offset:4 0x4d544950
[    1.102214] dev: 1 offset:8 0x1

after bring device up (disabled AN and set duplex to full):

[   34.615926] timer 0x98968
[   34.672888] offset:0 0x2c1140
[   34.675518] offset:4 0x4d544950
[   34.678473] offset:8 0x40e041a0

codebase:

https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii

> If duplex remains as "unknown", then the MAC will be programmed to
> operate in _half_ _duplex_ mode (read mtk_mac_link_up()) which is not
> what you likely want. Many MACs don't support half duplex at 1G speed,
> so it's likely that without setting state->duplex, the result is that
> the MAC hardware is programmed incorrectly.

wonder why it was working with only my patch which had duplex also not set.

> > > So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> > > the way the PCS interprets the control word, but as we don't have
> > > any documentation to go on, only experimentation will answer this
> > > question.

the bits were in offset 0/4/8? are they now different than before?
if yes maybe these break it.

as offset 4 is the phy-id and 8 is the advertisement from local and far
interface i guesss IF_MODE_* is in offset 0.

> > > If these registers are MMIO, you could ensure that you have /dev/mem
> > > access enabled, and use devmem2 to poke at this register which would
> > > probably be quicker than doing a build-boot-test cycle with the
> > > kernel - this is how I do a lot of this kind of discovery when
> > > documentation is lacking.
> > >
> > > > But the timer-change can also break sgmii...

currently this is no more the case as i set the timer to old value
so something else is breaking it.

> > > SGMII mode should be writing the same value to the link timer, but
> > > looking at it now, I see I ended up with one too many zeros on the
> > > 16000000! It should be 1.6ms in nanoseconds, so 1600000. Please
> > > correct for future testing.
> >
> > tried removing 1  zero from the 16000000, but same result.
> > tried also setting duplex with ethtool, but  after read it is still unknown.
>
> Honestly this doesn't surprise me given the poor state of the mtk_sgmii
> code. There's lots that this implementation gets wrong, but I can't fix
> it without either people willing to research and test stuff, or without
> the actual hardware in front of me.

i can test, but i do not fully understand the code nor have the exoerience
to work with devmem2. have used it long time ago but with assistance to read
which register and with expected values.

> This mtk_eth_soc driver has been a right pain for me ever since the
> half-hearted switch-over to use phylink.

i know it is huge as it covers many different SoC. but i'm not able to rewrite it :p

regards Frank
  
Russell King (Oracle) Oct. 23, 2022, 3:46 p.m. UTC | #20
On Sun, Oct 23, 2022 at 05:05:30PM +0200, Frank Wunderlich wrote:
> > Gesendet: Sonntag, 23. Oktober 2022 um 11:43 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > On Sun, Oct 23, 2022 at 09:26:39AM +0200, Frank Wunderlich wrote:
> > > > Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > Hi,
> > > >
> > > > On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > > > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > >
> > > > > > > this patch breaks connectivity at least on the sfp-port (eth1).
> > > > >
> > > > > > > pcs_get_state
> > > > > > > [   65.522936] offset:0 0x2c1140
> > > > > > > [   65.522950] offset:4 0x4d544950
> > > > > > > [   65.525914] offset:8 0x40e041a0
> > > > > > > [  177.346183] offset:0 0x2c1140
> > > > > > > [  177.346202] offset:4 0x4d544950
> > > > > > > [  177.349168] offset:8 0x40e041a0
> > > > > > > [  177.352477] offset:0 0x2c1140
> > > > > > > [  177.356952] offset:4 0x4d544950
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > Thanks. Well, the results suggest that the register at offset 8 is
> > > > > > indeed the advertisement and link-partner advertisement register. So
> > > > > > we have a bit of progress and a little more understanding of this
> > > > > > hardware.
> > > > > >
> > > > > > Do you know if your link partner also thinks the link is up?
> > > > >
> > > > > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> > > > >
> > > > > > What I notice is:
> > > > > >
> > > > > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > > > > >
> > > > > > The duplex is "unknown" which means you're not filling in the
> > > > > > state->duplex field in your pcs_get_state() function. Given the
> > > > > > link parter adverisement is 0x00e0, this means the link partner
> > > > > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > > > > is therefore full duplex, so can we hack that in to your
> > > > > > pcs_get_state() so we're getting that right for this testing please?
> > > > >
> > > > > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> > > > >
> > > > > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> > > > >
> > > > > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> > > > >
> > > > > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > > > > partner_advertising = (val & 0x00ff0000) >> 16;
> > > >
> > > > Not quite :) When we have the link partner's advertisement and the BMSR,
> > > > we have a helper function in phylink to do all the gritty work:
> > > >
> > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> > > >
> > > > 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> > > >
> > > > will do all the work for you without having to care about whether
> > > > you're operating at 2500base-X, 1000base-X or SGMII mode.
> > > >
> > > > > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > > > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > > > > format for the 16-bit control word that's used to convey the
> > > > > > advertisements. I think the next step would be to play around with
> > > > > > these and see what effect setting or clearing these bits has -
> > > > > > please can you give that a go?
> > > > >
> > > > > these is not clear to me...should i blindly set these and how to
> > > > > verify what they do?
> > > >
> > > > Yes please - I don't think anyone knows what they do.
> > >
> > > i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
> > > Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
> > > or for 1G vs. 2G5.
> >
> > "other sgmii implementations" ?
> 
> yes i googled for sgmii and found register definition for different vendor...
> i don't know if sgmii is a standard for each vendor, afair trgmii was not.
> 
> > If this is the SGMII_EN flag, maybe SGMII_IF_MODE_BIT0 should be
> > renamed to SGMII_IF_SGMII_EN ? Maybe it needs to be set for SGMII
> > and clear for base-X ?
> >
> > > but how to check what has changed...i guess only the register itself changed
> > > and i have to readout another to check whats changed.
> > >
> > > do we really need these 2 bits? reading/setting duplex/pause from/to the register
> > > makes sense, but digging into undocumented bits is much work and we still only guess.
> >
> > I don't know - I've no idea about this hardware, or what the PCS is,
> > and other people over the years I've talked to have said "we're not
> > using it, we can't help". The mediatek driver has been somewhat of a
> > pain for phylink as a result.
> >
> > > so i would first want to get sgmii working again and then strip the pause/duplex from it.
> >
> > I think I'd need more information on your setup - is this dev 0? Are
> > you using in-band mode or fixed-link mode?
> 
> i only test with dev1 which is the sfp-port/eth1/gmac1...dev0 is the fixed-link to switch-chip.

Okay, so when you're using SGMII, how are you testing it? With a copper
SFP plugged in?

> > I don't think you've updated me with register values for this since
> > the patch. With the link timer adjusted back to 1.6ms, that should
> > result in it working again, but if not, I think there's some
> > possibilities.
> 
> sorry for that, have debugged timing and it was wrong because if-
> condition had not included 1000baseX and 2500baseX. only sgmii

SGMII's link timer is specified to be 1.6ms - the SGMII v1.8 spec
doesn't specify the margins for this.

802.3z (1000base-X) is 10ms +10ms -0s.

This is what we should be using, and what I tried to implement.

The hex values programmed into the register should be 0x186A0 for
SGMII and 0x98968 for 1000base-X and 2500base-X - both values
should fit because the link timer is apparently 20 bits wide.

> > The addition of SGMII_AN_ENABLE for SGMSYS_PCS_CONTROL_1 could have
> > broken your setup if there is no actual in-band signalling, which
> > basically means that your firmware description is wrong - and you've
> > possibly been led astray by the poor driver implementation.
> 
> disabled it, but makes no change.
> 
> but i've noticed that timing is wrong
> 
> old value: 0x186a0
> new value: 0x98968
> 
> so it takes the 10000000 and not the 1600000. so it looks like interface-mode is not (yet) SGMII.
> 
> debugged it and got 1000baseX (21),in dts i have
> phy-mode = "2500base-x";
> but SFP only supports 1G so mode 1000baseX is right
> 
> set the old value with your calculation, but still not working, also with disabled AN_ENABLE-flag ;(

I'm getting the impression that there's some confusing terminology going
on here... can we clear this up please?

SGMII is a proprietary modification of the 802.3z 1000base-X standard
which:
- reduces the link timer from 10ms to 1.6ms
- implements data replication by 10x and 100x to achieve 100M and 10M
  speeds over a link operating at a fixed speed of 1.25Gbaud.
- changes the control word format to allow a SGMII PHY to signal to the
  MAC in a timely manner which speed it is operating at.

So, if you're using a fibre SFP to another device that is operating in
1000base-X mode, then you're wanting 1000base-X and not SGMII, and
referring to this as SGMII is technically misleading.

> root@bpi-r3:~# ip link set eth1 up
> [   44.287442] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000be
> [   44.295902] interface-mode 21 (sgmii:4)
> root@bpi-r3:~# [   44.295907] timer 0x186a0
> [   44.352872] offset:0 0x2c1140
> [   44.355507] offset:4 0x4d544950
> [   44.358462] offset:8 0x40e041a0
> [   44.361609] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flf
> [   44.373042] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> 
> root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
> root@bpi-r3:~# ping 192.168.0.10
> PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
> ^C

This is where I postulated that the PCS is trying to interpret the
advertisements as if they are SGMII formatted control words rather than
1000base-X formatted control words - and by doing so, it is trying to
operate at 10Mbps (100x data replication) with the remote end trying to
operate at 1000Mbps. If that is what it is doing, then you will have
link-up but no communication.

The solution to this is likely trying to find a bit that tells the
PCS whether it should be expecting a 1000base-X (or 802.3z) formatted
control word (aka 1000base-X mode) or a SGMII formatted control word.

You mentioned that bit 0 in SGMSYS_SGMII_MODE is a "SGMII_EN" bit.
Any ideas exactly what this bit does? Does it enable the PCS as a
whole, or could that be the bit which switches between 1000base-X
mode and SGMII mode? (More on this below).

Note that the "old way" used to work because even in 1000base-X
mode, the code would (technically incorrectly) force the PCS to
use a fixed configuration of 1000Mbps and force the duplex bit -
basically no 802.3 specified autonegotiation.

However, 1000base-X with in-band signalling _should_ be using
the autonegotiation - as everything else that uses phylink does.

> > Can you confirm that the device on the other end for dev 0 does in
> > actual fact use in-band signalling please?
> >
> > > > If it's interpreting a link partner advertisement of 0x00e0 using
> > > > SGMII rules, then it will be looking at bits 11 and 10 for the
> > > > speed, both of which are zero, which means 10Mbps - and 1000base-X
> > > > doesn't operate at 10Mbps!
> > >
> > > so maybe this breaks sgmii? have you changed this behaviour with your Patch?
> >
> > Nope, but not setting the duplex properly is yet another buggy and poor
> > quality of implementation that afficts this driver. I've said about
> > setting the duplex value when reviewing your patch to add .pcs_get_state
> > and I'm disappointed that you seemingly haven't yet corrected it in the
> > code you're testing despite those review comments.
> 
> sorry, i thought we want to read out the values from registers to set it based on them.
> 
> currently i test only with the dev 1 (in-band-managed with 1GBit/s SFP)
> 
> [    1.088310] dev: 0 offset:0 0x40140
> [    1.088331] dev: 0 offset:4 0x4d544950
> [    1.091827] dev: 0 offset:8 0x1
> [    1.095607] dev: 1 offset:0 0x81140
> [    1.098739] dev: 1 offset:4 0x4d544950
> [    1.102214] dev: 1 offset:8 0x1
> 
> after bring device up (disabled AN and set duplex to full):
> 
> [   34.615926] timer 0x98968
> [   34.672888] offset:0 0x2c1140
> [   34.675518] offset:4 0x4d544950
> [   34.678473] offset:8 0x40e041a0
> 
> codebase:
> 
> https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii

I think it would also be useful to print the register at offset 32
as well, which is the SGMSYS_SGMII_MODE register, so we can discover
what the initial and current values of these IF_MODE_BITs are. I
may then be able to provide you an updated patch.

> > If duplex remains as "unknown", then the MAC will be programmed to
> > operate in _half_ _duplex_ mode (read mtk_mac_link_up()) which is not
> > what you likely want. Many MACs don't support half duplex at 1G speed,
> > so it's likely that without setting state->duplex, the result is that
> > the MAC hardware is programmed incorrectly.
> 
> wonder why it was working with only my patch which had duplex also not set.

It depends entirely on the MAC implementation and why the manufacturer
decides to state that 1000 half-duplex isn't supported by the hardware!
I don't think we can guess. However, configuring the hardware correctly
eliminates potential issues.

It is in entirely possible for devices configured with dissimilar
duplex settings to communicate, but there will be packet loss - since
the end operating in full duplex will transmit while the receiving
end could also be transmitting, and the receving end could interpret
that as a collision.

> > > > So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> > > > the way the PCS interprets the control word, but as we don't have
> > > > any documentation to go on, only experimentation will answer this
> > > > question.
> 
> the bits were in offset 0/4/8? are they now different than before?
> if yes maybe these break it.
> 
> as offset 4 is the phy-id and 8 is the advertisement from local and far
> interface i guesss IF_MODE_* is in offset 0.

They're in the register at offset 32:

/* Register to control remote fault */
#define SGMSYS_SGMII_MODE               0x20
#define SGMII_IF_MODE_BIT0              BIT(0)
...
#define SGMII_IF_MODE_BIT5              BIT(5)

So, I think the first step should be to print the value of this register
along side the other three you've been providing me and update me with
its value. I'll then provide you a replacement patch to try.

Russell.
  
Frank Wunderlich Oct. 23, 2022, 4:41 p.m. UTC | #21
> Gesendet: Sonntag, 23. Oktober 2022 um 17:46 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> On Sun, Oct 23, 2022 at 05:05:30PM +0200, Frank Wunderlich wrote:
> > > Gesendet: Sonntag, 23. Oktober 2022 um 11:43 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> >
> > > On Sun, Oct 23, 2022 at 09:26:39AM +0200, Frank Wunderlich wrote:
> > > > > Gesendet: Samstag, 22. Oktober 2022 um 21:18 Uhr
> > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > > Hi,
> > > > >
> > > > > On Sat, Oct 22, 2022 at 07:53:16PM +0200, Frank Wunderlich wrote:
> > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 19:05 Uhr
> > > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > > > > On Sat, Oct 22, 2022 at 12:52:00PM +0200, Frank Wunderlich wrote:
> > > > > > > > > Gesendet: Samstag, 22. Oktober 2022 um 11:11 Uhr
> > > > > > > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > > > > >
> > > > > > > > this patch breaks connectivity at least on the sfp-port (eth1).
> > > > > >
> > > > > > > > pcs_get_state
> > > > > > > > [   65.522936] offset:0 0x2c1140
> > > > > > > > [   65.522950] offset:4 0x4d544950
> > > > > > > > [   65.525914] offset:8 0x40e041a0
> > > > > > > > [  177.346183] offset:0 0x2c1140
> > > > > > > > [  177.346202] offset:4 0x4d544950
> > > > > > > > [  177.349168] offset:8 0x40e041a0
> > > > > > > > [  177.352477] offset:0 0x2c1140
> > > > > > > > [  177.356952] offset:4 0x4d544950
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > Thanks. Well, the results suggest that the register at offset 8 is
> > > > > > > indeed the advertisement and link-partner advertisement register. So
> > > > > > > we have a bit of progress and a little more understanding of this
> > > > > > > hardware.
> > > > > > >
> > > > > > > Do you know if your link partner also thinks the link is up?
> > > > > >
> > > > > > yes link is up on my switch, cannot enable autoneg for fibre-port, so port is fixed to 1000M/full flowcontrol enabled.
> > > > > >
> > > > > > > What I notice is:
> > > > > > >
> > > > > > > mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Unknown - flow control off
> > > > > > >
> > > > > > > The duplex is "unknown" which means you're not filling in the
> > > > > > > state->duplex field in your pcs_get_state() function. Given the
> > > > > > > link parter adverisement is 0x00e0, this means the link partner
> > > > > > > supports PAUSE, 1000base-X/Half and 1000base-X/Full. The resolution
> > > > > > > is therefore full duplex, so can we hack that in to your
> > > > > > > pcs_get_state() so we're getting that right for this testing please?
> > > > > >
> > > > > > 0xe0 is bits 5-7 are set (in lower byte from upper word)..which one is for duplex?
> > > > > >
> > > > > > so i should set state->duplex/pause based on this value (maybe compare with own caps)?
> > > > > >
> > > > > > found a documentation where 5=full,6=half, and bits 7+8 are for pause (symetric/asymetric)
> > > > > >
> > > > > > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1+8, &val);
> > > > > > partner_advertising = (val & 0x00ff0000) >> 16;
> > > > >
> > > > > Not quite :) When we have the link partner's advertisement and the BMSR,
> > > > > we have a helper function in phylink to do all the gritty work:
> > > > >
> > > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > > > > 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> > > > >
> > > > > 	phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> > > > >
> > > > > will do all the work for you without having to care about whether
> > > > > you're operating at 2500base-X, 1000base-X or SGMII mode.
> > > > >
> > > > > > > Now, I'm wondering what SGMII_IF_MODE_BIT0 and SGMII_IF_MODE_BIT5 do
> > > > > > > in the SGMSYS_SGMII_MODE register. Does one of these bits set the
> > > > > > > format for the 16-bit control word that's used to convey the
> > > > > > > advertisements. I think the next step would be to play around with
> > > > > > > these and see what effect setting or clearing these bits has -
> > > > > > > please can you give that a go?
> > > > > >
> > > > > > these is not clear to me...should i blindly set these and how to
> > > > > > verify what they do?
> > > > >
> > > > > Yes please - I don't think anyone knows what they do.
> > > >
> > > > i guess BIT0 is the SGMII_EN flag like in other sgmii implementations.
> > > > Bit5 is "reserved" in all docs i've found....maybe it is related to HSGMII
> > > > or for 1G vs. 2G5.
> > >
> > > "other sgmii implementations" ?
> >
> > yes i googled for sgmii and found register definition for different vendor...
> > i don't know if sgmii is a standard for each vendor, afair trgmii was not.
> >
> > > If this is the SGMII_EN flag, maybe SGMII_IF_MODE_BIT0 should be
> > > renamed to SGMII_IF_SGMII_EN ? Maybe it needs to be set for SGMII
> > > and clear for base-X ?
> > >
> > > > but how to check what has changed...i guess only the register itself changed
> > > > and i have to readout another to check whats changed.
> > > >
> > > > do we really need these 2 bits? reading/setting duplex/pause from/to the register
> > > > makes sense, but digging into undocumented bits is much work and we still only guess.
> > >
> > > I don't know - I've no idea about this hardware, or what the PCS is,
> > > and other people over the years I've talked to have said "we're not
> > > using it, we can't help". The mediatek driver has been somewhat of a
> > > pain for phylink as a result.
> > >
> > > > so i would first want to get sgmii working again and then strip the pause/duplex from it.
> > >
> > > I think I'd need more information on your setup - is this dev 0? Are
> > > you using in-band mode or fixed-link mode?
> >
> > i only test with dev1 which is the sfp-port/eth1/gmac1...dev0 is the fixed-link to switch-chip.
>
> Okay, so when you're using SGMII, how are you testing it? With a copper
> SFP plugged in?
>
> > > I don't think you've updated me with register values for this since
> > > the patch. With the link timer adjusted back to 1.6ms, that should
> > > result in it working again, but if not, I think there's some
> > > possibilities.
> >
> > sorry for that, have debugged timing and it was wrong because if-
> > condition had not included 1000baseX and 2500baseX. only sgmii
>
> SGMII's link timer is specified to be 1.6ms - the SGMII v1.8 spec
> doesn't specify the margins for this.
>
> 802.3z (1000base-X) is 10ms +10ms -0s.
>
> This is what we should be using, and what I tried to implement.
>
> The hex values programmed into the register should be 0x186A0 for
> SGMII and 0x98968 for 1000base-X and 2500base-X - both values
> should fit because the link timer is apparently 20 bits wide.
>
> > > The addition of SGMII_AN_ENABLE for SGMSYS_PCS_CONTROL_1 could have
> > > broken your setup if there is no actual in-band signalling, which
> > > basically means that your firmware description is wrong - and you've
> > > possibly been led astray by the poor driver implementation.
> >
> > disabled it, but makes no change.
> >
> > but i've noticed that timing is wrong
> >
> > old value: 0x186a0
> > new value: 0x98968
> >
> > so it takes the 10000000 and not the 1600000. so it looks like interface-mode is not (yet) SGMII.
> >
> > debugged it and got 1000baseX (21),in dts i have
> > phy-mode = "2500base-x";
> > but SFP only supports 1G so mode 1000baseX is right
> >
> > set the old value with your calculation, but still not working, also with disabled AN_ENABLE-flag ;(
>
> I'm getting the impression that there's some confusing terminology going
> on here... can we clear this up please?
>
> SGMII is a proprietary modification of the 802.3z 1000base-X standard
> which:
> - reduces the link timer from 10ms to 1.6ms
> - implements data replication by 10x and 100x to achieve 100M and 10M
>   speeds over a link operating at a fixed speed of 1.25Gbaud.
> - changes the control word format to allow a SGMII PHY to signal to the
>   MAC in a timely manner which speed it is operating at.
>
> So, if you're using a fibre SFP to another device that is operating in
> 1000base-X mode, then you're wanting 1000base-X and not SGMII, and
> referring to this as SGMII is technically misleading.
>
> > root@bpi-r3:~# ip link set eth1 up
> > [   44.287442] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000be
> > [   44.295902] interface-mode 21 (sgmii:4)
> > root@bpi-r3:~# [   44.295907] timer 0x186a0
> > [   44.352872] offset:0 0x2c1140
> > [   44.355507] offset:4 0x4d544950
> > [   44.358462] offset:8 0x40e041a0
> > [   44.361609] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flf
> > [   44.373042] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready
> >
> > root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
> > root@bpi-r3:~# ping 192.168.0.10
> > PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
> > ^C
>
> This is where I postulated that the PCS is trying to interpret the
> advertisements as if they are SGMII formatted control words rather than
> 1000base-X formatted control words - and by doing so, it is trying to
> operate at 10Mbps (100x data replication) with the remote end trying to
> operate at 1000Mbps. If that is what it is doing, then you will have
> link-up but no communication.
>
> The solution to this is likely trying to find a bit that tells the
> PCS whether it should be expecting a 1000base-X (or 802.3z) formatted
> control word (aka 1000base-X mode) or a SGMII formatted control word.
>
> You mentioned that bit 0 in SGMSYS_SGMII_MODE is a "SGMII_EN" bit.
> Any ideas exactly what this bit does? Does it enable the PCS as a
> whole, or could that be the bit which switches between 1000base-X
> mode and SGMII mode? (More on this below).
>
> Note that the "old way" used to work because even in 1000base-X
> mode, the code would (technically incorrectly) force the PCS to
> use a fixed configuration of 1000Mbps and force the duplex bit -
> basically no 802.3 specified autonegotiation.
>
> However, 1000base-X with in-band signalling _should_ be using
> the autonegotiation - as everything else that uses phylink does.
>
> > > Can you confirm that the device on the other end for dev 0 does in
> > > actual fact use in-band signalling please?
> > >
> > > > > If it's interpreting a link partner advertisement of 0x00e0 using
> > > > > SGMII rules, then it will be looking at bits 11 and 10 for the
> > > > > speed, both of which are zero, which means 10Mbps - and 1000base-X
> > > > > doesn't operate at 10Mbps!
> > > >
> > > > so maybe this breaks sgmii? have you changed this behaviour with your Patch?
> > >
> > > Nope, but not setting the duplex properly is yet another buggy and poor
> > > quality of implementation that afficts this driver. I've said about
> > > setting the duplex value when reviewing your patch to add .pcs_get_state
> > > and I'm disappointed that you seemingly haven't yet corrected it in the
> > > code you're testing despite those review comments.
> >
> > sorry, i thought we want to read out the values from registers to set it based on them.
> >
> > currently i test only with the dev 1 (in-band-managed with 1GBit/s SFP)
> >
> > [    1.088310] dev: 0 offset:0 0x40140
> > [    1.088331] dev: 0 offset:4 0x4d544950
> > [    1.091827] dev: 0 offset:8 0x1
> > [    1.095607] dev: 1 offset:0 0x81140
> > [    1.098739] dev: 1 offset:4 0x4d544950
> > [    1.102214] dev: 1 offset:8 0x1
> >
> > after bring device up (disabled AN and set duplex to full):
> >
> > [   34.615926] timer 0x98968
> > [   34.672888] offset:0 0x2c1140
> > [   34.675518] offset:4 0x4d544950
> > [   34.678473] offset:8 0x40e041a0
> >
> > codebase:
> >
> > https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii
>
> I think it would also be useful to print the register at offset 32
> as well, which is the SGMSYS_SGMII_MODE register, so we can discover
> what the initial and current values of these IF_MODE_BITs are. I
> may then be able to provide you an updated patch.
>
> > > If duplex remains as "unknown", then the MAC will be programmed to
> > > operate in _half_ _duplex_ mode (read mtk_mac_link_up()) which is not
> > > what you likely want. Many MACs don't support half duplex at 1G speed,
> > > so it's likely that without setting state->duplex, the result is that
> > > the MAC hardware is programmed incorrectly.
> >
> > wonder why it was working with only my patch which had duplex also not set.
>
> It depends entirely on the MAC implementation and why the manufacturer
> decides to state that 1000 half-duplex isn't supported by the hardware!
> I don't think we can guess. However, configuring the hardware correctly
> eliminates potential issues.
>
> It is in entirely possible for devices configured with dissimilar
> duplex settings to communicate, but there will be packet loss - since
> the end operating in full duplex will transmit while the receiving
> end could also be transmitting, and the receving end could interpret
> that as a collision.
>
> > > > > So my hunch is that one of those two IF_MODE_BIT{0,5} _might_ change
> > > > > the way the PCS interprets the control word, but as we don't have
> > > > > any documentation to go on, only experimentation will answer this
> > > > > question.
> >
> > the bits were in offset 0/4/8? are they now different than before?
> > if yes maybe these break it.
> >
> > as offset 4 is the phy-id and 8 is the advertisement from local and far
> > interface i guesss IF_MODE_* is in offset 0.
>
> They're in the register at offset 32:
>
> /* Register to control remote fault */
> #define SGMSYS_SGMII_MODE               0x20
> #define SGMII_IF_MODE_BIT0              BIT(0)
> ...
> #define SGMII_IF_MODE_BIT5              BIT(5)
>
> So, I think the first step should be to print the value of this register
> along side the other three you've been providing me and update me with
> its value. I'll then provide you a replacement patch to try.

here it is (AN again enabled, changes pushed to tree above):

bootup:

[    1.098876] dev: 1 offset:0 0x81140
[    1.102699] dev: 1 offset:4 0x4d544950
[    1.106180] dev: 1 offset:8 0x1
[    1.109914] dev: 1 offset:32 0x3112001b

after putting eth1 up:

[   32.566099] timer 0x186a0
[   32.623021] offset:0 0x2c1140
[   32.625653] offset:4 0x4d544950
[   32.628614] offset:8 0x40e041a0
[   32.631746] offset:32 0x3112011b

regards Frank
  
Russell King (Oracle) Oct. 23, 2022, 5:52 p.m. UTC | #22
On Sun, Oct 23, 2022 at 06:41:45PM +0200, Frank Wunderlich wrote:
> bootup:
> 
> [    1.098876] dev: 1 offset:0 0x81140
> [    1.102699] dev: 1 offset:4 0x4d544950
> [    1.106180] dev: 1 offset:8 0x1
> [    1.109914] dev: 1 offset:32 0x3112001b
> 
> after putting eth1 up:
> 
> [   32.566099] timer 0x186a0
> [   32.623021] offset:0 0x2c1140
> [   32.625653] offset:4 0x4d544950
> [   32.628614] offset:8 0x40e041a0
> [   32.631746] offset:32 0x3112011b

Hi Frank,

Based on this, could you give the following patch a try - it replaces
my previous patch.

Thanks.

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index b52f3b0177ef..1a3eb3ecf7e3 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -479,7 +479,7 @@
 
 /* Register to programmable link timer, the unit in 2 * 8ns */
 #define SGMSYS_PCS_LINK_TIMER	0x18
-#define SGMII_LINK_TIMER_DEFAULT	(0x186a0 & GENMASK(19, 0))
+#define SGMII_LINK_TIMER_MASK		GENMASK(19, 0)
 
 /* Register to control remote fault */
 #define SGMSYS_SGMII_MODE		0x20
diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 736839c84130..63736c52bab2 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -20,19 +20,40 @@ static struct mtk_pcs *pcs_to_mtk_pcs(struct phylink_pcs *pcs)
 }
 
 /* For SGMII interface mode */
-static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs)
+static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs,
+				 phy_interface_t interface,
+				 const unsigned long *advertising)
 {
-	unsigned int val;
+	unsigned int val, link_timer;
+	int advertise;
+	bool changed;
+
+	advertise = phylink_mii_c22_pcs_encode_advertisement(interface,
+							     advertising);
+	if (advertise < 0)
+		return advertise;
+
+	if (interface == PHY_INTERFACE_MODE_SGMII)
+		link_timer = 1600000 / 2 / 8;
+	else
+		link_timer = 10000000 / 2 / 8;
 
 	/* Setup the link timer and QPHY power up inside SGMIISYS */
-	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER,
-		     SGMII_LINK_TIMER_DEFAULT);
+	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER, link_timer);
 
 	regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
+	if (interface = == PHY_INTERFACE_MODE_SGMII)
+		val |= SGMII_IF_MODE_BIT0;
+	else
+		val &= ~SGMII_IF_MODE_BIT0;
 	val |= SGMII_REMOTE_FAULT_DIS;
 	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
 
+	regmap_update_bits_check(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, 0xffff,
+				 advertise, &changed);
+
 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
+	val |= SGMII_AN_ENABLE;
 	val |= SGMII_AN_RESTART;
 	regmap_write(mpcs->regmap, SGMSYS_PCS_CONTROL_1, val);
 
@@ -40,7 +61,7 @@ static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs)
 	val &= ~SGMII_PHYA_PWD;
 	regmap_write(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, val);
 
-	return 0;
+	return changed ? 1 : 0;
 
 }
 
@@ -52,12 +73,6 @@ static int mtk_pcs_setup_mode_force(struct mtk_pcs *mpcs,
 {
 	unsigned int val;
 
-	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
-	val &= ~RG_PHY_SPEED_MASK;
-	if (interface == PHY_INTERFACE_MODE_2500BASEX)
-		val |= RG_PHY_SPEED_3_125G;
-	regmap_write(mpcs->regmap, mpcs->ana_rgc3, val);
-
 	/* Disable SGMII AN */
 	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
 	val &= ~SGMII_AN_ENABLE;
@@ -83,13 +98,22 @@ static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
 			  bool permit_pause_to_mac)
 {
 	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
+	unsigned int val;
 	int err = 0;
 
+	if (interface == PHY_INTERFACE_MODE_2500BASEX)
+		val = RG_PHY_SPEED_3_125G;
+	else
+		val = 0;
+
+	regmap_update_bits(mpcs->regmap, mpcs->ana_rgc3,
+			   RG_PHY_SPEED_3_125G, val);
+
 	/* Setup SGMIISYS with the determined property */
-	if (interface != PHY_INTERFACE_MODE_SGMII)
+	if (phylink_autoneg_inband(mode))
+		err = mtk_pcs_setup_mode_an(mpcs, interface, advertising);
+	else if (interface != PHY_INTERFACE_MODE_SGMII)
 		err = mtk_pcs_setup_mode_force(mpcs, interface);
-	else if (phylink_autoneg_inband(mode))
-		err = mtk_pcs_setup_mode_an(mpcs);
 
 	return err;
 }
  
Frank Wunderlich Oct. 23, 2022, 7:03 p.m. UTC | #23
> Gesendet: Sonntag, 23. Oktober 2022 um 19:52 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> Hi Frank,
>
> Based on this, could you give the following patch a try - it replaces
> my previous patch.

looks better now:

root@bpi-r3:~# ip link set eth1 up
[   59.437700] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/1000base-x link mode
root@bpi-r3:~# [   59.446191] interface-mode: 21 advertise: 0x1a0 link timer:0x98968
[   59.503145] offset:0 0x2c1140
[   59.509329] offset:4 0x4d544950
[   59.512284] offset:8 0x40e041a0
[   59.515446] offset:32 0x3112011a
[   59.518598] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 1Gbps/Full - flow control off
[   59.530096] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready

root@bpi-r3:~# ip a a 192.168.0.19/24 dev eth1
root@bpi-r3:~# ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
64 bytes from 192.168.0.10: icmp_seq=1 ttl=64 time=0.863 ms
64 bytes from 192.168.0.10: icmp_seq=2 ttl=64 time=0.491 ms
^C
--- 192.168.0.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1071ms
rtt min/avg/max/mdev = 0.491/0.677/0.863/0.186 ms
root@bpi-r3:~# ethtool eth1
[  128.027246] offset:0 0x2c1140
[  128.027264] offset:4 0x4d544950
[  128.030230] offset:8 0x40e041a0
Settings for eth[  128.033411] offset:32 0x3112011a
1:
        Supported p[  128.036798] offset:0 0x2c1140
[  128.041287] offset:4 0x4d544950

        Supported link[  128.045636] offset:8 0x40e041a0
 modes:   1000baseX/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseX/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: FIBRE
        PHYAD: 0
        Transceiver: internal
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes
root@bpi-r3:~#
root@bpi-r3:~# iperf3 -c 192.168.0.21
Connecting to host 192.168.0.21, port 5201
[  5] local 192.168.0.19 port 50992 connected to 192.168.0.21 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   114 MBytes   960 Mbits/sec    0    450 KBytes
[  5]   1.00-2.00   sec   112 MBytes   940 Mbits/sec    0    450 KBytes
[  5]   2.00-3.00   sec   113 MBytes   948 Mbits/sec    0    450 KBytes
[  5]   3.00-4.00   sec   112 MBytes   940 Mbits/sec    0    450 KBytes
[  5]   4.00-5.00   sec   112 MBytes   941 Mbits/sec    0    450 KBytes
[  5]   5.00-6.00   sec   112 MBytes   941 Mbits/sec    0    450 KBytes
[  5]   6.00-7.00   sec   113 MBytes   948 Mbits/sec    0    450 KBytes
[  5]   7.00-8.00   sec   112 MBytes   939 Mbits/sec    0    450 KBytes
[  5]   8.00-9.00   sec   112 MBytes   940 Mbits/sec    0    450 KBytes
[  5]   9.00-10.00  sec   113 MBytes   947 Mbits/sec    0    450 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.10 GBytes   944 Mbits/sec    0             sender
[  5]   0.00-10.06  sec  1.10 GBytes   938 Mbits/sec                  receiver

iperf Done.
root@bpi-r3:~# iperf3 -c 192.168.0.21 -R
Connecting to host 192.168.0.21, port 5201
Reverse mode, remote host 192.168.0.21 is sending
[  5] local 192.168.0.19 port 38736 connected to 192.168.0.21 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   112 MBytes   937 Mbits/sec
[  5]   1.00-2.00   sec   112 MBytes   940 Mbits/sec
[  5]   2.00-3.00   sec   112 MBytes   939 Mbits/sec
[  5]   3.00-4.00   sec   112 MBytes   940 Mbits/sec
[  5]   4.00-5.00   sec   112 MBytes   940 Mbits/sec
[  5]   5.00-6.00   sec   112 MBytes   940 Mbits/sec
[  5]   6.00-7.00   sec   112 MBytes   940 Mbits/sec
[  5]   7.00-8.00   sec   112 MBytes   941 Mbits/sec
[  5]   8.00-9.00   sec   112 MBytes   940 Mbits/sec
[  5]   9.00-10.00  sec   112 MBytes   940 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.06  sec  1.10 GBytes   936 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.09 GBytes   940 Mbits/sec                  receiver

iperf Done.
root@bpi-r3:~#

so now checking the first gmac (mt7531 switch-chip, fixed link with 2500baseX - sgmii, but i can only test 1G on a switchport not the full 2g5):

root@bpi-r3:~# ip link set eth1 down
[  128.050136] offset:32 0x3112011a
[  266.426857] mtk_soc_eth 15100000.ethernet eth1: Link is Down
root@bpi-r3:~# ip link set eth0 up
[  277.257578] mtk_soc_eth 15100000.ethernet eth0: configuring for fixed/2500base-x link mode
[  277.266007] mtk_soc_eth 15100000.ethernet eth0: Link is Up - 2.5Gbps/Full - flow control rx/tx
root@bpi-r3:~#
root@bpi-r3:~# ip a a 192.168.0.19/24 dev wan
root@bpi-r3:~# ip link set wan up
[  373.687223] mt7530 mdio-bus:1f wan: configuring for phy/gmii link mode
root@bpi-r3:~# [  373.698593] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[  373.706061] IPv6: ADDRCONF(NETDEV_CHANGE): wan: link becomes ready
root@bpi-r3:~# ip a a 192.168.0.19/24 dev wan
root@bpi-r3:~# ip link set wan up
[  373.687223] mt7530 mdio-bus:1f wan: configuring for phy/gmii link mode
root@bpi-r3:~# [  373.698593] mt7530 mdio-bus:1f wan: Link is Up - 1Gbps/Full - flow control rx/tx
[  373.706061] IPv6: ADDRCONF(NETDEV_CHANGE): wan: link becomes ready

root@bpi-r3:~# ping 192.168.0.10
PING 192.168.0.10 (192.168.0.10) 56(84) bytes of data.
64 bytes from 192.168.0.10: icmp_seq=1 ttl=64 time=0.964 ms
64 bytes from 192.168.0.10: icmp_seq=2 ttl=64 time=0.523 ms
^C
--- 192.168.0.10 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 0.523/0.743/0.964/0.220 ms
root@bpi-r3:~# iperf3 -c 192.168.0.21
Connecting to host 192.168.0.21, port 5201
[  5] local 192.168.0.19 port 48332 connected to 192.168.0.21 port 5201
[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec   115 MBytes   962 Mbits/sec    0    641 KBytes
[  5]   1.00-2.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   2.00-3.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   3.00-4.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   4.00-5.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   5.00-6.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   6.00-7.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   7.00-8.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   8.00-9.00   sec   112 MBytes   944 Mbits/sec    0    641 KBytes
[  5]   9.00-10.00  sec   111 MBytes   933 Mbits/sec    0    641 KBytes
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.00  sec  1.10 GBytes   944 Mbits/sec    0             sender
[  5]   0.00-10.06  sec  1.10 GBytes   937 Mbits/sec                  receiver

iperf Done.
root@bpi-r3:~# iperf3 -c 192.168.0.21 -R
Connecting to host 192.168.0.21, port 5201
Reverse mode, remote host 192.168.0.21 is sending
[  5] local 192.168.0.19 port 51822 connected to 192.168.0.21 port 5201
[ ID] Interval           Transfer     Bitrate
[  5]   0.00-1.00   sec   112 MBytes   937 Mbits/sec
[  5]   1.00-2.00   sec   112 MBytes   941 Mbits/sec
[  5]   2.00-3.00   sec   112 MBytes   939 Mbits/sec
[  5]   3.00-4.00   sec   112 MBytes   940 Mbits/sec
[  5]   4.00-5.00   sec   112 MBytes   938 Mbits/sec
[  5]   5.00-6.00   sec   112 MBytes   940 Mbits/sec
[  5]   6.00-7.00   sec   112 MBytes   938 Mbits/sec
[  5]   7.00-8.00   sec   112 MBytes   941 Mbits/sec
[  5]   8.00-9.00   sec   112 MBytes   940 Mbits/sec
[  5]   9.00-10.00  sec   112 MBytes   940 Mbits/sec
- - - - - - - - - - - - - - - - - - - - - - - - -
[ ID] Interval           Transfer     Bitrate         Retr
[  5]   0.00-10.06  sec  1.10 GBytes   936 Mbits/sec    0             sender
[  5]   0.00-10.00  sec  1.09 GBytes   939 Mbits/sec                  receiver

iperf Done.
root@bpi-r3:~# ethtool eth0
Settings for eth0:
        Supported ports: [ MII ]
        Supported link modes:   2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  2500baseT/Full
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: No
        Link partner advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Full
        Auto-negotiation: on
        Port: MII
        PHYAD: 0
        Transceiver: internal
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes
root@bpi-r3:~# ethtool wan
Settings for wan:
        Supported ports: [ TP    MII ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  10baseT/Half 10baseT/Full
                                             100baseT/Half 100baseT/Full
                                             1000baseT/Full
        Link partner advertised pause frame use: Symmetric
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: master
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: external
        MDI-X: Unknown
        Supports Wake-on: d
        Wake-on: d
        Link detected: yes
root@bpi-r3:~#

looks good too :)

if you fix this typo you can send the patch and add my tested-by:

        regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
-       if (interface = == PHY_INTERFACE_MODE_SGMII)
+       if (interface == PHY_INTERFACE_MODE_SGMII)
                val |= SGMII_IF_MODE_BIT0;
        else
                val &= ~SGMII_IF_MODE_BIT0;

should i send an update for my patch including this:

state->duplex = DUPLEX_FULL;

or do you want to read the duplex from the advertisement like stated before?

regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);

phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);

regards Frank
  
Frank Wunderlich Oct. 23, 2022, 7:21 p.m. UTC | #24
> Gesendet: Sonntag, 23. Oktober 2022 um 21:03 Uhr
> Von: "Frank Wunderlich" <frank-w@public-files.de>

> if you fix this typo you can send the patch and add my tested-by:
>
>         regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
> -       if (interface = == PHY_INTERFACE_MODE_SGMII)
> +       if (interface == PHY_INTERFACE_MODE_SGMII)
>                 val |= SGMII_IF_MODE_BIT0;
>         else
>                 val &= ~SGMII_IF_MODE_BIT0;
>
> should i send an update for my patch including this:
>
> state->duplex = DUPLEX_FULL;
>
> or do you want to read the duplex from the advertisement like stated before?
>
> regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
>
> phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);

with the phylink-helper it works too

https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii2

regards Frank
  
Russell King (Oracle) Oct. 23, 2022, 8:09 p.m. UTC | #25
On Sun, Oct 23, 2022 at 09:21:37PM +0200, Frank Wunderlich wrote:
> > Gesendet: Sonntag, 23. Oktober 2022 um 21:03 Uhr
> > Von: "Frank Wunderlich" <frank-w@public-files.de>
> 
> > if you fix this typo you can send the patch and add my tested-by:
> >
> >         regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
> > -       if (interface = == PHY_INTERFACE_MODE_SGMII)
> > +       if (interface == PHY_INTERFACE_MODE_SGMII)
> >                 val |= SGMII_IF_MODE_BIT0;
> >         else
> >                 val &= ~SGMII_IF_MODE_BIT0;
> >
> > should i send an update for my patch including this:
> >
> > state->duplex = DUPLEX_FULL;
> >
> > or do you want to read the duplex from the advertisement like stated before?
> >
> > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
> > regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1 + 8, &adv);
> >
> > phylink_mii_c22_pcs_decode_state(state, bm >> 16, adv >> 16);
> 
> with the phylink-helper it works too
> 
> https://github.com/frank-w/BPI-R2-4.14/commits/6.1-r3-sgmii2

This is amazingly great news - we now know how to configure this
hardware! Let me cook up a proper set of patches for tomorrow - if
that's okay.
  
Russell King (Oracle) Oct. 24, 2022, 9:27 a.m. UTC | #26
On Sun, Oct 23, 2022 at 09:09:23PM +0100, Russell King (Oracle) wrote:
> This is amazingly great news - we now know how to configure this
> hardware! Let me cook up a proper set of patches for tomorrow - if
> that's okay.

Here's the combined patch for where I would like mtk_sgmii to get to.

It looks like this PCS is similar to what we know as pcs-lynx.c, but
there do seem to be differences - the duplex bit for example appears
to be inverted.

Please confirm whether this still works for you, thanks.

diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.h b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
index b52f3b0177ef..8ecf97bcfec6 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.h
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.h
@@ -466,8 +466,10 @@
 #define ETHSYS_DMA_AG_MAP_PPE	BIT(2)
 
 /* SGMII subsystem config registers */
-/* Register to auto-negotiation restart */
+/* BMCR (low 16) BMSR (high 16) */
 #define SGMSYS_PCS_CONTROL_1	0x0
+#define SGMII_BMCR		GENMASK(15, 0)
+#define SGMII_BMSR		GENMASK(31, 16)
 #define SGMII_AN_RESTART	BIT(9)
 #define SGMII_ISOLATE		BIT(10)
 #define SGMII_AN_ENABLE		BIT(12)
@@ -477,9 +479,13 @@
 #define SGMII_PCS_FAULT		BIT(23)
 #define SGMII_AN_EXPANSION_CLR	BIT(30)
 
+#define SGMSYS_PCS_ADVERTISE	0x8
+#define SGMII_ADVERTISE		GENMASK(15, 0)
+#define SGMII_LPA		GENMASK(31, 16)
+
 /* Register to programmable link timer, the unit in 2 * 8ns */
 #define SGMSYS_PCS_LINK_TIMER	0x18
-#define SGMII_LINK_TIMER_DEFAULT	(0x186a0 & GENMASK(19, 0))
+#define SGMII_LINK_TIMER_MASK	GENMASK(19, 0)
 
 /* Register to control remote fault */
 #define SGMSYS_SGMII_MODE		0x20
diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 736839c84130..e64c02a48449 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -19,110 +19,136 @@ static struct mtk_pcs *pcs_to_mtk_pcs(struct phylink_pcs *pcs)
 	return container_of(pcs, struct mtk_pcs, pcs);
 }
 
-/* For SGMII interface mode */
-static int mtk_pcs_setup_mode_an(struct mtk_pcs *mpcs)
+static void mtk_pcs_get_state(struct phylink_pcs *pcs,
+			      struct phylink_link_state *state)
 {
-	unsigned int val;
-
-	/* Setup the link timer and QPHY power up inside SGMIISYS */
-	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER,
-		     SGMII_LINK_TIMER_DEFAULT);
-
-	regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
-	val |= SGMII_REMOTE_FAULT_DIS;
-	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
-
-	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
-	val |= SGMII_AN_RESTART;
-	regmap_write(mpcs->regmap, SGMSYS_PCS_CONTROL_1, val);
-
-	regmap_read(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, &val);
-	val &= ~SGMII_PHYA_PWD;
-	regmap_write(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, val);
+	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
+	unsigned int bm, adv;
 
-	return 0;
+	/* Read the BMSR and LPA */
+	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &bm);
+	regmap_read(mpcs->regmap, SGMSYS_PCS_ADVERTISE, &adv);
 
+	phylink_mii_c22_pcs_decode_state(state, FIELD_GET(SGMII_BMSR, bm),
+					 FIELD_GET(SGMII_LPA, adv));
 }
 
-/* For 1000BASE-X and 2500BASE-X interface modes, which operate at a
- * fixed speed.
- */
-static int mtk_pcs_setup_mode_force(struct mtk_pcs *mpcs,
-				    phy_interface_t interface)
+static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
+			  phy_interface_t interface,
+			  const unsigned long *advertising,
+			  bool permit_pause_to_mac)
 {
-	unsigned int val;
+	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
+	unsigned int rgc3, sgm_mode, bmcr;
+	int advertise, link_timer;
+	bool changed, use_an;
 
-	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
-	val &= ~RG_PHY_SPEED_MASK;
 	if (interface == PHY_INTERFACE_MODE_2500BASEX)
-		val |= RG_PHY_SPEED_3_125G;
-	regmap_write(mpcs->regmap, mpcs->ana_rgc3, val);
+		rgc3 = RG_PHY_SPEED_3_125G;
+	else
+		rgc3 = 0;
+
+	advertise = phylink_mii_c22_pcs_encode_advertisement(interface,
+							     advertising);
+	if (advertise < 0)
+		return advertise;
+
+	link_timer = phylink_get_link_timer_ns(interface);
+	if (link_timer < 0)
+		return link_timer;
+
+	/* Clearing IF_MODE_BIT0 switches the PCS to BASE-X mode, and
+	 * we assume that fixes it's speed at bitrate = line rate (in
+	 * other words, 1000Mbps or 2500Mbps).
+	 */
+	if (interface == PHY_INTERFACE_MODE_SGMII) {
+		sgm_mode = SGMII_IF_MODE_BIT0;
+		if (phylink_autoneg_inband(mode)) {
+			sgm_mode |= SGMII_REMOTE_FAULT_DIS |
+				    SGMII_SPEED_DUPLEX_AN;
+			use_an = true;
+		} else {
+			use_an = false;
+		}
+	} else if (phylink_autoneg_inband(mode)) {
+		/* 1000base-X or 2500base-X autoneg */
+		sgm_mode = SGMII_REMOTE_FAULT_DIS;
+		use_an = linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT,
+					   advertising);
+	} else {
+		/* 1000base-X or 2500base-X without autoneg */
+		sgm_mode = 0;
+		use_an = false;
+	}
 
-	/* Disable SGMII AN */
-	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
-	val &= ~SGMII_AN_ENABLE;
-	regmap_write(mpcs->regmap, SGMSYS_PCS_CONTROL_1, val);
+	if (use_an) {
+		/* FIXME: Do we need to set AN_RESTART here? */
+		bmcr = SGMII_AN_RESTART | SGMII_AN_ENABLE;
+	} else {
+		bmcr = 0;
+	}
 
-	/* Set the speed etc but leave the duplex unchanged */
-	regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
-	val &= SGMII_DUPLEX_FULL | ~SGMII_IF_MODE_MASK;
-	val |= SGMII_SPEED_1000;
-	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
+	/* Configure the underlying interface speed */
+	regmap_update_bits(mpcs->regmap, mpcs->ana_rgc3,
+			   RG_PHY_SPEED_3_125G, rgc3);
 
-	/* Release PHYA power down state */
-	regmap_read(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, &val);
-	val &= ~SGMII_PHYA_PWD;
-	regmap_write(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL, val);
+	/* Update the advertisement, noting whether it has changed */
+	regmap_update_bits_check(mpcs->regmap, SGMSYS_PCS_ADVERTISE,
+				 SGMII_ADVERTISE, advertise, &changed);
 
-	return 0;
-}
+	/* Setup the link timer and QPHY power up inside SGMIISYS */
+	regmap_write(mpcs->regmap, SGMSYS_PCS_LINK_TIMER, link_timer / 2 / 8);
 
-static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode,
-			  phy_interface_t interface,
-			  const unsigned long *advertising,
-			  bool permit_pause_to_mac)
-{
-	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
-	int err = 0;
+	/* Update the sgmsys mode register */
+	regmap_update_bits(mpcs->regmap, SGMSYS_SGMII_MODE,
+			   SGMII_REMOTE_FAULT_DIS | SGMII_SPEED_DUPLEX_AN |
+			   SGMII_IF_MODE_BIT0, sgm_mode);
+
+	/* Update the BMCR */
+	regmap_update_bits(mpcs->regmap, SGMSYS_PCS_CONTROL_1,
+			   SGMII_AN_RESTART | SGMII_AN_ENABLE, bmcr);
 
-	/* Setup SGMIISYS with the determined property */
-	if (interface != PHY_INTERFACE_MODE_SGMII)
-		err = mtk_pcs_setup_mode_force(mpcs, interface);
-	else if (phylink_autoneg_inband(mode))
-		err = mtk_pcs_setup_mode_an(mpcs);
+	/* Release PHYA power down state */
+	regmap_update_bits(mpcs->regmap, SGMSYS_QPHY_PWR_STATE_CTRL,
+			   SGMII_PHYA_PWD, 0);
 
-	return err;
+	return changed ? 1 : 0;
 }
 
 static void mtk_pcs_restart_an(struct phylink_pcs *pcs)
 {
 	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
-	unsigned int val;
 
-	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
-	val |= SGMII_AN_RESTART;
-	regmap_write(mpcs->regmap, SGMSYS_PCS_CONTROL_1, val);
+	regmap_update_bits(mpcs->regmap, SGMSYS_PCS_CONTROL_1,
+			   SGMII_AN_RESTART, SGMII_AN_RESTART);
 }
 
 static void mtk_pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
 			    phy_interface_t interface, int speed, int duplex)
 {
 	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
-	unsigned int val;
-
-	if (!phy_interface_mode_is_8023z(interface))
-		return;
-
-	/* SGMII force duplex setting */
-	regmap_read(mpcs->regmap, SGMSYS_SGMII_MODE, &val);
-	val &= ~SGMII_DUPLEX_FULL;
-	if (duplex == DUPLEX_FULL)
-		val |= SGMII_DUPLEX_FULL;
-
-	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
+	unsigned int sgm_mode;
+
+	if (!phylink_autoneg_inband(mode)) {
+		/* Force the speed and duplex setting */
+		if (speed == SPEED_10)
+			sgm_mode = SGMII_SPEED_10;
+		else if (speed == SPEED_100)
+			sgm_mode = SGMII_SPEED_100;
+		else
+			sgm_mode = SGMII_SPEED_1000;
+
+		if (duplex == DUPLEX_FULL)
+			sgm_mode |= SGMII_DUPLEX_FULL;
+
+		regmap_update_bits(mpcs->regmap, SGMSYS_SGMII_MODE,
+				   SGMII_DUPLEX_FULL | SGMII_SPEED_MASK,
+				   sgm_mode);
+	}
 }
 
 static const struct phylink_pcs_ops mtk_pcs_ops = {
+	.pcs_get_state = mtk_pcs_get_state,
 	.pcs_config = mtk_pcs_config,
 	.pcs_an_restart = mtk_pcs_restart_an,
 	.pcs_link_up = mtk_pcs_link_up,
diff --git a/include/linux/phylink.h b/include/linux/phylink.h
index 63800bf4a7ac..7a3eb46b38c1 100644
--- a/include/linux/phylink.h
+++ b/include/linux/phylink.h
@@ -616,6 +616,22 @@ int phylink_speed_up(struct phylink *pl);
 
 void phylink_set_port_modes(unsigned long *bits);
 
+static inline int phylink_get_link_timer_ns(phy_interface_t interface)
+{
+	switch (interface) {
+	case PHY_INTERFACE_MODE_SGMII:
+	case PHY_INTERFACE_MODE_QSGMII:
+		return 1600000;
+
+	case PHY_INTERFACE_MODE_1000BASEX:
+	case PHY_INTERFACE_MODE_2500BASEX:
+		return 10000000;
+
+	default:
+		return -EINVAL;
+	}
+}
+
 void phylink_mii_c22_pcs_decode_state(struct phylink_link_state *state,
 				      u16 bmsr, u16 lpa);
 void phylink_mii_c22_pcs_get_state(struct mdio_device *pcs,
  
Frank Wunderlich Oct. 24, 2022, 2:45 p.m. UTC | #27
Hi
> Gesendet: Montag, 24. Oktober 2022 um 11:27 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>

> Here's the combined patch for where I would like mtk_sgmii to get to.
>
> It looks like this PCS is similar to what we know as pcs-lynx.c, but
> there do seem to be differences - the duplex bit for example appears
> to be inverted.
>
> Please confirm whether this still works for you, thanks.

basicly Patch works, but i get some (1-50) retransmitts on iperf3 on first interval in tx-mode (on r3 without -R), other 9 are clean. reverse mode is mostly clean.
run iperf3 multiple times, every first interval has retransmitts. same for gmac0 (fixed-link 2500baseX)

i notice that you have changed the timer again to 10000000 for 1000/2500baseX...maybe use here the default value too like the older code does?

regards Frank
  
Russell King (Oracle) Oct. 24, 2022, 2:56 p.m. UTC | #28
On Mon, Oct 24, 2022 at 04:45:40PM +0200, Frank Wunderlich wrote:
> Hi
> > Gesendet: Montag, 24. Oktober 2022 um 11:27 Uhr
> > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> 
> > Here's the combined patch for where I would like mtk_sgmii to get to.
> >
> > It looks like this PCS is similar to what we know as pcs-lynx.c, but
> > there do seem to be differences - the duplex bit for example appears
> > to be inverted.
> >
> > Please confirm whether this still works for you, thanks.
> 
> basicly Patch works, but i get some (1-50) retransmitts on iperf3 on first interval in tx-mode (on r3 without -R), other 9 are clean. reverse mode is mostly clean.
> run iperf3 multiple times, every first interval has retransmitts. same for gmac0 (fixed-link 2500baseX)
> 
> i notice that you have changed the timer again to 10000000 for 1000/2500baseX...maybe use here the default value too like the older code does?

You obviously missed my explanation. I will instead quote the 802.3
standard which covers 1000base-X:

37.3.1.4 Timers

 link_timer
          Timer used to ensure Auto-Negotiation protocol stability and
	  register read/write by the management interface.

	  Duration: 10 ms, tolerance +10 ms, –0 s.

For SGMII, the situation is different. Here is what the SGMII
specification says:

  The link_timer inside the Auto-Negotiation has been changed from 10
  msec to 1.6 msec to ensure a prompt update of the link status.

So, 10ms is correct for 1000base-X, and 1.6ms correct for SGMII.

However, feel free to check whether changing it solves that issue, but
also check whether it could be some ARP related issue - remember, if
two endpoints haven't communicated, they need to ARP to get the other
end's ethernet addresses which adds extra latency, and may result in
some packet loss in high packet queuing rate situations.
  
Frank Wunderlich Oct. 25, 2022, 8:03 a.m. UTC | #29
> Gesendet: Montag, 24. Oktober 2022 um 16:56 Uhr
> Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> On Mon, Oct 24, 2022 at 04:45:40PM +0200, Frank Wunderlich wrote:
> > Hi
> > > Gesendet: Montag, 24. Oktober 2022 um 11:27 Uhr
> > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk>
> > 
> > > Here's the combined patch for where I would like mtk_sgmii to get to.
> > >
> > > It looks like this PCS is similar to what we know as pcs-lynx.c, but
> > > there do seem to be differences - the duplex bit for example appears
> > > to be inverted.
> > >
> > > Please confirm whether this still works for you, thanks.
> > 
> > basicly Patch works, but i get some (1-50) retransmitts on iperf3 on first interval in tx-mode (on r3 without -R), other 9 are clean. reverse mode is mostly clean.
> > run iperf3 multiple times, every first interval has retransmitts. same for gmac0 (fixed-link 2500baseX)
> > 
> > i notice that you have changed the timer again to 10000000 for 1000/2500baseX...maybe use here the default value too like the older code does?
> 
> You obviously missed my explanation. I will instead quote the 802.3
> standard which covers 1000base-X:

sorry, right i remember you've already mentioned it

> 37.3.1.4 Timers
> 
>  link_timer
>           Timer used to ensure Auto-Negotiation protocol stability and
> 	  register read/write by the management interface.
> 
> 	  Duration: 10 ms, tolerance +10 ms, –0 s.
> 
> For SGMII, the situation is different. Here is what the SGMII
> specification says:
> 
>   The link_timer inside the Auto-Negotiation has been changed from 10
>   msec to 1.6 msec to ensure a prompt update of the link status.
> 
> So, 10ms is correct for 1000base-X, and 1.6ms correct for SGMII.
> 
> However, feel free to check whether changing it solves that issue, but
> also check whether it could be some ARP related issue - remember, if
> two endpoints haven't communicated, they need to ARP to get the other
> end's ethernet addresses which adds extra latency, and may result in
> some packet loss in high packet queuing rate situations.

tried with 1.6ms, same result (or even worse on 1000baseX). i guess arp cache should stay for ~5s?
so at least second round followed directly after the first should be clean when looking on ARP.

apart from this little problem it works much better than it actually is so imho more
people should test it on different platforms.

regards Frank
  
Bjørn Mork Jan. 16, 2023, 1:08 p.m. UTC | #30
Frank Wunderlich <frank-w@public-files.de> writes:

> apart from this little problem it works much better than it actually is so imho more
> people should test it on different platforms.

Hello!  I've been banging my head against an MT7986 board with two
Maxlinear GPY211C phys for a while. One of those phys is connected to
port 5 of the MT7531 switch.  This is working perfectly.

The other GPY211C is connected to the second MT7986 mac.  This one is
giving me a headache...

I can only get the port to work at 2500Mb/s.  Changing the speed to
anything lower looks fine in ethtool etc, but traffic is blocked.

Not knowing the first thing about MACs and PHYs and such, my best guess
is that there is something wrong with the PCS config.

Now I am currently testing this on an older kernel (using OpenWrt -
booting mainline is not straight forward). But I have backported all the
patches which came out of this thread.  The resulting mtk_sgmii.c is
identical to the one currently in next-next, except for some additional
debug printk's.  Since this is a small file, I've attached my current
mtk_sgmii.c copy for reference.

The output of those printks when changing peer speed to to 1000Mb/s is:

[  363.099410] mtk_soc_eth 15100000.ethernet wan: Link is Down
[  365.189945] mtk_sgmii_select_pcs: id=1
[  365.193709] mtk_pcs_config: interface=4
[  365.197530] offset:0 0x140
[  365.197533] offset:4 0x4d544950
[  365.200237] offset:8 0x20
[  365.203365] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x1, bmcr=0x0
[  365.215601] mtk_pcs_link_up: interface=4
[  365.219511] offset:0 0x140
[  365.219513] offset:4 0x4d544950
[  365.222204] offset:8 0x1
[  365.225328] mtk_pcs_link_up: sgm_mode=0x18
[  365.231940] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx


and when changing peer back to autoneg (i.e. 2500Mb/s):

[  878.939417] mtk_soc_eth 15100000.ethernet wan: Link is Down
[  883.099857] mtk_sgmii_select_pcs: id=1
[  883.103620] mtk_pcs_config: interface=22
[  883.107527] offset:0 0x140
[  883.107529] offset:4 0x4d544950
[  883.110234] offset:8 0x1
[  883.113363] mtk_pcs_config: rgc3=0x4, advertise=0x20 (changed), link_timer=10000000,  sgm_mode=0x0, bmcr=0x0
[  883.125686] mtk_pcs_link_up: interface=22
[  883.129683] offset:0 0x40140
[  883.129685] offset:4 0x4d544950
[  883.132550] offset:8 0x20
[  883.135687] mtk_soc_eth 15100000.ethernet wan: Link is Up - 2.5Gbps/Full - flow control rx/tx


ethtool output looks as expected in both cases:

# ethtool wan
Settings for wan:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  1000baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: slave
        Port: Twisted Pair
        PHYAD: 6
        Transceiver: external
        MDI-X: on (auto)
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes


# ethtool wan
Settings for wan:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  100baseT/Full
                                             1000baseT/Full
                                             2500baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 2500Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: master
        Port: Twisted Pair
        PHYAD: 6
        Transceiver: external
        MDI-X: on (auto)
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes



The behaviour looks similar to the GPY211C attached to switch port 5.
Except that the latter works regardless of speed..

I did however notice one difference, which may or may not be
significant, in the VSPEC1_SGMII_STAT register of the phys after
changing to 1000Mb/s.  The switch attached phy reports:

root@OpenWrt:/# mdio mdio-bus 5:30 raw 9
0x002e

while the soc mac attached phy reports

root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e

According to
https://assets.maxlinear.com/web/documents/617810_gpy211b1vc_gpy211c0vc_ds_rev1.4.pdf
bit 5 is "Auto-Negotiation Completed". So it does look like the switch
mac is doing AN on the SGMII link, and the soc mac is not.  Is that
correct?

Any hints on where I should look next?

The ethernet part of my device tree looks like this:

&eth {
	status = "okay";

	gmac0: mac@0 {
		compatible = "mediatek,eth-mac";
		reg = <0>;
		phy-mode = "2500base-x";

		fixed-link {
			speed = <2500>;
			full-duplex;
			pause;
		};
	};

	mac@1 {
		compatible = "mediatek,eth-mac";
		reg = <1>;
		label = "wan";
		phy-mode = "2500base-x";
		phy-handle = <&phy6>;
	};

	mdio: mdio-bus {
		#address-cells = <1>;
		#size-cells = <0>;
	};
};

&mdio {
	reset-gpios = <&pio 6 GPIO_ACTIVE_LOW>;
	reset-delay-us = <50000>;
	reset-post-delay-us = <20000>;
  
	phy5: phy@5 {
		compatible = "ethernet-phy-ieee802.3-c45";
		reg = <5>;
	};

	phy6: phy@6 {
		compatible = "ethernet-phy-ieee802.3-c45";
		reg = <6>;
	};

	switch: switch@1f {
		compatible = "mediatek,mt7531";
		reg = <31>;
		reset-gpios = <&pio 5 GPIO_ACTIVE_HIGH>;
		interrupt-controller;
		#interrupt-cells = <1>;
		interrupt-parent = <&pio>;
		interrupts = <66 IRQ_TYPE_LEVEL_HIGH>;
	};
};

&switch {
	ports {
		#address-cells = <1>;
		#size-cells = <0>;

		port@0 {
			reg = <0>;
			label = "lan3";
		};

		port@1 {
			reg = <1>;
			label = "lan2";
		};

		port@2 {
			reg = <2>;
			label = "lan1";
		};

		port@5 {
			reg = <5>;
			label = "lan4";
			phy-mode = "2500base-x";
                        phy-handle = <&phy5>;
		};

		port@6 {
			reg = <6>;
			label = "cpu";
			ethernet = <&gmac0>;
			phy-mode = "2500base-x";

			fixed-link {
				speed = <2500>;
				full-duplex;
				pause;
			};
		};
	};
};




Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 1:47 p.m. UTC | #31
On Mon, Jan 16, 2023 at 02:08:30PM +0100, Bjørn Mork wrote:
> Frank Wunderlich <frank-w@public-files.de> writes:
> 
> > apart from this little problem it works much better than it actually is so imho more
> > people should test it on different platforms.
> 
> Hello!  I've been banging my head against an MT7986 board with two
> Maxlinear GPY211C phys for a while. One of those phys is connected to
> port 5 of the MT7531 switch.  This is working perfectly.
> 
> The other GPY211C is connected to the second MT7986 mac.  This one is
> giving me a headache...
> 
> I can only get the port to work at 2500Mb/s.  Changing the speed to
> anything lower looks fine in ethtool etc, but traffic is blocked.

My guess would be that the GPY PHY is using in-band SGMII negotiation
(it sets VSPEC1_SGMII_ANEN_ANRS when entering SGMII mode and clears
it in 2500base-X), but as the link is not using in-band mode on the
PCS side, the PHY never sees its in-band negotiation complete, so the
link between PCS and PHY never comes up.

Both sides need to agree on that detail.
  
Bjørn Mork Jan. 16, 2023, 2:45 p.m. UTC | #32
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:
> On Mon, Jan 16, 2023 at 02:08:30PM +0100, Bjørn Mork wrote:
>> Frank Wunderlich <frank-w@public-files.de> writes:
>> 
>> > apart from this little problem it works much better than it actually is so imho more
>> > people should test it on different platforms.
>> 
>> Hello!  I've been banging my head against an MT7986 board with two
>> Maxlinear GPY211C phys for a while. One of those phys is connected to
>> port 5 of the MT7531 switch.  This is working perfectly.
>> 
>> The other GPY211C is connected to the second MT7986 mac.  This one is
>> giving me a headache...
>> 
>> I can only get the port to work at 2500Mb/s.  Changing the speed to
>> anything lower looks fine in ethtool etc, but traffic is blocked.
>
> My guess would be that the GPY PHY is using in-band SGMII negotiation
> (it sets VSPEC1_SGMII_ANEN_ANRS when entering SGMII mode and clears
> it in 2500base-X), but as the link is not using in-band mode on the
> PCS side, the PHY never sees its in-band negotiation complete, so the
> link between PCS and PHY never comes up.
>
> Both sides need to agree on that detail.

Any hints on how I would go about doing that?  I am a little lost here,
changing arbitrary bits I don't understand the meaning of.


Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 2:59 p.m. UTC | #33
On Mon, Jan 16, 2023 at 03:45:24PM +0100, Bjørn Mork wrote:
> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
> > On Mon, Jan 16, 2023 at 02:08:30PM +0100, Bjørn Mork wrote:
> >> Frank Wunderlich <frank-w@public-files.de> writes:
> >> 
> >> > apart from this little problem it works much better than it actually is so imho more
> >> > people should test it on different platforms.
> >> 
> >> Hello!  I've been banging my head against an MT7986 board with two
> >> Maxlinear GPY211C phys for a while. One of those phys is connected to
> >> port 5 of the MT7531 switch.  This is working perfectly.
> >> 
> >> The other GPY211C is connected to the second MT7986 mac.  This one is
> >> giving me a headache...
> >> 
> >> I can only get the port to work at 2500Mb/s.  Changing the speed to
> >> anything lower looks fine in ethtool etc, but traffic is blocked.
> >
> > My guess would be that the GPY PHY is using in-band SGMII negotiation
> > (it sets VSPEC1_SGMII_ANEN_ANRS when entering SGMII mode and clears
> > it in 2500base-X), but as the link is not using in-band mode on the
> > PCS side, the PHY never sees its in-band negotiation complete, so the
> > link between PCS and PHY never comes up.
> >
> > Both sides need to agree on that detail.
> 
> Any hints on how I would go about doing that?  I am a little lost here,
> changing arbitrary bits I don't understand the meaning of.

Hi,

To prove the point, in mtk_pcs_config():

        if (interface == PHY_INTERFACE_MODE_SGMII) {
                sgm_mode = SGMII_IF_MODE_SGMII;
                if (phylink_autoneg_inband(mode)) {

Force the second if() to always be true, and see whether that allows
traffic to pass.

Thanks.
  
Bjørn Mork Jan. 16, 2023, 3:21 p.m. UTC | #34
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> To prove the point, in mtk_pcs_config():
>
>         if (interface == PHY_INTERFACE_MODE_SGMII) {
>                 sgm_mode = SGMII_IF_MODE_SGMII;
>                 if (phylink_autoneg_inband(mode)) {
>
> Force the second if() to always be true, and see whether that allows
> traffic to pass.

Changed it with a printk to make sure I didn't mess up:

		if (1 || phylink_autoneg_inband(mode)) {
			pr_info("forcing AN\n");

But unfortunately without success.  Still same failure.  Output when
changing peer speed:

[   54.539438] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   56.619937] mtk_sgmii_select_pcs: id=1
[   56.623690] mtk_pcs_config: interface=4
[   56.627511] offset:0 0x140
[   56.627513] offset:4 0x4d544950
[   56.630215] offset:8 0x20
[   56.633340] forcing AN
[   56.638292] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1000, use_an=1
[   56.649226] mtk_pcs_link_up: interface=4
[   56.653135] offset:0 0x81140
[   56.653137] offset:4 0x4d544950
[   56.656001] offset:8 0x1
[   56.659137] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx

and the phy still reports this:

root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e



Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 3:32 p.m. UTC | #35
On Mon, Jan 16, 2023 at 04:21:30PM +0100, Bjørn Mork wrote:
> [   54.539438] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   56.619937] mtk_sgmii_select_pcs: id=1
> [   56.623690] mtk_pcs_config: interface=4
> [   56.627511] offset:0 0x140
> [   56.627513] offset:4 0x4d544950
> [   56.630215] offset:8 0x20
> [   56.633340] forcing AN
> [   56.638292] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1000, use_an=1
> [   56.649226] mtk_pcs_link_up: interface=4
> [   56.653135] offset:0 0x81140
> [   56.653137] offset:4 0x4d544950
> [   56.656001] offset:8 0x1
> [   56.659137] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx

Thanks - there seems to be something weird with the bmcr value printed
above in the mtk_pcs_config line.

You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
0x1000. Any ideas why?

Can you also hint at what the bits in the PHY register you quote mean
please?

Thanks.
  
Bjørn Mork Jan. 16, 2023, 4:33 p.m. UTC | #36
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> On Mon, Jan 16, 2023 at 04:21:30PM +0100, Bjørn Mork wrote:
>> [   54.539438] mtk_soc_eth 15100000.ethernet wan: Link is Down
>> [   56.619937] mtk_sgmii_select_pcs: id=1
>> [   56.623690] mtk_pcs_config: interface=4
>> [   56.627511] offset:0 0x140
>> [   56.627513] offset:4 0x4d544950
>> [   56.630215] offset:8 0x20
>> [   56.633340] forcing AN
>> [   56.638292] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1000, use_an=1
>> [   56.649226] mtk_pcs_link_up: interface=4
>> [   56.653135] offset:0 0x81140
>> [   56.653137] offset:4 0x4d544950
>> [   56.656001] offset:8 0x1
>> [   56.659137] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
>
> Thanks - there seems to be something weird with the bmcr value printed
> above in the mtk_pcs_config line.
>
> You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
> SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
> 0x1000. Any ideas why?

No, not really

> Can you also hint at what the bits in the PHY register you quote mean
> please?

This could very well be a red herring.  It's the only difference I've
been able to spot, but I have no idea what it means.

This is an attempt at reformatting the pdf tables for email.  Hope it's
readable: 


VSPEC1_SGMII_STAT                                 Reset Value
Chip Level SGMII status register (Register 30.9)  0008 H

15        14..8   7     6     5     4     3     2    1 .. 0
MACSEC_*  Res     RES   Res   ANOK  RF    ANAB  LS   DR
ro                ro          ro    rolh  ro    roll ro


Field      Bits Type Description

MACSEC_CAP 15   RO   MACSEC Capability in the product
                     0 B DISABLED Product is not MACSEC capable
                     1 B ENABLED Product is MACSEC capable
RES        7    RO   Reserved
                     Ignore when read.
ANOK       5    RO   Auto-Negotiation Completed
                     Indicates whether the auto-negotiation process is completed or not.
                     0 B RUNNING Auto-negotiation process is in progress or not started
                     1 B COMPLETED Auto-negotiation process is completed
RF         4    ROLH Remote Fault
                     Indicates the detection of a remote fault event.
                     0 B INACTIVE No remote fault condition detected
                     1 B ACTIVE Remote fault condition detected
ANAB       3    RO   Auto-Negotiation Ability
                     Specifies the auto-negotiation ability.
                     0 B DISABLED PHY is not able to perform auto-negotiation
                     1 B ENABLED PHY is able to perform auto-negotiation
LS         2    ROLL Link Status
                     Indicates the link status of the SGMII
                     0 B INACTIVE The link is down. No communication with link partner possible.
                     1 B ACTIVE The link is up. Data communication with link partner is possible.
DR         1:0  RO   SGMII Data Rate
                     This field indicates the operating data rate of SGMII when link is up
                     00 B DR_10 SGMII link rate is 10 Mbit/s
                     01 B DR_100 SGMII link rate is 100 Mbit/s
                     10 B DR_1G SGMII link rate is 1000 Mbit/s
                     11 B DR_2G5 SGMII link rate is 2500 Mbit/s



Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 4:43 p.m. UTC | #37
On Mon, Jan 16, 2023 at 05:33:53PM +0100, Bjørn Mork wrote:
> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
> 
> > On Mon, Jan 16, 2023 at 04:21:30PM +0100, Bjørn Mork wrote:
> >> [   54.539438] mtk_soc_eth 15100000.ethernet wan: Link is Down
> >> [   56.619937] mtk_sgmii_select_pcs: id=1
> >> [   56.623690] mtk_pcs_config: interface=4
> >> [   56.627511] offset:0 0x140
> >> [   56.627513] offset:4 0x4d544950
> >> [   56.630215] offset:8 0x20
> >> [   56.633340] forcing AN
> >> [   56.638292] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1000, use_an=1
> >> [   56.649226] mtk_pcs_link_up: interface=4
> >> [   56.653135] offset:0 0x81140
> >> [   56.653137] offset:4 0x4d544950
> >> [   56.656001] offset:8 0x1
> >> [   56.659137] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
> >
> > Thanks - there seems to be something weird with the bmcr value printed
> > above in the mtk_pcs_config line.
> >
> > You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
> > SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
> > 0x1000. Any ideas why?
> 
> No, not really
> 
> > Can you also hint at what the bits in the PHY register you quote mean
> > please?
> 
> This could very well be a red herring.  It's the only difference I've
> been able to spot, but I have no idea what it means.
> 
> This is an attempt at reformatting the pdf tables for email.  Hope it's
> readable: 

I found the document for the PHY at:

https://assets.maxlinear.com/web/documents/617792_gpy212b1vc_gpy212c0vc_ds_rev1.3.pdf

It seems as I suspected, the PHY has not completed SGMII AN. Please
can you read register 8 when operating at 1G speeds as well
(VSPEC1_SGMII_CTRL)? Thanks.
  
Bjørn Mork Jan. 16, 2023, 4:45 p.m. UTC | #38
Bjørn Mork <bjorn@mork.no> writes:

>> You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
>> SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
>> 0x1000. Any ideas why?
>
> No, not really

Doh! Looked over it again, and this was my fault of course.  Had an

  "bmcr = SGMII_AN_ENABLE;"
  
line overwriting the original value from a previous attempt without
changing the if condition.. Thanks for spotting that.

But this still doesn't work any better:

[   43.019395] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   45.099898] mtk_sgmii_select_pcs: id=1
[   45.103653] mtk_pcs_config: interface=4
[   45.107473] offset:0 0x140
[   45.107476] offset:4 0x4d544950
[   45.110181] offset:8 0x20
[   45.113305] forcing AN
[   45.118256] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
[   45.129191] mtk_pcs_link_up: interface=4
[   45.133100] offset:0 0x81140
[   45.133102] offset:4 0x4d544950
[   45.135967] offset:8 0x1
[   45.139104] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx



Bjørn
  
Bjørn Mork Jan. 16, 2023, 4:48 p.m. UTC | #39
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> I found the document for the PHY at:
>
> https://assets.maxlinear.com/web/documents/617792_gpy212b1vc_gpy212c0vc_ds_rev1.3.pdf
>
> It seems as I suspected, the PHY has not completed SGMII AN. Please
> can you read register 8 when operating at 1G speeds as well
> (VSPEC1_SGMII_CTRL)? Thanks.

Both phys at 1G:

root@OpenWrt:/# mdio mdio-bus 5:30 raw 8
0x34da

root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da


Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 5:47 p.m. UTC | #40
On Mon, Jan 16, 2023 at 05:45:12PM +0100, Bjørn Mork wrote:
> Bjørn Mork <bjorn@mork.no> writes:
> 
> >> You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
> >> SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
> >> 0x1000. Any ideas why?
> >
> > No, not really
> 
> Doh! Looked over it again, and this was my fault of course.  Had an
> 
>   "bmcr = SGMII_AN_ENABLE;"
>   
> line overwriting the original value from a previous attempt without
> changing the if condition.. Thanks for spotting that.
> 
> But this still doesn't work any better:
> 
> [   43.019395] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   45.099898] mtk_sgmii_select_pcs: id=1
> [   45.103653] mtk_pcs_config: interface=4
> [   45.107473] offset:0 0x140
> [   45.107476] offset:4 0x4d544950
> [   45.110181] offset:8 0x20
> [   45.113305] forcing AN
> [   45.118256] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
> [   45.129191] mtk_pcs_link_up: interface=4
> [   45.133100] offset:0 0x81140
> [   45.133102] offset:4 0x4d544950
> [   45.135967] offset:8 0x1
> [   45.139104] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx

In your _dump_pcs_ctrl() function, please can you dump the
SGMSYS_SGMII_MODE register as well (offset 0x20), in case this gives
more clue as to what's going on.

Thanks.
  
Bjørn Mork Jan. 16, 2023, 5:59 p.m. UTC | #41
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> On Mon, Jan 16, 2023 at 05:45:12PM +0100, Bjørn Mork wrote:
>> Bjørn Mork <bjorn@mork.no> writes:
>> 
>> >> You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
>> >> SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
>> >> 0x1000. Any ideas why?
>> >
>> > No, not really
>> 
>> Doh! Looked over it again, and this was my fault of course.  Had an
>> 
>>   "bmcr = SGMII_AN_ENABLE;"
>>   
>> line overwriting the original value from a previous attempt without
>> changing the if condition.. Thanks for spotting that.
>> 
>> But this still doesn't work any better:
>> 
>> [   43.019395] mtk_soc_eth 15100000.ethernet wan: Link is Down
>> [   45.099898] mtk_sgmii_select_pcs: id=1
>> [   45.103653] mtk_pcs_config: interface=4
>> [   45.107473] offset:0 0x140
>> [   45.107476] offset:4 0x4d544950
>> [   45.110181] offset:8 0x20
>> [   45.113305] forcing AN
>> [   45.118256] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
>> [   45.129191] mtk_pcs_link_up: interface=4
>> [   45.133100] offset:0 0x81140
>> [   45.133102] offset:4 0x4d544950
>> [   45.135967] offset:8 0x1
>> [   45.139104] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
>
> In your _dump_pcs_ctrl() function, please can you dump the
> SGMSYS_SGMII_MODE register as well (offset 0x20), in case this gives
> more clue as to what's going on.


[   49.339410] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   52.459913] mtk_sgmii_select_pcs: id=1
[   52.463673] mtk_pcs_config: interface=4
[   52.467494] offset:0 0x140
[   52.467496] offset:4 0x4d544950
[   52.470199] offset:8 0x20
[   52.473325] offset:20 0x10000
[   52.475929] forcing AN
[   52.481232] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
[   52.492166] mtk_pcs_link_up: interface=4
[   52.496072] offset:0 0x81140
[   52.496074] offset:4 0x4d544950
[   52.498938] offset:8 0x1
[   52.502067] offset:20 0x10000
[   52.504599] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
[   65.979410] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   70.139856] mtk_sgmii_select_pcs: id=1
[   70.143616] mtk_pcs_config: interface=22
[   70.147523] offset:0 0x81140
[   70.147525] offset:4 0x4d544950
[   70.150402] offset:8 0x1
[   70.153526] offset:20 0x10000
[   70.156049] mtk_pcs_config: rgc3=0x4, advertise=0x20 (changed), link_timer=10000000,  sgm_mode=0x0, bmcr=0x0, use_an=0
[   70.169672] mtk_pcs_link_up: interface=22
[   70.173664] offset:0 0x40140
[   70.173666] offset:4 0x4d544950
[   70.176530] offset:8 0x20
[   70.179659] offset:20 0x10000
[   70.182279] mtk_soc_eth 15100000.ethernet wan: Link is Up - 2.5Gbps/Full - flow control rx/tx


Bjørn
  
Bjørn Mork Jan. 16, 2023, 6:04 p.m. UTC | #42
Bjørn Mork <bjorn@mork.no> writes:

> [   52.473325] offset:20 0x10000

Should have warned about my inability to write the simplest code without
adding more bugs than characters.  20 != 0x20

I hope this makes more sense:


[   44.139420] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   47.259922] mtk_sgmii_select_pcs: id=1
[   47.263683] mtk_pcs_config: interface=4
[   47.267503] offset:0 0x140
[   47.267505] offset:4 0x4d544950
[   47.270210] offset:8 0x20
[   47.273335] offset:0x20 0x31120018
[   47.275939] forcing AN
[   47.281676] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
[   47.292610] mtk_pcs_link_up: interface=4
[   47.296516] offset:0 0x81140
[   47.296518] offset:4 0x4d544950
[   47.299387] offset:8 0x1
[   47.302512] offset:0x20 0x3112011b
[   47.305043] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
[   56.619420] mtk_soc_eth 15100000.ethernet wan: Link is Down
[   60.779865] mtk_sgmii_select_pcs: id=1
[   60.783623] mtk_pcs_config: interface=22
[   60.787531] offset:0 0x81140
[   60.787533] offset:4 0x4d544950
[   60.790409] offset:8 0x1
[   60.793535] offset:0x20 0x3112011b
[   60.796057] mtk_pcs_config: rgc3=0x4, advertise=0x20 (changed), link_timer=10000000,  sgm_mode=0x0, bmcr=0x0, use_an=0
[   60.810117] mtk_pcs_link_up: interface=22
[   60.814110] offset:0 0x40140
[   60.814112] offset:4 0x4d544950
[   60.816976] offset:8 0x20
[   60.820105] offset:0x20 0x31120018
[   60.822723] mtk_soc_eth 15100000.ethernet wan: Link is Up - 2.5Gbps/Full - flow control rx/tx


Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 6:06 p.m. UTC | #43
On Mon, Jan 16, 2023 at 06:59:19PM +0100, Bjørn Mork wrote:
> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
> 
> > On Mon, Jan 16, 2023 at 05:45:12PM +0100, Bjørn Mork wrote:
> >> Bjørn Mork <bjorn@mork.no> writes:
> >> 
> >> >> You have bmcr=0x1000, but the code sets two bits - SGMII_AN_RESTART and
> >> >> SGMII_AN_ENABLE which are bits 9 and 12, so bmcr should be 0x1200, not
> >> >> 0x1000. Any ideas why?
> >> >
> >> > No, not really
> >> 
> >> Doh! Looked over it again, and this was my fault of course.  Had an
> >> 
> >>   "bmcr = SGMII_AN_ENABLE;"
> >>   
> >> line overwriting the original value from a previous attempt without
> >> changing the if condition.. Thanks for spotting that.
> >> 
> >> But this still doesn't work any better:
> >> 
> >> [   43.019395] mtk_soc_eth 15100000.ethernet wan: Link is Down
> >> [   45.099898] mtk_sgmii_select_pcs: id=1
> >> [   45.103653] mtk_pcs_config: interface=4
> >> [   45.107473] offset:0 0x140
> >> [   45.107476] offset:4 0x4d544950
> >> [   45.110181] offset:8 0x20
> >> [   45.113305] forcing AN
> >> [   45.118256] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
> >> [   45.129191] mtk_pcs_link_up: interface=4
> >> [   45.133100] offset:0 0x81140
> >> [   45.133102] offset:4 0x4d544950
> >> [   45.135967] offset:8 0x1
> >> [   45.139104] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
> >
> > In your _dump_pcs_ctrl() function, please can you dump the
> > SGMSYS_SGMII_MODE register as well (offset 0x20), in case this gives
> > more clue as to what's going on.
> 
> 
> [   49.339410] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   52.459913] mtk_sgmii_select_pcs: id=1
> [   52.463673] mtk_pcs_config: interface=4
> [   52.467494] offset:0 0x140
> [   52.467496] offset:4 0x4d544950
> [   52.470199] offset:8 0x20
> [   52.473325] offset:20 0x10000
> [   52.475929] forcing AN
> [   52.481232] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
> [   52.492166] mtk_pcs_link_up: interface=4
> [   52.496072] offset:0 0x81140
> [   52.496074] offset:4 0x4d544950
> [   52.498938] offset:8 0x1
> [   52.502067] offset:20 0x10000
> [   52.504599] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
> [   65.979410] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   70.139856] mtk_sgmii_select_pcs: id=1
> [   70.143616] mtk_pcs_config: interface=22
> [   70.147523] offset:0 0x81140
> [   70.147525] offset:4 0x4d544950
> [   70.150402] offset:8 0x1
> [   70.153526] offset:20 0x10000
> [   70.156049] mtk_pcs_config: rgc3=0x4, advertise=0x20 (changed), link_timer=10000000,  sgm_mode=0x0, bmcr=0x0, use_an=0
> [   70.169672] mtk_pcs_link_up: interface=22
> [   70.173664] offset:0 0x40140
> [   70.173666] offset:4 0x4d544950
> [   70.176530] offset:8 0x20
> [   70.179659] offset:20 0x10000
> [   70.182279] mtk_soc_eth 15100000.ethernet wan: Link is Up - 2.5Gbps/Full - flow control rx/tx

Umm. What's at offset:20 seems to be unprogrammable - it always reads
back with only bit 16 set! This probably explains why it's not working,
as it looks like it can't be programmed to operate in SGMII mode!
  
Russell King (Oracle) Jan. 16, 2023, 6:14 p.m. UTC | #44
On Mon, Jan 16, 2023 at 07:04:53PM +0100, Bjørn Mork wrote:
> Bjørn Mork <bjorn@mork.no> writes:
> 
> > [   52.473325] offset:20 0x10000
> 
> Should have warned about my inability to write the simplest code without
> adding more bugs than characters.  20 != 0x20

Ah, that kind of explains the lack of change in the values at offset 20!

> [   44.139420] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   47.259922] mtk_sgmii_select_pcs: id=1
> [   47.263683] mtk_pcs_config: interface=4
> [   47.267503] offset:0 0x140
> [   47.267505] offset:4 0x4d544950
> [   47.270210] offset:8 0x20
> [   47.273335] offset:0x20 0x31120018
> [   47.275939] forcing AN
> [   47.281676] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
> [   47.292610] mtk_pcs_link_up: interface=4
> [   47.296516] offset:0 0x81140
> [   47.296518] offset:4 0x4d544950
> [   47.299387] offset:8 0x1
> [   47.302512] offset:0x20 0x3112011b
> [   47.305043] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
> [   56.619420] mtk_soc_eth 15100000.ethernet wan: Link is Down
> [   60.779865] mtk_sgmii_select_pcs: id=1
> [   60.783623] mtk_pcs_config: interface=22
> [   60.787531] offset:0 0x81140
> [   60.787533] offset:4 0x4d544950
> [   60.790409] offset:8 0x1
> [   60.793535] offset:0x20 0x3112011b
> [   60.796057] mtk_pcs_config: rgc3=0x4, advertise=0x20 (changed), link_timer=10000000,  sgm_mode=0x0, bmcr=0x0, use_an=0
> [   60.810117] mtk_pcs_link_up: interface=22
> [   60.814110] offset:0 0x40140
> [   60.814112] offset:4 0x4d544950
> [   60.816976] offset:8 0x20
> [   60.820105] offset:0x20 0x31120018
> [   60.822723] mtk_soc_eth 15100000.ethernet wan: Link is Up - 2.5Gbps/Full - flow control rx/tx

That all looks fine. However, I'm running out of ideas. What we
seem to have is:

PHY:
VSPEC1_SGMII_CTRL = 0x34da
VSPEC1_SGMII_STAT = 0x000e

The PHY is programmed to exchange SGMII with the host PCS, and it
says that it hasn't completed that exchange (bit 5 of STAT).

The Mediatek PCS says:
BMCR = 0x1140		AN enabled
BMSR = 0x0008		AN capable
ADVERTISE = 0x0001	SGMII response (bit 14 is clear, hardware is
			supposed to manage that bit)
LPA = 0x0000		SGMII received control word (nothing)
SGMII_MODE = 0x011b	SGMII mode, duplex AN, 1000M, Full duplex,
			Remote fault disable

which all looks like it should work - but it isn't.

One last thing I can think of trying at the moment would be writing
the VSPEC1_SGMII_CTRL with 0x36da, setting bit 9 which allegedly
restarts the SGMII exchange. There's some comments in the PHY driver
that this may be needed - maybe it's necessary once the MAC's PCS
has been switched to SGMII mode.
  
Bjørn Mork Jan. 16, 2023, 6:30 p.m. UTC | #45
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> That all looks fine. However, I'm running out of ideas.

Thanks a lot for the effort in any case.  It's comforting that even the
top experts can't figure out this one :-)


> What we seem to have is:
>
> PHY:
> VSPEC1_SGMII_CTRL = 0x34da
> VSPEC1_SGMII_STAT = 0x000e
>
> The PHY is programmed to exchange SGMII with the host PCS, and it
> says that it hasn't completed that exchange (bit 5 of STAT).
>
> The Mediatek PCS says:
> BMCR = 0x1140		AN enabled
> BMSR = 0x0008		AN capable
> ADVERTISE = 0x0001	SGMII response (bit 14 is clear, hardware is
> 			supposed to manage that bit)
> LPA = 0x0000		SGMII received control word (nothing)
> SGMII_MODE = 0x011b	SGMII mode, duplex AN, 1000M, Full duplex,
> 			Remote fault disable
>
> which all looks like it should work - but it isn't.
>
> One last thing I can think of trying at the moment would be writing
> the VSPEC1_SGMII_CTRL with 0x36da, setting bit 9 which allegedly
> restarts the SGMII exchange. There's some comments in the PHY driver
> that this may be needed - maybe it's necessary once the MAC's PCS
> has been switched to SGMII mode.


Tried that now.  Didn't change anything.  And still no packets.

root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0x36da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e


Bjørn
  
Bjørn Mork Jan. 16, 2023, 6:50 p.m. UTC | #46
Bjørn Mork <bjorn@mork.no> writes:
> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
>
>> That all looks fine. However, I'm running out of ideas.
>
> Thanks a lot for the effort in any case.  It's comforting that even the
> top experts can't figure out this one :-)
>
>
>> What we seem to have is:
>>
>> PHY:
>> VSPEC1_SGMII_CTRL = 0x34da
>> VSPEC1_SGMII_STAT = 0x000e
>>
>> The PHY is programmed to exchange SGMII with the host PCS, and it
>> says that it hasn't completed that exchange (bit 5 of STAT).
>>
>> The Mediatek PCS says:
>> BMCR = 0x1140		AN enabled
>> BMSR = 0x0008		AN capable
>> ADVERTISE = 0x0001	SGMII response (bit 14 is clear, hardware is
>> 			supposed to manage that bit)
>> LPA = 0x0000		SGMII received control word (nothing)
>> SGMII_MODE = 0x011b	SGMII mode, duplex AN, 1000M, Full duplex,
>> 			Remote fault disable
>>
>> which all looks like it should work - but it isn't.
>>
>> One last thing I can think of trying at the moment would be writing
>> the VSPEC1_SGMII_CTRL with 0x36da, setting bit 9 which allegedly
>> restarts the SGMII exchange. There's some comments in the PHY driver
>> that this may be needed - maybe it's necessary once the MAC's PCS
>> has been switched to SGMII mode.
>
>
> Tried that now.  Didn't change anything.  And still no packets.
>
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
> 0x34da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
> 0x000e
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0x36da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
> 0x34da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
> 0x000e

And just as we were about to give up I got another datapoint.

Changed phy-mode and managed mode as follows in the device tree as a
last desperate attempt:

	mac@1 {
		compatible = "mediatek,eth-mac";
		reg = <1>;
		label = "wan";
//		phy-mode = "2500base-x";
		phy-mode = "sgmii";
		managed = "in-band-status";
		phy-handle = <&phy6>;
	};

Made things fail with 2.5G, as expected I guess. But this actually works
with 1G!

Except for an unexpected packet drop.  But at least there are packets
coming through at 1G now.  This is the remote end of the link:

ns-enp3s0# ethtool -s enp3s0 autoneg off speed 1000 duplex full
ns-enp3s0# ping  192.168.0.1
PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
64 bytes from 192.168.0.1: icmp_seq=1 ttl=64 time=0.544 ms
64 bytes from 192.168.0.1: icmp_seq=3 ttl=64 time=0.283 ms
64 bytes from 192.168.0.1: icmp_seq=4 ttl=64 time=0.261 ms
64 bytes from 192.168.0.1: icmp_seq=5 ttl=64 time=0.295 ms
64 bytes from 192.168.0.1: icmp_seq=6 ttl=64 time=0.273 ms
64 bytes from 192.168.0.1: icmp_seq=7 ttl=64 time=0.290 ms
64 bytes from 192.168.0.1: icmp_seq=8 ttl=64 time=0.266 ms
64 bytes from 192.168.0.1: icmp_seq=9 ttl=64 time=0.269 ms
64 bytes from 192.168.0.1: icmp_seq=10 ttl=64 time=0.270 ms
64 bytes from 192.168.0.1: icmp_seq=11 ttl=64 time=0.261 ms
64 bytes from 192.168.0.1: icmp_seq=12 ttl=64 time=0.261 ms
64 bytes from 192.168.0.1: icmp_seq=13 ttl=64 time=0.266 ms
^C
--- 192.168.0.1 ping statistics ---
13 packets transmitted, 12 received, 7.69231% packet loss, time 12282ms
rtt min/avg/max/mdev = 0.261/0.294/0.544/0.075 ms
ns-enp3s0# ethtool enp3s0
Settings for enp3s0:
        Supported ports: [ TP ]
        Supported link modes:   100baseT/Full
                                1000baseT/Full
                                10000baseT/Full
                                2500baseT/Full
                                5000baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  1000baseT/Full
        Advertised pause frame use: Symmetric
        Advertised auto-negotiation: No
        Advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: off
        Port: Twisted Pair
        PHYAD: 0
        Transceiver: internal
        MDI-X: Unknown
        Supports Wake-on: pg
        Wake-on: g
        Current message level: 0x00000005 (5)
                               drv link
        Link detected: yes
.

The MT7986 end looks like this:

root@OpenWrt:/# [   55.659413] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
[   55.664380] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
[   58.779924] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
[   58.784884] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
[   58.789841] mtk_sgmii_select_pcs: id=1
[   58.793581] mtk_pcs_config: interface=4
[   58.797399] offset:0 0x81140
[   58.797401] offset:4 0x4d544950
[   58.800273] offset:8 0x1a0
[   58.803397] offset:0x20 0x31120118
[   58.806089] forcing AN
[   58.811826] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
[   58.822759] mtk_pcs_restart_an
[   58.825800] mtk_pcs_get_state: bm=0x81140, adv=0xda014001
[   58.831184] mtk_pcs_get_state: bm=0x2c1140, adv=0xda014001
[   58.836649] mtk_pcs_link_up: interface=4
[   58.840559] offset:0 0xac1140
[   58.840561] offset:4 0x4d544950
[   58.843512] offset:8 0xda014001
[   58.846636] offset:0x20 0x3112011b
[   58.849780] mtk_soc_eth 15100000.ethernet wan: Link is Up - 1Gbps/Full - flow control rx/tx
[   58.861521] IPv6: ADDRCONF(NETDEV_CHANGE): wan: link becomes ready
root@OpenWrt:/# ethtool wan
Settings for wan:
        Supported ports: [  ]
        Supported link modes:   10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Supported pause frame use: Symmetric Receive-only
        Supports auto-negotiation: Yes
        Supported FEC modes: Not reported
        Advertised link modes:  10baseT/Half 10baseT/Full
                                100baseT/Half 100baseT/Full
                                1000baseT/Full
                                2500baseT/Full
        Advertised pause frame use: Symmetric Receive-only
        Advertised auto-negotiation: Yes
        Advertised FEC modes: Not reported
        Link partner advertised link modes:  1000baseT/Full
        Link partner advertised pause frame use: Symmetric Receive-only
        Link partner advertised auto-negotiation: Yes
        Link partner advertised FEC modes: Not reported
        Speed: 1000Mb/s
        Duplex: Full
        Auto-negotiation: on
        master-slave cfg: preferred slave
        master-slave status: master
        Port: Twisted Pair
        PHYAD: 6
        Transceiver: external
        MDI-X: on (auto)
        Current message level: 0x000000ff (255)
                               drv probe link timer ifdown ifup rx_err tx_err
        Link detected: yes
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x002e

And naturally 100M works too:

root@OpenWrt:/# [  528.859412] mtk_pcs_get_state: bm=0x2c1140, adv=0x5a014001
[  528.864908] mtk_soc_eth 15100000.ethernet wan: Link is Down
[  528.870513] mtk_pcs_get_state: bm=0x2c1140, adv=0x5a014001
[  528.875983] mtk_pcs_get_state: bm=0x2c1140, adv=0x5a014001
[  530.939756] mtk_pcs_get_state: bm=0x2c1140, adv=0xd6014001
[  530.945238] mtk_pcs_link_up: interface=4
[  530.949143] offset:0 0x2c1140
[  530.949145] offset:4 0x4d544950
[  530.952107] offset:8 0xd6014001
[  530.955232] offset:0x20 0x3112011b
[  530.958368] mtk_soc_eth 15100000.ethernet wan: Link is Up - 100Mbps/Full - flow control rx/tx
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x002d


Now, if we only could figure out what the difference is between this and
what we configure when the mode is changed from 2500base-x to sgmii.



Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 6:54 p.m. UTC | #47
On Mon, Jan 16, 2023 at 07:30:27PM +0100, Bjørn Mork wrote:
> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
> 
> > That all looks fine. However, I'm running out of ideas.
> 
> Thanks a lot for the effort in any case.  It's comforting that even the
> top experts can't figure out this one :-)
> 
> 
> > What we seem to have is:
> >
> > PHY:
> > VSPEC1_SGMII_CTRL = 0x34da
> > VSPEC1_SGMII_STAT = 0x000e
> >
> > The PHY is programmed to exchange SGMII with the host PCS, and it
> > says that it hasn't completed that exchange (bit 5 of STAT).
> >
> > The Mediatek PCS says:
> > BMCR = 0x1140		AN enabled
> > BMSR = 0x0008		AN capable
> > ADVERTISE = 0x0001	SGMII response (bit 14 is clear, hardware is
> > 			supposed to manage that bit)
> > LPA = 0x0000		SGMII received control word (nothing)
> > SGMII_MODE = 0x011b	SGMII mode, duplex AN, 1000M, Full duplex,
> > 			Remote fault disable
> >
> > which all looks like it should work - but it isn't.
> >
> > One last thing I can think of trying at the moment would be writing
> > the VSPEC1_SGMII_CTRL with 0x36da, setting bit 9 which allegedly
> > restarts the SGMII exchange. There's some comments in the PHY driver
> > that this may be needed - maybe it's necessary once the MAC's PCS
> > has been switched to SGMII mode.
> 
> 
> Tried that now.  Didn't change anything.  And still no packets.
> 
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
> 0x34da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
> 0x000e
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0x36da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
> 0x34da
> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
> 0x000e

If bit 9 is indeed the restart-an bit, it will be self-clearing, so
I wouldn't expect a read back of it to change to 0x36da.

I guess next thing to try is clearing and setting the AN enable bit,
bit 12, so please try this:

mdio mdio-bus 6:30 raw 8 0x24da
mdio mdio-bus 6:30 raw 8 0x36da
mdio mdio-bus 6:30 raw 9

If that doesn't work, then let's try something a bit harder:

mdio mdio-bus 6:30 raw 8 0xb4da
mdio mdio-bus 6:30 raw 9

Please let me know the results from those.

Thanks.
  
Bjørn Mork Jan. 16, 2023, 6:59 p.m. UTC | #48
"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> On Mon, Jan 16, 2023 at 07:30:27PM +0100, Bjørn Mork wrote:
>> "Russell King (Oracle)" <linux@armlinux.org.uk> writes:
>> 
>> > That all looks fine. However, I'm running out of ideas.
>> 
>> Thanks a lot for the effort in any case.  It's comforting that even the
>> top experts can't figure out this one :-)
>> 
>> 
>> > What we seem to have is:
>> >
>> > PHY:
>> > VSPEC1_SGMII_CTRL = 0x34da
>> > VSPEC1_SGMII_STAT = 0x000e
>> >
>> > The PHY is programmed to exchange SGMII with the host PCS, and it
>> > says that it hasn't completed that exchange (bit 5 of STAT).
>> >
>> > The Mediatek PCS says:
>> > BMCR = 0x1140		AN enabled
>> > BMSR = 0x0008		AN capable
>> > ADVERTISE = 0x0001	SGMII response (bit 14 is clear, hardware is
>> > 			supposed to manage that bit)
>> > LPA = 0x0000		SGMII received control word (nothing)
>> > SGMII_MODE = 0x011b	SGMII mode, duplex AN, 1000M, Full duplex,
>> > 			Remote fault disable
>> >
>> > which all looks like it should work - but it isn't.
>> >
>> > One last thing I can think of trying at the moment would be writing
>> > the VSPEC1_SGMII_CTRL with 0x36da, setting bit 9 which allegedly
>> > restarts the SGMII exchange. There's some comments in the PHY driver
>> > that this may be needed - maybe it's necessary once the MAC's PCS
>> > has been switched to SGMII mode.
>> 
>> 
>> Tried that now.  Didn't change anything.  And still no packets.
>> 
>> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
>> 0x34da
>> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
>> 0x000e
>> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0x36da
>> root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
>> 0x34da
>> root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
>> 0x000e
>
> If bit 9 is indeed the restart-an bit, it will be self-clearing, so
> I wouldn't expect a read back of it to change to 0x36da.
>
> I guess next thing to try is clearing and setting the AN enable bit,
> bit 12, so please try this:
>
> mdio mdio-bus 6:30 raw 8 0x24da
> mdio mdio-bus 6:30 raw 8 0x36da
> mdio mdio-bus 6:30 raw 9
>
> If that doesn't work, then let's try something a bit harder:
>
> mdio mdio-bus 6:30 raw 8 0xb4da
> mdio mdio-bus 6:30 raw 9
>
> Please let me know the results from those.

OK, back to the original dts with phy-mode = "2500base-x", with peer set
to 1G.  Still no success:

root@OpenWrt:/# mdio mdio-bus 6:30 raw 8
0x34da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e
root@OpenWrt:/#  mdio mdio-bus 6:30 raw 8 0x24da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0x36da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e
root@OpenWrt:/# mdio mdio-bus 6:30 raw 8 0xb4da
root@OpenWrt:/# mdio mdio-bus 6:30 raw 9
0x000e



Bjørn
  
Russell King (Oracle) Jan. 16, 2023, 7:15 p.m. UTC | #49
On Mon, Jan 16, 2023 at 07:50:52PM +0100, Bjørn Mork wrote:
> Made things fail with 2.5G, as expected I guess. But this actually works
> with 1G!
> 
> Except for an unexpected packet drop.  But at least there are packets
> coming through at 1G now.  This is the remote end of the link:
> 
> ns-enp3s0# ethtool -s enp3s0 autoneg off speed 1000 duplex full
> ns-enp3s0# ping  192.168.0.1
> PING 192.168.0.1 (192.168.0.1) 56(84) bytes of data.
> 64 bytes from 192.168.0.1: icmp_seq=1 ttl=64 time=0.544 ms
> 64 bytes from 192.168.0.1: icmp_seq=3 ttl=64 time=0.283 ms
> 64 bytes from 192.168.0.1: icmp_seq=4 ttl=64 time=0.261 ms
> 64 bytes from 192.168.0.1: icmp_seq=5 ttl=64 time=0.295 ms
> 64 bytes from 192.168.0.1: icmp_seq=6 ttl=64 time=0.273 ms
> 64 bytes from 192.168.0.1: icmp_seq=7 ttl=64 time=0.290 ms
> 64 bytes from 192.168.0.1: icmp_seq=8 ttl=64 time=0.266 ms
> 64 bytes from 192.168.0.1: icmp_seq=9 ttl=64 time=0.269 ms
> 64 bytes from 192.168.0.1: icmp_seq=10 ttl=64 time=0.270 ms
> 64 bytes from 192.168.0.1: icmp_seq=11 ttl=64 time=0.261 ms
> 64 bytes from 192.168.0.1: icmp_seq=12 ttl=64 time=0.261 ms
> 64 bytes from 192.168.0.1: icmp_seq=13 ttl=64 time=0.266 ms
> ^C
> --- 192.168.0.1 ping statistics ---
> 13 packets transmitted, 12 received, 7.69231% packet loss, time 12282ms
> rtt min/avg/max/mdev = 0.261/0.294/0.544/0.075 ms
> ns-enp3s0# ethtool enp3s0
> Settings for enp3s0:
>         Supported ports: [ TP ]
>         Supported link modes:   100baseT/Full
>                                 1000baseT/Full
>                                 10000baseT/Full
>                                 2500baseT/Full
>                                 5000baseT/Full
>         Supported pause frame use: Symmetric Receive-only
>         Supports auto-negotiation: Yes
>         Supported FEC modes: Not reported
>         Advertised link modes:  1000baseT/Full
>         Advertised pause frame use: Symmetric
>         Advertised auto-negotiation: No
>         Advertised FEC modes: Not reported
>         Speed: 1000Mb/s
>         Duplex: Full
>         Auto-negotiation: off
>         Port: Twisted Pair
>         PHYAD: 0
>         Transceiver: internal
>         MDI-X: Unknown
>         Supports Wake-on: pg
>         Wake-on: g
>         Current message level: 0x00000005 (5)
>                                drv link
>         Link detected: yes
> .
> 
> The MT7986 end looks like this:
> 
> root@OpenWrt:/# [   55.659413] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
> [   55.664380] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
> [   58.779924] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
> [   58.784884] mtk_pcs_get_state: bm=0x81140, adv=0x1a0
> [   58.789841] mtk_sgmii_select_pcs: id=1
> [   58.793581] mtk_pcs_config: interface=4
> [   58.797399] offset:0 0x81140
> [   58.797401] offset:4 0x4d544950
> [   58.800273] offset:8 0x1a0
> [   58.803397] offset:0x20 0x31120118

This looks like it's configured for 1000base-X at this point.

> [   58.806089] forcing AN
> [   58.811826] mtk_pcs_config: rgc3=0x0, advertise=0x1 (changed), link_timer=1600000,  sgm_mode=0x103, bmcr=0x1200, use_an=1
> [   58.822759] mtk_pcs_restart_an
> [   58.825800] mtk_pcs_get_state: bm=0x81140, adv=0xda014001
> [   58.831184] mtk_pcs_get_state: bm=0x2c1140, adv=0xda014001
> [   58.836649] mtk_pcs_link_up: interface=4
> [   58.840559] offset:0 0xac1140
> [   58.840561] offset:4 0x4d544950
> [   58.843512] offset:8 0xda014001
> [   58.846636] offset:0x20 0x3112011b

and here we've reconfigured it for SGMII mode - and we can see the
Mediatek PCS has set the ACK bit (bit 14) in its advertisement
register as one would expect.

> Now, if we only could figure out what the difference is between this and
> what we configure when the mode is changed from 2500base-x to sgmii.

Maybe there is something missing which we need to do on the Mediatek
side to properly switch between 2500base-X and SGMII.

It has a feel of a problem changing the underlying link speed between
3.125 and 1.25Gbps, which is done by the ANA_RGC3 register.

Hmm, maybe the PCS needs to be powered down to change the speed, in
other words, SGMII_PHYA_PWD needs to be set before the write to the
ANA_RGC3 register?
  

Patch

diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c
index 736839c84130..8b7465057e57 100644
--- a/drivers/net/ethernet/mediatek/mtk_sgmii.c
+++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c
@@ -122,7 +122,21 @@  static void mtk_pcs_link_up(struct phylink_pcs *pcs, unsigned int mode,
 	regmap_write(mpcs->regmap, SGMSYS_SGMII_MODE, val);
 }
 
+static void mtk_pcs_get_state(struct phylink_pcs *pcs, struct phylink_link_state *state)
+{
+	struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs);
+	unsigned int val;
+
+	regmap_read(mpcs->regmap, mpcs->ana_rgc3, &val);
+	state->speed = val & RG_PHY_SPEED_3_125G ? SPEED_2500 : SPEED_1000;
+
+	regmap_read(mpcs->regmap, SGMSYS_PCS_CONTROL_1, &val);
+	state->an_complete = !!(val & SGMII_AN_COMPLETE);
+	state->link = !!(val & SGMII_LINK_STATYS);
+}
+
 static const struct phylink_pcs_ops mtk_pcs_ops = {
+	.pcs_get_state = mtk_pcs_get_state,
 	.pcs_config = mtk_pcs_config,
 	.pcs_an_restart = mtk_pcs_restart_an,
 	.pcs_link_up = mtk_pcs_link_up,