[net-next,v12,08/18] net: ethernet: mtk_eth_soc: fix 1000Base-X and 2500Base-X modes
Message ID | fd5c7ea79a7f84caac7d0b64b39fe5c4043edfa8.1678201958.git.daniel@makrotopia.org |
---|---|
State | New |
Headers |
Return-Path: <linux-kernel-owner@vger.kernel.org> Delivered-To: ouuuleilei@gmail.com Received: by 2002:a05:7300:570d:b0:a1:2e52:cf23 with SMTP id h13csp1980615dyk; Tue, 7 Mar 2023 07:58:50 -0800 (PST) X-Google-Smtp-Source: AK7set+szZLUFSJ4rmCU17ylME+FjXbvT+d2pqAQM6+/LAaVghV+nkZZw67Q7WAbxswAZ3BDKByW X-Received: by 2002:a17:90b:1b04:b0:237:d2b0:dac6 with SMTP id nu4-20020a17090b1b0400b00237d2b0dac6mr16319992pjb.33.1678204730191; Tue, 07 Mar 2023 07:58:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1678204730; cv=none; d=google.com; s=arc-20160816; b=UtVg7XiBSfcd1ITiT+tvPnOcvwVYDlGK0y5si6k+U2Urayk1kOFe/tk4iXced/36a1 NqekOSNRXXBoifO9ZKmSf0vs4r6OqUY+yq+4SExxXzpL7OuTlnTPnTrjAfSG3U2f0V+v /yb/9ZXAcpk+NDrTo5Dx7zjU98fKmLWkFzqLHErPCvM1jaM5XgZ6K+ACpGfP6rh+kXvh aIycYDeXT1JT5lF+b2nBSBUSl0Q7aU/j63If77HA7ctjbSY/kvd7XaTHU24KQbhPf3P2 1aEMencGHl7ypmt+CsaxaTmBIdkEWdQVQEaEjHmIcUk87lsbbVNS2JOLrFLaMASFW5Hr LlgQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=0HfoUyn5+ovWJh8xz+qq6Vp7zPUUAYZvXM5HH2VsaY0=; b=J/LaiPn6irH+fO8gG9U6gX29CfV471qHagcvp/CEuf8UbHrcgUKW0GzFB1/DQGOzcs k3PA3fLW+9mS2KV+/UASy+rhkS7LwfI/zIQQyyyVOTy3pQKRCVSSM1aNDREn3iJjWQB8 6/lcJOF+c2vJXTfFl/Yiq1rWY5N+NJREmwtRDUOE6Rj8KPQkU0gXIvwrJQVH5y0E2lud 0A5BFtNEr7/BALtiwxmzDSMqI0V6veiIPT+3c20lmHHBnjAjhJ8sLbD/G9IXVaCwNOOQ 7xLIe0OWwYVHxOB4oMQTGmmYHl7RbDBvWm+ntzk7c/K9XOZCfTZW5JeppFpHFhva2s3x eOTQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id on11-20020a17090b1d0b00b0022c320a4d27si16923544pjb.189.2023.03.07.07.58.37; Tue, 07 Mar 2023 07:58:50 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231572AbjCGP4m (ORCPT <rfc822;toshivichauhan@gmail.com> + 99 others); Tue, 7 Mar 2023 10:56:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41552 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231598AbjCGP4I (ORCPT <rfc822;linux-kernel@vger.kernel.org>); Tue, 7 Mar 2023 10:56:08 -0500 Received: from fudo.makrotopia.org (fudo.makrotopia.org [IPv6:2a07:2ec0:3002::71]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 99B3E8F530; Tue, 7 Mar 2023 07:55:37 -0800 (PST) Received: from local by fudo.makrotopia.org with esmtpsa (TLS1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.96) (envelope-from <daniel@makrotopia.org>) id 1pZZf5-0001rV-2X; Tue, 07 Mar 2023 16:55:35 +0100 Date: Tue, 7 Mar 2023 15:53:58 +0000 From: Daniel Golle <daniel@makrotopia.org> To: netdev@vger.kernel.org, linux-mediatek@lists.infradead.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Russell King <linux@armlinux.org.uk>, Heiner Kallweit <hkallweit1@gmail.com>, Lorenzo Bianconi <lorenzo@kernel.org>, Mark Lee <Mark-MC.Lee@mediatek.com>, John Crispin <john@phrozen.org>, Felix Fietkau <nbd@nbd.name>, AngeloGioacchino Del Regno <angelogioacchino.delregno@collabora.com>, Matthias Brugger <matthias.bgg@gmail.com>, DENG Qingfang <dqfext@gmail.com>, Landen Chao <Landen.Chao@mediatek.com>, Sean Wang <sean.wang@mediatek.com>, Paolo Abeni <pabeni@redhat.com>, Jakub Kicinski <kuba@kernel.org>, Eric Dumazet <edumazet@google.com>, "David S. Miller" <davem@davemloft.net>, Vladimir Oltean <olteanv@gmail.com>, Florian Fainelli <f.fainelli@gmail.com>, Andrew Lunn <andrew@lunn.ch>, Vladimir Oltean <vladimir.oltean@nxp.com> Cc: Jianhui Zhao <zhaojh329@gmail.com>, =?iso-8859-1?q?Bj=F8rn?= Mork <bjorn@mork.no>, Frank Wunderlich <frank-w@public-files.de>, Alexander Couzens <lynxis@fe80.eu> Subject: [PATCH net-next v12 08/18] net: ethernet: mtk_eth_soc: fix 1000Base-X and 2500Base-X modes Message-ID: <fd5c7ea79a7f84caac7d0b64b39fe5c4043edfa8.1678201958.git.daniel@makrotopia.org> References: <cover.1678201958.git.daniel@makrotopia.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <cover.1678201958.git.daniel@makrotopia.org> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: <linux-kernel.vger.kernel.org> X-Mailing-List: linux-kernel@vger.kernel.org X-getmail-retrieved-from-mailbox: =?utf-8?q?INBOX?= X-GMAIL-THRID: =?utf-8?q?1759725203128188270?= X-GMAIL-MSGID: =?utf-8?q?1759725203128188270?= |
Series |
net: ethernet: mtk_eth_soc: various enhancements
|
|
Commit Message
Daniel Golle
March 7, 2023, 3:53 p.m. UTC
After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes
would work only after `ethtool -s eth1 autoneg off`.
As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed
to control auto-negotiation on the external interface it doesn't make
much sense to use it to control on-board SGMII auto-negotiation between
MAC and PHY.
Set correct values to really only enable SGMII auto-negotiation when
actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode,
enable remote-fault detection only if in-band-status is enabled.
This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3
board and also makes it possible to use interface-mode-switching PHYs
operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for
2500M mode on other boards.
Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
---
drivers/net/ethernet/mediatek/mtk_sgmii.c | 24 +++++------------------
1 file changed, 5 insertions(+), 19 deletions(-)
Comments
> Gesendet: Dienstag, 07. März 2023 um 16:53 Uhr > Von: "Daniel Golle" <daniel@makrotopia.org> > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > would work only after `ethtool -s eth1 autoneg off`. > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > to control auto-negotiation on the external interface it doesn't make > much sense to use it to control on-board SGMII auto-negotiation between > MAC and PHY. > Set correct values to really only enable SGMII auto-negotiation when > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > enable remote-fault detection only if in-band-status is enabled. > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > board and also makes it possible to use interface-mode-switching PHYs > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > 2500M mode on other boards. Hi, have tested Parts 1-12 this on bananapi-r3 (mt7986) with 1G Fiber SFP (no 2g5 available yet) on gmac1 and lan4 (mt7531 p5) Tested-by: Frank Wunderlich <frank-w@public-files.de> Thx Daniel for working on SFP support :) regards Frank
On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > would work only after `ethtool -s eth1 autoneg off`. > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > to control auto-negotiation on the external interface it doesn't make > much sense to use it to control on-board SGMII auto-negotiation between > MAC and PHY. > Set correct values to really only enable SGMII auto-negotiation when > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > enable remote-fault detection only if in-band-status is enabled. > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > board and also makes it possible to use interface-mode-switching PHYs > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > 2500M mode on other boards. > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > Signed-off-by: Daniel Golle <daniel@makrotopia.org> NAK. There are PHYs out there which operate in SGMII mode but do not exchange the SGMII 16-bit configuration word. The code implemented here by me was explicitly to allow such a configuration to work, which is defined as: SGMII *without* mode == inband An example of this is the Broadcom 84881 PHY which can be found on SFP modules.
On Wed, Mar 08, 2023 at 11:35:40AM +0000, Russell King (Oracle) wrote: > On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > > would work only after `ethtool -s eth1 autoneg off`. > > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > > to control auto-negotiation on the external interface it doesn't make > > much sense to use it to control on-board SGMII auto-negotiation between > > MAC and PHY. > > Set correct values to really only enable SGMII auto-negotiation when > > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > > enable remote-fault detection only if in-band-status is enabled. > > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > > board and also makes it possible to use interface-mode-switching PHYs > > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > > 2500M mode on other boards. > > > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > NAK. > > There are PHYs out there which operate in SGMII mode but do not > exchange the SGMII 16-bit configuration word. The code implemented > here by me was explicitly to allow such a configuration to work, > which is defined as: > > SGMII *without* mode == inband > > An example of this is the Broadcom 84881 PHY which can be found on > SFP modules. I also have multiple such 1000Base-T SFP modules here (finisar, AJYA), and this change doesn't touch the codepaths relevant for those. They are operating in SGMII mode, they have always been working fine. What I'm trying to fix here is 1000Base-X and 2500Base-X mode which has been broken by introducing ETHTOOL_LINK_MODE_Autoneg_BIT as the deciding factor for in-band AN here. Can you explain why ETHTOOL_LINK_MODE_Autoneg_BIT was used there in first place? Is my understanding of this bit controlling autoneg on the *external* interface rather than on the *system-side* interface wrong?
On Wed, Mar 08, 2023 at 12:11:48PM +0000, Daniel Golle wrote: > On Wed, Mar 08, 2023 at 11:35:40AM +0000, Russell King (Oracle) wrote: > > On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > > > would work only after `ethtool -s eth1 autoneg off`. > > > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > > > to control auto-negotiation on the external interface it doesn't make > > > much sense to use it to control on-board SGMII auto-negotiation between > > > MAC and PHY. > > > Set correct values to really only enable SGMII auto-negotiation when > > > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > > > enable remote-fault detection only if in-band-status is enabled. > > > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > > > board and also makes it possible to use interface-mode-switching PHYs > > > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > > > 2500M mode on other boards. > > > > > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > NAK. > > > > There are PHYs out there which operate in SGMII mode but do not > > exchange the SGMII 16-bit configuration word. The code implemented > > here by me was explicitly to allow such a configuration to work, > > which is defined as: > > > > SGMII *without* mode == inband > > > > An example of this is the Broadcom 84881 PHY which can be found on > > SFP modules. > > I also have multiple such 1000Base-T SFP modules here (finisar, AJYA), > and this change doesn't touch the codepaths relevant for those. They > are operating in SGMII mode, they have always been working fine. > > What I'm trying to fix here is 1000Base-X and 2500Base-X mode which > has been broken by introducing ETHTOOL_LINK_MODE_Autoneg_BIT as the > deciding factor for in-band AN here. ... which is correct. > Can you explain why ETHTOOL_LINK_MODE_Autoneg_BIT was used there in > first place? Is my understanding of this bit controlling autoneg on the > *external* interface rather than on the *system-side* interface wrong? Think about what 1000BASE-X is for. It's not really for internal links, it's intended by IEEE 802.3 to be the 1G *media* side protocol for 1000BASE-SX, 1000BASE-LX, 1000BASE-CX etc links. Therefore, when being used in that case, one may wish to disable autoneg over the fibre link. Hence, turning off autoneg via ethtool *should* turn off autoneg over the fibre link. So, using ETHTOOL_LINK_MODE_Autoneg_BIT to gate 802.3z autonegotiation the correct thing to do. If we have a PHY using 1000BASE-X, then it is at odds with the primary purpose of this protocol, especially with it comes to AN. This is why phylink used to refuse to accept PHYs when using 802.3z mode, but Marek wanted this to work, so relaxed the checks preventing such a setup working.
On Wed, Mar 08, 2023 at 12:41:43PM +0000, Russell King (Oracle) wrote: > On Wed, Mar 08, 2023 at 12:11:48PM +0000, Daniel Golle wrote: > > On Wed, Mar 08, 2023 at 11:35:40AM +0000, Russell King (Oracle) wrote: > > > On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > > > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > > > > would work only after `ethtool -s eth1 autoneg off`. > > > > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > > > > to control auto-negotiation on the external interface it doesn't make > > > > much sense to use it to control on-board SGMII auto-negotiation between > > > > MAC and PHY. > > > > Set correct values to really only enable SGMII auto-negotiation when > > > > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > > > > enable remote-fault detection only if in-band-status is enabled. > > > > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > > > > board and also makes it possible to use interface-mode-switching PHYs > > > > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > > > > 2500M mode on other boards. > > > > > > > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > > > NAK. > > > > > > There are PHYs out there which operate in SGMII mode but do not > > > exchange the SGMII 16-bit configuration word. The code implemented > > > here by me was explicitly to allow such a configuration to work, > > > which is defined as: > > > > > > SGMII *without* mode == inband > > > > > > An example of this is the Broadcom 84881 PHY which can be found on > > > SFP modules. > > > > I also have multiple such 1000Base-T SFP modules here (finisar, AJYA), > > and this change doesn't touch the codepaths relevant for those. They > > are operating in SGMII mode, they have always been working fine. > > > > What I'm trying to fix here is 1000Base-X and 2500Base-X mode which > > has been broken by introducing ETHTOOL_LINK_MODE_Autoneg_BIT as the > > deciding factor for in-band AN here. > > ... which is correct. > > > Can you explain why ETHTOOL_LINK_MODE_Autoneg_BIT was used there in > > first place? Is my understanding of this bit controlling autoneg on the > > *external* interface rather than on the *system-side* interface wrong? > > Think about what 1000BASE-X is for. It's not really for internal links, > it's intended by IEEE 802.3 to be the 1G *media* side protocol for > 1000BASE-SX, 1000BASE-LX, 1000BASE-CX etc links. > > Therefore, when being used in that case, one may wish to disable > autoneg over the fibre link. Hence, turning off autoneg via ethtool > *should* turn off autoneg over the fibre link. So, using > ETHTOOL_LINK_MODE_Autoneg_BIT to gate 802.3z autonegotiation the > correct thing to do. > > If we have a PHY using 1000BASE-X, then it is at odds with the > primary purpose of this protocol, especially with it comes to AN. > This is why phylink used to refuse to accept PHYs when using 802.3z > mode, but Marek wanted this to work, so relaxed the checks > preventing such a setup working. Sadly 2500Base-X is very commonly used to connect 2500Base-T-capable PHYs or SFP modules. I also got an ATS branded 1000M/100M/10M copper SFP module which uses 1000Base-X as system-side interface, independently of the speed of the link partner on the TP interface. All of them do not work with inband-AN enabled and a link only comes up after `ethtool -s eth1 autoneg off` in the current implementation, while previously they were working just fine. I understand that there isn't really a good solution for 1000Base-X as thanks to you I now understand that SerDes autoneg just transparently ends up being autoneg on a fiber link. 2500Base-X, however, is hardly used for fiber links, but rather mostly for 2500Base-T PHYs and SFP module as well as xPON SFPs. Maybe we could at least have in-band AN disabled by default for those to get them working without requiring the user to carry out ethtool settings?
On Wed, Mar 08, 2023 at 12:53:13PM +0000, Daniel Golle wrote: > On Wed, Mar 08, 2023 at 12:41:43PM +0000, Russell King (Oracle) wrote: > > On Wed, Mar 08, 2023 at 12:11:48PM +0000, Daniel Golle wrote: > > > On Wed, Mar 08, 2023 at 11:35:40AM +0000, Russell King (Oracle) wrote: > > > > On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > > > > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > > > > > would work only after `ethtool -s eth1 autoneg off`. > > > > > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > > > > > to control auto-negotiation on the external interface it doesn't make > > > > > much sense to use it to control on-board SGMII auto-negotiation between > > > > > MAC and PHY. > > > > > Set correct values to really only enable SGMII auto-negotiation when > > > > > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > > > > > enable remote-fault detection only if in-band-status is enabled. > > > > > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > > > > > board and also makes it possible to use interface-mode-switching PHYs > > > > > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > > > > > 2500M mode on other boards. > > > > > > > > > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > > > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > > > > > NAK. > > > > > > > > There are PHYs out there which operate in SGMII mode but do not > > > > exchange the SGMII 16-bit configuration word. The code implemented > > > > here by me was explicitly to allow such a configuration to work, > > > > which is defined as: > > > > > > > > SGMII *without* mode == inband > > > > > > > > An example of this is the Broadcom 84881 PHY which can be found on > > > > SFP modules. > > > > > > I also have multiple such 1000Base-T SFP modules here (finisar, AJYA), > > > and this change doesn't touch the codepaths relevant for those. They > > > are operating in SGMII mode, they have always been working fine. > > > > > > What I'm trying to fix here is 1000Base-X and 2500Base-X mode which > > > has been broken by introducing ETHTOOL_LINK_MODE_Autoneg_BIT as the > > > deciding factor for in-band AN here. > > > > ... which is correct. > > > > > Can you explain why ETHTOOL_LINK_MODE_Autoneg_BIT was used there in > > > first place? Is my understanding of this bit controlling autoneg on the > > > *external* interface rather than on the *system-side* interface wrong? > > > > Think about what 1000BASE-X is for. It's not really for internal links, > > it's intended by IEEE 802.3 to be the 1G *media* side protocol for > > 1000BASE-SX, 1000BASE-LX, 1000BASE-CX etc links. > > > > Therefore, when being used in that case, one may wish to disable > > autoneg over the fibre link. Hence, turning off autoneg via ethtool > > *should* turn off autoneg over the fibre link. So, using > > ETHTOOL_LINK_MODE_Autoneg_BIT to gate 802.3z autonegotiation the > > correct thing to do. > > > > If we have a PHY using 1000BASE-X, then it is at odds with the > > primary purpose of this protocol, especially with it comes to AN. > > This is why phylink used to refuse to accept PHYs when using 802.3z > > mode, but Marek wanted this to work, so relaxed the checks > > preventing such a setup working. > > Sadly 2500Base-X is very commonly used to connect 2500Base-T-capable > PHYs or SFP modules. I also got an ATS branded 1000M/100M/10M copper > SFP module which uses 1000Base-X as system-side interface, independently > of the speed of the link partner on the TP interface. > All of them do not work with inband-AN enabled and a link only comes > up after `ethtool -s eth1 autoneg off` in the current implementation, > while previously they were working just fine. > > I understand that there isn't really a good solution for 1000Base-X as > thanks to you I now understand that SerDes autoneg just transparently > ends up being autoneg on a fiber link. > > 2500Base-X, however, is hardly used for fiber links, but rather mostly > for 2500Base-T PHYs and SFP module as well as xPON SFPs. Maybe we could > at least have in-band AN disabled by default for those to get them > working without requiring the user to carry out ethtool settings? We could _possibly_ make 2500base-X ignore the autoneg bit, but in order to do that I would want to make other changes, because this is getting absolutely stupid to have these decisions being made in each and every PCS - and have each PCS author implementing different decision making in their PCS driver. The problem that gives is it makes phylink maintenance in hard, because it becomes impossible to predict what the effect of any change is. It also means that plugging the same SFP module into different hardware will give different results (maybe it works, maybe it doesn't.) So, what I would want to do is to move the decision about whether the PCS should enable in-band into the phylink core code instead of these random decisions being made in each PCS. At that point, we can then make the decision about whether the ethtool autoneg bit should affect whether the PCS uses inband depending on whether the PCS is effectively the media facing entity, or whether there is a PHY attached - and if there is a PHY attached, ask the PHY whether it will be using in-band or not. This also would give a way to ensure that all PCS adopt the same behaviour. Does that sound a reasonable approach? Strangely, I already have some patches along those lines in my net-queue branch. See: net: phylink: add helpers for decoding mode net: use phylink_mode_*() helpers net: phylink: split PCS in-band from inband mode It's nowhere near finished though, it was just an idea back in 2021 when the problem of getting rid of differing PCS behaviours was on my mind.
On Wed, Mar 08, 2023 at 01:12:10PM +0000, Russell King (Oracle) wrote: > So, what I would want to do is to move the decision about whether > the PCS should enable in-band into the phylink core code instead > of these random decisions being made in each PCS. > > At that point, we can then make the decision about whether the > ethtool autoneg bit should affect whether the PCS uses inband > depending on whether the PCS is effectively the media facing > entity, or whether there is a PHY attached - and if there is a PHY > attached, ask the PHY whether it will be using in-band or not. > > This also would give a way to ensure that all PCS adopt the same > behaviour. > > Does that sound a reasonable approach? > > Strangely, I already have some patches along those lines in my > net-queue branch. See: > > net: phylink: add helpers for decoding mode > net: use phylink_mode_*() helpers > net: phylink: split PCS in-band from inband mode > > It's nowhere near finished though, it was just an idea back in > 2021 when the problem of getting rid of differing PCS behaviours > was on my mind. Having looked at those patches (http://git.armlinux.org.uk/cgit/linux-arm.git/commit/?h=net-queue&id=a632167d226cf95d92cd887b2f1678e1539833b1) and seen the way in which they are incomplete, could you sketch here how do you see an actual pcs_validate() implementation making use of the new "mode" argument? I'd expect there to be some logic which changes the "mode", if the PCS validation with the existing one fails... or?
On Wed, Mar 08, 2023 at 03:46:42PM +0200, Vladimir Oltean wrote: > On Wed, Mar 08, 2023 at 01:12:10PM +0000, Russell King (Oracle) wrote: > > So, what I would want to do is to move the decision about whether > > the PCS should enable in-band into the phylink core code instead > > of these random decisions being made in each PCS. > > > > At that point, we can then make the decision about whether the > > ethtool autoneg bit should affect whether the PCS uses inband > > depending on whether the PCS is effectively the media facing > > entity, or whether there is a PHY attached - and if there is a PHY > > attached, ask the PHY whether it will be using in-band or not. > > > > This also would give a way to ensure that all PCS adopt the same > > behaviour. > > > > Does that sound a reasonable approach? > > > > Strangely, I already have some patches along those lines in my > > net-queue branch. See: > > > > net: phylink: add helpers for decoding mode > > net: use phylink_mode_*() helpers > > net: phylink: split PCS in-band from inband mode > > > > It's nowhere near finished though, it was just an idea back in > > 2021 when the problem of getting rid of differing PCS behaviours > > was on my mind. > > Having looked at those patches > (http://git.armlinux.org.uk/cgit/linux-arm.git/commit/?h=net-queue&id=a632167d226cf95d92cd887b2f1678e1539833b1) > and seen the way in which they are incomplete, could you sketch here how > do you see an actual pcs_validate() implementation making use of the new > "mode" argument? > > I'd expect there to be some logic which changes the "mode", if the PCS > validation with the existing one fails... or? You may notice that I explicitly didn't include the patches adding the mode argument to the validation path, precisely because that's an unanswered question. I haven't done much work in this area because I gave up with trying to convert mv88e6xxx to phylink_pcs because it just became impossible due to the history of the driver - and that destroyed my motivation for further work, because that would mean I'd just accumulate more and more patches. You may have noticed that I'm hardly doing any development work on phylink over the last few months, instead concentrating on problem non-DSA network drivers.
On Wed, Mar 08, 2023 at 01:12:10PM +0000, Russell King (Oracle) wrote: > On Wed, Mar 08, 2023 at 12:53:13PM +0000, Daniel Golle wrote: > > On Wed, Mar 08, 2023 at 12:41:43PM +0000, Russell King (Oracle) wrote: > > > On Wed, Mar 08, 2023 at 12:11:48PM +0000, Daniel Golle wrote: > > > > On Wed, Mar 08, 2023 at 11:35:40AM +0000, Russell King (Oracle) wrote: > > > > > On Tue, Mar 07, 2023 at 03:53:58PM +0000, Daniel Golle wrote: > > > > > > After conversion to phylink_pcs the 1000Base-X and 2500Base-X modes > > > > > > would work only after `ethtool -s eth1 autoneg off`. > > > > > > As ethtool autoneg and the ETHTOOL_LINK_MODE_Autoneg_BIT is supposed > > > > > > to control auto-negotiation on the external interface it doesn't make > > > > > > much sense to use it to control on-board SGMII auto-negotiation between > > > > > > MAC and PHY. > > > > > > Set correct values to really only enable SGMII auto-negotiation when > > > > > > actually operating in SGMII mode. For 1000Base-X and 2500Base-X mode, > > > > > > enable remote-fault detection only if in-band-status is enabled. > > > > > > This fixes using 1000Base-X and 2500Base-X SFPs on the BananaPi R3 > > > > > > board and also makes it possible to use interface-mode-switching PHYs > > > > > > operating in either SGMII mode for 10M/100M/1000M or in 2500Base-X for > > > > > > 2500M mode on other boards. > > > > > > > > > > > > Fixes: 14a44ab0330d ("net: mtk_eth_soc: partially convert to phylink_pcs") > > > > > > Signed-off-by: Daniel Golle <daniel@makrotopia.org> > > > > > > > > > > NAK. > > > > > > > > > > There are PHYs out there which operate in SGMII mode but do not > > > > > exchange the SGMII 16-bit configuration word. The code implemented > > > > > here by me was explicitly to allow such a configuration to work, > > > > > which is defined as: > > > > > > > > > > SGMII *without* mode == inband > > > > > > > > > > An example of this is the Broadcom 84881 PHY which can be found on > > > > > SFP modules. > > > > > > > > I also have multiple such 1000Base-T SFP modules here (finisar, AJYA), > > > > and this change doesn't touch the codepaths relevant for those. They > > > > are operating in SGMII mode, they have always been working fine. > > > > > > > > What I'm trying to fix here is 1000Base-X and 2500Base-X mode which > > > > has been broken by introducing ETHTOOL_LINK_MODE_Autoneg_BIT as the > > > > deciding factor for in-band AN here. > > > > > > ... which is correct. > > > > > > > Can you explain why ETHTOOL_LINK_MODE_Autoneg_BIT was used there in > > > > first place? Is my understanding of this bit controlling autoneg on the > > > > *external* interface rather than on the *system-side* interface wrong? > > > > > > Think about what 1000BASE-X is for. It's not really for internal links, > > > it's intended by IEEE 802.3 to be the 1G *media* side protocol for > > > 1000BASE-SX, 1000BASE-LX, 1000BASE-CX etc links. > > > > > > Therefore, when being used in that case, one may wish to disable > > > autoneg over the fibre link. Hence, turning off autoneg via ethtool > > > *should* turn off autoneg over the fibre link. So, using > > > ETHTOOL_LINK_MODE_Autoneg_BIT to gate 802.3z autonegotiation the > > > correct thing to do. > > > > > > If we have a PHY using 1000BASE-X, then it is at odds with the > > > primary purpose of this protocol, especially with it comes to AN. > > > This is why phylink used to refuse to accept PHYs when using 802.3z > > > mode, but Marek wanted this to work, so relaxed the checks > > > preventing such a setup working. > > > > Sadly 2500Base-X is very commonly used to connect 2500Base-T-capable > > PHYs or SFP modules. I also got an ATS branded 1000M/100M/10M copper > > SFP module which uses 1000Base-X as system-side interface, independently > > of the speed of the link partner on the TP interface. > > All of them do not work with inband-AN enabled and a link only comes > > up after `ethtool -s eth1 autoneg off` in the current implementation, > > while previously they were working just fine. > > > > I understand that there isn't really a good solution for 1000Base-X as > > thanks to you I now understand that SerDes autoneg just transparently > > ends up being autoneg on a fiber link. > > > > 2500Base-X, however, is hardly used for fiber links, but rather mostly > > for 2500Base-T PHYs and SFP module as well as xPON SFPs. Maybe we could > > at least have in-band AN disabled by default for those to get them > > working without requiring the user to carry out ethtool settings? > > We could _possibly_ make 2500base-X ignore the autoneg bit, but in > order to do that I would want to make other changes, because this > is getting absolutely stupid to have these decisions being made in > each and every PCS - and have each PCS author implementing different > decision making in their PCS driver. > > The problem that gives is it makes phylink maintenance in hard, > because it becomes impossible to predict what the effect of any > change is. > > It also means that plugging the same SFP module into different > hardware will give different results (maybe it works, maybe it > doesn't.) > > So, what I would want to do is to move the decision about whether > the PCS should enable in-band into the phylink core code instead > of these random decisions being made in each PCS. > > At that point, we can then make the decision about whether the > ethtool autoneg bit should affect whether the PCS uses inband > depending on whether the PCS is effectively the media facing > entity, or whether there is a PHY attached - and if there is a PHY > attached, ask the PHY whether it will be using in-band or not. > > This also would give a way to ensure that all PCS adopt the same > behaviour. > > Does that sound a reasonable approach? In general it sound reasonable. We may need more SFP qurik bits to indicate presence of a PHY on SFP modules which do not expose that PHY via i2c-mdio or otherwise let the host know about it's presence. For my TP-LINK TL-SM410U 2500Base-T SFP this unfortunately seems to be the case, and I assume it's actually like that for most 2500Base-T as well as xPON SFPs... (xPON SFPs are usually managed via high-level protocols, even Web-UI is common there. They don't tell you much about them via I2C, I suppose to get them to work in as many SFP host devices as possible without any software changes). FYI: TP-LINK TL-SM410U 2500Base-T module: sfp EE: 00000000: 03 04 07 00 00 00 00 00 00 40 00 01 1f 00 00 00 .........@...... sfp EE: 00000010: 00 00 00 00 54 50 2d 4c 49 4e 4b 20 20 20 20 20 ....TP-LINK sfp EE: 00000020: 20 20 20 20 00 30 b5 c2 54 4c 2d 53 4d 34 31 30 .0..TL-SM410 sfp EE: 00000030: 55 20 20 20 20 20 20 20 32 2e 30 20 00 00 00 1b U 2.0 .... sfp EE: 00000040: 00 08 01 00 80 ff ff ff 40 3d f0 0d c0 ff ff ff ........@=...... sfp EE: 00000050: c8 39 7a 08 c0 ff ff ff 50 3d f0 0d c0 ff ff ff .9z.....P=...... sfp sfp2: module TP-LINK TL-SM410U rev 2.0 sn 12260M4001782 dc 220622 And this is the ATS SFP-GE-T 10/100/1000M copper module doing rate-adaptation to 1000Base-X: sfp sfp1: EEPROM extended structure checksum failure: 0xb0 != 0xaf sfp EE: 00000000: 03 04 07 00 00 00 02 12 00 01 01 01 0c 00 03 00 ................ sfp EE: 00000010: 00 00 00 00 4f 45 4d 20 20 20 20 20 20 20 20 20 ....OEM sfp EE: 00000020: 20 20 20 20 00 00 90 65 53 46 50 2d 47 45 2d 54 ...eSFP-GE-T sfp EE: 00000030: 20 20 20 20 00 00 00 00 43 20 20 20 00 00 00 f0 ....C .... sfp EE: 00000040: 00 12 00 00 32 31 30 37 31 30 41 30 30 31 32 37 ....210710A00127 sfp EE: 00000050: 33 39 00 00 32 31 30 37 31 30 20 20 60 00 01 af 39..210710 `... sfp sfp1: module OEM SFP-GE-T rev C sn dc That one already needs quirks to even work at all as TX-FAULT is not reported properly by the module, see https://github.com/dangowrt/linux/commit/2c694bd494583f08858fabca97cfdc79de8ba089 > > Strangely, I already have some patches along those lines in my > net-queue branch. See: > > net: phylink: add helpers for decoding mode > net: use phylink_mode_*() helpers > net: phylink: split PCS in-band from inband mode > > It's nowhere near finished though, it was just an idea back in > 2021 when the problem of getting rid of differing PCS behaviours > was on my mind. I'll take a look and let you know. For this series, I will post v13 without the two patches fixing ????Base-X modes tomorrow, waiting for other comments on the current post of the series up to then.
On Wed, Mar 08, 2023 at 02:32:40PM +0000, Daniel Golle wrote: > In general it sound reasonable. We may need more SFP qurik bits to > indicate presence of a PHY on SFP modules which do not expose that > PHY via i2c-mdio or otherwise let the host know about it's presence. That's a whole load of fun - some modules where the PHY is inaccessible will be using 1000base-X, others will be using SGMII. So yes, its likely that we may need quirks for these. We don't have quirks yet because you're the first to suggest there's a problem. > For my TP-LINK TL-SM410U 2500Base-T SFP this unfortunately seems to > be the case, and I assume it's actually like that for most > 2500Base-T as well as xPON SFPs... (xPON SFPs are usually managed > via high-level protocols, even Web-UI is common there. They don't > tell you much about them via I2C, I suppose to get them to work in > as many SFP host devices as possible without any software changes). xPON SFPs are a whole different ball game. For some, they auto-detect while booting and try 2500base-X or 1000base-X to see which will sync and if not they try the other. Other xPON SFPs run their host interface at a speed determined by the configuration set by the remote end. Other xPON SFPs may do something entirely different. In many cases, their EEPROM is a full of errors - such as advertising that they're 1.2 or 1.3 Gbd while operating in 2500base-X mode. They do weird stuff with their status pins as well, for example, some use the RX_LOS pin as a uart - which is a problem if it's e.g. tied from the cage to a switch that uses the pin to gate the link-up indication without any software control of that! With xPON SFPs, it's just a total minefield, which lots of SFF MSA violations all over the place. They're essentially a law to themselves (this is exactly why we have the quirks infrastructure.) > FYI: > TP-LINK TL-SM410U 2500Base-T module: > > sfp EE: 00000000: 03 04 07 00 00 00 00 00 00 40 00 01 1f 00 00 00 .........@...... > sfp EE: 00000010: 00 00 00 00 54 50 2d 4c 49 4e 4b 20 20 20 20 20 ....TP-LINK > sfp EE: 00000020: 20 20 20 20 00 30 b5 c2 54 4c 2d 53 4d 34 31 30 .0..TL-SM410 > sfp EE: 00000030: 55 20 20 20 20 20 20 20 32 2e 30 20 00 00 00 1b U 2.0 .... > sfp EE: 00000040: 00 08 01 00 80 ff ff ff 40 3d f0 0d c0 ff ff ff ........@=...... > sfp EE: 00000050: c8 39 7a 08 c0 ff ff ff 50 3d f0 0d c0 ff ff ff .9z.....P=...... > sfp sfp2: module TP-LINK TL-SM410U rev 2.0 sn 12260M4001782 dc 220622 I'm guessing this is a module with a checksum problem... > And this is the ATS SFP-GE-T 10/100/1000M copper module doing > rate-adaptation to 1000Base-X: > > sfp sfp1: EEPROM extended structure checksum failure: 0xb0 != 0xaf Given how close that is, it looks like they used the wrong algorithm. > sfp EE: 00000000: 03 04 07 00 00 00 02 12 00 01 01 01 0c 00 03 00 ................ > sfp EE: 00000010: 00 00 00 00 4f 45 4d 20 20 20 20 20 20 20 20 20 ....OEM > sfp EE: 00000020: 20 20 20 20 00 00 90 65 53 46 50 2d 47 45 2d 54 ...eSFP-GE-T > sfp EE: 00000030: 20 20 20 20 00 00 00 00 43 20 20 20 00 00 00 f0 ....C .... > sfp EE: 00000040: 00 12 00 00 32 31 30 37 31 30 41 30 30 31 32 37 ....210710A00127 > sfp EE: 00000050: 33 39 00 00 32 31 30 37 31 30 20 20 60 00 01 af 39..210710 `... > sfp sfp1: module OEM SFP-GE-T rev C sn dc Welcome to the wonderful world of horribly broken SFPs. Do we know what form of rate adaption this module needs on the transmit path? Does it require the host to pace itself to the media speed (which I suspect will be unreadable if the PHY isn't accessible) or will it send pause frames? It would be nice to add these to my database - please send me the output of ethtool -m $iface raw on > foo.bin for each module. > That one already needs quirks to even work at all as TX-FAULT is not > reported properly by the module, see > > https://github.com/dangowrt/linux/commit/2c694bd494583f08858fabca97cfdc79de8ba089 I'm guessing that's not on a kernel version that has: 73472c830eae net: sfp: add support for HALNy GPON SFP 5029be761161 net: sfp: move Huawei MA5671A fixup 275416754e9a net: sfp: move Alcatel Lucent 3FE46541AA fixup 23571c7b9643 net: sfp: move quirk handling into sfp.c 8475c4b70b04 net: sfp: re-implement soft state polling setup which reworks how we deal with the soft/hard state signals. I think the problem space is growing, and I fear that if we try to address all these issues in one go, we're going to end up with way too much to deal with in one go (which means poor reviews etc.) Can we try to concentrate on fixing one problem at a time, rather than throwing a whole load of problems into the mix? Thanks.
On Wed, Mar 08, 2023 at 03:01:07PM +0000, Russell King (Oracle) wrote: > On Wed, Mar 08, 2023 at 02:32:40PM +0000, Daniel Golle wrote: > > In general it sound reasonable. We may need more SFP qurik bits to > > indicate presence of a PHY on SFP modules which do not expose that > > PHY via i2c-mdio or otherwise let the host know about it's presence. > > That's a whole load of fun - some modules where the PHY is inaccessible > will be using 1000base-X, others will be using SGMII. So yes, its > likely that we may need quirks for these. We don't have quirks yet > because you're the first to suggest there's a problem. > > > For my TP-LINK TL-SM410U 2500Base-T SFP this unfortunately seems to > > be the case, and I assume it's actually like that for most > > 2500Base-T as well as xPON SFPs... (xPON SFPs are usually managed > > via high-level protocols, even Web-UI is common there. They don't > > tell you much about them via I2C, I suppose to get them to work in > > as many SFP host devices as possible without any software changes). > > xPON SFPs are a whole different ball game. For some, they auto-detect > while booting and try 2500base-X or 1000base-X to see which will sync > and if not they try the other. Other xPON SFPs run their host interface > at a speed determined by the configuration set by the remote end. Other > xPON SFPs may do something entirely different. > > In many cases, their EEPROM is a full of errors - such as advertising > that they're 1.2 or 1.3 Gbd while operating in 2500base-X mode. > > They do weird stuff with their status pins as well, for example, some > use the RX_LOS pin as a uart - which is a problem if it's e.g. tied > from the cage to a switch that uses the pin to gate the link-up > indication without any software control of that! > > With xPON SFPs, it's just a total minefield, which lots of SFF MSA > violations all over the place. They're essentially a law to themselves > (this is exactly why we have the quirks infrastructure.) > > > FYI: > > TP-LINK TL-SM410U 2500Base-T module: > > > > sfp EE: 00000000: 03 04 07 00 00 00 00 00 00 40 00 01 1f 00 00 00 .........@...... > > sfp EE: 00000010: 00 00 00 00 54 50 2d 4c 49 4e 4b 20 20 20 20 20 ....TP-LINK > > sfp EE: 00000020: 20 20 20 20 00 30 b5 c2 54 4c 2d 53 4d 34 31 30 .0..TL-SM410 > > sfp EE: 00000030: 55 20 20 20 20 20 20 20 32 2e 30 20 00 00 00 1b U 2.0 .... > > sfp EE: 00000040: 00 08 01 00 80 ff ff ff 40 3d f0 0d c0 ff ff ff ........@=...... > > sfp EE: 00000050: c8 39 7a 08 c0 ff ff ff 50 3d f0 0d c0 ff ff ff .9z.....P=...... > > sfp sfp2: module TP-LINK TL-SM410U rev 2.0 sn 12260M4001782 dc 220622 > > I'm guessing this is a module with a checksum problem... No, the checksum of the TL-SM410U is correct. I have patched the kernel to always dump the EEPROM, so I can share it with you. > > > And this is the ATS SFP-GE-T 10/100/1000M copper module doing > > rate-adaptation to 1000Base-X: > > > > sfp sfp1: EEPROM extended structure checksum failure: 0xb0 != 0xaf > > Given how close that is, it looks like they used the wrong algorithm. > > > sfp EE: 00000000: 03 04 07 00 00 00 02 12 00 01 01 01 0c 00 03 00 ................ > > sfp EE: 00000010: 00 00 00 00 4f 45 4d 20 20 20 20 20 20 20 20 20 ....OEM > > sfp EE: 00000020: 20 20 20 20 00 00 90 65 53 46 50 2d 47 45 2d 54 ...eSFP-GE-T > > sfp EE: 00000030: 20 20 20 20 00 00 00 00 43 20 20 20 00 00 00 f0 ....C .... > > sfp EE: 00000040: 00 12 00 00 32 31 30 37 31 30 41 30 30 31 32 37 ....210710A00127 > > sfp EE: 00000050: 33 39 00 00 32 31 30 37 31 30 20 20 60 00 01 af 39..210710 `... > > sfp sfp1: module OEM SFP-GE-T rev C sn dc > > Welcome to the wonderful world of horribly broken SFPs. > > Do we know what form of rate adaption this module needs on the > transmit path? Does it require the host to pace itself to the media > speed (which I suspect will be unreadable if the PHY isn't accessible) > or will it send pause frames? > > It would be nice to add these to my database - please send me the > output of ethtool -m $iface raw on > foo.bin for each module. > > > That one already needs quirks to even work at all as TX-FAULT is not > > reported properly by the module, see > > > > https://github.com/dangowrt/linux/commit/2c694bd494583f08858fabca97cfdc79de8ba089 > > I'm guessing that's not on a kernel version that has: > > 73472c830eae net: sfp: add support for HALNy GPON SFP > 5029be761161 net: sfp: move Huawei MA5671A fixup > 275416754e9a net: sfp: move Alcatel Lucent 3FE46541AA fixup > 23571c7b9643 net: sfp: move quirk handling into sfp.c > 8475c4b70b04 net: sfp: re-implement soft state polling setup > > which reworks how we deal with the soft/hard state signals. > > I think the problem space is growing, and I fear that if we try to > address all these issues in one go, we're going to end up with way > too much to deal with in one go (which means poor reviews etc.) > > Can we try to concentrate on fixing one problem at a time, rather > than throwing a whole load of problems into the mix? Ok. I'll just repost tomorrow without the ????Base-X AN realted patches. > > Thanks. > > -- > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/ > FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!
On Wed, Mar 08, 2023 at 03:08:55PM +0000, Daniel Golle wrote: > On Wed, Mar 08, 2023 at 03:01:07PM +0000, Russell King (Oracle) wrote: > > > FYI: > > > TP-LINK TL-SM410U 2500Base-T module: > > > > > > sfp EE: 00000000: 03 04 07 00 00 00 00 00 00 40 00 01 1f 00 00 00 .........@...... > > > sfp EE: 00000010: 00 00 00 00 54 50 2d 4c 49 4e 4b 20 20 20 20 20 ....TP-LINK > > > sfp EE: 00000020: 20 20 20 20 00 30 b5 c2 54 4c 2d 53 4d 34 31 30 .0..TL-SM410 > > > sfp EE: 00000030: 55 20 20 20 20 20 20 20 32 2e 30 20 00 00 00 1b U 2.0 .... > > > sfp EE: 00000040: 00 08 01 00 80 ff ff ff 40 3d f0 0d c0 ff ff ff ........@=...... > > > sfp EE: 00000050: c8 39 7a 08 c0 ff ff ff 50 3d f0 0d c0 ff ff ff .9z.....P=...... > > > sfp sfp2: module TP-LINK TL-SM410U rev 2.0 sn 12260M4001782 dc 220622 > > > > I'm guessing this is a module with a checksum problem... > > No, the checksum of the TL-SM410U is correct. I have patched the kernel > to always dump the EEPROM, so I can share it with you. Bear in mind that I haven't spent time with the spec to manually decode the above, when I have tools to do that for me when given the EEPROM in the correct form, which is: > > It would be nice to add these to my database - please send me the > > output of ethtool -m $iface raw on > foo.bin for each module. so if you can do that for me, then I can see whether it's likely that the patches that are already in mainline will do anything to solve the workaround you've had to add for the hw signals.
Hi > Gesendet: Mittwoch, 08. März 2023 um 16:24 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > It would be nice to add these to my database - please send me the > > > output of ethtool -m $iface raw on > foo.bin for each module. > > so if you can do that for me, then I can see whether it's likely that > the patches that are already in mainline will do anything to solve > the workaround you've had to add for the hw signals. i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... i dumped the eeprom like you mention for your database: $ hexdump -C 2g5_sfp.bin 00000000 03 04 07 00 01 00 00 00 00 02 00 05 19 00 00 00 |................| 00000010 1e 14 00 00 4f 45 4d 20 20 20 20 20 20 20 20 20 |....OEM | 00000020 20 20 20 20 00 00 00 00 53 46 50 2d 32 2e 35 47 | ....SFP-2.5G| 00000030 2d 54 20 20 20 20 20 20 31 2e 30 20 03 52 00 19 |-T 1.0 .R..| 00000040 00 1a 00 00 53 4b 32 33 30 31 31 31 30 30 30 38 |....SK2301110008| 00000050 20 20 20 20 32 33 30 31 31 30 20 20 68 f0 01 e8 | 230110 h...| 00000060 00 00 11 37 4f 7a dc ff 3d c0 6e 74 9b 7c 06 ca |...7Oz..=.nt.|..| 00000070 e1 d0 f9 00 00 00 00 00 00 00 00 00 08 f0 fc 64 |...............d| 00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00000100 5f 00 ce 00 5a 00 d3 00 8c a0 75 30 88 b8 79 18 |_...Z.....u0..y.| 00000110 1d 4c 01 f4 19 64 03 e8 4d f0 06 30 3d e8 06 f2 |.L...d..M..0=...| 00000120 2b d4 00 c7 27 10 00 df 00 00 00 00 00 00 00 00 |+...'...........| 00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 00000140 00 00 00 00 3f 80 00 00 00 00 00 00 01 00 00 00 |....?...........| 00000150 01 00 00 00 01 00 00 00 01 00 00 00 00 00 00 f1 |................| 00000160 29 1a 82 41 0b b8 13 88 0f a0 ff ff ff ff 80 ff |)..A............| 00000170 00 00 ff ff 00 00 ff ff 04 ff ff ff ff ff ff 00 |................| 00000180 43 4e 53 38 54 55 54 41 41 43 33 30 2d 31 34 31 |CNS8TUTAAC30-141| 00000190 30 2d 30 34 56 30 34 20 49 fb 46 00 00 00 00 29 |0-04V04 I.F....)| 000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 aa aa |................| 000001c0 47 4c 43 2d 54 20 20 20 20 20 20 20 20 20 20 20 |GLC-T | 000001d0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 97 | .| 000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000001f0 00 00 00 00 00 00 00 00 00 40 00 40 00 00 00 00 |.........@.@....| how can we add a quirk to support this? some more information: root@bpi-r3:~# ethtool eth1 Settings for eth1: Supported ports: [ FIBRE ] Supported link modes: 2500baseX/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 2500baseX/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: 2500Mb/s Duplex: Full Auto-negotiation: on Port: FIBRE PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: yes root@bpi-r3:~# ethtool -m eth1 Identifier : 0x03 (SFP) Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) Connector : 0x07 (LC) Transceiver codes : 0x00 0x01 0x00 0x00 0x00 0x00 0x02 0x00 0x00 Transceiver type : SONET: OC-48, short reach Encoding : 0x05 (SONET Scrambled) BR, Nominal : 2500MBd Rate identifier : 0x00 (unspecified) Length (SMF,km) : 0km Length (SMF) : 0m Length (50um) : 300m Length (62.5um) : 200m Length (Copper) : 0m Length (OM3) : 0m Laser wavelength : 850nm Vendor name : OEM Vendor OUI : 00:00:00 Vendor PN : SFP-2.5G-T Vendor rev : 1.0 Option values : 0x00 0x1a Option : RX_LOS implemented Option : TX_FAULT implemented Option : TX_DISABLE implemented BR margin, max : 0% BR margin, min : 0% Vendor SN : SK2301110008 Date code : 230110 Optical diagnostics support : Yes Laser bias current : 6.000 mA Laser output power : 0.5000 mW / -3.01 dBm Receiver signal average optical power : 0.4000 mW / -3.98 dBm Module temperature : 28.21 degrees C / 82.78 degrees F Module voltage : 3.3067 V Alarm/warning flags implemented : Yes Laser bias current high alarm : Off Laser bias current low alarm : Off Laser bias current high warning : Off Laser bias current low warning : Off Laser output power high alarm : Off Laser output power low alarm : Off Laser output power high warning : Off Laser output power low warning : Off Module temperature high alarm : Off Module temperature low alarm : Off Module temperature high warning : Off Module temperature low warning : Off Module voltage high alarm : Off Module voltage low alarm : Off Module voltage high warning : Off Module voltage low warning : Off Laser rx power high alarm : Off Laser rx power low alarm : Off Laser rx power high warning : Off Laser rx power low warning : Off Laser bias current high alarm threshold : 15.000 mA Laser bias current low alarm threshold : 1.000 mA Laser bias current high warning threshold : 13.000 mA Laser bias current low warning threshold : 2.000 mA Laser output power high alarm threshold : 1.9952 mW / 3.00 dBm Laser output power low alarm threshold : 0.1584 mW / -8.00 dBm Laser output power high warning threshold : 1.5848 mW / 2.00 dBm Laser output power low warning threshold : 0.1778 mW / -7.50 dBm Module temperature high alarm threshold : 95.00 degrees C / 203.00 degrees F Module temperature low alarm threshold : -50.00 degrees C / -58.00 degrees F Module temperature high warning threshold : 90.00 degrees C / 194.00 degrees F Module temperature low warning threshold : -45.00 degrees C / -49.00 degrees F Module voltage high alarm threshold : 3.6000 V Module voltage low alarm threshold : 3.0000 V Module voltage high warning threshold : 3.5000 V Module voltage low warning threshold : 3.1000 V Laser rx power high alarm threshold : 1.1220 mW / 0.50 dBm Laser rx power low alarm threshold : 0.0199 mW / -17.01 dBm Laser rx power high warning threshold : 1.0000 mW / 0.00 dBm Laser rx power low warning threshold : 0.0223 mW / -16.52 dBm i guess this sfp have a phy as it can operate in 100/1000/2500 mode like described on the module. but only tested in 2500base-T mode only. regards Frank
On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: > Hi > > > Gesendet: Mittwoch, 08. März 2023 um 16:24 Uhr > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > > It would be nice to add these to my database - please send me the > > > > output of ethtool -m $iface raw on > foo.bin for each module. > > > > so if you can do that for me, then I can see whether it's likely that > > the patches that are already in mainline will do anything to solve > > the workaround you've had to add for the hw signals. > > i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... > > i dumped the eeprom like you mention for your database: > > $ hexdump -C 2g5_sfp.bin > 00000000 03 04 07 00 01 00 00 00 00 02 00 05 19 00 00 00 |................| > 00000010 1e 14 00 00 4f 45 4d 20 20 20 20 20 20 20 20 20 |....OEM | > 00000020 20 20 20 20 00 00 00 00 53 46 50 2d 32 2e 35 47 | ....SFP-2.5G| > 00000030 2d 54 20 20 20 20 20 20 31 2e 30 20 03 52 00 19 |-T 1.0 .R..| > 00000040 00 1a 00 00 53 4b 32 33 30 31 31 31 30 30 30 38 |....SK2301110008| > 00000050 20 20 20 20 32 33 30 31 31 30 20 20 68 f0 01 e8 | 230110 h...| > 00000060 00 00 11 37 4f 7a dc ff 3d c0 6e 74 9b 7c 06 ca |...7Oz..=.nt.|..| > 00000070 e1 d0 f9 00 00 00 00 00 00 00 00 00 08 f0 fc 64 |...............d| > 00000080 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > * > 00000100 5f 00 ce 00 5a 00 d3 00 8c a0 75 30 88 b8 79 18 |_...Z.....u0..y.| > 00000110 1d 4c 01 f4 19 64 03 e8 4d f0 06 30 3d e8 06 f2 |.L...d..M..0=...| > 00000120 2b d4 00 c7 27 10 00 df 00 00 00 00 00 00 00 00 |+...'...........| > 00000130 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 00000140 00 00 00 00 3f 80 00 00 00 00 00 00 01 00 00 00 |....?...........| > 00000150 01 00 00 00 01 00 00 00 01 00 00 00 00 00 00 f1 |................| > 00000160 29 1a 82 41 0b b8 13 88 0f a0 ff ff ff ff 80 ff |)..A............| > 00000170 00 00 ff ff 00 00 ff ff 04 ff ff ff ff ff ff 00 |................| > 00000180 43 4e 53 38 54 55 54 41 41 43 33 30 2d 31 34 31 |CNS8TUTAAC30-141| > 00000190 30 2d 30 34 56 30 34 20 49 fb 46 00 00 00 00 29 |0-04V04 I.F....)| > 000001a0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 000001b0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 aa aa |................| > 000001c0 47 4c 43 2d 54 20 20 20 20 20 20 20 20 20 20 20 |GLC-T | > 000001d0 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 97 | .| > 000001e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| > 000001f0 00 00 00 00 00 00 00 00 00 40 00 40 00 00 00 00 |.........@.@....| > > how can we add a quirk to support this? Why does it need a quirk? > > some more information: > > root@bpi-r3:~# ethtool eth1 > Settings for eth1: > Supported ports: [ FIBRE ] > Supported link modes: 2500baseX/Full > Supported pause frame use: Symmetric Receive-only > Supports auto-negotiation: Yes > Supported FEC modes: Not reported > Advertised link modes: 2500baseX/Full > Advertised pause frame use: Symmetric Receive-only > Advertised auto-negotiation: Yes > Advertised FEC modes: Not reported > Speed: 2500Mb/s > Duplex: Full > Auto-negotiation: on > Port: FIBRE > PHYAD: 0 > Transceiver: internal > Current message level: 0x000000ff (255) > drv probe link timer ifdown ifup rx_err tx_err > Link detected: yes > root@bpi-r3:~# ethtool -m eth1 > Identifier : 0x03 (SFP) > Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) > Connector : 0x07 (LC) > Transceiver codes : 0x00 0x01 0x00 0x00 0x00 0x00 0x02 0x00 0x00 > Transceiver type : SONET: OC-48, short reach > Encoding : 0x05 (SONET Scrambled) > BR, Nominal : 2500MBd > Rate identifier : 0x00 (unspecified) > Length (SMF,km) : 0km > Length (SMF) : 0m > Length (50um) : 300m > Length (62.5um) : 200m > Length (Copper) : 0m > Length (OM3) : 0m > Laser wavelength : 850nm > Vendor name : OEM > Vendor OUI : 00:00:00 > Vendor PN : SFP-2.5G-T > Vendor rev : 1.0 > Option values : 0x00 0x1a > Option : RX_LOS implemented > Option : TX_FAULT implemented > Option : TX_DISABLE implemented > BR margin, max : 0% > BR margin, min : 0% > Vendor SN : SK2301110008 > Date code : 230110 > Optical diagnostics support : Yes > Laser bias current : 6.000 mA > Laser output power : 0.5000 mW / -3.01 dBm > Receiver signal average optical power : 0.4000 mW / -3.98 dBm > Module temperature : 28.21 degrees C / 82.78 degrees F > Module voltage : 3.3067 V > Alarm/warning flags implemented : Yes > Laser bias current high alarm : Off > Laser bias current low alarm : Off > Laser bias current high warning : Off > Laser bias current low warning : Off > Laser output power high alarm : Off > Laser output power low alarm : Off > Laser output power high warning : Off > Laser output power low warning : Off > Module temperature high alarm : Off > Module temperature low alarm : Off > Module temperature high warning : Off > Module temperature low warning : Off > Module voltage high alarm : Off > Module voltage low alarm : Off > Module voltage high warning : Off > Module voltage low warning : Off > Laser rx power high alarm : Off > Laser rx power low alarm : Off > Laser rx power high warning : Off > Laser rx power low warning : Off > Laser bias current high alarm threshold : 15.000 mA > Laser bias current low alarm threshold : 1.000 mA > Laser bias current high warning threshold : 13.000 mA > Laser bias current low warning threshold : 2.000 mA > Laser output power high alarm threshold : 1.9952 mW / 3.00 dBm > Laser output power low alarm threshold : 0.1584 mW / -8.00 dBm > Laser output power high warning threshold : 1.5848 mW / 2.00 dBm > Laser output power low warning threshold : 0.1778 mW / -7.50 dBm > Module temperature high alarm threshold : 95.00 degrees C / 203.00 degrees F > Module temperature low alarm threshold : -50.00 degrees C / -58.00 degrees F > Module temperature high warning threshold : 90.00 degrees C / 194.00 degrees F > Module temperature low warning threshold : -45.00 degrees C / -49.00 degrees F > Module voltage high alarm threshold : 3.6000 V > Module voltage low alarm threshold : 3.0000 V > Module voltage high warning threshold : 3.5000 V > Module voltage low warning threshold : 3.1000 V > Laser rx power high alarm threshold : 1.1220 mW / 0.50 dBm > Laser rx power low alarm threshold : 0.0199 mW / -17.01 dBm > Laser rx power high warning threshold : 1.0000 mW / 0.00 dBm > Laser rx power low warning threshold : 0.0223 mW / -16.52 dBm > > i guess this sfp have a phy as it can operate in 100/1000/2500 mode like described on the module. It would help to know the kernel messages. Thanks.
Am 11. März 2023 21:00:20 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>: >On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: >> i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... >> how can we add a quirk to support this? > >Why does it need a quirk? To disable the inband-mode for this 2.5g copper sfp. But have not found a way to set a flag which i can grab in phylink. The interface imho can only hold 1 value (speedmode which is correctly set to 2500baseX) and the mode holds ethtool options which seem not accessable from phylink.c >> >> some more information: >> >> root@bpi-r3:~# ethtool eth1 >> Settings for eth1: >> Supported ports: [ FIBRE ] >> Supported link modes: 2500baseX/Full >> Supported pause frame use: Symmetric Receive-only >> Supports auto-negotiation: Yes >> Supported FEC modes: Not reported >> Advertised link modes: 2500baseX/Full >> Advertised pause frame use: Symmetric Receive-only >> Advertised auto-negotiation: Yes >> Advertised FEC modes: Not reported >> Speed: 2500Mb/s >> Duplex: Full >> Auto-negotiation: on >> Port: FIBRE >> PHYAD: 0 >> Transceiver: internal >> Current message level: 0x000000ff (255) >> drv probe link timer ifdown ifup rx_err tx_err >> Link detected: yes >> root@bpi-r3:~# ethtool -m eth1 >> Identifier : 0x03 (SFP) >> Extended identifier : 0x04 (GBIC/SFP defined by 2-wire interface ID) >> Connector : 0x07 (LC) >> Transceiver codes : 0x00 0x01 0x00 0x00 0x00 0x00 0x02 0x00 0x00 >> Transceiver type : SONET: OC-48, short reach >> Encoding : 0x05 (SONET Scrambled) >> BR, Nominal : 2500MBd >> Vendor name : OEM >> Vendor OUI : 00:00:00 >> Vendor PN : SFP-2.5G-T >> Vendor rev : 1.0 >> Option values : 0x00 0x1a ... >> >> i guess this sfp have a phy as it can operate in 100/1000/2500 mode like described on the module. > >It would help to know the kernel messages. I had only the sfp message with vendor/partno and the interface up from mac (2500baseX/Full) But no link up. Which message(s) do you want to see? regards Frank
On Sat, Mar 11, 2023 at 09:21:47PM +0100, Frank Wunderlich wrote: > Am 11. März 2023 21:00:20 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>: > >On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: > > >> i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... > > >> how can we add a quirk to support this? > > > >Why does it need a quirk? > > To disable the inband-mode for this 2.5g copper > sfp. But have not found a way to set a flag which i > can grab in phylink. We could make sfp_parse_support() set Autoneg, Pause, and Asym_Pause in "modes" at the top of that function, and then use the SFP modes quirk to clear the Autoneg bit for this SFP. Would that work for you?
> Gesendet: Samstag, 11. März 2023 um 21:30 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > On Sat, Mar 11, 2023 at 09:21:47PM +0100, Frank Wunderlich wrote: > > Am 11. März 2023 21:00:20 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>: > > >On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: > > > > >> i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... > > > > >> how can we add a quirk to support this? > > > > > >Why does it need a quirk? > > > > To disable the inband-mode for this 2.5g copper > > sfp. But have not found a way to set a flag which i > > can grab in phylink. > > We could make sfp_parse_support() set Autoneg, Pause, and Asym_Pause > in "modes" at the top of that function, and then use the SFP modes > quirk to clear the Autoneg bit for this SFP. Would that work for you? i already tried this (without moving the autoneg/pause to sfp_parse_support): static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, unsigned long *modes, unsigned long *interfaces) { linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); } quirk was executed, but no change (no link on 2g5 sfp). i guess you mean moving code handling the dt-property for inband-mode in phylink_parse_mode (phylink.c) to the sfp-function (drivers/net/phy/sfp-bus.c) as a first test i tried to disable autoneg for the inband-setting @@ -840,11 +842,18 @@ static int phylink_parse_mode(struct phylink *pl, struct fwnode_handle *fwnode) linkmode_zero(pl->supported); phylink_set(pl->supported, MII); - phylink_set(pl->supported, Autoneg); + //phylink_set(pl->supported, Autoneg); phylink_set(pl->supported, Asym_Pause); phylink_set(pl->supported, Pause); pl->link_config.an_enabled = true; - pl->cfg_link_an_mode = MLO_AN_INBAND; + //pl->cfg_link_an_mode = MLO_AN_INBAND; + + //how to access sfp->inband_disable? + printk(KERN_ALERT "DEBUG: Passed %s:%d %d==%d (inband)??",__FUNCTION__,__LINE__, pl->cfg_link_an_mode, MLO_AN_INBAND); + //pl->cfg_link_an_mode = MLO_AN_PHY; + pl->link_config.an_enabled = false; + //phylink_clear(pl->supported, Autoneg); + printk(KERN_ALERT "DEBUG: Passed %s:%d %d==%d (inband)?? (forced phy-mode)",__FUNCTION__,__LINE__, pl->cfg_link_an_mode, MLO_AN_INBAND); switch (pl->link_config.interface) { case PHY_INTERFACE_MODE_SGMII: @@ -947,7 +956,7 @@ static int phylink_parse_mode(struct phylink *pl, struct fwnode_handle *fwnode) } /* Check if MAC/PCS also supports Autoneg. */ - pl->link_config.an_enabled = phylink_test(pl->supported, Autoneg); + //pl->link_config.an_enabled = phylink_test(pl->supported, Autoneg); } return 0; but it seems this is not enough (or i miss something) and i get error when i try to set mac up root@bpi-r3:~# ip link set eth1 up [ 30.044144] mtk_soc_eth 15100000.ethernet eth1: mtk_open: could not attach PHY: -19 RTNETLINK answers: No such device regards Frank
and i can confirm that disabling autoneg on userspace brings up link on the 2.5g rj45 sfp root@bpi-r3:~# ip link set eth1 up [ 73.433869] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode root@bpi-r3:~# ip link show eth1 3: eth1: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 92:2f:d3:16:6f:94 brd ff:ff:ff:ff:ff:ff root@bpi-r3:~# ethtool -s eth1 autoneg off root@bpi-r3:~# [ 147.190136] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control off [ 147.198600] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready root@bpi-r3:~# wonder which flags this changes compared to the original state...these i have to set in code when the sfp is recognized. setting these flags in the phylink_parse_mode (or skipping set AN-flags) seem to break it completely regards Frank
Just to make it clear...the issue with the copper-sfps is no regression of this series it exists before. i only had none of them to test for until this weekend....my 1g fibre-sfp were working fine with the inband-flag. this patch tries to fix it in mtk driver, this was rejected (as far as i understand it should be handled in phylink core instead of pcs driver). and no more in the v13, so we try to fix it another way. whatever i do in phylink_parse_mode the link is always inband...i tried to add a new state to have the configuration not FIXED or PHY or INBAND drivers/net/phy/phylink.c @@ -151,6 +151,7 @@ static const char *phylink_an_mode_str(unsigned int mode) [MLO_AN_PHY] = "phy", [MLO_AN_FIXED] = "fixed", [MLO_AN_INBAND] = "inband", + [MLO_AN_INBAND_DISABLED] = "inband disabled", }; include/linux/phylink.h @@ -20,6 +20,7 @@ enum { MLO_AN_PHY = 0, /* Conventional PHY */ MLO_AN_FIXED, /* Fixed-link mode */ MLO_AN_INBAND, /* In-band protocol */ + MLO_AN_INBAND_DISABLED is my start the right way? i noticed that sfp-handling is always calling phylink_sfp_config_optical and this calls phylink_sfp_set_config with fixed MLO_AN_INBAND. https://elixir.bootlin.com/linux/v6.3-rc1/source/drivers/net/phy/phylink.c#L3038 This looks wrong for me... First not all sfp are optical (ethtool shows FIBRE for copper-sfps too) and not all use inband mode... after changing this to my new state i see it in the configuration message. @@ -3044,7 +3049,7 @@ static int phylink_sfp_config_optical(struct phylink *pl) pl->link_port = pl->sfp_port; - phylink_sfp_set_config(pl, MLO_AN_INBAND, pl->sfp_support, &config); + phylink_sfp_set_config(pl, MLO_AN_INBAND_DISABLED, pl->sfp_support, &config); return 0; } root@bpi-r3:~# ip link set eth1 up [ 30.186811] mtk_soc_eth 15100000.ethernet eth1: configuring for inband disabled/2500base-x link mode changing the mode itself is not enough for getting the link up... have not found spacial handling of MLO_AN_INBAND except in phylink_mac_initial_config where i added my mode to the MLO_AN_INBAND as fallthrough (there is not autoneg set/done). also disabled link_config.an_enabled whereever i've found, but this does not bring up the link on the copper sfps...it looks like there must be a state-change... but here i'm stuck with my limited knowledge ;( regards Frank
On Sun, Mar 12, 2023 at 01:40:36PM +0100, Frank Wunderlich wrote: > > Gesendet: Samstag, 11. März 2023 um 21:30 Uhr > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > On Sat, Mar 11, 2023 at 09:21:47PM +0100, Frank Wunderlich wrote: > > > Am 11. März 2023 21:00:20 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>: > > > >On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: > > > > > > >> i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... > > > > > > >> how can we add a quirk to support this? > > > > > > > >Why does it need a quirk? > > > > > > To disable the inband-mode for this 2.5g copper > > > sfp. But have not found a way to set a flag which i > > > can grab in phylink. > > > > We could make sfp_parse_support() set Autoneg, Pause, and Asym_Pause > > in "modes" at the top of that function, and then use the SFP modes > > quirk to clear the Autoneg bit for this SFP. Would that work for you? > > i already tried this (without moving the autoneg/pause to sfp_parse_support): > > static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, > unsigned long *modes, > unsigned long *interfaces) > { > linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); > } > > quirk was executed, but no change (no link on 2g5 sfp). It won't have any effect on its own - because sfp_parse_support() does this: if (bus->sfp_quirk && bus->sfp_quirk->modes) bus->sfp_quirk->modes(id, modes, interfaces); linkmode_or(support, support, modes); phylink_set(support, Autoneg); phylink_set(support, Pause); phylink_set(support, Asym_Pause); Which means clearing Autoneg in "modes" via the modes SFP quirk will have *absolutely* *no* *effect* what so ever. The fact that you replied having *not* followed my suggestion and then itimiating that it doesn't work is very frustrating. > i guess you mean moving code handling the dt-property for inband-mode in phylink_parse_mode (phylink.c) to the sfp-function (drivers/net/phy/sfp-bus.c) No. [rest of email cut because I can't be bothered to read it after this] Please try what I suggested. You might find that it works.
On Sun, Mar 12, 2023 at 05:50:48PM +0100, Frank Wunderlich wrote: > Just to make it clear...the issue with the copper-sfps is no regression of this series it exists before. > i only had none of them to test for until this weekend....my 1g fibre-sfp were working fine with the inband-flag. > > this patch tries to fix it in mtk driver, this was rejected (as far as i understand it should be handled in phylink core instead of pcs driver). > and no more in the v13, so we try to fix it another way. > > whatever i do in phylink_parse_mode the link is always inband...i tried to add a new state to have the configuration not FIXED or PHY or INBAND > > drivers/net/phy/phylink.c > @@ -151,6 +151,7 @@ static const char *phylink_an_mode_str(unsigned int mode) > [MLO_AN_PHY] = "phy", > [MLO_AN_FIXED] = "fixed", > [MLO_AN_INBAND] = "inband", > + [MLO_AN_INBAND_DISABLED] = "inband disabled", > }; > > include/linux/phylink.h > @@ -20,6 +20,7 @@ enum { > MLO_AN_PHY = 0, /* Conventional PHY */ > MLO_AN_FIXED, /* Fixed-link mode */ > MLO_AN_INBAND, /* In-band protocol */ > + MLO_AN_INBAND_DISABLED > > is my start the right way? Oh ffs.
On Sun, Mar 12, 2023 at 08:05:41PM +0000, Russell King (Oracle) wrote: > On Sun, Mar 12, 2023 at 01:40:36PM +0100, Frank Wunderlich wrote: > > > Gesendet: Samstag, 11. März 2023 um 21:30 Uhr > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > > > On Sat, Mar 11, 2023 at 09:21:47PM +0100, Frank Wunderlich wrote: > > > > Am 11. März 2023 21:00:20 MEZ schrieb "Russell King (Oracle)" <linux@armlinux.org.uk>: > > > > >On Sat, Mar 11, 2023 at 01:05:37PM +0100, Frank Wunderlich wrote: > > > > > > > > >> i got the 2.5G copper sfps, and tried them...they work well with the v12 (including this patch), but not in v13... > > > > > > > > >> how can we add a quirk to support this? > > > > > > > > > >Why does it need a quirk? > > > > > > > > To disable the inband-mode for this 2.5g copper > > > > sfp. But have not found a way to set a flag which i > > > > can grab in phylink. > > > > > > We could make sfp_parse_support() set Autoneg, Pause, and Asym_Pause > > > in "modes" at the top of that function, and then use the SFP modes > > > quirk to clear the Autoneg bit for this SFP. Would that work for you? > > > > i already tried this (without moving the autoneg/pause to sfp_parse_support): > > > > static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, > > unsigned long *modes, > > unsigned long *interfaces) > > { > > linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); > > } > > > > quirk was executed, but no change (no link on 2g5 sfp). > > It won't have any effect on its own - because sfp_parse_support() does > this: > > if (bus->sfp_quirk && bus->sfp_quirk->modes) > bus->sfp_quirk->modes(id, modes, interfaces); > > linkmode_or(support, support, modes); > > phylink_set(support, Autoneg); > phylink_set(support, Pause); > phylink_set(support, Asym_Pause); > > Which means clearing Autoneg in "modes" via the modes SFP quirk will > have *absolutely* *no* *effect* what so ever. > > The fact that you replied having *not* followed my suggestion and then > itimiating that it doesn't work is very frustrating. > > > i guess you mean moving code handling the dt-property for inband-mode in phylink_parse_mode (phylink.c) to the sfp-function (drivers/net/phy/sfp-bus.c) > > No. > > [rest of email cut because I can't be bothered to read it after this] > > Please try what I suggested. You might find that it works. Since describing what I wanted you to test didn't work, here's a patch instead, based upon the quirk that you provided (which is what I'd have written anyway). Add a "#define DEBUG" to the top of drivers/net/phy/phylink.c in addition to applying this patch, and please test the resulting kernel, sending me the resulting kernel messages, and also reporting whether this works or not. Thanks. diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c index daac293e8ede..1dd50f2ca05d 100644 --- a/drivers/net/phy/sfp-bus.c +++ b/drivers/net/phy/sfp-bus.c @@ -151,6 +151,10 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, unsigned int br_min, br_nom, br_max; __ETHTOOL_DECLARE_LINK_MODE_MASK(modes) = { 0, }; + phylink_set(modes, Autoneg); + phylink_set(modes, Pause); + phylink_set(modes, Asym_Pause); + /* Decode the bitrate information to MBd */ br_min = br_nom = br_max = 0; if (id->base.br_nominal) { @@ -329,10 +333,6 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, bus->sfp_quirk->modes(id, modes, interfaces); linkmode_or(support, support, modes); - - phylink_set(support, Autoneg); - phylink_set(support, Pause); - phylink_set(support, Asym_Pause); } EXPORT_SYMBOL_GPL(sfp_parse_support); diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 39e3095796d0..b054360bf636 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -360,6 +360,13 @@ static void sfp_quirk_2500basex(const struct sfp_eeprom_id *id, __set_bit(PHY_INTERFACE_MODE_2500BASEX, interfaces); } +static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, + unsigned long *modes, + unsigned long *interfaces) +{ + linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); +} + static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id, unsigned long *modes, unsigned long *interfaces) @@ -401,6 +408,7 @@ static const struct sfp_quirk sfp_quirks[] = { SFP_QUIRK_M("UBNT", "UF-INSTANT", sfp_quirk_ubnt_uf_instant), SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc), + SFP_QUIRK_M("OEM", "SFP-2.5G-T", sfp_quirk_disable_autoneg), SFP_QUIRK_F("OEM", "RTSFP-10", sfp_fixup_rollball_cc), SFP_QUIRK_F("OEM", "RTSFP-10G", sfp_fixup_rollball_cc), SFP_QUIRK_F("Turris", "RTSFP-10", sfp_fixup_rollball),
> Gesendet: Montag, 13. März 2023 um 11:59 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > Since describing what I wanted you to test didn't work, here's a patch > instead, based upon the quirk that you provided (which is what I'd have > written anyway). Add a "#define DEBUG" to the top of > drivers/net/phy/phylink.c in addition to applying this patch, and please > test the resulting kernel, sending me the resulting kernel messages, and > also reporting whether this works or not. Hi thx for the patch...sorry for misunderstanding. i thought the sfp quirk only sets a flag and i need to change something in phylink.c to do the same as done on userspace, so i tried to simulate the userspace call there only for testing. here relevant parts of debug [ 1.990637] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 [ 2.000147] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=] [ 56.321102] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode [ 56.329543] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x [ 56.336144] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00000000,0000e240 pause=04 link=0 an=1 full log here: https://pastebin.com/vaXtXFY8 unfortunately this does not bring link up root@bpi-r3:~# ethtool eth1 Settings for eth1: Supported ports: [ MII ] Supported link modes: 2500baseX/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: Yes Supported FEC modes: Not reported Advertised link modes: 2500baseX/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: Yes Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: on Port: MII PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: no also after calling this link does not came up root@bpi-r3:~# ethtool -s eth1 autoneg off [ 542.690293] mtk_soc_eth 15100000.ethernet eth1: phylink_change_inband_advert: mode=inband/2500base-x adv=00,00000000,00000000,0000e200 pause=04 so it looks like it needs to be configured first in inband mode and then autoneg needs to be disabled. regards Frank
On Mon, Mar 13, 2023 at 07:39:00PM +0100, Frank Wunderlich wrote: > > Gesendet: Montag, 13. März 2023 um 11:59 Uhr > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > Since describing what I wanted you to test didn't work, here's a patch > > instead, based upon the quirk that you provided (which is what I'd have > > written anyway). Add a "#define DEBUG" to the top of > > drivers/net/phy/phylink.c in addition to applying this patch, and please > > test the resulting kernel, sending me the resulting kernel messages, and > > also reporting whether this works or not. > > Hi > > thx for the patch...sorry for misunderstanding. i thought the sfp quirk only sets a flag and i need to change > something in phylink.c to do the same as done on userspace, so i tried to simulate the userspace call there only for testing. > > here relevant parts of debug > > [ 1.990637] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 > [ 2.000147] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=] First thing... why are the SFP interfaces here empty? They should be listing at least 22 for this SFP. Looking at the full log, you have omitted: [ 2.008678] mtk_soc_eth 15100000.ethernet eth1: unsupported SFP module: no common interface modes which seems to suggest that we need more than what I provided - and is a big pointer to why it isn't working... and I guess has been there all along. This means that the interface configuration never gets updated, so its pointless trying to add quirks etc. Error messages are rather a key point. So everything after this is just not relevant. Let's fix that. Here's an updated patch which sets an interface mode for this SFP and sets a link mode for it (although we use 2500baseX rather than baseT here just to test this). I'm guessing it also does rate adaption, which we will have to work out later. Thanks. diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 1a2f074685fa..08eeffa96ae4 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -3001,7 +3001,8 @@ static int phylink_sfp_config_optical(struct phylink *pl) config.speed = SPEED_UNKNOWN; config.duplex = DUPLEX_UNKNOWN; config.pause = MLO_PAUSE_AN; - config.an_enabled = true; + config.an_enabled = linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, + support); /* For all the interfaces that are supported, reduce the sfp_support * mask to only those link modes that can be supported. diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c index daac293e8ede..1dd50f2ca05d 100644 --- a/drivers/net/phy/sfp-bus.c +++ b/drivers/net/phy/sfp-bus.c @@ -151,6 +151,10 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, unsigned int br_min, br_nom, br_max; __ETHTOOL_DECLARE_LINK_MODE_MASK(modes) = { 0, }; + phylink_set(modes, Autoneg); + phylink_set(modes, Pause); + phylink_set(modes, Asym_Pause); + /* Decode the bitrate information to MBd */ br_min = br_nom = br_max = 0; if (id->base.br_nominal) { @@ -329,10 +333,6 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, bus->sfp_quirk->modes(id, modes, interfaces); linkmode_or(support, support, modes); - - phylink_set(support, Autoneg); - phylink_set(support, Pause); - phylink_set(support, Asym_Pause); } EXPORT_SYMBOL_GPL(sfp_parse_support); diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 39e3095796d0..5910c0e936a0 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -360,6 +360,21 @@ static void sfp_quirk_2500basex(const struct sfp_eeprom_id *id, __set_bit(PHY_INTERFACE_MODE_2500BASEX, interfaces); } +static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, + unsigned long *modes, + unsigned long *interfaces) +{ + linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); +} + +static void sfp_quirk_oem_2_5g(const struct sfp_eeprom_id *id, + unsigned long *modes, + unsigned long *interfaces) +{ + sfp_quirk_2500basex(id, modes, interfaces); + sfp_quirk_disable_autoneg(id, modes, interfaces); +} + static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id, unsigned long *modes, unsigned long *interfaces) @@ -401,6 +416,7 @@ static const struct sfp_quirk sfp_quirks[] = { SFP_QUIRK_M("UBNT", "UF-INSTANT", sfp_quirk_ubnt_uf_instant), SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc), + SFP_QUIRK_M("OEM", "SFP-2.5G-T", sfp_quirk_oem_2_5g), SFP_QUIRK_F("OEM", "RTSFP-10", sfp_fixup_rollball_cc), SFP_QUIRK_F("OEM", "RTSFP-10G", sfp_fixup_rollball_cc), SFP_QUIRK_F("Turris", "RTSFP-10", sfp_fixup_rollball),
Hi, first i removed maintainers/reviewers and mtk people from this thread to (hopefully) no more disturb the review-process of daniels patch-series. > Gesendet: Montag, 13. März 2023 um 23:57 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > On Mon, Mar 13, 2023 at 07:39:00PM +0100, Frank Wunderlich wrote: > > > Gesendet: Montag, 13. März 2023 um 11:59 Uhr > > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > > > Since describing what I wanted you to test didn't work, here's a patch > > > instead, based upon the quirk that you provided (which is what I'd have > > > written anyway). Add a "#define DEBUG" to the top of > > > drivers/net/phy/phylink.c in addition to applying this patch, and please > > > test the resulting kernel, sending me the resulting kernel messages, and > > > also reporting whether this works or not. > > > > Hi > > > > thx for the patch...sorry for misunderstanding. i thought the sfp quirk only sets a flag and i need to change > > something in phylink.c to do the same as done on userspace, so i tried to simulate the userspace call there only for testing. > > > > here relevant parts of debug > > > > [ 1.990637] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 > > [ 2.000147] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=] > > First thing... why are the SFP interfaces here empty? They should be > listing at least 22 for this SFP. Looking at the full log, you have > omitted: > > [ 2.008678] mtk_soc_eth 15100000.ethernet eth1: unsupported SFP module: no common interface modes not sure why i have not added this as it should match the second grep for eth1 (sfp had not matched because of missing -i flag) it was not intended to omit any relevant data, just wanted to pre-filter from full log. > which seems to suggest that we need more than what I provided - and > is a big pointer to why it isn't working... and I guess has been there > all along. > > This means that the interface configuration never gets updated, so > its pointless trying to add quirks etc. Error messages are rather > a key point. thats clear, but somehow missed this one, sorry > So everything after this is just not relevant. Let's fix that. Here's > an updated patch which sets an interface mode for this SFP and sets a > link mode for it (although we use 2500baseX rather than baseT here > just to test this). I'm guessing it also does rate adaption, which we > will have to work out later. many thanks for guiding through this ;) at least the error-message is gone, and interface gets up when i call ethtoo to switch off autoneg. root@bpi-r3:~# dmesg | grep -i 'sfp\|eth1' [ 0.000000] Linux version 6.3.0-rc1-bpi-r3-sfp13 (frank@frank-G5) (aarch64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #1 SMP Tue Mar 143 [ 1.653862] sfp sfp-1: Host maximum power 1.0W [ 1.658862] sfp sfp-2: Host maximum power 1.0W [ 1.812551] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc00b080000, irq 123 [ 1.991838] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 [ 2.001352] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=22] [ 2.010059] mtk_soc_eth 15100000.ethernet eth1: optical SFP: chosen 2500base-x interface [ 2.018145] mtk_soc_eth 15100000.ethernet eth1: requesting link mode inband/2500base-x with support 00,00000000,00000000,0000e400 [ 34.385814] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode [ 34.394259] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x [ 34.400860] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00000000,0000e400 pause=04 link=0 an=1 root@bpi-r3:~# root@bpi-r3:~# ethtool -s eth1 autoneg off root@bpi-r3:~# [ 131.031902] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control off [ 131.040366] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready full log here: https://pastebin.com/yDC7PuM2 i see that an is still 1..maybe because of the fixed value here? https://elixir.bootlin.com/linux/v6.3-rc1/source/drivers/net/phy/phylink.c#L3038 ethtool output after autoneg workaround: Settings for eth1: Supported ports: [ FIBRE ] Supported link modes: 2500baseX/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 2500baseX/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: 2500Mb/s Duplex: Full Auto-negotiation: off Port: FIBRE PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: yes and yes, module seems to do rate adaption (it is labeled with 100M/1G/2.5G), i tried it on a 1G-Port and link came up (with workaround patch from daniel), traffic "works" but in tx-direction with massive retransmitts (i guess because pause-frames are ignored - pause was 00). regards Frank
On Tue, Mar 14, 2023 at 09:51:12AM +0100, Frank Wunderlich wrote: > Hi, > > at least the error-message is gone, and interface gets up when i call ethtoo to switch off autoneg. > > root@bpi-r3:~# dmesg | grep -i 'sfp\|eth1' > [ 1.991838] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 > [ 2.001352] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=22] > [ 2.010059] mtk_soc_eth 15100000.ethernet eth1: optical SFP: chosen 2500base-x interface > [ 2.018145] mtk_soc_eth 15100000.ethernet eth1: requesting link mode inband/2500base-x with support 00,00000000,00000000,0000e400 > [ 34.385814] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode > [ 34.394259] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x > [ 34.400860] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00000000,0000e400 pause=04 link=0 an=1 Looking good - apart from that pesky an=1 (which isn't used by the PCS driver, and I've been thinking of killing it off anyway.) Until such time that happens, we really ought to set that correctly, which means an extra bit is needed in phylink_sfp_set_config(). However, this should not affect anything. > root@bpi-r3:~# > root@bpi-r3:~# ethtool -s eth1 autoneg off > root@bpi-r3:~# [ 131.031902] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control off > [ 131.040366] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > > full log here: > https://pastebin.com/yDC7PuM2 > > i see that an is still 1..maybe because of the fixed value here? > > https://elixir.bootlin.com/linux/v6.3-rc1/source/drivers/net/phy/phylink.c#L3038 Not sure what that line has to do with it - this is what the above points to: phylink_sfp_set_config(pl, MLO_AN_INBAND, pl->sfp_support, &config); Anyway, the important thing is the Autoneg bit in the advertising mask is now zero, which is what we want. That should set the PCS to disable negotiation when in 2500baseX mode, the same as ethtool -s eth1 autoneg off. So I think the question becomes - what was the state that ethtool was reporting before asking ethtool to set autoneg off, and why does that make a difference. > and yes, module seems to do rate adaption (it is labeled with 100M/1G/2.5G), i tried it on a 1G-Port and link came up (with workaround patch from daniel), > traffic "works" but in tx-direction with massive retransmitts (i guess because pause-frames are ignored - pause was 00). We'll see about addressing that later once we've got the module working at 2.5G. However, thanks for the information. The patch below should result in ethtool reporting 2500baseT rather than 2500baseX, and that an=1 should now be an=0. Please try it, and dump the ethtool eth1 before asking for autoneg to be manually disabled, and also report the kernel messages. Thanks.
Hi > Gesendet: Dienstag, 14. März 2023 um 10:12 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > On Tue, Mar 14, 2023 at 09:51:12AM +0100, Frank Wunderlich wrote: > > Hi, > > > > at least the error-message is gone, and interface gets up when i call ethtoo to switch off autoneg. ... > > [ 34.400860] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00000000,0000e400 pause=04 link=0 an=1 > > Looking good - apart from that pesky an=1 (which isn't used by the PCS > driver, and I've been thinking of killing it off anyway.) Until such > time that happens, we really ought to set that correctly, which means > an extra bit is needed in phylink_sfp_set_config(). However, this > should not affect anything. > > > root@bpi-r3:~# > > root@bpi-r3:~# ethtool -s eth1 autoneg off > > root@bpi-r3:~# [ 131.031902] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control off > > [ 131.040366] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > > > > full log here: > > https://pastebin.com/yDC7PuM2 > > > > i see that an is still 1..maybe because of the fixed value here? > > > > https://elixir.bootlin.com/linux/v6.3-rc1/source/drivers/net/phy/phylink.c#L3038 > > Not sure what that line has to do with it - this is what the above > points to: > > phylink_sfp_set_config(pl, MLO_AN_INBAND, pl->sfp_support, &config); MLO_AN_INBAND => may cause the an=1 and mode=inband if previously (?) disabled :) > Anyway, the important thing is the Autoneg bit in the advertising mask > is now zero, which is what we want. That should set the PCS to disable > negotiation when in 2500baseX mode, the same as ethtool -s eth1 autoneg > off. > > So I think the question becomes - what was the state that ethtool was > reporting before asking ethtool to set autoneg off, and why does that > make a difference. ok, i do ethtool output before and after on next test. > > and yes, module seems to do rate adaption (it is labeled with 100M/1G/2.5G), i tried it on a 1G-Port and link came up (with workaround patch from daniel), > > traffic "works" but in tx-direction with massive retransmitts (i guess because pause-frames are ignored - pause was 00). > > We'll see about addressing that later once we've got the module working > at 2.5G. However, thanks for the information. of course...step by step, just wanted to tell this behaviour > The patch below should result in ethtool reporting 2500baseT rather than > 2500baseX, and that an=1 should now be an=0. Please try it, and dump the > ethtool eth1 before asking for autoneg to be manually disabled, and also > report the kernel messages. i see no Patch below ;) regards Frank
On Tue, Mar 14, 2023 at 11:01:42AM +0100, Frank Wunderlich wrote: > Hi > > > Gesendet: Dienstag, 14. März 2023 um 10:12 Uhr > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > On Tue, Mar 14, 2023 at 09:51:12AM +0100, Frank Wunderlich wrote: > > > Hi, > > > > > > at least the error-message is gone, and interface gets up when i call ethtoo to switch off autoneg. > ... > > > [ 34.400860] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00000000,0000e400 pause=04 link=0 an=1 > > > > Looking good - apart from that pesky an=1 (which isn't used by the PCS > > driver, and I've been thinking of killing it off anyway.) Until such > > time that happens, we really ought to set that correctly, which means > > an extra bit is needed in phylink_sfp_set_config(). However, this > > should not affect anything. > > > > > root@bpi-r3:~# > > > root@bpi-r3:~# ethtool -s eth1 autoneg off > > > root@bpi-r3:~# [ 131.031902] mtk_soc_eth 15100000.ethernet eth1: Link is Up - 2.5Gbps/Full - flow control off > > > [ 131.040366] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > > > > > > full log here: > > > https://pastebin.com/yDC7PuM2 > > > > > > i see that an is still 1..maybe because of the fixed value here? > > > > > > https://elixir.bootlin.com/linux/v6.3-rc1/source/drivers/net/phy/phylink.c#L3038 > > > > Not sure what that line has to do with it - this is what the above > > points to: > > > > phylink_sfp_set_config(pl, MLO_AN_INBAND, pl->sfp_support, &config); > > MLO_AN_INBAND => may cause the an=1 and mode=inband if previously (?) disabled :) For 802.3z modes, MLO_AN_INBAND with Autoneg clear in the advertising mode disables in-band negotiation. This is exactly how "ethtool -s ethX autoneg off" works. > > The patch below should result in ethtool reporting 2500baseT rather than > > 2500baseX, and that an=1 should now be an=0. Please try it, and dump the > > ethtool eth1 before asking for autoneg to be manually disabled, and also > > report the kernel messages. > > i see no Patch below ;) diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 1a2f074685fa..b8844dfcbf51 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -2898,6 +2898,10 @@ static void phylink_sfp_set_config(struct phylink *pl, u8 mode, changed = true; } + if (pl->link_config.an_enabled != state->an_enabled) + changed = true; + pl->link_config.an_enabled = state->an_enabled; + if (pl->cur_link_an_mode != mode || pl->link_config.interface != state->interface) { pl->cur_link_an_mode = mode; @@ -3001,7 +3005,8 @@ static int phylink_sfp_config_optical(struct phylink *pl) config.speed = SPEED_UNKNOWN; config.duplex = DUPLEX_UNKNOWN; config.pause = MLO_PAUSE_AN; - config.an_enabled = true; + config.an_enabled = linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, + support); /* For all the interfaces that are supported, reduce the sfp_support * mask to only those link modes that can be supported. diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c index daac293e8ede..1dd50f2ca05d 100644 --- a/drivers/net/phy/sfp-bus.c +++ b/drivers/net/phy/sfp-bus.c @@ -151,6 +151,10 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, unsigned int br_min, br_nom, br_max; __ETHTOOL_DECLARE_LINK_MODE_MASK(modes) = { 0, }; + phylink_set(modes, Autoneg); + phylink_set(modes, Pause); + phylink_set(modes, Asym_Pause); + /* Decode the bitrate information to MBd */ br_min = br_nom = br_max = 0; if (id->base.br_nominal) { @@ -329,10 +333,6 @@ void sfp_parse_support(struct sfp_bus *bus, const struct sfp_eeprom_id *id, bus->sfp_quirk->modes(id, modes, interfaces); linkmode_or(support, support, modes); - - phylink_set(support, Autoneg); - phylink_set(support, Pause); - phylink_set(support, Asym_Pause); } EXPORT_SYMBOL_GPL(sfp_parse_support); diff --git a/drivers/net/phy/sfp.c b/drivers/net/phy/sfp.c index 39e3095796d0..9c1fa0b1737f 100644 --- a/drivers/net/phy/sfp.c +++ b/drivers/net/phy/sfp.c @@ -360,6 +360,23 @@ static void sfp_quirk_2500basex(const struct sfp_eeprom_id *id, __set_bit(PHY_INTERFACE_MODE_2500BASEX, interfaces); } +static void sfp_quirk_disable_autoneg(const struct sfp_eeprom_id *id, + unsigned long *modes, + unsigned long *interfaces) +{ + linkmode_clear_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, modes); +} + +static void sfp_quirk_oem_2_5g(const struct sfp_eeprom_id *id, + unsigned long *modes, + unsigned long *interfaces) +{ + /* Copper 2.5G SFP */ + linkmode_set_bit(ETHTOOL_LINK_MODE_2500baseT_Full_BIT, modes); + __set_bit(PHY_INTERFACE_MODE_2500BASEX, interfaces); + sfp_quirk_disable_autoneg(id, modes, interfaces); +} + static void sfp_quirk_ubnt_uf_instant(const struct sfp_eeprom_id *id, unsigned long *modes, unsigned long *interfaces) @@ -401,6 +418,7 @@ static const struct sfp_quirk sfp_quirks[] = { SFP_QUIRK_M("UBNT", "UF-INSTANT", sfp_quirk_ubnt_uf_instant), SFP_QUIRK_F("OEM", "SFP-10G-T", sfp_fixup_rollball_cc), + SFP_QUIRK_M("OEM", "SFP-2.5G-T", sfp_quirk_oem_2_5g), SFP_QUIRK_F("OEM", "RTSFP-10", sfp_fixup_rollball_cc), SFP_QUIRK_F("OEM", "RTSFP-10G", sfp_fixup_rollball_cc), SFP_QUIRK_F("Turris", "RTSFP-10", sfp_fixup_rollball),
Hi very good...do not need the manual autoneg with the last Patch :) > Gesendet: Dienstag, 14. März 2023 um 11:10 Uhr > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > For 802.3z modes, MLO_AN_INBAND with Autoneg clear in the advertising mode > disables in-band negotiation. This is exactly how "ethtool -s ethX > autoneg off" works. ok, this seems now correctly set. > > > The patch below should result in ethtool reporting 2500baseT rather than > > > 2500baseX, and that an=1 should now be an=0. Please try it, and dump the > > > ethtool eth1 before asking for autoneg to be manually disabled, and also > > > report the kernel messages. root@bpi-r3:~# ip link set eth1 up [ 91.624075] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode [ 91.632485] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x [ 91.639094] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00008000,00006400 pause=00 link=0 an=0 root@bpi-r3:~# [ 95.808983] mtk_soc_eth 15100000.ethernet eth1: Link is Up - Unknown/Unknown - flow control off [ 95.817706] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready root@bpi-r3:~# ethtool eth1 Settings for eth1: Supported ports: [ FIBRE ] Supported link modes: 2500baseT/Full Supported pause frame use: Symmetric Receive-only Supports auto-negotiation: No Supported FEC modes: Not reported Advertised link modes: 2500baseT/Full Advertised pause frame use: Symmetric Receive-only Advertised auto-negotiation: No Advertised FEC modes: Not reported Speed: Unknown! Duplex: Unknown! (255) Auto-negotiation: off Port: FIBRE PHYAD: 0 Transceiver: internal Current message level: 0x000000ff (255) drv probe link timer ifdown ifup rx_err tx_err Link detected: yes root@bpi-r3:~# dmesg | grep -i 'sfp\|eth1' [ 0.000000] Linux version 6.3.0-rc1-bpi-r3-sfp13 (frank@frank-G5) (aarch64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #2 SMP Tue Mar 143 [ 1.658048] sfp sfp-1: Host maximum power 1.0W [ 1.663128] sfp sfp-2: Host maximum power 1.0W [ 1.812401] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc00af80000, irq 123 [ 2.001796] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 [ 2.011307] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=22] [ 2.020000] mtk_soc_eth 15100000.ethernet eth1: optical SFP: chosen 2500base-x interface [ 2.028080] mtk_soc_eth 15100000.ethernet eth1: requesting link mode inband/2500base-x with support 00,00000000,00008000,00006400 [ 91.624075] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode [ 91.632485] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x [ 91.639094] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00008000,00006400 pause=00 link=0 an=0 [ 95.808983] mtk_soc_eth 15100000.ethernet eth1: Link is Up - Unknown/Unknown - flow control off [ 95.817706] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready so you can see the link-up comes directly after the interface up does the ethtool-output look like expected? i see speed/duplex is set as supported/advertised but not active Supported link modes: 2500baseT/Full Advertised link modes: 2500baseT/Full vs. Speed: Unknown! Duplex: Unknown! (255) maybe because of the @@ -3003,7 +3007,8 @@ static int phylink_sfp_config_optical(struct phylink *pl) config.speed = SPEED_UNKNOWN; config.duplex = DUPLEX_UNKNOWN; config.pause = MLO_PAUSE_AN; imho ETHTOOL_LINK_MODE_2500baseT_Full_BIT sets only the supported which intersected with the advertised from the other side maximum should be taken as actual mode...so this part seems not correctly working at the moment. the "Supported ports: [ FIBRE ]" is also misleading for copper sfp, but imho all SFP are shown like this. full log if needed: https://pastebin.com/6yWe4Kyi next step: is it possible to have pause for rate adaption (handling rx pause frames correctly)? regards Frank
On Tue, Mar 14, 2023 at 02:59:11PM +0100, Frank Wunderlich wrote: > Hi > > very good...do not need the manual autoneg with the last Patch :) Great news! Thanks for your patience. > > Gesendet: Dienstag, 14. März 2023 um 11:10 Uhr > > Von: "Russell King (Oracle)" <linux@armlinux.org.uk> > > > For 802.3z modes, MLO_AN_INBAND with Autoneg clear in the advertising mode > > disables in-band negotiation. This is exactly how "ethtool -s ethX > > autoneg off" works. > > ok, this seems now correctly set. > > > > > The patch below should result in ethtool reporting 2500baseT rather than > > > > 2500baseX, and that an=1 should now be an=0. Please try it, and dump the > > > > ethtool eth1 before asking for autoneg to be manually disabled, and also > > > > report the kernel messages. > > root@bpi-r3:~# ip link set eth1 up > [ 91.624075] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode > [ 91.632485] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x > [ 91.639094] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00008000,00006400 pause=00 link=0 an=0 > root@bpi-r3:~# [ 95.808983] mtk_soc_eth 15100000.ethernet eth1: Link is Up - Unknown/Unknown - flow control off > [ 95.817706] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > > root@bpi-r3:~# ethtool eth1 > Settings for eth1: > Supported ports: [ FIBRE ] > Supported link modes: 2500baseT/Full > Supported pause frame use: Symmetric Receive-only > Supports auto-negotiation: No > Supported FEC modes: Not reported > Advertised link modes: 2500baseT/Full > Advertised pause frame use: Symmetric Receive-only > Advertised auto-negotiation: No > Advertised FEC modes: Not reported > Speed: Unknown! > Duplex: Unknown! (255) > Auto-negotiation: off > Port: FIBRE > PHYAD: 0 > Transceiver: internal > Current message level: 0x000000ff (255) > drv probe link timer ifdown ifup rx_err tx_err > Link detected: yes > > root@bpi-r3:~# dmesg | grep -i 'sfp\|eth1' > [ 0.000000] Linux version 6.3.0-rc1-bpi-r3-sfp13 (frank@frank-G5) (aarch64-linux-gnu-gcc (Ubuntu 11.3.0-1ubuntu1~22.04) 11.3.0, GNU ld (GNU Binutils for Ubuntu) 2.38) #2 SMP Tue Mar 143 > [ 1.658048] sfp sfp-1: Host maximum power 1.0W > [ 1.663128] sfp sfp-2: Host maximum power 1.0W > [ 1.812401] mtk_soc_eth 15100000.ethernet eth1: mediatek frame engine at 0xffffffc00af80000, irq 123 > [ 2.001796] sfp sfp-1: module OEM SFP-2.5G-T rev 1.0 sn SK2301110008 dc 230110 > [ 2.011307] mtk_soc_eth 15100000.ethernet eth1: optical SFP: interfaces=[mac=2-4,21-22, sfp=22] > [ 2.020000] mtk_soc_eth 15100000.ethernet eth1: optical SFP: chosen 2500base-x interface > [ 2.028080] mtk_soc_eth 15100000.ethernet eth1: requesting link mode inband/2500base-x with support 00,00000000,00008000,00006400 > [ 91.624075] mtk_soc_eth 15100000.ethernet eth1: configuring for inband/2500base-x link mode > [ 91.632485] mtk_soc_eth 15100000.ethernet eth1: major config 2500base-x > [ 91.639094] mtk_soc_eth 15100000.ethernet eth1: phylink_mac_config: mode=inband/2500base-x/Unknown/Unknown/none adv=00,00000000,00008000,00006400 pause=00 link=0 an=0 > [ 95.808983] mtk_soc_eth 15100000.ethernet eth1: Link is Up - Unknown/Unknown - flow control off > [ 95.817706] IPv6: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready > > so you can see the link-up comes directly after the interface up > > does the ethtool-output look like expected? i see speed/duplex is set as supported/advertised but not active > > Supported link modes: 2500baseT/Full > Advertised link modes: 2500baseT/Full > vs. > Speed: Unknown! > Duplex: Unknown! (255) Yes, and I think that's reasonable given that the PHY is inaccessible, and therefore we have no way to know what the PHY is actually doing. > imho ETHTOOL_LINK_MODE_2500baseT_Full_BIT sets only the supported which intersected with the advertised from the other side maximum should be taken as actual mode...so this part seems not correctly working at the moment. ... except we don't know what "the other side" is doing because we need to read that from the PHY in the SFP. > the "Supported ports: [ FIBRE ]" is also misleading for copper sfp, but imho all SFP are shown like this. ... unless they have a PHY we can access. > full log if needed: > https://pastebin.com/6yWe4Kyi > > next step: > is it possible to have pause for rate adaption (handling rx pause frames correctly)? That's certainly the next issue to sort out. I'll send a patch when I've sorted that out. Thanks!
diff --git a/drivers/net/ethernet/mediatek/mtk_sgmii.c b/drivers/net/ethernet/mediatek/mtk_sgmii.c index 61bd9986466a..58d8cb3aa7f4 100644 --- a/drivers/net/ethernet/mediatek/mtk_sgmii.c +++ b/drivers/net/ethernet/mediatek/mtk_sgmii.c @@ -38,9 +38,9 @@ static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode, const unsigned long *advertising, bool permit_pause_to_mac) { - bool mode_changed = false, changed, use_an; struct mtk_pcs *mpcs = pcs_to_mtk_pcs(pcs); - unsigned int rgc3, sgm_mode, bmcr; + unsigned int rgc3, sgm_mode = 0, bmcr = 0; + bool mode_changed = false, changed; int advertise, link_timer; advertise = phylink_mii_c22_pcs_encode_advertisement(interface, @@ -55,27 +55,13 @@ static int mtk_pcs_config(struct phylink_pcs *pcs, unsigned int mode, if (interface == PHY_INTERFACE_MODE_SGMII) { sgm_mode = SGMII_IF_MODE_SGMII; if (phylink_autoneg_inband(mode)) { + bmcr = SGMII_AN_ENABLE; sgm_mode |= SGMII_REMOTE_FAULT_DIS | SGMII_SPEED_DUPLEX_AN; - use_an = true; - } else { - use_an = false; } - } else if (phylink_autoneg_inband(mode)) { - /* 1000base-X or 2500base-X autoneg */ - sgm_mode = SGMII_REMOTE_FAULT_DIS; - use_an = linkmode_test_bit(ETHTOOL_LINK_MODE_Autoneg_BIT, - advertising); - } else { + } else if (!phylink_autoneg_inband(mode)) { /* 1000base-X or 2500base-X without autoneg */ - sgm_mode = 0; - use_an = false; - } - - if (use_an) { - bmcr = SGMII_AN_ENABLE; - } else { - bmcr = 0; + sgm_mode = SGMII_REMOTE_FAULT_DIS; } if (mpcs->interface != interface) {