[net] net: phy: Avoid WARN_ON for PHY_NOLINK during resuming

Message ID 20221021074154.25906-1-hayashi.kunihiko@socionext.com
State New
Headers
Series [net] net: phy: Avoid WARN_ON for PHY_NOLINK during resuming |

Commit Message

Kunihiko Hayashi Oct. 21, 2022, 7:41 a.m. UTC
  When resuming from sleep, if there is a time lag from link-down to link-up
due to auto-negotiation, the phy status has been still PHY_NOLINK, so
WARN_ON dump occurs in mdio_bus_phy_resume(). For example, UniPhier AVE
ethernet takes about a few seconds to link up after resuming.

To avoid this issue, should remove PHY_NOLINK the WARN_ON conditions.

Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
---
 drivers/net/phy/phy_device.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)
  

Comments

Heiner Kallweit Oct. 21, 2022, 8:38 a.m. UTC | #1
On 21.10.2022 09:41, Kunihiko Hayashi wrote:
> When resuming from sleep, if there is a time lag from link-down to link-up
> due to auto-negotiation, the phy status has been still PHY_NOLINK, so
> WARN_ON dump occurs in mdio_bus_phy_resume(). For example, UniPhier AVE
> ethernet takes about a few seconds to link up after resuming.
> 
That autoneg takes some time is normal. If this would actually the root
cause then basically every driver should be affected. But it's not.

> To avoid this issue, should remove PHY_NOLINK the WARN_ON conditions.
> 
> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
> ---
>  drivers/net/phy/phy_device.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 57849ac0384e..c647d027bb5d 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -318,12 +318,12 @@ static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
>  	phydev->suspended_by_mdio_bus = 0;
>  
>  	/* If we managed to get here with the PHY state machine in a state
> -	 * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication
> -	 * that something went wrong and we should most likely be using
> -	 * MAC managed PM, but we are not.
> +	 * neither PHY_HALTED, PHY_READY, PHY_UP nor PHY_NOLINK, this is an
> +	 * indication that something went wrong and we should most likely
> +	 * be using MAC managed PM, but we are not.
>  	 */

Did you read the comment you're changing? ave_resume() calls phy_resume(),
so you should follow the advice in the comment.

>  	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
> -		phydev->state != PHY_UP);
> +		phydev->state != PHY_UP && phydev->state != PHY_NOLINK);
>  
>  	ret = phy_init_hw(phydev);
>  	if (ret < 0)
  
Kunihiko Hayashi Oct. 21, 2022, 9:35 a.m. UTC | #2
Hi Heiner,

Thank you for your comment.

On 2022/10/21 17:38, Heiner Kallweit wrote:
> On 21.10.2022 09:41, Kunihiko Hayashi wrote:
>> When resuming from sleep, if there is a time lag from link-down to link-up
>> due to auto-negotiation, the phy status has been still PHY_NOLINK, so
>> WARN_ON dump occurs in mdio_bus_phy_resume(). For example, UniPhier AVE
>> ethernet takes about a few seconds to link up after resuming.
>>
> That autoneg takes some time is normal. If this would actually the root
> cause then basically every driver should be affected. But it's not.

Although the auto-neg should happen normally, I'm not sure about other
platforms.

>> To avoid this issue, should remove PHY_NOLINK the WARN_ON conditions.
>>
>> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
>> ---
>>   drivers/net/phy/phy_device.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>> index 57849ac0384e..c647d027bb5d 100644
>> --- a/drivers/net/phy/phy_device.c
>> +++ b/drivers/net/phy/phy_device.c
>> @@ -318,12 +318,12 @@ static __maybe_unused int mdio_bus_phy_resume(struct
>> device *dev)
>>   	phydev->suspended_by_mdio_bus = 0;
>>
>>   	/* If we managed to get here with the PHY state machine in a state
>> -	 * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication
>> -	 * that something went wrong and we should most likely be using
>> -	 * MAC managed PM, but we are not.
>> +	 * neither PHY_HALTED, PHY_READY, PHY_UP nor PHY_NOLINK, this is an
>> +	 * indication that something went wrong and we should most likely
>> +	 * be using MAC managed PM, but we are not.
>>   	 */
> 
> Did you read the comment you're changing? ave_resume() calls phy_resume(),
> so you should follow the advice in the comment.

I understand something is wrong with "PHY_NOLINK" here, and need to investigate
the root cause of the phy state issue.

Thank you,

---
Best Regards
Kunihiko Hayashi
  
Heiner Kallweit Oct. 21, 2022, 11:12 a.m. UTC | #3
On 21.10.2022 11:35, Kunihiko Hayashi wrote:
> Hi Heiner,
> 
> Thank you for your comment.
> 
> On 2022/10/21 17:38, Heiner Kallweit wrote:
>> On 21.10.2022 09:41, Kunihiko Hayashi wrote:
>>> When resuming from sleep, if there is a time lag from link-down to link-up
>>> due to auto-negotiation, the phy status has been still PHY_NOLINK, so
>>> WARN_ON dump occurs in mdio_bus_phy_resume(). For example, UniPhier AVE
>>> ethernet takes about a few seconds to link up after resuming.
>>>
>> That autoneg takes some time is normal. If this would actually the root
>> cause then basically every driver should be affected. But it's not.
> 
> Although the auto-neg should happen normally, I'm not sure about other
> platforms.
> 
>>> To avoid this issue, should remove PHY_NOLINK the WARN_ON conditions.
>>>
>>> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
>>> ---
>>>   drivers/net/phy/phy_device.c | 8 ++++----
>>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>> index 57849ac0384e..c647d027bb5d 100644
>>> --- a/drivers/net/phy/phy_device.c
>>> +++ b/drivers/net/phy/phy_device.c
>>> @@ -318,12 +318,12 @@ static __maybe_unused int mdio_bus_phy_resume(struct
>>> device *dev)
>>>       phydev->suspended_by_mdio_bus = 0;
>>>
>>>       /* If we managed to get here with the PHY state machine in a state
>>> -     * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication
>>> -     * that something went wrong and we should most likely be using
>>> -     * MAC managed PM, but we are not.
>>> +     * neither PHY_HALTED, PHY_READY, PHY_UP nor PHY_NOLINK, this is an
>>> +     * indication that something went wrong and we should most likely
>>> +     * be using MAC managed PM, but we are not.
>>>        */
>>
>> Did you read the comment you're changing? ave_resume() calls phy_resume(),
>> so you should follow the advice in the comment.
> 
> I understand something is wrong with "PHY_NOLINK" here, and need to investigate
> the root cause of the phy state issue.
> 
Best look at how phydev->mac_managed_pm is used in phylib and by MAC drivers.

> Thank you,
> 
> ---
> Best Regards
> Kunihiko Hayashi
  
Kunihiko Hayashi Oct. 21, 2022, 11:27 a.m. UTC | #4
On 2022/10/21 20:12, Heiner Kallweit wrote:
> On 21.10.2022 11:35, Kunihiko Hayashi wrote:
>> Hi Heiner,
>>
>> Thank you for your comment.
>>
>> On 2022/10/21 17:38, Heiner Kallweit wrote:
>>> On 21.10.2022 09:41, Kunihiko Hayashi wrote:
>>>> When resuming from sleep, if there is a time lag from link-down to
>>>> link-up
>>>> due to auto-negotiation, the phy status has been still PHY_NOLINK, so
>>>> WARN_ON dump occurs in mdio_bus_phy_resume(). For example, UniPhier AVE
>>>> ethernet takes about a few seconds to link up after resuming.
>>>>
>>> That autoneg takes some time is normal. If this would actually the root
>>> cause then basically every driver should be affected. But it's not.
>>
>> Although the auto-neg should happen normally, I'm not sure about other
>> platforms.
>>
>>>> To avoid this issue, should remove PHY_NOLINK the WARN_ON conditions.
>>>>
>>>> Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
>>>> ---
>>>>    drivers/net/phy/phy_device.c | 8 ++++----
>>>>    1 file changed, 4 insertions(+), 4 deletions(-)
>>>>
>>>> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
>>>> index 57849ac0384e..c647d027bb5d 100644
>>>> --- a/drivers/net/phy/phy_device.c
>>>> +++ b/drivers/net/phy/phy_device.c
>>>> @@ -318,12 +318,12 @@ static __maybe_unused int
>>>> mdio_bus_phy_resume(struct
>>>> device *dev)
>>>>        phydev->suspended_by_mdio_bus = 0;
>>>>
>>>>        /* If we managed to get here with the PHY state machine in a state
>>>> -     * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication
>>>> -     * that something went wrong and we should most likely be using
>>>> -     * MAC managed PM, but we are not.
>>>> +     * neither PHY_HALTED, PHY_READY, PHY_UP nor PHY_NOLINK, this is an
>>>> +     * indication that something went wrong and we should most likely
>>>> +     * be using MAC managed PM, but we are not.
>>>>         */
>>>
>>> Did you read the comment you're changing? ave_resume() calls
>>> phy_resume(),
>>> so you should follow the advice in the comment.
>>
>> I understand something is wrong with "PHY_NOLINK" here, and need to
>> investigate
>> the root cause of the phy state issue.
>>
> Best look at how phydev->mac_managed_pm is used in phylib and by MAC
> drivers.

Thank you for the clue!
I'll try the flag and check the behavior of MAC/PHY.

Thank you,

---
Best Regards
Kunihiko Hayashi
  

Patch

diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 57849ac0384e..c647d027bb5d 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -318,12 +318,12 @@  static __maybe_unused int mdio_bus_phy_resume(struct device *dev)
 	phydev->suspended_by_mdio_bus = 0;
 
 	/* If we managed to get here with the PHY state machine in a state
-	 * neither PHY_HALTED, PHY_READY nor PHY_UP, this is an indication
-	 * that something went wrong and we should most likely be using
-	 * MAC managed PM, but we are not.
+	 * neither PHY_HALTED, PHY_READY, PHY_UP nor PHY_NOLINK, this is an
+	 * indication that something went wrong and we should most likely
+	 * be using MAC managed PM, but we are not.
 	 */
 	WARN_ON(phydev->state != PHY_HALTED && phydev->state != PHY_READY &&
-		phydev->state != PHY_UP);
+		phydev->state != PHY_UP && phydev->state != PHY_NOLINK);
 
 	ret = phy_init_hw(phydev);
 	if (ret < 0)