[RFC,net-next,v8,04/13] net: Change the API of PHY default timestamp to MAC

Message ID 20240216-feature_ptp_netnext-v8-4-510f42f444fb@bootlin.com
State New
Headers
Series net: Make timestamping selectable |

Commit Message

Köry Maincent Feb. 16, 2024, 3:52 p.m. UTC
  Change the API to select MAC default time stamping instead of the PHY.
Indeed the PHY is closer to the wire therefore theoretically it has less
delay than the MAC timestamping but the reality is different. Due to lower
time stamping clock frequency, latency in the MDIO bus and no PHC hardware
synchronization between different PHY, the PHY PTP is often less precise
than the MAC. The exception is for PHY designed specially for PTP case but
these devices are not very widespread. For not breaking the compatibility
default_timestamp flag has been introduced in phy_device that is set by
the phy driver to know we are using the old API behavior.

Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
---

Changes in v5:
- Extract the API change in this patch.
- Rename whitelist to allowlist.
- Set NETDEV_TIMESTAMPING in register_netdevice function.
- Add software timestamping case description in ts_info.

Change in v6:
- Replace the allowlist phy with a default_timestamp flag to know which
  phy is using old API behavior.
- Fix dereferenced of a possible null pointer.
- Follow timestamping layer naming update.
- Update timestamp default set between MAC and software.
- Update ts_info returned in case of software timestamping.

Change in v8:
- Reform the implementation to use a simple phy_is_default_hwtstamp helper
  instead of saving the hwtstamp in the net_device struct.
---
 drivers/net/phy/bcm-phy-ptp.c     |  3 +++
 drivers/net/phy/dp83640.c         |  3 +++
 drivers/net/phy/micrel.c          |  6 ++++++
 drivers/net/phy/mscc/mscc_ptp.c   |  3 +++
 drivers/net/phy/nxp-c45-tja11xx.c |  3 +++
 include/linux/phy.h               | 17 +++++++++++++++++
 net/core/dev_ioctl.c              |  8 +++-----
 net/core/timestamping.c           | 10 ++++++++--
 net/ethtool/common.c              |  2 +-
 9 files changed, 47 insertions(+), 8 deletions(-)
  

Comments

Rahul Rameshbabu Feb. 16, 2024, 6:09 p.m. UTC | #1
On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent <kory.maincent@bootlin.com> wrote:
> Change the API to select MAC default time stamping instead of the PHY.
> Indeed the PHY is closer to the wire therefore theoretically it has less
> delay than the MAC timestamping but the reality is different. Due to lower
> time stamping clock frequency, latency in the MDIO bus and no PHC hardware
> synchronization between different PHY, the PHY PTP is often less precise
> than the MAC. The exception is for PHY designed specially for PTP case but
> these devices are not very widespread. For not breaking the compatibility
> default_timestamp flag has been introduced in phy_device that is set by
> the phy driver to know we are using the old API behavior.
>
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> ---

Overall, I agree with the motivation and reasoning behind the patch. It
takes dedicated effort to build a good phy timestamping mechanism, so
this approach is good. I do have a question though. In this patch if we
set the phy as the default timestamp mechanism, does that mean for even
non-PTP applications, the phy will be used for timestamping when
hardware timestamping is enabled? If so, I think this might need some
thought because there are timing applications in general when a
timestamp closest to the MAC layer would be best.

>
> Changes in v5:
> - Extract the API change in this patch.
> - Rename whitelist to allowlist.
> - Set NETDEV_TIMESTAMPING in register_netdevice function.
> - Add software timestamping case description in ts_info.
>
> Change in v6:
> - Replace the allowlist phy with a default_timestamp flag to know which
>   phy is using old API behavior.
> - Fix dereferenced of a possible null pointer.
> - Follow timestamping layer naming update.
> - Update timestamp default set between MAC and software.
> - Update ts_info returned in case of software timestamping.
>
> Change in v8:
> - Reform the implementation to use a simple phy_is_default_hwtstamp helper
>   instead of saving the hwtstamp in the net_device struct.
> ---

One general concern

>  drivers/net/phy/bcm-phy-ptp.c     |  3 +++
>  drivers/net/phy/dp83640.c         |  3 +++
>  drivers/net/phy/micrel.c          |  6 ++++++
>  drivers/net/phy/mscc/mscc_ptp.c   |  3 +++
>  drivers/net/phy/nxp-c45-tja11xx.c |  3 +++
>  include/linux/phy.h               | 17 +++++++++++++++++
>  net/core/dev_ioctl.c              |  8 +++-----
>  net/core/timestamping.c           | 10 ++++++++--
>  net/ethtool/common.c              |  2 +-
>  9 files changed, 47 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/net/phy/bcm-phy-ptp.c b/drivers/net/phy/bcm-phy-ptp.c
> index 617d384d4551..d3e825c951ee 100644
> --- a/drivers/net/phy/bcm-phy-ptp.c
> +++ b/drivers/net/phy/bcm-phy-ptp.c
> @@ -931,6 +931,9 @@ struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)
>  		return ERR_CAST(clock);
>  	priv->ptp_clock = clock;
>  
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +
>  	priv->phydev = phydev;
>  	bcm_ptp_init(priv);
>  
> diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
> index 5c42c47dc564..64fd1a109c0f 100644
> --- a/drivers/net/phy/dp83640.c
> +++ b/drivers/net/phy/dp83640.c
> @@ -1450,6 +1450,9 @@ static int dp83640_probe(struct phy_device *phydev)
>  	phydev->mii_ts = &dp83640->mii_ts;
>  	phydev->priv = dp83640;
>  
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +
>  	spin_lock_init(&dp83640->rx_lock);
>  	skb_queue_head_init(&dp83640->rx_queue);
>  	skb_queue_head_init(&dp83640->tx_queue);
> diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
> index 9b6973581989..1c9eba331b01 100644
> --- a/drivers/net/phy/micrel.c
> +++ b/drivers/net/phy/micrel.c
> @@ -3177,6 +3177,9 @@ static void lan8814_ptp_init(struct phy_device *phydev)
>  	ptp_priv->mii_ts.ts_info  = lan8814_ts_info;
>  
>  	phydev->mii_ts = &ptp_priv->mii_ts;
> +
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
>  }
>  
>  static int lan8814_ptp_probe_once(struct phy_device *phydev)
> @@ -4613,6 +4616,9 @@ static int lan8841_probe(struct phy_device *phydev)
>  
>  	phydev->mii_ts = &ptp_priv->mii_ts;
>  
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +
>  	return 0;
>  }
>  
> diff --git a/drivers/net/phy/mscc/mscc_ptp.c b/drivers/net/phy/mscc/mscc_ptp.c
> index eb0b032cb613..e66d20eff7c4 100644
> --- a/drivers/net/phy/mscc/mscc_ptp.c
> +++ b/drivers/net/phy/mscc/mscc_ptp.c
> @@ -1570,6 +1570,9 @@ int vsc8584_ptp_probe(struct phy_device *phydev)
>  		return PTR_ERR(vsc8531->load_save);
>  	}
>  
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +
>  	vsc8531->ptp->phydev = phydev;
>  
>  	return 0;
> diff --git a/drivers/net/phy/nxp-c45-tja11xx.c b/drivers/net/phy/nxp-c45-tja11xx.c
> index 3cf614b4cd52..d18c133e6013 100644
> --- a/drivers/net/phy/nxp-c45-tja11xx.c
> +++ b/drivers/net/phy/nxp-c45-tja11xx.c
> @@ -1660,6 +1660,9 @@ static int nxp_c45_probe(struct phy_device *phydev)
>  		priv->mii_ts.ts_info = nxp_c45_ts_info;
>  		phydev->mii_ts = &priv->mii_ts;
>  		ret = nxp_c45_init_ptp_clock(priv);
> +
> +		/* Timestamp selected by default to keep legacy API */
> +		phydev->default_timestamp = true;
>  	} else {
>  		phydev_dbg(phydev, "PTP support not enabled even if the phy supports it");
>  	}
> diff --git a/include/linux/phy.h b/include/linux/phy.h
> index c2dda21b39e1..9a31243e9f7e 100644
> --- a/include/linux/phy.h
> +++ b/include/linux/phy.h
> @@ -607,6 +607,8 @@ struct macsec_ops;
>   *                 handling shall be postponed until PHY has resumed
>   * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
>   *             requiring a rerun of the interrupt handler after resume
> + * @default_timestamp: Flag indicating whether we are using the phy
> + *		       timestamp as the default one
>   * @interface: enum phy_interface_t value
>   * @possible_interfaces: bitmap if interface modes that the attached PHY
>   *			 will switch between depending on media speed.
> @@ -672,6 +674,8 @@ struct phy_device {
>  	unsigned irq_suspended:1;
>  	unsigned irq_rerun:1;
>  
> +	unsigned default_timestamp:1;
> +
>  	int rate_matching;
>  
>  	enum phy_state state;
> @@ -1613,6 +1617,19 @@ static inline void phy_txtstamp(struct phy_device *phydev, struct sk_buff *skb,
>  	phydev->mii_ts->txtstamp(phydev->mii_ts, skb, type);
>  }
>  
> +/**
> + * phy_is_default_hwtstamp - return true if phy is the default hw timestamp
> + * @phydev: Pointer to phy_device
> + *
> + * This is used to get default timestamping device taking into account
> + * the new API choice, which is selecting the timestamping from MAC by
> + * default if the phydev does not have default_timestamp flag enabled.
> + */
> +static inline bool phy_is_default_hwtstamp(struct phy_device *phydev)
> +{
> +	return phy_has_hwtstamp(phydev) && phydev->default_timestamp;
> +}
> +
>  /**
>   * phy_is_internal - Convenience function for testing if a PHY is internal
>   * @phydev: the phy_device struct
> diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
> index 847254fd7f13..3342834597cd 100644
> --- a/net/core/dev_ioctl.c
> +++ b/net/core/dev_ioctl.c
> @@ -260,9 +260,7 @@ static int dev_eth_ioctl(struct net_device *dev,
>   * @dev: Network device
>   * @cfg: Timestamping configuration structure
>   *
> - * Helper for enforcing a common policy that phylib timestamping, if available,
> - * should take precedence in front of hardware timestamping provided by the
> - * netdev.
> + * Helper for calling the default hardware provider timestamping.
>   *
>   * Note: phy_mii_ioctl() only handles SIOCSHWTSTAMP (not SIOCGHWTSTAMP), and
>   * there only exists a phydev->mii_ts->hwtstamp() method. So this will return
> @@ -272,7 +270,7 @@ static int dev_eth_ioctl(struct net_device *dev,
>  int dev_get_hwtstamp_phylib(struct net_device *dev,
>  			    struct kernel_hwtstamp_config *cfg)
>  {
> -	if (phy_has_hwtstamp(dev->phydev))
> +	if (phy_is_default_hwtstamp(dev->phydev))
>  		return phy_hwtstamp_get(dev->phydev, cfg);
>  
>  	return dev->netdev_ops->ndo_hwtstamp_get(dev, cfg);
> @@ -329,7 +327,7 @@ int dev_set_hwtstamp_phylib(struct net_device *dev,
>  			    struct netlink_ext_ack *extack)
>  {
>  	const struct net_device_ops *ops = dev->netdev_ops;
> -	bool phy_ts = phy_has_hwtstamp(dev->phydev);
> +	bool phy_ts = phy_is_default_hwtstamp(dev->phydev);
>  	struct kernel_hwtstamp_config old_cfg = {};
>  	bool changed = false;
>  	int err;
> diff --git a/net/core/timestamping.c b/net/core/timestamping.c
> index 04840697fe79..891bfc2f62fd 100644
> --- a/net/core/timestamping.c
> +++ b/net/core/timestamping.c
> @@ -25,7 +25,10 @@ void skb_clone_tx_timestamp(struct sk_buff *skb)
>  	struct sk_buff *clone;
>  	unsigned int type;
>  
> -	if (!skb->sk)
> +	if (!skb->sk || !skb->dev)
> +		return;
> +
> +	if (!phy_is_default_hwtstamp(skb->dev->phydev))

Really minor but any reason to not just keep the conditional chaining
with a single if statement?

>  		return;
>  
>  	type = classify(skb);
> @@ -47,7 +50,10 @@ bool skb_defer_rx_timestamp(struct sk_buff *skb)
>  	struct mii_timestamper *mii_ts;
>  	unsigned int type;
>  
> -	if (!skb->dev || !skb->dev->phydev || !skb->dev->phydev->mii_ts)
> +	if (!skb->dev)
> +		return false;
> +
> +	if (!phy_is_default_hwtstamp(skb->dev->phydev))

Same here

  if (!skb->dev || !phy_is_default_hwtstamp(skb->dev->phydev))

>  		return false;
>  
>  	if (skb_headroom(skb) < ETH_HLEN)
> diff --git a/net/ethtool/common.c b/net/ethtool/common.c
> index ce486cec346c..e56bde53cd5c 100644
> --- a/net/ethtool/common.c
> +++ b/net/ethtool/common.c
> @@ -637,7 +637,7 @@ int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
>  	memset(info, 0, sizeof(*info));
>  	info->cmd = ETHTOOL_GET_TS_INFO;
>  
> -	if (phy_has_tsinfo(phydev))
> +	if (phy_is_default_hwtstamp(phydev) && phy_has_tsinfo(phydev))
>  		return phy_ts_info(phydev, info);
>  	if (ops->get_ts_info)
>  		return ops->get_ts_info(dev, info);

--
Thanks,

Rahul Rameshbabu
  
Florian Fainelli Feb. 16, 2024, 6:52 p.m. UTC | #2
On 2/16/24 07:52, Kory Maincent wrote:
> Change the API to select MAC default time stamping instead of the PHY.
> Indeed the PHY is closer to the wire therefore theoretically it has less
> delay than the MAC timestamping but the reality is different. Due to lower
> time stamping clock frequency, latency in the MDIO bus and no PHC hardware
> synchronization between different PHY, the PHY PTP is often less precise
> than the MAC. The exception is for PHY designed specially for PTP case but
> these devices are not very widespread. For not breaking the compatibility
> default_timestamp flag has been introduced in phy_device that is set by
> the phy driver to know we are using the old API behavior.
> 
> Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> ---
> 
> Changes in v5:
> - Extract the API change in this patch.
> - Rename whitelist to allowlist.
> - Set NETDEV_TIMESTAMPING in register_netdevice function.
> - Add software timestamping case description in ts_info.
> 
> Change in v6:
> - Replace the allowlist phy with a default_timestamp flag to know which
>    phy is using old API behavior.
> - Fix dereferenced of a possible null pointer.
> - Follow timestamping layer naming update.
> - Update timestamp default set between MAC and software.
> - Update ts_info returned in case of software timestamping.
> 
> Change in v8:
> - Reform the implementation to use a simple phy_is_default_hwtstamp helper
>    instead of saving the hwtstamp in the net_device struct.
> ---
>   drivers/net/phy/bcm-phy-ptp.c     |  3 +++
>   drivers/net/phy/dp83640.c         |  3 +++
>   drivers/net/phy/micrel.c          |  6 ++++++
>   drivers/net/phy/mscc/mscc_ptp.c   |  3 +++
>   drivers/net/phy/nxp-c45-tja11xx.c |  3 +++
>   include/linux/phy.h               | 17 +++++++++++++++++
>   net/core/dev_ioctl.c              |  8 +++-----
>   net/core/timestamping.c           | 10 ++++++++--
>   net/ethtool/common.c              |  2 +-
>   9 files changed, 47 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/phy/bcm-phy-ptp.c b/drivers/net/phy/bcm-phy-ptp.c
> index 617d384d4551..d3e825c951ee 100644
> --- a/drivers/net/phy/bcm-phy-ptp.c
> +++ b/drivers/net/phy/bcm-phy-ptp.c
> @@ -931,6 +931,9 @@ struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)
>   		return ERR_CAST(clock);
>   	priv->ptp_clock = clock;
>   
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +
>   	priv->phydev = phydev;
>   	bcm_ptp_init(priv);
>   
> diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
> index 5c42c47dc564..64fd1a109c0f 100644
> --- a/drivers/net/phy/dp83640.c
> +++ b/drivers/net/phy/dp83640.c
> @@ -1450,6 +1450,9 @@ static int dp83640_probe(struct phy_device *phydev)
>   	phydev->mii_ts = &dp83640->mii_ts;
>   	phydev->priv = dp83640;
>   
> +	/* Timestamp selected by default to keep legacy API */
> +	phydev->default_timestamp = true;
> +

This probably does not matter too much given that the mii_ts is not 
visible until we fully probed the PHY, though for consistency and to be 
on the safe side, it would be more prudent to set default_timestamp 
before finishing the mii_ts assignment, in case we ever become more 
aggressive at exposing objects to user-space/kernel-space. Probably over 
thinking this.

More comments below:

[snip]

>   
> -	if (!skb->dev || !skb->dev->phydev || !skb->dev->phydev->mii_ts)
> +	if (!skb->dev)
> +		return false;
> +
> +	if (!phy_is_default_hwtstamp(skb->dev->phydev))

Was not obvious that we could remove the phydev NULL check, but it's 
fine because phy_is_default_hwtstamp() calls phy_has_hwtstamp() first, 
and that function has that check. Seems a bit brittle, but fair enough.
  
Andrew Lunn Feb. 17, 2024, 5:07 p.m. UTC | #3
On Fri, Feb 16, 2024 at 10:09:36AM -0800, Rahul Rameshbabu wrote:
> 
> On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent <kory.maincent@bootlin.com> wrote:
> > Change the API to select MAC default time stamping instead of the PHY.
> > Indeed the PHY is closer to the wire therefore theoretically it has less
> > delay than the MAC timestamping but the reality is different. Due to lower
> > time stamping clock frequency, latency in the MDIO bus and no PHC hardware
> > synchronization between different PHY, the PHY PTP is often less precise
> > than the MAC. The exception is for PHY designed specially for PTP case but
> > these devices are not very widespread. For not breaking the compatibility
> > default_timestamp flag has been introduced in phy_device that is set by
> > the phy driver to know we are using the old API behavior.
> >
> > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> > ---
> 
> Overall, I agree with the motivation and reasoning behind the patch. It
> takes dedicated effort to build a good phy timestamping mechanism, so
> this approach is good. I do have a question though. In this patch if we
> set the phy as the default timestamp mechanism, does that mean for even
> non-PTP applications, the phy will be used for timestamping when
> hardware timestamping is enabled? If so, I think this might need some
> thought because there are timing applications in general when a
> timestamp closest to the MAC layer would be best.

Could you give some examples? It seems odd to me, the application
wants a less accurate timestamp?

Is it more about overheads? A MAC timestamp might be less costly than
a PHY timestamp?

Or is the application not actually doing PTP, it does not care about
the time of the packet on the wire, but it is more about media access
control? Maybe the applications you are talking about are misusing the
PTP API for something its not intended?

    Andrew
  
Rahul Rameshbabu Feb. 17, 2024, 9:05 p.m. UTC | #4
On Sat, 17 Feb, 2024 18:07:31 +0100 Andrew Lunn <andrew@lunn.ch> wrote:
> On Fri, Feb 16, 2024 at 10:09:36AM -0800, Rahul Rameshbabu wrote:
>> 
>> On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent <kory.maincent@bootlin.com> wrote:
>> > Change the API to select MAC default time stamping instead of the PHY.
>> > Indeed the PHY is closer to the wire therefore theoretically it has less
>> > delay than the MAC timestamping but the reality is different. Due to lower
>> > time stamping clock frequency, latency in the MDIO bus and no PHC hardware
>> > synchronization between different PHY, the PHY PTP is often less precise
>> > than the MAC. The exception is for PHY designed specially for PTP case but
>> > these devices are not very widespread. For not breaking the compatibility
>> > default_timestamp flag has been introduced in phy_device that is set by
>> > the phy driver to know we are using the old API behavior.
>> >
>> > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
>> > ---
>> 
>> Overall, I agree with the motivation and reasoning behind the patch. It
>> takes dedicated effort to build a good phy timestamping mechanism, so
>> this approach is good. I do have a question though. In this patch if we
>> set the phy as the default timestamp mechanism, does that mean for even
>> non-PTP applications, the phy will be used for timestamping when
>> hardware timestamping is enabled? If so, I think this might need some
>> thought because there are timing applications in general when a
>> timestamp closest to the MAC layer would be best.
>
> Could you give some examples? It seems odd to me, the application
> wants a less accurate timestamp?
>
> Is it more about overheads? A MAC timestamp might be less costly than
> a PHY timestamp?

It's a combination of both though I think primarily about line rate.
This point is somewhat carried over from the previous discussions on
this patch series in the last revision. I assume the device in question
here cannot timestamp at the PHY at a high rate.

  https://lore.kernel.org/netdev/20231120093723.4d88fb2a@kernel.org/

>
> Or is the application not actually doing PTP, it does not care about
> the time of the packet on the wire, but it is more about media access
> control? Maybe the applications you are talking about are misusing the
> PTP API for something its not intended?

So hardware timestamping is not a PTP specific API or application right?
It's purely a socket option that is not tied to PTP (unless I am missing
something here).

  https://docs.kernel.org/networking/timestamping.html#timestamp-generation

So you could use this information for other applications like congestion
control where you do not want to limit the line rate using the PHY
timestamping mechanism.

In mlx5, we only steering PTP traffic to our PHY timestamping mechanism
through a traffic matching logic.

  https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h?id=a6e0cb150c514efba4aaba4069927de43d80bb59#n71

This is because we do not want to PHY/port timestamp timing related
applications such as congestion control. I think it makes sense for
specialized timestamping applications to instead use the ethtool ioctl
to reconfigure using the PHY timestamps if the device is capable of PHY
timestamping. (So have the change be in userspace application tools like
linuxptp where precise but low <relative> rate timestamp information is
ideal).

--
Thanks,

Rahul Rameshbabu
  
Andrew Lunn Feb. 17, 2024, 10:10 p.m. UTC | #5
> > Could you give some examples? It seems odd to me, the application
> > wants a less accurate timestamp?
> >
> > Is it more about overheads? A MAC timestamp might be less costly than
> > a PHY timestamp?
> 
> It's a combination of both though I think primarily about line rate.
> This point is somewhat carried over from the previous discussions on
> this patch series in the last revision.

Sorry, i've not been keeping up with the discussion. That could also
mean whatever i say below is total nonsense!

> I assume the device in question
> here cannot timestamp at the PHY at a high rate.
> 
>   https://lore.kernel.org/netdev/20231120093723.4d88fb2a@kernel.org/
> 
> >
> > Or is the application not actually doing PTP, it does not care about
> > the time of the packet on the wire, but it is more about media access
> > control? Maybe the applications you are talking about are misusing the
> > PTP API for something its not intended?
> 
> So hardware timestamping is not a PTP specific API or application right?

Well, we have drivers/ptp. The IOCTL numbers are all PTP_XXXX. It
seems like the subsystem started life in order to support PTP. It is
not unusual for a subsystem to gain extra capabilities, and maybe PTP
timestamps can be used in a more general way than the PTP
protocol.

> It's purely a socket option that is not tied to PTP (unless I am missing
> something here).
> 
>   https://docs.kernel.org/networking/timestamping.html#timestamp-generation
> 
> So you could use this information for other applications like congestion
> control where you do not want to limit the line rate using the PHY
> timestamping mechanism.

I think the key API point here is, you need to separate PTP stamping
from other sorts of stamping. PTP stamping generally works better at
the lowest point. So PTP stamping could be PHY stamping. If the PHY
does not support PTP, or its implementation is poor, PTP stamping can
be performed at the MAC. There are plenty of MACs which support that.
So we need an API to configure where PTP stamping is performed.

I expect the socket option is more generic. It is more about, give me
a time stamp at a specific point in the stack. It is probably not
being used by PTP, it could be used for flow control, etc. We probably
need an API to configure that SOF_TIMESTAMPING_RX_HARDWARE actually
means. It could be the PHY time stamp, maybe the MAC timestamp. Same
for SOF_TIMESTAMPING_TX_HARDWARE, it could be the MAC, could be the
PHY. But whatever they mean, i expect they are separate PTP.

> In mlx5, we only steering PTP traffic to our PHY timestamping mechanism
> through a traffic matching logic.
> 
>   https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h?id=a6e0cb150c514efba4aaba4069927de43d80bb59#n71
> 
> This is because we do not want to PHY/port timestamp timing related
> applications such as congestion control. I think it makes sense for
> specialized timestamping applications to instead use the ethtool ioctl
> to reconfigure using the PHY timestamps if the device is capable of PHY
> timestamping. (So have the change be in userspace application tools like
> linuxptp where precise but low <relative> rate timestamp information is
> ideal).

I would expect linuxptp is only interested in the PTP timestamp. It
might be interested where the stamp is coming from, PHY or MAC, but it
probably does not care too much, it just assumes the time stamp is
good for PTP. But i would expect linuxptp has no interest in what the
generic socket options are doing.

    Andrew
  
Köry Maincent Feb. 19, 2024, 1:29 p.m. UTC | #6
On Fri, 16 Feb 2024 10:09:36 -0800
Rahul Rameshbabu <rrameshbabu@nvidia.com> wrote:

> On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent <kory.maincent@bootlin.com>
> wrote:
> > Change the API to select MAC default time stamping instead of the PHY.
> > Indeed the PHY is closer to the wire therefore theoretically it has less
> > delay than the MAC timestamping but the reality is different. Due to lower
> > time stamping clock frequency, latency in the MDIO bus and no PHC hardware
> > synchronization between different PHY, the PHY PTP is often less precise
> > than the MAC. The exception is for PHY designed specially for PTP case but
> > these devices are not very widespread. For not breaking the compatibility
> > default_timestamp flag has been introduced in phy_device that is set by
> > the phy driver to know we are using the old API behavior.
> >
> > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> > ---  
> 
> Overall, I agree with the motivation and reasoning behind the patch. It
> takes dedicated effort to build a good phy timestamping mechanism, so
> this approach is good. I do have a question though. In this patch if we
> set the phy as the default timestamp mechanism, does that mean for even
> non-PTP applications, the phy will be used for timestamping when
> hardware timestamping is enabled? If so, I think this might need some
> thought because there are timing applications in general when a
> timestamp closest to the MAC layer would be best.

This patch comes from a request from Russell due to incompatibility between MAC
and PHY timestamping when both were supported.
https://lore.kernel.org/netdev/Y%2F4DZIDm1d74MuFJ@shell.armlinux.org.uk/

His point was adding PTP support to a PHY driver would select timestamp from it
by default even if we had a better timestamp with the MAC which is often the
case. This is an unwanted behavior.
https://lore.kernel.org/netdev/Y%2F6Cxf6EAAg22GOL@shell.armlinux.org.uk/

In fact, with the new support of NDOs hwtstamp and the
dev_get/set_hwtstamp_phylib functions, alongside this series which make
timestamp selectable, changing the default timestamp may be not necessary
anymore.

Russell any thought about it? 

Regards,
  
Russell King (Oracle) Feb. 19, 2024, 4:11 p.m. UTC | #7
On Mon, Feb 19, 2024 at 02:29:36PM +0100, Köry Maincent wrote:
> On Fri, 16 Feb 2024 10:09:36 -0800
> Rahul Rameshbabu <rrameshbabu@nvidia.com> wrote:
> 
> > On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent <kory.maincent@bootlin.com>
> > wrote:
> > > Change the API to select MAC default time stamping instead of the PHY.
> > > Indeed the PHY is closer to the wire therefore theoretically it has less
> > > delay than the MAC timestamping but the reality is different. Due to lower
> > > time stamping clock frequency, latency in the MDIO bus and no PHC hardware
> > > synchronization between different PHY, the PHY PTP is often less precise
> > > than the MAC. The exception is for PHY designed specially for PTP case but
> > > these devices are not very widespread. For not breaking the compatibility
> > > default_timestamp flag has been introduced in phy_device that is set by
> > > the phy driver to know we are using the old API behavior.
> > >
> > > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> > > ---  
> > 
> > Overall, I agree with the motivation and reasoning behind the patch. It
> > takes dedicated effort to build a good phy timestamping mechanism, so
> > this approach is good. I do have a question though. In this patch if we
> > set the phy as the default timestamp mechanism, does that mean for even
> > non-PTP applications, the phy will be used for timestamping when
> > hardware timestamping is enabled? If so, I think this might need some
> > thought because there are timing applications in general when a
> > timestamp closest to the MAC layer would be best.
> 
> This patch comes from a request from Russell due to incompatibility between MAC
> and PHY timestamping when both were supported.
> https://lore.kernel.org/netdev/Y%2F4DZIDm1d74MuFJ@shell.armlinux.org.uk/
> 
> His point was adding PTP support to a PHY driver would select timestamp from it
> by default even if we had a better timestamp with the MAC which is often the
> case. This is an unwanted behavior.
> https://lore.kernel.org/netdev/Y%2F6Cxf6EAAg22GOL@shell.armlinux.org.uk/
> 
> In fact, with the new support of NDOs hwtstamp and the
> dev_get/set_hwtstamp_phylib functions, alongside this series which make
> timestamp selectable, changing the default timestamp may be not necessary
> anymore.
> 
> Russell any thought about it? 

My position remains: in the case of Marvell PP2 network driver with a
Marvell PHY, when we add PTP support for the Marvell PHYs (I have
patches for it for years) then we must _not_ regress the existing
setup where the PP2 timestamps are the default.
  
Köry Maincent Feb. 20, 2024, 4:20 p.m. UTC | #8
On Mon, 19 Feb 2024 16:11:16 +0000
"Russell King (Oracle)" <linux@armlinux.org.uk> wrote:

> On Mon, Feb 19, 2024 at 02:29:36PM +0100, Köry Maincent wrote:
> > On Fri, 16 Feb 2024 10:09:36 -0800
> > Rahul Rameshbabu <rrameshbabu@nvidia.com> wrote:
> >   
> > > On Fri, 16 Feb, 2024 16:52:22 +0100 Kory Maincent
> > > <kory.maincent@bootlin.com> wrote:  
> > > > Change the API to select MAC default time stamping instead of the PHY.
> > > > Indeed the PHY is closer to the wire therefore theoretically it has less
> > > > delay than the MAC timestamping but the reality is different. Due to
> > > > lower time stamping clock frequency, latency in the MDIO bus and no PHC
> > > > hardware synchronization between different PHY, the PHY PTP is often
> > > > less precise than the MAC. The exception is for PHY designed specially
> > > > for PTP case but these devices are not very widespread. For not
> > > > breaking the compatibility default_timestamp flag has been introduced
> > > > in phy_device that is set by the phy driver to know we are using the
> > > > old API behavior.
> > > >
> > > > Signed-off-by: Kory Maincent <kory.maincent@bootlin.com>
> > > > ---    
> > > 
> > > Overall, I agree with the motivation and reasoning behind the patch. It
> > > takes dedicated effort to build a good phy timestamping mechanism, so
> > > this approach is good. I do have a question though. In this patch if we
> > > set the phy as the default timestamp mechanism, does that mean for even
> > > non-PTP applications, the phy will be used for timestamping when
> > > hardware timestamping is enabled? If so, I think this might need some
> > > thought because there are timing applications in general when a
> > > timestamp closest to the MAC layer would be best.  
> > 
> > This patch comes from a request from Russell due to incompatibility between
> > MAC and PHY timestamping when both were supported.
> > https://lore.kernel.org/netdev/Y%2F4DZIDm1d74MuFJ@shell.armlinux.org.uk/
> > 
> > His point was adding PTP support to a PHY driver would select timestamp
> > from it by default even if we had a better timestamp with the MAC which is
> > often the case. This is an unwanted behavior.
> > https://lore.kernel.org/netdev/Y%2F6Cxf6EAAg22GOL@shell.armlinux.org.uk/
> > 
> > In fact, with the new support of NDOs hwtstamp and the
> > dev_get/set_hwtstamp_phylib functions, alongside this series which make
> > timestamp selectable, changing the default timestamp may be not necessary
> > anymore.
> > 
> > Russell any thought about it?   
> 
> My position remains: in the case of Marvell PP2 network driver with a
> Marvell PHY, when we add PTP support for the Marvell PHYs (I have
> patches for it for years) then we must _not_ regress the existing
> setup where the PP2 timestamps are the default.

Yes, that's what I thought.
About the Marvell PHYs PTP support I have a few fixes on it, but we will
talk about it when this series gets merged.

Regards,
  
Rahul Rameshbabu Feb. 20, 2024, 8:17 p.m. UTC | #9
On Sat, 17 Feb, 2024 23:10:07 +0100 Andrew Lunn <andrew@lunn.ch> wrote:
>> > Could you give some examples? It seems odd to me, the application
>> > wants a less accurate timestamp?
>> >
>> > Is it more about overheads? A MAC timestamp might be less costly than
>> > a PHY timestamp?
>> 
>> It's a combination of both though I think primarily about line rate.
>> This point is somewhat carried over from the previous discussions on
>> this patch series in the last revision.
>
> Sorry, i've not been keeping up with the discussion. That could also
> mean whatever i say below is total nonsense!
>

No worries. I could also be off here. I am mostly using mlx5 for my
perspective here and I think Kory and Russell sent some feedback that
likely confirms that this patch makes sense. Will reply to that in a
bit.

>> I assume the device in question
>> here cannot timestamp at the PHY at a high rate.
>> 
>>   https://lore.kernel.org/netdev/20231120093723.4d88fb2a@kernel.org/
>> 
>> >
>> > Or is the application not actually doing PTP, it does not care about
>> > the time of the packet on the wire, but it is more about media access
>> > control? Maybe the applications you are talking about are misusing the
>> > PTP API for something its not intended?
>> 
>> So hardware timestamping is not a PTP specific API or application right?
>
> Well, we have drivers/ptp. The IOCTL numbers are all PTP_XXXX. It
> seems like the subsystem started life in order to support PTP. It is
> not unusual for a subsystem to gain extra capabilities, and maybe PTP
> timestamps can be used in a more general way than the PTP
> protocol.
>

This is a great point to bring up. I think the PTP related ioctls can be
confusing. Rather than calling them PTP ioctls, I think it would be best
to call them PHC ioctls, where PHC stands for PTP hardware clock. These
ioctls are more about controlling the local PTP clock devices rather
than handling timestamps sent/received via the PTP protocol.

https://docs.kernel.org/driver-api/ptp.html

We can look at the ptp4l source code and see that PTP does indeed depend
on the more generic PTP hardware timestamping socket options.

  https://github.com/richardcochran/linuxptp/blob/f271257b799d390d9ec09d5c7dafb7f10a3bd99b/sk.c#L559

Again, I do know the ioctls can be confusing. The ioctls tend to be more
about adjusting the PHCs rather than controlling the timestamping flow
if that makes sense.

>> It's purely a socket option that is not tied to PTP (unless I am missing
>> something here).
>> 
>>   https://docs.kernel.org/networking/timestamping.html#timestamp-generation
>> 
>> So you could use this information for other applications like congestion
>> control where you do not want to limit the line rate using the PHY
>> timestamping mechanism.
>
> I think the key API point here is, you need to separate PTP stamping
> from other sorts of stamping. PTP stamping generally works better at
> the lowest point. So PTP stamping could be PHY stamping. If the PHY
> does not support PTP, or its implementation is poor, PTP stamping can
> be performed at the MAC. There are plenty of MACs which support that.
> So we need an API to configure where PTP stamping is performed.
>

I actually agree with this to a degree. However, I do think it is
sensible for applications that understand their properties to explicitly
select the timestamping layer in the application initialization as well.
I think because of the impact on line rate, the MAC/DMA layer makes
sense as the default.

> I expect the socket option is more generic. It is more about, give me
> a time stamp at a specific point in the stack. It is probably not
> being used by PTP, it could be used for flow control, etc. We probably
> need an API to configure that SOF_TIMESTAMPING_RX_HARDWARE actually
> means. It could be the PHY time stamp, maybe the MAC timestamp. Same
> for SOF_TIMESTAMPING_TX_HARDWARE, it could be the MAC, could be the
> PHY. But whatever they mean, i expect they are separate PTP.
>

As I linked above, the socket options are being utilized by the linuxptp
userspace stack.

>> In mlx5, we only steering PTP traffic to our PHY timestamping mechanism
>> through a traffic matching logic.
>> 
>>   https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/tree/drivers/net/ethernet/mellanox/mlx5/core/en/ptp.h?id=a6e0cb150c514efba4aaba4069927de43d80bb59#n71
>> 
>> This is because we do not want to PHY/port timestamp timing related
>> applications such as congestion control. I think it makes sense for
>> specialized timestamping applications to instead use the ethtool ioctl
>> to reconfigure using the PHY timestamps if the device is capable of PHY
>> timestamping. (So have the change be in userspace application tools like
>> linuxptp where precise but low <relative> rate timestamp information is
>> ideal).
>
> I would expect linuxptp is only interested in the PTP timestamp. It
> might be interested where the stamp is coming from, PHY or MAC, but it
> probably does not care too much, it just assumes the time stamp is
> good for PTP. But i would expect linuxptp has no interest in what the
> generic socket options are doing.

For timestamping events and being able to generate and receive them for
the userspace perspective, there is no special interface just for ptp.
That said, maybe it makes sense versus having the userspace stack just
make use of the generic timestamping options. I am slightly against
doing something special for the ptp implementation since I do think it
is a typically timestamping application with different configuration
parameters (like which timestamping layer to select).

--
Thanks,

Rahul Rameshbabu
  

Patch

diff --git a/drivers/net/phy/bcm-phy-ptp.c b/drivers/net/phy/bcm-phy-ptp.c
index 617d384d4551..d3e825c951ee 100644
--- a/drivers/net/phy/bcm-phy-ptp.c
+++ b/drivers/net/phy/bcm-phy-ptp.c
@@ -931,6 +931,9 @@  struct bcm_ptp_private *bcm_ptp_probe(struct phy_device *phydev)
 		return ERR_CAST(clock);
 	priv->ptp_clock = clock;
 
+	/* Timestamp selected by default to keep legacy API */
+	phydev->default_timestamp = true;
+
 	priv->phydev = phydev;
 	bcm_ptp_init(priv);
 
diff --git a/drivers/net/phy/dp83640.c b/drivers/net/phy/dp83640.c
index 5c42c47dc564..64fd1a109c0f 100644
--- a/drivers/net/phy/dp83640.c
+++ b/drivers/net/phy/dp83640.c
@@ -1450,6 +1450,9 @@  static int dp83640_probe(struct phy_device *phydev)
 	phydev->mii_ts = &dp83640->mii_ts;
 	phydev->priv = dp83640;
 
+	/* Timestamp selected by default to keep legacy API */
+	phydev->default_timestamp = true;
+
 	spin_lock_init(&dp83640->rx_lock);
 	skb_queue_head_init(&dp83640->rx_queue);
 	skb_queue_head_init(&dp83640->tx_queue);
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 9b6973581989..1c9eba331b01 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -3177,6 +3177,9 @@  static void lan8814_ptp_init(struct phy_device *phydev)
 	ptp_priv->mii_ts.ts_info  = lan8814_ts_info;
 
 	phydev->mii_ts = &ptp_priv->mii_ts;
+
+	/* Timestamp selected by default to keep legacy API */
+	phydev->default_timestamp = true;
 }
 
 static int lan8814_ptp_probe_once(struct phy_device *phydev)
@@ -4613,6 +4616,9 @@  static int lan8841_probe(struct phy_device *phydev)
 
 	phydev->mii_ts = &ptp_priv->mii_ts;
 
+	/* Timestamp selected by default to keep legacy API */
+	phydev->default_timestamp = true;
+
 	return 0;
 }
 
diff --git a/drivers/net/phy/mscc/mscc_ptp.c b/drivers/net/phy/mscc/mscc_ptp.c
index eb0b032cb613..e66d20eff7c4 100644
--- a/drivers/net/phy/mscc/mscc_ptp.c
+++ b/drivers/net/phy/mscc/mscc_ptp.c
@@ -1570,6 +1570,9 @@  int vsc8584_ptp_probe(struct phy_device *phydev)
 		return PTR_ERR(vsc8531->load_save);
 	}
 
+	/* Timestamp selected by default to keep legacy API */
+	phydev->default_timestamp = true;
+
 	vsc8531->ptp->phydev = phydev;
 
 	return 0;
diff --git a/drivers/net/phy/nxp-c45-tja11xx.c b/drivers/net/phy/nxp-c45-tja11xx.c
index 3cf614b4cd52..d18c133e6013 100644
--- a/drivers/net/phy/nxp-c45-tja11xx.c
+++ b/drivers/net/phy/nxp-c45-tja11xx.c
@@ -1660,6 +1660,9 @@  static int nxp_c45_probe(struct phy_device *phydev)
 		priv->mii_ts.ts_info = nxp_c45_ts_info;
 		phydev->mii_ts = &priv->mii_ts;
 		ret = nxp_c45_init_ptp_clock(priv);
+
+		/* Timestamp selected by default to keep legacy API */
+		phydev->default_timestamp = true;
 	} else {
 		phydev_dbg(phydev, "PTP support not enabled even if the phy supports it");
 	}
diff --git a/include/linux/phy.h b/include/linux/phy.h
index c2dda21b39e1..9a31243e9f7e 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -607,6 +607,8 @@  struct macsec_ops;
  *                 handling shall be postponed until PHY has resumed
  * @irq_rerun: Flag indicating interrupts occurred while PHY was suspended,
  *             requiring a rerun of the interrupt handler after resume
+ * @default_timestamp: Flag indicating whether we are using the phy
+ *		       timestamp as the default one
  * @interface: enum phy_interface_t value
  * @possible_interfaces: bitmap if interface modes that the attached PHY
  *			 will switch between depending on media speed.
@@ -672,6 +674,8 @@  struct phy_device {
 	unsigned irq_suspended:1;
 	unsigned irq_rerun:1;
 
+	unsigned default_timestamp:1;
+
 	int rate_matching;
 
 	enum phy_state state;
@@ -1613,6 +1617,19 @@  static inline void phy_txtstamp(struct phy_device *phydev, struct sk_buff *skb,
 	phydev->mii_ts->txtstamp(phydev->mii_ts, skb, type);
 }
 
+/**
+ * phy_is_default_hwtstamp - return true if phy is the default hw timestamp
+ * @phydev: Pointer to phy_device
+ *
+ * This is used to get default timestamping device taking into account
+ * the new API choice, which is selecting the timestamping from MAC by
+ * default if the phydev does not have default_timestamp flag enabled.
+ */
+static inline bool phy_is_default_hwtstamp(struct phy_device *phydev)
+{
+	return phy_has_hwtstamp(phydev) && phydev->default_timestamp;
+}
+
 /**
  * phy_is_internal - Convenience function for testing if a PHY is internal
  * @phydev: the phy_device struct
diff --git a/net/core/dev_ioctl.c b/net/core/dev_ioctl.c
index 847254fd7f13..3342834597cd 100644
--- a/net/core/dev_ioctl.c
+++ b/net/core/dev_ioctl.c
@@ -260,9 +260,7 @@  static int dev_eth_ioctl(struct net_device *dev,
  * @dev: Network device
  * @cfg: Timestamping configuration structure
  *
- * Helper for enforcing a common policy that phylib timestamping, if available,
- * should take precedence in front of hardware timestamping provided by the
- * netdev.
+ * Helper for calling the default hardware provider timestamping.
  *
  * Note: phy_mii_ioctl() only handles SIOCSHWTSTAMP (not SIOCGHWTSTAMP), and
  * there only exists a phydev->mii_ts->hwtstamp() method. So this will return
@@ -272,7 +270,7 @@  static int dev_eth_ioctl(struct net_device *dev,
 int dev_get_hwtstamp_phylib(struct net_device *dev,
 			    struct kernel_hwtstamp_config *cfg)
 {
-	if (phy_has_hwtstamp(dev->phydev))
+	if (phy_is_default_hwtstamp(dev->phydev))
 		return phy_hwtstamp_get(dev->phydev, cfg);
 
 	return dev->netdev_ops->ndo_hwtstamp_get(dev, cfg);
@@ -329,7 +327,7 @@  int dev_set_hwtstamp_phylib(struct net_device *dev,
 			    struct netlink_ext_ack *extack)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
-	bool phy_ts = phy_has_hwtstamp(dev->phydev);
+	bool phy_ts = phy_is_default_hwtstamp(dev->phydev);
 	struct kernel_hwtstamp_config old_cfg = {};
 	bool changed = false;
 	int err;
diff --git a/net/core/timestamping.c b/net/core/timestamping.c
index 04840697fe79..891bfc2f62fd 100644
--- a/net/core/timestamping.c
+++ b/net/core/timestamping.c
@@ -25,7 +25,10 @@  void skb_clone_tx_timestamp(struct sk_buff *skb)
 	struct sk_buff *clone;
 	unsigned int type;
 
-	if (!skb->sk)
+	if (!skb->sk || !skb->dev)
+		return;
+
+	if (!phy_is_default_hwtstamp(skb->dev->phydev))
 		return;
 
 	type = classify(skb);
@@ -47,7 +50,10 @@  bool skb_defer_rx_timestamp(struct sk_buff *skb)
 	struct mii_timestamper *mii_ts;
 	unsigned int type;
 
-	if (!skb->dev || !skb->dev->phydev || !skb->dev->phydev->mii_ts)
+	if (!skb->dev)
+		return false;
+
+	if (!phy_is_default_hwtstamp(skb->dev->phydev))
 		return false;
 
 	if (skb_headroom(skb) < ETH_HLEN)
diff --git a/net/ethtool/common.c b/net/ethtool/common.c
index ce486cec346c..e56bde53cd5c 100644
--- a/net/ethtool/common.c
+++ b/net/ethtool/common.c
@@ -637,7 +637,7 @@  int __ethtool_get_ts_info(struct net_device *dev, struct ethtool_ts_info *info)
 	memset(info, 0, sizeof(*info));
 	info->cmd = ETHTOOL_GET_TS_INFO;
 
-	if (phy_has_tsinfo(phydev))
+	if (phy_is_default_hwtstamp(phydev) && phy_has_tsinfo(phydev))
 		return phy_ts_info(phydev, info);
 	if (ops->get_ts_info)
 		return ops->get_ts_info(dev, info);