[net-next,v2,5/5] net: bcmgenet: Interrogate PHY for WAKE_FILTER programming

Message ID 20231026224509.112353-6-florian.fainelli@broadcom.com
State New
Headers
Series WAKE_FILTER for Broadcom PHY (v2) |

Commit Message

Florian Fainelli Oct. 26, 2023, 10:45 p.m. UTC
  Determine whether the PHY can support waking up from the user programmed
network filter, and if it can utilize it.

Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>>
---
 drivers/net/ethernet/broadcom/genet/bcmgenet.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)
  

Comments

Jacob Keller Oct. 26, 2023, 11:23 p.m. UTC | #1
On 10/26/2023 3:45 PM, Florian Fainelli wrote:
> Determine whether the PHY can support waking up from the user programmed
> network filter, and if it can utilize it.
> 

Here, you're passing through to phy_ethtool_set_rxnfc, basically
allowing the lower device to program the wakeup filter if its supported. Ok.

This almost feels like it would belong generally in the higher level
ethtool code rather than in the driver?

Thanks,
Jake
  
Florian Fainelli Oct. 26, 2023, 11:52 p.m. UTC | #2
On 10/26/23 16:23, Jacob Keller wrote:
> 
> 
> On 10/26/2023 3:45 PM, Florian Fainelli wrote:
>> Determine whether the PHY can support waking up from the user programmed
>> network filter, and if it can utilize it.
>>
> 
> Here, you're passing through to phy_ethtool_set_rxnfc, basically
> allowing the lower device to program the wakeup filter if its supported. Ok.
> 
> This almost feels like it would belong generally in the higher level
> ethtool code rather than in the driver?

Agreed, as Doug just pointed out to me, there is still an open question 
about reconciling the PHY and the MAC RXNFC spaces into a single 
ethtool_rxnfc structure.

An ideal goal is to have zero modifications to neither the MAC or the 
PHY drivers such that they can both work in their own spaces as if they 
were alone, or combined.

I suppose that if we get the number of supported rules from the MAC 
first, and then get the supported number of rules from the PHY next, we 
could do something like this:

rule index
| 0|
| .| -> MAC rules
|15|
|16| -> PHY rule

and each of the MAC or the PHY {get,set}_rxnfc() operate within a base 
rule number which is relative to their own space. So the MAC driver 
would continue to care about its (max..first) - base (0) range, and the 
PHY would care about (max..first) - base (16).

Though then the issue is discoverability, how do you know which rule 
location is backed by which hardware block. We could create an 
intermediate and inert rule at index 16 for instance that acts as a 
delimiter?

Or we could create yet another RX_CLS_LOC_* value that is "special" and 
can denote whether of the MAC or the PHY we should be targeting 
whichever is supported, but that does not usually lend itself to being 
logically ORed with the existing RX_CLS_LOC_* values. WDYT?

pw-bot: cr
  
Jacob Keller Oct. 27, 2023, 4:55 p.m. UTC | #3
On 10/26/2023 4:52 PM, Florian Fainelli wrote:
> On 10/26/23 16:23, Jacob Keller wrote:
>>
>>
>> On 10/26/2023 3:45 PM, Florian Fainelli wrote:
>>> Determine whether the PHY can support waking up from the user programmed
>>> network filter, and if it can utilize it.
>>>
>>
>> Here, you're passing through to phy_ethtool_set_rxnfc, basically
>> allowing the lower device to program the wakeup filter if its supported. Ok.
>>
>> This almost feels like it would belong generally in the higher level
>> ethtool code rather than in the driver?
> 
> Agreed, as Doug just pointed out to me, there is still an open question 
> about reconciling the PHY and the MAC RXNFC spaces into a single 
> ethtool_rxnfc structure.
> 
> An ideal goal is to have zero modifications to neither the MAC or the 
> PHY drivers such that they can both work in their own spaces as if they 
> were alone, or combined.
> 
> I suppose that if we get the number of supported rules from the MAC 
> first, and then get the supported number of rules from the PHY next, we 
> could do something like this:
> 
> rule index
> | 0|
> | .| -> MAC rules
> |15|
> |16| -> PHY rule
> 
> and each of the MAC or the PHY {get,set}_rxnfc() operate within a base 
> rule number which is relative to their own space. So the MAC driver 
> would continue to care about its (max..first) - base (0) range, and the 
> PHY would care about (max..first) - base (16).
> 
> Though then the issue is discoverability, how do you know which rule 
> location is backed by which hardware block. We could create an 
> intermediate and inert rule at index 16 for instance that acts as a 
> delimiter?
> 
> Or we could create yet another RX_CLS_LOC_* value that is "special" and 
> can denote whether of the MAC or the PHY we should be targeting 
> whichever is supported, but that does not usually lend itself to being 
> logically ORed with the existing RX_CLS_LOC_* values. WDYT?
> 
> pw-bot: cr

Ah, yea there is a lot of complexity to consider here.

I'm not entirely sure what we should do here. What about extending with
another attribute entirely instead of another bit in RX_CLS_LOC?
  
Florian Fainelli Oct. 27, 2023, 5:15 p.m. UTC | #4
On 10/27/23 09:55, Jacob Keller wrote:
> 
> 
> On 10/26/2023 4:52 PM, Florian Fainelli wrote:
>> On 10/26/23 16:23, Jacob Keller wrote:
>>>
>>>
>>> On 10/26/2023 3:45 PM, Florian Fainelli wrote:
>>>> Determine whether the PHY can support waking up from the user programmed
>>>> network filter, and if it can utilize it.
>>>>
>>>
>>> Here, you're passing through to phy_ethtool_set_rxnfc, basically
>>> allowing the lower device to program the wakeup filter if its supported. Ok.
>>>
>>> This almost feels like it would belong generally in the higher level
>>> ethtool code rather than in the driver?
>>
>> Agreed, as Doug just pointed out to me, there is still an open question
>> about reconciling the PHY and the MAC RXNFC spaces into a single
>> ethtool_rxnfc structure.
>>
>> An ideal goal is to have zero modifications to neither the MAC or the
>> PHY drivers such that they can both work in their own spaces as if they
>> were alone, or combined.
>>
>> I suppose that if we get the number of supported rules from the MAC
>> first, and then get the supported number of rules from the PHY next, we
>> could do something like this:
>>
>> rule index
>> | 0|
>> | .| -> MAC rules
>> |15|
>> |16| -> PHY rule
>>
>> and each of the MAC or the PHY {get,set}_rxnfc() operate within a base
>> rule number which is relative to their own space. So the MAC driver
>> would continue to care about its (max..first) - base (0) range, and the
>> PHY would care about (max..first) - base (16).
>>
>> Though then the issue is discoverability, how do you know which rule
>> location is backed by which hardware block. We could create an
>> intermediate and inert rule at index 16 for instance that acts as a
>> delimiter?
>>
>> Or we could create yet another RX_CLS_LOC_* value that is "special" and
>> can denote whether of the MAC or the PHY we should be targeting
>> whichever is supported, but that does not usually lend itself to being
>> logically ORed with the existing RX_CLS_LOC_* values. WDYT?
>>
>> pw-bot: cr
> 
> Ah, yea there is a lot of complexity to consider here.

Yes this is only the tip of iceberg! Here is hopefully a better 
description of our particular system where this is being requested (the 
fact there is a single one also makes me question the entire effort, but 
anyway). We have 2 distinct system sleep modes:

- akin to ACPI S2 where the Ethernet PHY and MAC remain enabled and both 
can be used for Wake-on-LAN filtering, with the MAC being more capable 
than the PHY. System power consumption is just around 500mW at the wall. 
In that case it would make sense to leverage the MAC's capability 
because it is better and would lead to fewer false wake-ups

- akin to ACPI S3 where the Ethernet PHY only remains enabled, the MAC 
is powered off (as is most of the SoC), but we have limited Wake-on-LAN 
capability in the form of network filter as we can only match on a 
custom MAC DA + mask. System power consumption is closer to 350mW at the 
wall.

My users are not really willing to use the broad WAKE_MCAST because they 
want to match specifically on mDNS over IPv4 (or IPv6), so they prefer 
to program an exact match to limit the amount of false wake-ups. 
Arguably there will already be quite a lot in home network due to 
phones, IoT devices, and whatnot.

 From an user perspective they would know which system standby state is 
being entered so one could imagine that ahead of entry, we could 
configure either the MAC, or the PHY when targeting S2, or just the PHY 
when targeting S3. This implies that we can selectively target one 
entity, or the other.

For the current time being, and knowing the use case of my users, 
directing all of the Wake-on-LAN configuration towards the PHY would be 
enough IMHO, even if that means we stop leveraging the MAC capabilities, 
hence this patch series.

> 
> I'm not entirely sure what we should do here. What about extending with
> another attribute entirely instead of another bit in RX_CLS_LOC?

Yes possibly, or we just target different objects, right now we have 
visibility into the MACs via the net_device, it seems like we ought to 
be able to target some ethtool APIs towards PHY objects, which currently 
have no netlink representation. There is on-going work to bridge that gap:

https://lore.kernel.org/netdev/ffc6ff4a-d1af-4643-a538-fd13e6be9e06@lunn.ch/T/

but I am not sure we will reach an agreement any time soon. Maybe I can 
convince my masters to wait for that to land and use WAKE_MCAST in the 
meantime.

I would not necessary want to invent a new set of ethtool commands and 
kernel APIs such that we could do the below examples, though maybe this 
is not incompatible with the work being done by Maxime:

# Target the Ethernet MAC
ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 # 
Assumes MAC by default
ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 
target mac

# Target the Ethernet PHY, if capable
ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 
target phy

# Enable WAKE_FILTER at the MAC level
ethtool -s eth0 wol f # assumes MAC by default
ethtool -s eth0 wol f target mac

# Enable WAKE_FILTER at the PHY level, if capable
ethtool -s eth0 wol f target phy

though maybe this is the much needed addition to ethtool so we can be 
more selective.

After a bunch of candies on Tuesday I might reach a state of trance and 
figure which way to proceed :D
  
Jacob Keller Oct. 27, 2023, 5:36 p.m. UTC | #5
On 10/27/2023 10:15 AM, Florian Fainelli wrote:
> On 10/27/23 09:55, Jacob Keller wrote:
>>
>>
>> On 10/26/2023 4:52 PM, Florian Fainelli wrote:
>>> On 10/26/23 16:23, Jacob Keller wrote:
>>>>
>>>>
>>>> On 10/26/2023 3:45 PM, Florian Fainelli wrote:
>>>>> Determine whether the PHY can support waking up from the user programmed
>>>>> network filter, and if it can utilize it.
>>>>>
>>>>
>>>> Here, you're passing through to phy_ethtool_set_rxnfc, basically
>>>> allowing the lower device to program the wakeup filter if its supported. Ok.
>>>>
>>>> This almost feels like it would belong generally in the higher level
>>>> ethtool code rather than in the driver?
>>>
>>> Agreed, as Doug just pointed out to me, there is still an open question
>>> about reconciling the PHY and the MAC RXNFC spaces into a single
>>> ethtool_rxnfc structure.
>>>
>>> An ideal goal is to have zero modifications to neither the MAC or the
>>> PHY drivers such that they can both work in their own spaces as if they
>>> were alone, or combined.
>>>
>>> I suppose that if we get the number of supported rules from the MAC
>>> first, and then get the supported number of rules from the PHY next, we
>>> could do something like this:
>>>
>>> rule index
>>> | 0|
>>> | .| -> MAC rules
>>> |15|
>>> |16| -> PHY rule
>>>
>>> and each of the MAC or the PHY {get,set}_rxnfc() operate within a base
>>> rule number which is relative to their own space. So the MAC driver
>>> would continue to care about its (max..first) - base (0) range, and the
>>> PHY would care about (max..first) - base (16).
>>>
>>> Though then the issue is discoverability, how do you know which rule
>>> location is backed by which hardware block. We could create an
>>> intermediate and inert rule at index 16 for instance that acts as a
>>> delimiter?
>>>
>>> Or we could create yet another RX_CLS_LOC_* value that is "special" and
>>> can denote whether of the MAC or the PHY we should be targeting
>>> whichever is supported, but that does not usually lend itself to being
>>> logically ORed with the existing RX_CLS_LOC_* values. WDYT?
>>>
>>> pw-bot: cr
>>
>> Ah, yea there is a lot of complexity to consider here.
> 
> Yes this is only the tip of iceberg! Here is hopefully a better 
> description of our particular system where this is being requested (the 
> fact there is a single one also makes me question the entire effort, but 
> anyway). We have 2 distinct system sleep modes:
> 
> - akin to ACPI S2 where the Ethernet PHY and MAC remain enabled and both 
> can be used for Wake-on-LAN filtering, with the MAC being more capable 
> than the PHY. System power consumption is just around 500mW at the wall. 
> In that case it would make sense to leverage the MAC's capability 
> because it is better and would lead to fewer false wake-ups
> 
> - akin to ACPI S3 where the Ethernet PHY only remains enabled, the MAC 
> is powered off (as is most of the SoC), but we have limited Wake-on-LAN 
> capability in the form of network filter as we can only match on a 
> custom MAC DA + mask. System power consumption is closer to 350mW at the 
> wall.
> 
> My users are not really willing to use the broad WAKE_MCAST because they 
> want to match specifically on mDNS over IPv4 (or IPv6), so they prefer 
> to program an exact match to limit the amount of false wake-ups. 
> Arguably there will already be quite a lot in home network due to 
> phones, IoT devices, and whatnot.
> 
>  From an user perspective they would know which system standby state is 
> being entered so one could imagine that ahead of entry, we could 
> configure either the MAC, or the PHY when targeting S2, or just the PHY 
> when targeting S3. This implies that we can selectively target one 
> entity, or the other.
> 
> For the current time being, and knowing the use case of my users, 
> directing all of the Wake-on-LAN configuration towards the PHY would be 
> enough IMHO, even if that means we stop leveraging the MAC capabilities, 
> hence this patch series.
> 

Right.

>>
>> I'm not entirely sure what we should do here. What about extending with
>> another attribute entirely instead of another bit in RX_CLS_LOC?
> 
> Yes possibly, or we just target different objects, right now we have 
> visibility into the MACs via the net_device, it seems like we ought to 
> be able to target some ethtool APIs towards PHY objects, which currently 
> have no netlink representation. There is on-going work to bridge that gap:
> 
> https://lore.kernel.org/netdev/ffc6ff4a-d1af-4643-a538-fd13e6be9e06@lunn.ch/T/
> 
> but I am not sure we will reach an agreement any time soon. Maybe I can 
> convince my masters to wait for that to land and use WAKE_MCAST in the 
> meantime.
> 

Sure, but this obviously costs a potentially significant amount of extra
power, and it would be better to avoid that.

> I would not necessary want to invent a new set of ethtool commands and 
> kernel APIs such that we could do the below examples, though maybe this 
> is not incompatible with the work being done by Maxime:
> 
> # Target the Ethernet MAC
> ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 # 
> Assumes MAC by default
> ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 
> target mac
> 
> # Target the Ethernet PHY, if capable
> ethtool -N eth0 flow-type ether dst 01:00:5e:00:00:fb loc 0 action -2 
> target phy
> 
> # Enable WAKE_FILTER at the MAC level
> ethtool -s eth0 wol f # assumes MAC by default
> ethtool -s eth0 wol f target mac
> 
> # Enable WAKE_FILTER at the PHY level, if capable
> ethtool -s eth0 wol f target phy
> 
> though maybe this is the much needed addition to ethtool so we can be 
> more selective.
> 
> After a bunch of candies on Tuesday I might reach a state of trance and 
> figure which way to proceed :D

It does seem like an acceptable compromise here, and perhaps being
driver specific is ok, since this does depend a lot on the individual
device support, thus broadly applying this across all drivers could be
problematic.

I like the idea of being able to more precisely target the rules so that
its clear to userspace what is being done... but I also understand the
challenge of wanting to deliver what feels like a small win and being
asked to do something much larger.
  
Andrew Lunn Oct. 27, 2023, 6:31 p.m. UTC | #6
> It does seem like an acceptable compromise here, and perhaps being
> driver specific is ok, since this does depend a lot on the individual
> device support, thus broadly applying this across all drivers could be
> problematic.

The Marvell PHYs have a similar capability. Its actually more feature
rich. It allows upto 8 matches, each being of up to 128 bytes, and you
can enable/disable each byte within the 128 bytes. This would in fact
be better for Florian's use case, since it could match deeper into the
frame and reduce the false positive. But its a Marvell device...

My real point is, other hardware does have similar capabilities. Its
unclear if anybody else will ever actually need it, but we should try
to avoid a one device solution.

   Andrew
  

Patch

diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
index 9282403d1bf6..9d01c13552eb 100644
--- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c
+++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c
@@ -1524,6 +1524,14 @@  static int bcmgenet_set_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd)
 	struct bcmgenet_priv *priv = netdev_priv(dev);
 	int err = 0;
 
+	if (dev->phydev) {
+		err = phy_ethtool_set_rxnfc(dev->phydev, cmd);
+		if (err != -EOPNOTSUPP)
+			return err;
+
+		err = 0;
+	}
+
 	switch (cmd->cmd) {
 	case ETHTOOL_SRXCLSRLINS:
 		err = bcmgenet_insert_flow(dev, cmd);
@@ -1579,6 +1587,14 @@  static int bcmgenet_get_rxnfc(struct net_device *dev, struct ethtool_rxnfc *cmd,
 	int err = 0;
 	int i = 0;
 
+	if (dev->phydev) {
+		err = phy_ethtool_get_rxnfc(dev->phydev, cmd, rule_locs);
+		if (err != -EOPNOTSUPP)
+			return err;
+
+		err = 0;
+	}
+
 	switch (cmd->cmd) {
 	case ETHTOOL_GRXRINGS:
 		cmd->data = priv->hw_params->rx_queues ?: 1;