[net-next,0/3] Rework GENET MDIO controller clocking

Message ID 20240216184237.259954-1-florian.fainelli@broadcom.com
Headers
Series Rework GENET MDIO controller clocking |

Message

Florian Fainelli Feb. 16, 2024, 6:42 p.m. UTC
  This patch series reworks the way that we manage the GENET MDIO
controller clocks around I/O accesses. During testing with a fully
modular build where bcmgenet, mdio-bcm-unimac, and the Broadcom PHY
driver (broadcom) are all loaded as modules, with no particular care
being taken to order them to mimize deferred probing the following bus
error was obtained:

[    4.344831] printk: console [ttyS0] enabled
[    4.351102] 840d000.serial: ttyS1 at MMIO 0x840d000 (irq = 29, base_baud = 5062500) is a Broadcom BCM7271 UART
[    4.363110] 840e000.serial: ttyS2 at MMIO 0x840e000 (irq = 30, base_baud = 5062500) is a Broadcom BCM7271 UART
[    4.387392] iproc-rng200 8402000.rng: hwrng registered
[    4.398012] Consider using thermal netlink events interface
[    4.403717] brcmstb_thermal a581500.thermal: registered AVS TMON of-sensor driver
[    4.440085] bcmgenet 8f00000.ethernet: GENET 5.0 EPHY: 0x0000
[    4.482526] unimac-mdio unimac-mdio.0: Broadcom UniMAC MDIO bus
[    4.514019] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[    4.551304] SError Interrupt on CPU2, code 0x00000000bf000002 -- SError
[    4.551324] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 6.1.53-0.1pre-g5a26d98e908c #2
[    4.551330] Hardware name: BCM972180HB_V20 (DT)
[    4.551336] Workqueue: events_unbound deferred_probe_work_func
[    4.551363] pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[    4.551368] pc : el1_abort+0x2c/0x58
[    4.551376] lr : el1_abort+0x20/0x58
[    4.551379] sp : ffffffc00a383960
[    4.551380] x29: ffffffc00a383960 x28: ffffff80029fd780 x27: 0000000000000000
[    4.551385] x26: 0000000000000000 x25: ffffff8002839005 x24: ffffffc00a1f9bd0
[    4.551390] x23: 0000000040000005 x22: ffffffc000a48084 x21: ffffffc00a3dde14
[    4.551394] x20: 0000000096000210 x19: ffffffc00a3839a0 x18: 0000000000000579
[    4.551399] x17: 0000000000000000 x16: 0000000100000000 x15: ffffffc00a3838c0
[    4.551403] x14: 000000000000000a x13: 6e69622f7273752f x12: 3a6e6962732f7273
[    4.551408] x11: 752f3a6e69622f3a x10: 6e6962732f3d4854 x9 : ffffffc0086466a8
[    4.551412] x8 : ffffff80049ee100 x7 : ffffff8003231938 x6 : 0000000000000000
[    4.551416] x5 : 0000002200000000 x4 : ffffffc00a3839a0 x3 : 0000002000000000
[    4.551420] x2 : 0000000000000025 x1 : 0000000096000210 x0 : 0000000000000000
[    4.551429] Kernel panic - not syncing: Asynchronous SError Interrupt
[    4.551432] CPU: 2 PID: 8 Comm: kworker/u8:0 Not tainted 6.1.53-0.1pre-g5a26d98e908c #2
[    4.551435] Hardware name: BCM972180HB_V20 (DT)
[    4.551437] Workqueue: events_unbound deferred_probe_work_func
[    4.551443] Call trace:
[    4.551445]  dump_backtrace+0xe4/0x124
[    4.551452]  show_stack+0x1c/0x28
[    4.551455]  dump_stack_lvl+0x60/0x78
[    4.551462]  dump_stack+0x14/0x2c
[    4.551467]  panic+0x134/0x304
[    4.551472]  nmi_panic+0x50/0x70
[    4.551480]  arm64_serror_panic+0x70/0x7c
[    4.551484]  do_serror+0x2c/0x5c
[    4.551487]  el1h_64_error_handler+0x2c/0x40
[    4.551491]  el1h_64_error+0x64/0x68
[    4.551496]  el1_abort+0x2c/0x58
[    4.551499]  el1h_64_sync_handler+0x8c/0xb4
[    4.551502]  el1h_64_sync+0x64/0x68
[    4.551505]  unimac_mdio_readl.isra.0+0x4/0xc [mdio_bcm_unimac]
[    4.551519]  __mdiobus_read+0x2c/0x88
[    4.551526]  mdiobus_read+0x40/0x60
[    4.551530]  phy_read+0x18/0x20
[    4.551534]  bcm_phy_config_intr+0x20/0x84
[    4.551537]  phy_disable_interrupts+0x2c/0x3c
[    4.551543]  phy_probe+0x80/0x1b0
[    4.551545]  really_probe+0x1b8/0x390
[    4.551550]  __driver_probe_device+0x134/0x14c
[    4.551554]  driver_probe_device+0x40/0xf8
[    4.551559]  __device_attach_driver+0x108/0x11c
[    4.551563]  bus_for_each_drv+0xa4/0xcc
[    4.551567]  __device_attach+0xdc/0x190
[    4.551571]  device_initial_probe+0x18/0x20
[    4.551575]  bus_probe_device+0x34/0x94
[    4.551579]  deferred_probe_work_func+0xd4/0xe8
[    4.551583]  process_one_work+0x1ac/0x25c
[    4.551590]  worker_thread+0x1f4/0x260
[    4.551595]  kthread+0xc0/0xd0
[    4.551600]  ret_from_fork+0x10/0x20
[    4.551608] SMP: stopping secondary CPUs
[    4.551617] Kernel Offset: disabled
[    4.551619] CPU features: 0x00000,00c00080,0000420b
[    4.551622] Memory Limit: none
[    4.833838] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]---

The issue here is that we managed to probe the GENET controller, the
mdio-bcm-unimac MDIO controller, but the PHY was still being held in a
probe deferral state because it depended upon a GPIO controller provider
not loaded yet. As soon as that provider is loaded however, the PHY
continues to probe, tries to disable the interrupts, and this causes a
MDIO transaction. That MDIO transaction requires I/O register accesses
within the GENET's larger block, and since its clocks are turned off,
the CPU gets a bus error signaled as a System Error.

The patch series takes the simplest approach of keeping the clocks
enabled just for the duration of the I/O accesses. This is also
beneficial to other drivers like bcmasp2 which make use of the same MDIO
controller driver.

Florian Fainelli (3):
  net: mdio: mdio-bcm-unimac: Manage clock around I/O accesses
  net: bcmgenet: Pass "main" clock down to the MDIO driver
  Revert "net: bcmgenet: Ensure MDIO unregistration has clocks enabled"

 drivers/net/ethernet/broadcom/genet/bcmmii.c  |  6 +-
 drivers/net/mdio/mdio-bcm-unimac.c            | 91 ++++++++++---------
 include/linux/platform_data/mdio-bcm-unimac.h |  3 +
 3 files changed, 55 insertions(+), 45 deletions(-)
  

Comments

Jacob Keller Feb. 16, 2024, 10:37 p.m. UTC | #1
On 2/16/2024 10:42 AM, Florian Fainelli wrote:
> Up until now we have managed not to have the mdio-bcm-unimac manage its
> clock except during probe and suspend/resume. This works most of the
> time, except where it does not.
> 

This made me chuckle :)

> With a fully modular build, we can get into a situation whereby the
> GENET driver is fully registered, and so is the mdio-bcm-unimac driver,
> however the Ethernet PHY driver is not yet, because it depends on a
> resource that is not yet available (e.g.: GPIO provider). In that state,
> the network device is not usable yet, and so to conserve power, the
> GENET driver will have turned off its "main" clock which feeds its MDIO
> controller.
> 
> When the PHY driver finally probes however, we make an access to the PHY
> registers to e.g.: disable interrupts, and this causes a bus error
> within the MDIO controller space because the MDIO controller clock(s)
> are turned off.
> 
> To remedy that, we manage the clock around all of the I/O accesses to
> the hardware which are done exclusively during read, write and clock
> divider configuration.
> 
> This ensures that the register space is accessible, and this also
> ensures that there are not unnecessarily elevated reference counts
> keeping the clocks active when the network device is administratively
> turned off. It would be the case with the previous way of managing the
> clock.
> 

Good description of the issues.

> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
> ---

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
  
Jacob Keller Feb. 16, 2024, 10:41 p.m. UTC | #2
On 2/16/2024 10:42 AM, Florian Fainelli wrote:
> GENET has historically had to create a MDIO platform device for its
> controller and pass some auxiliary data to it, like a MDIO completion
> callback. Now we also pass the "main" clock to allow for the MDIO bus
> controller to manage that clock adequately around I/O accesses.
> 
> Signed-off-by: Florian Fainelli <florian.fainelli@broadcom.com>
> ---
>  drivers/net/ethernet/broadcom/genet/bcmmii.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/net/ethernet/broadcom/genet/bcmmii.c b/drivers/net/ethernet/broadcom/genet/bcmmii.c
> index cbbe004621bc..7a21950da77c 100644
> --- a/drivers/net/ethernet/broadcom/genet/bcmmii.c
> +++ b/drivers/net/ethernet/broadcom/genet/bcmmii.c
> @@ -476,6 +476,10 @@ static int bcmgenet_mii_register(struct bcmgenet_priv *priv)
>  	ppd.wait_func = bcmgenet_mii_wait;
>  	ppd.wait_func_data = priv;
>  	ppd.bus_name = "bcmgenet MII bus";
> +	/* Pass a reference to our "main" clock which is used for MDIO
> +	 * transfers
> +	 */
> +	ppd.clk = priv->clk;
>  
>  	/* Unimac MDIO bus controller starts at UniMAC offset + MDIO_CMD
>  	 * and is 2 * 32-bits word long, 8 bytes total.

Is this missing a modification of the header file to add the clk field
to struct unimac_mdio_pdata? I don't see that field in the
include/linux/platform_data/mdio-bcm-unimac.h header currently...

Oh. you included that in the first patch of the series. I see.

It feels like the series would be more natural of this was 1/3 instead
of 2/3, since the current 1/3 patch depends on this clk value being set, no?

The result of the series makes sense tho:

Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>