[v2] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)

Message ID ce4b1442135fe03d0de41859b04b268c88c854a3.1707498577.git.robin.murphy@arm.com
State New
Headers
Series [v2] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count) |

Commit Message

Robin Murphy Feb. 9, 2024, 5:11 p.m. UTC
  From: Ilkka Koskinen <ilkka@os.amperecomputing.com>

AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
report incorrect child count. The failing crosspoints report 8 children
while they only have two.

When the driver tries to access the inexistent child nodes, it believes it
has reached an invalid node type and probing fails. The workaround is to
ignore those incorrect child nodes and continue normally.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
[ rm: rewrote simpler generalised version ]
Tested-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Signed-off-by: Robin Murphy <robin.murphy@arm.com>
---
 drivers/perf/arm-cmn.c | 11 +++++++++++
 1 file changed, 11 insertions(+)
  

Comments

Will Deacon Feb. 9, 2024, 6:31 p.m. UTC | #1
On Fri, 9 Feb 2024 17:11:09 +0000, Robin Murphy wrote:
> From: Ilkka Koskinen <ilkka@os.amperecomputing.com>
> 
> AmpereOneX mesh implementation has a bug in HN-P nodes that makes them
> report incorrect child count. The failing crosspoints report 8 children
> while they only have two.
> 
> When the driver tries to access the inexistent child nodes, it believes it
> has reached an invalid node type and probing fails. The workaround is to
> ignore those incorrect child nodes and continue normally.
> 
> [...]

Applied to arm64 (for-next/fixes), thanks!

[1/1] perf/arm-cmn: Workaround AmpereOneX errata AC04_MESH_1 (incorrect child count)
      https://git.kernel.org/arm64/c/50572064ec71

Cheers,
  

Patch

diff --git a/drivers/perf/arm-cmn.c b/drivers/perf/arm-cmn.c
index c584165b13ba..7e3aa7e2345f 100644
--- a/drivers/perf/arm-cmn.c
+++ b/drivers/perf/arm-cmn.c
@@ -2305,6 +2305,17 @@  static int arm_cmn_discover(struct arm_cmn *cmn, unsigned int rgn_offset)
 				dev_dbg(cmn->dev, "ignoring external node %llx\n", reg);
 				continue;
 			}
+			/*
+			 * AmpereOneX erratum AC04_MESH_1 makes some XPs report a bogus
+			 * child count larger than the number of valid child pointers.
+			 * A child offset of 0 can only occur on CMN-600; otherwise it
+			 * would imply the root node being its own grandchild, which
+			 * we can safely dismiss in general.
+			 */
+			if (reg == 0 && cmn->part != PART_CMN600) {
+				dev_dbg(cmn->dev, "bogus child pointer?\n");
+				continue;
+			}
 
 			arm_cmn_init_node_info(cmn, reg & CMN_CHILD_NODE_ADDR, dn);