drivers/perf: hisi: Schedule perf session according to locality

Message ID 20230808125147.2080-1-yangyicong@huawei.com
State New
Headers
Series drivers/perf: hisi: Schedule perf session according to locality |

Commit Message

Yicong Yang Aug. 8, 2023, 12:51 p.m. UTC
  From: Yicong Yang <yangyicong@hisilicon.com>

The PCIe PMUs locate on different NUMA node but currently we don't
consider it and likely stack all the sessions on the same CPU:

[root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask
0
0
0
0
0
0

This can be optimize a bit to use a local CPU for the PMU.

Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
---
 drivers/perf/hisilicon/hisi_pcie_pmu.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)
  

Comments

Will Deacon Aug. 11, 2023, 11:14 a.m. UTC | #1
On Tue, Aug 08, 2023 at 08:51:47PM +0800, Yicong Yang wrote:
> From: Yicong Yang <yangyicong@hisilicon.com>
> 
> The PCIe PMUs locate on different NUMA node but currently we don't
> consider it and likely stack all the sessions on the same CPU:
> 
> [root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask
> 0
> 0
> 0
> 0
> 0
> 0
> 
> This can be optimize a bit to use a local CPU for the PMU.
> 
> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
> ---
>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 15 ++++++++++++---
>  1 file changed, 12 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> index e10fc7cb9493..60ecf469782b 100644
> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
> @@ -665,7 +665,7 @@ static int hisi_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
>  	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
>  
>  	if (pcie_pmu->on_cpu == -1) {
> -		pcie_pmu->on_cpu = cpu;
> +		pcie_pmu->on_cpu = cpumask_local_spread(0, dev_to_node(&pcie_pmu->pdev->dev));
>  		WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu)));

Hmm, this is a bit weird now, because the interrupt is affine to a different
CPU from the one you've chosen. Are you sure that's ok? When the offline
notifier picks a new target, it moves the irq as well.

Will
  
Yicong Yang Aug. 15, 2023, 11:02 a.m. UTC | #2
On 2023/8/11 19:14, Will Deacon wrote:
> On Tue, Aug 08, 2023 at 08:51:47PM +0800, Yicong Yang wrote:
>> From: Yicong Yang <yangyicong@hisilicon.com>
>>
>> The PCIe PMUs locate on different NUMA node but currently we don't
>> consider it and likely stack all the sessions on the same CPU:
>>
>> [root@localhost tmp]# cat /sys/devices/hisi_pcie*/cpumask
>> 0
>> 0
>> 0
>> 0
>> 0
>> 0
>>
>> This can be optimize a bit to use a local CPU for the PMU.
>>
>> Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
>> ---
>>  drivers/perf/hisilicon/hisi_pcie_pmu.c | 15 ++++++++++++---
>>  1 file changed, 12 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> index e10fc7cb9493..60ecf469782b 100644
>> --- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> +++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
>> @@ -665,7 +665,7 @@ static int hisi_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
>>  	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
>>  
>>  	if (pcie_pmu->on_cpu == -1) {
>> -		pcie_pmu->on_cpu = cpu;
>> +		pcie_pmu->on_cpu = cpumask_local_spread(0, dev_to_node(&pcie_pmu->pdev->dev));
>>  		WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu)));
> 
> Hmm, this is a bit weird now, because the interrupt is affine to a different
> CPU from the one you've chosen. Are you sure that's ok? When the offline
> notifier picks a new target, it moves the irq as well.
> 

Thanks for pointing it out. This is indeed a problem. Will fix this.

Thanks.
  

Patch

diff --git a/drivers/perf/hisilicon/hisi_pcie_pmu.c b/drivers/perf/hisilicon/hisi_pcie_pmu.c
index e10fc7cb9493..60ecf469782b 100644
--- a/drivers/perf/hisilicon/hisi_pcie_pmu.c
+++ b/drivers/perf/hisilicon/hisi_pcie_pmu.c
@@ -665,7 +665,7 @@  static int hisi_pcie_pmu_online_cpu(unsigned int cpu, struct hlist_node *node)
 	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
 
 	if (pcie_pmu->on_cpu == -1) {
-		pcie_pmu->on_cpu = cpu;
+		pcie_pmu->on_cpu = cpumask_local_spread(0, dev_to_node(&pcie_pmu->pdev->dev));
 		WARN_ON(irq_set_affinity(pcie_pmu->irq, cpumask_of(cpu)));
 	}
 
@@ -676,14 +676,23 @@  static int hisi_pcie_pmu_offline_cpu(unsigned int cpu, struct hlist_node *node)
 {
 	struct hisi_pcie_pmu *pcie_pmu = hlist_entry_safe(node, struct hisi_pcie_pmu, node);
 	unsigned int target;
+	cpumask_t mask;
+	int numa_node;
 
 	/* Nothing to do if this CPU doesn't own the PMU */
 	if (pcie_pmu->on_cpu != cpu)
 		return 0;
 
 	pcie_pmu->on_cpu = -1;
-	/* Choose a new CPU from all online cpus. */
-	target = cpumask_any_but(cpu_online_mask, cpu);
+
+	/* Choose a local CPU from all online cpus. */
+	numa_node = dev_to_node(&pcie_pmu->pdev->dev);
+	if (cpumask_and(&mask, cpumask_of_node(numa_node), cpu_online_mask) &&
+	    cpumask_andnot(&mask, &mask, cpumask_of(cpu)))
+		target = cpumask_any(&mask);
+	else
+		target = cpumask_any_but(cpu_online_mask, cpu);
+
 	if (target >= nr_cpu_ids) {
 		pci_err(pcie_pmu->pdev, "There is no CPU to set\n");
 		return 0;