diff mbox

[v2,7/7] powerpc/powernv: nest pmu cpumask and cpu hotplug support

Message ID 1433999874-2043-8-git-send-email-maddy@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

maddy June 11, 2015, 5:17 a.m. UTC
Adds cpumask attribute to be used by each nest pmu since nest
units are per-chip. Only one cpu (first online cpu) from each node/chip
is designated to read counters.

On cpu hotplug, dying cpu is checked to see whether it is one of the
designated cpus, if yes, next online cpu from the same node/chip is designated
as new cpu to read counters.

Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Anton Blanchard <anton@samba.org>
Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
---
 arch/powerpc/perf/nest-pmu.c | 153 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 153 insertions(+)

Comments

Preeti U Murthy June 16, 2015, 6:28 a.m. UTC | #1
On 06/11/2015 10:47 AM, Madhavan Srinivasan wrote:
> Adds cpumask attribute to be used by each nest pmu since nest
> units are per-chip. Only one cpu (first online cpu) from each node/chip
> is designated to read counters.
> 
> On cpu hotplug, dying cpu is checked to see whether it is one of the
> designated cpus, if yes, next online cpu from the same node/chip is designated
> as new cpu to read counters.
> 
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Paul Mackerras <paulus@samba.org>
> Cc: Anton Blanchard <anton@samba.org>
> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
> Cc: Stephane Eranian <eranian@google.com>
> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
> ---
> +static void nest_change_cpu_context(int old_cpu, int new_cpu)
> +{
> +	int i;
> +
> +	if (new_cpu >= 0) {
> +		for (i = 0; per_nest_pmu_arr[i] != NULL; i++)
> +			perf_pmu_migrate_context(&per_nest_pmu_arr[i]->pmu,
> +							old_cpu, new_cpu);
> +	}
> +}
> +
> +static void nest_exit_cpu(int cpu)
> +{
> +	int i, nid, target = -1;
> +	const struct cpumask *l_cpumask;
> +	int src_chipid;
> +
> +	/*
> +	 * Check in the designated list for this cpu. Dont bother
> +	 * if not one of them.
> +	 */
> +	if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
> +		return;
> +
> +	/*
> +	 * Now that this cpu is one of the designated,
> +	 * find a new cpu a) which is not dying and

This comment is not right. nest_exit_cpu() is called in the hotplug
path, so another cpu cannot be dying in parallel. Hotplug operations are
done serially. The comment ought to be "a) which is online" instead.

> +	 * b) is in same node/chip.

node is not the same as a chip right ? And you are interested in cpus on
the same chip alone. So shouldn't the above comment be b) in the same chip ?

> +	 */
> +	nid = cpu_to_node(cpu);
> +	src_chipid = topology_physical_package_id(cpu);

What is the relation between a node and a chip ? Can't we have a
function which returns the cpumask of a chip straight away, since that
is what you seem to be more interested in ? You can then simply choose
the next cpu in this cpumask rather than going through the below loop.


> +	l_cpumask = cpumask_of_node(nid);
> +	for_each_cpu(i, l_cpumask) {
> +		if (i == cpu)
> +			continue;
> +		if (src_chipid == topology_physical_package_id(i)) {
> +			target = i;
> +			break;
> +		}
> +	}
> +
> +	/*
> +	 * Update the cpumask with the target cpu and
> +	 * migrate the context if needed
> +	 */
> +	if (target >= 0) {

You already check for target >= 0 here. So you don't need to check for
new_cpu >= 0 in nest_change_cpu_context() above ?

> +		cpumask_set_cpu(target, &cpu_mask_nest_pmu);
> +		nest_change_cpu_context (cpu, target);
> +	}
> +}
> +
> +static void nest_init_cpu(int cpu)
> +{
> +	int i, src_chipid;
> +
> +	/*
> +	 * Search for any existing designated thread from
> +	 * the incoming cpu's node/chip. If found, do nothing.
> +	 */
> +	src_chipid = topology_physical_package_id(cpu);
> +	for_each_cpu(i, &cpu_mask_nest_pmu)
> +		if (src_chipid == topology_physical_package_id(i))
> +			return;
> +
> +	/*
> +	 * Make incoming cpu as a designated thread for
> +	 * this node/chip
> +	 */
> +	cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);

Why can't we simply check if cpu is the first one coming online in the
chip and designate it as the cpu_mask_nest_pmu for that chip ? If it is
not the first cpu, it means that another cpu in the same chip already
took over this duty and it needn't bother.

And shouldn't we also call nest_init() on this cpu, just like you do in
cpumask_chip() on all cpu_mask_nest_pmu cpus ?

> +}
> +
> +static int nest_cpu_notifier(struct notifier_block *self,
> +				unsigned long action, void *hcpu)
> +{
> +	long cpu = (long)hcpu;
> +
> +	switch (action & ~CPU_TASKS_FROZEN) {
> +	case CPU_DOWN_FAILED:

Why do we need to handle the DOWN_FAILED case ? In DOWN_PREPARE, you
have ensured that the function moves on to another cpu. So even if the
offline failed, its no issue. The duty is safely taken over.

> +	case CPU_STARTING:

I would suggest initializing nest in the CPU_ONLINE stage. This is
because CPU_STARTING phase can fail. In that case, we will be
unnecessarily initializing nest pre-maturely. CPU_ONLINE phase assures
that the cpu is successfully online and you can then initiate nest.

> +		nest_init_cpu(cpu);
> +		break;
> +	case CPU_DOWN_PREPARE:
> +		nest_exit_cpu(cpu);
> +		break;
> +	default:
> +		break;
> +	}
> +
> +	return NOTIFY_OK;
> +}
> +
> +static struct notifier_block nest_cpu_nb = {
> +	.notifier_call  = nest_cpu_notifier,
> +	.priority       = CPU_PRI_PERF + 1,
> +};
> +
> +void cpumask_chip(void)

This name ^^ is not apt IMO. You are initiating the cpumask necessary
for nest pmu. So why not call it nest_pmu_cpumask_init() ?

> +{
> +	const struct cpumask *l_cpumask;
> +	int cpu, nid;
> +
> +	if (!cpumask_empty(&cpu_mask_nest_pmu))

When can this condition become true ?

> +		return;
> +
> +	cpu_notifier_register_begin();
> +
> +	/*
> +	 * Nest PMUs are per-chip counters. So designate a cpu
> +	 * from each node/chip for counter collection.
> +	 */
> +	for_each_online_node(nid) {
> +		l_cpumask = cpumask_of_node(nid);
> +
> +		/* designate first online cpu in this node */
> +		cpu = cpumask_first(l_cpumask);
> +		cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
> +	}
> +
> +	/* Initialize Nest PMUs in each node using designated cpus */
> +	on_each_cpu_mask(&cpu_mask_nest_pmu, (smp_call_func_t)nest_init, NULL, 1);
> +
> +	__register_cpu_notifier(&nest_cpu_nb);
> +
> +	cpu_notifier_register_done();
> +}

Regards
Preeti U Murthy
maddy June 22, 2015, 9:15 a.m. UTC | #2
On Tuesday 16 June 2015 11:58 AM, Preeti U Murthy wrote:
> On 06/11/2015 10:47 AM, Madhavan Srinivasan wrote:
>> Adds cpumask attribute to be used by each nest pmu since nest
>> units are per-chip. Only one cpu (first online cpu) from each node/chip
>> is designated to read counters.
>>
>> On cpu hotplug, dying cpu is checked to see whether it is one of the
>> designated cpus, if yes, next online cpu from the same node/chip is designated
>> as new cpu to read counters.
>>
>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Paul Mackerras <paulus@samba.org>
>> Cc: Anton Blanchard <anton@samba.org>
>> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
>> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>> Cc: Stephane Eranian <eranian@google.com>
>> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>> Cc: Ingo Molnar <mingo@kernel.org>
>> Cc: Peter Zijlstra <peterz@infradead.org>
>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>> ---
>> +static void nest_change_cpu_context(int old_cpu, int new_cpu)
>> +{
>> +	int i;
>> +
>> +	if (new_cpu >= 0) {
>> +		for (i = 0; per_nest_pmu_arr[i] != NULL; i++)
>> +			perf_pmu_migrate_context(&per_nest_pmu_arr[i]->pmu,
>> +							old_cpu, new_cpu);
>> +	}
>> +}
>> +
>> +static void nest_exit_cpu(int cpu)
>> +{
>> +	int i, nid, target = -1;
>> +	const struct cpumask *l_cpumask;
>> +	int src_chipid;
>> +
>> +	/*
>> +	 * Check in the designated list for this cpu. Dont bother
>> +	 * if not one of them.
>> +	 */
>> +	if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
>> +		return;
>> +
>> +	/*
>> +	 * Now that this cpu is one of the designated,
>> +	 * find a new cpu a) which is not dying and
> This comment is not right. nest_exit_cpu() is called in the hotplug
> path, so another cpu cannot be dying in parallel. Hotplug operations are
> done serially. The comment ought to be "a) which is online" instead.
Ok will change it.

>> +	 * b) is in same node/chip.
> node is not the same as a chip right ? And you are interested in cpus on
> the same chip alone. So shouldn't the above comment be b) in the same chip ?
I was hoping it to be, but i will change  comment to chip.

>> +	 */
>> +	nid = cpu_to_node(cpu);
>> +	src_chipid = topology_physical_package_id(cpu);
> What is the relation between a node and a chip ? Can't we have a
> function which returns the cpumask of a chip straight away, since that
> is what you seem to be more interested in ? You can then simply choose
> the next cpu in this cpumask rather than going through the below loop.
>
Make sense. I can separate it out.

>> +	l_cpumask = cpumask_of_node(nid);
>> +	for_each_cpu(i, l_cpumask) {
>> +		if (i == cpu)
>> +			continue;
>> +		if (src_chipid == topology_physical_package_id(i)) {
>> +			target = i;
>> +			break;
>> +		}
>> +	}
>> +
>> +	/*
>> +	 * Update the cpumask with the target cpu and
>> +	 * migrate the context if needed
>> +	 */
>> +	if (target >= 0) {
> You already check for target >= 0 here. So you don't need to check for
> new_cpu >= 0 in nest_change_cpu_context() above ?
I guess i was way too cautious :) Will change it

>> +		cpumask_set_cpu(target, &cpu_mask_nest_pmu);
>> +		nest_change_cpu_context (cpu, target);
>> +	}
>> +}
>> +
>> +static void nest_init_cpu(int cpu)
>> +{
>> +	int i, src_chipid;
>> +
>> +	/*
>> +	 * Search for any existing designated thread from
>> +	 * the incoming cpu's node/chip. If found, do nothing.
>> +	 */
>> +	src_chipid = topology_physical_package_id(cpu);
>> +	for_each_cpu(i, &cpu_mask_nest_pmu)
>> +		if (src_chipid == topology_physical_package_id(i))
>> +			return;
>> +
>> +	/*
>> +	 * Make incoming cpu as a designated thread for
>> +	 * this node/chip
>> +	 */
>> +	cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
> Why can't we simply check if cpu is the first one coming online in the
> chip and designate it as the cpu_mask_nest_pmu for that chip ? If it is
> not the first cpu, it means that another cpu in the same chip already
> took over this duty and it needn't bother.
Looks to be right. let me try it out.

> And shouldn't we also call nest_init() on this cpu, just like you do in
> cpumask_chip() on all cpu_mask_nest_pmu cpus ?
Yes. I missed that. We should init. Nice catch.

>> +}
>> +
>> +static int nest_cpu_notifier(struct notifier_block *self,
>> +				unsigned long action, void *hcpu)
>> +{
>> +	long cpu = (long)hcpu;
>> +
>> +	switch (action & ~CPU_TASKS_FROZEN) {
>> +	case CPU_DOWN_FAILED:
> Why do we need to handle the DOWN_FAILED case ? In DOWN_PREPARE, you
> have ensured that the function moves on to another cpu. So even if the
> offline failed, its no issue. The duty is safely taken over.
>
>> +	case CPU_STARTING:
> I would suggest initializing nest in the CPU_ONLINE stage. This is
> because CPU_STARTING phase can fail. In that case, we will be
> unnecessarily initializing nest pre-maturely. CPU_ONLINE phase assures
> that the cpu is successfully online and you can then initiate nest.
Ok sure. Will do that.

>> +		nest_init_cpu(cpu);
>> +		break;
>> +	case CPU_DOWN_PREPARE:
>> +		nest_exit_cpu(cpu);
>> +		break;
>> +	default:
>> +		break;
>> +	}
>> +
>> +	return NOTIFY_OK;
>> +}
>> +
>> +static struct notifier_block nest_cpu_nb = {
>> +	.notifier_call  = nest_cpu_notifier,
>> +	.priority       = CPU_PRI_PERF + 1,
>> +};
>> +
>> +void cpumask_chip(void)
> This name ^^ is not apt IMO. You are initiating the cpumask necessary
> for nest pmu. So why not call it nest_pmu_cpumask_init() ?

Ok.
>> +{
>> +	const struct cpumask *l_cpumask;
>> +	int cpu, nid;
>> +
>> +	if (!cpumask_empty(&cpu_mask_nest_pmu))
> When can this condition become true ?
My bad. This code ended up here from the initial RFC patch, but after
reviewing this again, i dont think this is needed. Once again nice catch.

>> +		return;
>> +
>> +	cpu_notifier_register_begin();
>> +
>> +	/*
>> +	 * Nest PMUs are per-chip counters. So designate a cpu
>> +	 * from each node/chip for counter collection.
>> +	 */
>> +	for_each_online_node(nid) {
>> +		l_cpumask = cpumask_of_node(nid);
>> +
>> +		/* designate first online cpu in this node */
>> +		cpu = cpumask_first(l_cpumask);
>> +		cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
>> +	}
>> +
>> +	/* Initialize Nest PMUs in each node using designated cpus */
>> +	on_each_cpu_mask(&cpu_mask_nest_pmu, (smp_call_func_t)nest_init, NULL, 1);
>> +
>> +	__register_cpu_notifier(&nest_cpu_nb);
>> +
>> +	cpu_notifier_register_done();
>> +}
> Regards
> Preeti U Murthy
maddy June 24, 2015, 6:46 a.m. UTC | #3
On Monday 22 June 2015 02:45 PM, Madhavan Srinivasan wrote:
>
> On Tuesday 16 June 2015 11:58 AM, Preeti U Murthy wrote:
>> On 06/11/2015 10:47 AM, Madhavan Srinivasan wrote:
>>> Adds cpumask attribute to be used by each nest pmu since nest
>>> units are per-chip. Only one cpu (first online cpu) from each node/chip
>>> is designated to read counters.
>>>
>>> On cpu hotplug, dying cpu is checked to see whether it is one of the
>>> designated cpus, if yes, next online cpu from the same node/chip is designated
>>> as new cpu to read counters.
>>>
>>> Cc: Michael Ellerman <mpe@ellerman.id.au>
>>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>>> Cc: Paul Mackerras <paulus@samba.org>
>>> Cc: Anton Blanchard <anton@samba.org>
>>> Cc: Sukadev Bhattiprolu <sukadev@linux.vnet.ibm.com>
>>> Cc: Anshuman Khandual <khandual@linux.vnet.ibm.com>
>>> Cc: Stephane Eranian <eranian@google.com>
>>> Cc: Preeti U Murthy <preeti@linux.vnet.ibm.com>
>>> Cc: Ingo Molnar <mingo@kernel.org>
>>> Cc: Peter Zijlstra <peterz@infradead.org>
>>> Signed-off-by: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
>>> ---
>>> +static void nest_change_cpu_context(int old_cpu, int new_cpu)
>>> +{
>>> +	int i;
>>> +
>>> +	if (new_cpu >= 0) {
>>> +		for (i = 0; per_nest_pmu_arr[i] != NULL; i++)
>>> +			perf_pmu_migrate_context(&per_nest_pmu_arr[i]->pmu,
>>> +							old_cpu, new_cpu);
>>> +	}
>>> +}
>>> +
>>> +static void nest_exit_cpu(int cpu)
>>> +{
>>> +	int i, nid, target = -1;
>>> +	const struct cpumask *l_cpumask;
>>> +	int src_chipid;
>>> +
>>> +	/*
>>> +	 * Check in the designated list for this cpu. Dont bother
>>> +	 * if not one of them.
>>> +	 */
>>> +	if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
>>> +		return;
>>> +
>>> +	/*
>>> +	 * Now that this cpu is one of the designated,
>>> +	 * find a new cpu a) which is not dying and
>> This comment is not right. nest_exit_cpu() is called in the hotplug
>> path, so another cpu cannot be dying in parallel. Hotplug operations are
>> done serially. The comment ought to be "a) which is online" instead.
> Ok will change it.
>
>>> +	 * b) is in same node/chip.
>> node is not the same as a chip right ? And you are interested in cpus on
>> the same chip alone. So shouldn't the above comment be b) in the same chip ?
> I was hoping it to be, but i will change  comment to chip.
>
>>> +	 */
>>> +	nid = cpu_to_node(cpu);
>>> +	src_chipid = topology_physical_package_id(cpu);
>> What is the relation between a node and a chip ? Can't we have a
>> function which returns the cpumask of a chip straight away, since that
>> is what you seem to be more interested in ? You can then simply choose
>> the next cpu in this cpumask rather than going through the below loop.
>>
> Make sense. I can separate it out.
>
>>> +	l_cpumask = cpumask_of_node(nid);
>>> +	for_each_cpu(i, l_cpumask) {
>>> +		if (i == cpu)
>>> +			continue;
>>> +		if (src_chipid == topology_physical_package_id(i)) {
>>> +			target = i;
>>> +			break;
>>> +		}
>>> +	}
>>> +
>>> +	/*
>>> +	 * Update the cpumask with the target cpu and
>>> +	 * migrate the context if needed
>>> +	 */
>>> +	if (target >= 0) {
>> You already check for target >= 0 here. So you don't need to check for
>> new_cpu >= 0 in nest_change_cpu_context() above ?
> I guess i was way too cautious :) Will change it
>
>>> +		cpumask_set_cpu(target, &cpu_mask_nest_pmu);
>>> +		nest_change_cpu_context (cpu, target);
>>> +	}
>>> +}
>>> +
>>> +static void nest_init_cpu(int cpu)
>>> +{
>>> +	int i, src_chipid;
>>> +
>>> +	/*
>>> +	 * Search for any existing designated thread from
>>> +	 * the incoming cpu's node/chip. If found, do nothing.
>>> +	 */
>>> +	src_chipid = topology_physical_package_id(cpu);
>>> +	for_each_cpu(i, &cpu_mask_nest_pmu)
>>> +		if (src_chipid == topology_physical_package_id(i))
>>> +			return;
>>> +
>>> +	/*
>>> +	 * Make incoming cpu as a designated thread for
>>> +	 * this node/chip
>>> +	 */
>>> +	cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
>> Why can't we simply check if cpu is the first one coming online in the
>> chip and designate it as the cpu_mask_nest_pmu for that chip ? If it is
>> not the first cpu, it means that another cpu in the same chip already
>> took over this duty and it needn't bother.
> Looks to be right. let me try it out.
>
>> And shouldn't we also call nest_init() on this cpu, just like you do in
>> cpumask_chip() on all cpu_mask_nest_pmu cpus ?
> Yes. I missed that. We should init. Nice catch.
I guess, we dont need to init again, since we dont stop the pore engine,
we are ok.

>>> +}
>>> +
>>> +static int nest_cpu_notifier(struct notifier_block *self,
>>> +				unsigned long action, void *hcpu)
>>> +{
>>> +	long cpu = (long)hcpu;
>>> +
>>> +	switch (action & ~CPU_TASKS_FROZEN) {
>>> +	case CPU_DOWN_FAILED:
>> Why do we need to handle the DOWN_FAILED case ? In DOWN_PREPARE, you
>> have ensured that the function moves on to another cpu. So even if the
>> offline failed, its no issue. The duty is safely taken over.
>>
>>> +	case CPU_STARTING:
>> I would suggest initializing nest in the CPU_ONLINE stage. This is
>> because CPU_STARTING phase can fail. In that case, we will be
>> unnecessarily initializing nest pre-maturely. CPU_ONLINE phase assures
>> that the cpu is successfully online and you can then initiate nest.
> Ok sure. Will do that.
>
>>> +		nest_init_cpu(cpu);
>>> +		break;
>>> +	case CPU_DOWN_PREPARE:
>>> +		nest_exit_cpu(cpu);
>>> +		break;
>>> +	default:
>>> +		break;
>>> +	}
>>> +
>>> +	return NOTIFY_OK;
>>> +}
>>> +
>>> +static struct notifier_block nest_cpu_nb = {
>>> +	.notifier_call  = nest_cpu_notifier,
>>> +	.priority       = CPU_PRI_PERF + 1,
>>> +};
>>> +
>>> +void cpumask_chip(void)
>> This name ^^ is not apt IMO. You are initiating the cpumask necessary
>> for nest pmu. So why not call it nest_pmu_cpumask_init() ?
> Ok.
>>> +{
>>> +	const struct cpumask *l_cpumask;
>>> +	int cpu, nid;
>>> +
>>> +	if (!cpumask_empty(&cpu_mask_nest_pmu))
>> When can this condition become true ?
> My bad. This code ended up here from the initial RFC patch, but after
> reviewing this again, i dont think this is needed. Once again nice catch.
>
>>> +		return;
>>> +
>>> +	cpu_notifier_register_begin();
>>> +
>>> +	/*
>>> +	 * Nest PMUs are per-chip counters. So designate a cpu
>>> +	 * from each node/chip for counter collection.
>>> +	 */
>>> +	for_each_online_node(nid) {
>>> +		l_cpumask = cpumask_of_node(nid);
>>> +
>>> +		/* designate first online cpu in this node */
>>> +		cpu = cpumask_first(l_cpumask);
>>> +		cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
>>> +	}
>>> +
>>> +	/* Initialize Nest PMUs in each node using designated cpus */
>>> +	on_each_cpu_mask(&cpu_mask_nest_pmu, (smp_call_func_t)nest_init, NULL, 1);
>>> +
>>> +	__register_cpu_notifier(&nest_cpu_nb);
>>> +
>>> +	cpu_notifier_register_done();
>>> +}
>> Regards
>> Preeti U Murthy
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
diff mbox

Patch

diff --git a/arch/powerpc/perf/nest-pmu.c b/arch/powerpc/perf/nest-pmu.c
index a662c14..95f8ecc 100644
--- a/arch/powerpc/perf/nest-pmu.c
+++ b/arch/powerpc/perf/nest-pmu.c
@@ -12,6 +12,155 @@ 
 
 static struct perchip_nest_info p8_perchip_nest_info[P8_MAX_CHIP];
 static struct nest_pmu *per_nest_pmu_arr[P8_MAX_NEST_PMUS];
+static cpumask_t cpu_mask_nest_pmu;
+
+static ssize_t cpumask_nest_pmu_get_attr(struct device *dev,
+				struct device_attribute *attr, char *buf)
+{
+	return cpumap_print_to_pagebuf(true, buf, &cpu_mask_nest_pmu);
+}
+
+static DEVICE_ATTR(cpumask, S_IRUGO, cpumask_nest_pmu_get_attr, NULL);
+
+static struct attribute *cpumask_nest_pmu_attrs[] = {
+	&dev_attr_cpumask.attr,
+	NULL,
+};
+
+static struct attribute_group cpumask_nest_pmu_attr_group = {
+	.attrs = cpumask_nest_pmu_attrs,
+};
+
+static void nest_init(void *dummy)
+{
+	opal_nest_ima_control(P8_NEST_ENGINE_START);
+}
+
+static void nest_change_cpu_context(int old_cpu, int new_cpu)
+{
+	int i;
+
+	if (new_cpu >= 0) {
+		for (i = 0; per_nest_pmu_arr[i] != NULL; i++)
+			perf_pmu_migrate_context(&per_nest_pmu_arr[i]->pmu,
+							old_cpu, new_cpu);
+	}
+}
+
+static void nest_exit_cpu(int cpu)
+{
+	int i, nid, target = -1;
+	const struct cpumask *l_cpumask;
+	int src_chipid;
+
+	/*
+	 * Check in the designated list for this cpu. Dont bother
+	 * if not one of them.
+	 */
+	if (!cpumask_test_and_clear_cpu(cpu, &cpu_mask_nest_pmu))
+		return;
+
+	/*
+	 * Now that this cpu is one of the designated,
+	 * find a new cpu a) which is not dying and
+	 * b) is in same node/chip.
+	 */
+	nid = cpu_to_node(cpu);
+	src_chipid = topology_physical_package_id(cpu);
+	l_cpumask = cpumask_of_node(nid);
+	for_each_cpu(i, l_cpumask) {
+		if (i == cpu)
+			continue;
+		if (src_chipid == topology_physical_package_id(i)) {
+			target = i;
+			break;
+		}
+	}
+
+	/*
+	 * Update the cpumask with the target cpu and
+	 * migrate the context if needed
+	 */
+	if (target >= 0) {
+		cpumask_set_cpu(target, &cpu_mask_nest_pmu);
+		nest_change_cpu_context (cpu, target);
+	}
+}
+
+static void nest_init_cpu(int cpu)
+{
+	int i, src_chipid;
+
+	/*
+	 * Search for any existing designated thread from
+	 * the incoming cpu's node/chip. If found, do nothing.
+	 */
+	src_chipid = topology_physical_package_id(cpu);
+	for_each_cpu(i, &cpu_mask_nest_pmu)
+		if (src_chipid == topology_physical_package_id(i))
+			return;
+
+	/*
+	 * Make incoming cpu as a designated thread for
+	 * this node/chip
+	 */
+	cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
+}
+
+static int nest_cpu_notifier(struct notifier_block *self,
+				unsigned long action, void *hcpu)
+{
+	long cpu = (long)hcpu;
+
+	switch (action & ~CPU_TASKS_FROZEN) {
+	case CPU_DOWN_FAILED:
+	case CPU_STARTING:
+		nest_init_cpu(cpu);
+		break;
+	case CPU_DOWN_PREPARE:
+		nest_exit_cpu(cpu);
+		break;
+	default:
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block nest_cpu_nb = {
+	.notifier_call  = nest_cpu_notifier,
+	.priority       = CPU_PRI_PERF + 1,
+};
+
+void cpumask_chip(void)
+{
+	const struct cpumask *l_cpumask;
+	int cpu, nid;
+
+	if (!cpumask_empty(&cpu_mask_nest_pmu))
+		return;
+
+	cpu_notifier_register_begin();
+
+	/*
+	 * Nest PMUs are per-chip counters. So designate a cpu
+	 * from each node/chip for counter collection.
+	 */
+	for_each_online_node(nid) {
+		l_cpumask = cpumask_of_node(nid);
+
+		/* designate first online cpu in this node */
+		cpu = cpumask_first(l_cpumask);
+		cpumask_set_cpu(cpu, &cpu_mask_nest_pmu);
+	}
+
+	/* Initialize Nest PMUs in each node using designated cpus */
+	on_each_cpu_mask(&cpu_mask_nest_pmu, (smp_call_func_t)nest_init, NULL, 1);
+
+	__register_cpu_notifier(&nest_cpu_nb);
+
+	cpu_notifier_register_done();
+}
 
 PMU_FORMAT_ATTR(event, "config:0-20");
 struct attribute *p8_nest_format_attrs[] = {
@@ -209,6 +358,7 @@  static int nest_pmu_create(struct device_node *dev, int pmu_index)
 			sprintf(buf, "Nest_%s", (char *)pp->value);
 			pmu_ptr->pmu.name = (char *)buf;
 			pmu_ptr->attr_groups[1] = &p8_nest_format_group;
+			pmu_ptr->attr_groups[2] = &cpumask_nest_pmu_attr_group;
 		}
 
 		/* Skip these, we dont need it */
@@ -362,6 +512,9 @@  static int __init nest_pmu_init(void)
 	if (ret)
 		return ret;
 
+	/* Add cpumask and register for hotplug notification */
+	cpumask_chip();
+
 	return 0;
 }
 device_initcall(nest_pmu_init);