diff mbox

Do not update sysfs cpu registration from invalid context

Message ID 51C8543F.6080905@linux.vnet.ibm.com (mailing list archive)
State Rejected
Headers show

Commit Message

Nathan Fontenot June 24, 2013, 2:14 p.m. UTC
The topology update code that updates the cpu node registration in sysfs
should not be called while in stop_machine(). The register/unregister
calls take a lock and may sleep.

This patch moves these calls outside of the call to stop_machine().

Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>
---
 arch/powerpc/mm/numa.c |    5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Seth Jennings June 24, 2013, 5:18 p.m. UTC | #1
On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> The topology update code that updates the cpu node registration in sysfs
> should not be called while in stop_machine(). The register/unregister
> calls take a lock and may sleep.
> 
> This patch moves these calls outside of the call to stop_machine().
> 
> Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>

Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>

> ---
>  arch/powerpc/mm/numa.c |    5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> Index: powerpc/arch/powerpc/mm/numa.c
> ===================================================================
> --- powerpc.orig/arch/powerpc/mm/numa.c	2013-06-24 06:53:31.000000000 -0500
> +++ powerpc/arch/powerpc/mm/numa.c	2013-06-24 06:56:30.000000000 -0500
> @@ -1433,11 +1433,9 @@
>  		if (cpu != update->cpu)
>  			continue;
> 
> -		unregister_cpu_under_node(update->cpu, update->old_nid);
>  		unmap_cpu_from_node(update->cpu);
>  		map_cpu_to_node(update->cpu, update->new_nid);
>  		vdso_getcpu_init();
> -		register_cpu_under_node(update->cpu, update->new_nid);
>  	}
> 
>  	return 0;
> @@ -1485,6 +1483,9 @@
>  	stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
> 
>  	for (ud = &updates[0]; ud; ud = ud->next) {
> +		unregister_cpu_under_node(update->cpu, update->old_nid);
> +		register_cpu_under_node(update->cpu, update->new_nid);
> +
>  		dev = get_cpu_device(ud->cpu);
>  		if (dev)
>  			kobject_uevent(&dev->kobj, KOBJ_CHANGE);
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev
>
Seth Jennings June 24, 2013, 7:16 p.m. UTC | #2
On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> > The topology update code that updates the cpu node registration in sysfs
> > should not be called while in stop_machine(). The register/unregister
> > calls take a lock and may sleep.
> > 
> > This patch moves these calls outside of the call to stop_machine().
> > 
> > Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>
> 
> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>

Gah! I _knew_ I should have waited for my cross compiler to finish
building.  This thing doesn't build:

  CC      arch/powerpc/mm/numa.o
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' undeclared (first use in this function)
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each undeclared identifier is reported only once
/home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each function it appears in.)

s/update/ud/ in the *_cpu_under_node() calls.

Seth
Nathan Fontenot June 24, 2013, 7:25 p.m. UTC | #3
On 06/24/2013 02:16 PM, Seth Jennings wrote:
> On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
>> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
>>> The topology update code that updates the cpu node registration in sysfs
>>> should not be called while in stop_machine(). The register/unregister
>>> calls take a lock and may sleep.
>>>
>>> This patch moves these calls outside of the call to stop_machine().
>>>
>>> Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>
>> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
> 
> Gah! I _knew_ I should have waited for my cross compiler to finish
> building.  This thing doesn't build:
> 
>   CC      arch/powerpc/mm/numa.o
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' undeclared (first use in this function)
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each undeclared identifier is reported only once
> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each function it appears in.)
> 
> s/update/ud/ in the *_cpu_under_node() calls.

Oops! Time for patch submission re-education training.

New, and correct, patch coming soon.

-Nathan
Michael Ellerman June 25, 2013, 1:50 a.m. UTC | #4
On Mon, Jun 24, 2013 at 02:25:59PM -0500, Nathan Fontenot wrote:
> On 06/24/2013 02:16 PM, Seth Jennings wrote:
> > On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
> >> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> >>> The topology update code that updates the cpu node registration in sysfs
> >>> should not be called while in stop_machine(). The register/unregister
> >>> calls take a lock and may sleep.
> >>>
> >>> This patch moves these calls outside of the call to stop_machine().
> >>>
> >>> Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>
> >>
> >> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
> > 
> > Gah! I _knew_ I should have waited for my cross compiler to finish
> > building.  This thing doesn't build:
> > 
> >   CC      arch/powerpc/mm/numa.o
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' undeclared (first use in this function)
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each undeclared identifier is reported only once
> > /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each function it appears in.)
> > 
> > s/update/ud/ in the *_cpu_under_node() calls.
> 
> Oops! Time for patch submission re-education training.

We've all done it, but yes :)

I try to stick to:

  1. write code.
  2. build code.
  3. test code.
  4. submit code.

I imagine you tested an early version of the patch, or on RHEL or
something, but that can bite you like this. Whenever possible you should
build & test the exact code you submit, though that can be hard when
trees are moving quickly underneath you.

cheers
Michael Ellerman June 25, 2013, 1:50 a.m. UTC | #5
On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
> The topology update code that updates the cpu node registration in sysfs
> should not be called while in stop_machine(). The register/unregister
> calls take a lock and may sleep.
> 
> This patch moves these calls outside of the call to stop_machine().

What happens? Do we lockup or do you just get a warning?

And what commit introduced the breakage?

cheers
Nathan Fontenot June 25, 2013, 2:46 a.m. UTC | #6
On 06/24/2013 08:50 PM, Michael Ellerman wrote:
> On Mon, Jun 24, 2013 at 02:25:59PM -0500, Nathan Fontenot wrote:
>> On 06/24/2013 02:16 PM, Seth Jennings wrote:
>>> On Mon, Jun 24, 2013 at 12:18:04PM -0500, Seth Jennings wrote:
>>>> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
>>>>> The topology update code that updates the cpu node registration in sysfs
>>>>> should not be called while in stop_machine(). The register/unregister
>>>>> calls take a lock and may sleep.
>>>>>
>>>>> This patch moves these calls outside of the call to stop_machine().
>>>>>
>>>>> Signed-off-by:Nathan Fontenot <nfont@linux.vnet.ibm.com>
>>>>
>>>> Reviewed-by: Seth Jennings <sjenning@linux.vnet.ibm.com>
>>>
>>> Gah! I _knew_ I should have waited for my cross compiler to finish
>>> building.  This thing doesn't build:
>>>
>>>   CC      arch/powerpc/mm/numa.o
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c: In function 'arch_update_cpu_topology':
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: 'update' undeclared (first use in this function)
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: (Each undeclared identifier is reported only once
>>> /home/sjennings/ltc/linux/arch/powerpc/mm/numa.c:1486: error: for each function it appears in.)
>>>
>>> s/update/ud/ in the *_cpu_under_node() calls.
>>
>> Oops! Time for patch submission re-education training.
> 
> We've all done it, but yes :)
> 
> I try to stick to:
> 
>   1. write code.

I would suggest
    1a. ensure you have the proper config options set

>   2. build code.
>   3. test code.
>   4. submit code.
> 
> I imagine you tested an early version of the patch, or on RHEL or
> something, but that can bite you like this. Whenever possible you should
> build & test the exact code you submit, though that can be hard when
> trees are moving quickly underneath you.

Yep, bitten by 1a. I didn't verify the config options I was building
with and had SMP disabled in the tree. This ifdef'ed out my code.

-Nathan
Nathan Fontenot June 25, 2013, 2:47 a.m. UTC | #7
On 06/24/2013 08:50 PM, Michael Ellerman wrote:
> On Mon, Jun 24, 2013 at 09:14:23AM -0500, Nathan Fontenot wrote:
>> The topology update code that updates the cpu node registration in sysfs
>> should not be called while in stop_machine(). The register/unregister
>> calls take a lock and may sleep.
>>
>> This patch moves these calls outside of the call to stop_machine().
> 
> What happens? Do we lockup or do you just get a warning?
> 
> And what commit introduced the breakage?

Guilty on on all counts. Hopefully I can still get by with stern warning.

-Nathan
diff mbox

Patch

Index: powerpc/arch/powerpc/mm/numa.c
===================================================================
--- powerpc.orig/arch/powerpc/mm/numa.c	2013-06-24 06:53:31.000000000 -0500
+++ powerpc/arch/powerpc/mm/numa.c	2013-06-24 06:56:30.000000000 -0500
@@ -1433,11 +1433,9 @@ 
 		if (cpu != update->cpu)
 			continue;
 
-		unregister_cpu_under_node(update->cpu, update->old_nid);
 		unmap_cpu_from_node(update->cpu);
 		map_cpu_to_node(update->cpu, update->new_nid);
 		vdso_getcpu_init();
-		register_cpu_under_node(update->cpu, update->new_nid);
 	}
 
 	return 0;
@@ -1485,6 +1483,9 @@ 
 	stop_machine(update_cpu_topology, &updates[0], &updated_cpus);
 
 	for (ud = &updates[0]; ud; ud = ud->next) {
+		unregister_cpu_under_node(update->cpu, update->old_nid);
+		register_cpu_under_node(update->cpu, update->new_nid);
+
 		dev = get_cpu_device(ud->cpu);
 		if (dev)
 			kobject_uevent(&dev->kobj, KOBJ_CHANGE);