Message ID | 1357268318-7993-1-git-send-email-vatsa@codeaurora.org |
---|---|
State | Not Applicable |
Delegated to: | David Miller |
Headers | show |
On Thu, Jan 03, 2013 at 06:58:38PM -0800, Srivatsa Vaddagiri wrote: > I also think that the > wait_for_completion() based wait in ARM's __cpu_die() can be replaced with a > busy-loop based one, as the wait there in general should be terminated within > few cycles. Why open-code this stuff when we have infrastructure already in the kernel for waiting for stuff to happen? I chose to use the standard infrastructure because its better tested, and avoids having to think about whether we need CPU barriers and such like to ensure that updates are seen in a timely manner. My stance on a lot of this idle/cpu dying code is that much of it can probably be cleaned up and merged into a single common implementation - in which case the use of standard infrastructure for things like waiting for other CPUs do stuff is even more justified. -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Russell King - ARM Linux <linux@arm.linux.org.uk> [2013-01-05 10:36:27]: > On Thu, Jan 03, 2013 at 06:58:38PM -0800, Srivatsa Vaddagiri wrote: > > I also think that the > > wait_for_completion() based wait in ARM's __cpu_die() can be replaced with a > > busy-loop based one, as the wait there in general should be terminated within > > few cycles. > > Why open-code this stuff when we have infrastructure already in the kernel > for waiting for stuff to happen? I chose to use the standard infrastructure > because its better tested, and avoids having to think about whether we need > CPU barriers and such like to ensure that updates are seen in a timely > manner. I was primarily thinking of calling as few generic functions as possible on a dead cpu. I recall several "am I running on a dead cpu?" checks (cpu_is_offline(this_cpu) that were put in generic routines during early versions of cpu hotplug [1] to educate code running on dead cpu, the need for which went away though with introduction of atomic/stop-machine variant. The need to add a RCU_NONIDLE() wrapper around ARM's cpu_die() [2] is perhaps a more recent example of educating code running on dead cpu. As quickly we die as possible after idle thread of dying cpu gains control, the better! 1. http://lwn.net/Articles/69040/ 2. http://lists.infradead.org/pipermail/linux-arm-kernel/2012-July/107971.html - vatsa
On 01/05/2013 04:06 PM, Russell King - ARM Linux wrote: > On Thu, Jan 03, 2013 at 06:58:38PM -0800, Srivatsa Vaddagiri wrote: >> I also think that the >> wait_for_completion() based wait in ARM's __cpu_die() can be replaced with a >> busy-loop based one, as the wait there in general should be terminated within >> few cycles. > > Why open-code this stuff when we have infrastructure already in the kernel > for waiting for stuff to happen? I chose to use the standard infrastructure > because its better tested, and avoids having to think about whether we need > CPU barriers and such like to ensure that updates are seen in a timely > manner. > > My stance on a lot of this idle/cpu dying code is that much of it can > probably be cleaned up and merged into a single common implementation - > in which case the use of standard infrastructure for things like waiting > for other CPUs do stuff is even more justified. On similar lines, Nikunj (in CC) and I had posted a patchset sometime ago to consolidate some of the CPU hotplug related code in the various architectures into a common standard implementation [1]. However, we ended up hitting a problem with Xen, because its existing code was unlike the other arch/ pieces [2]. At that time, we decided that we will first make the CPU online and offline paths symmetric in the generic code and then provide a common implementation of the duplicated bits in arch/, for the new CPU hotplug model [3]. I guess we should probably revisit it sometime, consolidating the code in incremental steps if not all at a time... -- [1]. http://lwn.net/Articles/500185/ [2]. http://thread.gmane.org/gmane.linux.kernel.cross-arch/14342/focus=14430 [3]. http://thread.gmane.org/gmane.linux.kernel.cross-arch/14342/focus=15567 Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c index c6dec5f..254099b 100644 --- a/arch/arm/kernel/process.c +++ b/arch/arm/kernel/process.c @@ -191,11 +191,6 @@ void cpu_idle(void) rcu_idle_enter(); ledtrig_cpu(CPU_LED_IDLE_START); while (!need_resched()) { -#ifdef CONFIG_HOTPLUG_CPU - if (cpu_is_offline(smp_processor_id())) - cpu_die(); -#endif - /* * We need to disable interrupts here * to ensure we don't miss a wakeup call. @@ -224,6 +219,10 @@ void cpu_idle(void) rcu_idle_exit(); tick_nohz_idle_exit(); schedule_preempt_disabled(); +#ifdef CONFIG_HOTPLUG_CPU + if (cpu_is_offline(smp_processor_id())) + cpu_die(); +#endif } } diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c index 84f4cbf..a8e3b8a 100644 --- a/arch/arm/kernel/smp.c +++ b/arch/arm/kernel/smp.c @@ -251,7 +251,7 @@ void __ref cpu_die(void) mb(); /* Tell __cpu_die() that this CPU is now safe to dispose of */ - RCU_NONIDLE(complete(&cpu_died)); + complete(&cpu_died); /* * actual CPU shutdown procedure is at least platform (if not diff --git a/arch/blackfin/kernel/process.c b/arch/blackfin/kernel/process.c index 3e16ad9..2bee1af 100644 --- a/arch/blackfin/kernel/process.c +++ b/arch/blackfin/kernel/process.c @@ -83,10 +83,6 @@ void cpu_idle(void) while (1) { void (*idle)(void) = pm_idle; -#ifdef CONFIG_HOTPLUG_CPU - if (cpu_is_offline(smp_processor_id())) - cpu_die(); -#endif if (!idle) idle = default_idle; tick_nohz_idle_enter(); @@ -98,6 +94,10 @@ void cpu_idle(void) preempt_enable_no_resched(); schedule(); preempt_disable(); +#ifdef CONFIG_HOTPLUG_CPU + if (cpu_is_offline(smp_processor_id())) + cpu_die(); +#endif } } diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c index a11c6f9..41102a0 100644 --- a/arch/mips/kernel/process.c +++ b/arch/mips/kernel/process.c @@ -71,13 +71,13 @@ void __noreturn cpu_idle(void) start_critical_timings(); } } + rcu_idle_exit(); + tick_nohz_idle_exit(); + schedule_preempt_disabled(); #ifdef CONFIG_HOTPLUG_CPU if (!cpu_online(cpu) && !cpu_isset(cpu, cpu_callin_map)) play_dead(); #endif - rcu_idle_exit(); - tick_nohz_idle_exit(); - schedule_preempt_disabled(); } } diff --git a/arch/powerpc/kernel/idle.c b/arch/powerpc/kernel/idle.c index ea78761..39ad029 100644 --- a/arch/powerpc/kernel/idle.c +++ b/arch/powerpc/kernel/idle.c @@ -102,11 +102,11 @@ void cpu_idle(void) ppc64_runlatch_on(); rcu_idle_exit(); tick_nohz_idle_exit(); + schedule_preempt_disabled(); if (cpu_should_die()) { sched_preempt_enable_no_resched(); cpu_die(); } - schedule_preempt_disabled(); } } diff --git a/arch/s390/kernel/process.c b/arch/s390/kernel/process.c index 536d645..5290556 100644 --- a/arch/s390/kernel/process.c +++ b/arch/s390/kernel/process.c @@ -66,8 +66,6 @@ unsigned long thread_saved_pc(struct task_struct *tsk) */ static void default_idle(void) { - if (cpu_is_offline(smp_processor_id())) - cpu_die(); local_irq_disable(); if (need_resched()) { local_irq_enable(); @@ -95,6 +93,8 @@ void cpu_idle(void) if (test_thread_flag(TIF_MCCK_PENDING)) s390_handle_mcck(); schedule_preempt_disabled(); + if (cpu_is_offline(smp_processor_id())) + cpu_die(); } } diff --git a/arch/sh/kernel/idle.c b/arch/sh/kernel/idle.c index 0c91016..f8bc2f0 100644 --- a/arch/sh/kernel/idle.c +++ b/arch/sh/kernel/idle.c @@ -96,9 +96,6 @@ void cpu_idle(void) check_pgt_cache(); rmb(); - if (cpu_is_offline(cpu)) - play_dead(); - local_irq_disable(); /* Don't trace irqs off for idle */ stop_critical_timings(); @@ -115,6 +112,8 @@ void cpu_idle(void) rcu_idle_exit(); tick_nohz_idle_exit(); schedule_preempt_disabled(); + if (cpu_is_offline(cpu)) + play_dead(); } } diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c index cdb80b2..01589e7 100644 --- a/arch/sparc/kernel/process_64.c +++ b/arch/sparc/kernel/process_64.c @@ -105,13 +105,14 @@ void cpu_idle(void) rcu_idle_exit(); tick_nohz_idle_exit(); + schedule_preempt_disabled(); + #ifdef CONFIG_HOTPLUG_CPU if (cpu_is_offline(cpu)) { sched_preempt_enable_no_resched(); cpu_play_dead(); } #endif - schedule_preempt_disabled(); } } diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 2ed787f..3d5f142 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -331,9 +331,6 @@ void cpu_idle(void) while (!need_resched()) { rmb(); - if (cpu_is_offline(smp_processor_id())) - play_dead(); - /* * Idle routines should keep interrupts disabled * from here on, until they go to idle. @@ -366,6 +363,8 @@ void cpu_idle(void) preempt_enable_no_resched(); schedule(); preempt_disable(); + if (cpu_is_offline(smp_processor_id())) + play_dead(); } }
Modify idle loop of arm, mips, s390, sh and x86 architectures to exit from nohz state before dying upon hot-remove. This change is needed to avoid userspace tools like top command from seeing a rollback in total idle time over some sampling periods. Additionaly, modify idle loop on all architectures supporting cpu hotplug to have idle thread of a dying cpu die immediately after scheduler returns control to it. There is no point in wasting time via calls to *_enter()/*_exit() before noticing the need to die and dying. Additional ARM specific change: Revert commit ff081e05 ("ARM: 7457/1: smp: Fix suspicious RCU originating from cpu_die()"), which added a RCU_NONIDLE() wrapper around call to complete(). That wrapper is no longer needed as cpu_die() is now called outside of a rcu_idle_enter()/exit() section. I also think that the wait_for_completion() based wait in ARM's __cpu_die() can be replaced with a busy-loop based one, as the wait there in general should be terminated within few cycles. Cc: Russell King <linux@arm.linux.org.uk> Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com> Cc: Stephen Boyd <sboyd@codeaurora.org> Cc: linux-arm-kernel@lists.infradead.org Cc: Mike Frysinger <vapier@gentoo.org> Cc: uclinux-dist-devel@blackfin.uclinux.org Cc: Ralf Baechle <ralf@linux-mips.org> Cc: linux-mips@linux-mips.org Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org> Cc: linuxppc-dev@lists.ozlabs.org Cc: Martin Schwidefsky <schwidefsky@de.ibm.com> Cc: linux-s390@vger.kernel.org Cc: Paul Mundt <lethal@linux-sh.org> Cc: linux-sh@vger.kernel.org Cc: "David S. Miller" <davem@davemloft.net> Cc: sparclinux@vger.kernel.org Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Ingo Molnar <mingo@redhat.com> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: x86@kernel.org Cc: mhocko@suse.cz Cc: srivatsa.bhat@linux.vnet.ibm.com Signed-off-by: Srivatsa Vaddagiri <vatsa@codeaurora.org> --- arch/arm/kernel/process.c | 9 ++++----- arch/arm/kernel/smp.c | 2 +- arch/blackfin/kernel/process.c | 8 ++++---- arch/mips/kernel/process.c | 6 +++--- arch/powerpc/kernel/idle.c | 2 +- arch/s390/kernel/process.c | 4 ++-- arch/sh/kernel/idle.c | 5 ++--- arch/sparc/kernel/process_64.c | 3 ++- arch/x86/kernel/process.c | 5 ++--- 9 files changed, 21 insertions(+), 23 deletions(-)