From patchwork Tue Sep 22 11:29:36 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Arun Bharadwaj X-Patchwork-Id: 34065 Return-Path: X-Original-To: patchwork-incoming@ozlabs.org Delivered-To: patchwork-incoming@ozlabs.org Received: from bilbo.ozlabs.org (localhost [127.0.0.1]) by ozlabs.org (Postfix) with ESMTP id 21DA8B80F3 for ; Tue, 22 Sep 2009 21:29:52 +1000 (EST) Received: from e28smtp07.in.ibm.com (e28smtp07.in.ibm.com [59.145.155.7]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "e28smtp07.in.ibm.com", Issuer "Equifax" (verified OK)) by ozlabs.org (Postfix) with ESMTPS id C2527B70B3 for ; Tue, 22 Sep 2009 21:29:43 +1000 (EST) Received: from d28relay05.in.ibm.com (d28relay05.in.ibm.com [9.184.220.62]) by e28smtp07.in.ibm.com (8.14.3/8.13.1) with ESMTP id n8MBTcsA024302 for ; Tue, 22 Sep 2009 16:59:38 +0530 Received: from d28av05.in.ibm.com (d28av05.in.ibm.com [9.184.220.67]) by d28relay05.in.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id n8MBTcxe2281610 for ; Tue, 22 Sep 2009 16:59:38 +0530 Received: from d28av05.in.ibm.com (loopback [127.0.0.1]) by d28av05.in.ibm.com (8.14.3/8.13.1/NCO v10.0 AVout) with ESMTP id n8MBTbc8024901 for ; Tue, 22 Sep 2009 21:29:38 +1000 Received: from linux.vnet.ibm.com (Crystal-Planet.in.ibm.com [9.124.35.26]) by d28av05.in.ibm.com (8.14.3/8.13.1/NCO v10.0 AVin) with ESMTP id n8MBTaus024886 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NO); Tue, 22 Sep 2009 21:29:37 +1000 Date: Tue, 22 Sep 2009 16:59:36 +0530 From: Arun R Bharadwaj To: Peter Zijlstra , Joel Schopp , Benjamin Herrenschmidt , Paul Mackerras , Ingo Molnar , Vaidyanathan Srinivasan , Dipankar Sarma , Balbir Singh , Gautham R Shenoy , Shaohua Li , Venkatesh Pallipadi , Arun Bharadwaj Subject: [v6 PATCH 3/7]: x86: refactor x86 idle power management code and remove all instances of pm_idle. Message-ID: <20090922112936.GD7788@linux.vnet.ibm.com> References: <20090922112526.GA7788@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20090922112526.GA7788@linux.vnet.ibm.com> User-Agent: Mutt/1.5.18 (2008-05-17) Cc: linuxppc-dev@lists.ozlabs.org, linux-kernel@vger.kernel.org X-BeenThere: linuxppc-dev@lists.ozlabs.org X-Mailman-Version: 2.1.12 Precedence: list Reply-To: arun@linux.vnet.ibm.com List-Id: Linux on PowerPC Developers Mail List List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org Errors-To: linuxppc-dev-bounces+patchwork-incoming=ozlabs.org@lists.ozlabs.org * Arun R Bharadwaj [2009-09-22 16:55:27]: This patch cleans up x86 of all instances of pm_idle. pm_idle which was earlier called from cpu_idle() idle loop is replaced by cpuidle_idle_call. x86 also registers to cpuidle when the idle routine is selected, by populating the cpuidle_device data structure for each cpu. This is replicated for apm module and for xen, which also used pm_idle. Signed-off-by: Arun R Bharadwaj --- arch/x86/kernel/apm_32.c | 37 +++++++++++++++++++++-- arch/x86/kernel/process.c | 69 ++++++++++++++++++++++++++++++++++--------- arch/x86/kernel/process_32.c | 3 + arch/x86/kernel/process_64.c | 3 + arch/x86/xen/setup.c | 22 +++++++++++++ 5 files changed, 114 insertions(+), 20 deletions(-) Index: linux.trees.git/arch/x86/kernel/process.c =================================================================== --- linux.trees.git.orig/arch/x86/kernel/process.c +++ linux.trees.git/arch/x86/kernel/process.c @@ -9,6 +9,8 @@ #include #include #include +#include + #include #include #include @@ -247,12 +249,6 @@ int sys_vfork(struct pt_regs *regs) unsigned long boot_option_idle_override = 0; EXPORT_SYMBOL(boot_option_idle_override); -/* - * Powermanagement idle function, if any.. - */ -void (*pm_idle)(void); -EXPORT_SYMBOL(pm_idle); - #ifdef CONFIG_X86_32 /* * This halt magic was a workaround for ancient floppy DMA @@ -531,15 +527,58 @@ static void c1e_idle(void) default_idle(); } +static void (*local_idle)(void); +DEFINE_PER_CPU(struct cpuidle_device, idle_devices); + +struct cpuidle_driver cpuidle_default_driver = { + .name = "cpuidle_default", +}; + +static int local_idle_loop(struct cpuidle_device *dev, struct cpuidle_state *st) +{ + ktime_t t1, t2; + s64 diff; + int ret; + + t1 = ktime_get(); + local_idle(); + t2 = ktime_get(); + + diff = ktime_to_us(ktime_sub(t2, t1)); + if (diff > INT_MAX) + diff = INT_MAX; + ret = (int) diff; + + return ret; +} +static int __cpuinit setup_cpuidle_simple(void) +{ + struct cpuidle_device *dev; + int cpu; + + if (!cpuidle_curr_driver) + cpuidle_register_driver(&cpuidle_default_driver); + + for_each_online_cpu(cpu) { + dev = &per_cpu(idle_devices, cpu); + dev->cpu = cpu; + dev->states[0].enter = local_idle_loop; + dev->state_count = 1; + cpuidle_register_device(dev); + } + return 0; +} +late_initcall(setup_cpuidle_simple); + void __cpuinit select_idle_routine(const struct cpuinfo_x86 *c) { #ifdef CONFIG_SMP - if (pm_idle == poll_idle && smp_num_siblings > 1) { + if (local_idle == poll_idle && smp_num_siblings > 1) { printk(KERN_WARNING "WARNING: polling idle and HT enabled," " performance may degrade.\n"); } #endif - if (pm_idle) + if (local_idle) return; if (cpu_has(c, X86_FEATURE_MWAIT) && mwait_usable(c)) { @@ -547,18 +586,20 @@ void __cpuinit select_idle_routine(const * One CPU supports mwait => All CPUs supports mwait */ printk(KERN_INFO "using mwait in idle threads.\n"); - pm_idle = mwait_idle; + local_idle = mwait_idle; } else if (check_c1e_idle(c)) { printk(KERN_INFO "using C1E aware idle routine\n"); - pm_idle = c1e_idle; + local_idle = c1e_idle; } else - pm_idle = default_idle; + local_idle = default_idle; + + return; } void __init init_c1e_mask(void) { /* If we're using c1e_idle, we need to allocate c1e_mask. */ - if (pm_idle == c1e_idle) { + if (local_idle == c1e_idle) { alloc_cpumask_var(&c1e_mask, GFP_KERNEL); cpumask_clear(c1e_mask); } @@ -571,7 +612,7 @@ static int __init idle_setup(char *str) if (!strcmp(str, "poll")) { printk("using polling idle threads.\n"); - pm_idle = poll_idle; + local_idle = poll_idle; } else if (!strcmp(str, "mwait")) force_mwait = 1; else if (!strcmp(str, "halt")) { @@ -582,7 +623,7 @@ static int __init idle_setup(char *str) * To continue to load the CPU idle driver, don't touch * the boot_option_idle_override. */ - pm_idle = default_idle; + local_idle = default_idle; idle_halt = 1; return 0; } else if (!strcmp(str, "nomwait")) { Index: linux.trees.git/arch/x86/kernel/process_32.c =================================================================== --- linux.trees.git.orig/arch/x86/kernel/process_32.c +++ linux.trees.git/arch/x86/kernel/process_32.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -113,7 +114,7 @@ void cpu_idle(void) local_irq_disable(); /* Don't trace irqs off for idle */ stop_critical_timings(); - pm_idle(); + cpuidle_idle_call(); start_critical_timings(); } tick_nohz_restart_sched_tick(); Index: linux.trees.git/arch/x86/kernel/process_64.c =================================================================== --- linux.trees.git.orig/arch/x86/kernel/process_64.c +++ linux.trees.git/arch/x86/kernel/process_64.c @@ -39,6 +39,7 @@ #include #include #include +#include #include #include @@ -142,7 +143,7 @@ void cpu_idle(void) enter_idle(); /* Don't trace irqs off for idle */ stop_critical_timings(); - pm_idle(); + cpuidle_idle_call(); start_critical_timings(); /* In many cases the interrupt that ended idle has already called exit_idle. But some idle Index: linux.trees.git/arch/x86/kernel/apm_32.c =================================================================== --- linux.trees.git.orig/arch/x86/kernel/apm_32.c +++ linux.trees.git/arch/x86/kernel/apm_32.c @@ -2257,6 +2257,38 @@ static struct dmi_system_id __initdata a { } }; +DEFINE_PER_CPU(struct cpuidle_device, apm_idle_devices); + +struct cpuidle_driver cpuidle_apm_driver = { + .name = "cpuidle_apm", +}; + +void __cpuinit setup_cpuidle_apm(void) +{ + struct cpuidle_device *dev; + + if (!cpuidle_curr_driver) + cpuidle_register_driver(&cpuidle_apm_driver); + + dev = &per_cpu(apm_idle_devices, smp_processor_id()); + dev->cpu = smp_processor_id(); + dev->states[0].enter = apm_cpu_idle; + dev->state_count = 1; + cpuidle_register_device(dev); +} + +void exit_cpuidle_apm(void) +{ + struct cpuidle_device *dev; + int cpu; + + for_each_online_cpu(cpu) { + dev = &per_cpu(apm_idle_devices, cpu); + cpuidle_unregister_device(dev); + } +} + + /* * Just start the APM thread. We do NOT want to do APM BIOS * calls from anything but the APM thread, if for no other reason @@ -2394,8 +2426,7 @@ static int __init apm_init(void) if (HZ != 100) idle_period = (idle_period * HZ) / 100; if (idle_threshold < 100) { - original_pm_idle = pm_idle; - pm_idle = apm_cpu_idle; + setup_cpuidle_apm(); set_pm_idle = 1; } @@ -2407,7 +2438,7 @@ static void __exit apm_exit(void) int error; if (set_pm_idle) { - pm_idle = original_pm_idle; + exit_cpuidle_apm(); /* * We are about to unload the current idle thread pm callback * (pm_idle), Wait for all processors to update cached/local Index: linux.trees.git/arch/x86/xen/setup.c =================================================================== --- linux.trees.git.orig/arch/x86/xen/setup.c +++ linux.trees.git/arch/x86/xen/setup.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include @@ -151,6 +152,25 @@ void __cpuinit xen_enable_syscall(void) #endif /* CONFIG_X86_64 */ } +DEFINE_PER_CPU(struct cpuidle_device, idle_devices); +struct cpuidle_driver cpuidle_xen_driver = { + .name = "cpuidle_xen", +}; + +void __cpuinit setup_cpuidle_xen(void) +{ + struct cpuidle_device *dev; + + if (!cpuidle_curr_driver) + cpuidle_register_driver(&cpuidle_xen_driver); + + dev = &per_cpu(idle_devices, smp_processor_id()); + dev->cpu = smp_processor_id(); + dev->states[0].enter = xen_idle; + dev->state_count = 1; + cpuidle_register_device(dev); +} + void __init xen_arch_setup(void) { struct physdev_set_iopl set_iopl; @@ -186,7 +206,7 @@ void __init xen_arch_setup(void) MAX_GUEST_CMDLINE > COMMAND_LINE_SIZE ? COMMAND_LINE_SIZE : MAX_GUEST_CMDLINE); - pm_idle = xen_idle; + setup_cpuidle_xen(); paravirt_disable_iospace();