Message ID | 20160615113249.GH30909@twins.programming.kicks-ass.net (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Wed, Jun 15, 2016 at 01:32:49PM +0200, Peter Zijlstra wrote: > On Wed, Jun 15, 2016 at 03:49:36PM +0530, Gautham R Shenoy wrote: > > > Also, with the first patch in the series (which ensures that > > restore_unbound_workers are called *after* the new workers for the > > newly onlined CPUs are created) and without this one, you can > > reproduce this WARN_ON on both x86 and PPC by offlining all the CPUs > > of a node and bringing just one of them online. > > Ah good. > > > I am not sure about that. The workqueue creates unbound workers for a > > node via wq_update_unbound_numa() whenever the first CPU of every node > > comes online. So that seems legitimate. It then tries to affine these > > workers to the cpumask of that node. Again this seems right. As an > > optimization, it does this only when the first CPU of the node comes > > online. Since this online CPU is not yet active, and since > > nr_cpus_allowed > 1, we will hit the WARN_ON(). > > So I had another look and isn't the below a much simpler solution? > > It seems to work on my x86 with: > > for i in /sys/devices/system/cpu/cpu*/online ; do echo 0 > $i ; done > for i in /sys/devices/system/cpu/cpu*/online ; do echo 1 > $i ; done > > without complaint. Yup. This will work on PPC as well. We will no longer have the optimization in restore_unbound_workers_cpumask() but I suppose we don't lose much by resetting the affinity every time a CPU in the pool->attr->cpumask comes online. -- Thanks and Regards gautham.
On Wed, Jun 15, 2016 at 06:20:33PM +0530, Gautham R Shenoy wrote: > We will no longer have the optimization in > restore_unbound_workers_cpumask() but I suppose we don't lose much by > resetting the affinity every time a CPU in the pool->attr->cpumask > comes online. Right; optimizing hotplug really isn't worth it. The code needs to be simple and robust (ha! funny). In any case, Tejun, does this work for you?
Hello, On Wed, Jun 15, 2016 at 03:14:15PM +0200, Peter Zijlstra wrote: > On Wed, Jun 15, 2016 at 06:20:33PM +0530, Gautham R Shenoy wrote: > > We will no longer have the optimization in > > restore_unbound_workers_cpumask() but I suppose we don't lose much by > > resetting the affinity every time a CPU in the pool->attr->cpumask > > comes online. > > Right; optimizing hotplug really isn't worth it. The code needs to be > simple and robust (ha! funny). The only case it might matter is CPU hotplug being used aggressively for power saving. No idea how popular that is now tho. set_cpus_allowed isn't that expensive and phones don't tend to have massive number of kworkers, so hopefully it won't show up. > In any case, Tejun, does this work for you? I'm not sure about the reordering part but for setting affinity on each onlining, no objection. If it ever shows up as performance / power regression, we can revisit it later. Thanks.
On Wed, 2016-06-15 at 12:01 -0400, Tejun Heo wrote: > On Wed, Jun 15, 2016 at 03:14:15PM +0200, Peter Zijlstra wrote: > > On Wed, Jun 15, 2016 at 06:20:33PM +0530, Gautham R Shenoy wrote: > > > We will no longer have the optimization in > > > restore_unbound_workers_cpumask() but I suppose we don't lose much by > > > resetting the affinity every time a CPU in the pool->attr->cpumask > > > comes online. > > > > Right; optimizing hotplug really isn't worth it. The code needs to be > > simple and robust (ha! funny). > > The only case it might matter is CPU hotplug being used aggressively > for power saving. No idea how popular that is now tho. > set_cpus_allowed isn't that expensive and phones don't tend to have > massive number of kworkers, so hopefully it won't show up. > > > In any case, Tejun, does this work for you? > > I'm not sure about the reordering part but for setting affinity on > each onlining, no objection. If it ever shows up as performance / > power regression, we can revisit it later. Peterz do you want to send a SOB'ed patch, or can we take what you posted and add your SOB? And Tejun are you happy to merge this for rc4? cheers
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index e1c0e99..09c9160 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -4600,15 +4600,11 @@ static void restore_unbound_workers_cpumask(struct worker_pool *pool, int cpu) if (!cpumask_test_cpu(cpu, pool->attrs->cpumask)) return; - /* is @cpu the only online CPU? */ cpumask_and(&cpumask, pool->attrs->cpumask, cpu_online_mask); - if (cpumask_weight(&cpumask) != 1) - return; /* as we're called from CPU_ONLINE, the following shouldn't fail */ for_each_pool_worker(worker, pool) - WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, - pool->attrs->cpumask) < 0); + WARN_ON_ONCE(set_cpus_allowed_ptr(worker->task, &cpumask) < 0); } /* @@ -4638,6 +4634,10 @@ static int workqueue_cpu_up_callback(struct notifier_block *nfb, case CPU_ONLINE: mutex_lock(&wq_pool_mutex); + /* update NUMA affinity of unbound workqueues */ + list_for_each_entry(wq, &workqueues, list) + wq_update_unbound_numa(wq, cpu, true); + for_each_pool(pool, pi) { mutex_lock(&pool->attach_mutex); @@ -4649,10 +4649,6 @@ static int workqueue_cpu_up_callback(struct notifier_block *nfb, mutex_unlock(&pool->attach_mutex); } - /* update NUMA affinity of unbound workqueues */ - list_for_each_entry(wq, &workqueues, list) - wq_update_unbound_numa(wq, cpu, true); - mutex_unlock(&wq_pool_mutex); break; }