Message ID | 529C0614.6070708@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On 02.12.2013, at 05:01, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: > Hi, > > On 11/30/2013 11:15 PM, Alexander Graf wrote: >> Hi Ben, >> >> With current linus master (3.13-rc2+) I'm facing an interesting issue with > > SMT disabling on p7. When I trigger the cpu offlining it works as expected, > but after a few seconds the machine goes into an oops as you can see below. >> >> It looks like a null pointer dereference. > > tip/sched/urgent has the below fix. Can you please apply the following it and > check if the issue gets resolved? A similar issue was reported earlier as I've disabled NO_HZ now on that machine which also "fixed" it for me. Unfortunately I can't reboot that box for at least the next week now to test whether the patch does fix the issue. Alex
Hi, On 12/02/2013 03:27 PM, Alexander Graf wrote: > > On 02.12.2013, at 05:01, Preeti U Murthy <preeti@linux.vnet.ibm.com> wrote: > >> Hi, >> >> On 11/30/2013 11:15 PM, Alexander Graf wrote: >>> Hi Ben, >>> >>> With current linus master (3.13-rc2+) I'm facing an interesting issue with >> >> SMT disabling on p7. When I trigger the cpu offlining it works as expected, >> but after a few seconds the machine goes into an oops as you can see below. >>> >>> It looks like a null pointer dereference. >> >> tip/sched/urgent has the below fix. Can you please apply the following it and >> check if the issue gets resolved? A similar issue was reported earlier as > > I've disabled NO_HZ now on that machine which also "fixed" it for me. Unfortunately I can't reboot that box for at least the next week now to test whether the patch does fix the issue. The commit 37dc6b50cee9 that has caused this regression is around NO_HZ. It decides when to kick nohz idle balancing. Regards Preeti U Murthy > > > Alex >
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index c1808606ee5f..a1591ca7eb5a 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -4910,8 +4910,9 @@ static void update_top_cache_domain(int cpu) if (sd) { id = cpumask_first(sched_domain_span(sd)); size = cpumask_weight(sched_domain_span(sd)); - rcu_assign_pointer(per_cpu(sd_busy, cpu), sd->parent); + sd = sd->parent; /* sd_busy */ } + rcu_assign_pointer(per_cpu(sd_busy, cpu), sd); rcu_assign_pointer(per_cpu(sd_llc, cpu), sd); per_cpu(sd_llc_size, cpu) = size;