From patchwork Wed Jun 3 23:14:33 2009 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kjetil Oftedal X-Patchwork-Id: 28079 X-Patchwork-Delegate: davem@davemloft.net Return-Path: X-Original-To: patchwork-incoming@bilbo.ozlabs.org Delivered-To: patchwork-incoming@bilbo.ozlabs.org Received: from ozlabs.org (ozlabs.org [203.10.76.45]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (Client CN "mx.ozlabs.org", Issuer "CA Cert Signing Authority" (verified OK)) by bilbo.ozlabs.org (Postfix) with ESMTPS id AE4B0B7067 for ; Thu, 4 Jun 2009 09:39:04 +1000 (EST) Received: by ozlabs.org (Postfix) id A0EF8DDE01; Thu, 4 Jun 2009 09:39:04 +1000 (EST) Delivered-To: patchwork-incoming@ozlabs.org Received: from vger.kernel.org (vger.kernel.org [209.132.176.167]) by ozlabs.org (Postfix) with ESMTP id 44991DDDFD for ; Thu, 4 Jun 2009 09:39:04 +1000 (EST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753153AbZFCXjA (ORCPT ); Wed, 3 Jun 2009 19:39:00 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753781AbZFCXjA (ORCPT ); Wed, 3 Jun 2009 19:39:00 -0400 Received: from lutt.itea.ntnu.no ([129.241.18.234]:57776 "EHLO lutt.itea.ntnu.no" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753153AbZFCXi7 (ORCPT ); Wed, 3 Jun 2009 19:38:59 -0400 X-Greylist: delayed 547 seconds by postgrey-1.27 at vger.kernel.org; Wed, 03 Jun 2009 19:38:59 EDT Received: from bene1.itea.ntnu.no (bene1.itea.ntnu.no [IPv6:2001:700:300:3::56]) by lutt.itea.ntnu.no (Postfix) with ESMTP id 5CBD523C92C for ; Thu, 4 Jun 2009 01:29:51 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by bene1.itea.ntnu.no (Postfix) with ESMTP id 7C4CD16C7E5; Thu, 4 Jun 2009 01:28:44 +0200 (CEST) Received: from m166j.studby.ntnu.no (m166j.studby.ntnu.no [129.241.137.166]) by bene1.itea.ntnu.no (Postfix) with ESMTP id E17F516C780; Thu, 4 Jun 2009 01:28:43 +0200 (CEST) Date: Thu, 4 Jun 2009 01:14:33 +0200 (CEST) From: oftedal X-X-Sender: kjetil@oizys.tordivel.org To: Andrew Morton cc: bugzilla-daemon@bugzilla.kernel.org, bugme-daemon@bugzilla.kernel.org, sparclinux@vger.kernel.org Subject: Re: [Bugme-new] [Bug 13444] New: SparcServer 1000E SMP can cause kernel-nullpointer with some hw configurations In-Reply-To: <20090603144241.99669d0e.akpm@linux-foundation.org> Message-ID: References: <20090603144241.99669d0e.akpm@linux-foundation.org> MIME-Version: 1.0 X-Virus-Scanned: Debian amavisd-new at bene1.itea.ntnu.no Sender: sparclinux-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: sparclinux@vger.kernel.org On Wed, 3 Jun 2009, Andrew Morton wrote: > > (switched to email. Please respond via emailed reply-to-all, not via the > bugzilla web interface). > > On Wed, 3 Jun 2009 16:45:33 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > >> http://bugzilla.kernel.org/show_bug.cgi?id=13444 >> >> Summary: SparcServer 1000E SMP can cause kernel-nullpointer >> with some hw configurations >> Product: Platform Specific/Hardware >> Version: 2.5 >> Kernel Version: 2.4.37 >> Platform: All >> OS/Version: Linux >> Tree: Mainline >> Status: NEW >> Severity: normal >> Priority: P1 >> Component: SPARC32 >> AssignedTo: zaitcev@yahoo.com >> ReportedBy: oftedal@gmail.com >> Regression: No >> >> >> Created an attachment (id=21732) >> --> (http://bugzilla.kernel.org/attachment.cgi?id=21732) >> SUN4D-SMP mode patch >> >> SparcServer 1000E/SUN4D machines will cause a kernel-nullpointer when running >> in SMP-mode with certain hw configurations. >> According to Sun documentation Slot A on the systemboards should be filled with >> cpu modules first. This will fill physical cpu-slots 0,2,4,8 with cpus first. >> The current smp code will then fill the init_tasks-array using the physical >> cpu-mapping. >> The scheduler on the other hand uses the logical cpu-mapping. >> So in a two cpu system with two systemboards cpu-slots 0 and 2 will be occupied >> and init_tasks-array position 0 and 2 will contain a idle_task. But the >> scheduler will access init_tasks position 0 and 1 which will cause an error. >> Any sparcserver 1000(E) system with 2 systemboards and 2 cpus, with 1 cpu per >> board will cause this error. >> > > Thanks, but handling patches via bugzilla is far from preferred. > > Please send the patch via email, as a reply-to-all to this email. The > patch should include a full description and a Signed-off-by:, as per > Documentation/SubmittingPatches. > > As mentioned in the bug-report(#13444) the sun4d-SMP code on 2.4-series kernels(2.4.37) uses the physical cpu-mapping when inserting tasks into the init_tasks array. And the scheduler uses the logical cpu-mapping which will cause a kernel nullpointer when there are gaps in the physical cpu-mapping. The attached patch uses the cpucount variable, which can be used to find the next logical slot in the init_tasks array. As it is only incremented when a cpu is successfully started. (I hope this was more compliant with kernel-bugfixing standards. I am new to this) Signed-off-by: Kjetil Oftedal --- -- To unsubscribe from this list: send the line "unsubscribe sparclinux" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/sparc/kernel/sun4d_smp.c b/arch/sparc/kernel/sun4d_smp.c index 9bb7f78..5b9b34e 100644 --- a/arch/sparc/kernel/sun4d_smp.c +++ b/arch/sparc/kernel/sun4d_smp.c @@ -221,7 +221,9 @@ void __init smp4d_boot_cpus(void) cpucount++; p = init_task.prev_task; - init_tasks[i] = p; + + /* The scheduler uses the logical cpu mapping when accessing this array */ + init_tasks[cpucount] = p; p->processor = i; p->cpus_runnable = 1 << i; /* we schedule the first task manually */