diff mbox

[powerpc] Next tree Nov 2 : kernel BUG at mm/mmap.c:2135!

Message ID 20091113021048.GA4865@yookeroo.seuss (mailing list archive)
State Not Applicable
Headers show

Commit Message

David Gibson Nov. 13, 2009, 2:10 a.m. UTC
On Fri, Nov 13, 2009 at 12:37:29PM +1100, David Gibson wrote:
> On Thu, Nov 12, 2009 at 04:46:40PM +0530, Sachin Sant wrote:
> > David Gibson wrote:
> > >On Wed, Nov 04, 2009 at 06:08:44PM +0530, Sachin Sant wrote:
> > >>Sachin Sant wrote:
> > >>>Today's next tree failed to boot on a POWER 6 box with :
> > >>>
> > >>>------------[ cut here ]------------
> > >>>kernel BUG at mm/mmap.c:2135!
> > >>>Oops: Exception in kernel mode, sig: 5 [#2]
> > >>>SMP NR_CPUS=1024 NUMA pSeries
> > >>Problem exists with today's next as well.
> > >>
> > >>Likely cause for this problem seems to the following commit.
> > >>If i revert this patch the machine boots fine.
> > >>
> > >>commit a0668cdc154e54bf0c85182e0535eea237d53146
> > >>powerpc/mm: Cleanup management of kmem_caches for pagetables
> > >
> > >Ugh.  Ok, it's not at all obvious how my patch could cause this bug.
> > >Can you send your .config?
> > >
> > Still present in today's next.
> 
> Sorry, I've been sidetracked by other issues and have only managed to
> look into this today.  My initial attempts to reproduce the bug with
> your config on both POWER6 and POWER5+ have failed though.
> 
> Is it possible to get the complete boot log from this system - not
> just the [cut here] section around the BUG()?  This should help to
> determine exactly when during boot the bug is being triggered.

Also, could you try booting the kernel with the patch below, which
should give a bit more information about the problem.

Comments

Sachin P. Sant Nov. 13, 2009, 9:35 a.m. UTC | #1
David Gibson wrote:
> so, could you try booting the kernel with the patch below, which
> should give a bit more information about the problem.
>
> Index: working-2.6/mm/mmap.c
> ===================================================================
> --- working-2.6.orig/mm/mmap.c	2009-11-13 13:08:29.000000000 +1100
> +++ working-2.6/mm/mmap.c	2009-11-13 13:09:26.000000000 +1100
> @@ -2136,6 +2136,8 @@ void exit_mmap(struct mm_struct *mm)
>  	while (vma)
>  		vma = remove_vma(vma);
>
> +	if (nr_ptes != 0)
> +		printk("exit_mmap(): mm %p nr_ptes %d\n", mm, mm->nr_ptes);
>  	BUG_ON(mm->nr_ptes > (FIRST_USER_ADDRESS+PMD_SIZE-1)>>PMD_SHIFT);
>  }
>   
Here is the information collected with today's next.
(2.6.32-rc7-20091113)

------------[ cut here ]------------
kernel BUG at mm/mmap.c:2139!
cpu 0x3: Vector: 700 (Program Check) at [c0000000fae1b7e0]
    pc: c000000000150e88: .exit_mmap+0x1ac/0x1d4
    lr: c000000000150e78: .exit_mmap+0x19c/0x1d4
    sp: c0000000fae1ba60
   msr: 8000000000029032
  current = 0xc0000000fada8be0
  paca    = 0xc000000000bb2c00
    pid   = 84, comm = cat
kernel BUG at mm/mmap.c:2139!
enter ? for help
[c0000000fae1bb10] c000000000093d24 .mmput+0x54/0x164
[c0000000fae1bba0] c000000000098f30 .exit_mm+0x17c/0x1a0
[c0000000fae1bc50] c00000000009b310 .do_exit+0x248/0x784
[c0000000fae1bd30] c00000000009b900 .do_group_exit+0xb4/0xe8
[c0000000fae1bdc0] c00000000009b948 .SyS_exit_group+0x14/0x28
[c0000000fae1be30] c0000000000085b4 syscall_exit+0x0/0x40
--- Exception: c01 (System Call) at 00000fff89a8ff40
SP (fffdf8a2460) is in userspace

Have attached the complete boot log.

At the time of crash values of mm and mm->nr_ptes were

<7>exit_mmap(): mm c0000000fa9f9580 nr_ptes 1

Thanks
-Sachin
diff mbox

Patch

Index: working-2.6/mm/mmap.c
===================================================================
--- working-2.6.orig/mm/mmap.c	2009-11-13 13:08:29.000000000 +1100
+++ working-2.6/mm/mmap.c	2009-11-13 13:09:26.000000000 +1100
@@ -2136,6 +2136,8 @@  void exit_mmap(struct mm_struct *mm)
 	while (vma)
 		vma = remove_vma(vma);
 
+	if (nr_ptes != 0)
+		printk("exit_mmap(): mm %p nr_ptes %d\n", mm, mm->nr_ptes);
 	BUG_ON(mm->nr_ptes > (FIRST_USER_ADDRESS+PMD_SIZE-1)>>PMD_SHIFT);
 }