Patchwork 2.6.31-git5 kernel boot hangs on powerpc

login
register
mail settings
Submitter Tejun Heo
Date Sept. 25, 2009, 7:39 a.m.
Message ID <4ABC73C7.20403@kernel.org>
Download mbox | patch
Permalink /patch/34261/
State Not Applicable
Headers show

Comments

Tejun Heo - Sept. 25, 2009, 7:39 a.m.
Hello,

Sachin Sant wrote:
> <4>PERCPU: chunk 1 relocating -1 -> 18 c0000000db70fb00
> <c0000000db70fb00:c0000000db70fb00>
> <4>PERCPU: relocated <c000000001120320:c000000001120320>
> <4>PERCPU: chunk 1 relocating 18 -> 16 c0000000db70fb00
> <c000000001120320:c000000001120320>
> <4>PERCPU: relocated <c000000001120300:c000000001120300>
> <4>PERCPU: chunk 1, alloc pages [0,1)
> <4>PERCPU: chunk 1, map pages [0,1)
> <4>PERCPU: map 0xd00007fffff00000, 1 pages 53544
> <4>PERCPU: map 0xd00007fffff80000, 1 pages 53545
> <4>PERCPU: chunk 1, will clear 4096b/unit d00007fffff00000 d00007fffff80000
> <3>INFO: RCU detected CPU 0 stall (t=1000 jiffies)

This supports my hypothesis.  This is the first area being allocated
from a dynamic chunk and cleared.  PFN 53544 and 53545 have been
allocated and successfully mapped to 0xd00007fffff00000 and
0xd00007fffff80000 using map_kernel_range_noflush() but when those
addresses are actually accessed, we end up with infinite faults.  The
fault handler probably thinks that the fault has been handled
correctly but, when the control is returned, the processor faults
again.  Benjamin, I'm way out of my depth here, can you please help?

Oh, one more simple experiment.  Sachin, does the following patch make
any difference?
Benjamin Herrenschmidt - Sept. 25, 2009, 8:31 a.m.
On Fri, 2009-09-25 at 16:39 +0900, Tejun Heo wrote:
> Hello,
> 
> Sachin Sant wrote:
> > <4>PERCPU: chunk 1 relocating -1 -> 18 c0000000db70fb00
> > <c0000000db70fb00:c0000000db70fb00>
> > <4>PERCPU: relocated <c000000001120320:c000000001120320>
> > <4>PERCPU: chunk 1 relocating 18 -> 16 c0000000db70fb00
> > <c000000001120320:c000000001120320>
> > <4>PERCPU: relocated <c000000001120300:c000000001120300>
> > <4>PERCPU: chunk 1, alloc pages [0,1)
> > <4>PERCPU: chunk 1, map pages [0,1)
> > <4>PERCPU: map 0xd00007fffff00000, 1 pages 53544
> > <4>PERCPU: map 0xd00007fffff80000, 1 pages 53545
> > <4>PERCPU: chunk 1, will clear 4096b/unit d00007fffff00000 d00007fffff80000
> > <3>INFO: RCU detected CPU 0 stall (t=1000 jiffies)
> 
> This supports my hypothesis.  This is the first area being allocated
> from a dynamic chunk and cleared.  PFN 53544 and 53545 have been
> allocated and successfully mapped to 0xd00007fffff00000 and
> 0xd00007fffff80000 using map_kernel_range_noflush() but when those
> addresses are actually accessed, we end up with infinite faults.  The
> fault handler probably thinks that the fault has been handled
> correctly but, when the control is returned, the processor faults
> again.  Benjamin, I'm way out of my depth here, can you please help?

Definitely looks like a powerpc mm problem. I'll have a look on monday.

Cheers,
Ben.

> Oh, one more simple experiment.  Sachin, does the following patch make
> any difference?
> 
> diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> index 69511e6..93d29eb 100644
> --- a/mm/vmalloc.c
> +++ b/mm/vmalloc.c
> @@ -2102,7 +2102,8 @@ struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
>  				     size_t align, gfp_t gfp_mask)
>  {
>  	const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align);
> -	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
> +	//const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
> +	const unsigned long vmalloc_end = vmalloc_start + (512 << 20);
>  	struct vmap_area **vas, *prev, *next;
>  	struct vm_struct **vms;
>  	int area, area2, last_area, term_area;
> 
>

Patch

diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 69511e6..93d29eb 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -2102,7 +2102,8 @@  struct vm_struct **pcpu_get_vm_areas(const unsigned long *offsets,
 				     size_t align, gfp_t gfp_mask)
 {
 	const unsigned long vmalloc_start = ALIGN(VMALLOC_START, align);
-	const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	//const unsigned long vmalloc_end = VMALLOC_END & ~(align - 1);
+	const unsigned long vmalloc_end = vmalloc_start + (512 << 20);
 	struct vmap_area **vas, *prev, *next;
 	struct vm_struct **vms;
 	int area, area2, last_area, term_area;