Message ID | 20180109101810.2471D6C6CF@localhost.localdomain (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [v2] powerpc/mm: Fix growth direction for hugepages mmaps with slice | expand |
Christophe Leroy <christophe.leroy@c-s.fr> writes: > An application running with libhugetlbfs fails to allocate > additional pages to HEAP due to the hugemap being done > inconditionally as topdown mapping: > > mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 > [...] > mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 > munmap(0x73d80000, 1048576) = 0 > [...] > mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 > munmap(0x73d00000, 1572864) = 0 > [...] > mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 > munmap(0x73d00000, 1572864) = 0 > [...] > Can you explain the failure details above. I am not sure I understand what to read from the above output. > As one can see from the above strace log, mmap() allocates further > pages below the initial one. > > This patch fixes it by taking into account MAP_GROWSDOWN flag. Rest of the kernel don't depend on that flag to select a topdown search or not. So what is special with hugetlb? IF we select legacy mmap that is when we select a bottomup search. Hugetlb on ppc64 always did a topdown search. > > Fixes: d0f13e3c20b6f ("[POWERPC] Introduce address space "slices" ") > Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> > --- > v2: Added missing include > > arch/powerpc/mm/hugetlbpage.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c > index 79e1378ee303..0eadf9f199de 100644 > --- a/arch/powerpc/mm/hugetlbpage.c > +++ b/arch/powerpc/mm/hugetlbpage.c > @@ -19,6 +19,7 @@ > #include <linux/moduleparam.h> > #include <linux/swap.h> > #include <linux/swapops.h> > +#include <linux/mman.h> > #include <asm/pgtable.h> > #include <asm/pgalloc.h> > #include <asm/tlb.h> > @@ -558,7 +559,8 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > return radix__hugetlb_get_unmapped_area(file, addr, len, > pgoff, flags); > #endif > - return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1); > + return slice_get_unmapped_area(addr, len, flags, mmu_psize, > + flags & MAP_GROWSDOWN); > } > #endif > > -- > 2.13.3
Le 16/01/2018 à 17:03, Aneesh Kumar K.V a écrit : > Christophe Leroy <christophe.leroy@c-s.fr> writes: > >> An application running with libhugetlbfs fails to allocate >> additional pages to HEAP due to the hugemap being done >> inconditionally as topdown mapping: >> >> mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 >> [...] >> mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 >> munmap(0x73d80000, 1048576) = 0 >> [...] >> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >> munmap(0x73d00000, 1572864) = 0 >> [...] >> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >> munmap(0x73d00000, 1572864) = 0 >> [...] >> > > Can you explain the failure details above. I am not sure I understand > what to read from the above output. libhugetlbfs first requests an area of size 1.5Mbytes, at address 0x10080000 mmap() returns an area at address 0x73e80000 Then libhugetlbfs requests an additional area on top of that, ie at address 0x74000000, to expand the heap. But mmap() returns an area at address 0x73d80000, ie under the previous area. This is not the behaviour when using the generic (ie without mm_slices) hugepages code, and this is not what libhugetlbfs expects for expending the heap. > >> As one can see from the above strace log, mmap() allocates further >> pages below the initial one. >> >> This patch fixes it by taking into account MAP_GROWSDOWN flag. > > Rest of the kernel don't depend on that flag to select a topdown search > or not. So what is special with hugetlb? IF we select legacy mmap that > is when we select a bottomup search. Hugetlb on ppc64 always did a > topdown search. The generic hugepage code does a bottomup search. First page is allocated at address 0x30000000 and following pages are allocated at requested addresses when requested, then libhugetlbfs has no issue expanding the heap when required. > >> >> Fixes: d0f13e3c20b6f ("[POWERPC] Introduce address space "slices" ") >> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> >> --- >> v2: Added missing include >> >> arch/powerpc/mm/hugetlbpage.c | 4 +++- >> 1 file changed, 3 insertions(+), 1 deletion(-) >> >> diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c >> index 79e1378ee303..0eadf9f199de 100644 >> --- a/arch/powerpc/mm/hugetlbpage.c >> +++ b/arch/powerpc/mm/hugetlbpage.c >> @@ -19,6 +19,7 @@ >> #include <linux/moduleparam.h> >> #include <linux/swap.h> >> #include <linux/swapops.h> >> +#include <linux/mman.h> >> #include <asm/pgtable.h> >> #include <asm/pgalloc.h> >> #include <asm/tlb.h> >> @@ -558,7 +559,8 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, >> return radix__hugetlb_get_unmapped_area(file, addr, len, >> pgoff, flags); >> #endif >> - return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1); >> + return slice_get_unmapped_area(addr, len, flags, mmu_psize, >> + flags & MAP_GROWSDOWN); >> } >> #endif >> >> -- >> 2.13.3
On 01/16/2018 10:18 PM, Christophe LEROY wrote: > > > Le 16/01/2018 à 17:03, Aneesh Kumar K.V a écrit : >> Christophe Leroy <christophe.leroy@c-s.fr> writes: >> >>> An application running with libhugetlbfs fails to allocate >>> additional pages to HEAP due to the hugemap being done >>> inconditionally as topdown mapping: >>> >>> mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 >>> [...] >>> mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 >>> munmap(0x73d80000, 1048576) = 0 >>> [...] >>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>> munmap(0x73d00000, 1572864) = 0 >>> [...] >>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>> munmap(0x73d00000, 1572864) = 0 >>> [...] >>> >> >> Can you explain the failure details above. I am not sure I understand >> what to read from the above output. > > libhugetlbfs first requests an area of size 1.5Mbytes, at address > 0x10080000 > mmap() returns an area at address 0x73e80000 > > Then libhugetlbfs requests an additional area on top of that, ie at > address 0x74000000, to expand the heap. > But mmap() returns an area at address 0x73d80000, ie under the previous > area. > Can you share the test details?. Why does it not fail on book3s64? We use topdown search with book3s64. > This is not the behaviour when using the generic (ie without mm_slices) > hugepages code, and this is not what libhugetlbfs expects for expending > the heap. > > -aneesh
Le 17/01/2018 à 04:19, Aneesh Kumar K.V a écrit : > > > On 01/16/2018 10:18 PM, Christophe LEROY wrote: >> >> >> Le 16/01/2018 à 17:03, Aneesh Kumar K.V a écrit : >>> Christophe Leroy <christophe.leroy@c-s.fr> writes: >>> >>>> An application running with libhugetlbfs fails to allocate >>>> additional pages to HEAP due to the hugemap being done >>>> inconditionally as topdown mapping: >>>> >>>> mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, >>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 >>>> [...] >>>> mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, >>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 >>>> munmap(0x73d80000, 1048576) = 0 >>>> [...] >>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>> munmap(0x73d00000, 1572864) = 0 >>>> [...] >>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>> munmap(0x73d00000, 1572864) = 0 >>>> [...] >>>> >>> >>> Can you explain the failure details above. I am not sure I understand >>> what to read from the above output. >> >> libhugetlbfs first requests an area of size 1.5Mbytes, at address >> 0x10080000 >> mmap() returns an area at address 0x73e80000 >> >> Then libhugetlbfs requests an additional area on top of that, ie at >> address 0x74000000, to expand the heap. >> But mmap() returns an area at address 0x73d80000, ie under the >> previous area. >> > > > Can you share the test details?. Why does it not fail on book3s64? We > use topdown search with book3s64. I don't know about book3s64, I only have 8xx. Here is my test app: #include <sys/mman.h> #include <stdlib.h> #include <stdio.h> #include <unistd.h> #include <fcntl.h> int main() { char *p; char buf[16384]; char filename[32]; int fd, r; sprintf(filename, "/proc/%d/maps", getpid()); fd = open(filename, O_RDONLY); r = read(fd, buf, sizeof(buf)); close(fd); buf[r] = 0; fputs(buf, stderr); fputc('\n', stderr); p = malloc(1024*1024); fprintf(stderr, "\nAllocated 1Mbytes at %p\n\n", p); p = malloc(1024*1024); fprintf(stderr, "\nAllocated 1Mbytes at %p\n\n", p); p = malloc(1024*1024); fprintf(stderr, "\nAllocated 1Mbytes at %p\n\n", p); fd = open(filename, O_RDONLY); r = read(fd, buf, sizeof(buf)); close(fd); buf[r] = 0; fputs(buf, stderr); fputc('\n', stderr); exit(0); } It is linked with -lhugetlbfs (version 2.20) My 8xx board is configured with 64 huge pages, default size 512k: root@vgoip:~# cat /proc/meminfo MemTotal: 123664 kB MemFree: 58464 kB MemAvailable: 67904 kB Buffers: 0 kB Cached: 14480 kB SwapCached: 0 kB Active: 11616 kB Inactive: 7872 kB Active(anon): 7584 kB Inactive(anon): 240 kB Active(file): 4032 kB Inactive(file): 7632 kB Unevictable: 2560 kB Mlocked: 2560 kB SwapTotal: 0 kB SwapFree: 0 kB Dirty: 0 kB Writeback: 0 kB AnonPages: 7568 kB Mapped: 7456 kB Shmem: 736 kB Slab: 7456 kB SReclaimable: 3120 kB SUnreclaim: 4336 kB KernelStack: 568 kB PageTables: 1024 kB NFS_Unstable: 0 kB Bounce: 0 kB WritebackTmp: 0 kB CommitLimit: 45440 kB Committed_AS: 38880 kB VmallocTotal: 866304 kB VmallocUsed: 0 kB VmallocChunk: 0 kB HugePages_Total: 64 HugePages_Free: 64 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 512 kB Without the patch, my test app gives the following results: as you can see, second and third malloc returns address which is not in a hugepage. strace shows that libhugetlbfs fallsback on regular mmap because hugepage has not been allocated at the requested address. root@vgoip:~# HUGETLB_MORECORE=yes ./huge_malloc_test 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 77940000-77964000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 7797c000-77980000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 77980000-77984000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7fa58000-7fa7c000 rw-p 00000000 00:00 0 [stack] libhugetlbfs: WARNING: Heap originates at 0x73e80000 instead of 0x10080000 Allocated 1Mbytes at 0x73e80008 libhugetlbfs: WARNING: New heap segment mapped at 0x73d80000 instead of 0x74000000 Allocated 1Mbytes at 0x777fc008 libhugetlbfs: WARNING: New heap segment mapped at 0x73d00000 instead of 0x74000000 Allocated 1Mbytes at 0x776b8008 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 73e80000-74000000 rw-p 00000000 00:0b 98386 /anon_hugepage (deleted) 776b8000-77940000 rw-p 00000000 00:00 0 77940000-77964000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 7797c000-77980000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 77980000-77984000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7fa58000-7fa7c000 rw-p 00000000 00:00 0 [stack] With the patch applied, it works properly, each malloc get an address within the hugepage space. root@vgoip:~# HUGETLB_MORECORE=yes ./huge_malloc_test 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 77884000-778a8000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 778c0000-778c4000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 778c4000-778c8000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7ff98000-7ffbc000 rw-p 00000000 00:00 0 [stack] libhugetlbfs: WARNING: Heap originates at 0x30000000 instead of 0x10080000 Allocated 1Mbytes at 0x30000008 Allocated 1Mbytes at 0x30100010 Allocated 1Mbytes at 0x30200018 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 30000000-30180000 rw-p 00000000 00:0b 7321 /anon_hugepage (deleted) 30180000-30280000 rw-p 00180000 00:0b 7322 /anon_hugepage (deleted) 30280000-30380000 rw-p 00280000 00:0b 7323 /anon_hugepage (deleted) 77884000-778a8000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 778c0000-778c4000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 778c4000-778c8000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7ff98000-7ffbc000 rw-p 00000000 00:00 0 [stack] Christophe
Did a reply instead of reply-all. Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> writes: > Christophe LEROY <christophe.leroy@c-s.fr> writes: > >> Le 17/01/2018 à 04:19, Aneesh Kumar K.V a écrit : >>> >>> >>> On 01/16/2018 10:18 PM, Christophe LEROY wrote: >>>> >>>> >>>> Le 16/01/2018 à 17:03, Aneesh Kumar K.V a écrit : >>>>> Christophe Leroy <christophe.leroy@c-s.fr> writes: >>>>> >>>>>> An application running with libhugetlbfs fails to allocate >>>>>> additional pages to HEAP due to the hugemap being done >>>>>> inconditionally as topdown mapping: >>>>>> >>>>>> mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 >>>>>> [...] >>>>>> mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 >>>>>> munmap(0x73d80000, 1048576) = 0 >>>>>> [...] >>>>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>>>> munmap(0x73d00000, 1572864) = 0 >>>>>> [...] >>>>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>>>> munmap(0x73d00000, 1572864) = 0 >>>>>> [...] >>>>>> >>>>> >>>>> Can you explain the failure details above. I am not sure I understand >>>>> what to read from the above output. >>>> >>>> libhugetlbfs first requests an area of size 1.5Mbytes, at address >>>> 0x10080000 >>>> mmap() returns an area at address 0x73e80000 >>>> >>>> Then libhugetlbfs requests an additional area on top of that, ie at >>>> address 0x74000000, to expand the heap. >>>> But mmap() returns an area at address 0x73d80000, ie under the >>>> previous area. >>>> >>> >>> >>> Can you share the test details?. Why does it not fail on book3s64? We >>> use topdown search with book3s64. >> >> I don't know about book3s64, I only have 8xx. >> >> Here is my test app: >> > > The test ran fine on ppc64. > > kvaneesh@ltctulc6a-p1:[~]$ HUGETLB_MORECORE=yes ./a.out > 10000000-10010000 r-xp 00000000 fc:00 9044312 /home/kvaneesh/a.out > 10010000-10020000 r--p 00000000 fc:00 9044312 /home/kvaneesh/a.out > 10020000-10030000 rw-p 00010000 fc:00 9044312 /home/kvaneesh/a.out > 7ffff7d60000-7ffff7f10000 r-xp 00000000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f10000-7ffff7f20000 r--p 001a0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f20000-7ffff7f30000 rw-p 001b0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f40000-7ffff7f60000 r-xp 00000000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f60000-7ffff7f70000 r--p 00010000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f70000-7ffff7f80000 rw-p 00020000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f80000-7ffff7fa0000 r-xp 00000000 00:00 0 [vdso] > 7ffff7fa0000-7ffff7fe0000 r-xp 00000000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffff7fe0000-7ffff7ff0000 r--p 00030000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffff7ff0000-7ffff8000000 rw-p 00040000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffffffd0000-800000000000 rw-p 00000000 00:00 0 [stack] > > > Allocated 1Mbytes at 0x10000000010 > > > Allocated 1Mbytes at 0x10002000020 > > > Allocated 1Mbytes at 0x10004000030 > > 10000000-10010000 r-xp 00000000 fc:00 9044312 /home/kvaneesh/a.out > 10010000-10020000 r--p 00000000 fc:00 9044312 /home/kvaneesh/a.out > 10020000-10030000 rw-p 00010000 fc:00 9044312 /home/kvaneesh/a.out > 10000000000-10003000000 rw-p 00000000 00:0d 1041435 /anon_hugepage (deleted) > 10003000000-10005000000 rw-p 03000000 00:0d 1041436 /anon_hugepage (deleted) > 10005000000-10007000000 rw-p 05000000 00:0d 1041437 /anon_hugepage (deleted) > 7ffff7d60000-7ffff7f10000 r-xp 00000000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f10000-7ffff7f20000 r--p 001a0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f20000-7ffff7f30000 rw-p 001b0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so > 7ffff7f40000-7ffff7f60000 r-xp 00000000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f60000-7ffff7f70000 r--p 00010000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f70000-7ffff7f80000 rw-p 00020000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 > 7ffff7f80000-7ffff7fa0000 r-xp 00000000 00:00 0 [vdso] > 7ffff7fa0000-7ffff7fe0000 r-xp 00000000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffff7fe0000-7ffff7ff0000 r--p 00030000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffff7ff0000-7ffff8000000 rw-p 00040000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so > 7ffffffd0000-800000000000 rw-p 00000000 00:00 0 [stack] > > > > So i am definitely missing something. I understand that generic hugetlb > get unmapped area always search bottom up and 8xx used to depend on that > callback. But on ppc64 slice based get unmapped area always did topdown > and I am not sure whether we should change that. More over I don't think > MAP_GROWSDOWN is the right flag for selecting topdown/bottom up search. > > > Is it that libhugetlbfs does something specific for 32 bit? Other option > is to add huget_get_unmapped_area for 8xx that does bottom up search? > > If you are on ppc64 irc on freenode we can discuss this there. > -aneesh
Le 19/01/2018 à 11:05, Aneesh Kumar K.V a écrit : > > Did a reply instead of reply-all. > > Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com> writes: > >> Christophe LEROY <christophe.leroy@c-s.fr> writes: >> >>> Le 17/01/2018 à 04:19, Aneesh Kumar K.V a écrit : >>>> >>>> >>>> On 01/16/2018 10:18 PM, Christophe LEROY wrote: >>>>> >>>>> >>>>> Le 16/01/2018 à 17:03, Aneesh Kumar K.V a écrit : >>>>>> Christophe Leroy <christophe.leroy@c-s.fr> writes: >>>>>> >>>>>>> An application running with libhugetlbfs fails to allocate >>>>>>> additional pages to HEAP due to the hugemap being done >>>>>>> inconditionally as topdown mapping: >>>>>>> >>>>>>> mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, >>>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 >>>>>>> [...] >>>>>>> mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, >>>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 >>>>>>> munmap(0x73d80000, 1048576) = 0 >>>>>>> [...] >>>>>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>>>>> munmap(0x73d00000, 1572864) = 0 >>>>>>> [...] >>>>>>> mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, >>>>>>> MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 >>>>>>> munmap(0x73d00000, 1572864) = 0 >>>>>>> [...] >>>>>>> >>>>>> >>>>>> Can you explain the failure details above. I am not sure I understand >>>>>> what to read from the above output. >>>>> >>>>> libhugetlbfs first requests an area of size 1.5Mbytes, at address >>>>> 0x10080000 >>>>> mmap() returns an area at address 0x73e80000 >>>>> >>>>> Then libhugetlbfs requests an additional area on top of that, ie at >>>>> address 0x74000000, to expand the heap. >>>>> But mmap() returns an area at address 0x73d80000, ie under the >>>>> previous area. >>>>> >>>> >>>> >>>> Can you share the test details?. Why does it not fail on book3s64? We >>>> use topdown search with book3s64. >>> >>> I don't know about book3s64, I only have 8xx. >>> >>> Here is my test app: >>> >> >> The test ran fine on ppc64. >> >> kvaneesh@ltctulc6a-p1:[~]$ HUGETLB_MORECORE=yes ./a.out >> 10000000-10010000 r-xp 00000000 fc:00 9044312 /home/kvaneesh/a.out >> 10010000-10020000 r--p 00000000 fc:00 9044312 /home/kvaneesh/a.out >> 10020000-10030000 rw-p 00010000 fc:00 9044312 /home/kvaneesh/a.out >> 7ffff7d60000-7ffff7f10000 r-xp 00000000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f10000-7ffff7f20000 r--p 001a0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f20000-7ffff7f30000 rw-p 001b0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f40000-7ffff7f60000 r-xp 00000000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f60000-7ffff7f70000 r--p 00010000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f70000-7ffff7f80000 rw-p 00020000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f80000-7ffff7fa0000 r-xp 00000000 00:00 0 [vdso] >> 7ffff7fa0000-7ffff7fe0000 r-xp 00000000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffff7fe0000-7ffff7ff0000 r--p 00030000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffff7ff0000-7ffff8000000 rw-p 00040000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffffffd0000-800000000000 rw-p 00000000 00:00 0 [stack] >> >> >> Allocated 1Mbytes at 0x10000000010 >> >> >> Allocated 1Mbytes at 0x10002000020 >> >> >> Allocated 1Mbytes at 0x10004000030 >> >> 10000000-10010000 r-xp 00000000 fc:00 9044312 /home/kvaneesh/a.out >> 10010000-10020000 r--p 00000000 fc:00 9044312 /home/kvaneesh/a.out >> 10020000-10030000 rw-p 00010000 fc:00 9044312 /home/kvaneesh/a.out >> 10000000000-10003000000 rw-p 00000000 00:0d 1041435 /anon_hugepage (deleted) >> 10003000000-10005000000 rw-p 03000000 00:0d 1041436 /anon_hugepage (deleted) >> 10005000000-10007000000 rw-p 05000000 00:0d 1041437 /anon_hugepage (deleted) >> 7ffff7d60000-7ffff7f10000 r-xp 00000000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f10000-7ffff7f20000 r--p 001a0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f20000-7ffff7f30000 rw-p 001b0000 fc:00 9250090 /lib/powerpc64le-linux-gnu/libc-2.23.so >> 7ffff7f40000-7ffff7f60000 r-xp 00000000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f60000-7ffff7f70000 r--p 00010000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f70000-7ffff7f80000 rw-p 00020000 fc:00 10754812 /usr/lib/libhugetlbfs.so.0 >> 7ffff7f80000-7ffff7fa0000 r-xp 00000000 00:00 0 [vdso] >> 7ffff7fa0000-7ffff7fe0000 r-xp 00000000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffff7fe0000-7ffff7ff0000 r--p 00030000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffff7ff0000-7ffff8000000 rw-p 00040000 fc:00 9250107 /lib/powerpc64le-linux-gnu/ld-2.23.so >> 7ffffffd0000-800000000000 rw-p 00000000 00:00 0 [stack] >> >> >> >> So i am definitely missing something. I understand that generic hugetlb >> get unmapped area always search bottom up and 8xx used to depend on that >> callback. But on ppc64 slice based get unmapped area always did topdown >> and I am not sure whether we should change that. More over I don't think >> MAP_GROWSDOWN is the right flag for selecting topdown/bottom up search. >> >> >> Is it that libhugetlbfs does something specific for 32 bit? Other option >> is to add huget_get_unmapped_area for 8xx that does bottom up search? I think I identified the difference. In my run you have the following warning: libhugetlbfs: WARNING: Heap originates at 0x73e80000 instead of 0x10080000 In your run, there is no such warning. I tried running the test with HUGETLB_MORECORE_HEAPBASE=0x30000000, and it works without the patch: root@vgoip:~# HUGETLB_MORECORE=yes HUGETLB_MORECORE_HEAPBASE=0x30000000 ./huge_m alloc_test 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 77ee0000-77f04000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 77f1c000-77f20000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 77f20000-77f24000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7f830000-7f854000 rw-p 00000000 00:00 0 [stack] Allocated 1Mbytes at 0x30000008 Allocated 1Mbytes at 0x30100010 Allocated 1Mbytes at 0x30200018 00100000-00108000 r-xp 00000000 00:00 0 [vdso] 0fde4000-0fde8000 r-xp 00000000 00:0f 168 /lib/libdl-2.23.so 0fde8000-0fe00000 ---p 00004000 00:0f 168 /lib/libdl-2.23.so 0fe00000-0fe04000 r--p 0000c000 00:0f 168 /lib/libdl-2.23.so 0fe04000-0fe08000 rwxp 00010000 00:0f 168 /lib/libdl-2.23.so 0fe18000-0ff88000 r-xp 00000000 00:0f 191 /lib/libc-2.23.so 0ff88000-0ffa4000 ---p 00170000 00:0f 191 /lib/libc-2.23.so 0ffa4000-0ffa8000 r--p 0017c000 00:0f 191 /lib/libc-2.23.so 0ffa8000-0ffac000 rwxp 00180000 00:0f 191 /lib/libc-2.23.so 0ffac000-0ffb0000 rwxp 00000000 00:00 0 0ffc0000-0ffd4000 r-xp 00000000 00:0f 90 /lib/libhugetlbfs.so 0ffd4000-0ffe0000 ---p 00014000 00:0f 90 /lib/libhugetlbfs.so 0ffe0000-0ffe4000 rwxp 00010000 00:0f 90 /lib/libhugetlbfs.so 0ffe4000-0fff0000 rwxp 00000000 00:00 0 10000000-10004000 r-xp 00000000 00:0f 3076 /root/huge_malloc_test 10010000-10014000 rwxp 00000000 00:0f 3076 /root/huge_malloc_test 30000000-30180000 rw-p 00000000 00:0b 7682 /anon_hugepage (deleted) 30180000-30280000 rw-p 00180000 00:0b 7683 /anon_hugepage (deleted) 30280000-30380000 rw-p 00280000 00:0b 7684 /anon_hugepage (deleted) 77ee0000-77f04000 r-xp 00000000 00:0f 171 /lib/ld-2.23.so 77f1c000-77f20000 r--p 0002c000 00:0f 171 /lib/ld-2.23.so 77f20000-77f24000 rwxp 00030000 00:0f 171 /lib/ld-2.23.so 7f830000-7f854000 rw-p 00000000 00:00 0 [stack] On your side, could you try and see with HUGETLB_MORECORE_HEAPBASE=0x11000000 ? Christophe >> >> If you are on ppc64 irc on freenode we can discuss this there. >> -aneesh
diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 79e1378ee303..0eadf9f199de 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -19,6 +19,7 @@ #include <linux/moduleparam.h> #include <linux/swap.h> #include <linux/swapops.h> +#include <linux/mman.h> #include <asm/pgtable.h> #include <asm/pgalloc.h> #include <asm/tlb.h> @@ -558,7 +559,8 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, return radix__hugetlb_get_unmapped_area(file, addr, len, pgoff, flags); #endif - return slice_get_unmapped_area(addr, len, flags, mmu_psize, 1); + return slice_get_unmapped_area(addr, len, flags, mmu_psize, + flags & MAP_GROWSDOWN); } #endif
An application running with libhugetlbfs fails to allocate additional pages to HEAP due to the hugemap being done inconditionally as topdown mapping: mmap(0x10080000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0) = 0x73e80000 [...] mmap(0x74000000, 1048576, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d80000 munmap(0x73d80000, 1048576) = 0 [...] mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 munmap(0x73d00000, 1572864) = 0 [...] mmap(0x74000000, 1572864, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|0x40000, -1, 0x180000) = 0x73d00000 munmap(0x73d00000, 1572864) = 0 [...] As one can see from the above strace log, mmap() allocates further pages below the initial one. This patch fixes it by taking into account MAP_GROWSDOWN flag. Fixes: d0f13e3c20b6f ("[POWERPC] Introduce address space "slices" ") Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr> --- v2: Added missing include arch/powerpc/mm/hugetlbpage.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)