Message ID | 1492067567-24688-1-git-send-email-ricklind@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
ricklind@linux.vnet.ibm.com writes: > From: Rick Lindsley <ricklind@linux.vnet.ibm.com> > > With the 512TB virtual addressing capability, a new field was added to > the paca and mm_context (addr_limit) to track the process's desire to > use the larger addressing space. Functions in the radix-enabled path > (mmap.c) were modified to inspect this value when deciding whether to > grant or deny requests in that range. > > However, the non-radix path falls through to the old, hashed slice code > (slice_get_unmapped_area, etc.) and these code paths still inspect > task_size. The same attention to addr_limit made in (for example) > radix__arch_get_unmapped_area() should also be applied to (correspondingly) > slice_get_unmapped_area(). I would suggest we don't do this change now. But rather we audit the usage of TASK_SIZE(), mm->task_size and move them correctly to mm->task_size and mm->context.addr_limit. The context.addr_limit is added as an optimization for slice_mask copy and we need to closely audit to make sure we can use that as a boundary condition for error checking in case of mmap. IMHO we should do this while consoliding TASK_SIZE/mm->task_size/mm->context.addr_limit. A previous attempt can be found at https://lkml.kernel.org/r/20161230155634.8692-1-dsafonov@virtuozzo.com We should start working in that direction. Some arch even have thread_info.addr_limit > > Signed-off-by: Rick Lindsley <ricklind@linux.vnet.ibm.com> > --- > arch/powerpc/mm/slice.c | 12 ++++++------ > 1 files changed, 6 insertions(+), 6 deletions(-) > > diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c > index 251b6ba..c023bff 100644 > --- a/arch/powerpc/mm/slice.c > +++ b/arch/powerpc/mm/slice.c > @@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr, > { > struct vm_area_struct *vma; > > - if ((mm->task_size - len) < addr) > + if ((mm->addr_limit - len) < addr) > return 0; > vma = find_vma(mm, addr); > return (!vma || (addr + len) <= vma->vm_start); > @@ -133,7 +133,7 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret) > if (!slice_low_has_vma(mm, i)) > ret->low_slices |= 1u << i; > > - if (mm->task_size <= SLICE_LOW_TOP) > + if (mm->addr_limit <= SLICE_LOW_TOP) > return; > > for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) > @@ -444,20 +444,20 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, > bitmap_zero(compat_mask.high_slices, SLICE_NUM_HIGH); > > /* Sanity checks */ > - BUG_ON(mm->task_size == 0); > + BUG_ON(mm->addr_limit == 0); > VM_BUG_ON(radix_enabled()); > > slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize); > slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d\n", > addr, len, flags, topdown); > > - if (len > mm->task_size) > + if (len > mm->addr_limit) > return -ENOMEM; > if (len & ((1ul << pshift) - 1)) > return -EINVAL; > if (fixed && (addr & ((1ul << pshift) - 1))) > return -EINVAL; > - if (fixed && addr > (mm->task_size - len)) > + if (fixed && addr > (mm->addr_limit - len)) > return -ENOMEM; > > /* If hint, make sure it matches our alignment restrictions */ > @@ -465,7 +465,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, > addr = _ALIGN_UP(addr, 1ul << pshift); > slice_dbg(" aligned addr=%lx\n", addr); > /* Ignore hint if it's too large or overlaps a VMA */ > - if (addr > mm->task_size - len || > + if (addr > mm->addr_limit - len || > !slice_area_is_free(mm, addr, len)) > addr = 0; > } > -- > 1.7.1 -aneesh
On 04/13/2017 03:58 AM, Aneesh Kumar K.V wrote: > I would suggest we don't do this change now. But rather we audit the > usage of TASK_SIZE(), mm->task_size and move them correctly to > mm->task_size and mm->context.addr_limit. The context.addr_limit is > added as an optimization for slice_mask copy and we need to closely > audit to make sure we can use that as a boundary condition for error > checking in case of mmap. I do agree that extending the address space could affect other code because of the use of TASK_SIZE. and we need to examine "the big picture" to resolve the usage of addr_limit, task_size, and TASK_SIZE. But without this small change now, I don't see how the extended address space is made available at all in the non-radix case. Are we acknowledging that 512TB address space is only available when radix is enabled? Rick
On Thursday 13 April 2017 05:13 PM, Rick Lindsley wrote: > On 04/13/2017 03:58 AM, Aneesh Kumar K.V wrote: > >> I would suggest we don't do this change now. But rather we audit the >> usage of TASK_SIZE(), mm->task_size and move them correctly to >> mm->task_size and mm->context.addr_limit. The context.addr_limit is >> added as an optimization for slice_mask copy and we need to closely >> audit to make sure we can use that as a boundary condition for error >> checking in case of mmap. > > I do agree that extending the address space could affect other > code because of the use of TASK_SIZE. and we need to examine > "the big picture" to resolve the usage of addr_limit, task_size, > and TASK_SIZE. But without this small change now, I don't see how > the extended address space is made available at all in the non-radix > case. Are we acknowledging that 512TB address space is only available > when radix is enabled? > Those code path you modified doesn't control the address allocation. The relevant bits are /* * This mmap request can allocate upt to 512TB */ if (addr > DEFAULT_MAP_WINDOW) high_limit = mm->context.addr_limit; else high_limit = DEFAULT_MAP_WINDOW; and for topdown search addr = mm->mmap_base; /* * If we are trying to allocate above DEFAULT_MAP_WINDOW * Add the different to the mmap_base. * Only for that request for which high_limit is above * DEFAULT_MAP_WINDOW we should apply this. */ if (high_limit > DEFAULT_MAP_WINDOW) addr += mm->context.addr_limit - DEFAULT_MAP_WINDOW; and for bottom up search addr = TASK_UNMAPPED_BASE; /* * Check till the allow max value for this mmap request */ while (addr < high_limit) { -aneesh
Hi Rick, [auto build test ERROR on powerpc/next] [also build test ERROR on next-20170413] [cannot apply to v4.11-rc6] [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] url: https://github.com/0day-ci/linux/commits/ricklind-linux-vnet-ibm-com/arch-powerpc-mm-slice-Cleanup-leftover-use-of-task_size/20170413-182643 base: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next config: powerpc-defconfig (attached as .config) compiler: powerpc64-linux-gnu-gcc (Debian 6.1.1-9) 6.1.1 20160705 reproduce: wget https://raw.githubusercontent.com/01org/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # save the attached .config to linux build tree make.cross ARCH=powerpc All errors (new ones prefixed by >>): arch/powerpc/mm/slice.c: In function 'slice_area_is_free': >> arch/powerpc/mm/slice.c:99:9: error: 'struct mm_struct' has no member named 'addr_limit' if ((mm->addr_limit - len) < addr) ^~ arch/powerpc/mm/slice.c: In function 'slice_mask_for_free': arch/powerpc/mm/slice.c:136:8: error: 'struct mm_struct' has no member named 'addr_limit' if (mm->addr_limit <= SLICE_LOW_TOP) ^~ In file included from include/linux/bug.h:4:0, from include/linux/mmdebug.h:4, from include/linux/mm.h:8, from arch/powerpc/mm/slice.c:28: arch/powerpc/mm/slice.c: In function 'slice_get_unmapped_area': arch/powerpc/mm/slice.c:447:11: error: 'struct mm_struct' has no member named 'addr_limit' BUG_ON(mm->addr_limit == 0); ^ arch/powerpc/include/asm/bug.h:75:27: note: in definition of macro 'BUG_ON' if (__builtin_constant_p(x)) { \ ^ arch/powerpc/mm/slice.c:447:11: error: 'struct mm_struct' has no member named 'addr_limit' BUG_ON(mm->addr_limit == 0); ^ arch/powerpc/include/asm/bug.h:76:7: note: in definition of macro 'BUG_ON' if (x) \ ^ arch/powerpc/mm/slice.c:447:11: error: 'struct mm_struct' has no member named 'addr_limit' BUG_ON(mm->addr_limit == 0); ^ arch/powerpc/include/asm/bug.h:84:25: note: in definition of macro 'BUG_ON' "r" ((__force long)(x))); \ ^ arch/powerpc/mm/slice.c:454:14: error: 'struct mm_struct' has no member named 'addr_limit' if (len > mm->addr_limit) ^~ arch/powerpc/mm/slice.c:460:25: error: 'struct mm_struct' has no member named 'addr_limit' if (fixed && addr > (mm->addr_limit - len)) ^~ arch/powerpc/mm/slice.c:468:16: error: 'struct mm_struct' has no member named 'addr_limit' if (addr > mm->addr_limit - len || ^~ vim +99 arch/powerpc/mm/slice.c 93 94 static int slice_area_is_free(struct mm_struct *mm, unsigned long addr, 95 unsigned long len) 96 { 97 struct vm_area_struct *vma; 98 > 99 if ((mm->addr_limit - len) < addr) 100 return 0; 101 vma = find_vma(mm, addr); 102 return (!vma || (addr + len) <= vma->vm_start); --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
ricklind@linux.vnet.ibm.com writes: > From: Rick Lindsley <ricklind@linux.vnet.ibm.com> > > However, the non-radix path falls through to the old, hashed slice code > (slice_get_unmapped_area, etc.) and these code paths still inspect > task_size. The same attention to addr_limit made in (for example) > radix__arch_get_unmapped_area() should also be applied to (correspondingly) > slice_get_unmapped_area(). I missed this part earlier. I guess that should be fixed in radix code. This came in via fbfef9027c2a7ad9277755509fdb849dbccfe8c1 (powerpc/mm: Switch some TASK_SIZE checks to use mm_context addr_limit). That patch needs update. When we switched from mm->task_size to mm->context.addr_limit in latest version of the patch, we missed updating the above correctly. I have now send a version which should update this correctly. https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-April/156781.html With this we use addr_limit only for slice mask optimization and addr serach limit. All the boundary check is now based on mm->task_size. We will later consolidate TASK_SIZE/mm->task_size/mm->context.addr_limit -aneesh
On 04/13/2017 12:33 PM, Aneesh Kumar K.V wrote: > I missed this part earlier. I guess that should be fixed in radix code. > This came in via fbfef9027c2a7ad9277755509fdb849dbccfe8c1 (powerpc/mm: > Switch some TASK_SIZE checks to use mm_context addr_limit). That patch > needs update. When we switched from mm->task_size to > mm->context.addr_limit in latest version of the patch, we missed > updating the above correctly. I have now send a version which should > update this correctly. Ok - so the intent then is that you may extend your address space, but you still may not allocate anything larger than task_size (which will never be larger than 128TB)? The section we are talking about is checking the length of the request against task_size, so that means we may not allocate a single vm area larger than 128TB even though it would be okay to (say) allocate 3 of those within 512TB of address space? Rick
On 04/13/2017 08:35 AM, kbuild test robot wrote: > Hi Rick, > > [auto build test ERROR on powerpc/next] > [also build test ERROR on next-20170413] > [cannot apply to v4.11-rc6] > [if your patch is applied to the wrong git tree, please drop us a note to help improve the system] Bah. I encountered this during a backport and typoed the structure name in preparing the patch here. All the references to addr_limit should have been context.addr_limit. However, Aneesh's post https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-April/156781.html makes this one unnecessary. With that patch, the two branches, radix and hash, are congruent in their treatment of addr_limit. Rick
diff --git a/arch/powerpc/mm/slice.c b/arch/powerpc/mm/slice.c index 251b6ba..c023bff 100644 --- a/arch/powerpc/mm/slice.c +++ b/arch/powerpc/mm/slice.c @@ -96,7 +96,7 @@ static int slice_area_is_free(struct mm_struct *mm, unsigned long addr, { struct vm_area_struct *vma; - if ((mm->task_size - len) < addr) + if ((mm->addr_limit - len) < addr) return 0; vma = find_vma(mm, addr); return (!vma || (addr + len) <= vma->vm_start); @@ -133,7 +133,7 @@ static void slice_mask_for_free(struct mm_struct *mm, struct slice_mask *ret) if (!slice_low_has_vma(mm, i)) ret->low_slices |= 1u << i; - if (mm->task_size <= SLICE_LOW_TOP) + if (mm->addr_limit <= SLICE_LOW_TOP) return; for (i = 0; i < GET_HIGH_SLICE_INDEX(mm->context.addr_limit); i++) @@ -444,20 +444,20 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, bitmap_zero(compat_mask.high_slices, SLICE_NUM_HIGH); /* Sanity checks */ - BUG_ON(mm->task_size == 0); + BUG_ON(mm->addr_limit == 0); VM_BUG_ON(radix_enabled()); slice_dbg("slice_get_unmapped_area(mm=%p, psize=%d...\n", mm, psize); slice_dbg(" addr=%lx, len=%lx, flags=%lx, topdown=%d\n", addr, len, flags, topdown); - if (len > mm->task_size) + if (len > mm->addr_limit) return -ENOMEM; if (len & ((1ul << pshift) - 1)) return -EINVAL; if (fixed && (addr & ((1ul << pshift) - 1))) return -EINVAL; - if (fixed && addr > (mm->task_size - len)) + if (fixed && addr > (mm->addr_limit - len)) return -ENOMEM; /* If hint, make sure it matches our alignment restrictions */ @@ -465,7 +465,7 @@ unsigned long slice_get_unmapped_area(unsigned long addr, unsigned long len, addr = _ALIGN_UP(addr, 1ul << pshift); slice_dbg(" aligned addr=%lx\n", addr); /* Ignore hint if it's too large or overlaps a VMA */ - if (addr > mm->task_size - len || + if (addr > mm->addr_limit - len || !slice_area_is_free(mm, addr, len)) addr = 0; }