Message ID | 20190618044609.19997-1-vaibhav@linux.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | powerpc/mm: Ensure Huge-page memory is free before allocation | expand |
Context | Check | Description |
---|---|---|
snowpatch_ozlabs/apply_patch | success | Successfully applied on branch next (e610a466d16a086e321f0bd421e2fc75cff28605) |
snowpatch_ozlabs/build-ppc64le | success | Build succeeded |
snowpatch_ozlabs/build-ppc64be | success | Build succeeded |
snowpatch_ozlabs/build-ppc64e | success | Build succeeded |
snowpatch_ozlabs/build-pmac32 | success | Build succeeded |
snowpatch_ozlabs/checkpatch | warning | total: 0 errors, 1 warnings, 0 checks, 11 lines checked |
Vaibhav Jain <vaibhav@linux.ibm.com> writes: > We recently discovered an bug where physical memory meant for > allocation of Huge-pages was inadvertently allocated by another component > during early boot. Can you give me some more detail on what that was? You're seemingly the only person who's ever hit this :) > The behavior of memblock_reserve() where it wont > indicate whether an existing reserved block overlaps with the > requested reservation only makes such bugs hard to investigate. > > Hence this patch proposes adding a memblock reservation check in > htab_dt_scan_hugepage_blocks() just before call to memblock_reserve() > to ensure that the physical memory thats being reserved for is not > already reserved by someone else. In case this happens we panic the > the kernel to ensure that user of this huge-page doesn't accidentally > stomp on memory allocated to someone else. Do we really need to panic? Can't we just leave the block alone and not register it as huge page memory? With a big warning obviously. cheers > diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c > index 28ced26f2a00..a05be3adb8c9 100644 > --- a/arch/powerpc/mm/book3s64/hash_utils.c > +++ b/arch/powerpc/mm/book3s64/hash_utils.c > @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node, > printk(KERN_INFO "Huge page(16GB) memory: " > "addr = 0x%lX size = 0x%lX pages = %d\n", > phys_addr, block_size, expected_pages); > + > + /* Ensure no one else has reserved memory for huge pages before */ > + BUG_ON(memblock_is_region_reserved(phys_addr, > + block_size * expected_pages)); > + > if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) { > memblock_reserve(phys_addr, block_size * expected_pages); > pseries_add_gpage(phys_addr, block_size, expected_pages); > -- > 2.21.0
On 6/18/19 5:37 PM, Michael Ellerman wrote: > Vaibhav Jain <vaibhav@linux.ibm.com> writes: >> We recently discovered an bug where physical memory meant for >> allocation of Huge-pages was inadvertently allocated by another component >> during early boot. > > Can you give me some more detail on what that was? You're seemingly the > only person who's ever hit this :) > >> The behavior of memblock_reserve() where it wont >> indicate whether an existing reserved block overlaps with the >> requested reservation only makes such bugs hard to investigate. >> >> Hence this patch proposes adding a memblock reservation check in >> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve() >> to ensure that the physical memory thats being reserved for is not >> already reserved by someone else. In case this happens we panic the >> the kernel to ensure that user of this huge-page doesn't accidentally >> stomp on memory allocated to someone else. > > Do we really need to panic? Can't we just leave the block alone and not > register it as huge page memory? With a big warning obviously. > I need to check this again with Paul. But IIRC we have issues w.r.t hash page table size, if we use 16G pages as 64K mapped pages. ie, hypervisor create hash page table size with the assumptions that these pfns will only be mapped as 16G pages. If we try to put 64K hash pte entries for them we will not have enough space in hash page table. That is one of the reason we never allowed freeing the hugetlb reserved pool with 16G pages back to buddy. Note: We should switch that BUG to panic. I guess using BUG that early don't work. -aneesh
Michael Ellerman <mpe@ellerman.id.au> writes: > Vaibhav Jain <vaibhav@linux.ibm.com> writes: >> We recently discovered an bug where physical memory meant for >> allocation of Huge-pages was inadvertently allocated by another component >> during early boot. > > Can you give me some more detail on what that was? You're seemingly the > only person who's ever hit this :) The specific bug I investigated was in fadump which was trying to reserve a large chunk of physically contiguous memory from memblock and inadvertently stomped into a memory region that was reserved for allocation of 16G hugepages. This happened because fadump reservation happens much earlier in prom-init compared to hugepage reservation. The bug manifested as a panic seen when trying to 'cat /proc/pagetypeinfo' that dumps the buddy stats for each zone/migrate-type. Incidentally fadump after reserving this memory, would carve out a CMA region that was then entered into the buddy-allocater. This would cause pagetypeinfo_showfree_print() to fail when it tries to iterate over the free list of this CMA migrate type as the corresponding memmap for these pages was never initialized. > >> The behavior of memblock_reserve() where it wont >> indicate whether an existing reserved block overlaps with the >> requested reservation only makes such bugs hard to investigate. >> >> Hence this patch proposes adding a memblock reservation check in >> htab_dt_scan_hugepage_blocks() just before call to memblock_reserve() >> to ensure that the physical memory thats being reserved for is not >> already reserved by someone else. In case this happens we panic the >> the kernel to ensure that user of this huge-page doesn't accidentally >> stomp on memory allocated to someone else. > > Do we really need to panic? Can't we just leave the block alone and not > register it as huge page memory? With a big warning obviously. Possibly yes, but Aneesh pointed out that this memory is supposed to be backed only by 16G pages mapping due to limitation in phyp for hash page table size. > > cheers > >> diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c >> index 28ced26f2a00..a05be3adb8c9 100644 >> --- a/arch/powerpc/mm/book3s64/hash_utils.c >> +++ b/arch/powerpc/mm/book3s64/hash_utils.c >> @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node, >> printk(KERN_INFO "Huge page(16GB) memory: " >> "addr = 0x%lX size = 0x%lX pages = %d\n", >> phys_addr, block_size, expected_pages); >> + >> + /* Ensure no one else has reserved memory for huge pages before */ >> + BUG_ON(memblock_is_region_reserved(phys_addr, >> + block_size * expected_pages)); >> + >> if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) { >> memblock_reserve(phys_addr, block_size * expected_pages); >> pseries_add_gpage(phys_addr, block_size, expected_pages); >> -- >> 2.21.0 >
diff --git a/arch/powerpc/mm/book3s64/hash_utils.c b/arch/powerpc/mm/book3s64/hash_utils.c index 28ced26f2a00..a05be3adb8c9 100644 --- a/arch/powerpc/mm/book3s64/hash_utils.c +++ b/arch/powerpc/mm/book3s64/hash_utils.c @@ -516,6 +516,11 @@ static int __init htab_dt_scan_hugepage_blocks(unsigned long node, printk(KERN_INFO "Huge page(16GB) memory: " "addr = 0x%lX size = 0x%lX pages = %d\n", phys_addr, block_size, expected_pages); + + /* Ensure no one else has reserved memory for huge pages before */ + BUG_ON(memblock_is_region_reserved(phys_addr, + block_size * expected_pages)); + if (phys_addr + block_size * expected_pages <= memblock_end_of_DRAM()) { memblock_reserve(phys_addr, block_size * expected_pages); pseries_add_gpage(phys_addr, block_size, expected_pages);
We recently discovered an bug where physical memory meant for allocation of Huge-pages was inadvertently allocated by another component during early boot. The behavior of memblock_reserve() where it wont indicate whether an existing reserved block overlaps with the requested reservation only makes such bugs hard to investigate. Hence this patch proposes adding a memblock reservation check in htab_dt_scan_hugepage_blocks() just before call to memblock_reserve() to ensure that the physical memory thats being reserved for is not already reserved by someone else. In case this happens we panic the the kernel to ensure that user of this huge-page doesn't accidentally stomp on memory allocated to someone else. Signed-off-by: Vaibhav Jain <vaibhav@linux.ibm.com> --- arch/powerpc/mm/book3s64/hash_utils.c | 5 +++++ 1 file changed, 5 insertions(+)