Message ID | 20171117014601.31606-1-pasha.tatashin@oracle.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | [v1] mm: relax deferred struct page requirements | expand |
On Thu, Nov 16, 2017 at 08:46:01PM -0500, Pavel Tatashin wrote: > There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > as all the page initialization code is in common code. > > Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code > does not really use hotplug memory functionality. So, we can remove this > requirement as well. > > This patch allows to use deferred struct page initialization on all > platforms with memblock allocator. > > Tested on x86, arm64, and sparc. Also, verified that code compiles on > PPC with CONFIG_MEMORY_HOTPLUG disabled. > > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> > --- > arch/powerpc/Kconfig | 1 - > arch/s390/Kconfig | 1 - > arch/x86/Kconfig | 1 - > mm/Kconfig | 7 +------ > 4 files changed, 1 insertion(+), 9 deletions(-) For s390 the s390 bit: Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: > There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > as all the page initialization code is in common code. > > Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code > does not really use hotplug memory functionality. So, we can remove this > requirement as well. > > This patch allows to use deferred struct page initialization on all > platforms with memblock allocator. > > Tested on x86, arm64, and sparc. Also, verified that code compiles on > PPC with CONFIG_MEMORY_HOTPLUG disabled. There is slight risk that we will encounter corner cases on some architectures with weird memory layout/topology but we should better explicitly disable this code rather than make it opt-in so this looks like an improvement to me. > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> Acked-by: Michal Hocko <mhocko@suse.com> > --- > arch/powerpc/Kconfig | 1 - > arch/s390/Kconfig | 1 - > arch/x86/Kconfig | 1 - > mm/Kconfig | 7 +------ > 4 files changed, 1 insertion(+), 9 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index cb782ac1c35d..1540348691c9 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -148,7 +148,6 @@ config PPC > select ARCH_MIGHT_HAVE_PC_PARPORT > select ARCH_MIGHT_HAVE_PC_SERIO > select ARCH_SUPPORTS_ATOMIC_RMW > - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT > select ARCH_USE_BUILTIN_BSWAP > select ARCH_USE_CMPXCHG_LOCKREF if PPC64 > select ARCH_WANT_IPC_PARSE_VERSION > diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig > index 863a62a6de3c..525c2e3df6f5 100644 > --- a/arch/s390/Kconfig > +++ b/arch/s390/Kconfig > @@ -108,7 +108,6 @@ config S390 > select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE > select ARCH_SAVE_PAGE_KEYS if HIBERNATION > select ARCH_SUPPORTS_ATOMIC_RMW > - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT > select ARCH_SUPPORTS_NUMA_BALANCING > select ARCH_USE_BUILTIN_BSWAP > select ARCH_USE_CMPXCHG_LOCKREF > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig > index df3276d6bfe3..00a5446de394 100644 > --- a/arch/x86/Kconfig > +++ b/arch/x86/Kconfig > @@ -69,7 +69,6 @@ config X86 > select ARCH_MIGHT_HAVE_PC_PARPORT > select ARCH_MIGHT_HAVE_PC_SERIO > select ARCH_SUPPORTS_ATOMIC_RMW > - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT > select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 > select ARCH_USE_BUILTIN_BSWAP > select ARCH_USE_QUEUED_RWLOCKS > diff --git a/mm/Kconfig b/mm/Kconfig > index 9c4bdddd80c2..c6bd0309ce7a 100644 > --- a/mm/Kconfig > +++ b/mm/Kconfig > @@ -639,15 +639,10 @@ config MAX_STACK_SIZE_MB > > A sane initial value is 80 MB. > > -# For architectures that support deferred memory initialisation > -config ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT > - bool > - > config DEFERRED_STRUCT_PAGE_INIT > bool "Defer initialisation of struct pages to kthreads" > default n > - depends on ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT > - depends on NO_BOOTMEM && MEMORY_HOTPLUG > + depends on NO_BOOTMEM > depends on !FLATMEM > help > Ordinarily all struct pages are initialised during early boot in a > -- > 2.15.0
On Thu, 2017-11-16 at 20:46 -0500, Pavel Tatashin wrote: > There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > as all the page initialization code is in common code. > > Also, there is no need to depend on MEMORY_HOTPLUG, as initialization > code > does not really use hotplug memory functionality. So, we can remove > this > requirement as well. > > This patch allows to use deferred struct page initialization on all > platforms with memblock allocator. > > Tested on x86, arm64, and sparc. Also, verified that code compiles on > PPC with CONFIG_MEMORY_HOTPLUG disabled. > > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> > --- > arch/powerpc/Kconfig | 1 - > arch/s390/Kconfig | 1 - > arch/x86/Kconfig | 1 - > mm/Kconfig | 7 +------ > 4 files changed, 1 insertion(+), 9 deletions(-) > > Looks reasonable to me. Reviewed-by: Khalid Aziz <khalid.aziz@oracle.com>
Pavel Tatashin <pasha.tatashin@oracle.com> writes: > There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > as all the page initialization code is in common code. > > Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code > does not really use hotplug memory functionality. So, we can remove this > requirement as well. > > This patch allows to use deferred struct page initialization on all > platforms with memblock allocator. > > Tested on x86, arm64, and sparc. Also, verified that code compiles on > PPC with CONFIG_MEMORY_HOTPLUG disabled. > > Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> > --- > arch/powerpc/Kconfig | 1 - > arch/s390/Kconfig | 1 - > arch/x86/Kconfig | 1 - > mm/Kconfig | 7 +------ > 4 files changed, 1 insertion(+), 9 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index cb782ac1c35d..1540348691c9 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -148,7 +148,6 @@ config PPC > select ARCH_MIGHT_HAVE_PC_PARPORT > select ARCH_MIGHT_HAVE_PC_SERIO > select ARCH_SUPPORTS_ATOMIC_RMW > - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT Acked-by: Michael Ellerman <mpe@ellerman.id.au> cheers
On 11/21/2017, 08:24 AM, Michal Hocko wrote: > On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: >> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, >> as all the page initialization code is in common code. >> >> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code >> does not really use hotplug memory functionality. So, we can remove this >> requirement as well. >> >> This patch allows to use deferred struct page initialization on all >> platforms with memblock allocator. >> >> Tested on x86, arm64, and sparc. Also, verified that code compiles on >> PPC with CONFIG_MEMORY_HOTPLUG disabled. > > There is slight risk that we will encounter corner cases on some > architectures with weird memory layout/topology Which x86_32-pae seems to be. Many bad page state errors are emitted during boot when this patch is applied: BUG: Bad page state in process swapper pfn:3c01c page:f566c3f0 count:0 mapcount:1 mapping:00000000 index:0x0 flags: 0x0() raw: 00000000 00000000 00000000 00000000 00000000 00000100 00000200 00000000 raw: 00000000 page dumped because: nonzero mapcount Modules linked in: CPU: 0 PID: 0 Comm: swapper Tainted: G B 4.17.1-4.gdf028bb-pae #1 openSUSE Tumbleweed (unreleased) Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 Call Trace: dump_stack+0x7d/0xbd bad_page.cold.111+0x90/0xc7 free_pages_check_bad+0x52/0x70 free_pcppages_bulk+0x37d/0x570 free_unref_page_commit+0x9a/0xc0 free_unref_page+0x6a/0xa0 __free_pages+0x17/0x30 free_highmem_page+0x1e/0x50 add_highpages_with_active_regions+0xd6/0x113 set_highmem_pages_init+0x67/0x7d mem_init+0x23/0x1d9 start_kernel+0x1c2/0x437 i386_start_kernel+0x98/0x9c startup_32_smp+0x164/0x168 free_pages_check_bad expects mapcount == -1, but it is 1 with this patch. Reverting the patch makes the BUGs go away -- the config diff is then: @@ -617,7 +617,7 @@ # CONFIG_PGTABLE_MAPPING is not set # CONFIG_ZSMALLOC_STAT is not set CONFIG_GENERIC_EARLY_IOREMAP=y -CONFIG_DEFERRED_STRUCT_PAGE_INIT=y +CONFIG_ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT=y # CONFIG_IDLE_PAGE_TRACKING is not set CONFIG_FRAME_VECTOR=y # CONFIG_PERCPU_STATS is not set >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -148,7 +148,6 @@ config PPC >> select ARCH_MIGHT_HAVE_PC_PARPORT >> select ARCH_MIGHT_HAVE_PC_SERIO >> select ARCH_SUPPORTS_ATOMIC_RMW >> - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT >> select ARCH_USE_BUILTIN_BSWAP >> select ARCH_USE_CMPXCHG_LOCKREF if PPC64 >> select ARCH_WANT_IPC_PARSE_VERSION >> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig >> index 863a62a6de3c..525c2e3df6f5 100644 >> --- a/arch/s390/Kconfig >> +++ b/arch/s390/Kconfig >> @@ -108,7 +108,6 @@ config S390 >> select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE >> select ARCH_SAVE_PAGE_KEYS if HIBERNATION >> select ARCH_SUPPORTS_ATOMIC_RMW >> - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT >> select ARCH_SUPPORTS_NUMA_BALANCING >> select ARCH_USE_BUILTIN_BSWAP >> select ARCH_USE_CMPXCHG_LOCKREF >> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig >> index df3276d6bfe3..00a5446de394 100644 >> --- a/arch/x86/Kconfig >> +++ b/arch/x86/Kconfig >> @@ -69,7 +69,6 @@ config X86 >> select ARCH_MIGHT_HAVE_PC_PARPORT >> select ARCH_MIGHT_HAVE_PC_SERIO >> select ARCH_SUPPORTS_ATOMIC_RMW >> - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT >> select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 >> select ARCH_USE_BUILTIN_BSWAP >> select ARCH_USE_QUEUED_RWLOCKS >> diff --git a/mm/Kconfig b/mm/Kconfig >> index 9c4bdddd80c2..c6bd0309ce7a 100644 >> --- a/mm/Kconfig >> +++ b/mm/Kconfig >> @@ -639,15 +639,10 @@ config MAX_STACK_SIZE_MB >> >> A sane initial value is 80 MB. >> >> -# For architectures that support deferred memory initialisation >> -config ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT >> - bool >> - >> config DEFERRED_STRUCT_PAGE_INIT >> bool "Defer initialisation of struct pages to kthreads" >> default n >> - depends on ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT >> - depends on NO_BOOTMEM && MEMORY_HOTPLUG >> + depends on NO_BOOTMEM >> depends on !FLATMEM >> help >> Ordinarily all struct pages are initialised during early boot in a thanks,
On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: > > On 11/21/2017, 08:24 AM, Michal Hocko wrote: > > On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: > >> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > >> as all the page initialization code is in common code. > >> > >> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code > >> does not really use hotplug memory functionality. So, we can remove this > >> requirement as well. > >> > >> This patch allows to use deferred struct page initialization on all > >> platforms with memblock allocator. > >> > >> Tested on x86, arm64, and sparc. Also, verified that code compiles on > >> PPC with CONFIG_MEMORY_HOTPLUG disabled. > > > > There is slight risk that we will encounter corner cases on some > > architectures with weird memory layout/topology > > Which x86_32-pae seems to be. Many bad page state errors are emitted > during boot when this patch is applied: Hi Jiri, Thank you for reporting this bug. Because 32-bit systems are limited in the maximum amount of physical memory, they don't need deferred struct pages. So, we can add depends on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. However, before we do this, I want to try reproducing this problem and root cause it, as it might expose a general problem that is not 32-bit specific. Thank you, Pavel
On Tue, Jun 19, 2018 at 9:50 AM Pavel Tatashin <pasha.tatashin@oracle.com> wrote: > > On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: > > > > On 11/21/2017, 08:24 AM, Michal Hocko wrote: > > > On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: > > >> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, > > >> as all the page initialization code is in common code. > > >> > > >> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code > > >> does not really use hotplug memory functionality. So, we can remove this > > >> requirement as well. > > >> > > >> This patch allows to use deferred struct page initialization on all > > >> platforms with memblock allocator. > > >> > > >> Tested on x86, arm64, and sparc. Also, verified that code compiles on > > >> PPC with CONFIG_MEMORY_HOTPLUG disabled. > > > > > > There is slight risk that we will encounter corner cases on some > > > architectures with weird memory layout/topology > > > > Which x86_32-pae seems to be. Many bad page state errors are emitted > > during boot when this patch is applied: > > Hi Jiri, > > Thank you for reporting this bug. > > Because 32-bit systems are limited in the maximum amount of physical > memory, they don't need deferred struct pages. So, we can add depends > on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. > > However, before we do this, I want to try reproducing this problem and > root cause it, as it might expose a general problem that is not 32-bit > specific. Hi Jiri, Could you please attach your config and full qemu arguments that you used to reproduce this bug. Thank you, Pavel > > Thank you, > Pavel
On 06/19/2018, 09:56 PM, Pavel Tatashin wrote: > On Tue, Jun 19, 2018 at 9:50 AM Pavel Tatashin > <pasha.tatashin@oracle.com> wrote: >> >> On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: >>> >>> On 11/21/2017, 08:24 AM, Michal Hocko wrote: >>>> On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: >>>>> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, >>>>> as all the page initialization code is in common code. >>>>> >>>>> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code >>>>> does not really use hotplug memory functionality. So, we can remove this >>>>> requirement as well. >>>>> >>>>> This patch allows to use deferred struct page initialization on all >>>>> platforms with memblock allocator. >>>>> >>>>> Tested on x86, arm64, and sparc. Also, verified that code compiles on >>>>> PPC with CONFIG_MEMORY_HOTPLUG disabled. >>>> >>>> There is slight risk that we will encounter corner cases on some >>>> architectures with weird memory layout/topology >>> >>> Which x86_32-pae seems to be. Many bad page state errors are emitted >>> during boot when this patch is applied: >> >> Hi Jiri, >> >> Thank you for reporting this bug. >> >> Because 32-bit systems are limited in the maximum amount of physical >> memory, they don't need deferred struct pages. So, we can add depends >> on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. >> >> However, before we do this, I want to try reproducing this problem and >> root cause it, as it might expose a general problem that is not 32-bit >> specific. > > Hi Jiri, > > Could you please attach your config and full qemu arguments that you > used to reproduce this bug. Hi, I seem I never replied. Attaching .config and the qemu cmdline: $ qemu-kvm -m 2000 -hda /dev/null -kernel bzImage "-m 2000" is important to reproduce. If I disable CONFIG_DEFERRED_STRUCT_PAGE_INIT (which the patch allowed to enable), the error goes away, of course. thanks,
pasha.tatashin@oracle.com -> pavel.tatashin@microsoft.com due to 550 5.1.1 Unknown recipient address. On 08/24/2018, 09:32 AM, Jiri Slaby wrote: > On 06/19/2018, 09:56 PM, Pavel Tatashin wrote: >> On Tue, Jun 19, 2018 at 9:50 AM Pavel Tatashin >> <pasha.tatashin@oracle.com> wrote: >>> >>> On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: >>>> >>>> On 11/21/2017, 08:24 AM, Michal Hocko wrote: >>>>> On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: >>>>>> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, >>>>>> as all the page initialization code is in common code. >>>>>> >>>>>> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code >>>>>> does not really use hotplug memory functionality. So, we can remove this >>>>>> requirement as well. >>>>>> >>>>>> This patch allows to use deferred struct page initialization on all >>>>>> platforms with memblock allocator. >>>>>> >>>>>> Tested on x86, arm64, and sparc. Also, verified that code compiles on >>>>>> PPC with CONFIG_MEMORY_HOTPLUG disabled. >>>>> >>>>> There is slight risk that we will encounter corner cases on some >>>>> architectures with weird memory layout/topology >>>> >>>> Which x86_32-pae seems to be. Many bad page state errors are emitted >>>> during boot when this patch is applied: >>> >>> Hi Jiri, >>> >>> Thank you for reporting this bug. >>> >>> Because 32-bit systems are limited in the maximum amount of physical >>> memory, they don't need deferred struct pages. So, we can add depends >>> on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. >>> >>> However, before we do this, I want to try reproducing this problem and >>> root cause it, as it might expose a general problem that is not 32-bit >>> specific. >> >> Hi Jiri, >> >> Could you please attach your config and full qemu arguments that you >> used to reproduce this bug. > > Hi, > > I seem I never replied. Attaching .config and the qemu cmdline: > $ qemu-kvm -m 2000 -hda /dev/null -kernel bzImage > > "-m 2000" is important to reproduce. > > If I disable CONFIG_DEFERRED_STRUCT_PAGE_INIT (which the patch allowed > to enable), the error goes away, of course. > > thanks, >
Thank you Jiri, I am studying it. Pavel On 8/24/18 3:44 AM, Jiri Slaby wrote: > pasha.tatashin@oracle.com -> pavel.tatashin@microsoft.com > > due to > 550 5.1.1 Unknown recipient address. > > > On 08/24/2018, 09:32 AM, Jiri Slaby wrote: >> On 06/19/2018, 09:56 PM, Pavel Tatashin wrote: >>> On Tue, Jun 19, 2018 at 9:50 AM Pavel Tatashin >>> <pasha.tatashin@oracle.com> wrote: >>>> >>>> On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: >>>>> >>>>> On 11/21/2017, 08:24 AM, Michal Hocko wrote: >>>>>> On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: >>>>>>> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, >>>>>>> as all the page initialization code is in common code. >>>>>>> >>>>>>> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code >>>>>>> does not really use hotplug memory functionality. So, we can remove this >>>>>>> requirement as well. >>>>>>> >>>>>>> This patch allows to use deferred struct page initialization on all >>>>>>> platforms with memblock allocator. >>>>>>> >>>>>>> Tested on x86, arm64, and sparc. Also, verified that code compiles on >>>>>>> PPC with CONFIG_MEMORY_HOTPLUG disabled. >>>>>> >>>>>> There is slight risk that we will encounter corner cases on some >>>>>> architectures with weird memory layout/topology >>>>> >>>>> Which x86_32-pae seems to be. Many bad page state errors are emitted >>>>> during boot when this patch is applied: >>>> >>>> Hi Jiri, >>>> >>>> Thank you for reporting this bug. >>>> >>>> Because 32-bit systems are limited in the maximum amount of physical >>>> memory, they don't need deferred struct pages. So, we can add depends >>>> on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. >>>> >>>> However, before we do this, I want to try reproducing this problem and >>>> root cause it, as it might expose a general problem that is not 32-bit >>>> specific. >>> >>> Hi Jiri, >>> >>> Could you please attach your config and full qemu arguments that you >>> used to reproduce this bug. >> >> Hi, >> >> I seem I never replied. Attaching .config and the qemu cmdline: >> $ qemu-kvm -m 2000 -hda /dev/null -kernel bzImage >> >> "-m 2000" is important to reproduce. >> >> If I disable CONFIG_DEFERRED_STRUCT_PAGE_INIT (which the patch allowed >> to enable), the error goes away, of course. >> >> thanks, >> > >
Hi Jiri, I believe this bug is fixed with this change: d39f8fb4b7776dcb09ec3bf7a321547083078ee3 mm: make DEFERRED_STRUCT_PAGE_INIT explicitly depend on SPARSEMEM I am not able to reproduce this problem on x86-32. Pavel On 8/30/18 10:35 AM, Pavel Tatashin wrote: > Thank you Jiri, I am studying it. > > Pavel > > On 8/24/18 3:44 AM, Jiri Slaby wrote: >> pasha.tatashin@oracle.com -> pavel.tatashin@microsoft.com >> >> due to >> 550 5.1.1 Unknown recipient address. >> >> >> On 08/24/2018, 09:32 AM, Jiri Slaby wrote: >>> On 06/19/2018, 09:56 PM, Pavel Tatashin wrote: >>>> On Tue, Jun 19, 2018 at 9:50 AM Pavel Tatashin >>>> <pasha.tatashin@oracle.com> wrote: >>>>> >>>>> On Sat, Jun 16, 2018 at 4:04 AM Jiri Slaby <jslaby@suse.cz> wrote: >>>>>> >>>>>> On 11/21/2017, 08:24 AM, Michal Hocko wrote: >>>>>>> On Thu 16-11-17 20:46:01, Pavel Tatashin wrote: >>>>>>>> There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, >>>>>>>> as all the page initialization code is in common code. >>>>>>>> >>>>>>>> Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code >>>>>>>> does not really use hotplug memory functionality. So, we can remove this >>>>>>>> requirement as well. >>>>>>>> >>>>>>>> This patch allows to use deferred struct page initialization on all >>>>>>>> platforms with memblock allocator. >>>>>>>> >>>>>>>> Tested on x86, arm64, and sparc. Also, verified that code compiles on >>>>>>>> PPC with CONFIG_MEMORY_HOTPLUG disabled. >>>>>>> >>>>>>> There is slight risk that we will encounter corner cases on some >>>>>>> architectures with weird memory layout/topology >>>>>> >>>>>> Which x86_32-pae seems to be. Many bad page state errors are emitted >>>>>> during boot when this patch is applied: >>>>> >>>>> Hi Jiri, >>>>> >>>>> Thank you for reporting this bug. >>>>> >>>>> Because 32-bit systems are limited in the maximum amount of physical >>>>> memory, they don't need deferred struct pages. So, we can add depends >>>>> on 64BIT to DEFERRED_STRUCT_PAGE_INIT in mm/Kconfig. >>>>> >>>>> However, before we do this, I want to try reproducing this problem and >>>>> root cause it, as it might expose a general problem that is not 32-bit >>>>> specific. >>>> >>>> Hi Jiri, >>>> >>>> Could you please attach your config and full qemu arguments that you >>>> used to reproduce this bug. >>> >>> Hi, >>> >>> I seem I never replied. Attaching .config and the qemu cmdline: >>> $ qemu-kvm -m 2000 -hda /dev/null -kernel bzImage >>> >>> "-m 2000" is important to reproduce. >>> >>> If I disable CONFIG_DEFERRED_STRUCT_PAGE_INIT (which the patch allowed >>> to enable), the error goes away, of course. >>> >>> thanks, >>> >> >>
On 08/30/2018, 05:45 PM, Pasha Tatashin wrote: > Hi Jiri, > > I believe this bug is fixed with this change: > > d39f8fb4b7776dcb09ec3bf7a321547083078ee3 > mm: make DEFERRED_STRUCT_PAGE_INIT explicitly depend on SPARSEMEM Hi, it only shifted. Enabling only SPARSEMEM works fine, enabling also DEFERRED_STRUCT_PAGE_INIT doesn't even boot – immediately reboots (config attached). thanks,
On 08/31/2018, 01:26 PM, Jiri Slaby wrote: > On 08/30/2018, 05:45 PM, Pasha Tatashin wrote: >> Hi Jiri, >> >> I believe this bug is fixed with this change: >> >> d39f8fb4b7776dcb09ec3bf7a321547083078ee3 >> mm: make DEFERRED_STRUCT_PAGE_INIT explicitly depend on SPARSEMEM > > Hi, > > it only shifted. Enabling only SPARSEMEM works fine, enabling also > DEFERRED_STRUCT_PAGE_INIT doesn't even boot – immediately reboots > (config attached). Wow, earlyprintk is up at the moment of crash already: [ 0.000000] Linux version 4.19.0-rc1-pae (jslaby@kunlun) (gcc version 4.8.5 (SUSE Linux)) #4 SMP PREEMPT Fri Aug 31 13:18:33 CEST 2018 [ 0.000000] x86/fpu: x87 FPU will use FXSAVE [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007cfdffff] usable [ 0.000000] BIOS-e820: [mem 0x000000007cfe0000-0x000000007cffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved [ 0.000000] bootconsole [earlyser0] enabled [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] SMBIOS 2.8 present. [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 [ 0.000000] Hypervisor detected: KVM [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000002] kvm-clock: cpu 0, msr 1d12c001, primary cpu clock [ 0.000002] kvm-clock: using sched offset of 1597117996 cycles [ 0.001395] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.006245] tsc: Detected 2808.000 MHz processor [ 0.010055] last_pfn = 0x7cfe0 max_arch_pfn = 0x1000000 [ 0.011483] x86/PAT: PAT not supported by CPU. [ 0.012580] x86/PAT: Configuration [0-7]: WB WT UC- UC WB WT UC- UC [ 0.020644] found SMP MP-table at [mem 0x000f5d20-0x000f5d2f] mapped at [(ptrval)] [ 0.023528] Scanning 1 areas for low memory corruption [ 0.025047] ACPI: Early table checksum verification disabled [ 0.026581] ACPI: RSDP 0x00000000000F5B40 000014 (v00 BOCHS ) [ 0.028031] ACPI: RSDT 0x000000007CFE157C 000030 (v01 BOCHS BXPCRSDT 00000001 BXPC 00000001) [ 0.029996] ACPI: FACP 0x000000007CFE1458 000074 (v01 BOCHS BXPCFACP 00000001 BXPC 00000001) [ 0.032234] ACPI: DSDT 0x000000007CFE0040 001418 (v01 BOCHS BXPCDSDT 00000001 BXPC 00000001) [ 0.034662] ACPI: FACS 0x000000007CFE0000 000040 [ 0.036126] ACPI: APIC 0x000000007CFE14CC 000078 (v01 BOCHS BXPCAPIC 00000001 BXPC 00000001) [ 0.038235] ACPI: HPET 0x000000007CFE1544 000038 (v01 BOCHS BXPCHPET 00000001 BXPC 00000001) [ 0.040373] No NUMA configuration found [ 0.041407] Faking a node at [mem 0x0000000000000000-0x000000007cfdffff] [ 0.043306] NODE_DATA(0) allocated [mem 0x367fc000-0x367fcfff] [ 0.044958] 1127MB HIGHMEM available. [ 0.045940] 871MB LOWMEM available. [ 0.046978] mapped low ram: 0 - 367fe000 [ 0.048200] low ram: 0 - 367fe000 [ 0.050830] Zone ranges: [ 0.051625] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.053295] Normal [mem 0x0000000001000000-0x00000000367fdfff] [ 0.054921] HighMem [mem 0x00000000367fe000-0x000000007cfdffff] [ 0.056408] Movable zone start for each node [ 0.057452] Early memory node ranges [ 0.058377] node 0: [mem 0x0000000000001000-0x000000000009efff] [ 0.059946] node 0: [mem 0x0000000000100000-0x000000007cfdffff] [ 0.061825] Reserved but unavailable: 12418 pages [ 0.061828] Initmem setup node 0 [mem 0x0000000000001000-0x000000007cfdffff] [ 0.074252] Using APIC driver default [ 0.075615] ACPI: PM-Timer IO Port: 0x608 [ 0.076574] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) [ 0.077995] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 [ 0.079610] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) [ 0.081111] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) [ 0.082786] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.084297] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) [ 0.085933] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) [ 0.087729] Using ACPI (MADT) for SMP configuration information [ 0.089119] ACPI: HPET id: 0x8086a201 base: 0xfed00000 [ 0.090351] smpboot: Allowing 1 CPUs, 0 hotplug CPUs [ 0.091561] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] [ 0.093361] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff] [ 0.096382] PM: Registered nosave memory: [mem 0x000a0000-0x000effff] [ 0.098130] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff] [ 0.099729] [mem 0x7d000000-0xfeffbfff] available for PCI devices [ 0.101034] Booting paravirtualized kernel on KVM [ 0.102034] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns [ 0.104207] random: get_random_bytes called from start_kernel+0x77/0x47c with crng_init=0 [ 0.105913] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:1 nr_node_ids:1 [ 0.107548] percpu: Embedded 31 pages/cpu @(ptrval) s94604 r0 d32372 u126976 [ 0.109019] KVM setup async PF for cpu 0 [ 0.109825] kvm-stealtime: cpu 0, msr 367e5300 [ 0.110755] Built 1 zonelists, mobility grouping on. Total pages: 509908 [ 0.112113] Policy zone: HighMem [ 0.112755] Kernel command line: earlyprintk=serial [ 0.113773] Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) [ 0.115788] Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) [ 0.117465] Initializing CPU#0 [ 0.118522] Initializing HighMem for node 0 (000367fe:0007cfe0) [ 0.161140] BUG: unable to handle kernel NULL pointer dereference at 00000028 [ 0.162671] *pdpt = 0000000000000000 *pde = f000ff53f000ff53 [ 0.163857] Oops: 0000 [#1] PREEMPT SMP PTI [ 0.164862] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc1-pae #4 openSUSE Tumbleweed (unreleased) [ 0.167041] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 [ 0.169389] EIP: free_unref_page_prepare.part.75+0x26/0x50 [ 0.170337] Code: 00 00 00 00 e8 e7 a4 e9 ff 89 d1 c1 ea 11 55 8b 14 d5 84 d2 1c dd c1 e9 07 89 e5 56 81 e1 fc 03 00 00 53 89 cb c1 eb 05 89 ce <8b> 14 9a 83 e6 1f b9 1d 00 00 00 29 f1 d3 ea 83 e2 07 89 50 10 b8 [ 0.174205] EAX: f4cfa000 EBX: 0000000a ECX: 00000150 EDX: 00000000 [ 0.175422] ESI: 00000150 EDI: 00d80000 EBP: dcf2be50 ESP: dcf2be48 [ 0.176724] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210007 [ 0.178075] CR0: 80050033 CR2: 00000028 CR3: 1d118000 CR4: 000006b0 [ 0.179354] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.180629] DR6: fffe0ff0 DR7: 00000400 [ 0.181400] Call Trace: [ 0.181907] free_unref_page+0x3a/0x90 [ 0.182642] __free_pages+0x25/0x30 [ 0.183748] free_highmem_page+0x1e/0x50 [ 0.184594] add_highpages_with_active_regions+0x123/0x125 [ 0.185813] set_highmem_pages_init+0x83/0x8d [ 0.186847] mem_init+0x26/0x240 [ 0.187590] ? vprintk_func+0x38/0xd0 [ 0.188427] ? idt_setup_from_table.constprop.1+0x45/0x70 [ 0.189666] ? set_intr_gate+0x39/0x40 [ 0.190551] ? general_protection+0xc/0xc [ 0.191818] ? update_intr_gate+0x1e/0x20 [ 0.192817] ? kvm_apf_trap_init+0x17/0x19 [ 0.193800] ? trap_init+0x77/0x7d [ 0.194644] start_kernel+0x203/0x47c [ 0.195491] ? set_init_arg+0x57/0x57 [ 0.196385] i386_start_kernel+0x143/0x146 [ 0.197351] startup_32_smp+0x164/0x168 [ 0.198232] Modules linked in: [ 0.199072] CR2: 0000000000000028 [ 0.199983] ---[ end trace 69f4a864c8bd9bcd ]--- [ 0.201198] EIP: free_unref_page_prepare.part.75+0x26/0x50 [ 0.202610] Code: 00 00 00 00 e8 e7 a4 e9 ff 89 d1 c1 ea 11 55 8b 14 d5 84 d2 1c dd c1 e9 07 89 e5 56 81 e1 fc 03 00 00 53 89 cb c1 eb 05 89 ce <8b> 14 9a 83 e6 1f b9 1d 00 00 00 29 f1 d3 ea 83 e2 07 89 50 10 b8 [ 0.206942] EAX: f4cfa000 EBX: 0000000a ECX: 00000150 EDX: 00000000 [ 0.208177] ESI: 00000150 EDI: 00d80000 EBP: dcf2be50 ESP: dd11fefc [ 0.209438] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210007 [ 0.210826] CR0: 80050033 CR2: 00000028 CR3: 1d118000 CR4: 000006b0 [ 0.212155] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.213752] DR6: fffe0ff0 DR7: 00000400 > > thanks, >
Thanks Jiri, I am now able to reproduce it with your new config. I have tried yesterday to enable sparsemem and deferred_struct_init on x86_32, and that kernel booted fine, there must be something else in your config that helps to trigger this problem. I am studying it now. [ 0.051245] Initializing CPU#0 [ 0.051682] Initializing HighMem for node 0 (000367fe:0007ffe0) [ 0.067499] BUG: unable to handle kernel NULL pointer dereference at 00000028 [ 0.068452] *pdpt = 0000000000000000 *pde = f000ff53f000ff53 [ 0.069105] Oops: 0000 [#1] PREEMPT SMP PTI [ 0.069595] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc1-pae_pt_jiri #1 [ 0.070382] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014 [ 0.071545] EIP: free_unref_page_prepare.part.70+0x2c/0x50 [ 0.072178] Code: 19 e9 ff 89 d1 55 c1 ea 11 c1 e9 07 8b 14 d5 44 52 fd d6 81 e1 fc 03 00 00 89 e5 56 53 89 cb be 1d 00 00 00 c1 eb 05 83 e1 1f <8b> 14 9a 29 ce 89 f1 d3 ea 83 e2 07 89 50 10 b8 01 00 00 00 5b 5e [ 0.074296] EAX: f4cfa000 EBX: 0000000a ECX: 00000010 EDX: 00000000 [ 0.075005] ESI: 0000001d EDI: 0007ffe0 EBP: d6d41ed0 ESP: d6d41ec8 [ 0.075714] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210002 [ 0.076508] CR0: 80050033 CR2: 00000028 CR3: 16f20000 CR4: 000406b0 [ 0.077242] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.077934] DR6: fffe0ff0 DR7: 00000400 [ 0.078380] Call Trace: [ 0.078670] free_unref_page+0x3a/0x90 [ 0.079136] __free_pages+0x25/0x30 [ 0.079533] free_highmem_page+0x1e/0x50 [ 0.079978] add_highpages_with_active_regions+0xd1/0x11f [ 0.080592] set_highmem_pages_init+0x67/0x7d [ 0.081076] mem_init+0x30/0x1fc [ 0.081434] start_kernel+0x1cc/0x44c [ 0.081874] i386_start_kernel+0x98/0x9c [ 0.082401] startup_32_smp+0x164/0x168 [ 0.082873] Modules linked in: [ 0.083228] CR2: 0000000000000028 [ 0.083606] ---[ end trace a5990d9ace2ec990 ]--- [ 0.084128] EIP: free_unref_page_prepare.part.70+0x2c/0x50 [ 0.084747] Code: 19 e9 ff 89 d1 55 c1 ea 11 c1 e9 07 8b 14 d5 44 52 fd d6 81 e1 fc 03 00 00 89 e5 56 53 89 cb be 1d 00 00 00 c1 eb 05 83 e1 1f <8b> 14 9a 29 ce 89 f1 d3 ea 83 e2 07 89 50 10 b8 01 00 00 00 5b 5e [ 0.086874] EAX: f4cfa000 EBX: 0000000a ECX: 00000010 EDX: 00000000 [ 0.087581] ESI: 0000001d EDI: 0007ffe0 EBP: d6d41ed0 ESP: d6f27efc [ 0.088287] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210002 [ 0.089139] CR0: 80050033 CR2: 00000028 CR3: 16f20000 CR4: 000406b0 [ 0.089850] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 [ 0.090557] DR6: fffe0ff0 DR7: 00000400 [ 0.090992] Kernel panic - not syncing: Attempted to kill the idle task! Pavel On 8/31/18 7:29 AM, Jiri Slaby wrote: > On 08/31/2018, 01:26 PM, Jiri Slaby wrote: >> On 08/30/2018, 05:45 PM, Pasha Tatashin wrote: >>> Hi Jiri, >>> >>> I believe this bug is fixed with this change: >>> >>> d39f8fb4b7776dcb09ec3bf7a321547083078ee3 >>> mm: make DEFERRED_STRUCT_PAGE_INIT explicitly depend on SPARSEMEM >> >> Hi, >> >> it only shifted. Enabling only SPARSEMEM works fine, enabling also >> DEFERRED_STRUCT_PAGE_INIT doesn't even boot – immediately reboots >> (config attached). > > Wow, earlyprintk is up at the moment of crash already: > [ 0.000000] Linux version 4.19.0-rc1-pae (jslaby@kunlun) (gcc version > 4.8.5 (SUSE Linux)) #4 SMP PREEMPT Fri Aug 31 13:18:33 CEST 2018 > [ 0.000000] x86/fpu: x87 FPU will use FXSAVE > [ 0.000000] BIOS-provided physical RAM map: > [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable > [ 0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] > reserved > [ 0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] > reserved > [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000007cfdffff] usable > [ 0.000000] BIOS-e820: [mem 0x000000007cfe0000-0x000000007cffffff] > reserved > [ 0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] > reserved > [ 0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] > reserved > [ 0.000000] bootconsole [earlyser0] enabled > [ 0.000000] NX (Execute Disable) protection: active > [ 0.000000] SMBIOS 2.8 present. > [ 0.000000] DMI: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > 1.0.0-prebuilt.qemu-project.org 04/01/2014 > [ 0.000000] Hypervisor detected: KVM > [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 > [ 0.000002] kvm-clock: cpu 0, msr 1d12c001, primary cpu clock > [ 0.000002] kvm-clock: using sched offset of 1597117996 cycles > [ 0.001395] clocksource: kvm-clock: mask: 0xffffffffffffffff > max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns > [ 0.006245] tsc: Detected 2808.000 MHz processor > [ 0.010055] last_pfn = 0x7cfe0 max_arch_pfn = 0x1000000 > [ 0.011483] x86/PAT: PAT not supported by CPU. > [ 0.012580] x86/PAT: Configuration [0-7]: WB WT UC- UC WB WT UC- > UC > [ 0.020644] found SMP MP-table at [mem 0x000f5d20-0x000f5d2f] mapped > at [(ptrval)] > [ 0.023528] Scanning 1 areas for low memory corruption > [ 0.025047] ACPI: Early table checksum verification disabled > [ 0.026581] ACPI: RSDP 0x00000000000F5B40 000014 (v00 BOCHS ) > [ 0.028031] ACPI: RSDT 0x000000007CFE157C 000030 (v01 BOCHS BXPCRSDT > 00000001 BXPC 00000001) > [ 0.029996] ACPI: FACP 0x000000007CFE1458 000074 (v01 BOCHS BXPCFACP > 00000001 BXPC 00000001) > [ 0.032234] ACPI: DSDT 0x000000007CFE0040 001418 (v01 BOCHS BXPCDSDT > 00000001 BXPC 00000001) > [ 0.034662] ACPI: FACS 0x000000007CFE0000 000040 > [ 0.036126] ACPI: APIC 0x000000007CFE14CC 000078 (v01 BOCHS BXPCAPIC > 00000001 BXPC 00000001) > [ 0.038235] ACPI: HPET 0x000000007CFE1544 000038 (v01 BOCHS BXPCHPET > 00000001 BXPC 00000001) > [ 0.040373] No NUMA configuration found > [ 0.041407] Faking a node at [mem 0x0000000000000000-0x000000007cfdffff] > [ 0.043306] NODE_DATA(0) allocated [mem 0x367fc000-0x367fcfff] > [ 0.044958] 1127MB HIGHMEM available. > [ 0.045940] 871MB LOWMEM available. > [ 0.046978] mapped low ram: 0 - 367fe000 > [ 0.048200] low ram: 0 - 367fe000 > [ 0.050830] Zone ranges: > [ 0.051625] DMA [mem 0x0000000000001000-0x0000000000ffffff] > [ 0.053295] Normal [mem 0x0000000001000000-0x00000000367fdfff] > [ 0.054921] HighMem [mem 0x00000000367fe000-0x000000007cfdffff] > [ 0.056408] Movable zone start for each node > [ 0.057452] Early memory node ranges > [ 0.058377] node 0: [mem 0x0000000000001000-0x000000000009efff] > [ 0.059946] node 0: [mem 0x0000000000100000-0x000000007cfdffff] > [ 0.061825] Reserved but unavailable: 12418 pages > [ 0.061828] Initmem setup node 0 [mem > 0x0000000000001000-0x000000007cfdffff] > [ 0.074252] Using APIC driver default > [ 0.075615] ACPI: PM-Timer IO Port: 0x608 > [ 0.076574] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) > [ 0.077995] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI > 0-23 > [ 0.079610] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) > [ 0.081111] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) > [ 0.082786] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) > [ 0.084297] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) > [ 0.085933] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) > [ 0.087729] Using ACPI (MADT) for SMP configuration information > [ 0.089119] ACPI: HPET id: 0x8086a201 base: 0xfed00000 > [ 0.090351] smpboot: Allowing 1 CPUs, 0 hotplug CPUs > [ 0.091561] PM: Registered nosave memory: [mem 0x00000000-0x00000fff] > [ 0.093361] PM: Registered nosave memory: [mem 0x0009f000-0x0009ffff] > [ 0.096382] PM: Registered nosave memory: [mem 0x000a0000-0x000effff] > [ 0.098130] PM: Registered nosave memory: [mem 0x000f0000-0x000fffff] > [ 0.099729] [mem 0x7d000000-0xfeffbfff] available for PCI devices > [ 0.101034] Booting paravirtualized kernel on KVM > [ 0.102034] clocksource: refined-jiffies: mask: 0xffffffff > max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns > [ 0.104207] random: get_random_bytes called from > start_kernel+0x77/0x47c with crng_init=0 > [ 0.105913] setup_percpu: NR_CPUS:64 nr_cpumask_bits:64 nr_cpu_ids:1 > nr_node_ids:1 > [ 0.107548] percpu: Embedded 31 pages/cpu @(ptrval) s94604 r0 d32372 > u126976 > [ 0.109019] KVM setup async PF for cpu 0 > [ 0.109825] kvm-stealtime: cpu 0, msr 367e5300 > [ 0.110755] Built 1 zonelists, mobility grouping on. Total pages: 509908 > [ 0.112113] Policy zone: HighMem > [ 0.112755] Kernel command line: earlyprintk=serial > [ 0.113773] Dentry cache hash table entries: 131072 (order: 7, 524288 > bytes) > [ 0.115788] Inode-cache hash table entries: 65536 (order: 6, 262144 > bytes) > [ 0.117465] Initializing CPU#0 > [ 0.118522] Initializing HighMem for node 0 (000367fe:0007cfe0) > [ 0.161140] BUG: unable to handle kernel NULL pointer dereference at > 00000028 > [ 0.162671] *pdpt = 0000000000000000 *pde = f000ff53f000ff53 > [ 0.163857] Oops: 0000 [#1] PREEMPT SMP PTI > [ 0.164862] CPU: 0 PID: 0 Comm: swapper Not tainted 4.19.0-rc1-pae #4 > openSUSE Tumbleweed (unreleased) > [ 0.167041] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.0.0-prebuilt.qemu-project.org 04/01/2014 > [ 0.169389] EIP: free_unref_page_prepare.part.75+0x26/0x50 > [ 0.170337] Code: 00 00 00 00 e8 e7 a4 e9 ff 89 d1 c1 ea 11 55 8b 14 > d5 84 d2 1c dd c1 e9 07 89 e5 56 81 e1 fc 03 00 00 53 89 cb c1 eb 05 89 > ce <8b> 14 9a 83 e6 1f b9 1d 00 00 00 29 f1 d3 ea 83 e2 07 89 50 10 b8 > [ 0.174205] EAX: f4cfa000 EBX: 0000000a ECX: 00000150 EDX: 00000000 > [ 0.175422] ESI: 00000150 EDI: 00d80000 EBP: dcf2be50 ESP: dcf2be48 > [ 0.176724] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210007 > [ 0.178075] CR0: 80050033 CR2: 00000028 CR3: 1d118000 CR4: 000006b0 > [ 0.179354] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 0.180629] DR6: fffe0ff0 DR7: 00000400 > [ 0.181400] Call Trace: > [ 0.181907] free_unref_page+0x3a/0x90 > [ 0.182642] __free_pages+0x25/0x30 > [ 0.183748] free_highmem_page+0x1e/0x50 > [ 0.184594] add_highpages_with_active_regions+0x123/0x125 > [ 0.185813] set_highmem_pages_init+0x83/0x8d > [ 0.186847] mem_init+0x26/0x240 > [ 0.187590] ? vprintk_func+0x38/0xd0 > [ 0.188427] ? idt_setup_from_table.constprop.1+0x45/0x70 > [ 0.189666] ? set_intr_gate+0x39/0x40 > [ 0.190551] ? general_protection+0xc/0xc > [ 0.191818] ? update_intr_gate+0x1e/0x20 > [ 0.192817] ? kvm_apf_trap_init+0x17/0x19 > [ 0.193800] ? trap_init+0x77/0x7d > [ 0.194644] start_kernel+0x203/0x47c > [ 0.195491] ? set_init_arg+0x57/0x57 > [ 0.196385] i386_start_kernel+0x143/0x146 > [ 0.197351] startup_32_smp+0x164/0x168 > [ 0.198232] Modules linked in: > [ 0.199072] CR2: 0000000000000028 > [ 0.199983] ---[ end trace 69f4a864c8bd9bcd ]--- > [ 0.201198] EIP: free_unref_page_prepare.part.75+0x26/0x50 > [ 0.202610] Code: 00 00 00 00 e8 e7 a4 e9 ff 89 d1 c1 ea 11 55 8b 14 > d5 84 d2 1c dd c1 e9 07 89 e5 56 81 e1 fc 03 00 00 53 89 cb c1 eb 05 89 > ce <8b> 14 9a 83 e6 1f b9 1d 00 00 00 29 f1 d3 ea 83 e2 07 89 50 10 b8 > [ 0.206942] EAX: f4cfa000 EBX: 0000000a ECX: 00000150 EDX: 00000000 > [ 0.208177] ESI: 00000150 EDI: 00d80000 EBP: dcf2be50 ESP: dd11fefc > [ 0.209438] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210007 > [ 0.210826] CR0: 80050033 CR2: 00000028 CR3: 1d118000 CR4: 000006b0 > [ 0.212155] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 0.213752] DR6: fffe0ff0 DR7: 00000400 > > >> >> thanks, >> > >
On 08/31/2018, 02:10 PM, Pasha Tatashin wrote: > Thanks Jiri, I am now able to reproduce it with your new config. > > I have tried yesterday to enable sparsemem and deferred_struct_init on > x86_32, and that kernel booted fine, there must be something else in > your config that helps to trigger this problem. I am studying it now. > > [ 0.051245] Initializing CPU#0 > [ 0.051682] Initializing HighMem for node 0 (000367fe:0007ffe0) > [ 0.067499] BUG: unable to handle kernel NULL pointer dereference at > 00000028 > [ 0.068452] *pdpt = 0000000000000000 *pde = f000ff53f000ff53 > [ 0.069105] Oops: 0000 [#1] PREEMPT SMP PTI > [ 0.069595] CPU: 0 PID: 0 Comm: swapper Not tainted > 4.19.0-rc1-pae_pt_jiri #1 > [ 0.070382] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), > BIOS 1.11.0-20171110_100015-anatol 04/01/2014 > [ 0.071545] EIP: free_unref_page_prepare.part.70+0x2c/0x50 > [ 0.072178] Code: 19 e9 ff 89 d1 55 c1 ea 11 c1 e9 07 8b 14 d5 44 52 > fd d6 81 e1 fc 03 00 00 89 e5 56 53 89 cb be 1d 00 00 00 c1 eb 05 83 e1 > 1f <8b> 14 9a 29 ce 89 f1 d3 ea 83 e2 07 89 50 10 b8 01 00 00 00 5b 5e > [ 0.074296] EAX: f4cfa000 EBX: 0000000a ECX: 00000010 EDX: 00000000 > [ 0.075005] ESI: 0000001d EDI: 0007ffe0 EBP: d6d41ed0 ESP: d6d41ec8 > [ 0.075714] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068 EFLAGS: 00210002 > [ 0.076508] CR0: 80050033 CR2: 00000028 CR3: 16f20000 CR4: 000406b0 > [ 0.077242] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 > [ 0.077934] DR6: fffe0ff0 DR7: 00000400 > [ 0.078380] Call Trace: > [ 0.078670] free_unref_page+0x3a/0x90 > [ 0.079136] __free_pages+0x25/0x30 > [ 0.079533] free_highmem_page+0x1e/0x50 > [ 0.079978] add_highpages_with_active_regions+0xd1/0x11f > [ 0.080592] set_highmem_pages_init+0x67/0x7d > [ 0.081076] mem_init+0x30/0x1fc page_to_pfn(pfn_to_page(pfn)) != pfn with my .config on pfns >= 0x60000: [ 0.157667] add_highpages_with_active_regions: pfn=5fffb pg=f55f9f4c pfn(pg(pfn)=5fffb sec=2 [ 0.159231] add_highpages_with_active_regions: pfn=5fffc pg=f55f9f70 pfn(pg(pfn)=5fffc sec=2 [ 0.161020] add_highpages_with_active_regions: pfn=5fffd pg=f55f9f94 pfn(pg(pfn)=5fffd sec=2 [ 0.163149] add_highpages_with_active_regions: pfn=5fffe pg=f55f9fb8 pfn(pg(pfn)=5fffe sec=2 [ 0.165204] add_highpages_with_active_regions: pfn=5ffff pg=f55f9fdc pfn(pg(pfn)=5ffff sec=2 [ 0.167216] add_highpages_with_active_regions: pfn=60000 pg=f4cfa000 pfn(pg(pfn)=c716a800 sec=3 So add_highpages_with_active_regions passes down page to free_highmem_page and later, free_unref_page does page_to_pfn(page) and __get_pfnblock_flags_mask operates on this modified pfn leading to crash – __pfn_to_section(pfn)->pageblock_flags is NULL! Note that __pfn_to_section(pfn)->pageblock_flags on the original pfn returns a valid bitmap. thanks,
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index cb782ac1c35d..1540348691c9 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -148,7 +148,6 @@ config PPC select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO select ARCH_SUPPORTS_ATOMIC_RMW - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_WANT_IPC_PARSE_VERSION diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig index 863a62a6de3c..525c2e3df6f5 100644 --- a/arch/s390/Kconfig +++ b/arch/s390/Kconfig @@ -108,7 +108,6 @@ config S390 select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE select ARCH_SAVE_PAGE_KEYS if HIBERNATION select ARCH_SUPPORTS_ATOMIC_RMW - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index df3276d6bfe3..00a5446de394 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -69,7 +69,6 @@ config X86 select ARCH_MIGHT_HAVE_PC_PARPORT select ARCH_MIGHT_HAVE_PC_SERIO select ARCH_SUPPORTS_ATOMIC_RMW - select ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT select ARCH_SUPPORTS_NUMA_BALANCING if X86_64 select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_QUEUED_RWLOCKS diff --git a/mm/Kconfig b/mm/Kconfig index 9c4bdddd80c2..c6bd0309ce7a 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -639,15 +639,10 @@ config MAX_STACK_SIZE_MB A sane initial value is 80 MB. -# For architectures that support deferred memory initialisation -config ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT - bool - config DEFERRED_STRUCT_PAGE_INIT bool "Defer initialisation of struct pages to kthreads" default n - depends on ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT - depends on NO_BOOTMEM && MEMORY_HOTPLUG + depends on NO_BOOTMEM depends on !FLATMEM help Ordinarily all struct pages are initialised during early boot in a
There is no need to have ARCH_SUPPORTS_DEFERRED_STRUCT_PAGE_INIT, as all the page initialization code is in common code. Also, there is no need to depend on MEMORY_HOTPLUG, as initialization code does not really use hotplug memory functionality. So, we can remove this requirement as well. This patch allows to use deferred struct page initialization on all platforms with memblock allocator. Tested on x86, arm64, and sparc. Also, verified that code compiles on PPC with CONFIG_MEMORY_HOTPLUG disabled. Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com> --- arch/powerpc/Kconfig | 1 - arch/s390/Kconfig | 1 - arch/x86/Kconfig | 1 - mm/Kconfig | 7 +------ 4 files changed, 1 insertion(+), 9 deletions(-)