Message ID | 1272439375.24542.94.camel@pasglop (mailing list archive) |
---|---|
State | Accepted, archived |
Commit | 75c1d539ea13117cbe95e2c343e52af67d735145 |
Delegated to: | Benjamin Herrenschmidt |
Headers | show |
On Wed, Apr 28, 2010 at 2:22 AM, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: > So we tried to speed things up a bit using flush_hash_pages() directly > but that falls over on 603 of course meaning we fail to flush the TLB > properly and we may even end up having it corrupt memory randomly by > accessing a hash table that doesn't exist. > > This removes the "optimization" by always going through flush_tlb_page() > for now at least. > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > --- > > Somebody with a 603 or e300 core based FSL SoC to try this out for me ? > > It's obviously completely untested :-) > > Cheers, > Ben. > > diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c > index b9243e7..95774b4 100644 > --- a/arch/powerpc/mm/pgtable_32.c > +++ b/arch/powerpc/mm/pgtable_32.c > @@ -385,11 +385,7 @@ static int __change_page_attr(struct page *page, pgprot_t prot) > return -EINVAL; > __set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0); > wmb(); > -#ifdef CONFIG_PPC_STD_MMU > - flush_hash_pages(0, address, pmd_val(*kpmd), 1); > -#else > flush_tlb_page(NULL, address); > -#endif > pte_unmap(kpte); > > return 0; > > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org > https://lists.ozlabs.org/listinfo/linuxppc-dev > This change works me on a 834x(e300) platform, tested with lmbench and a production-ready application with 2.6.33.3. Xianghua
On Wed, 28 Apr 2010 14:15:50 -0500 Xianghua Xiao <xiaoxianghua@gmail.com> wrote: > On Wed, Apr 28, 2010 at 2:22 AM, Benjamin Herrenschmidt > <benh@kernel.crashing.org> wrote: > > So we tried to speed things up a bit using flush_hash_pages() directly > > but that falls over on 603 of course meaning we fail to flush the TLB > > properly and we may even end up having it corrupt memory randomly by > > accessing a hash table that doesn't exist. > > > > This removes the "optimization" by always going through flush_tlb_page() > > for now at least. > > > > Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > --- > > Somebody with a 603 or e300 core based FSL SoC to try this out for me ? > > > This change works me on a 834x(e300) platform, tested with lmbench and > a production-ready application with 2.6.33.3. moi aussi, Ben: booted this on an 8377 and started a kernel build. It seems to be progressing normally, I'm just not going to wait for it to complete before hitting send :). Kim
On Wed, 2010-04-28 at 14:15 -0500, Xianghua Xiao wrote: > This change works me on a 834x(e300) platform, tested with lmbench and > a production-ready application with 2.6.33.3. But have you tested that DEBUG_PAGEALLOC actually works ? :-) A way to do that is to - get_free_pages a page - read from it - free it - write to it It should oops on the write, and I suspect that without my patch it doesn't. Cheers, Ben.
Any updates on this ? I need something similar. Thanks, Maindoor. --- On Thu, 4/29/10, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Xianghua Xiao" <xiaoxianghua@gmail.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Thursday, April 29, 2010, 3:59 AM On Wed, 2010-04-28 at 14:15 -0500, Xianghua Xiao wrote: > This change works me on a 834x(e300) platform, tested with lmbench and > a production-ready application with 2.6.33.3. But have you tested that DEBUG_PAGEALLOC actually works ? :-) A way to do that is to - get_free_pages a page - read from it - free it - write to it It should oops on the write, and I suspect that without my patch it doesn't. Cheers, Ben.
On Sun, May 2, 2010 at 5:27 AM, Maindoor <sanjeevfiles@yahoo.com> wrote: > Any updates on this ? I need something similar. > > Thanks, > Maindoor. > > --- On *Thu, 4/29/10, Benjamin Herrenschmidt <benh@kernel.crashing.org>*wrote: > > > From: Benjamin Herrenschmidt <benh@kernel.crashing.org> > Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 > To: "Xianghua Xiao" <xiaoxianghua@gmail.com> > Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> > Date: Thursday, April 29, 2010, 3:59 AM > > On Wed, 2010-04-28 at 14:15 -0500, Xianghua Xiao wrote: > > This change works me on a 834x(e300) platform, tested with lmbench and > > a production-ready application with 2.6.33.3. > > But have you tested that DEBUG_PAGEALLOC actually works ? :-) > > A way to do that is to > > - get_free_pages a page > - read from it > - free it > - write to it > > It should oops on the write, and I suspect that without my patch it > doesn't. > > Cheers, > Ben. > > > _______________________________________________ > Linuxppc-dev mailing list > Linuxppc-dev@lists.ozlabs.org<http://mc/compose?to=Linuxppc-dev@lists.ozlabs.org> > https://lists.ozlabs.org/listinfo/linuxppc-dev > > > Ok here is the test result I did this morning, in short with/without Ben's patch it will oops if you write to a freed page. 1. With DEBUG_PAGEALLOC on, after applied Ben's patch and wrote to a freed page, it will oops: -------------------------------------- Read from allocated page Write to freed page Unable to handle kernel paging request for data at address 0xce33c000 Faulting instruction address: 0xd004d07c Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT DEBUG_PAGEALLOC 834x SYS Modules linked in: e300page(+) NIP: d004d07c LR: d004d074 CTR: 00000001 REGS: cf231e30 TRAP: 0300 Not tainted (2.6.33.3-rt17) MSR: 00009032 <EE,ME,IR,DR> CR: 24000422 XER: 20000000 DAR: ce33c000, DSISR: 22000000 TASK = ce312e10[1395] 'insmod' THREAD: cf230000 GPR00: 00000061 cf231ee0 ce312e10 0000001a 00003b0d ffffffff c047c4d6 00004000 GPR08: c047c8c4 ce33c000 00003fff ce312e10 24000422 100bbc1c 0fffd000 ffffffff GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 100b90a0 GPR24: 4800e4ac bfd2ff80 c043d388 00000000 d004d000 c0480000 cf230000 d0050000 NIP [d004d07c] e300page_init+0x7c/0xb8 [e300page] LR [d004d074] e300page_init+0x74/0xb8 [e300page] Call Trace: [cf231ee0] [d004d074] e300page_init+0x74/0xb8 [e300page] (unreliable) [cf231ef0] [c00038b4] do_one_initcall+0x54/0x210 [cf231f20] [c0058ba4] sys_init_module+0x120/0x240 [cf231f40] [c0012afc] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfe5baa0 LR = 0x10027de0 Instruction dump: 4e800020 3c60d005 3863a094 48000031 807fa214 38800000 48000045 3c60d005 3863a0b4 48000019 813fa214 38000061 <98090000> 4bffffb8 00000000 3d60c035 ---[ end trace 4b412bd05f6f6d8d ]--- # rmmod e300page rmmod: e300page: Device or resource busy 2. Without Ben's patch, there is also oops observed: # insmod ./e300page.ko Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT DEBUG_PAGEALLOC 834x SYS Modules linked in: e300page(+) NIP: d004d07c LR: d004d074 CTR: 00000001 REGS: ce30be30 TRAP: 0300 Not tainted (2.6.33.3-rt17) MSR: 00009032 <EE,ME,IR,DR> CR: 22000422 XER: 20000000 DAR: ce172000, DSISR: 22000000 TASK = ce306120[1392] 'insmod' THREAD: ce30a000 GPR00: 00000061 ce30bee0 ce306120 0000001a 00003b0d ffffffff c047c4d6 00004000 GPR08: c047c8c4 ce172000 00003fff ce306120 22000422 100bbc1c 0fffd000 ffffffff GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 100b90a0 GPR24: 4800e4ac bf876f10 c043d388 00000000 d004d000 c0480000 ce30a000 d0050000 NIP [d004d07c] e300page_init+0x7c/0xb8 [e300page] LR [d004d074] e300page_init+0x74/0xb8 [e300page] Call Trace: [ce30bee0] [d004d074] e300page_init+0x74/0xb8 [e300page] (unreliable) [ce30bef0] [c00038b4] do_one_initcall+0x54/0x210 [ce30bf20] [c0058bb0] sys_init_module+0x120/0x240 [ce30bf40] [c0012afc] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfe5baa0 LR = 0x10027de0 Instruction dump: 4e800020 3c60d005 3863a094 48000031 807fa214 38800000 48000045 3c60d005 3863a0b4 48000019 813fa214 38000061 <98090000> 4bffffb8 00000000 3d60c035 ---[ end trace a05c5135ddb62734 ]--- Segmentation fault Checking dmesg(where you can see the printk msg from the module): Read from allocated page Write to freed page Unable to handle kernel paging request for data at address 0xce172000 Faulting instruction address: 0xd004d07c Oops: Kernel access of bad area, sig: 11 [#1] PREEMPT DEBUG_PAGEALLOC 834x SYS Modules linked in: e300page(+) NIP: d004d07c LR: d004d074 CTR: 00000001 REGS: ce30be30 TRAP: 0300 Not tainted (2.6.33.3-rt17) MSR: 00009032 <EE,ME,IR,DR> CR: 22000422 XER: 20000000 DAR: ce172000, DSISR: 22000000 TASK = ce306120[1392] 'insmod' THREAD: ce30a000 GPR00: 00000061 ce30bee0 ce306120 0000001a 00003b0d ffffffff c047c4d6 00004000 GPR08: c047c8c4 ce172000 00003fff ce306120 22000422 100bbc1c 0fffd000 ffffffff GPR16: 00000001 00000000 007fff00 00000000 00000000 0fffa1a0 00000000 100b90a0 GPR24: 4800e4ac bf876f10 c043d388 00000000 d004d000 c0480000 ce30a000 d0050000 NIP [d004d07c] e300page_init+0x7c/0xb8 [e300page] LR [d004d074] e300page_init+0x74/0xb8 [e300page] Call Trace: [ce30bee0] [d004d074] e300page_init+0x74/0xb8 [e300page] (unreliable) [ce30bef0] [c00038b4] do_one_initcall+0x54/0x210 [ce30bf20] [c0058bb0] sys_init_module+0x120/0x240 [ce30bf40] [c0012afc] ret_from_syscall+0x0/0x38 --- Exception: c01 at 0xfe5baa0 LR = 0x10027de0 Instruction dump: 4e800020 3c60d005 3863a094 48000031 807fa214 38800000 48000045 3c60d005 3863a0b4 48000019 813fa214 38000061 <98090000> 4bffffb8 00000000 3d60c035 ---[ end trace a05c5135ddb62734 ]--- # lsmod Module Size Used by Tainted: G e300page 1549 1 # rmmod e300page rmmod: e300page: Device or resource busy Module source is below: #include <linux/module.h> #include <linux/slab.h> #include <linux/init.h> static int __init e300page_init (void) { static char *kpage; char ch; if (!(kpage = (char *)__get_free_pages (GFP_KERNEL, 0))) { printk (KERN_ERR " __get_free_pages failed\n"); return 0; } printk(KERN_INFO "Read from allocated page\n"); ch = kpage[0]; free_pages ((unsigned long)kpage, 0); printk(KERN_INFO "Write to freed page\n"); kpage[0]='a'; return 0; } static void __exit e300page_exit (void) { printk (KERN_INFO "e300page exiting\n"); } module_init (e300page_init); module_exit (e300page_exit); MODULE_AUTHOR ("Xianghua Xiao"); MODULE_DESCRIPTION ("Test PageAlloc on e300"); MODULE_LICENSE ("GPL v2");
On Mon, 2010-05-03 at 09:19 -0500, Xianghua Xiao wrote: > Ok here is the test result I did this morning, in short with/without > Ben's patch it will oops if you write to a freed page. Probably lucky enough that the free_pages() is pushing the stuff out of the TLB. At least my patch prevents writing to a non-existing hash table though, thus avoiding memory corruption. Cheers, Ben.
I'm not sure how good of a test this is. I tried the same thing and it seems to workimmediately ie. it gives an oops as soon as "I change the attribute and write to thepage". But when I try to write to it at a later stage as part of an ioctl or after about5 mins, this feature does not seem to work. What if the page is brought in again due to a fault, does the attribute remain the same ? How can I verify this ? RegardsMaindoor. --- On Mon, 5/3/10, Xianghua Xiao <xiaoxianghua@gmail.com> wrote: From: Xianghua Xiao <xiaoxianghua@gmail.com> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Maindoor" <sanjeevfiles@yahoo.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Monday, May 3, 2010, 7:49 PM On Sun, May 2, 2010 at 5:27 AM, Maindoor <sanjeevfiles@yahoo.com> wrote: Any updates on this ? I need something similar. Thanks, Maindoor. --- On Thu, 4/29/10, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Xianghua Xiao" <xiaoxianghua@gmail.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Thursday, April 29, 2010, 3:59 AM On Wed, 2010-04-28 at 14:15 -0500, Xianghua Xiao wrote: > This change works me on a 834x(e300) platform, tested with lmbench and > a production-ready application with 2.6.33.3. But have you tested that DEBUG_PAGEALLOC actually works ? :-) A way to do that is to - get_free_pages a page - read from it - free it - write to it It should oops on the write, and I suspect that without my patch it doesn't. Cheers, Ben.
Kind of working, I'll confirm this tomorrow.Was discussed earlier on ppcdev. Had to make somechange in fault handler, for this to work. Regards,Maindoor. --- On Thu, 5/6/10, Maindoor <sanjeevfiles@yahoo.com> wrote: From: Maindoor <sanjeevfiles@yahoo.com> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Xianghua Xiao" <xiaoxianghua@gmail.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Thursday, May 6, 2010, 1:53 PM I'm not sure how good of a test this is. I tried the same thing and it seems to workimmediately ie. it gives an oops as soon as "I change the attribute and write to thepage". But when I try to write to it at a later stage as part of an ioctl or after about5 mins, this feature does not seem to work. What if the page is brought in again due to a fault, does the attribute remain the same ? How can I verify this ? RegardsMaindoor. --- On Mon, 5/3/10, Xianghua Xiao <xiaoxianghua@gmail.com> wrote: From: Xianghua Xiao <xiaoxianghua@gmail.com> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Maindoor" <sanjeevfiles@yahoo.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Monday, May 3, 2010, 7:49 PM On Sun, May 2, 2010 at 5:27 AM, Maindoor <sanjeevfiles@yahoo.com> wrote: Any updates on this ? I need something similar. Thanks, Maindoor. --- On Thu, 4/29/10, Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote: From: Benjamin Herrenschmidt <benh@kernel.crashing.org> Subject: Re: [PATCH] Fix DEBUG_PAGEALLOC on 603/e300 To: "Xianghua Xiao" <xiaoxianghua@gmail.com> Cc: "linuxppc-dev" <linuxppc-dev@lists.ozlabs.org> Date: Thursday, April 29, 2010, 3:59 AM On Wed, 2010-04-28 at 14:15 -0500, Xianghua Xiao wrote: > This change works me on a 834x(e300) platform, tested with lmbench and > a production-ready application with 2.6.33.3. But have you tested that DEBUG_PAGEALLOC actually works ? :-) A way to do that is to - get_free_pages a page - read from it - free it - write to it It should oops on the write, and I suspect that without my patch it doesn't. Cheers, Ben.
diff --git a/arch/powerpc/mm/pgtable_32.c b/arch/powerpc/mm/pgtable_32.c index b9243e7..95774b4 100644 --- a/arch/powerpc/mm/pgtable_32.c +++ b/arch/powerpc/mm/pgtable_32.c @@ -385,11 +385,7 @@ static int __change_page_attr(struct page *page, pgprot_t prot) return -EINVAL; __set_pte_at(&init_mm, address, kpte, mk_pte(page, prot), 0); wmb(); -#ifdef CONFIG_PPC_STD_MMU - flush_hash_pages(0, address, pmd_val(*kpmd), 1); -#else flush_tlb_page(NULL, address); -#endif pte_unmap(kpte); return 0;
So we tried to speed things up a bit using flush_hash_pages() directly but that falls over on 603 of course meaning we fail to flush the TLB properly and we may even end up having it corrupt memory randomly by accessing a hash table that doesn't exist. This removes the "optimization" by always going through flush_tlb_page() for now at least. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> --- Somebody with a 603 or e300 core based FSL SoC to try this out for me ? It's obviously completely untested :-) Cheers, Ben.