Message ID | 87vc60na89.fsf@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
On Thu, 2013-05-30 at 13:57 +0530, Aneesh Kumar K.V wrote: > + /* FIXME!!, will fail with when we enable hugepage > support */ Just fix that to say "Transparent huge pages" as normal huge pages should work fine unless I'm missing something. Hugh, any chance you can give that a spin ? Cheers, Ben.
On Thu, 30 May 2013, Benjamin Herrenschmidt wrote: > On Thu, 2013-05-30 at 13:57 +0530, Aneesh Kumar K.V wrote: > > + /* FIXME!!, will fail with when we enable hugepage > > support */ > > Just fix that to say "Transparent huge pages" as normal huge pages > should work fine unless I'm missing something. > > Hugh, any chance you can give that a spin ? Sure, it's now under way. If all goes well, I'll give you a progress report in about 15 hours time; but given the variance in how long it took to hit, I won't feel fully confident until this time tomorrow, when I'll update you again. Thank you both for the great response! Hugh
On Thu, 30 May 2013, Hugh Dickins wrote: > On Thu, 30 May 2013, Benjamin Herrenschmidt wrote: > > On Thu, 2013-05-30 at 13:57 +0530, Aneesh Kumar K.V wrote: > > > + /* FIXME!!, will fail with when we enable hugepage > > > support */ > > > > Just fix that to say "Transparent huge pages" as normal huge pages > > should work fine unless I'm missing something. > > > > Hugh, any chance you can give that a spin ? > > Sure, it's now under way. If all goes well, I'll give you a > progress report in about 15 hours time; but given the variance in > how long it took to hit, I won't feel fully confident until this > time tomorrow, when I'll update you again. Still running fine. I'll leave it running a few more hours to make sure, and then try switching to the patch Aneesh sent afterwards - or say if you'd prefer me to switch over to that one immediately. Hugh
On Thu, 2013-05-30 at 22:05 -0700, Hugh Dickins wrote: > > Sure, it's now under way. If all goes well, I'll give you a > > progress report in about 15 hours time; but given the variance in > > how long it took to hit, I won't feel fully confident until this > > time tomorrow, when I'll update you again. > > Still running fine. I'll leave it running a few more hours to make > sure, and then try switching to the patch Aneesh sent afterwards - > or say if you'd prefer me to switch over to that one immediately. The patch you are running on is what I'll send to Linus for 3.10 (+/- cosmetics). Aneesh second patch is a much larger rework which will be needed for THP but that will wait for 3.11. I'm happy for you to test it but I first want to make sure it's solid with the 3.10 fix :-) Thanks ! Cheers, Ben.
On Fri, 31 May 2013, Benjamin Herrenschmidt wrote: > On Thu, 2013-05-30 at 22:05 -0700, Hugh Dickins wrote: > > > Sure, it's now under way. If all goes well, I'll give you a > > > progress report in about 15 hours time; but given the variance in > > > how long it took to hit, I won't feel fully confident until this > > > time tomorrow, when I'll update you again. > > > > Still running fine. I'll leave it running a few more hours to make > > sure, and then try switching to the patch Aneesh sent afterwards - > > or say if you'd prefer me to switch over to that one immediately. > > The patch you are running on is what I'll send to Linus for 3.10 (+/- > cosmetics). Aneesh second patch is a much larger rework which will be > needed for THP but that will wait for 3.11. I'm happy for you to test it > but I first want to make sure it's solid with the 3.10 fix :-) That makes sense. The first patch ran fine without incident for 25 hours, I say it's good. I've now stopped that run and started another with the second patch in. Hugh
On Fri, 2013-05-31 at 14:45 +0530, Aneesh Kumar K.V wrote: > > The patch you are running on is what I'll send to Linus for 3.10 (+/- > > cosmetics). Aneesh second patch is a much larger rework which will be > > needed for THP but that will wait for 3.11. I'm happy for you to test it > > but I first want to make sure it's solid with the 3.10 fix :-) BTW. One concern I still have is that Hugh identified the bad commit to be: 7e74c3921ad9610c0b49f28b8fc69f7480505841 "powerpc: Fix hpte_decode to use the correct decoding for page sizes". However, you introduce the return on HPTE not found earlier, in b1022fbd293564de91596b8775340cf41ad5214c "powerpc: Decode the pte-lp-encoding bits correctly." So while I'm still happy with the current band-aid for 3.10 and am about to send it to Linus, the above *does* seem to indicate that there is also something wrong with the "Fix hpte_decode..." commit, which might not actually get the page size right... Can you investigate ? Cheers, Ben.
Benjamin Herrenschmidt <benh@au1.ibm.com> writes: > On Fri, 2013-05-31 at 14:45 +0530, Aneesh Kumar K.V wrote: > >> > The patch you are running on is what I'll send to Linus for 3.10 (+/- >> > cosmetics). Aneesh second patch is a much larger rework which will be >> > needed for THP but that will wait for 3.11. I'm happy for you to test it >> > but I first want to make sure it's solid with the 3.10 fix :-) > > BTW. One concern I still have is that Hugh identified the bad commit > to be: > > 7e74c3921ad9610c0b49f28b8fc69f7480505841 > "powerpc: Fix hpte_decode to use the correct decoding for page sizes". > > However, you introduce the return on HPTE not found earlier, in > > b1022fbd293564de91596b8775340cf41ad5214c > "powerpc: Decode the pte-lp-encoding bits correctly." > > So while I'm still happy with the current band-aid for 3.10 and am > about to send it to Linus, the above *does* seem to indicate that > there is also something wrong with the "Fix hpte_decode..." commit, > which might not actually get the page size right... > > Can you investigate ? 7e74c3921ad9610c0b49f28b8fc69f7480505841 "powerpc: Fix hpte_decode to use the correct decoding for page sizes" changes should only impact hpte_decode. We don't change the details of hpte_actual_psize at all in this patch. That means we should see a difference only with kexec right ?. Hugh, Will you be able to double check whether 7e74c3921ad9610c0b49f28b8fc69f7480505841 is the bad commit. The one before that is what we changed in the patch that fixed your problem. -aneesh
On Sun, 2 Jun 2013, Aneesh Kumar K.V wrote: > Benjamin Herrenschmidt <benh@au1.ibm.com> writes: > > On Fri, 2013-05-31 at 14:45 +0530, Aneesh Kumar K.V wrote: > > > >> > The patch you are running on is what I'll send to Linus for 3.10 (+/- > >> > cosmetics). Aneesh second patch is a much larger rework which will be > >> > needed for THP but that will wait for 3.11. I'm happy for you to test it > >> > but I first want to make sure it's solid with the 3.10 fix :-) > > > > BTW. One concern I still have is that Hugh identified the bad commit > > to be: > > > > 7e74c3921ad9610c0b49f28b8fc69f7480505841 > > "powerpc: Fix hpte_decode to use the correct decoding for page sizes". > > > > However, you introduce the return on HPTE not found earlier, in > > > > b1022fbd293564de91596b8775340cf41ad5214c > > "powerpc: Decode the pte-lp-encoding bits correctly." > > > > So while I'm still happy with the current band-aid for 3.10 and am > > about to send it to Linus, the above *does* seem to indicate that > > there is also something wrong with the "Fix hpte_decode..." commit, > > which might not actually get the page size right... > > > > Can you investigate ? > > 7e74c3921ad9610c0b49f28b8fc69f7480505841 > "powerpc: Fix hpte_decode to use the correct decoding for page sizes" > changes should only impact hpte_decode. We don't change the details > of hpte_actual_psize at all in this patch. That means we should see a > difference only with kexec right ?. > > Hugh, > > Will you be able to double check whether > 7e74c3921ad9610c0b49f28b8fc69f7480505841 is the bad commit. The one > before that is what we changed in the patch that fixed your problem. You are absolutely right. I just set b1022fbd29 going, expecting to answer you tomorrow: but got a Segmentation fault in 20 minutes (quicker than ever seen before). It looks as if I was running some other kernel for the last stage of my bisection: I can't see how that came about, but it's not very interesting now - you got it right. Prior to trying that, I had been running your second patch, 9f70fd8cfe, and that tested out successfully for 50 hours before I stopped it. Hugh
diff --git a/arch/powerpc/mm/hash_native_64.c b/arch/powerpc/mm/hash_native_64.c index 6a2aead..6d1bd81 100644 --- a/arch/powerpc/mm/hash_native_64.c +++ b/arch/powerpc/mm/hash_native_64.c @@ -336,11 +336,19 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp, hpte_v = hptep->v; actual_psize = hpte_actual_psize(hptep, psize); + /* + * We need to invalidate the TLB always because hpte_remove doesn't do + * a tlb invalidate. If a hash bucket gets full, we "evict" a more/less + * random entry from it. When we do that we don't invalidate the TLB + * (hpte_remove) because we assume the old translation is still technically + * "valid". + */ if (actual_psize < 0) { - native_unlock_hpte(hptep); - return -1; + /* FIXME!!, will fail with when we enable hugepage support */ + actual_psize = psize; + ret = -1; + goto err_out; } - /* Even if we miss, we need to invalidate the TLB */ if (!HPTE_V_COMPARE(hpte_v, want_v)) { DBG_LOW(" -> miss\n"); ret = -1; @@ -350,6 +358,7 @@ static long native_hpte_updatepp(unsigned long slot, unsigned long newpp, hptep->r = (hptep->r & ~(HPTE_R_PP | HPTE_R_N)) | (newpp & (HPTE_R_PP | HPTE_R_N | HPTE_R_C)); } +err_out: native_unlock_hpte(hptep); /* Ensure it is out of the tlb too. */ @@ -408,8 +417,9 @@ static void native_hpte_updateboltedpp(unsigned long newpp, unsigned long ea, panic("could not find page to bolt\n"); hptep = htab_address + slot; actual_psize = hpte_actual_psize(hptep, psize); + /* FIXME!! can this happen for bolted entry ? */ if (actual_psize < 0) - return; + actual_psize = psize; /* Update the HPTE */ hptep->r = (hptep->r & ~(HPTE_R_PP | HPTE_R_N)) | @@ -437,21 +447,28 @@ static void native_hpte_invalidate(unsigned long slot, unsigned long vpn, hpte_v = hptep->v; actual_psize = hpte_actual_psize(hptep, psize); + /* + * We need to invalidate the TLB always because hpte_remove doesn't do + * a tlb invalidate. If a hash bucket gets full, we "evict" a more/less + * random entry from it. When we do that we don't invalidate the TLB + * (hpte_remove) because we assume the old translation is still technically + * "valid". + */ if (actual_psize < 0) { + /* FIXME!!, will fail with when we enable hugepage support */ + actual_psize = psize; native_unlock_hpte(hptep); - local_irq_restore(flags); - return; + goto err_out; } - /* Even if we miss, we need to invalidate the TLB */ if (!HPTE_V_COMPARE(hpte_v, want_v)) native_unlock_hpte(hptep); else /* Invalidate the hpte. NOTE: this also unlocks it */ hptep->v = 0; +err_out: /* Invalidate the TLB */ tlbie(vpn, psize, actual_psize, ssize, local); - local_irq_restore(flags); }