Patchwork [RFC,v2,3/4] powerpc: Don't bolt the hpte in kernel_map_linear_page()

login
register
mail settings
Submitter Li Zhong
Date April 12, 2013, 2:16 a.m.
Message ID <1365733021-28912-4-git-send-email-zhong@linux.vnet.ibm.com>
Download mbox | patch
Permalink /patch/235956/
State Rejected, archived
Delegated to: Michael Ellerman
Headers show

Comments

Li Zhong - April 12, 2013, 2:16 a.m.
It seems that in kernel_unmap_linear_page(), it only checks whether there
is a map in the linear_map_hash_slots array, so seems we don't need bolt
the hpte.

Signed-off-by: Li Zhong <zhong@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Paul Mackerras - April 15, 2013, 3:50 a.m.
On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> It seems that in kernel_unmap_linear_page(), it only checks whether there
> is a map in the linear_map_hash_slots array, so seems we don't need bolt
> the hpte.

I don't exactly understand your rationale here, but I don't think it's
safe not to have linear mapping pages bolted.  Basically, if a page
will be used in the process of calling hash_page to demand-fault an
HPTE into the hash table, then that page needs to be bolted, otherwise
we can get an infinite recursion of HPT misses.  That includes all
kernel stack pages, among other things, so I think we need to leave
the HPTE_V_BOLTED in there.

Paul.
Benjamin Herrenschmidt - April 15, 2013, 6:56 a.m.
On Mon, 2013-04-15 at 13:50 +1000, Paul Mackerras wrote:
> On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> > It seems that in kernel_unmap_linear_page(), it only checks whether there
> > is a map in the linear_map_hash_slots array, so seems we don't need bolt
> > the hpte.
> 
> I don't exactly understand your rationale here, but I don't think it's
> safe not to have linear mapping pages bolted.  Basically, if a page
> will be used in the process of calling hash_page to demand-fault an
> HPTE into the hash table, then that page needs to be bolted, otherwise
> we can get an infinite recursion of HPT misses.  That includes all
> kernel stack pages, among other things, so I think we need to leave
> the HPTE_V_BOLTED in there.

I suspect Li's confusion comes from the fact that he doesn't realizes
that we might evict random hash slots. If the linear mapping hash
entries could only be thrown out via kernel_unmap_linear_page() then his
comment would make sense. However this isn't the case.

Li: When faulting something in, if both the primary and secondary
buckets are full, we "somewhat randomly" evict the content of a slot and
replace it. However we only do that on non-bolted slots.

This is why the linear mapping (and the vmemmap) must be bolted.

Cheers,
Ben.
Li Zhong - April 15, 2013, 8:15 a.m.
On Mon, 2013-04-15 at 08:56 +0200, Benjamin Herrenschmidt wrote:
> On Mon, 2013-04-15 at 13:50 +1000, Paul Mackerras wrote:
> > On Fri, Apr 12, 2013 at 10:16:59AM +0800, Li Zhong wrote:
> > > It seems that in kernel_unmap_linear_page(), it only checks whether there
> > > is a map in the linear_map_hash_slots array, so seems we don't need bolt
> > > the hpte.
> > 

Hi Paul, Ben

Thank you both for the comments and detailed information. I'll keep it
bolted in the next version. If you have time, please help to check
whether my understanding below is correct.

Thanks, Zhong

> > I don't exactly understand your rationale here, but I don't think it's
> > safe not to have linear mapping pages bolted.  Basically, if a page
> > will be used in the process of calling hash_page to demand-fault an
> > HPTE into the hash table, then that page needs to be bolted, otherwise
> > we can get an infinite recursion of HPT misses.  

So the infinite recursion happens like below?

        fault for PAGE A
        
        hash_page for PAGE A 
        
        some page B needed by hash_page processing removed by others,
        before inserting the HPTE
        
        fault for PAGE B 
        
        hash_page for PAGE B and recursion for ever
        

> That includes all
> > kernel stack pages, among other things, so I think we need to leave
> > the HPTE_V_BOLTED in there.
> 
> I suspect Li's confusion comes from the fact that he doesn't realizes
> that we might evict random hash slots. If the linear mapping hash
> entries could only be thrown out via kernel_unmap_linear_page() then his
> comment would make sense. However this isn't the case.
> 
> Li: When faulting something in, if both the primary and secondary
> buckets are full, we "somewhat randomly" evict the content of a slot and
> replace it. However we only do that on non-bolted slots.

So the code is implemented in ppc_md.hpte_remove(), may be called by
__hash_huge_page(), and asm code htab_call_hpte_remove?

> This is why the linear mapping (and the vmemmap) must be bolted.

If not, it would result the infinite recursion like above?

> Cheers,
> Ben.
> 
>
Benjamin Herrenschmidt - April 15, 2013, 11:27 a.m.
On Mon, 2013-04-15 at 16:15 +0800, Li Zhong wrote:

> So the code is implemented in ppc_md.hpte_remove(), may be called by
> __hash_huge_page(), and asm code htab_call_hpte_remove?
> 
> > This is why the linear mapping (and the vmemmap) must be bolted.
> 
> If not, it would result the infinite recursion like above?

Potentially, we don't expect to fault linear mapping or vmemmap entries
on demand. We aren't equipped to do it and we occasionally have code
path that access the linear mapping and cannot afford to have SRR0 and
SRR1 clobbered by a page fault.

Cheers,
Ben.
Li Zhong - April 16, 2013, 2:51 a.m.
On Mon, 2013-04-15 at 13:27 +0200, Benjamin Herrenschmidt wrote:
> On Mon, 2013-04-15 at 16:15 +0800, Li Zhong wrote:
> 
> > So the code is implemented in ppc_md.hpte_remove(), may be called by
> > __hash_huge_page(), and asm code htab_call_hpte_remove?
> > 
> > > This is why the linear mapping (and the vmemmap) must be bolted.
> > 
> > If not, it would result the infinite recursion like above?
> 
> Potentially, we don't expect to fault linear mapping or vmemmap entries
> on demand. We aren't equipped to do it and we occasionally have code
> path that access the linear mapping and cannot afford to have SRR0 and
> SRR1 clobbered by a page fault.

Thank you for the education :)

Thanks, Zhong

> 
> Cheers,
> Ben.
> 
>

Patch

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 716f42b..a7f54f0 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1281,7 +1281,7 @@  static void kernel_map_linear_page(unsigned long vaddr, unsigned long lmi)
 	if (!vsid)
 		return;
 	ret = ppc_md.hpte_insert(hpteg, vpn, __pa(vaddr),
-				 mode, HPTE_V_BOLTED,
+				 mode, 0,
 				 mmu_linear_psize, mmu_kernel_ssize);
 	BUG_ON (ret < 0);
 	spin_lock(&linear_map_hash_lock);