diff mbox

powerpc: thp: Add write barrier after updating the valid bit

Message ID 1405435937-24115-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Aneesh Kumar K.V July 15, 2014, 2:52 p.m. UTC
With hugepages, we store the hpte valid information in the pte page
whose address is stored in the second half of the PMD. Use a
write barrier to make sure that clearing pmd busy bit and updating
hpte valid info are ordered properly.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/pgtable-ppc64.h | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Benjamin Herrenschmidt July 22, 2014, 5:27 a.m. UTC | #1
On Tue, 2014-07-15 at 20:22 +0530, Aneesh Kumar K.V wrote:
> With hugepages, we store the hpte valid information in the pte page
> whose address is stored in the second half of the PMD. Use a
> write barrier to make sure that clearing pmd busy bit and updating
> hpte valid info are ordered properly.
> 
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  arch/powerpc/include/asm/pgtable-ppc64.h | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
> index eb9261024f51..558beb760062 100644
> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
> @@ -394,6 +394,12 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>  					unsigned int index, unsigned int hidx)
>  {
>  	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
> +	/*
> +	 * The hpte valid is stored in the pgtable whose address is in the
> +	 * second half of the PMD. Order this against clearing of the busy bit in
> +	 * huge pmd.
> +	 */
> +	smp_wmb();
>  }

A better place for this would be right before the last write to the PMD
(that's also clearing BUSY) in __hash_page_thp(). Basically, it's the
normal lock ordering that's missing here, nothing specific to
mark_hpte_slot_valid() but instead, any state relative to the BUSY bit
in the PMD (including the actual hash writes in update_pp etc...)

>  struct page *realmode_pfn_to_page(unsigned long pfn);
Aneesh Kumar K.V July 22, 2014, 6:53 p.m. UTC | #2
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Tue, 2014-07-15 at 20:22 +0530, Aneesh Kumar K.V wrote:
>> With hugepages, we store the hpte valid information in the pte page
>> whose address is stored in the second half of the PMD. Use a
>> write barrier to make sure that clearing pmd busy bit and updating
>> hpte valid info are ordered properly.
>> 
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/include/asm/pgtable-ppc64.h | 6 ++++++
>>  1 file changed, 6 insertions(+)
>> 
>> diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
>> index eb9261024f51..558beb760062 100644
>> --- a/arch/powerpc/include/asm/pgtable-ppc64.h
>> +++ b/arch/powerpc/include/asm/pgtable-ppc64.h
>> @@ -394,6 +394,12 @@ static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
>>  					unsigned int index, unsigned int hidx)
>>  {
>>  	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
>> +	/*
>> +	 * The hpte valid is stored in the pgtable whose address is in the
>> +	 * second half of the PMD. Order this against clearing of the busy bit in
>> +	 * huge pmd.
>> +	 */
>> +	smp_wmb();
>>  }
>
> A better place for this would be right before the last write to the PMD
> (that's also clearing BUSY) in __hash_page_thp(). Basically, it's the
> normal lock ordering that's missing here, nothing specific to
> mark_hpte_slot_valid() but instead, any state relative to the BUSY bit
> in the PMD (including the actual hash writes in update_pp etc...)
>

IIUC updatepp already have required barriers. ie in updatepp we do tlbie
which should take care of the ordering right ?

Now the reason i moved that spm_wmb() to mark_hpte_slot_valid was to
pair it with smb_rmb() in get_hpte_slot_array().

-aneesh
Benjamin Herrenschmidt July 22, 2014, 9:55 p.m. UTC | #3
On Wed, 2014-07-23 at 00:23 +0530, Aneesh Kumar K.V wrote:
> > A better place for this would be right before the last write to the PMD
> > (that's also clearing BUSY) in __hash_page_thp(). Basically, it's the
> > normal lock ordering that's missing here, nothing specific to
> > mark_hpte_slot_valid() but instead, any state relative to the BUSY bit
> > in the PMD (including the actual hash writes in update_pp etc...)
> >
> 
> IIUC updatepp already have required barriers. ie in updatepp we do tlbie
> which should take care of the ordering right ?

Only if it succeeds but that doesn't matter, I'd rather we get the
semantics right. The clearing of the busy bit is an unlock, it should
have the appropriate barriers like it does in other variants of hash
page.
> 
> Now the reason i moved that spm_wmb() to mark_hpte_slot_valid was to
> pair it with smb_rmb() in get_hpte_slot_array().

Which is also probably in the wrong place. Care to explain to me the
exact relationship ?

Ben.
Aneesh Kumar K.V July 29, 2014, 6:55 a.m. UTC | #4
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Wed, 2014-07-23 at 00:23 +0530, Aneesh Kumar K.V wrote:
>> > A better place for this would be right before the last write to the PMD
>> > (that's also clearing BUSY) in __hash_page_thp(). Basically, it's the
>> > normal lock ordering that's missing here, nothing specific to
>> > mark_hpte_slot_valid() but instead, any state relative to the BUSY bit
>> > in the PMD (including the actual hash writes in update_pp etc...)
>> >
>> 
>> IIUC updatepp already have required barriers. ie in updatepp we do tlbie
>> which should take care of the ordering right ?
>
> Only if it succeeds but that doesn't matter, I'd rather we get the
> semantics right. The clearing of the busy bit is an unlock, it should
> have the appropriate barriers like it does in other variants of hash
> page.

ok

>> 
>> Now the reason i moved that spm_wmb() to mark_hpte_slot_valid was to
>> pair it with smb_rmb() in get_hpte_slot_array().
>
> Which is also probably in the wrong place. Care to explain to me the
> exact relationship ?

We want to make sure for usage like below we don't reorder the  load.

if (pmd_trans_huge(*pmdp)){

   get_hpte_slot_array(pmdp)
}

-aneesh
Benjamin Herrenschmidt July 29, 2014, 7 a.m. UTC | #5
On Tue, 2014-07-29 at 12:25 +0530, Aneesh Kumar K.V wrote:
> We want to make sure for usage like below we don't reorder the  load.
> 
> if (pmd_trans_huge(*pmdp)){
> 
>    get_hpte_slot_array(pmdp)
> }

Shouldn't we also make sure that we don't have lock set ? (In case it's
in the middle of being updated).

Cheers,
Ben.
Aneesh Kumar K.V July 29, 2014, 10:37 a.m. UTC | #6
Benjamin Herrenschmidt <benh@kernel.crashing.org> writes:

> On Tue, 2014-07-29 at 12:25 +0530, Aneesh Kumar K.V wrote:
>> We want to make sure for usage like below we don't reorder the  load.
>> 
>> if (pmd_trans_huge(*pmdp)){
>> 
>>    get_hpte_slot_array(pmdp)
>> }
>
> Shouldn't we also make sure that we don't have lock set ? (In case it's
> in the middle of being updated).

The reace against withdraw is not a real problem. We use
get_hpte_slot_array in huge page invalidate and hash page. In the first
case we are holding pmd_trans_huge_lock and in the later we have the
_PAGE_BUSY check.

-aneesh
diff mbox

Patch

diff --git a/arch/powerpc/include/asm/pgtable-ppc64.h b/arch/powerpc/include/asm/pgtable-ppc64.h
index eb9261024f51..558beb760062 100644
--- a/arch/powerpc/include/asm/pgtable-ppc64.h
+++ b/arch/powerpc/include/asm/pgtable-ppc64.h
@@ -394,6 +394,12 @@  static inline void mark_hpte_slot_valid(unsigned char *hpte_slot_array,
 					unsigned int index, unsigned int hidx)
 {
 	hpte_slot_array[index] = hidx << 4 | 0x1 << 3;
+	/*
+	 * The hpte valid is stored in the pgtable whose address is in the
+	 * second half of the PMD. Order this against clearing of the busy bit in
+	 * huge pmd.
+	 */
+	smp_wmb();
 }
 
 struct page *realmode_pfn_to_page(unsigned long pfn);