diff mbox

[1/3] powerpc: mm: make _PAGE_NUMA take effect

Message ID 1386140348-7854-2-git-send-email-pingfank@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Pingfan Liu Dec. 4, 2013, 6:59 a.m. UTC
To enable the do_numa_page(), we should not fix _PAGE_NUMA in
hash_page(), so bail out for the case of pte_numa().

Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
---
 arch/powerpc/mm/hash_utils_64.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Aneesh Kumar K.V Dec. 5, 2013, 10:53 a.m. UTC | #1
Liu Ping Fan <kernelfans@gmail.com> writes:

> To enable the do_numa_page(), we should not fix _PAGE_NUMA in
> hash_page(), so bail out for the case of pte_numa().
>
> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
> ---
>  arch/powerpc/mm/hash_utils_64.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
> index fb176e9..9bf1195 100644
> --- a/arch/powerpc/mm/hash_utils_64.c
> +++ b/arch/powerpc/mm/hash_utils_64.c
> @@ -1033,7 +1033,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
>
>  	/* Get PTE and page size from page tables */
>  	ptep = find_linux_pte_or_hugepte(pgdir, ea, &hugeshift);
> -	if (ptep == NULL || !pte_present(*ptep)) {
> +	if (ptep == NULL || !pte_present(*ptep) || pte_numa(*ptep)) {
>  		DBG_LOW(" no PTE !\n");
>  		rc = 1;
>  		goto bail;

why ? , All the hash routines do check for _PAGE_PRESENT via access
variable.

-aneesh
Benjamin Herrenschmidt Dec. 9, 2013, 12:31 a.m. UTC | #2
On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote:
> Liu Ping Fan <kernelfans@gmail.com> writes:
> 
> > To enable the do_numa_page(), we should not fix _PAGE_NUMA in
> > hash_page(), so bail out for the case of pte_numa().

For some reason I don't have 2/3 and 3/3 in my mbox (though I do have
them on patchwork) so I'll reply to this one.

Overall, your statement that this is a faster path needs to be backed up
with numbers.

The code is complicated enough as it-is, such additional mess in the low
level hashing code requires a good justification, and also a
demonstration that it doesn't add overhead to the normal hash path.

Cheers,
Ben.
Pingfan Liu Dec. 9, 2013, 6:17 a.m. UTC | #3
On Mon, Dec 9, 2013 at 8:31 AM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote:
>> Liu Ping Fan <kernelfans@gmail.com> writes:
>>
>> > To enable the do_numa_page(), we should not fix _PAGE_NUMA in
>> > hash_page(), so bail out for the case of pte_numa().
>
> For some reason I don't have 2/3 and 3/3 in my mbox (though I do have
> them on patchwork) so I'll reply to this one.
>
> Overall, your statement that this is a faster path needs to be backed up
> with numbers.
>
> The code is complicated enough as it-is, such additional mess in the low
> level hashing code requires a good justification, and also a
> demonstration that it doesn't add overhead to the normal hash path.
>
For the test, is it ok to have an user application to copy page where
all page are PG_mlocked?

Thanks and regards,
Pingfan
Benjamin Herrenschmidt Dec. 9, 2013, 6:19 a.m. UTC | #4
On Mon, 2013-12-09 at 14:17 +0800, Liu ping fan wrote:
> On Mon, Dec 9, 2013 at 8:31 AM, Benjamin Herrenschmidt
> <benh@kernel.crashing.org> wrote:
> > On Thu, 2013-12-05 at 16:23 +0530, Aneesh Kumar K.V wrote:
> >> Liu Ping Fan <kernelfans@gmail.com> writes:
> >>
> >> > To enable the do_numa_page(), we should not fix _PAGE_NUMA in
> >> > hash_page(), so bail out for the case of pte_numa().
> >
> > For some reason I don't have 2/3 and 3/3 in my mbox (though I do have
> > them on patchwork) so I'll reply to this one.
> >
> > Overall, your statement that this is a faster path needs to be backed up
> > with numbers.
> >
> > The code is complicated enough as it-is, such additional mess in the low
> > level hashing code requires a good justification, and also a
> > demonstration that it doesn't add overhead to the normal hash path.
> >
> For the test, is it ok to have an user application to copy page where
> all page are PG_mlocked?

If that specific scenario is relevant in practice, then yes, though also
demonstrate the lack of regression with some more normal path such as a
kernel compile.

Cheers,
Ben.
Pingfan Liu Dec. 11, 2013, 8:50 a.m. UTC | #5
On Thu, Dec 5, 2013 at 6:53 PM, Aneesh Kumar K.V
<aneesh.kumar@linux.vnet.ibm.com> wrote:
> Liu Ping Fan <kernelfans@gmail.com> writes:
>
>> To enable the do_numa_page(), we should not fix _PAGE_NUMA in
>> hash_page(), so bail out for the case of pte_numa().
>>
>> Signed-off-by: Liu Ping Fan <pingfank@linux.vnet.ibm.com>
>> ---
>>  arch/powerpc/mm/hash_utils_64.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
>> index fb176e9..9bf1195 100644
>> --- a/arch/powerpc/mm/hash_utils_64.c
>> +++ b/arch/powerpc/mm/hash_utils_64.c
>> @@ -1033,7 +1033,7 @@ int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
>>
>>       /* Get PTE and page size from page tables */
>>       ptep = find_linux_pte_or_hugepte(pgdir, ea, &hugeshift);
>> -     if (ptep == NULL || !pte_present(*ptep)) {
>> +     if (ptep == NULL || !pte_present(*ptep) || pte_numa(*ptep)) {
>>               DBG_LOW(" no PTE !\n");
>>               rc = 1;
>>               goto bail;
>
> why ? , All the hash routines do check for _PAGE_PRESENT via access
> variable.
>
Going through __hash_page_4K(4k on 4k HW), I do not find such check.
Am I wrong? Or I will send out a patch to fix that.

Thanks and regards,
Pingfan
> -aneesh
>
Benjamin Herrenschmidt Dec. 11, 2013, 9:50 a.m. UTC | #6
On Wed, 2013-12-11 at 16:50 +0800, Liu ping fan wrote:
> > why ? , All the hash routines do check for _PAGE_PRESENT via access
> > variable.
> >
> Going through __hash_page_4K(4k on 4k HW), I do not find such check.
> Am I wrong? Or I will send out a patch to fix that.

We pass a bitmask of flags to check which are tested by doing an "andc"
of the PTE on that mask and checking if anything is left...

Ben.
Pingfan Liu Dec. 12, 2013, 2:19 a.m. UTC | #7
On Wed, Dec 11, 2013 at 5:50 PM, Benjamin Herrenschmidt
<benh@kernel.crashing.org> wrote:
> On Wed, 2013-12-11 at 16:50 +0800, Liu ping fan wrote:
>> > why ? , All the hash routines do check for _PAGE_PRESENT via access
>> > variable.
>> >
>> Going through __hash_page_4K(4k on 4k HW), I do not find such check.
>> Am I wrong? Or I will send out a patch to fix that.
>
> We pass a bitmask of flags to check which are tested by doing an "andc"
> of the PTE on that mask and checking if anything is left...
>
Oh, see it, thank you very much

Regards,
Pingfan
diff mbox

Patch

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index fb176e9..9bf1195 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -1033,7 +1033,7 @@  int hash_page(unsigned long ea, unsigned long access, unsigned long trap)
 
 	/* Get PTE and page size from page tables */
 	ptep = find_linux_pte_or_hugepte(pgdir, ea, &hugeshift);
-	if (ptep == NULL || !pte_present(*ptep)) {
+	if (ptep == NULL || !pte_present(*ptep) || pte_numa(*ptep)) {
 		DBG_LOW(" no PTE !\n");
 		rc = 1;
 		goto bail;