diff mbox

[V5,16/17] mm: Let arch choose the initial value of task size

Message ID 1490153823-29241-17-git-send-email-aneesh.kumar@linux.vnet.ibm.com (mailing list archive)
State Superseded
Headers show

Commit Message

Aneesh Kumar K.V March 22, 2017, 3:37 a.m. UTC
As we start supporting larger address space (>128TB), we want to give
architecture a control on max task size of an application which is different
from the TASK_SIZE. For ex: ppc64 needs to track the base page size of a segment
and it is copied from mm_context_t to PACA on each context switch. If we know that
application has not used an address range above 128TB we only need to copy
details about 128TB range to PACA. This will help in improving context switch
performance by avoiding larger copy operation.

Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: linux-mm@kvack.org
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
---
 fs/exec.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Michael Ellerman March 28, 2017, 11:17 a.m. UTC | #1
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:

> As we start supporting larger address space (>128TB), we want to give
> architecture a control on max task size of an application which is different
> from the TASK_SIZE. For ex: ppc64 needs to track the base page size of a segment
> and it is copied from mm_context_t to PACA on each context switch. If we know that
> application has not used an address range above 128TB we only need to copy
> details about 128TB range to PACA. This will help in improving context switch
> performance by avoiding larger copy operation.
>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/exec.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)

I'll need an ACK at least on this from someone in mm land.

I assume there's no way I can merge patch 17 without this?

> diff --git a/fs/exec.c b/fs/exec.c
> index 65145a3df065..5550a56d03c3 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1308,6 +1308,14 @@ void would_dump(struct linux_binprm *bprm, struct file *file)
>  }
>  EXPORT_SYMBOL(would_dump);
>  
> +#ifndef arch_init_task_size
> +static inline void arch_init_task_size(void)
> +{
> +	current->mm->task_size = TASK_SIZE;
> +}
> +#define arch_init_task_size arch_init_task_size

I don't think you need to do the #define in the fallback case, it's
just extra noise.

> +#endif
> +
>  void setup_new_exec(struct linux_binprm * bprm)
>  {
>  	arch_pick_mmap_layout(current->mm);

cheers
Aneesh Kumar K.V March 28, 2017, 3:22 p.m. UTC | #2
On Tuesday 28 March 2017 04:47 PM, Michael Ellerman wrote:
> "Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com> writes:
>
>> As we start supporting larger address space (>128TB), we want to give
>> architecture a control on max task size of an application which is different
>> from the TASK_SIZE. For ex: ppc64 needs to track the base page size of a segment
>> and it is copied from mm_context_t to PACA on each context switch. If we know that
>> application has not used an address range above 128TB we only need to copy
>> details about 128TB range to PACA. This will help in improving context switch
>> performance by avoiding larger copy operation.
>>
>> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
>> Cc: linux-mm@kvack.org
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
>> ---
>>  fs/exec.c | 10 +++++++++-
>>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> I'll need an ACK at least on this from someone in mm land.
>
> I assume there's no way I can merge patch 17 without this?

That is correct.

I didn't add linux-mm to cc for rest of the patches. They are all ppc64 
specific and can be found at

https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-March/155726.html


-aneesh
Anshuman Khandual March 29, 2017, 9:20 a.m. UTC | #3
On Wednesday 22 March 2017 09:07 AM, Aneesh Kumar K.V wrote:
> As we start supporting larger address space (>128TB), we want to give
> architecture a control on max task size of an application which is different
> from the TASK_SIZE. For ex: ppc64 needs to track the base page size of a segment
> and it is copied from mm_context_t to PACA on each context switch. If we know that
> application has not used an address range above 128TB we only need to copy
> details about 128TB range to PACA. This will help in improving context switch
> performance by avoiding larger copy operation.
>
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>   fs/exec.c | 10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
> diff --git a/fs/exec.c b/fs/exec.c
> index 65145a3df065..5550a56d03c3 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1308,6 +1308,14 @@ void would_dump(struct linux_binprm *bprm, struct file *file)
>   }
>   EXPORT_SYMBOL(would_dump);
>   
> +#ifndef arch_init_task_size
> +static inline void arch_init_task_size(void)
> +{
> +	current->mm->task_size = TASK_SIZE;
> +}
> +#define arch_init_task_size arch_init_task_size
> +#endif

Why not a proper CONFIG_ARCH_DEFINED_TASK_SIZE kind of option for this ? 
Also
are there no assumptions about task current->mm->size being TASK_SIZE in 
other
places which might get broken ?
Anshuman Khandual March 30, 2017, 3:03 a.m. UTC | #4
On 03/22/2017 09:07 AM, Aneesh Kumar K.V wrote:
> As we start supporting larger address space (>128TB), we want to give
> architecture a control on max task size of an application which is different
> from the TASK_SIZE. For ex: ppc64 needs to track the base page size of a segment
> and it is copied from mm_context_t to PACA on each context switch. If we know that
> application has not used an address range above 128TB we only need to copy
> details about 128TB range to PACA. This will help in improving context switch
> performance by avoiding larger copy operation.
> 
> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
> Cc: linux-mm@kvack.org
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
> ---
>  fs/exec.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/exec.c b/fs/exec.c
> index 65145a3df065..5550a56d03c3 100644
> --- a/fs/exec.c
> +++ b/fs/exec.c
> @@ -1308,6 +1308,14 @@ void would_dump(struct linux_binprm *bprm, struct file *file)
>  }
>  EXPORT_SYMBOL(would_dump);
>  
> +#ifndef arch_init_task_size
> +static inline void arch_init_task_size(void)
> +{
> +	current->mm->task_size = TASK_SIZE;
> +}
> +#define arch_init_task_size arch_init_task_size
> +#endif

Why not a proper CONFIG_ARCH_DEFINED_TASK_SIZE kind of option for
this ? Also are there no assumptions about task current->mm->size
being TASK_SIZE in other places which might get broken ?
diff mbox

Patch

diff --git a/fs/exec.c b/fs/exec.c
index 65145a3df065..5550a56d03c3 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1308,6 +1308,14 @@  void would_dump(struct linux_binprm *bprm, struct file *file)
 }
 EXPORT_SYMBOL(would_dump);
 
+#ifndef arch_init_task_size
+static inline void arch_init_task_size(void)
+{
+	current->mm->task_size = TASK_SIZE;
+}
+#define arch_init_task_size arch_init_task_size
+#endif
+
 void setup_new_exec(struct linux_binprm * bprm)
 {
 	arch_pick_mmap_layout(current->mm);
@@ -1327,7 +1335,7 @@  void setup_new_exec(struct linux_binprm * bprm)
 	 * depend on TIF_32BIT which is only updated in flush_thread() on
 	 * some architectures like powerpc
 	 */
-	current->mm->task_size = TASK_SIZE;
+	arch_init_task_size();
 
 	/* install the new credentials */
 	if (!uid_eq(bprm->cred->uid, current_euid()) ||