um: Optimize Flush TLB for force/fork case

Message ID 20181207090553.26251-1-anton.ivanov@cambridgegreys.com
State Accepted
Headers show
Series
  • um: Optimize Flush TLB for force/fork case
Related show

Commit Message

Anton Ivanov Dec. 7, 2018, 9:05 a.m.
From: Anton Ivanov <anton.ivanov@cambridgegreys.com>

When UML handles a fork the page tables need to be brought up
to date. That was done using brute force - full tlb flush.

This is actually unnecessary, because the mapped-in mappings are
all correct and the only mappings which need to be updated
after a flush are any unmaps (so that paging works) as well as
any pending protection changes.

This optimization squeezes out up to 3% from a full kernel rebuild
time under memory pressure.

Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
---
 arch/um/kernel/tlb.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

Comments

Anton Ivanov Dec. 7, 2018, 9:11 a.m. | #1
On 12/7/18 9:05 AM, anton.ivanov@cambridgegreys.com wrote:
> From: Anton Ivanov <anton.ivanov@cambridgegreys.com>
>
> When UML handles a fork the page tables need to be brought up
> to date. That was done using brute force - full tlb flush.
>
> This is actually unnecessary, because the mapped-in mappings are
> all correct and the only mappings which need to be updated
> after a flush are any unmaps (so that paging works) as well as
> any pending protection changes.
>
> This optimization squeezes out up to 3% from a full kernel rebuild
> time under memory pressure.
>
> Signed-off-by: Anton Ivanov <anton.ivanov@cambridgegreys.com>
> ---
>   arch/um/kernel/tlb.c | 9 +++++----
>   1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
> index 9ca902df243a..8347161c2ae0 100644
> --- a/arch/um/kernel/tlb.c
> +++ b/arch/um/kernel/tlb.c
> @@ -242,10 +242,11 @@ static inline int update_pte_range(pmd_t *pmd, unsigned long addr,
>   		prot = ((r ? UM_PROT_READ : 0) | (w ? UM_PROT_WRITE : 0) |
>   			(x ? UM_PROT_EXEC : 0));
>   		if (hvc->force || pte_newpage(*pte)) {
> -			if (pte_present(*pte))
> -				ret = add_mmap(addr, pte_val(*pte) & PAGE_MASK,
> -					       PAGE_SIZE, prot, hvc);
> -			else
> +			if (pte_present(*pte)) {
> +				if (pte_newpage(*pte))
> +					ret = add_mmap(addr, pte_val(*pte) & PAGE_MASK,
> +						       PAGE_SIZE, prot, hvc);
> +			} else
>   				ret = add_munmap(addr, PAGE_SIZE, hvc);
>   		} else if (pte_newprot(*pte))
>   			ret = add_mprotect(addr, PAGE_SIZE, prot, hvc);

This completes the tlb rework - it is incremental on top of the previous 
series, but will apply clean to a pristine kernel as well.

All in all after all the beatings the morale improved by some measly 
5-7% for the whole series on heavy fork dependent cases.

There is no point to continue the beatings, it does not look like there 
is anything substantial left to extract out of the tlb itself unless the 
"incoming" state after fork improves somehow.

Patch

diff --git a/arch/um/kernel/tlb.c b/arch/um/kernel/tlb.c
index 9ca902df243a..8347161c2ae0 100644
--- a/arch/um/kernel/tlb.c
+++ b/arch/um/kernel/tlb.c
@@ -242,10 +242,11 @@  static inline int update_pte_range(pmd_t *pmd, unsigned long addr,
 		prot = ((r ? UM_PROT_READ : 0) | (w ? UM_PROT_WRITE : 0) |
 			(x ? UM_PROT_EXEC : 0));
 		if (hvc->force || pte_newpage(*pte)) {
-			if (pte_present(*pte))
-				ret = add_mmap(addr, pte_val(*pte) & PAGE_MASK,
-					       PAGE_SIZE, prot, hvc);
-			else
+			if (pte_present(*pte)) {
+				if (pte_newpage(*pte))
+					ret = add_mmap(addr, pte_val(*pte) & PAGE_MASK,
+						       PAGE_SIZE, prot, hvc);
+			} else
 				ret = add_munmap(addr, PAGE_SIZE, hvc);
 		} else if (pte_newprot(*pte))
 			ret = add_mprotect(addr, PAGE_SIZE, prot, hvc);