diff mbox series

[2/7] arm64: mm: accelerate pagefault when VM_FAULT_BADACCESS

Message ID 20240402075142.196265-3-wangkefeng.wang@huawei.com (mailing list archive)
State Handled Elsewhere
Headers show
Series arch/mm/fault: accelerate pagefault when badaccess | expand

Commit Message

Kefeng Wang April 2, 2024, 7:51 a.m. UTC
The vm_flags of vma already checked under per-VMA lock, if it is a
bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
no need to lock_mm_and_find_vma() and check vm_flags again, the latency
time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.

Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
---
 arch/arm64/mm/fault.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Suren Baghdasaryan April 3, 2024, 5:19 a.m. UTC | #1
On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.

The change makes sense to me. Per-VMA lock is enough to keep
vma->vm_flags stable, so no need to retry with mmap_lock.

>
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>

Reviewed-by: Suren Baghdasaryan <surenb@google.com>

> ---
>  arch/arm64/mm/fault.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9bb9f395351a..405f9aa831bd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>
>         if (!(vma->vm_flags & vm_flags)) {
>                 vma_end_read(vma);
> -               goto lock_mmap;
> +               fault = VM_FAULT_BADACCESS;
> +               count_vm_vma_lock_event(VMA_LOCK_SUCCESS);

nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
unrelated to the main change. Either splitting into a separate patch
or mentioning this additional fixup in the changelog would be helpful.

> +               goto done;
>         }
>         fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>         if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> --
> 2.27.0
>
Suren Baghdasaryan April 3, 2024, 5:30 a.m. UTC | #2
On Tue, Apr 2, 2024 at 10:19 PM Suren Baghdasaryan <surenb@google.com> wrote:
>
> On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
> >
> > The vm_flags of vma already checked under per-VMA lock, if it is a
> > bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> > no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> > time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.
>
> The change makes sense to me. Per-VMA lock is enough to keep
> vma->vm_flags stable, so no need to retry with mmap_lock.
>
> >
> > Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>
> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>
> > ---
> >  arch/arm64/mm/fault.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index 9bb9f395351a..405f9aa831bd 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> >
> >         if (!(vma->vm_flags & vm_flags)) {
> >                 vma_end_read(vma);
> > -               goto lock_mmap;
> > +               fault = VM_FAULT_BADACCESS;
> > +               count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>
> nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
> unrelated to the main change. Either splitting into a separate patch
> or mentioning this additional fixup in the changelog would be helpful.

The above nit applies to all the patches after this one, so I won't
comment on each one separately. If you decide to split or adjust the
changelog please do that for each patch.

>
> > +               goto done;
> >         }
> >         fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
> >         if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
> > --
> > 2.27.0
> >
Kefeng Wang April 3, 2024, 6:13 a.m. UTC | #3
On 2024/4/3 13:30, Suren Baghdasaryan wrote:
> On Tue, Apr 2, 2024 at 10:19 PM Suren Baghdasaryan <surenb@google.com> wrote:
>>
>> On Tue, Apr 2, 2024 at 12:53 AM Kefeng Wang <wangkefeng.wang@huawei.com> wrote:
>>>
>>> The vm_flags of vma already checked under per-VMA lock, if it is a
>>> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
>>> no need to lock_mm_and_find_vma() and check vm_flags again, the latency
>>> time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.
>>
>> The change makes sense to me. Per-VMA lock is enough to keep
>> vma->vm_flags stable, so no need to retry with mmap_lock.
>>
>>>
>>> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
>>
>> Reviewed-by: Suren Baghdasaryan <surenb@google.com>
>>
>>> ---
>>>   arch/arm64/mm/fault.c | 4 +++-
>>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>>> index 9bb9f395351a..405f9aa831bd 100644
>>> --- a/arch/arm64/mm/fault.c
>>> +++ b/arch/arm64/mm/fault.c
>>> @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>>>
>>>          if (!(vma->vm_flags & vm_flags)) {
>>>                  vma_end_read(vma);
>>> -               goto lock_mmap;
>>> +               fault = VM_FAULT_BADACCESS;
>>> +               count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
>>
>> nit: VMA_LOCK_SUCCESS accounting here seems correct to me but
>> unrelated to the main change. Either splitting into a separate patch
>> or mentioning this additional fixup in the changelog would be helpful.
> 
> The above nit applies to all the patches after this one, so I won't
> comment on each one separately. If you decide to split or adjust the
> changelog please do that for each patch.

I will update the change log for each patch, thank for your review and 
suggestion.

> 
>>
>>> +               goto done;
>>>          }
>>>          fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>>>          if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))
>>> --
>>> 2.27.0
>>>
Catalin Marinas April 3, 2024, 6:32 p.m. UTC | #4
On Tue, Apr 02, 2024 at 03:51:37PM +0800, Kefeng Wang wrote:
> The vm_flags of vma already checked under per-VMA lock, if it is a
> bad access, directly set fault to VM_FAULT_BADACCESS and handle error,
> no need to lock_mm_and_find_vma() and check vm_flags again, the latency
> time reduce 34% in lmbench 'lat_sig -P 1 prot lat_sig'.
> 
> Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
> ---
>  arch/arm64/mm/fault.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index 9bb9f395351a..405f9aa831bd 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -572,7 +572,9 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  
>  	if (!(vma->vm_flags & vm_flags)) {
>  		vma_end_read(vma);
> -		goto lock_mmap;
> +		fault = VM_FAULT_BADACCESS;
> +		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
> +		goto done;
>  	}
>  	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
>  	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))

I think this makes sense. A concurrent modification of vma->vm_flags
(e.g. mprotect()) would do a vma_start_write(), so no need to recheck
again with the mmap lock held.

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
diff mbox series

Patch

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index 9bb9f395351a..405f9aa831bd 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -572,7 +572,9 @@  static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 
 	if (!(vma->vm_flags & vm_flags)) {
 		vma_end_read(vma);
-		goto lock_mmap;
+		fault = VM_FAULT_BADACCESS;
+		count_vm_vma_lock_event(VMA_LOCK_SUCCESS);
+		goto done;
 	}
 	fault = handle_mm_fault(vma, addr, mm_flags | FAULT_FLAG_VMA_LOCK, regs);
 	if (!(fault & (VM_FAULT_RETRY | VM_FAULT_COMPLETED)))