diff mbox series

[RESEND,1/1] um: oops on accessing a non-present page in the vmalloc area

Message ID 20240223140435.1240-1-petrtesarik@huaweicloud.com
State Changes Requested
Headers show
Series [RESEND,1/1] um: oops on accessing a non-present page in the vmalloc area | expand

Commit Message

Petr Tesarik Feb. 23, 2024, 2:04 p.m. UTC
From: Petr Tesarik <petr.tesarik1@huawei-partners.com>

If a segmentation fault is caused by accessing an address in the vmalloc
area, check that the target page is present.

Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
assumes that the fault is caused by a stale mapping and will be fixed by
flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
a guard page, no mapping is created, and when the faulting instruction is
restarted, it will cause exactly the same fault again, effectively creating
an infinite loop.

Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
---
 arch/um/kernel/trap.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Petr Tesarik March 12, 2024, 3:07 p.m. UTC | #1
On 2/23/2024 3:04 PM, Petr Tesarik wrote:
> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> 
> If a segmentation fault is caused by accessing an address in the vmalloc
> area, check that the target page is present.
> 
> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
> assumes that the fault is caused by a stale mapping and will be fixed by
> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
> a guard page, no mapping is created, and when the faulting instruction is
> restarted, it will cause exactly the same fault again, effectively creating
> an infinite loop.

Ping. Any comment on this fix?

Petr T

> 
> Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> ---
>  arch/um/kernel/trap.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
> index 6d8ae86ae978..d5b85f1bfe33 100644
> --- a/arch/um/kernel/trap.c
> +++ b/arch/um/kernel/trap.c
> @@ -206,11 +206,15 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
>  	int err;
>  	int is_write = FAULT_WRITE(fi);
>  	unsigned long address = FAULT_ADDRESS(fi);
> +	pte_t *pte;
>  
>  	if (!is_user && regs)
>  		current->thread.segv_regs = container_of(regs, struct pt_regs, regs);
>  
>  	if (!is_user && (address >= start_vm) && (address < end_vm)) {
> +		pte = virt_to_pte(&init_mm, address);
> +		if (!pte_present(*pte))
> +			page_fault_oops(regs, address, ip);
>  		flush_tlb_kernel_vm();
>  		goto out;
>  	}
Petr Tesarik March 18, 2024, 1:09 p.m. UTC | #2
On 3/12/2024 4:07 PM, Petr Tesarik wrote:
> On 2/23/2024 3:04 PM, Petr Tesarik wrote:
>> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>>
>> If a segmentation fault is caused by accessing an address in the vmalloc
>> area, check that the target page is present.
>>
>> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
>> assumes that the fault is caused by a stale mapping and will be fixed by
>> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
>> a guard page, no mapping is created, and when the faulting instruction is
>> restarted, it will cause exactly the same fault again, effectively creating
>> an infinite loop.
> 
> Ping. Any comment on this fix?

I don't think I have seen a reply from you. If you did comment, then
your email has not reached me.

Please, can you confirm you have seen my patch?

Kind regards
Petr T

> Petr T
> 
>>
>> Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>> ---
>>  arch/um/kernel/trap.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
>> index 6d8ae86ae978..d5b85f1bfe33 100644
>> --- a/arch/um/kernel/trap.c
>> +++ b/arch/um/kernel/trap.c
>> @@ -206,11 +206,15 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
>>  	int err;
>>  	int is_write = FAULT_WRITE(fi);
>>  	unsigned long address = FAULT_ADDRESS(fi);
>> +	pte_t *pte;
>>  
>>  	if (!is_user && regs)
>>  		current->thread.segv_regs = container_of(regs, struct pt_regs, regs);
>>  
>>  	if (!is_user && (address >= start_vm) && (address < end_vm)) {
>> +		pte = virt_to_pte(&init_mm, address);
>> +		if (!pte_present(*pte))
>> +			page_fault_oops(regs, address, ip);
>>  		flush_tlb_kernel_vm();
>>  		goto out;
>>  	}
>
Richard Weinberger March 19, 2024, 10:18 p.m. UTC | #3
----- Ursprüngliche Mail -----
> Von: "Petr Tesarik" <petrtesarik@huaweicloud.com>
> An: "richard" <richard@nod.at>, "anton ivanov" <anton.ivanov@cambridgegreys.com>, "Johannes Berg"
> <johannes@sipsolutions.net>, "linux-um" <linux-um@lists.infradead.org>, "linux-kernel" <linux-kernel@vger.kernel.org>
> CC: "Roberto Sassu" <roberto.sassu@huaweicloud.com>, "petr" <petr@tesarici.cz>
> Gesendet: Montag, 18. März 2024 14:09:07
> Betreff: Re: [PATCH RESEND 1/1] um: oops on accessing a non-present page in the vmalloc area

> On 3/12/2024 4:07 PM, Petr Tesarik wrote:
>> On 2/23/2024 3:04 PM, Petr Tesarik wrote:
>>> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>>>
>>> If a segmentation fault is caused by accessing an address in the vmalloc
>>> area, check that the target page is present.
>>>
>>> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
>>> assumes that the fault is caused by a stale mapping and will be fixed by
>>> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
>>> a guard page, no mapping is created, and when the faulting instruction is
>>> restarted, it will cause exactly the same fault again, effectively creating
>>> an infinite loop.
>> 
>> Ping. Any comment on this fix?
> 
> I don't think I have seen a reply from you. If you did comment, then
> your email has not reached me.
> 
> Please, can you confirm you have seen my patch?

Yes. I'm just way behind my maintainer schedule. :-/

Thanks,
//richard
Petr Tesarik March 20, 2024, 1:58 p.m. UTC | #4
On 3/19/2024 11:18 PM, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "Petr Tesarik" <petrtesarik@huaweicloud.com>
>> An: "richard" <richard@nod.at>, "anton ivanov" <anton.ivanov@cambridgegreys.com>, "Johannes Berg"
>> <johannes@sipsolutions.net>, "linux-um" <linux-um@lists.infradead.org>, "linux-kernel" <linux-kernel@vger.kernel.org>
>> CC: "Roberto Sassu" <roberto.sassu@huaweicloud.com>, "petr" <petr@tesarici.cz>
>> Gesendet: Montag, 18. März 2024 14:09:07
>> Betreff: Re: [PATCH RESEND 1/1] um: oops on accessing a non-present page in the vmalloc area
> 
>> On 3/12/2024 4:07 PM, Petr Tesarik wrote:
>>> On 2/23/2024 3:04 PM, Petr Tesarik wrote:
>>>> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>>>>
>>>> If a segmentation fault is caused by accessing an address in the vmalloc
>>>> area, check that the target page is present.
>>>>
>>>> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
>>>> assumes that the fault is caused by a stale mapping and will be fixed by
>>>> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
>>>> a guard page, no mapping is created, and when the faulting instruction is
>>>> restarted, it will cause exactly the same fault again, effectively creating
>>>> an infinite loop.
>>>
>>> Ping. Any comment on this fix?
>>
>> I don't think I have seen a reply from you. If you did comment, then
>> your email has not reached me.
>>
>> Please, can you confirm you have seen my patch?
> 
> Yes. I'm just way behind my maintainer schedule. :-/

Understood. Thank you for your reply.

By the way, are you looking for more people to help with the amount of work?

Petr T
Richard Weinberger March 20, 2024, 2:09 p.m. UTC | #5
----- Ursprüngliche Mail -----
> Von: "Petr Tesarik" <petrtesarik@huaweicloud.com>
>> Yes. I'm just way behind my maintainer schedule. :-/
> 
> Understood. Thank you for your reply.
> 
> By the way, are you looking for more people to help with the amount of work?

Yes, help is always welcome!
Johannes and Anton do already a great job but more maintenance power is always good.
You could help with reviewing patches, testing stuff, etc. :-)
It's not that UML itself is a lot of work, it's just that $dayjob is not UML related at all...

Thanks,
//richard
Anton Ivanov March 20, 2024, 3:14 p.m. UTC | #6
On 20/03/2024 14:09, Richard Weinberger wrote:
> ----- Ursprüngliche Mail -----
>> Von: "Petr Tesarik" <petrtesarik@huaweicloud.com>
>>> Yes. I'm just way behind my maintainer schedule. :-/
>>
>> Understood. Thank you for your reply.
>>
>> By the way, are you looking for more people to help with the amount of work?
> 
> Yes, help is always welcome!
> Johannes and Anton do already a great job but more maintenance power is always good.
> You could help with reviewing patches, testing stuff, etc. :-)
> It's not that UML itself is a lot of work, it's just that $dayjob is not UML related at all...

Same here.

> 
> Thanks,
> //richard
> 
>
David Gow March 21, 2024, 4:44 a.m. UTC | #7
On Fri, 23 Feb 2024 at 22:07, Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
>
> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>
> If a segmentation fault is caused by accessing an address in the vmalloc
> area, check that the target page is present.
>
> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
> assumes that the fault is caused by a stale mapping and will be fixed by
> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
> a guard page, no mapping is created, and when the faulting instruction is
> restarted, it will cause exactly the same fault again, effectively creating
> an infinite loop.
>
> Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
> ---
>  arch/um/kernel/trap.c | 4 ++++
>  1 file changed, 4 insertions(+)
>
> diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
> index 6d8ae86ae978..d5b85f1bfe33 100644
> --- a/arch/um/kernel/trap.c
> +++ b/arch/um/kernel/trap.c
> @@ -206,11 +206,15 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
>         int err;
>         int is_write = FAULT_WRITE(fi);
>         unsigned long address = FAULT_ADDRESS(fi);
> +       pte_t *pte;
>
>         if (!is_user && regs)
>                 current->thread.segv_regs = container_of(regs, struct pt_regs, regs);
>
>         if (!is_user && (address >= start_vm) && (address < end_vm)) {
> +               pte = virt_to_pte(&init_mm, address);
> +               if (!pte_present(*pte))
> +                       page_fault_oops(regs, address, ip);

page_fault_oops() appears to be private to arch/x86/mm/fault.c, so
can't be used here?
Also, it accepts struct pt_regs*, not struct uml_pt_regs*, so would
need to at least handle the type difference here.

Could we equally avoid the infinite loop here by putting the
'flush_tlb_kernel_vm();goto out;' behind a if (pte_present(...))
check, and let the rest of the UML checks panic or oops if required.
(Actually OOPSing where we can under UML would be nice to do at some
point anyway, but is a bigger issue than just fixing a bug, IMO.)

Or am I lacking a prerequisite patch or applying this to the wrong
version (or otherwise missing something), as it definitely doesn't
build here.

Cheers,
-- David
Petr Tesarik March 21, 2024, 5:30 p.m. UTC | #8
On 3/21/2024 5:44 AM, David Gow wrote:
> On Fri, 23 Feb 2024 at 22:07, Petr Tesarik <petrtesarik@huaweicloud.com> wrote:
>>
>> From: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>>
>> If a segmentation fault is caused by accessing an address in the vmalloc
>> area, check that the target page is present.
>>
>> Currently, if the kernel hits a guard page in the vmalloc area, UML blindly
>> assumes that the fault is caused by a stale mapping and will be fixed by
>> flush_tlb_kernel_vm(). Unsurprisingly, if the fault is caused by accessing
>> a guard page, no mapping is created, and when the faulting instruction is
>> restarted, it will cause exactly the same fault again, effectively creating
>> an infinite loop.
>>
>> Signed-off-by: Petr Tesarik <petr.tesarik1@huawei-partners.com>
>> ---
>>  arch/um/kernel/trap.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
>> index 6d8ae86ae978..d5b85f1bfe33 100644
>> --- a/arch/um/kernel/trap.c
>> +++ b/arch/um/kernel/trap.c
>> @@ -206,11 +206,15 @@ unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
>>         int err;
>>         int is_write = FAULT_WRITE(fi);
>>         unsigned long address = FAULT_ADDRESS(fi);
>> +       pte_t *pte;
>>
>>         if (!is_user && regs)
>>                 current->thread.segv_regs = container_of(regs, struct pt_regs, regs);
>>
>>         if (!is_user && (address >= start_vm) && (address < end_vm)) {
>> +               pte = virt_to_pte(&init_mm, address);
>> +               if (!pte_present(*pte))
>> +                       page_fault_oops(regs, address, ip);
> 
> page_fault_oops() appears to be private to arch/x86/mm/fault.c, so
> can't be used here?
> Also, it accepts struct pt_regs*, not struct uml_pt_regs*, so would
> need to at least handle the type difference here.

Argh, you're right. This was originally a two-patch series, but Richard
wanted improvements in the implementation which would require more
effort, see here:

http://lists.infradead.org/pipermail/linux-um/2024-January/006406.html

So I wanted to fix only the infinite loop, but in the mean time I forgot
about the dependency on the first patch:

http://lists.infradead.org/pipermail/linux-um/2023-December/006380.html

That's because a quick git grep page_fault_oops found the function. It
was my mistake that I did not notice the other page_fault_oops() earlier.

OK, please forget about this patch for now; I must rework it.

> Could we equally avoid the infinite loop here by putting the
> 'flush_tlb_kernel_vm();goto out;' behind a if (pte_present(...))
> check, and let the rest of the UML checks panic or oops if required.
> (Actually OOPSing where we can under UML would be nice to do at some
> point anyway, but is a bigger issue than just fixing a bug, IMO.)

Yes, that would be the best quick fix until I get to implementing all
the blows and whistles (oops_* helpers, notification chains, tainting,
etc.).

Petr T

> Or am I lacking a prerequisite patch or applying this to the wrong
> version (or otherwise missing something), as it definitely doesn't
> build here.
> 
> Cheers,
> -- David
diff mbox series

Patch

diff --git a/arch/um/kernel/trap.c b/arch/um/kernel/trap.c
index 6d8ae86ae978..d5b85f1bfe33 100644
--- a/arch/um/kernel/trap.c
+++ b/arch/um/kernel/trap.c
@@ -206,11 +206,15 @@  unsigned long segv(struct faultinfo fi, unsigned long ip, int is_user,
 	int err;
 	int is_write = FAULT_WRITE(fi);
 	unsigned long address = FAULT_ADDRESS(fi);
+	pte_t *pte;
 
 	if (!is_user && regs)
 		current->thread.segv_regs = container_of(regs, struct pt_regs, regs);
 
 	if (!is_user && (address >= start_vm) && (address < end_vm)) {
+		pte = virt_to_pte(&init_mm, address);
+		if (!pte_present(*pte))
+			page_fault_oops(regs, address, ip);
 		flush_tlb_kernel_vm();
 		goto out;
 	}