diff mbox series

linux-user: do setrlimit selectively

Message ID 20180904210036.7317-1-jcmvbkbc@gmail.com
State New
Headers show
Series linux-user: do setrlimit selectively | expand

Commit Message

Max Filippov Sept. 4, 2018, 9 p.m. UTC
When running 32-bit guest on 64-bit host setrlimit guest calls that
affect memory resources (RLIMIT_{AS,DATA,STACK}) don't always make sense
as is. They may result in QEMU lockup because mprotect call in
page_unprotect would fail with ENOMEM error code, causing infinite loop
of SIGSEGV. E.g. it happens when running libstdc++ testsuite for xtensa
target on x86_64 host.

Don't call host setrlimit for memory-related resources when running
32-bit guest on 64-bit host.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
---
 linux-user/syscall.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

Comments

Peter Maydell Sept. 4, 2018, 9:13 p.m. UTC | #1
On 4 September 2018 at 22:00, Max Filippov <jcmvbkbc@gmail.com> wrote:
> When running 32-bit guest on 64-bit host setrlimit guest calls that
> affect memory resources (RLIMIT_{AS,DATA,STACK}) don't always make sense
> as is. They may result in QEMU lockup because mprotect call in
> page_unprotect would fail with ENOMEM error code, causing infinite loop
> of SIGSEGV. E.g. it happens when running libstdc++ testsuite for xtensa
> target on x86_64 host.
>
> Don't call host setrlimit for memory-related resources when running
> 32-bit guest on 64-bit host.

I think the issue here is not related to 32-on-64 but to the fact
that we just pass through the memory rlimits. What we should ideally
be doing is tracking the actual guest memory allocations sufficiently
that we can then apply the rlimits at the QEMU level, so that guest
allocations that breach limits can be failed, without ever causing
QEMU's own alloactions to fail. (There's a bug in launchpad somewhere
about this -- an autoconf test for some obscure printf bug tries
to set a low rlimit and makes QEMU fall over.)

Unfortunately adding tracking of how much memory the guest has
allocated is probably non-trivial...

thanks
-- PMM
Max Filippov Sept. 4, 2018, 10:02 p.m. UTC | #2
On Tue, Sep 4, 2018 at 2:13 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 4 September 2018 at 22:00, Max Filippov <jcmvbkbc@gmail.com> wrote:
>> When running 32-bit guest on 64-bit host setrlimit guest calls that
>> affect memory resources (RLIMIT_{AS,DATA,STACK}) don't always make sense
>> as is. They may result in QEMU lockup because mprotect call in
>> page_unprotect would fail with ENOMEM error code, causing infinite loop
>> of SIGSEGV. E.g. it happens when running libstdc++ testsuite for xtensa
>> target on x86_64 host.
>>
>> Don't call host setrlimit for memory-related resources when running
>> 32-bit guest on 64-bit host.
>
> I think the issue here is not related to 32-on-64 but to the fact
> that we just pass through the memory rlimits. What we should ideally
> be doing is tracking the actual guest memory allocations sufficiently
> that we can then apply the rlimits at the QEMU level, so that guest
> allocations that breach limits can be failed, without ever causing
> QEMU's own alloactions to fail.

In a sense we do it by limiting 32-bit guest to 32 or less bits of the address
space, that's why it should be rather safe to just ignore setrlimit calls in
32-on-64 case.
Peter Maydell Sept. 4, 2018, 10:10 p.m. UTC | #3
On 4 September 2018 at 23:02, Max Filippov <jcmvbkbc@gmail.com> wrote:
> On Tue, Sep 4, 2018 at 2:13 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> I think the issue here is not related to 32-on-64 but to the fact
>> that we just pass through the memory rlimits. What we should ideally
>> be doing is tracking the actual guest memory allocations sufficiently
>> that we can then apply the rlimits at the QEMU level, so that guest
>> allocations that breach limits can be failed, without ever causing
>> QEMU's own alloactions to fail.
>
> In a sense we do it by limiting 32-bit guest to 32 or less bits of the address
> space, that's why it should be rather safe to just ignore setrlimit calls in
> 32-on-64 case.

I'm not sure why you think that we should treat 32-on-64 differently.
You could make a case for always ignoring setrlimit calls: if we
ever hit the limit it's as likely to be by failing a QEMU internal
allocation as a guest one, so not to imposing the limit at all
would avoid QEMU failing then. But that would apply in both the
32-on-64 and also 32-on-32 and 64-on-64 cases too.

thanks
-- PMM
Max Filippov Sept. 4, 2018, 10:26 p.m. UTC | #4
On Tue, Sep 4, 2018 at 3:10 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 4 September 2018 at 23:02, Max Filippov <jcmvbkbc@gmail.com> wrote:
>> On Tue, Sep 4, 2018 at 2:13 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> I think the issue here is not related to 32-on-64 but to the fact
>>> that we just pass through the memory rlimits. What we should ideally
>>> be doing is tracking the actual guest memory allocations sufficiently
>>> that we can then apply the rlimits at the QEMU level, so that guest
>>> allocations that breach limits can be failed, without ever causing
>>> QEMU's own alloactions to fail.
>>
>> In a sense we do it by limiting 32-bit guest to 32 or less bits of the address
>> space, that's why it should be rather safe to just ignore setrlimit calls in
>> 32-on-64 case.
>
> I'm not sure why you think that we should treat 32-on-64 differently.

I'm not sure, it's just that I have a case that I'd like to fix, and
it's 32-on 64.

> You could make a case for always ignoring setrlimit calls: if we
> ever hit the limit it's as likely to be by failing a QEMU internal
> allocation as a guest one, so not to imposing the limit at all
> would avoid QEMU failing then. But that would apply in both the
> 32-on-64 and also 32-on-32 and 64-on-64 cases too.

That's what I did initially, but it feels somewhat unsafe in 64-on-64 case.
My expectation is that limits set by 64-bit guest should be somewhat
suitable for the 64-bit host, is it wrong?
Peter Maydell Sept. 4, 2018, 10:55 p.m. UTC | #5
On 4 September 2018 at 23:26, Max Filippov <jcmvbkbc@gmail.com> wrote:
> On Tue, Sep 4, 2018 at 3:10 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
>> On 4 September 2018 at 23:02, Max Filippov <jcmvbkbc@gmail.com> wrote:

>> You could make a case for always ignoring setrlimit calls: if we
>> ever hit the limit it's as likely to be by failing a QEMU internal
>> allocation as a guest one, so not to imposing the limit at all
>> would avoid QEMU failing then. But that would apply in both the
>> 32-on-64 and also 32-on-32 and 64-on-64 cases too.
>
> That's what I did initially, but it feels somewhat unsafe in 64-on-64 case.
> My expectation is that limits set by 64-bit guest should be somewhat
> suitable for the 64-bit host, is it wrong?

It doesn't matter what the limit the guest sets is -- the
problem is that if we hit it then chances are good it'll cause
a QEMU allocation to fail and then we'll deadlock. The limit
might be entirely reasonable for the guest program (which
presumably has a plan for handling the failure) but QEMU
itself can't cope with hitting the limit.

https://bugs.launchpad.net/qemu/+bug/1163034 is the bug I
mentioned in my earlier email, by the way.

thanks
-- PMM
Max Filippov Sept. 5, 2018, 12:08 a.m. UTC | #6
On Tue, Sep 4, 2018 at 3:55 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> On 4 September 2018 at 23:26, Max Filippov <jcmvbkbc@gmail.com> wrote:
>> On Tue, Sep 4, 2018 at 3:10 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> On 4 September 2018 at 23:02, Max Filippov <jcmvbkbc@gmail.com> wrote:
>
>>> You could make a case for always ignoring setrlimit calls: if we
>>> ever hit the limit it's as likely to be by failing a QEMU internal
>>> allocation as a guest one, so not to imposing the limit at all
>>> would avoid QEMU failing then. But that would apply in both the
>>> 32-on-64 and also 32-on-32 and 64-on-64 cases too.
>>
>> That's what I did initially, but it feels somewhat unsafe in 64-on-64 case.
>> My expectation is that limits set by 64-bit guest should be somewhat
>> suitable for the 64-bit host, is it wrong?
>
> It doesn't matter what the limit the guest sets is -- the
> problem is that if we hit it then chances are good it'll cause
> a QEMU allocation to fail and then we'll deadlock. The limit
> might be entirely reasonable for the guest program (which
> presumably has a plan for handling the failure) but QEMU
> itself can't cope with hitting the limit.

Ok, the deadlock (and other weird behavior) is a separate problem:
QEMU doesn't check results of resource allocation functions and
then gets surprised. Errors must be handled and QEMU must be
terminated properly.

As to hitting the limit, I don't see anything wrong with it as long as
the limits are relevant. What I've seen so far is application calling
setrlimit to catch its own bugs and limit bug impact on the rest of
the system. 64-on-64 would probably set up a relevant limit,
32-on-64 -- unlikely.

> https://bugs.launchpad.net/qemu/+bug/1163034 is the bug I
> mentioned in my earlier email, by the way.

It looks like the proposed patch would fix the issue if it were
32-on-64 case without making it worse for the 64-on-64 case?
Peter Maydell Sept. 5, 2018, 1:40 a.m. UTC | #7
On 5 September 2018 at 01:08, Max Filippov <jcmvbkbc@gmail.com> wrote:
> Ok, the deadlock (and other weird behavior) is a separate problem:
> QEMU doesn't check results of resource allocation functions and
> then gets surprised. Errors must be handled and QEMU must be
> terminated properly.
>
> As to hitting the limit, I don't see anything wrong with it as long as
> the limits are relevant.

The limit is relevant *to the guest process*. It's not relevant
to QEMU's own allocations. If you're running directly under
the host kernel and you set a memory rlimit, it doesn't mean
that the kernel suddenly starts failing its own internal
allocations.

It would obviously be better if we fixed the problems of
allocations failing causing non-obvious problems (if nothing
else it would make debugging of things which triggered QEMU
asserts easier). But the guest should not be able to cause
QEMU's internal allocations to fail by setting the rlimit.

> What I've seen so far is application calling
> setrlimit to catch its own bugs and limit bug impact on the rest of
> the system. 64-on-64 would probably set up a relevant limit,
> 32-on-64 -- unlikely.

>> https://bugs.launchpad.net/qemu/+bug/1163034 is the bug I
>> mentioned in my earlier email, by the way.
>
> It looks like the proposed patch would fix the issue if it were
> 32-on-64 case without making it worse for the 64-on-64 case?

I still think 32-on-64 is a red herring here. If ignoring
client attempts to set the memory rlimits makes sense on
32-on-64 it makes sense for all configurations.

thanks
-- PMM
Max Filippov Sept. 5, 2018, 1:51 a.m. UTC | #8
On Tue, Sep 4, 2018 at 6:40 PM, Peter Maydell <peter.maydell@linaro.org> wrote:
> I still think 32-on-64 is a red herring here. If ignoring
> client attempts to set the memory rlimits makes sense on
> 32-on-64 it makes sense for all configurations.

Ok, I don't have anything against ignoring guest setrlimit completely, so
if that variant has better chances for positive review I'll gladly send it.
diff mbox series

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 850b72a0c760..693a6c8aa7bb 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -9272,7 +9272,14 @@  static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
             rlim.rlim_cur = target_to_host_rlim(target_rlim->rlim_cur);
             rlim.rlim_max = target_to_host_rlim(target_rlim->rlim_max);
             unlock_user_struct(target_rlim, arg2, 0);
-            return get_errno(setrlimit(resource, &rlim));
+            if (HOST_LONG_BITS <= TARGET_LONG_BITS ||
+                (resource != RLIMIT_DATA &&
+                 resource != RLIMIT_AS &&
+                 resource != RLIMIT_STACK)) {
+                return get_errno(setrlimit(resource, &rlim));
+            } else {
+                return 0;
+            }
         }
 #endif
 #ifdef TARGET_NR_getrlimit