Patchwork linux-user: fix wait* syscall status returns

login
register
mail settings
Submitter Alexander Graf
Date Nov. 23, 2011, 8:38 p.m.
Message ID <1322080707-13527-1-git-send-email-agraf@suse.de>
Download mbox | patch
Permalink /patch/127383/
State New
Headers show

Comments

Alexander Graf - Nov. 23, 2011, 8:38 p.m.
When calling wait4 or waitpid with a status pointer and WNOHANG, the
syscall can potentially not modify the status pointer input. Now if we
have guest code like:

  int status = 0;
  waitpid(pid, &status, WNOHANG);
  if (status)
     <breakage>

then we have to make sure that in case status did not change we actually
return the guest's initialized status variable instead of our own uninitialized.
We fail to do so today, as we proxy everything through an uninitialized status
variable which for me ended up always containing the last error code.

This patch fixes some test cases when building yast2-core in OBS for ARM.

Signed-off-by: Alexander Graf <agraf@suse.de>
---
 linux-user/syscall.c |    8 +++++++-
 1 files changed, 7 insertions(+), 1 deletions(-)
Peter Maydell - Nov. 23, 2011, 9:55 p.m.
On 23 November 2011 20:38, Alexander Graf <agraf@suse.de> wrote:
> When calling wait4 or waitpid with a status pointer and WNOHANG, the
> syscall can potentially not modify the status pointer input. Now if we
> have guest code like:
>
>  int status = 0;
>  waitpid(pid, &status, WNOHANG);
>  if (status)
>     <breakage>
>
> then we have to make sure that in case status did not change we actually
> return the guest's initialized status variable instead of our own uninitialized.
> We fail to do so today, as we proxy everything through an uninitialized status
> variable which for me ended up always containing the last error code.
>
> This patch fixes some test cases when building yast2-core in OBS for ARM.
>
> Signed-off-by: Alexander Graf <agraf@suse.de>
> ---
>  linux-user/syscall.c |    8 +++++++-
>  1 files changed, 7 insertions(+), 1 deletions(-)
>
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index 3e6f3bd..f86fe4a 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -4833,7 +4833,10 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>  #ifdef TARGET_NR_waitpid
>     case TARGET_NR_waitpid:
>         {
> -            int status;
> +            int status = 0;
> +            if (arg2) {
> +                get_user_s32(status, arg2);
> +            }
>             ret = get_errno(waitpid(arg1, &status, arg3));
>             if (!is_error(ret) && arg2
>                 && put_user_s32(host_to_target_waitstatus(status), arg2))

If the problem is that waitpid() can return success without writing to
status, then this code is still not right because we will get the
initial target waitstatus into status, and then pass it through
host_to_target_waitstatus(), possibly modifying it, before writing
it back to guest memory.

I think waitpid() will always and only write to status if the return
value is > 0 (ie it's a PID, not 0 or -1). So I think the right fix
for this problem is to have the if() protecting the put_user_s32()
read "if (ret && !is_error(ret) && ...".

(ret == 0 is of course the WNOHANG-and-no-child-yet case you are hitting.)

> @@ -6389,6 +6392,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>                 rusage_ptr = &rusage;
>             else
>                 rusage_ptr = NULL;
> +            if (status_ptr) {
> +                get_user_s32(status, status_ptr);
> +            }
>             ret = get_errno(wait4(arg1, &status, arg3, rusage_ptr));
>             if (!is_error(ret)) {
>                 if (status_ptr) {

...and similarly here.

-- PMM
Alexander Graf - Nov. 23, 2011, 10:03 p.m.
On 23.11.2011, at 22:55, Peter Maydell <peter.maydell@linaro.org> wrote:

> On 23 November 2011 20:38, Alexander Graf <agraf@suse.de> wrote:
>> When calling wait4 or waitpid with a status pointer and WNOHANG, the
>> syscall can potentially not modify the status pointer input. Now if we
>> have guest code like:
>> 
>>  int status = 0;
>>  waitpid(pid, &status, WNOHANG);
>>  if (status)
>>     <breakage>
>> 
>> then we have to make sure that in case status did not change we actually
>> return the guest's initialized status variable instead of our own uninitialized.
>> We fail to do so today, as we proxy everything through an uninitialized status
>> variable which for me ended up always containing the last error code.
>> 
>> This patch fixes some test cases when building yast2-core in OBS for ARM.
>> 
>> Signed-off-by: Alexander Graf <agraf@suse.de>
>> ---
>>  linux-user/syscall.c |    8 +++++++-
>>  1 files changed, 7 insertions(+), 1 deletions(-)
>> 
>> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> index 3e6f3bd..f86fe4a 100644
>> --- a/linux-user/syscall.c
>> +++ b/linux-user/syscall.c
>> @@ -4833,7 +4833,10 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>  #ifdef TARGET_NR_waitpid
>>     case TARGET_NR_waitpid:
>>         {
>> -            int status;
>> +            int status = 0;
>> +            if (arg2) {
>> +                get_user_s32(status, arg2);
>> +            }
>>             ret = get_errno(waitpid(arg1, &status, arg3));
>>             if (!is_error(ret) && arg2
>>                 && put_user_s32(host_to_target_waitstatus(status), arg2))
> 
> If the problem is that waitpid() can return success without writing to
> status, then this code is still not right because we will get the
> initial target waitstatus into status, and then pass it through
> host_to_target_waitstatus(), possibly modifying it, before writing
> it back to guest memory.

Yes. Maybe we should add a check if input_state != output_state and only then do the conversion?

> 
> I think waitpid() will always and only write to status if the return
> value is > 0 (ie it's a PID, not 0 or -1). So I think the right fix
> for this problem is to have the if() protecting the put_user_s32()
> read "if (ret && !is_error(ret) && ...".
> 
> (ret == 0 is of course the WNOHANG-and-no-child-yet case you are hitting.)

The man page wasn't really clear here. It sounded as if you can also get 0 as return value and still have status change. That's why I jumped through this hoop in the first place :)


Alex

> 
>> @@ -6389,6 +6392,9 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>                 rusage_ptr = &rusage;
>>             else
>>                 rusage_ptr = NULL;
>> +            if (status_ptr) {
>> +                get_user_s32(status, status_ptr);
>> +            }
>>             ret = get_errno(wait4(arg1, &status, arg3, rusage_ptr));
>>             if (!is_error(ret)) {
>>                 if (status_ptr) {
> 
> ...and similarly here.
> 
> -- PMM
>
Peter Maydell - Nov. 23, 2011, 10:34 p.m.
On 23 November 2011 22:03, Alexander Graf <agraf@suse.de> wrote:
> On 23.11.2011, at 22:55, Peter Maydell <peter.maydell@linaro.org> wrote:
>> If the problem is that waitpid() can return success without writing to
>> status, then this code is still not right because we will get the
>> initial target waitstatus into status, and then pass it through
>> host_to_target_waitstatus(), possibly modifying it, before writing
>> it back to guest memory.
>
> Yes. Maybe we should add a check if input_state != output_state and
> only then do the conversion?

I'm not sure this works (unless you go to the effort of implementing
target_to_host_waitstatus() which seems overkill to me) but I'm not
entirely sure what you're proposing to compare to what.

>> I think waitpid() will always and only write to status if the return
>> value is > 0 (ie it's a PID, not 0 or -1). So I think the right fix
>> for this problem is to have the if() protecting the put_user_s32()
>> read "if (ret && !is_error(ret) && ...".
>>
>> (ret == 0 is of course the WNOHANG-and-no-child-yet case you are hitting.)
>
> The man page wasn't really clear here. It sounded as if you can also
> get 0 as return value and still have status change. That's why I
> jumped through this hoop in the first place :)

I think POSIX is clear enough. Either
 * a child status is available, return pid and write to status
 * WNOHANG & not ready, return 0 (don't write to status)
 * -1 and set errno (don't write to status)

(strictly, whether status is written for EINTR is unpredictable, but I
think we can assume nobody relies on this and I bet the kernel folks
wouldn't worry too much about changing that between versions...)

-- PMM
Alexander Graf - Nov. 23, 2011, 11 p.m.
On 23.11.2011, at 23:34, Peter Maydell wrote:

> On 23 November 2011 22:03, Alexander Graf <agraf@suse.de> wrote:
>> On 23.11.2011, at 22:55, Peter Maydell <peter.maydell@linaro.org> wrote:
>>> If the problem is that waitpid() can return success without writing to
>>> status, then this code is still not right because we will get the
>>> initial target waitstatus into status, and then pass it through
>>> host_to_target_waitstatus(), possibly modifying it, before writing
>>> it back to guest memory.
>> 
>> Yes. Maybe we should add a check if input_state != output_state and
>> only then do the conversion?
> 
> I'm not sure this works (unless you go to the effort of implementing
> target_to_host_waitstatus() which seems overkill to me) but I'm not
> entirely sure what you're proposing to compare to what.

It's an integer. If the number is the same before and after the wait syscall, we can safely assume that it's the same thing, so we don't have to convert it, no?

> 
>>> I think waitpid() will always and only write to status if the return
>>> value is > 0 (ie it's a PID, not 0 or -1). So I think the right fix
>>> for this problem is to have the if() protecting the put_user_s32()
>>> read "if (ret && !is_error(ret) && ...".
>>> 
>>> (ret == 0 is of course the WNOHANG-and-no-child-yet case you are hitting.)
>> 
>> The man page wasn't really clear here. It sounded as if you can also
>> get 0 as return value and still have status change. That's why I
>> jumped through this hoop in the first place :)
> 
> I think POSIX is clear enough. Either
> * a child status is available, return pid and write to status
> * WNOHANG & not ready, return 0 (don't write to status)
> * -1 and set errno (don't write to status)
> 
> (strictly, whether status is written for EINTR is unpredictable, but I
> think we can assume nobody relies on this and I bet the kernel folks
> wouldn't worry too much about changing that between versions...)

Hrm, ok. I'll change it to your version then.


Alex
Peter Maydell - Nov. 23, 2011, 11:02 p.m.
On 23 November 2011 23:00, Alexander Graf <agraf@suse.de> wrote:
>
> On 23.11.2011, at 23:34, Peter Maydell wrote:
>
>> On 23 November 2011 22:03, Alexander Graf <agraf@suse.de> wrote:
>>> Yes. Maybe we should add a check if input_state != output_state and
>>> only then do the conversion?
>>
>> I'm not sure this works (unless you go to the effort of implementing
>> target_to_host_waitstatus() which seems overkill to me) but I'm not
>> entirely sure what you're proposing to compare to what.
>
> It's an integer. If the number is the same before and after the wait
> syscall, we can safely assume that it's the same thing, so we don't
> have to convert it, no?

But you don't know whether it's the same before and after because
wait() didn't write to it [=> don't write to guest memory] or if
wait() did write to it but happened to write it as the same value
it had before [=> do write to guest memory].

-- PMM
Alexander Graf - Nov. 23, 2011, 11:31 p.m.
On 24.11.2011, at 00:02, Peter Maydell wrote:

> On 23 November 2011 23:00, Alexander Graf <agraf@suse.de> wrote:
>> 
>> On 23.11.2011, at 23:34, Peter Maydell wrote:
>> 
>>> On 23 November 2011 22:03, Alexander Graf <agraf@suse.de> wrote:
>>>> Yes. Maybe we should add a check if input_state != output_state and
>>>> only then do the conversion?
>>> 
>>> I'm not sure this works (unless you go to the effort of implementing
>>> target_to_host_waitstatus() which seems overkill to me) but I'm not
>>> entirely sure what you're proposing to compare to what.
>> 
>> It's an integer. If the number is the same before and after the wait
>> syscall, we can safely assume that it's the same thing, so we don't
>> have to convert it, no?
> 
> But you don't know whether it's the same before and after because
> wait() didn't write to it [=> don't write to guest memory] or if
> wait() did write to it but happened to write it as the same value
> it had before [=> do write to guest memory].

If it was the same value before, it will still be the same value in guest memory.

  get_guest_s32(status, status_ptr);
  old_status = status;
  wait(...)
  if (old_status != status) {
    status = convert_status(status);
    put_guest_s32(status, status_ptr);
  }

If the values are identical, it's safe to assume that we don't have to convert. And the value will already be in guest memory :)


Alex
Peter Maydell - Nov. 23, 2011, 11:48 p.m.
On 23 November 2011 23:31, Alexander Graf <agraf@suse.de> wrote:
> If it was the same value before, it will still be the same value in guest memory.
>
>  get_guest_s32(status, status_ptr);
>  old_status = status;
>  wait(...)
>  if (old_status != status) {
>    status = convert_status(status);
>    put_guest_s32(status, status_ptr);
>  }

Picking some concrete numbers as an illustration; obviously
they're not really sensible status values:

Suppose guest memory contains the value 1, and that
convert_status(1) == 2. Now if you come out of wait()
and status == 1 (ie old_status == status), then either:
 (a) wait() didn't write to status => do nothing
 (b) wait() did write to status => since convert_status(1) == 2
     we need to write 2 to guest memory

For this approach to work you have to have a conversion
function from guest to host status, I think.

-- PMM

Patch

diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index 3e6f3bd..f86fe4a 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -4833,7 +4833,10 @@  abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
 #ifdef TARGET_NR_waitpid
     case TARGET_NR_waitpid:
         {
-            int status;
+            int status = 0;
+            if (arg2) {
+                get_user_s32(status, arg2);
+            }
             ret = get_errno(waitpid(arg1, &status, arg3));
             if (!is_error(ret) && arg2
                 && put_user_s32(host_to_target_waitstatus(status), arg2))
@@ -6389,6 +6392,9 @@  abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
                 rusage_ptr = &rusage;
             else
                 rusage_ptr = NULL;
+            if (status_ptr) {
+                get_user_s32(status, status_ptr);
+            }
             ret = get_errno(wait4(arg1, &status, arg3, rusage_ptr));
             if (!is_error(ret)) {
                 if (status_ptr) {