linux-user: Improve strace output for read() and getcwd()
diff mbox series

Message ID 20191121201448.GA3133@ls3530.fritz.box
State New
Headers show
Series
  • linux-user: Improve strace output for read() and getcwd()
Related show

Commit Message

Helge Deller Nov. 21, 2019, 8:14 p.m. UTC
The strace functionality in qemu-user lacks the possibility to trace
which real values get returned to pointers in userspace by syscalls.

For example, the read() and getcwd() syscalls currently only show the
destination address where the syscalls should put the return values:
2532 read(3,0xff80038c,512) = 512
2532 getcwd(0x18180,4096) = 9

With the patch below, one now can specify in print_syscall_late() which
syscalls should be executed first, before they get printed.
After adding the read() and getcwd() syscalls, we now get this output in
with strace instead:
1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
1708 getcwd("/usr/bin",4096) = 9

This patch adds just the framework with the respective implemenations for
read() and getcwd(). If applied, more functions can be added easily later.

Signed-off-by: Helge Deller <deller@gmx.de>

Comments

no-reply@patchew.org Nov. 21, 2019, 11:15 p.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/20191121201448.GA3133@ls3530.fritz.box/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Subject: [PATCH] linux-user: Improve strace output for read() and getcwd()
Type: series
Message-id: 20191121201448.GA3133@ls3530.fritz.box

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/cover.1574375668.git.dirty@apple.com -> patchew/cover.1574375668.git.dirty@apple.com
Switched to a new branch 'test'
e367aa3 linux-user: Improve strace output for read() and getcwd()

=== OUTPUT BEGIN ===
WARNING: line over 80 characters
#37: FILE: linux-user/strace.c:64:
+UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);

ERROR: storage class should be at the beginning of the declaration
#37: FILE: linux-user/strace.c:64:
+UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);

total: 1 errors, 1 warnings, 137 lines checked

Commit e367aa385965 (linux-user: Improve strace output for read() and getcwd()) has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20191121201448.GA3133@ls3530.fritz.box/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Aleksandar Markovic Nov. 23, 2019, 11:34 a.m. UTC | #2
On Thursday, November 21, 2019, Helge Deller <deller@gmx.de> wrote:

> The strace functionality in qemu-user lacks the possibility to trace
> which real values get returned to pointers in userspace by syscalls.
>
> For example, the read() and getcwd() syscalls currently only show the
> destination address where the syscalls should put the return values:
> 2532 read(3,0xff80038c,512) = 512
> 2532 getcwd(0x18180,4096) = 9
>
> With the patch below, one now can specify in print_syscall_late() which
> syscalls should be executed first, before they get printed.
> After adding the read() and getcwd() syscalls, we now get this output in
> with strace instead:
> 1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> = 512
> 1708 getcwd("/usr/bin",4096) = 9
>
> This patch adds just the framework with the respective implemenations for
> read() and getcwd(). If applied, more functions can be added easily later.
>
>
Great out-of-the-box idea! However, there are some things that are not
thought over yet. There is a good reason why strace happens "early": if
something crash-like happens during syscall translation, we still have
trace of original target syscall. In case of "late" flavor, we don't have
anything. Another potentially problematic aspect is that end user certainly
should know whether the trace was done "early" or "late" - otherwise, there
will be, for certain, cases of misinterpretation / misleading / confusion
of end users.

Yours,
Aleksandar




> Signed-off-by: Helge Deller <deller@gmx.de>
>
> diff --git a/linux-user/strace.c b/linux-user/strace.c
> index de43238fa4..ff254732af 100644
> --- a/linux-user/strace.c
> +++ b/linux-user/strace.c
> @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
>  UNUSED static void print_syscall_prologue(const struct syscallname *);
>  UNUSED static void print_syscall_epilogue(const struct syscallname *);
>  UNUSED static void print_string(abi_long, int);
> +UNUSED static void print_encoded_string(abi_long addr, unsigned int
> maxlen, int last);
>  UNUSED static void print_buf(abi_long addr, abi_long len, int last);
>  UNUSED static void print_raw_param(const char *, abi_long, int);
>  UNUSED static void print_timeval(abi_ulong, int);
> @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
>      gemu_log(")");
>  }
>
> +#define MAX_ENCODED_CHARS 32
> +static void
> +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
> +{
> +    unsigned int maxout;
> +    char *s, *str;
> +
> +    s = lock_user_string(addr);
> +    if (s == NULL) {
> +        /* can't get string out of it, so print it as pointer */
> +        print_pointer(addr, last);
> +        return;
> +    }
> +
> +    str = s;
> +    gemu_log("\"");
> +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
> +    while (maxout--) {
> +        unsigned char c = *str++;
> +        if (isprint(c)) {
> +            gemu_log("%c", c);
> +        } else {
> +            gemu_log("\\%o", (unsigned int) c);
> +        }
> +    }
> +    unlock_user(s, addr, 0);
> +
> +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
> +                    get_comma(last));
> +}
> +
>  static void
>  print_string(abi_long addr, int last)
>  {
> @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
>  }
>  #endif
>
> +#ifdef TARGET_NR_getcwd
> +static void
> +print_getcwd(const struct syscallname *name,
> +    abi_long arg0, abi_long arg1, abi_long arg2,
> +    abi_long arg3, abi_long arg4, abi_long arg5)
> +{
> +    print_syscall_prologue(name);
> +    print_string(arg0, 0);
> +    print_raw_param("%u", arg1, 1);
> +    print_syscall_epilogue(name);
> +}
> +#endif
> +
>  #ifdef TARGET_NR_settimeofday
>  static void
>  print_settimeofday(const struct syscallname *name,
> @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
>  #define print_newfstatat    print_fstatat64
>  #endif
>
> +#ifdef TARGET_NR_read
> +static void
> +print_read(const struct syscallname *name,
> +    abi_long arg0, abi_long arg1, abi_long arg2,
> +    abi_long arg3, abi_long arg4, abi_long arg5)
> +{
> +    print_syscall_prologue(name);
> +    print_raw_param("%d", arg0, 0);
> +    print_encoded_string(arg1, arg2, 0);
> +    print_raw_param("%u", arg2, 1);
> +    print_syscall_epilogue(name);
> +}
> +#endif
> +
>  #ifdef TARGET_NR_readlink
>  static void
>  print_readlink(const struct syscallname *name,
> diff --git a/linux-user/strace.list b/linux-user/strace.list
> index d49a1e92a8..220b1f4c46 100644
> --- a/linux-user/strace.list
> +++ b/linux-user/strace.list
> @@ -272,7 +272,7 @@
>  { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
>  #endif
>  #ifdef TARGET_NR_getcwd
> -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
> +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
>  #endif
>  #ifdef TARGET_NR_getdents
>  { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
> @@ -1080,7 +1080,7 @@
>  { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
>  #endif
>  #ifdef TARGET_NR_read
> -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
> +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
>  #endif
>  #ifdef TARGET_NR_readahead
>  { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
> diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> index ce399a55f0..c0079ca2b7 100644
> --- a/linux-user/syscall.c
> +++ b/linux-user/syscall.c
> @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int
> num, abi_long arg1,
>      return ret;
>  }
>
> +/*
> + * True if this syscall should be printed after having called the native
> + * syscall, so that values which are fed back to userspace gets printed.
> + */
> +static int print_syscall_late(int syscall)
> +{
> +    switch (syscall) {
> +    case TARGET_NR_getcwd:
> +    case TARGET_NR_read:
> +        return 1;
> +    default:
> +        return 0;
> +    }
> +}
> +
>  abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>                      abi_long arg2, abi_long arg3, abi_long arg4,
>                      abi_long arg5, abi_long arg6, abi_long arg7,
> @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num,
> abi_long arg1,
>                           arg2, arg3, arg4, arg5, arg6, arg7, arg8);
>
>      if (unlikely(do_strace)) {
> -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> +        int late_printing;
> +        late_printing = print_syscall_late(num);
> +        if (!late_printing) {
> +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> +        }
>          ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>                            arg5, arg6, arg7, arg8);
> +        if (late_printing) {
> +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> +        }
>          print_syscall_ret(num, ret);
>      } else {
>          ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>
>
Helge Deller Nov. 24, 2019, 11:54 a.m. UTC | #3
On 23.11.19 12:34, Aleksandar Markovic wrote:
> On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de>> wrote:
>
>     The strace functionality in qemu-user lacks the possibility to trace
>     which real values get returned to pointers in userspace by syscalls.
>
>     For example, the read() and getcwd() syscalls currently only show the
>     destination address where the syscalls should put the return values:
>     2532 read(3,0xff80038c,512) = 512
>     2532 getcwd(0x18180,4096) = 9
>
>     With the patch below, one now can specify in print_syscall_late() which
>     syscalls should be executed first, before they get printed.
>     After adding the read() and getcwd() syscalls, we now get this output in
>     with strace instead:
>     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
>     1708 getcwd("/usr/bin",4096) = 9
>
>     This patch adds just the framework with the respective implemenations for
>     read() and getcwd(). If applied, more functions can be added easily later.
>
>
> Great out-of-the-box idea! However, there are some things that are
> not thought over yet. There is a good reason why strace happens
> "early": if something crash-like happens during syscall translation,
> we still have trace of original target syscall. In case of "late"
> flavor, we don't have anything. Another potentially problematic
> aspect is that end user certainly should know whether the trace was
> done "early" or "late" - otherwise, there will be, for certain, cases
> of misinterpretation / misleading / confusion of end users.

Thanks for the feedback - I'm shortly sending a v2 version which
prints the syscall name with an open parenthesis, e.g. "read(" or "getcwd(",
before calling the syscall. That way you can see where it crashed/hangs...

Helge

>
> Yours,
> Aleksandar
>
>
>  
>
>     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de>>
>
>     diff --git a/linux-user/strace.c b/linux-user/strace.c
>     index de43238fa4..ff254732af 100644
>     --- a/linux-user/strace.c
>     +++ b/linux-user/strace.c
>     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
>      UNUSED static void print_syscall_prologue(const struct syscallname *);
>      UNUSED static void print_syscall_epilogue(const struct syscallname *);
>      UNUSED static void print_string(abi_long, int);
>     +UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
>      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
>      UNUSED static void print_raw_param(const char *, abi_long, int);
>      UNUSED static void print_timeval(abi_ulong, int);
>     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
>          gemu_log(")");
>      }
>
>     +#define MAX_ENCODED_CHARS 32
>     +static void
>     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
>     +{
>     +    unsigned int maxout;
>     +    char *s, *str;
>     +
>     +    s = lock_user_string(addr);
>     +    if (s == NULL) {
>     +        /* can't get string out of it, so print it as pointer */
>     +        print_pointer(addr, last);
>     +        return;
>     +    }
>     +
>     +    str = s;
>     +    gemu_log("\"");
>     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
>     +    while (maxout--) {
>     +        unsigned char c = *str++;
>     +        if (isprint(c)) {
>     +            gemu_log("%c", c);
>     +        } else {
>     +            gemu_log("\\%o", (unsigned int) c);
>     +        }
>     +    }
>     +    unlock_user(s, addr, 0);
>     +
>     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
>     +                    get_comma(last));
>     +}
>     +
>      static void
>      print_string(abi_long addr, int last)
>      {
>     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
>      }
>      #endif
>
>     +#ifdef TARGET_NR_getcwd
>     +static void
>     +print_getcwd(const struct syscallname *name,
>     +    abi_long arg0, abi_long arg1, abi_long arg2,
>     +    abi_long arg3, abi_long arg4, abi_long arg5)
>     +{
>     +    print_syscall_prologue(name);
>     +    print_string(arg0, 0);
>     +    print_raw_param("%u", arg1, 1);
>     +    print_syscall_epilogue(name);
>     +}
>     +#endif
>     +
>      #ifdef TARGET_NR_settimeofday
>      static void
>      print_settimeofday(const struct syscallname *name,
>     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
>      #define print_newfstatat    print_fstatat64
>      #endif
>
>     +#ifdef TARGET_NR_read
>     +static void
>     +print_read(const struct syscallname *name,
>     +    abi_long arg0, abi_long arg1, abi_long arg2,
>     +    abi_long arg3, abi_long arg4, abi_long arg5)
>     +{
>     +    print_syscall_prologue(name);
>     +    print_raw_param("%d", arg0, 0);
>     +    print_encoded_string(arg1, arg2, 0);
>     +    print_raw_param("%u", arg2, 1);
>     +    print_syscall_epilogue(name);
>     +}
>     +#endif
>     +
>      #ifdef TARGET_NR_readlink
>      static void
>      print_readlink(const struct syscallname *name,
>     diff --git a/linux-user/strace.list b/linux-user/strace.list
>     index d49a1e92a8..220b1f4c46 100644
>     --- a/linux-user/strace.list
>     +++ b/linux-user/strace.list
>     @@ -272,7 +272,7 @@
>      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
>      #endif
>      #ifdef TARGET_NR_getcwd
>     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
>     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
>      #endif
>      #ifdef TARGET_NR_getdents
>      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
>     @@ -1080,7 +1080,7 @@
>      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
>      #endif
>      #ifdef TARGET_NR_read
>     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
>     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
>      #endif
>      #ifdef TARGET_NR_readahead
>      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
>     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>     index ce399a55f0..c0079ca2b7 100644
>     --- a/linux-user/syscall.c
>     +++ b/linux-user/syscall.c
>     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>          return ret;
>      }
>
>     +/*
>     + * True if this syscall should be printed after having called the native
>     + * syscall, so that values which are fed back to userspace gets printed.
>     + */
>     +static int print_syscall_late(int syscall)
>     +{
>     +    switch (syscall) {
>     +    case TARGET_NR_getcwd:
>     +    case TARGET_NR_read:
>     +        return 1;
>     +    default:
>     +        return 0;
>     +    }
>     +}
>     +
>      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>                          abi_long arg2, abi_long arg3, abi_long arg4,
>                          abi_long arg5, abi_long arg6, abi_long arg7,
>     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
>
>          if (unlikely(do_strace)) {
>     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     +        int late_printing;
>     +        late_printing = print_syscall_late(num);
>     +        if (!late_printing) {
>     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     +        }
>              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>                                arg5, arg6, arg7, arg8);
>     +        if (late_printing) {
>     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     +        }
>              print_syscall_ret(num, ret);
>          } else {
>              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>
Aleksandar Markovic Nov. 24, 2019, 12:10 p.m. UTC | #4
On Sunday, November 24, 2019, Helge Deller <deller@gmx.de> wrote:

> On 23.11.19 12:34, Aleksandar Markovic wrote:
> > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:
> deller@gmx.de>> wrote:
> >
> >     The strace functionality in qemu-user lacks the possibility to trace
> >     which real values get returned to pointers in userspace by syscalls.
> >
> >     For example, the read() and getcwd() syscalls currently only show the
> >     destination address where the syscalls should put the return values:
> >     2532 read(3,0xff80038c,512) = 512
> >     2532 getcwd(0x18180,4096) = 9
> >
> >     With the patch below, one now can specify in print_syscall_late()
> which
> >     syscalls should be executed first, before they get printed.
> >     After adding the read() and getcwd() syscalls, we now get this
> output in
> >     with strace instead:
> >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> = 512
> >     1708 getcwd("/usr/bin",4096) = 9
> >
> >     This patch adds just the framework with the respective
> implemenations for
> >     read() and getcwd(). If applied, more functions can be added easily
> later.
> >
> >
> > Great out-of-the-box idea! However, there are some things that are
> > not thought over yet. There is a good reason why strace happens
> > "early": if something crash-like happens during syscall translation,
> > we still have trace of original target syscall. In case of "late"
> > flavor, we don't have anything. Another potentially problematic
> > aspect is that end user certainly should know whether the trace was
> > done "early" or "late" - otherwise, there will be, for certain, cases
> > of misinterpretation / misleading / confusion of end users.
>
> Thanks for the feedback - I'm shortly sending a v2 version which
> prints the syscall name with an open parenthesis, e.g. "read(" or
> "getcwd(",
> before calling the syscall. That way you can see where it crashed/hangs...
>
>
... but I cannot see passed arguments (for example, whether NULL is passed,
or somrthing that looks like a real address)...



> Helge
>
> >
> > Yours,
> > Aleksandar
> >
> >
> >
> >
> >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de>>
> >
> >     diff --git a/linux-user/strace.c b/linux-user/strace.c
> >     index de43238fa4..ff254732af 100644
> >     --- a/linux-user/strace.c
> >     +++ b/linux-user/strace.c
> >     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
> >      UNUSED static void print_syscall_prologue(const struct syscallname
> *);
> >      UNUSED static void print_syscall_epilogue(const struct syscallname
> *);
> >      UNUSED static void print_string(abi_long, int);
> >     +UNUSED static void print_encoded_string(abi_long addr, unsigned int
> maxlen, int last);
> >      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
> >      UNUSED static void print_raw_param(const char *, abi_long, int);
> >      UNUSED static void print_timeval(abi_ulong, int);
> >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct
> syscallname *sc)
> >          gemu_log(")");
> >      }
> >
> >     +#define MAX_ENCODED_CHARS 32
> >     +static void
> >     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
> >     +{
> >     +    unsigned int maxout;
> >     +    char *s, *str;
> >     +
> >     +    s = lock_user_string(addr);
> >     +    if (s == NULL) {
> >     +        /* can't get string out of it, so print it as pointer */
> >     +        print_pointer(addr, last);
> >     +        return;
> >     +    }
> >     +
> >     +    str = s;
> >     +    gemu_log("\"");
> >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
> >     +    while (maxout--) {
> >     +        unsigned char c = *str++;
> >     +        if (isprint(c)) {
> >     +            gemu_log("%c", c);
> >     +        } else {
> >     +            gemu_log("\\%o", (unsigned int) c);
> >     +        }
> >     +    }
> >     +    unlock_user(s, addr, 0);
> >     +
> >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
> >     +                    get_comma(last));
> >     +}
> >     +
> >      static void
> >      print_string(abi_long addr, int last)
> >      {
> >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname
> *name,
> >      }
> >      #endif
> >
> >     +#ifdef TARGET_NR_getcwd
> >     +static void
> >     +print_getcwd(const struct syscallname *name,
> >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >     +{
> >     +    print_syscall_prologue(name);
> >     +    print_string(arg0, 0);
> >     +    print_raw_param("%u", arg1, 1);
> >     +    print_syscall_epilogue(name);
> >     +}
> >     +#endif
> >     +
> >      #ifdef TARGET_NR_settimeofday
> >      static void
> >      print_settimeofday(const struct syscallname *name,
> >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname
> *name,
> >      #define print_newfstatat    print_fstatat64
> >      #endif
> >
> >     +#ifdef TARGET_NR_read
> >     +static void
> >     +print_read(const struct syscallname *name,
> >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >     +{
> >     +    print_syscall_prologue(name);
> >     +    print_raw_param("%d", arg0, 0);
> >     +    print_encoded_string(arg1, arg2, 0);
> >     +    print_raw_param("%u", arg2, 1);
> >     +    print_syscall_epilogue(name);
> >     +}
> >     +#endif
> >     +
> >      #ifdef TARGET_NR_readlink
> >      static void
> >      print_readlink(const struct syscallname *name,
> >     diff --git a/linux-user/strace.list b/linux-user/strace.list
> >     index d49a1e92a8..220b1f4c46 100644
> >     --- a/linux-user/strace.list
> >     +++ b/linux-user/strace.list
> >     @@ -272,7 +272,7 @@
> >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
> >      #endif
> >      #ifdef TARGET_NR_getcwd
> >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
> >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
> >      #endif
> >      #ifdef TARGET_NR_getdents
> >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
> >     @@ -1080,7 +1080,7 @@
> >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
> >      #endif
> >      #ifdef TARGET_NR_read
> >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
> >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
> >      #endif
> >      #ifdef TARGET_NR_readahead
> >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
> >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >     index ce399a55f0..c0079ca2b7 100644
> >     --- a/linux-user/syscall.c
> >     +++ b/linux-user/syscall.c
> >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env,
> int num, abi_long arg1,
> >          return ret;
> >      }
> >
> >     +/*
> >     + * True if this syscall should be printed after having called the
> native
> >     + * syscall, so that values which are fed back to userspace gets
> printed.
> >     + */
> >     +static int print_syscall_late(int syscall)
> >     +{
> >     +    switch (syscall) {
> >     +    case TARGET_NR_getcwd:
> >     +    case TARGET_NR_read:
> >     +        return 1;
> >     +    default:
> >     +        return 0;
> >     +    }
> >     +}
> >     +
> >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> >                          abi_long arg2, abi_long arg3, abi_long arg4,
> >                          abi_long arg5, abi_long arg6, abi_long arg7,
> >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num,
> abi_long arg1,
> >                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
> >
> >          if (unlikely(do_strace)) {
> >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >     +        int late_printing;
> >     +        late_printing = print_syscall_late(num);
> >     +        if (!late_printing) {
> >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >     +        }
> >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> >                                arg5, arg6, arg7, arg8);
> >     +        if (late_printing) {
> >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >     +        }
> >              print_syscall_ret(num, ret);
> >          } else {
> >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> >
>
>
Helge Deller Nov. 24, 2019, 1:04 p.m. UTC | #5
On 24.11.19 13:10, Aleksandar Markovic wrote:
>
>
> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de>> wrote:
>
>     On 23.11.19 12:34, Aleksandar Markovic wrote:
>     > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>> wrote:
>     >
>     >     The strace functionality in qemu-user lacks the possibility to trace
>     >     which real values get returned to pointers in userspace by syscalls.
>     >
>     >     For example, the read() and getcwd() syscalls currently only show the
>     >     destination address where the syscalls should put the return values:
>     >     2532 read(3,0xff80038c,512) = 512
>     >     2532 getcwd(0x18180,4096) = 9
>     >
>     >     With the patch below, one now can specify in print_syscall_late() which
>     >     syscalls should be executed first, before they get printed.
>     >     After adding the read() and getcwd() syscalls, we now get this output in
>     >     with strace instead:
>     >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
>     >     1708 getcwd("/usr/bin",4096) = 9
>     >
>     >     This patch adds just the framework with the respective implemenations for
>     >     read() and getcwd(). If applied, more functions can be added easily later.
>     >
>     >
>     > Great out-of-the-box idea! However, there are some things that are
>     > not thought over yet. There is a good reason why strace happens
>     > "early": if something crash-like happens during syscall translation,
>     > we still have trace of original target syscall. In case of "late"
>     > flavor, we don't have anything. Another potentially problematic
>     > aspect is that end user certainly should know whether the trace was
>     > done "early" or "late" - otherwise, there will be, for certain, cases
>     > of misinterpretation / misleading / confusion of end users.
>
>     Thanks for the feedback - I'm shortly sending a v2 version which
>     prints the syscall name with an open parenthesis, e.g. "read(" or "getcwd(",
>     before calling the syscall. That way you can see where it crashed/hangs...
>
>
> ... but I cannot see passed arguments (for example, whether NULL is passed, or somrthing that looks like a real address)...

Right.
And you won't see that in native strace either...
If you have an idea, just let me know.

Helge

>
>  
>
>     Helge
>
>     >
>     > Yours,
>     > Aleksandar
>     >
>     >
>     >  
>     >
>     >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
>     >
>     >     diff --git a/linux-user/strace.c b/linux-user/strace.c
>     >     index de43238fa4..ff254732af 100644
>     >     --- a/linux-user/strace.c
>     >     +++ b/linux-user/strace.c
>     >     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
>     >      UNUSED static void print_syscall_prologue(const struct syscallname *);
>     >      UNUSED static void print_syscall_epilogue(const struct syscallname *);
>     >      UNUSED static void print_string(abi_long, int);
>     >     +UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
>     >      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
>     >      UNUSED static void print_raw_param(const char *, abi_long, int);
>     >      UNUSED static void print_timeval(abi_ulong, int);
>     >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
>     >          gemu_log(")");
>     >      }
>     >
>     >     +#define MAX_ENCODED_CHARS 32
>     >     +static void
>     >     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
>     >     +{
>     >     +    unsigned int maxout;
>     >     +    char *s, *str;
>     >     +
>     >     +    s = lock_user_string(addr);
>     >     +    if (s == NULL) {
>     >     +        /* can't get string out of it, so print it as pointer */
>     >     +        print_pointer(addr, last);
>     >     +        return;
>     >     +    }
>     >     +
>     >     +    str = s;
>     >     +    gemu_log("\"");
>     >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
>     >     +    while (maxout--) {
>     >     +        unsigned char c = *str++;
>     >     +        if (isprint(c)) {
>     >     +            gemu_log("%c", c);
>     >     +        } else {
>     >     +            gemu_log("\\%o", (unsigned int) c);
>     >     +        }
>     >     +    }
>     >     +    unlock_user(s, addr, 0);
>     >     +
>     >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
>     >     +                    get_comma(last));
>     >     +}
>     >     +
>     >      static void
>     >      print_string(abi_long addr, int last)
>     >      {
>     >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
>     >      }
>     >      #endif
>     >
>     >     +#ifdef TARGET_NR_getcwd
>     >     +static void
>     >     +print_getcwd(const struct syscallname *name,
>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>     >     +{
>     >     +    print_syscall_prologue(name);
>     >     +    print_string(arg0, 0);
>     >     +    print_raw_param("%u", arg1, 1);
>     >     +    print_syscall_epilogue(name);
>     >     +}
>     >     +#endif
>     >     +
>     >      #ifdef TARGET_NR_settimeofday
>     >      static void
>     >      print_settimeofday(const struct syscallname *name,
>     >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
>     >      #define print_newfstatat    print_fstatat64
>     >      #endif
>     >
>     >     +#ifdef TARGET_NR_read
>     >     +static void
>     >     +print_read(const struct syscallname *name,
>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>     >     +{
>     >     +    print_syscall_prologue(name);
>     >     +    print_raw_param("%d", arg0, 0);
>     >     +    print_encoded_string(arg1, arg2, 0);
>     >     +    print_raw_param("%u", arg2, 1);
>     >     +    print_syscall_epilogue(name);
>     >     +}
>     >     +#endif
>     >     +
>     >      #ifdef TARGET_NR_readlink
>     >      static void
>     >      print_readlink(const struct syscallname *name,
>     >     diff --git a/linux-user/strace.list b/linux-user/strace.list
>     >     index d49a1e92a8..220b1f4c46 100644
>     >     --- a/linux-user/strace.list
>     >     +++ b/linux-user/strace.list
>     >     @@ -272,7 +272,7 @@
>     >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
>     >      #endif
>     >      #ifdef TARGET_NR_getcwd
>     >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
>     >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
>     >      #endif
>     >      #ifdef TARGET_NR_getdents
>     >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
>     >     @@ -1080,7 +1080,7 @@
>     >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
>     >      #endif
>     >      #ifdef TARGET_NR_read
>     >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
>     >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
>     >      #endif
>     >      #ifdef TARGET_NR_readahead
>     >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
>     >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>     >     index ce399a55f0..c0079ca2b7 100644
>     >     --- a/linux-user/syscall.c
>     >     +++ b/linux-user/syscall.c
>     >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>     >          return ret;
>     >      }
>     >
>     >     +/*
>     >     + * True if this syscall should be printed after having called the native
>     >     + * syscall, so that values which are fed back to userspace gets printed.
>     >     + */
>     >     +static int print_syscall_late(int syscall)
>     >     +{
>     >     +    switch (syscall) {
>     >     +    case TARGET_NR_getcwd:
>     >     +    case TARGET_NR_read:
>     >     +        return 1;
>     >     +    default:
>     >     +        return 0;
>     >     +    }
>     >     +}
>     >     +
>     >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>     >                          abi_long arg2, abi_long arg3, abi_long arg4,
>     >                          abi_long arg5, abi_long arg6, abi_long arg7,
>     >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>     >                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
>     >
>     >          if (unlikely(do_strace)) {
>     >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     >     +        int late_printing;
>     >     +        late_printing = print_syscall_late(num);
>     >     +        if (!late_printing) {
>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     >     +        }
>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>     >                                arg5, arg6, arg7, arg8);
>     >     +        if (late_printing) {
>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>     >     +        }
>     >              print_syscall_ret(num, ret);
>     >          } else {
>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>     >
>
Aleksandar Markovic Nov. 24, 2019, 1:31 p.m. UTC | #6
On Sunday, November 24, 2019, Helge Deller <deller@gmx.de> wrote:

> On 24.11.19 13:10, Aleksandar Markovic wrote:
> >
> >
> > On Sunday, November 24, 2019, Helge Deller <deller@gmx.de <mailto:
> deller@gmx.de>> wrote:
> >
> >     On 23.11.19 12:34, Aleksandar Markovic wrote:
> >     > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de
> <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
> wrote:
> >     >
> >     >     The strace functionality in qemu-user lacks the possibility to
> trace
> >     >     which real values get returned to pointers in userspace by
> syscalls.
> >     >
> >     >     For example, the read() and getcwd() syscalls currently only
> show the
> >     >     destination address where the syscalls should put the return
> values:
> >     >     2532 read(3,0xff80038c,512) = 512
> >     >     2532 getcwd(0x18180,4096) = 9
> >     >
> >     >     With the patch below, one now can specify in
> print_syscall_late() which
> >     >     syscalls should be executed first, before they get printed.
> >     >     After adding the read() and getcwd() syscalls, we now get this
> output in
> >     >     with strace instead:
> >     >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\
> 0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
> >     >     1708 getcwd("/usr/bin",4096) = 9
> >     >
> >     >     This patch adds just the framework with the respective
> implemenations for
> >     >     read() and getcwd(). If applied, more functions can be added
> easily later.
> >     >
> >     >
> >     > Great out-of-the-box idea! However, there are some things that are
> >     > not thought over yet. There is a good reason why strace happens
> >     > "early": if something crash-like happens during syscall
> translation,
> >     > we still have trace of original target syscall. In case of "late"
> >     > flavor, we don't have anything. Another potentially problematic
> >     > aspect is that end user certainly should know whether the trace was
> >     > done "early" or "late" - otherwise, there will be, for certain,
> cases
> >     > of misinterpretation / misleading / confusion of end users.
> >
> >     Thanks for the feedback - I'm shortly sending a v2 version which
> >     prints the syscall name with an open parenthesis, e.g. "read(" or
> "getcwd(",
> >     before calling the syscall. That way you can see where it
> crashed/hangs...
> >
> >
> > ... but I cannot see passed arguments (for example, whether NULL is
> passed, or somrthing that looks like a real address)...
>
> Right.
> And you won't see that in native strace either...
> If you have an idea, just let me know.
>
>
Helge, I do have an idea how to amend your proposal that should give
us, believe it or not, "best of both worlds", but let's wait few days, I
don't have any dev setup ar hand at the moment to check it in action, and
on top of that I am swamped with other unrelated tasks. In the meantime,
maybe other guys will say something too, or perhaps you would have
something new as well.

Still, by no means, I see this as a (potential) *huge* improvement of QEMU
strace. Thanks!

Yours,
Aleksandar



> Helge
>
> >
> >
> >
> >     Helge
> >
> >     >
> >     > Yours,
> >     > Aleksandar
> >     >
> >     >
> >     >
> >     >
> >     >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:
> deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
> >     >
> >     >     diff --git a/linux-user/strace.c b/linux-user/strace.c
> >     >     index de43238fa4..ff254732af 100644
> >     >     --- a/linux-user/strace.c
> >     >     +++ b/linux-user/strace.c
> >     >     @@ -61,6 +61,7 @@ UNUSED static void
> print_open_flags(abi_long, int);
> >     >      UNUSED static void print_syscall_prologue(const struct
> syscallname *);
> >     >      UNUSED static void print_syscall_epilogue(const struct
> syscallname *);
> >     >      UNUSED static void print_string(abi_long, int);
> >     >     +UNUSED static void print_encoded_string(abi_long addr,
> unsigned int maxlen, int last);
> >     >      UNUSED static void print_buf(abi_long addr, abi_long len, int
> last);
> >     >      UNUSED static void print_raw_param(const char *, abi_long,
> int);
> >     >      UNUSED static void print_timeval(abi_ulong, int);
> >     >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct
> syscallname *sc)
> >     >          gemu_log(")");
> >     >      }
> >     >
> >     >     +#define MAX_ENCODED_CHARS 32
> >     >     +static void
> >     >     +print_encoded_string(abi_long addr, unsigned int maxlen, int
> last)
> >     >     +{
> >     >     +    unsigned int maxout;
> >     >     +    char *s, *str;
> >     >     +
> >     >     +    s = lock_user_string(addr);
> >     >     +    if (s == NULL) {
> >     >     +        /* can't get string out of it, so print it as pointer
> */
> >     >     +        print_pointer(addr, last);
> >     >     +        return;
> >     >     +    }
> >     >     +
> >     >     +    str = s;
> >     >     +    gemu_log("\"");
> >     >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
> >     >     +    while (maxout--) {
> >     >     +        unsigned char c = *str++;
> >     >     +        if (isprint(c)) {
> >     >     +            gemu_log("%c", c);
> >     >     +        } else {
> >     >     +            gemu_log("\\%o", (unsigned int) c);
> >     >     +        }
> >     >     +    }
> >     >     +    unlock_user(s, addr, 0);
> >     >     +
> >     >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." :
> "",
> >     >     +                    get_comma(last));
> >     >     +}
> >     >     +
> >     >      static void
> >     >      print_string(abi_long addr, int last)
> >     >      {
> >     >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct
> syscallname *name,
> >     >      }
> >     >      #endif
> >     >
> >     >     +#ifdef TARGET_NR_getcwd
> >     >     +static void
> >     >     +print_getcwd(const struct syscallname *name,
> >     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >     >     +{
> >     >     +    print_syscall_prologue(name);
> >     >     +    print_string(arg0, 0);
> >     >     +    print_raw_param("%u", arg1, 1);
> >     >     +    print_syscall_epilogue(name);
> >     >     +}
> >     >     +#endif
> >     >     +
> >     >      #ifdef TARGET_NR_settimeofday
> >     >      static void
> >     >      print_settimeofday(const struct syscallname *name,
> >     >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct
> syscallname *name,
> >     >      #define print_newfstatat    print_fstatat64
> >     >      #endif
> >     >
> >     >     +#ifdef TARGET_NR_read
> >     >     +static void
> >     >     +print_read(const struct syscallname *name,
> >     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >     >     +{
> >     >     +    print_syscall_prologue(name);
> >     >     +    print_raw_param("%d", arg0, 0);
> >     >     +    print_encoded_string(arg1, arg2, 0);
> >     >     +    print_raw_param("%u", arg2, 1);
> >     >     +    print_syscall_epilogue(name);
> >     >     +}
> >     >     +#endif
> >     >     +
> >     >      #ifdef TARGET_NR_readlink
> >     >      static void
> >     >      print_readlink(const struct syscallname *name,
> >     >     diff --git a/linux-user/strace.list b/linux-user/strace.list
> >     >     index d49a1e92a8..220b1f4c46 100644
> >     >     --- a/linux-user/strace.list
> >     >     +++ b/linux-user/strace.list
> >     >     @@ -272,7 +272,7 @@
> >     >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
> >     >      #endif
> >     >      #ifdef TARGET_NR_getcwd
> >     >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
> >     >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
> >     >      #endif
> >     >      #ifdef TARGET_NR_getdents
> >     >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
> >     >     @@ -1080,7 +1080,7 @@
> >     >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
> >     >      #endif
> >     >      #ifdef TARGET_NR_read
> >     >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
> >     >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
> >     >      #endif
> >     >      #ifdef TARGET_NR_readahead
> >     >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
> >     >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >     >     index ce399a55f0..c0079ca2b7 100644
> >     >     --- a/linux-user/syscall.c
> >     >     +++ b/linux-user/syscall.c
> >     >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void
> *cpu_env, int num, abi_long arg1,
> >     >          return ret;
> >     >      }
> >     >
> >     >     +/*
> >     >     + * True if this syscall should be printed after having called
> the native
> >     >     + * syscall, so that values which are fed back to userspace
> gets printed.
> >     >     + */
> >     >     +static int print_syscall_late(int syscall)
> >     >     +{
> >     >     +    switch (syscall) {
> >     >     +    case TARGET_NR_getcwd:
> >     >     +    case TARGET_NR_read:
> >     >     +        return 1;
> >     >     +    default:
> >     >     +        return 0;
> >     >     +    }
> >     >     +}
> >     >     +
> >     >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> >     >                          abi_long arg2, abi_long arg3, abi_long
> arg4,
> >     >                          abi_long arg5, abi_long arg6, abi_long
> arg7,
> >     >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env,
> int num, abi_long arg1,
> >     >                               arg2, arg3, arg4, arg5, arg6, arg7,
> arg8);
> >     >
> >     >          if (unlikely(do_strace)) {
> >     >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> arg6);
> >     >     +        int late_printing;
> >     >     +        late_printing = print_syscall_late(num);
> >     >     +        if (!late_printing) {
> >     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> arg6);
> >     >     +        }
> >     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3,
> arg4,
> >     >                                arg5, arg6, arg7, arg8);
> >     >     +        if (late_printing) {
> >     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5,
> arg6);
> >     >     +        }
> >     >              print_syscall_ret(num, ret);
> >     >          } else {
> >     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3,
> arg4,
> >     >
> >
>
>
Aleksandar Markovic Nov. 25, 2019, 1:46 p.m. UTC | #7
On Sun, Nov 24, 2019 at 2:31 PM Aleksandar Markovic
<aleksandar.m.mail@gmail.com> wrote:
>
>
>
> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de> wrote:
>>
>> On 24.11.19 13:10, Aleksandar Markovic wrote:
>> >
>> >
>> > On Sunday, November 24, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de>> wrote:
>> >
>> >     On 23.11.19 12:34, Aleksandar Markovic wrote:
>> >     > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>> wrote:
>> >     >
>> >     >     The strace functionality in qemu-user lacks the possibility to trace
>> >     >     which real values get returned to pointers in userspace by syscalls.
>> >     >
>> >     >     For example, the read() and getcwd() syscalls currently only show the
>> >     >     destination address where the syscalls should put the return values:
>> >     >     2532 read(3,0xff80038c,512) = 512
>> >     >     2532 getcwd(0x18180,4096) = 9
>> >     >
>> >     >     With the patch below, one now can specify in print_syscall_late() which
>> >     >     syscalls should be executed first, before they get printed.
>> >     >     After adding the read() and getcwd() syscalls, we now get this output in
>> >     >     with strace instead:
>> >     >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
>> >     >     1708 getcwd("/usr/bin",4096) = 9
>> >     >
>> >     >     This patch adds just the framework with the respective implemenations for
>> >     >     read() and getcwd(). If applied, more functions can be added easily later.
>> >     >
>> >     >
>> >     > Great out-of-the-box idea! However, there are some things that are
>> >     > not thought over yet. There is a good reason why strace happens
>> >     > "early": if something crash-like happens during syscall translation,
>> >     > we still have trace of original target syscall. In case of "late"
>> >     > flavor, we don't have anything. Another potentially problematic
>> >     > aspect is that end user certainly should know whether the trace was
>> >     > done "early" or "late" - otherwise, there will be, for certain, cases
>> >     > of misinterpretation / misleading / confusion of end users.
>> >
>> >     Thanks for the feedback - I'm shortly sending a v2 version which
>> >     prints the syscall name with an open parenthesis, e.g. "read(" or "getcwd(",
>> >     before calling the syscall. That way you can see where it crashed/hangs...
>> >
>> >
>> > ... but I cannot see passed arguments (for example, whether NULL is passed, or somrthing that looks like a real address)...
>>
>> Right.
>> And you won't see that in native strace either...
>> If you have an idea, just let me know.
>>
>
> Helge, I do have an idea how to amend your proposal that should give us, believe it or not, "best of both worlds", but let's wait few days, I don't have any dev setup ar hand at the moment to check it in action, and on top of that I am swamped with other unrelated tasks. In the meantime, maybe other guys will say something too, or perhaps you would have something new as well.
>
> Still, by no means, I see this as a (potential) *huge* improvement of QEMU strace. Thanks!
>

Helge, Laurent,

I don't have time to experiment, but I will write down here my
suggestion to extending Helge's system in a textual form, based on
example from prevous Helge's patches:

1. CURRENT STATE:

early: read(3,0xff80038c,512)
late: = 512

early: getcwd(0x18180,4096)
late: = 9

2. HELGE'S PROPOSAL 1:

late: read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
= 512
late: getcwd("/usr/bin",4096) = 9

3. HELGE'S PROPOSAL 2:

early: read
late: (3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
= 512
early: getcwd
late: ("/usr/bin",4096) = 9

4. NEW PROPOSAL :

early: read(3,0xff80038c,512)
late: = 512  [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
early: getcwd(0x18180,4096)
late: = 9  [("/usr/bin",4096)]

In other words, we would print for (selected) system calls content of
the parameters both before and after actual ioctl
translation/execution.

Sincerely,
Aleksandar

> Yours,
> Aleksandar
>
>
>>
>> Helge
>>
>> >
>> >
>> >
>> >     Helge
>> >
>> >     >
>> >     > Yours,
>> >     > Aleksandar
>> >     >
>> >     >
>> >     >
>> >     >
>> >     >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
>> >     >
>> >     >     diff --git a/linux-user/strace.c b/linux-user/strace.c
>> >     >     index de43238fa4..ff254732af 100644
>> >     >     --- a/linux-user/strace.c
>> >     >     +++ b/linux-user/strace.c
>> >     >     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
>> >     >      UNUSED static void print_syscall_prologue(const struct syscallname *);
>> >     >      UNUSED static void print_syscall_epilogue(const struct syscallname *);
>> >     >      UNUSED static void print_string(abi_long, int);
>> >     >     +UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
>> >     >      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
>> >     >      UNUSED static void print_raw_param(const char *, abi_long, int);
>> >     >      UNUSED static void print_timeval(abi_ulong, int);
>> >     >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
>> >     >          gemu_log(")");
>> >     >      }
>> >     >
>> >     >     +#define MAX_ENCODED_CHARS 32
>> >     >     +static void
>> >     >     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
>> >     >     +{
>> >     >     +    unsigned int maxout;
>> >     >     +    char *s, *str;
>> >     >     +
>> >     >     +    s = lock_user_string(addr);
>> >     >     +    if (s == NULL) {
>> >     >     +        /* can't get string out of it, so print it as pointer */
>> >     >     +        print_pointer(addr, last);
>> >     >     +        return;
>> >     >     +    }
>> >     >     +
>> >     >     +    str = s;
>> >     >     +    gemu_log("\"");
>> >     >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
>> >     >     +    while (maxout--) {
>> >     >     +        unsigned char c = *str++;
>> >     >     +        if (isprint(c)) {
>> >     >     +            gemu_log("%c", c);
>> >     >     +        } else {
>> >     >     +            gemu_log("\\%o", (unsigned int) c);
>> >     >     +        }
>> >     >     +    }
>> >     >     +    unlock_user(s, addr, 0);
>> >     >     +
>> >     >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
>> >     >     +                    get_comma(last));
>> >     >     +}
>> >     >     +
>> >     >      static void
>> >     >      print_string(abi_long addr, int last)
>> >     >      {
>> >     >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
>> >     >      }
>> >     >      #endif
>> >     >
>> >     >     +#ifdef TARGET_NR_getcwd
>> >     >     +static void
>> >     >     +print_getcwd(const struct syscallname *name,
>> >     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>> >     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>> >     >     +{
>> >     >     +    print_syscall_prologue(name);
>> >     >     +    print_string(arg0, 0);
>> >     >     +    print_raw_param("%u", arg1, 1);
>> >     >     +    print_syscall_epilogue(name);
>> >     >     +}
>> >     >     +#endif
>> >     >     +
>> >     >      #ifdef TARGET_NR_settimeofday
>> >     >      static void
>> >     >      print_settimeofday(const struct syscallname *name,
>> >     >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
>> >     >      #define print_newfstatat    print_fstatat64
>> >     >      #endif
>> >     >
>> >     >     +#ifdef TARGET_NR_read
>> >     >     +static void
>> >     >     +print_read(const struct syscallname *name,
>> >     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>> >     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>> >     >     +{
>> >     >     +    print_syscall_prologue(name);
>> >     >     +    print_raw_param("%d", arg0, 0);
>> >     >     +    print_encoded_string(arg1, arg2, 0);
>> >     >     +    print_raw_param("%u", arg2, 1);
>> >     >     +    print_syscall_epilogue(name);
>> >     >     +}
>> >     >     +#endif
>> >     >     +
>> >     >      #ifdef TARGET_NR_readlink
>> >     >      static void
>> >     >      print_readlink(const struct syscallname *name,
>> >     >     diff --git a/linux-user/strace.list b/linux-user/strace.list
>> >     >     index d49a1e92a8..220b1f4c46 100644
>> >     >     --- a/linux-user/strace.list
>> >     >     +++ b/linux-user/strace.list
>> >     >     @@ -272,7 +272,7 @@
>> >     >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
>> >     >      #endif
>> >     >      #ifdef TARGET_NR_getcwd
>> >     >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
>> >     >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
>> >     >      #endif
>> >     >      #ifdef TARGET_NR_getdents
>> >     >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
>> >     >     @@ -1080,7 +1080,7 @@
>> >     >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
>> >     >      #endif
>> >     >      #ifdef TARGET_NR_read
>> >     >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
>> >     >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
>> >     >      #endif
>> >     >      #ifdef TARGET_NR_readahead
>> >     >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
>> >     >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>> >     >     index ce399a55f0..c0079ca2b7 100644
>> >     >     --- a/linux-user/syscall.c
>> >     >     +++ b/linux-user/syscall.c
>> >     >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>> >     >          return ret;
>> >     >      }
>> >     >
>> >     >     +/*
>> >     >     + * True if this syscall should be printed after having called the native
>> >     >     + * syscall, so that values which are fed back to userspace gets printed.
>> >     >     + */
>> >     >     +static int print_syscall_late(int syscall)
>> >     >     +{
>> >     >     +    switch (syscall) {
>> >     >     +    case TARGET_NR_getcwd:
>> >     >     +    case TARGET_NR_read:
>> >     >     +        return 1;
>> >     >     +    default:
>> >     >     +        return 0;
>> >     >     +    }
>> >     >     +}
>> >     >     +
>> >     >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>> >     >                          abi_long arg2, abi_long arg3, abi_long arg4,
>> >     >                          abi_long arg5, abi_long arg6, abi_long arg7,
>> >     >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>> >     >                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
>> >     >
>> >     >          if (unlikely(do_strace)) {
>> >     >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>> >     >     +        int late_printing;
>> >     >     +        late_printing = print_syscall_late(num);
>> >     >     +        if (!late_printing) {
>> >     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>> >     >     +        }
>> >     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>> >     >                                arg5, arg6, arg7, arg8);
>> >     >     +        if (late_printing) {
>> >     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>> >     >     +        }
>> >     >              print_syscall_ret(num, ret);
>> >     >          } else {
>> >     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>> >     >
>> >
>>
Helge Deller Nov. 25, 2019, 8:40 p.m. UTC | #8
On 25.11.19 14:46, Aleksandar Markovic wrote:
> On Sun, Nov 24, 2019 at 2:31 PM Aleksandar Markovic
> <aleksandar.m.mail@gmail.com> wrote:
>>
>>
>>
>> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de> wrote:
>>>
>>> On 24.11.19 13:10, Aleksandar Markovic wrote:
>>>>
>>>>
>>>> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de>> wrote:
>>>>
>>>>     On 23.11.19 12:34, Aleksandar Markovic wrote:
>>>>     > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>> wrote:
>>>>     >
>>>>     >     The strace functionality in qemu-user lacks the possibility to trace
>>>>     >     which real values get returned to pointers in userspace by syscalls.
>>>>     >
>>>>     >     For example, the read() and getcwd() syscalls currently only show the
>>>>     >     destination address where the syscalls should put the return values:
>>>>     >     2532 read(3,0xff80038c,512) = 512
>>>>     >     2532 getcwd(0x18180,4096) = 9
>>>>     >
>>>>     >     With the patch below, one now can specify in print_syscall_late() which
>>>>     >     syscalls should be executed first, before they get printed.
>>>>     >     After adding the read() and getcwd() syscalls, we now get this output in
>>>>     >     with strace instead:
>>>>     >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
>>>>     >     1708 getcwd("/usr/bin",4096) = 9
>>>>     >
>>>>     >     This patch adds just the framework with the respective implemenations for
>>>>     >     read() and getcwd(). If applied, more functions can be added easily later.
>>>>     >
>>>>     >
>>>>     > Great out-of-the-box idea! However, there are some things that are
>>>>     > not thought over yet. There is a good reason why strace happens
>>>>     > "early": if something crash-like happens during syscall translation,
>>>>     > we still have trace of original target syscall. In case of "late"
>>>>     > flavor, we don't have anything. Another potentially problematic
>>>>     > aspect is that end user certainly should know whether the trace was
>>>>     > done "early" or "late" - otherwise, there will be, for certain, cases
>>>>     > of misinterpretation / misleading / confusion of end users.
>>>>
>>>>     Thanks for the feedback - I'm shortly sending a v2 version which
>>>>     prints the syscall name with an open parenthesis, e.g. "read(" or "getcwd(",
>>>>     before calling the syscall. That way you can see where it crashed/hangs...
>>>>
>>>>
>>>> ... but I cannot see passed arguments (for example, whether NULL is passed, or somrthing that looks like a real address)...
>>>
>>> Right.
>>> And you won't see that in native strace either...
>>> If you have an idea, just let me know.
>>>
>>
>> Helge, I do have an idea how to amend your proposal that should give us, believe it or not, "best of both worlds", but let's wait few days, I don't have any dev setup ar hand at the moment to check it in action, and on top of that I am swamped with other unrelated tasks. In the meantime, maybe other guys will say something too, or perhaps you would have something new as well.
>>
>> Still, by no means, I see this as a (potential) *huge* improvement of QEMU strace. Thanks!
>>
>
> Helge, Laurent,
>
> I don't have time to experiment, but I will write down here my
> suggestion to extending Helge's system in a textual form, based on
> example from prevous Helge's patches:
>
> 1. CURRENT STATE:
>
> early: read(3,0xff80038c,512)
> late: = 512
>
> early: getcwd(0x18180,4096)
> late: = 9
>
> 2. HELGE'S PROPOSAL 1:
>
> late: read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> = 512
> late: getcwd("/usr/bin",4096) = 9
>
> 3. HELGE'S PROPOSAL 2:
>
> early: read
> late: (3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> = 512
> early: getcwd
> late: ("/usr/bin",4096) = 9

My v2 patch actually includes the opening parenthesis in the early part:
early: read(
late: 3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512

The open parenthesis should make it visible, that the call hasn't happened yet.

> 4. NEW PROPOSAL :
>
> early: read(3,0xff80038c,512)
> late: = 512  [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
> early: getcwd(0x18180,4096)
> late: = 9  [("/usr/bin",4096)]
>
> In other words, we would print for (selected) system calls content of
> the parameters both before and after actual ioctl
> translation/execution.

I don't like this.
IMHO it just adds unnecessary noise and differs from what a native strace shows.

Helge


>>>>     >
>>>>     >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
>>>>     >
>>>>     >     diff --git a/linux-user/strace.c b/linux-user/strace.c
>>>>     >     index de43238fa4..ff254732af 100644
>>>>     >     --- a/linux-user/strace.c
>>>>     >     +++ b/linux-user/strace.c
>>>>     >     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
>>>>     >      UNUSED static void print_syscall_prologue(const struct syscallname *);
>>>>     >      UNUSED static void print_syscall_epilogue(const struct syscallname *);
>>>>     >      UNUSED static void print_string(abi_long, int);
>>>>     >     +UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
>>>>     >      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
>>>>     >      UNUSED static void print_raw_param(const char *, abi_long, int);
>>>>     >      UNUSED static void print_timeval(abi_ulong, int);
>>>>     >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
>>>>     >          gemu_log(")");
>>>>     >      }
>>>>     >
>>>>     >     +#define MAX_ENCODED_CHARS 32
>>>>     >     +static void
>>>>     >     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
>>>>     >     +{
>>>>     >     +    unsigned int maxout;
>>>>     >     +    char *s, *str;
>>>>     >     +
>>>>     >     +    s = lock_user_string(addr);
>>>>     >     +    if (s == NULL) {
>>>>     >     +        /* can't get string out of it, so print it as pointer */
>>>>     >     +        print_pointer(addr, last);
>>>>     >     +        return;
>>>>     >     +    }
>>>>     >     +
>>>>     >     +    str = s;
>>>>     >     +    gemu_log("\"");
>>>>     >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
>>>>     >     +    while (maxout--) {
>>>>     >     +        unsigned char c = *str++;
>>>>     >     +        if (isprint(c)) {
>>>>     >     +            gemu_log("%c", c);
>>>>     >     +        } else {
>>>>     >     +            gemu_log("\\%o", (unsigned int) c);
>>>>     >     +        }
>>>>     >     +    }
>>>>     >     +    unlock_user(s, addr, 0);
>>>>     >     +
>>>>     >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
>>>>     >     +                    get_comma(last));
>>>>     >     +}
>>>>     >     +
>>>>     >      static void
>>>>     >      print_string(abi_long addr, int last)
>>>>     >      {
>>>>     >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
>>>>     >      }
>>>>     >      #endif
>>>>     >
>>>>     >     +#ifdef TARGET_NR_getcwd
>>>>     >     +static void
>>>>     >     +print_getcwd(const struct syscallname *name,
>>>>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>>>>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>>>>     >     +{
>>>>     >     +    print_syscall_prologue(name);
>>>>     >     +    print_string(arg0, 0);
>>>>     >     +    print_raw_param("%u", arg1, 1);
>>>>     >     +    print_syscall_epilogue(name);
>>>>     >     +}
>>>>     >     +#endif
>>>>     >     +
>>>>     >      #ifdef TARGET_NR_settimeofday
>>>>     >      static void
>>>>     >      print_settimeofday(const struct syscallname *name,
>>>>     >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
>>>>     >      #define print_newfstatat    print_fstatat64
>>>>     >      #endif
>>>>     >
>>>>     >     +#ifdef TARGET_NR_read
>>>>     >     +static void
>>>>     >     +print_read(const struct syscallname *name,
>>>>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
>>>>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
>>>>     >     +{
>>>>     >     +    print_syscall_prologue(name);
>>>>     >     +    print_raw_param("%d", arg0, 0);
>>>>     >     +    print_encoded_string(arg1, arg2, 0);
>>>>     >     +    print_raw_param("%u", arg2, 1);
>>>>     >     +    print_syscall_epilogue(name);
>>>>     >     +}
>>>>     >     +#endif
>>>>     >     +
>>>>     >      #ifdef TARGET_NR_readlink
>>>>     >      static void
>>>>     >      print_readlink(const struct syscallname *name,
>>>>     >     diff --git a/linux-user/strace.list b/linux-user/strace.list
>>>>     >     index d49a1e92a8..220b1f4c46 100644
>>>>     >     --- a/linux-user/strace.list
>>>>     >     +++ b/linux-user/strace.list
>>>>     >     @@ -272,7 +272,7 @@
>>>>     >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
>>>>     >      #endif
>>>>     >      #ifdef TARGET_NR_getcwd
>>>>     >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
>>>>     >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
>>>>     >      #endif
>>>>     >      #ifdef TARGET_NR_getdents
>>>>     >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
>>>>     >     @@ -1080,7 +1080,7 @@
>>>>     >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
>>>>     >      #endif
>>>>     >      #ifdef TARGET_NR_read
>>>>     >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
>>>>     >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
>>>>     >      #endif
>>>>     >      #ifdef TARGET_NR_readahead
>>>>     >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
>>>>     >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
>>>>     >     index ce399a55f0..c0079ca2b7 100644
>>>>     >     --- a/linux-user/syscall.c
>>>>     >     +++ b/linux-user/syscall.c
>>>>     >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
>>>>     >          return ret;
>>>>     >      }
>>>>     >
>>>>     >     +/*
>>>>     >     + * True if this syscall should be printed after having called the native
>>>>     >     + * syscall, so that values which are fed back to userspace gets printed.
>>>>     >     + */
>>>>     >     +static int print_syscall_late(int syscall)
>>>>     >     +{
>>>>     >     +    switch (syscall) {
>>>>     >     +    case TARGET_NR_getcwd:
>>>>     >     +    case TARGET_NR_read:
>>>>     >     +        return 1;
>>>>     >     +    default:
>>>>     >     +        return 0;
>>>>     >     +    }
>>>>     >     +}
>>>>     >     +
>>>>     >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>>>     >                          abi_long arg2, abi_long arg3, abi_long arg4,
>>>>     >                          abi_long arg5, abi_long arg6, abi_long arg7,
>>>>     >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
>>>>     >                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
>>>>     >
>>>>     >          if (unlikely(do_strace)) {
>>>>     >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>>>>     >     +        int late_printing;
>>>>     >     +        late_printing = print_syscall_late(num);
>>>>     >     +        if (!late_printing) {
>>>>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>>>>     >     +        }
>>>>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>>>>     >                                arg5, arg6, arg7, arg8);
>>>>     >     +        if (late_printing) {
>>>>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
>>>>     >     +        }
>>>>     >              print_syscall_ret(num, ret);
>>>>     >          } else {
>>>>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
>>>>     >
>>>>
>>>
Aleksandar Markovic Nov. 25, 2019, 8:54 p.m. UTC | #9
On Mon, Nov 25, 2019 at 9:40 PM Helge Deller <deller@gmx.de> wrote:
>
> On 25.11.19 14:46, Aleksandar Markovic wrote:
> > On Sun, Nov 24, 2019 at 2:31 PM Aleksandar Markovic
> > <aleksandar.m.mail@gmail.com> wrote:
> >>
> >>
> >>
> >> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de> wrote:
> >>>
> >>> On 24.11.19 13:10, Aleksandar Markovic wrote:
> >>>>
> >>>>
> >>>> On Sunday, November 24, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de>> wrote:
> >>>>
> >>>>     On 23.11.19 12:34, Aleksandar Markovic wrote:
> >>>>     > On Thursday, November 21, 2019, Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>> wrote:
> >>>>     >
> >>>>     >     The strace functionality in qemu-user lacks the possibility to trace
> >>>>     >     which real values get returned to pointers in userspace by syscalls.
> >>>>     >
> >>>>     >     For example, the read() and getcwd() syscalls currently only show the
> >>>>     >     destination address where the syscalls should put the return values:
> >>>>     >     2532 read(3,0xff80038c,512) = 512
> >>>>     >     2532 getcwd(0x18180,4096) = 9
> >>>>     >
> >>>>     >     With the patch below, one now can specify in print_syscall_late() which
> >>>>     >     syscalls should be executed first, before they get printed.
> >>>>     >     After adding the read() and getcwd() syscalls, we now get this output in
> >>>>     >     with strace instead:
> >>>>     >     1708 read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
> >>>>     >     1708 getcwd("/usr/bin",4096) = 9
> >>>>     >
> >>>>     >     This patch adds just the framework with the respective implemenations for
> >>>>     >     read() and getcwd(). If applied, more functions can be added easily later.
> >>>>     >
> >>>>     >
> >>>>     > Great out-of-the-box idea! However, there are some things that are
> >>>>     > not thought over yet. There is a good reason why strace happens
> >>>>     > "early": if something crash-like happens during syscall translation,
> >>>>     > we still have trace of original target syscall. In case of "late"
> >>>>     > flavor, we don't have anything. Another potentially problematic
> >>>>     > aspect is that end user certainly should know whether the trace was
> >>>>     > done "early" or "late" - otherwise, there will be, for certain, cases
> >>>>     > of misinterpretation / misleading / confusion of end users.
> >>>>
> >>>>     Thanks for the feedback - I'm shortly sending a v2 version which
> >>>>     prints the syscall name with an open parenthesis, e.g. "read(" or "getcwd(",
> >>>>     before calling the syscall. That way you can see where it crashed/hangs...
> >>>>
> >>>>
> >>>> ... but I cannot see passed arguments (for example, whether NULL is passed, or somrthing that looks like a real address)...
> >>>
> >>> Right.
> >>> And you won't see that in native strace either...
> >>> If you have an idea, just let me know.
> >>>
> >>
> >> Helge, I do have an idea how to amend your proposal that should give us, believe it or not, "best of both worlds", but let's wait few days, I don't have any dev setup ar hand at the moment to check it in action, and on top of that I am swamped with other unrelated tasks. In the meantime, maybe other guys will say something too, or perhaps you would have something new as well.
> >>
> >> Still, by no means, I see this as a (potential) *huge* improvement of QEMU strace. Thanks!
> >>
> >
> > Helge, Laurent,
> >
> > I don't have time to experiment, but I will write down here my
> > suggestion to extending Helge's system in a textual form, based on
> > example from prevous Helge's patches:
> >
> > 1. CURRENT STATE:
> >
> > early: read(3,0xff80038c,512)
> > late: = 512
> >
> > early: getcwd(0x18180,4096)
> > late: = 9
> >
> > 2. HELGE'S PROPOSAL 1:
> >
> > late: read(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> > = 512
> > late: getcwd("/usr/bin",4096) = 9
> >
> > 3. HELGE'S PROPOSAL 2:
> >
> > early: read
> > late: (3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)
> > = 512
> > early: getcwd
> > late: ("/usr/bin",4096) = 9
>
> My v2 patch actually includes the opening parenthesis in the early part:
> early: read(
> late: 3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512) = 512
>
> The open parenthesis should make it visible, that the call hasn't happened yet.
>

Helge, with due respect, you are mixed up here. The purpose of QEMU
strace is to trace *target* syscalls, not *host* syscalls. Target
syscall read() happened.

> > 4. NEW PROPOSAL :
> >
> > early: read(3,0xff80038c,512)
> > late: = 512  [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
> > early: getcwd(0x18180,4096)
> > late: = 9  [("/usr/bin",4096)]
> >
> > In other words, we would print for (selected) system calls content of
> > the parameters both before and after actual ioctl
> > translation/execution.
>
> I don't like this.
> IMHO it just adds unnecessary noise and differs from what a native strace shows.
>

QEMU strace doesn't have to be the same as native strace. Actually,
they could be used even at the same time: QEMU provides info on target
syscall, strace on host syscall. Do not forget that not all syscalls
are implemeted as one-to-one-correspondence between target and host
syscalls - combination of QEMU strace and native strace is precious at
some moments.

In any case, I think that requiring QEMU strace to look exactly like
native strace is a self-imposed limitation, that does not have enough
justification. Think of it, wouldn't it be nice to surpass strace,
like in these cases above?

Yours,
Aleksandar

> Helge
>
>
> >>>>     >
> >>>>     >     Signed-off-by: Helge Deller <deller@gmx.de <mailto:deller@gmx.de> <mailto:deller@gmx.de <mailto:deller@gmx.de>>>
> >>>>     >
> >>>>     >     diff --git a/linux-user/strace.c b/linux-user/strace.c
> >>>>     >     index de43238fa4..ff254732af 100644
> >>>>     >     --- a/linux-user/strace.c
> >>>>     >     +++ b/linux-user/strace.c
> >>>>     >     @@ -61,6 +61,7 @@ UNUSED static void print_open_flags(abi_long, int);
> >>>>     >      UNUSED static void print_syscall_prologue(const struct syscallname *);
> >>>>     >      UNUSED static void print_syscall_epilogue(const struct syscallname *);
> >>>>     >      UNUSED static void print_string(abi_long, int);
> >>>>     >     +UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
> >>>>     >      UNUSED static void print_buf(abi_long addr, abi_long len, int last);
> >>>>     >      UNUSED static void print_raw_param(const char *, abi_long, int);
> >>>>     >      UNUSED static void print_timeval(abi_ulong, int);
> >>>>     >     @@ -1204,6 +1205,37 @@ print_syscall_epilogue(const struct syscallname *sc)
> >>>>     >          gemu_log(")");
> >>>>     >      }
> >>>>     >
> >>>>     >     +#define MAX_ENCODED_CHARS 32
> >>>>     >     +static void
> >>>>     >     +print_encoded_string(abi_long addr, unsigned int maxlen, int last)
> >>>>     >     +{
> >>>>     >     +    unsigned int maxout;
> >>>>     >     +    char *s, *str;
> >>>>     >     +
> >>>>     >     +    s = lock_user_string(addr);
> >>>>     >     +    if (s == NULL) {
> >>>>     >     +        /* can't get string out of it, so print it as pointer */
> >>>>     >     +        print_pointer(addr, last);
> >>>>     >     +        return;
> >>>>     >     +    }
> >>>>     >     +
> >>>>     >     +    str = s;
> >>>>     >     +    gemu_log("\"");
> >>>>     >     +    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
> >>>>     >     +    while (maxout--) {
> >>>>     >     +        unsigned char c = *str++;
> >>>>     >     +        if (isprint(c)) {
> >>>>     >     +            gemu_log("%c", c);
> >>>>     >     +        } else {
> >>>>     >     +            gemu_log("\\%o", (unsigned int) c);
> >>>>     >     +        }
> >>>>     >     +    }
> >>>>     >     +    unlock_user(s, addr, 0);
> >>>>     >     +
> >>>>     >     +    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
> >>>>     >     +                    get_comma(last));
> >>>>     >     +}
> >>>>     >     +
> >>>>     >      static void
> >>>>     >      print_string(abi_long addr, int last)
> >>>>     >      {
> >>>>     >     @@ -1633,6 +1665,19 @@ print_futimesat(const struct syscallname *name,
> >>>>     >      }
> >>>>     >      #endif
> >>>>     >
> >>>>     >     +#ifdef TARGET_NR_getcwd
> >>>>     >     +static void
> >>>>     >     +print_getcwd(const struct syscallname *name,
> >>>>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >>>>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >>>>     >     +{
> >>>>     >     +    print_syscall_prologue(name);
> >>>>     >     +    print_string(arg0, 0);
> >>>>     >     +    print_raw_param("%u", arg1, 1);
> >>>>     >     +    print_syscall_epilogue(name);
> >>>>     >     +}
> >>>>     >     +#endif
> >>>>     >     +
> >>>>     >      #ifdef TARGET_NR_settimeofday
> >>>>     >      static void
> >>>>     >      print_settimeofday(const struct syscallname *name,
> >>>>     >     @@ -2428,6 +2473,20 @@ print_fstatat64(const struct syscallname *name,
> >>>>     >      #define print_newfstatat    print_fstatat64
> >>>>     >      #endif
> >>>>     >
> >>>>     >     +#ifdef TARGET_NR_read
> >>>>     >     +static void
> >>>>     >     +print_read(const struct syscallname *name,
> >>>>     >     +    abi_long arg0, abi_long arg1, abi_long arg2,
> >>>>     >     +    abi_long arg3, abi_long arg4, abi_long arg5)
> >>>>     >     +{
> >>>>     >     +    print_syscall_prologue(name);
> >>>>     >     +    print_raw_param("%d", arg0, 0);
> >>>>     >     +    print_encoded_string(arg1, arg2, 0);
> >>>>     >     +    print_raw_param("%u", arg2, 1);
> >>>>     >     +    print_syscall_epilogue(name);
> >>>>     >     +}
> >>>>     >     +#endif
> >>>>     >     +
> >>>>     >      #ifdef TARGET_NR_readlink
> >>>>     >      static void
> >>>>     >      print_readlink(const struct syscallname *name,
> >>>>     >     diff --git a/linux-user/strace.list b/linux-user/strace.list
> >>>>     >     index d49a1e92a8..220b1f4c46 100644
> >>>>     >     --- a/linux-user/strace.list
> >>>>     >     +++ b/linux-user/strace.list
> >>>>     >     @@ -272,7 +272,7 @@
> >>>>     >      { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
> >>>>     >      #endif
> >>>>     >      #ifdef TARGET_NR_getcwd
> >>>>     >     -{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
> >>>>     >     +{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
> >>>>     >      #endif
> >>>>     >      #ifdef TARGET_NR_getdents
> >>>>     >      { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
> >>>>     >     @@ -1080,7 +1080,7 @@
> >>>>     >      { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
> >>>>     >      #endif
> >>>>     >      #ifdef TARGET_NR_read
> >>>>     >     -{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
> >>>>     >     +{ TARGET_NR_read, "read" , NULL, print_read, NULL },
> >>>>     >      #endif
> >>>>     >      #ifdef TARGET_NR_readahead
> >>>>     >      { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
> >>>>     >     diff --git a/linux-user/syscall.c b/linux-user/syscall.c
> >>>>     >     index ce399a55f0..c0079ca2b7 100644
> >>>>     >     --- a/linux-user/syscall.c
> >>>>     >     +++ b/linux-user/syscall.c
> >>>>     >     @@ -12069,6 +12069,21 @@ static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
> >>>>     >          return ret;
> >>>>     >      }
> >>>>     >
> >>>>     >     +/*
> >>>>     >     + * True if this syscall should be printed after having called the native
> >>>>     >     + * syscall, so that values which are fed back to userspace gets printed.
> >>>>     >     + */
> >>>>     >     +static int print_syscall_late(int syscall)
> >>>>     >     +{
> >>>>     >     +    switch (syscall) {
> >>>>     >     +    case TARGET_NR_getcwd:
> >>>>     >     +    case TARGET_NR_read:
> >>>>     >     +        return 1;
> >>>>     >     +    default:
> >>>>     >     +        return 0;
> >>>>     >     +    }
> >>>>     >     +}
> >>>>     >     +
> >>>>     >      abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> >>>>     >                          abi_long arg2, abi_long arg3, abi_long arg4,
> >>>>     >                          abi_long arg5, abi_long arg6, abi_long arg7,
> >>>>     >     @@ -12095,9 +12110,16 @@ abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
> >>>>     >                               arg2, arg3, arg4, arg5, arg6, arg7, arg8);
> >>>>     >
> >>>>     >          if (unlikely(do_strace)) {
> >>>>     >     -        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >>>>     >     +        int late_printing;
> >>>>     >     +        late_printing = print_syscall_late(num);
> >>>>     >     +        if (!late_printing) {
> >>>>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >>>>     >     +        }
> >>>>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> >>>>     >                                arg5, arg6, arg7, arg8);
> >>>>     >     +        if (late_printing) {
> >>>>     >     +            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
> >>>>     >     +        }
> >>>>     >              print_syscall_ret(num, ret);
> >>>>     >          } else {
> >>>>     >              ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
> >>>>     >
> >>>>
> >>>
>
Laurent Vivier Nov. 26, 2019, 8:08 a.m. UTC | #10
Le 25/11/2019 à 21:54, Aleksandar Markovic a écrit :
>>> 4. NEW PROPOSAL :
>>>
>>> early: read(3,0xff80038c,512)
>>> late: = 512  [(3,"\177ELF\1\2\1\3\0\0\0\0\0\0\0\0\0\3\0\17\0\0\0\1\0\2bl\0\0\04"...,512)]
>>> early: getcwd(0x18180,4096)
>>> late: = 9  [("/usr/bin",4096)]
>>>
>>> In other words, we would print for (selected) system calls content of
>>> the parameters both before and after actual ioctl
>>> translation/execution.
>> I don't like this.
>> IMHO it just adds unnecessary noise and differs from what a native strace shows.
>>
> QEMU strace doesn't have to be the same as native strace. Actually,
> they could be used even at the same time: QEMU provides info on target
> syscall, strace on host syscall. Do not forget that not all syscalls
> are implemeted as one-to-one-correspondence between target and host
> syscalls - combination of QEMU strace and native strace is precious at
> some moments.
> 
> In any case, I think that requiring QEMU strace to look exactly like
> native strace is a self-imposed limitation, that does not have enough
> justification. Think of it, wouldn't it be nice to surpass strace,
> like in these cases above?

I'd like to have a QEMU_STRACE formatting similar to the host strace
formatting because it makes debugging easier: we can run the same
program in guest and host and compare the strace output to see where it
differs and thus where the problem appears.

Thanks,
Laurent

Patch
diff mbox series

diff --git a/linux-user/strace.c b/linux-user/strace.c
index de43238fa4..ff254732af 100644
--- a/linux-user/strace.c
+++ b/linux-user/strace.c
@@ -61,6 +61,7 @@  UNUSED static void print_open_flags(abi_long, int);
 UNUSED static void print_syscall_prologue(const struct syscallname *);
 UNUSED static void print_syscall_epilogue(const struct syscallname *);
 UNUSED static void print_string(abi_long, int);
+UNUSED static void print_encoded_string(abi_long addr, unsigned int maxlen, int last);
 UNUSED static void print_buf(abi_long addr, abi_long len, int last);
 UNUSED static void print_raw_param(const char *, abi_long, int);
 UNUSED static void print_timeval(abi_ulong, int);
@@ -1204,6 +1205,37 @@  print_syscall_epilogue(const struct syscallname *sc)
     gemu_log(")");
 }

+#define MAX_ENCODED_CHARS 32
+static void
+print_encoded_string(abi_long addr, unsigned int maxlen, int last)
+{
+    unsigned int maxout;
+    char *s, *str;
+
+    s = lock_user_string(addr);
+    if (s == NULL) {
+        /* can't get string out of it, so print it as pointer */
+        print_pointer(addr, last);
+        return;
+    }
+
+    str = s;
+    gemu_log("\"");
+    maxout = MIN(maxlen, MAX_ENCODED_CHARS);
+    while (maxout--) {
+        unsigned char c = *str++;
+        if (isprint(c)) {
+            gemu_log("%c", c);
+        } else {
+            gemu_log("\\%o", (unsigned int) c);
+        }
+    }
+    unlock_user(s, addr, 0);
+
+    gemu_log("\"%s%s", maxlen > MAX_ENCODED_CHARS ? "..." : "",
+                    get_comma(last));
+}
+
 static void
 print_string(abi_long addr, int last)
 {
@@ -1633,6 +1665,19 @@  print_futimesat(const struct syscallname *name,
 }
 #endif

+#ifdef TARGET_NR_getcwd
+static void
+print_getcwd(const struct syscallname *name,
+    abi_long arg0, abi_long arg1, abi_long arg2,
+    abi_long arg3, abi_long arg4, abi_long arg5)
+{
+    print_syscall_prologue(name);
+    print_string(arg0, 0);
+    print_raw_param("%u", arg1, 1);
+    print_syscall_epilogue(name);
+}
+#endif
+
 #ifdef TARGET_NR_settimeofday
 static void
 print_settimeofday(const struct syscallname *name,
@@ -2428,6 +2473,20 @@  print_fstatat64(const struct syscallname *name,
 #define print_newfstatat    print_fstatat64
 #endif

+#ifdef TARGET_NR_read
+static void
+print_read(const struct syscallname *name,
+    abi_long arg0, abi_long arg1, abi_long arg2,
+    abi_long arg3, abi_long arg4, abi_long arg5)
+{
+    print_syscall_prologue(name);
+    print_raw_param("%d", arg0, 0);
+    print_encoded_string(arg1, arg2, 0);
+    print_raw_param("%u", arg2, 1);
+    print_syscall_epilogue(name);
+}
+#endif
+
 #ifdef TARGET_NR_readlink
 static void
 print_readlink(const struct syscallname *name,
diff --git a/linux-user/strace.list b/linux-user/strace.list
index d49a1e92a8..220b1f4c46 100644
--- a/linux-user/strace.list
+++ b/linux-user/strace.list
@@ -272,7 +272,7 @@ 
 { TARGET_NR_getcpu, "getcpu" , "%s(%p,%d)", NULL, NULL },
 #endif
 #ifdef TARGET_NR_getcwd
-{ TARGET_NR_getcwd, "getcwd" , "%s(%p,%d)", NULL, NULL },
+{ TARGET_NR_getcwd, "getcwd" , NULL, print_getcwd, NULL },
 #endif
 #ifdef TARGET_NR_getdents
 { TARGET_NR_getdents, "getdents" , NULL, NULL, NULL },
@@ -1080,7 +1080,7 @@ 
 { TARGET_NR_quotactl, "quotactl" , NULL, NULL, NULL },
 #endif
 #ifdef TARGET_NR_read
-{ TARGET_NR_read, "read" , "%s(%d,%#x,%d)", NULL, NULL },
+{ TARGET_NR_read, "read" , NULL, print_read, NULL },
 #endif
 #ifdef TARGET_NR_readahead
 { TARGET_NR_readahead, "readahead" , NULL, NULL, NULL },
diff --git a/linux-user/syscall.c b/linux-user/syscall.c
index ce399a55f0..c0079ca2b7 100644
--- a/linux-user/syscall.c
+++ b/linux-user/syscall.c
@@ -12069,6 +12069,21 @@  static abi_long do_syscall1(void *cpu_env, int num, abi_long arg1,
     return ret;
 }

+/*
+ * True if this syscall should be printed after having called the native
+ * syscall, so that values which are fed back to userspace gets printed.
+ */
+static int print_syscall_late(int syscall)
+{
+    switch (syscall) {
+    case TARGET_NR_getcwd:
+    case TARGET_NR_read:
+        return 1;
+    default:
+        return 0;
+    }
+}
+
 abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
                     abi_long arg2, abi_long arg3, abi_long arg4,
                     abi_long arg5, abi_long arg6, abi_long arg7,
@@ -12095,9 +12110,16 @@  abi_long do_syscall(void *cpu_env, int num, abi_long arg1,
                          arg2, arg3, arg4, arg5, arg6, arg7, arg8);

     if (unlikely(do_strace)) {
-        print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
+        int late_printing;
+        late_printing = print_syscall_late(num);
+        if (!late_printing) {
+            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
+        }
         ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,
                           arg5, arg6, arg7, arg8);
+        if (late_printing) {
+            print_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
+        }
         print_syscall_ret(num, ret);
     } else {
         ret = do_syscall1(cpu_env, num, arg1, arg2, arg3, arg4,