simplify debugging in-container tests (and regular tests)
diff mbox series

Message ID xny2xhg2t4.fsf@greed.delorie.com
State New
Headers show
Series
  • simplify debugging in-container tests (and regular tests)
Related show

Commit Message

DJ Delorie Oct. 18, 2019, 9:29 p.m. UTC
The attached patch helps you debug tests, especially in-container
tests.  Demo first...

dj@envy pts/27 ~/tools/upstream/glibc.debugtests.build
$ WAIT_FOR_DEBUGGER=1 make test t=nss/tst-nss-test3
/envy/dj/bin/make -r PARALLELMFLAGS="" -C ../glibc.debugtests objdir=`pwd` test
. . .
../scripts/evaluate-test.sh nss/tst-nss-test3 $? false false > /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3.test-result
Waiting for debugger, test process is pid 9643
gdb -se /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3 /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3 9643
(gdb) set sysroot /envy/dj/tools/upstream/glibc.debugtests.build/testroot.root
(gdb) set wait_for_debugger = 0
(gdb) b do_test
(gdb) c

Then cut-n-paste those into another window...

dj@envy pts/30 ~/tools/upstream/glibc.debugtests.build
$ gdb -se /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3 /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3 9643
GNU gdb (GDB) Fedora 8.3-1.fc30
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-redhat-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3...
Attaching to program: /envy/dj/tools/upstream/glibc.debugtests.build/nss/tst-nss-test3, process 9643
Error while mapping shared library sections:
Could not open `target:/lib64/libc.so.6' as an executable file: Operation not permitted
Error while mapping shared library sections:
Could not open `target:/lib64/ld-linux-x86-64.so.2' as an executable file: Operation not permitted

warning: Unable to find dynamic linker breakpoint function.
GDB will be unable to debug shared library initializers
and track explicitly loaded dynamic code.

warning: Target and debugger are in different PID namespaces; thread lists and other data are likely unreliable.  Connect to gdbserver inside the container.
run_test_function (argc=argc@entry=1, argv=argv@entry=0x7fffe3a9d578, config=<optimized out>, 
    config=<optimized out>) at support_test_main.c:205
205         ;

[ gdb ] set sysroot /envy/dj/tools/upstream/glibc.debugtests.build/testroot.root
Reading symbols from /envy/dj/tools/upstream/glibc.debugtests.build/testroot.root/lib64/libc.so.6...
Reading symbols from /envy/dj/tools/upstream/glibc.debugtests.build/testroot.root/lib64/ld-linux-x86-64.so.2...

[ gdb ] set wait_for_debugger = 0

[ gdb ] b do_test
Breakpoint 1 at 0x402970: file tst-nss-test3.c, line 116.

[ gdb ] c
Continuing.

Breakpoint 1, do_test () at tst-nss-test3.c:116
116       setgrent ();

[ gdb ] n
119       for (g = getgrent () ;

[ gdb ] s
getgrent () at ../nss/getXXent.c:81
81        __libc_lock_lock (lock);

[ gdb ] 
__lll_cas_lock (futex=0x7fc5eaf6f300 <lock>) at ../sysdeps/unix/sysv/linux/x86/lowlevellock.h:47
47        asm volatile ("cmpxchgl %2, %1"

[ gdb ] where
#0  __lll_cas_lock (futex=0x7fc5eaf6f300 <lock>) at ../sysdeps/unix/sysv/linux/x86/lowlevellock.h:47
#1  getgrent () at ../nss/getXXent.c:81
#2  0x0000000000402982 in do_test () at tst-nss-test3.c:119
#3  0x0000000000403165 in support_test_main (argc=1, argv=0x7fffe3a9d578, config=config@entry=0x7fffe3a9d440)
    at support_test_main.c:378
#4  0x00000000004022e5 in main (argc=<optimized out>, argv=<optimized out>) at ../support/test-driver.c:168



From 39ba1c48e97fb75133f92863561f7dfda23d8d1a Mon Sep 17 00:00:00 2001
From: DJ Delorie <dj@redhat.com>
Date: Wed, 2 Oct 2019 14:46:46 -0400
Subject: Add wait-for-debugger test harness hooks

If WAIT_FOR_DEBUGGER is set in the environment, any test that
runs will print some useful gdb information and wait for gdb
to attach to it and clear the "wait_for_debugger" variable.

Comments

Florian Weimer Oct. 21, 2019, 12:31 p.m. UTC | #1
* DJ Delorie:

> +  while (wait_for_debugger)
> +    ;

Can you put a usleep call into the loop or something like that?
Hard busy-waiting seems to be a bit excessive.

(pause does not fail with EINTR when attaching GDB, so that doesn't
work.)

Thanks,
Florian
Carlos O'Donell Oct. 21, 2019, 1:18 p.m. UTC | #2
On 10/18/19 5:29 PM, DJ Delorie wrote:
> The attached patch helps you debug tests, especially in-container
> tests.  Demo first...

The demo was totally awesome! I really like seeing this kind of thing in action.
I don't particularly like all the gdb warnings, but we know what we're doing
here so most of them are actually harmless because it's a matching runtime
to the sysroot you declare.

I have a few high-level comments about the design. I'll put them inline,
and comment here:

* Use a boolean WAIT_FOR_DEBUGGER supporting 1 or 0 value (true and false respectively).

* Have the child write a *.x file for gdb to just run. Simplifies things.

Thoughts?

> From 39ba1c48e97fb75133f92863561f7dfda23d8d1a Mon Sep 17 00:00:00 2001
> From: DJ Delorie <dj@redhat.com>
> Date: Wed, 2 Oct 2019 14:46:46 -0400
> Subject: Add wait-for-debugger test harness hooks
> 
> If WAIT_FOR_DEBUGGER is set in the environment, any test that
> runs will print some useful gdb information and wait for gdb
> to attach to it and clear the "wait_for_debugger" variable.

OK.

> 
> diff --git a/support/support_test_main.c b/support/support_test_main.c
> index 05ad92e688..c94b0333b7 100644
> --- a/support/support_test_main.c
> +++ b/support/support_test_main.c
> @@ -19,6 +19,7 @@
>  #include <support/test-driver.h>
>  #include <support/check.h>
>  #include <support/temp_file-internal.h>
> +#include <support/support.h>

OK.

>  
>  #include <assert.h>
>  #include <errno.h>
> @@ -176,10 +177,32 @@ signal_handler (int sig)
>    exit (1);
>  }
>  
> +static volatile int wait_for_debugger = 0;

Why is this volatile?

> +
>  /* Run test_function or test_function_argv.  */
>  static int
>  run_test_function (int argc, char **argv, const struct test_config *config)
>  {
> +  if (getenv("WAIT_FOR_DEBUGGER") != NULL)

It is a problem IMO, that you can't easily flip
WAIT_FOR_DEBUGGER=1/0 which is what I'd want to do with
my own testing.

That is to say it's much easier to up-arrow and flip it
to zero, than delete the entire env.

Likewise it's much clearer in a shell script to have:
"... WAIT_FOR_DEBUGGER=$WFD ..."
than:
"... $WFD ..."
where the entire env var is hidden in the local shell variable
(or worse more complex conditionals).

Could we please parse this as a boolean 0/1?

> +    {
> +      pid_t mypid;
> +      mypid = getpid();
> +      if (mypid < 3)
> +	{
> +	  const char *outside_pid = getenv("PID_OUTSIDE_CONTAINER");
> +	  if (outside_pid)
> +	    mypid = atoi (outside_pid);
> +	}
> +      wait_for_debugger = 1;
> +      fprintf(stderr, "Waiting for debugger, test process is pid %d\n", mypid);
> +      fprintf(stderr, "gdb -se %s %s %d\n", argv[0], argv[0], mypid);
> +      fprintf(stderr, "(gdb) set sysroot %s/testroot.root\n", support_objdir_root);
> +      fprintf(stderr, "(gdb) set wait_for_debugger = 0\n");
> +      fprintf(stderr, "(gdb) b do_test\n");
> +      fprintf(stderr, "(gdb) c\n");

This should be output to a file in $objdir in the same location
as the test result file, and be loadable via gdb -x file.

We should not have to cut and paste this by hand. The test driver
has to know the test file location, so we can just write to there
e.g. $test.x

Then tell the user to do:

gdb -x /path/to/$test.x

That should be all they need to do?

> +    }
> +  while (wait_for_debugger)
> +    ;

Florian already mentioned a small usleep here. Just kinder on the host
system.

>    if (config->test_function != NULL)
>      return config->test_function ();
>    else if (config->test_function_argv != NULL)
> @@ -229,6 +252,11 @@ support_test_main (int argc, char **argv, const struct test_config *config)
>    unsigned int timeoutfactor = 1;
>    pid_t termpid;
>  
> +  /* If we're debugging the test, we need to disable timeouts and use
> +     the initial pid (esp if we're running inside a container).  */
> +  if (getenv("WAIT_FOR_DEBUGGER") != NULL)
> +    direct = 1;

OK. Perfect.

> +
>    if (!config->no_mallopt)
>      {
>        /* Make uses of freed and uninitialized memory known.  Do not
> diff --git a/support/test-container.c b/support/test-container.c
> index 9c42d3ae2f..5d08979df3 100644
> --- a/support/test-container.c
> +++ b/support/test-container.c
> @@ -676,6 +676,9 @@ main (int argc, char **argv)
>    char *so_base;
>    int do_postclean = 0;
>  
> +  int pipes[2];
> +  char pid_buf[20];

OK.

> +
>    uid_t original_uid;
>    gid_t original_gid;
>    /* If set, the test runs as root instead of the user running the testsuite.  */
> @@ -999,6 +1002,11 @@ main (int argc, char **argv)
>    if (chdir (new_cwd_path) < 0)
>      FAIL_EXIT1 ("Can't cd to new %s - ", new_cwd_path);
>  
> +  /* This is to pass the "outside" PID to the child, which will be PID
> +     1.  */
> +  if (pipe2 (pipes, O_CLOEXEC) < 0)
> +    FAIL_EXIT1 ("Can't create pid pipe");

OK.

> +
>    /* To complete the containerization, we need to fork () at least
>       once.  We can't exec, nor can we somehow link the new child to
>       our parent.  So we run the child and propogate it's exit status
> @@ -1010,6 +1018,12 @@ main (int argc, char **argv)
>      {
>        /* Parent.  */
>        int status;
> +
> +      /* Send the child's "outside" pid to it.  */
> +      write (pipes[1], &child, sizeof(child));
> +      close (pipes[0]);
> +      close (pipes[1]);

OK.

> +
>        waitpid (child, &status, 0);
>  
>        if (WIFEXITED (status))
> @@ -1028,6 +1042,14 @@ main (int argc, char **argv)
>    /* The rest is the child process, which is now PID 1 and "in" the
>       new root.  */
>  
> +  /* Get our "outside" pid from our parent.  We use this to help with
> +     debugging from outside the container.  */
> +  read (pipes[0], &child, sizeof(child));
> +  close (pipes[0]);
> +  close (pipes[1]);
> +  sprintf (pid_buf, "%lu", (long unsigned)child);
> +  setenv ("PID_OUTSIDE_CONTAINER", pid_buf, 0);

OK.

> +
>    maybe_xmkdir ("/tmp", 0755);
>  
>    /* Now that we're pid 1 (effectively "root") we can mount /proc  */
>

Patch
diff mbox series

diff --git a/support/support_test_main.c b/support/support_test_main.c
index 05ad92e688..c94b0333b7 100644
--- a/support/support_test_main.c
+++ b/support/support_test_main.c
@@ -19,6 +19,7 @@ 
 #include <support/test-driver.h>
 #include <support/check.h>
 #include <support/temp_file-internal.h>
+#include <support/support.h>
 
 #include <assert.h>
 #include <errno.h>
@@ -176,10 +177,32 @@  signal_handler (int sig)
   exit (1);
 }
 
+static volatile int wait_for_debugger = 0;
+
 /* Run test_function or test_function_argv.  */
 static int
 run_test_function (int argc, char **argv, const struct test_config *config)
 {
+  if (getenv("WAIT_FOR_DEBUGGER") != NULL)
+    {
+      pid_t mypid;
+      mypid = getpid();
+      if (mypid < 3)
+	{
+	  const char *outside_pid = getenv("PID_OUTSIDE_CONTAINER");
+	  if (outside_pid)
+	    mypid = atoi (outside_pid);
+	}
+      wait_for_debugger = 1;
+      fprintf(stderr, "Waiting for debugger, test process is pid %d\n", mypid);
+      fprintf(stderr, "gdb -se %s %s %d\n", argv[0], argv[0], mypid);
+      fprintf(stderr, "(gdb) set sysroot %s/testroot.root\n", support_objdir_root);
+      fprintf(stderr, "(gdb) set wait_for_debugger = 0\n");
+      fprintf(stderr, "(gdb) b do_test\n");
+      fprintf(stderr, "(gdb) c\n");
+    }
+  while (wait_for_debugger)
+    ;
   if (config->test_function != NULL)
     return config->test_function ();
   else if (config->test_function_argv != NULL)
@@ -229,6 +252,11 @@  support_test_main (int argc, char **argv, const struct test_config *config)
   unsigned int timeoutfactor = 1;
   pid_t termpid;
 
+  /* If we're debugging the test, we need to disable timeouts and use
+     the initial pid (esp if we're running inside a container).  */
+  if (getenv("WAIT_FOR_DEBUGGER") != NULL)
+    direct = 1;
+
   if (!config->no_mallopt)
     {
       /* Make uses of freed and uninitialized memory known.  Do not
diff --git a/support/test-container.c b/support/test-container.c
index 9c42d3ae2f..5d08979df3 100644
--- a/support/test-container.c
+++ b/support/test-container.c
@@ -676,6 +676,9 @@  main (int argc, char **argv)
   char *so_base;
   int do_postclean = 0;
 
+  int pipes[2];
+  char pid_buf[20];
+
   uid_t original_uid;
   gid_t original_gid;
   /* If set, the test runs as root instead of the user running the testsuite.  */
@@ -999,6 +1002,11 @@  main (int argc, char **argv)
   if (chdir (new_cwd_path) < 0)
     FAIL_EXIT1 ("Can't cd to new %s - ", new_cwd_path);
 
+  /* This is to pass the "outside" PID to the child, which will be PID
+     1.  */
+  if (pipe2 (pipes, O_CLOEXEC) < 0)
+    FAIL_EXIT1 ("Can't create pid pipe");
+
   /* To complete the containerization, we need to fork () at least
      once.  We can't exec, nor can we somehow link the new child to
      our parent.  So we run the child and propogate it's exit status
@@ -1010,6 +1018,12 @@  main (int argc, char **argv)
     {
       /* Parent.  */
       int status;
+
+      /* Send the child's "outside" pid to it.  */
+      write (pipes[1], &child, sizeof(child));
+      close (pipes[0]);
+      close (pipes[1]);
+
       waitpid (child, &status, 0);
 
       if (WIFEXITED (status))
@@ -1028,6 +1042,14 @@  main (int argc, char **argv)
   /* The rest is the child process, which is now PID 1 and "in" the
      new root.  */
 
+  /* Get our "outside" pid from our parent.  We use this to help with
+     debugging from outside the container.  */
+  read (pipes[0], &child, sizeof(child));
+  close (pipes[0]);
+  close (pipes[1]);
+  sprintf (pid_buf, "%lu", (long unsigned)child);
+  setenv ("PID_OUTSIDE_CONTAINER", pid_buf, 0);
+
   maybe_xmkdir ("/tmp", 0755);
 
   /* Now that we're pid 1 (effectively "root") we can mount /proc  */