diff mbox series

[RFA,PR,middle-end/61118] Improve tree CFG accuracy for setjmp/longjmp

Message ID 93b4b8a7-c6fe-65c0-0609-62ebee669967@redhat.com
State New
Headers show
Series [RFA,PR,middle-end/61118] Improve tree CFG accuracy for setjmp/longjmp | expand

Commit Message

Jeff Law Feb. 28, 2018, 12:16 a.m. UTC
Richi, you worked on 57147 which touches on the issues here.  Your
thoughts would be greatly appreciated.


So 61118 is one of several bugs related to the clobbered-by-longjmp warning.

In 61118 is we are unable to coalesce all the objects in the key
partitions.  To remove the relevant PHIs we have to create two
assignments to the key pseudos.

Pseudos with more than one assignment are subject to the
clobbered-by-longjmp analysis:

 * True if register REGNO was alive at a place where `setjmp' was
   called and was set more than once or is an argument.  Such regs may
   be clobbered by `longjmp'.  */

static bool
regno_clobbered_at_setjmp (bitmap setjmp_crosses, int regno)
{
  /* There appear to be cases where some local vars never reach the
     backend but have bogus regnos.  */
  if (regno >= max_reg_num ())
    return false;

  return ((REG_N_SETS (regno) > 1
           || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN
(cfun)),
                               regno))
          && REGNO_REG_SET_P (setjmp_crosses, regno));
}


The fact that no path sets the pseudo more than once is not considered.
If there is more than one static set of the pseudo, then it is
considered for possible warning.

--


I looked at the propagations which led to the inability to coalesce.
They all seemed valid to me.  We have always allowed copy propagation to
replace one pseudo with another as long as neither has
SSA_NAME_USED_IN_ABNORMAL_PHI set.

We have a PHI like

x1(ab) = (x0, x3 (ab))

x0 is not marked as abnormal because the edge isn't abnormal and thus we
can propagate into the x0 argument of the PHI.  This is consistent with
behavior since, well, forever.   We propagate a value for x0 resulting
in something like

x1(b) = (y0, x3 (ab))


Where y0 is still live across the PHI.  Thus the partition for x1/x3,
etc conflicts with the partition for y0 and they can not be coalesced.
This leads to the multiple assignments to the pseudo for the x1/x3
partition.  I briefly looked marking all the PHI arguments as abnormal
when the destination is abnormal, but it just doesn't seem right.

Anyway, I'd already been looking at 21161 and was aware that the CFG's
we're building in presence of setjmp/longjmp were slightly inaccurate.

In particular, a longjmp returns to the point immediately after the
setjmp, not to the setjmp itself.  But our CFG building has the edge
from the abnormal dispatcher going to the block containing the setjmp call.

This creates unnecessary irreducible loops.  It turns out that if we fix
the tree CFG, then lifetimes become more accurate (and more
constrained).  The more constrained, more accurate lifetime information
is enough to allow things to coalesce the way we want and everything for
61118 just works.

It's actually pretty easy to fix the CFG.  We  just need to recognize
that a "returns twice" function returns not to the call, but to the
point immediately after the call.  So if we have a call to a returns
twice function that ends a block with a single successor, when we wire
up the abnormal dispatcher, we target the single successor rather than
the block containing the returns-twice call.

This compromises the test gcc.dg/torture/57147-2.c


Prior to this change the CFG looks like

     2
    / \
   3<->4
   |
   R

Where block #3 contains the setjmp.  The edges 2->4, 3->4 and 4->3 are
abnormals.  Block #4 is the abnormal dispatcher.

Eventually we remove the edge from 2->3 because the last statement in
block #2 is to a non-returning function call.  But we leave the abnormal
edge 2->4 (on purpose) resulting in:


     2
     |
  +->4
  |  |
  +--3
     |
     R

The test then proceeds to verify there is a call to setjmp in the
resulting .optimized dump -- which there is because block #3 remains
reachable.


With this change the CFG looks like:



     2
    / \
   3-->4
   |  /
   | /
   |/
   R


Where the edges 2->4 and 3->4 and 4->R are abnormals.  Block #4 is still
the dispatcher and the setjmp is still in block #3.

We realize block #2 ends with a call to a noreturn function and again we
remove the 2->3 edge.  That makes block #3 unreachable and it gets
removed, resulting in:

    2
    |
    4
    |
    R

Where 2->4 and 4->R are still abnormal edges.  With bb3 becoming
unreachable, the setjmp is unreachable and gets removed thus breaking
the scan part of the test.




If we review the source of the test:


struct __jmp_buf_tag {};
typedef struct __jmp_buf_tag jmp_buf[1];
extern int _setjmp (struct __jmp_buf_tag __env[1]);

jmp_buf g_return_jmp_buf;

void SetNaClSwitchExpectations (void)
{
  __builtin_longjmp (g_return_jmp_buf, 1);
}
void TestSyscall(void)
{
  SetNaClSwitchExpectations();
  _setjmp (g_return_jmp_buf);
}


We can easily see that the call to __setjmp can never be reached given
that we consider the longjmp call as non-returning.  So AFAICT
everything is as should be expected.  I think the right thing is to just
remove this compromised test.

--



The regression tested from pr61118 disables -ftracer as -ftracer creates
an additional assignment to key objects which gets carried through into
RTL thus triggering the problem all over again.  My RTL fixes for 21161
do not fix this.  So if the patch is accepted I propose we keep 61118
open, but without the gcc-8 regression marker.  It's still a deficiency
that -ftracer can trigger a bogus clobbered-by-longjmp warning.

This has been bootstrapped and regression tested on x86_64.

Thoughts?  OK for the trunk?

Jeff
PR middle-end/61118
	* tree-cfg.c (handle_abnormal_edges): Accept new argument.
	(make_edges): Callers of handle_abnormal_edges changed.

	* gcc.dg/torture/pr61118.c: New test.
	* gcc.dg/torture/pr57147.c: Remove compromised test.

Comments

Richard Biener Feb. 28, 2018, 10:43 a.m. UTC | #1
On Wed, Feb 28, 2018 at 1:16 AM, Jeff Law <law@redhat.com> wrote:
> Richi, you worked on 57147 which touches on the issues here.  Your
> thoughts would be greatly appreciated.
>
>
> So 61118 is one of several bugs related to the clobbered-by-longjmp warning.
>
> In 61118 is we are unable to coalesce all the objects in the key
> partitions.  To remove the relevant PHIs we have to create two
> assignments to the key pseudos.
>
> Pseudos with more than one assignment are subject to the
> clobbered-by-longjmp analysis:
>
>  * True if register REGNO was alive at a place where `setjmp' was
>    called and was set more than once or is an argument.  Such regs may
>    be clobbered by `longjmp'.  */
>
> static bool
> regno_clobbered_at_setjmp (bitmap setjmp_crosses, int regno)
> {
>   /* There appear to be cases where some local vars never reach the
>      backend but have bogus regnos.  */
>   if (regno >= max_reg_num ())
>     return false;
>
>   return ((REG_N_SETS (regno) > 1
>            || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN
> (cfun)),
>                                regno))
>           && REGNO_REG_SET_P (setjmp_crosses, regno));
> }
>
>
> The fact that no path sets the pseudo more than once is not considered.
> If there is more than one static set of the pseudo, then it is
> considered for possible warning.
>
> --
>
>
> I looked at the propagations which led to the inability to coalesce.
> They all seemed valid to me.  We have always allowed copy propagation to
> replace one pseudo with another as long as neither has
> SSA_NAME_USED_IN_ABNORMAL_PHI set.
>
> We have a PHI like
>
> x1(ab) = (x0, x3 (ab))
>
> x0 is not marked as abnormal because the edge isn't abnormal and thus we
> can propagate into the x0 argument of the PHI.  This is consistent with
> behavior since, well, forever.   We propagate a value for x0 resulting
> in something like
>
> x1(b) = (y0, x3 (ab))
>
>
> Where y0 is still live across the PHI.  Thus the partition for x1/x3,
> etc conflicts with the partition for y0 and they can not be coalesced.
> This leads to the multiple assignments to the pseudo for the x1/x3
> partition.  I briefly looked marking all the PHI arguments as abnormal
> when the destination is abnormal, but it just doesn't seem right.
>
> Anyway, I'd already been looking at 21161 and was aware that the CFG's
> we're building in presence of setjmp/longjmp were slightly inaccurate.
>
> In particular, a longjmp returns to the point immediately after the
> setjmp, not to the setjmp itself.  But our CFG building has the edge
> from the abnormal dispatcher going to the block containing the setjmp call.

Yeah...  for SJLJ EH we get this right via __builtin_setjmp_receiver.

> This creates unnecessary irreducible loops.  It turns out that if we fix
> the tree CFG, then lifetimes become more accurate (and more
> constrained).  The more constrained, more accurate lifetime information
> is enough to allow things to coalesce the way we want and everything for
> 61118 just works.

Sounds good.

> It's actually pretty easy to fix the CFG.  We  just need to recognize
> that a "returns twice" function returns not to the call, but to the
> point immediately after the call.  So if we have a call to a returns
> twice function that ends a block with a single successor, when we wire
> up the abnormal dispatcher, we target the single successor rather than
> the block containing the returns-twice call.

Hmm, I think you need to check whether the successor has a single
predecessor, not whether we have a single successor (we always have
that unless setjmp also throws).  If you fix that you keep the CFG
"incorrect" if there are multiple predecessors so I think in addition
to properly creating the edges you have to work on the BB building
part to ensure that there's a single-predecessor block after
returns-twice function calls.  Note that currently we force returns-twice
to be the first (and only) stmt of a block -- your fix would relax this,
returns-twice no longer needs to start a new BB.

-               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
-                                      &ab_edge_call, false);
+               {
+                 bool target_after_setjmp = false;
+
+                 /* If the returns twice statement looks like a setjmp
+                    call at the end of a block with a single successor
+                    then we want the edge from the dispatcher to target
+                    that single successor.  That more accurately reflects
+                    actual control flow.  The more accurate CFG also
+                    results in fewer false positive warnings.  */
+                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
+                     && gimple_call_fndecl (call_stmt)
+                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
+                     && single_succ_p (bb))
+                   target_after_setjmp = true;
+                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
+                                        &ab_edge_call, false,
+                                        target_after_setjmp);
+               }

I don't exactly get the hops you jump through here -- I think it's
better to split the returns-twice (always last stmt of a block after
the fixing) and the setjmp-receiver (always first stmt of a block) cases.
So, remove the handling of returns-twice from the above case and
handle returns-twice via

  gimple *last = last_stmt (bb);
  if (last && ...)

also handle all returns-twice calls this way, not only setjmp_call_p.

>
> This compromises the test gcc.dg/torture/57147-2.c
>
>
> Prior to this change the CFG looks like
>
>      2
>     / \
>    3<->4
>    |
>    R
>
> Where block #3 contains the setjmp.  The edges 2->4, 3->4 and 4->3 are
> abnormals.  Block #4 is the abnormal dispatcher.
>
> Eventually we remove the edge from 2->3 because the last statement in
> block #2 is to a non-returning function call.  But we leave the abnormal
> edge 2->4 (on purpose) resulting in:
>
>
>      2
>      |
>   +->4
>   |  |
>   +--3
>      |
>      R
>
> The test then proceeds to verify there is a call to setjmp in the
> resulting .optimized dump -- which there is because block #3 remains
> reachable.
>
>
> With this change the CFG looks like:
>
>
>
>      2
>     / \
>    3-->4
>    |  /
>    | /
>    |/
>    R
>
>
> Where the edges 2->4 and 3->4 and 4->R are abnormals.  Block #4 is still
> the dispatcher and the setjmp is still in block #3.
>
> We realize block #2 ends with a call to a noreturn function and again we
> remove the 2->3 edge.  That makes block #3 unreachable and it gets
> removed, resulting in:
>
>     2
>     |
>     4
>     |
>     R
>
> Where 2->4 and 4->R are still abnormal edges.  With bb3 becoming
> unreachable, the setjmp is unreachable and gets removed thus breaking
> the scan part of the test.
>
>
>
>
> If we review the source of the test:
>
>
> struct __jmp_buf_tag {};
> typedef struct __jmp_buf_tag jmp_buf[1];
> extern int _setjmp (struct __jmp_buf_tag __env[1]);
>
> jmp_buf g_return_jmp_buf;
>
> void SetNaClSwitchExpectations (void)
> {
>   __builtin_longjmp (g_return_jmp_buf, 1);
> }
> void TestSyscall(void)
> {
>   SetNaClSwitchExpectations();
>   _setjmp (g_return_jmp_buf);
> }
>
>
> We can easily see that the call to __setjmp can never be reached given
> that we consider the longjmp call as non-returning.  So AFAICT
> everything is as should be expected.  I think the right thing is to just
> remove this compromised test.

I agree.  Bonus points if you look at PR57147 and see if the testcase
was misreduced (maybe it was just for an ICE so we can keep it
and just remove the dump scanning?)

Richard.

> --
>
>
>
> The regression tested from pr61118 disables -ftracer as -ftracer creates
> an additional assignment to key objects which gets carried through into
> RTL thus triggering the problem all over again.  My RTL fixes for 21161
> do not fix this.  So if the patch is accepted I propose we keep 61118
> open, but without the gcc-8 regression marker.  It's still a deficiency
> that -ftracer can trigger a bogus clobbered-by-longjmp warning.
>
> This has been bootstrapped and regression tested on x86_64.
>
> Thoughts?  OK for the trunk?
>
> Jeff
>
>         PR middle-end/61118
>         * tree-cfg.c (handle_abnormal_edges): Accept new argument.
>         (make_edges): Callers of handle_abnormal_edges changed.
>
>         * gcc.dg/torture/pr61118.c: New test.
>         * gcc.dg/torture/pr57147.c: Remove compromised test.
>
>
> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
> index b87e48d..551195a 100644
> --- a/gcc/tree-cfg.c
> +++ b/gcc/tree-cfg.c
> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>  #include "fold-const.h"
>  #include "trans-mem.h"
>  #include "stor-layout.h"
> +#include "calls.h"
>  #include "print-tree.h"
>  #include "cfganal.h"
>  #include "gimple-fold.h"
> @@ -776,13 +777,22 @@ get_abnormal_succ_dispatcher (basic_block bb)
>  static void
>  handle_abnormal_edges (basic_block *dispatcher_bbs,
>                        basic_block for_bb, int *bb_to_omp_idx,
> -                      auto_vec<basic_block> *bbs, bool computed_goto)
> +                      auto_vec<basic_block> *bbs, bool computed_goto,
> +                      bool target_after_setjmp)
>  {
>    basic_block *dispatcher = dispatcher_bbs + (computed_goto ? 1 : 0);
>    unsigned int idx = 0;
> -  basic_block bb;
> +  basic_block bb, target_bb;
>    bool inner = false;
>
> +  /* Determine the block the abnormal dispatcher will transfer
> +     control to.  It may be FOR_BB, or in some cases it may be the
> +     single successor of FOR_BB.  */
> +  if (target_after_setjmp)
> +    target_bb = single_succ (for_bb);
> +  else
> +    target_bb = for_bb;
> +
>    if (bb_to_omp_idx)
>      {
>        dispatcher = dispatcher_bbs + 2 * bb_to_omp_idx[for_bb->index];
> @@ -878,7 +888,7 @@ handle_abnormal_edges (basic_block *dispatcher_bbs,
>         }
>      }
>
> -  make_edge (*dispatcher, for_bb, EDGE_ABNORMAL);
> +  make_edge (*dispatcher, target_bb, EDGE_ABNORMAL);
>  }
>
>  /* Creates outgoing edges for BB.  Returns 1 when it ends with an
> @@ -1075,11 +1085,11 @@ make_edges (void)
>                  potential target for a computed goto or a non-local goto.  */
>               if (FORCED_LABEL (target))
>                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                      &ab_edge_goto, true);
> +                                      &ab_edge_goto, true, false);
>               if (DECL_NONLOCAL (target))
>                 {
>                   handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                        &ab_edge_call, false);
> +                                        &ab_edge_call, false, false);
>                   break;
>                 }
>             }
> @@ -1094,8 +1104,24 @@ make_edges (void)
>                   && ((gimple_call_flags (call_stmt) & ECF_RETURNS_TWICE)
>                       || gimple_call_builtin_p (call_stmt,
>                                                 BUILT_IN_SETJMP_RECEIVER)))
> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                      &ab_edge_call, false);
> +               {
> +                 bool target_after_setjmp = false;
> +
> +                 /* If the returns twice statement looks like a setjmp
> +                    call at the end of a block with a single successor
> +                    then we want the edge from the dispatcher to target
> +                    that single successor.  That more accurately reflects
> +                    actual control flow.  The more accurate CFG also
> +                    results in fewer false positive warnings.  */
> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
> +                     && gimple_call_fndecl (call_stmt)
> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
> +                     && single_succ_p (bb))
> +                   target_after_setjmp = true;
> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> +                                        &ab_edge_call, false,
> +                                        target_after_setjmp);
> +               }
>             }
>         }
>
> diff --git a/gcc/testsuite/gcc.dg/torture/pr57147-2.c b/gcc/testsuite/gcc.dg/torture/pr57147-2.c
> deleted file mode 100644
> index fc5fb39..0000000
> --- a/gcc/testsuite/gcc.dg/torture/pr57147-2.c
> +++ /dev/null
> @@ -1,22 +0,0 @@
> -/* { dg-do compile } */
> -/* { dg-options "-fdump-tree-optimized" } */
> -/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } { "" } } */
> -/* { dg-require-effective-target indirect_jumps } */
> -
> -struct __jmp_buf_tag {};
> -typedef struct __jmp_buf_tag jmp_buf[1];
> -extern int _setjmp (struct __jmp_buf_tag __env[1]);
> -
> -jmp_buf g_return_jmp_buf;
> -
> -void SetNaClSwitchExpectations (void)
> -{
> -  __builtin_longjmp (g_return_jmp_buf, 1);
> -}
> -void TestSyscall(void)
> -{
> -  SetNaClSwitchExpectations();
> -  _setjmp (g_return_jmp_buf);
> -}
> -
> -/* { dg-final { scan-tree-dump "setjmp" "optimized" } } */
> diff --git a/gcc/testsuite/gcc.dg/torture/pr61118.c b/gcc/testsuite/gcc.dg/torture/pr61118.c
> new file mode 100644
> index 0000000..12be892
> --- /dev/null
> +++ b/gcc/testsuite/gcc.dg/torture/pr61118.c
> @@ -0,0 +1,652 @@
> +/* { dg-options "-Wextra -fno-tracer" } */
> +typedef unsigned char __u_char;
> +typedef unsigned short int __u_short;
> +typedef unsigned int __u_int;
> +typedef unsigned long int __u_long;
> +typedef signed char __int8_t;
> +typedef unsigned char __uint8_t;
> +typedef signed short int __int16_t;
> +typedef unsigned short int __uint16_t;
> +typedef signed int __int32_t;
> +typedef unsigned int __uint32_t;
> +typedef signed long int __int64_t;
> +typedef unsigned long int __uint64_t;
> +typedef long int __quad_t;
> +typedef unsigned long int __u_quad_t;
> +typedef unsigned long int __dev_t;
> +typedef unsigned int __uid_t;
> +typedef unsigned int __gid_t;
> +typedef unsigned long int __ino_t;
> +typedef unsigned long int __ino64_t;
> +typedef unsigned int __mode_t;
> +typedef unsigned long int __nlink_t;
> +typedef long int __off_t;
> +typedef long int __off64_t;
> +typedef int __pid_t;
> +typedef struct { int __val[2]; } __fsid_t;
> +typedef long int __clock_t;
> +typedef unsigned long int __rlim_t;
> +typedef unsigned long int __rlim64_t;
> +typedef unsigned int __id_t;
> +typedef long int __time_t;
> +typedef unsigned int __useconds_t;
> +typedef long int __suseconds_t;
> +typedef int __daddr_t;
> +typedef int __key_t;
> +typedef int __clockid_t;
> +typedef void * __timer_t;
> +typedef long int __blksize_t;
> +typedef long int __blkcnt_t;
> +typedef long int __blkcnt64_t;
> +typedef unsigned long int __fsblkcnt_t;
> +typedef unsigned long int __fsblkcnt64_t;
> +typedef unsigned long int __fsfilcnt_t;
> +typedef unsigned long int __fsfilcnt64_t;
> +typedef long int __fsword_t;
> +typedef long int __ssize_t;
> +typedef long int __syscall_slong_t;
> +typedef unsigned long int __syscall_ulong_t;
> +typedef __off64_t __loff_t;
> +typedef __quad_t *__qaddr_t;
> +typedef char *__caddr_t;
> +typedef long int __intptr_t;
> +typedef unsigned int __socklen_t;
> +static __inline unsigned int
> +__bswap_32 (unsigned int __bsx)
> +{
> +  return __builtin_bswap32 (__bsx);
> +}
> +static __inline __uint64_t
> +__bswap_64 (__uint64_t __bsx)
> +{
> +  return __builtin_bswap64 (__bsx);
> +}
> +typedef long unsigned int size_t;
> +typedef __time_t time_t;
> +struct timespec
> +  {
> +    __time_t tv_sec;
> +    __syscall_slong_t tv_nsec;
> +  };
> +typedef __pid_t pid_t;
> +struct sched_param
> +  {
> +    int __sched_priority;
> +  };
> +struct __sched_param
> +  {
> +    int __sched_priority;
> +  };
> +typedef unsigned long int __cpu_mask;
> +typedef struct
> +{
> +  __cpu_mask __bits[1024 / (8 * sizeof (__cpu_mask))];
> +} cpu_set_t;
> +extern int __sched_cpucount (size_t __setsize, const cpu_set_t *__setp)
> +  __attribute__ ((__nothrow__ , __leaf__));
> +extern cpu_set_t *__sched_cpualloc (size_t __count) __attribute__ ((__nothrow__ , __leaf__)) ;
> +extern void __sched_cpufree (cpu_set_t *__set) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_setparam (__pid_t __pid, const struct sched_param *__param)
> +     __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_getparam (__pid_t __pid, struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_setscheduler (__pid_t __pid, int __policy,
> +          const struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_getscheduler (__pid_t __pid) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_yield (void) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_get_priority_max (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_get_priority_min (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
> +extern int sched_rr_get_interval (__pid_t __pid, struct timespec *__t) __attribute__ ((__nothrow__ , __leaf__));
> +typedef __clock_t clock_t;
> +typedef __clockid_t clockid_t;
> +typedef __timer_t timer_t;
> +struct tm
> +{
> +  int tm_sec;
> +  int tm_min;
> +  int tm_hour;
> +  int tm_mday;
> +  int tm_mon;
> +  int tm_year;
> +  int tm_wday;
> +  int tm_yday;
> +  int tm_isdst;
> +  long int tm_gmtoff;
> +  const char *tm_zone;
> +};
> +struct itimerspec
> +  {
> +    struct timespec it_interval;
> +    struct timespec it_value;
> +  };
> +struct sigevent;
> +extern clock_t clock (void) __attribute__ ((__nothrow__ , __leaf__));
> +extern time_t time (time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
> +extern double difftime (time_t __time1, time_t __time0)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
> +extern time_t mktime (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern size_t strftime (char *__restrict __s, size_t __maxsize,
> +   const char *__restrict __format,
> +   const struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
> +typedef struct __locale_struct
> +{
> +  struct __locale_data *__locales[13];
> +  const unsigned short int *__ctype_b;
> +  const int *__ctype_tolower;
> +  const int *__ctype_toupper;
> +  const char *__names[13];
> +} *__locale_t;
> +typedef __locale_t locale_t;
> +extern size_t strftime_l (char *__restrict __s, size_t __maxsize,
> +     const char *__restrict __format,
> +     const struct tm *__restrict __tp,
> +     __locale_t __loc) __attribute__ ((__nothrow__ , __leaf__));
> +extern struct tm *gmtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
> +extern struct tm *localtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
> +extern struct tm *gmtime_r (const time_t *__restrict __timer,
> +       struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern struct tm *localtime_r (const time_t *__restrict __timer,
> +          struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern char *asctime (const struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern char *ctime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
> +extern char *asctime_r (const struct tm *__restrict __tp,
> +   char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
> +extern char *ctime_r (const time_t *__restrict __timer,
> +        char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
> +extern char *__tzname[2];
> +extern int __daylight;
> +extern long int __timezone;
> +extern char *tzname[2];
> +extern void tzset (void) __attribute__ ((__nothrow__ , __leaf__));
> +extern int daylight;
> +extern long int timezone;
> +extern int stime (const time_t *__when) __attribute__ ((__nothrow__ , __leaf__));
> +extern time_t timegm (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern time_t timelocal (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern int dysize (int __year) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
> +extern int nanosleep (const struct timespec *__requested_time,
> +        struct timespec *__remaining);
> +extern int clock_getres (clockid_t __clock_id, struct timespec *__res) __attribute__ ((__nothrow__ , __leaf__));
> +extern int clock_gettime (clockid_t __clock_id, struct timespec *__tp) __attribute__ ((__nothrow__ , __leaf__));
> +extern int clock_settime (clockid_t __clock_id, const struct timespec *__tp)
> +     __attribute__ ((__nothrow__ , __leaf__));
> +extern int clock_nanosleep (clockid_t __clock_id, int __flags,
> +       const struct timespec *__req,
> +       struct timespec *__rem);
> +extern int clock_getcpuclockid (pid_t __pid, clockid_t *__clock_id) __attribute__ ((__nothrow__ , __leaf__));
> +extern int timer_create (clockid_t __clock_id,
> +    struct sigevent *__restrict __evp,
> +    timer_t *__restrict __timerid) __attribute__ ((__nothrow__ , __leaf__));
> +extern int timer_delete (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
> +extern int timer_settime (timer_t __timerid, int __flags,
> +     const struct itimerspec *__restrict __value,
> +     struct itimerspec *__restrict __ovalue) __attribute__ ((__nothrow__ , __leaf__));
> +extern int timer_gettime (timer_t __timerid, struct itimerspec *__value)
> +     __attribute__ ((__nothrow__ , __leaf__));
> +extern int timer_getoverrun (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
> +typedef unsigned long int pthread_t;
> +union pthread_attr_t
> +{
> +  char __size[56];
> +  long int __align;
> +};
> +typedef union pthread_attr_t pthread_attr_t;
> +typedef struct __pthread_internal_list
> +{
> +  struct __pthread_internal_list *__prev;
> +  struct __pthread_internal_list *__next;
> +} __pthread_list_t;
> +typedef union
> +{
> +  struct __pthread_mutex_s
> +  {
> +    int __lock;
> +    unsigned int __count;
> +    int __owner;
> +    unsigned int __nusers;
> +    int __kind;
> +    short __spins;
> +    short __elision;
> +    __pthread_list_t __list;
> +  } __data;
> +  char __size[40];
> +  long int __align;
> +} pthread_mutex_t;
> +typedef union
> +{
> +  char __size[4];
> +  int __align;
> +} pthread_mutexattr_t;
> +typedef union
> +{
> +  struct
> +  {
> +    int __lock;
> +    unsigned int __futex;
> +    __extension__ unsigned long long int __total_seq;
> +    __extension__ unsigned long long int __wakeup_seq;
> +    __extension__ unsigned long long int __woken_seq;
> +    void *__mutex;
> +    unsigned int __nwaiters;
> +    unsigned int __broadcast_seq;
> +  } __data;
> +  char __size[48];
> +  __extension__ long long int __align;
> +} pthread_cond_t;
> +typedef union
> +{
> +  char __size[4];
> +  int __align;
> +} pthread_condattr_t;
> +typedef unsigned int pthread_key_t;
> +typedef int pthread_once_t;
> +typedef union
> +{
> +  struct
> +  {
> +    int __lock;
> +    unsigned int __nr_readers;
> +    unsigned int __readers_wakeup;
> +    unsigned int __writer_wakeup;
> +    unsigned int __nr_readers_queued;
> +    unsigned int __nr_writers_queued;
> +    int __writer;
> +    int __shared;
> +    unsigned long int __pad1;
> +    unsigned long int __pad2;
> +    unsigned int __flags;
> +  } __data;
> +  char __size[56];
> +  long int __align;
> +} pthread_rwlock_t;
> +typedef union
> +{
> +  char __size[8];
> +  long int __align;
> +} pthread_rwlockattr_t;
> +typedef volatile int pthread_spinlock_t;
> +typedef union
> +{
> +  char __size[32];
> +  long int __align;
> +} pthread_barrier_t;
> +typedef union
> +{
> +  char __size[4];
> +  int __align;
> +} pthread_barrierattr_t;
> +typedef long int __jmp_buf[8];
> +enum
> +{
> +  PTHREAD_CREATE_JOINABLE,
> +  PTHREAD_CREATE_DETACHED
> +};
> +enum
> +{
> +  PTHREAD_MUTEX_TIMED_NP,
> +  PTHREAD_MUTEX_RECURSIVE_NP,
> +  PTHREAD_MUTEX_ERRORCHECK_NP,
> +  PTHREAD_MUTEX_ADAPTIVE_NP
> +  ,
> +  PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP,
> +  PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP,
> +  PTHREAD_MUTEX_ERRORCHECK = PTHREAD_MUTEX_ERRORCHECK_NP,
> +  PTHREAD_MUTEX_DEFAULT = PTHREAD_MUTEX_NORMAL
> +};
> +enum
> +{
> +  PTHREAD_MUTEX_STALLED,
> +  PTHREAD_MUTEX_STALLED_NP = PTHREAD_MUTEX_STALLED,
> +  PTHREAD_MUTEX_ROBUST,
> +  PTHREAD_MUTEX_ROBUST_NP = PTHREAD_MUTEX_ROBUST
> +};
> +enum
> +{
> +  PTHREAD_PRIO_NONE,
> +  PTHREAD_PRIO_INHERIT,
> +  PTHREAD_PRIO_PROTECT
> +};
> +enum
> +{
> +  PTHREAD_RWLOCK_PREFER_READER_NP,
> +  PTHREAD_RWLOCK_PREFER_WRITER_NP,
> +  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP,
> +  PTHREAD_RWLOCK_DEFAULT_NP = PTHREAD_RWLOCK_PREFER_READER_NP
> +};
> +enum
> +{
> +  PTHREAD_INHERIT_SCHED,
> +  PTHREAD_EXPLICIT_SCHED
> +};
> +enum
> +{
> +  PTHREAD_SCOPE_SYSTEM,
> +  PTHREAD_SCOPE_PROCESS
> +};
> +enum
> +{
> +  PTHREAD_PROCESS_PRIVATE,
> +  PTHREAD_PROCESS_SHARED
> +};
> +struct _pthread_cleanup_buffer
> +{
> +  void (*__routine) (void *);
> +  void *__arg;
> +  int __canceltype;
> +  struct _pthread_cleanup_buffer *__prev;
> +};
> +enum
> +{
> +  PTHREAD_CANCEL_ENABLE,
> +  PTHREAD_CANCEL_DISABLE
> +};
> +enum
> +{
> +  PTHREAD_CANCEL_DEFERRED,
> +  PTHREAD_CANCEL_ASYNCHRONOUS
> +};
> +extern int pthread_create (pthread_t *__restrict __newthread,
> +      const pthread_attr_t *__restrict __attr,
> +      void *(*__start_routine) (void *),
> +      void *__restrict __arg) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 3)));
> +extern void pthread_exit (void *__retval) __attribute__ ((__noreturn__));
> +extern int pthread_join (pthread_t __th, void **__thread_return);
> +extern int pthread_detach (pthread_t __th) __attribute__ ((__nothrow__ , __leaf__));
> +extern pthread_t pthread_self (void) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
> +extern int pthread_equal (pthread_t __thread1, pthread_t __thread2)
> +  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
> +extern int pthread_attr_init (pthread_attr_t *__attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_destroy (pthread_attr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getdetachstate (const pthread_attr_t *__attr,
> +     int *__detachstate)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setdetachstate (pthread_attr_t *__attr,
> +     int __detachstate)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getguardsize (const pthread_attr_t *__attr,
> +          size_t *__guardsize)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setguardsize (pthread_attr_t *__attr,
> +          size_t __guardsize)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getschedparam (const pthread_attr_t *__restrict __attr,
> +           struct sched_param *__restrict __param)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setschedparam (pthread_attr_t *__restrict __attr,
> +           const struct sched_param *__restrict
> +           __param) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_getschedpolicy (const pthread_attr_t *__restrict
> +     __attr, int *__restrict __policy)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setschedpolicy (pthread_attr_t *__attr, int __policy)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getinheritsched (const pthread_attr_t *__restrict
> +      __attr, int *__restrict __inherit)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setinheritsched (pthread_attr_t *__attr,
> +      int __inherit)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getscope (const pthread_attr_t *__restrict __attr,
> +      int *__restrict __scope)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setscope (pthread_attr_t *__attr, int __scope)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getstackaddr (const pthread_attr_t *__restrict
> +          __attr, void **__restrict __stackaddr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2))) __attribute__ ((__deprecated__));
> +extern int pthread_attr_setstackaddr (pthread_attr_t *__attr,
> +          void *__stackaddr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1))) __attribute__ ((__deprecated__));
> +extern int pthread_attr_getstacksize (const pthread_attr_t *__restrict
> +          __attr, size_t *__restrict __stacksize)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_attr_setstacksize (pthread_attr_t *__attr,
> +          size_t __stacksize)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_attr_getstack (const pthread_attr_t *__restrict __attr,
> +      void **__restrict __stackaddr,
> +      size_t *__restrict __stacksize)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2, 3)));
> +extern int pthread_attr_setstack (pthread_attr_t *__attr, void *__stackaddr,
> +      size_t __stacksize) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_setschedparam (pthread_t __target_thread, int __policy,
> +      const struct sched_param *__param)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (3)));
> +extern int pthread_getschedparam (pthread_t __target_thread,
> +      int *__restrict __policy,
> +      struct sched_param *__restrict __param)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 3)));
> +extern int pthread_setschedprio (pthread_t __target_thread, int __prio)
> +     __attribute__ ((__nothrow__ , __leaf__));
> +extern int pthread_once (pthread_once_t *__once_control,
> +    void (*__init_routine) (void)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_setcancelstate (int __state, int *__oldstate);
> +extern int pthread_setcanceltype (int __type, int *__oldtype);
> +extern int pthread_cancel (pthread_t __th);
> +extern void pthread_testcancel (void);
> +typedef struct
> +{
> +  struct
> +  {
> +    __jmp_buf __cancel_jmp_buf;
> +    int __mask_was_saved;
> +  } __cancel_jmp_buf[1];
> +  void *__pad[4];
> +} __pthread_unwind_buf_t __attribute__ ((__aligned__));
> +struct __pthread_cleanup_frame
> +{
> +  void (*__cancel_routine) (void *);
> +  void *__cancel_arg;
> +  int __do_it;
> +  int __cancel_type;
> +};
> +extern void __pthread_register_cancel (__pthread_unwind_buf_t *__buf)
> +     ;
> +extern void __pthread_unregister_cancel (__pthread_unwind_buf_t *__buf)
> +  ;
> +extern void __pthread_unwind_next (__pthread_unwind_buf_t *__buf)
> +     __attribute__ ((__noreturn__))
> +     __attribute__ ((__weak__))
> +     ;
> +struct __jmp_buf_tag;
> +extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) __attribute__ ((__nothrow__));
> +extern int pthread_mutex_init (pthread_mutex_t *__mutex,
> +          const pthread_mutexattr_t *__mutexattr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutex_destroy (pthread_mutex_t *__mutex)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutex_trylock (pthread_mutex_t *__mutex)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutex_lock (pthread_mutex_t *__mutex)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutex_timedlock (pthread_mutex_t *__restrict __mutex,
> +        const struct timespec *__restrict
> +        __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutex_unlock (pthread_mutex_t *__mutex)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutex_getprioceiling (const pthread_mutex_t *
> +      __restrict __mutex,
> +      int *__restrict __prioceiling)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutex_setprioceiling (pthread_mutex_t *__restrict __mutex,
> +      int __prioceiling,
> +      int *__restrict __old_ceiling)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 3)));
> +extern int pthread_mutex_consistent (pthread_mutex_t *__mutex)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_init (pthread_mutexattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_destroy (pthread_mutexattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_getpshared (const pthread_mutexattr_t *
> +      __restrict __attr,
> +      int *__restrict __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutexattr_setpshared (pthread_mutexattr_t *__attr,
> +      int __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_gettype (const pthread_mutexattr_t *__restrict
> +          __attr, int *__restrict __kind)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutexattr_settype (pthread_mutexattr_t *__attr, int __kind)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_getprotocol (const pthread_mutexattr_t *
> +       __restrict __attr,
> +       int *__restrict __protocol)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutexattr_setprotocol (pthread_mutexattr_t *__attr,
> +       int __protocol)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_getprioceiling (const pthread_mutexattr_t *
> +          __restrict __attr,
> +          int *__restrict __prioceiling)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutexattr_setprioceiling (pthread_mutexattr_t *__attr,
> +          int __prioceiling)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_mutexattr_getrobust (const pthread_mutexattr_t *__attr,
> +     int *__robustness)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_mutexattr_setrobust (pthread_mutexattr_t *__attr,
> +     int __robustness)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_init (pthread_rwlock_t *__restrict __rwlock,
> +    const pthread_rwlockattr_t *__restrict
> +    __attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_destroy (pthread_rwlock_t *__rwlock)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_rdlock (pthread_rwlock_t *__rwlock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_tryrdlock (pthread_rwlock_t *__rwlock)
> +  __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_timedrdlock (pthread_rwlock_t *__restrict __rwlock,
> +           const struct timespec *__restrict
> +           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_rwlock_wrlock (pthread_rwlock_t *__rwlock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_trywrlock (pthread_rwlock_t *__rwlock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlock_timedwrlock (pthread_rwlock_t *__restrict __rwlock,
> +           const struct timespec *__restrict
> +           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_rwlock_unlock (pthread_rwlock_t *__rwlock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlockattr_init (pthread_rwlockattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlockattr_destroy (pthread_rwlockattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlockattr_getpshared (const pthread_rwlockattr_t *
> +       __restrict __attr,
> +       int *__restrict __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_rwlockattr_setpshared (pthread_rwlockattr_t *__attr,
> +       int __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_rwlockattr_getkind_np (const pthread_rwlockattr_t *
> +       __restrict __attr,
> +       int *__restrict __pref)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_rwlockattr_setkind_np (pthread_rwlockattr_t *__attr,
> +       int __pref) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_cond_init (pthread_cond_t *__restrict __cond,
> +         const pthread_condattr_t *__restrict __cond_attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_cond_destroy (pthread_cond_t *__cond)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_cond_signal (pthread_cond_t *__cond)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_cond_broadcast (pthread_cond_t *__cond)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_cond_wait (pthread_cond_t *__restrict __cond,
> +         pthread_mutex_t *__restrict __mutex)
> +     __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_cond_timedwait (pthread_cond_t *__restrict __cond,
> +       pthread_mutex_t *__restrict __mutex,
> +       const struct timespec *__restrict __abstime)
> +     __attribute__ ((__nonnull__ (1, 2, 3)));
> +extern int pthread_condattr_init (pthread_condattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_condattr_destroy (pthread_condattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_condattr_getpshared (const pthread_condattr_t *
> +     __restrict __attr,
> +     int *__restrict __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_condattr_setpshared (pthread_condattr_t *__attr,
> +     int __pshared) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_condattr_getclock (const pthread_condattr_t *
> +          __restrict __attr,
> +          __clockid_t *__restrict __clock_id)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_condattr_setclock (pthread_condattr_t *__attr,
> +          __clockid_t __clock_id)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_spin_init (pthread_spinlock_t *__lock, int __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_spin_destroy (pthread_spinlock_t *__lock)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_spin_lock (pthread_spinlock_t *__lock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_spin_trylock (pthread_spinlock_t *__lock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_spin_unlock (pthread_spinlock_t *__lock)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrier_init (pthread_barrier_t *__restrict __barrier,
> +     const pthread_barrierattr_t *__restrict
> +     __attr, unsigned int __count)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrier_destroy (pthread_barrier_t *__barrier)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrier_wait (pthread_barrier_t *__barrier)
> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrierattr_init (pthread_barrierattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrierattr_destroy (pthread_barrierattr_t *__attr)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_barrierattr_getpshared (const pthread_barrierattr_t *
> +        __restrict __attr,
> +        int *__restrict __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
> +extern int pthread_barrierattr_setpshared (pthread_barrierattr_t *__attr,
> +        int __pshared)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_key_create (pthread_key_t *__key,
> +          void (*__destr_function) (void *))
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
> +extern int pthread_key_delete (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
> +extern void *pthread_getspecific (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
> +extern int pthread_setspecific (pthread_key_t __key,
> +    const void *__pointer) __attribute__ ((__nothrow__ , __leaf__)) ;
> +extern int pthread_getcpuclockid (pthread_t __thread_id,
> +      __clockid_t *__clock_id)
> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2)));
> +extern int pthread_atfork (void (*__prepare) (void),
> +      void (*__parent) (void),
> +      void (*__child) (void)) __attribute__ ((__nothrow__ , __leaf__));
> +extern __inline __attribute__ ((__gnu_inline__)) int
> +__attribute__ ((__nothrow__ , __leaf__)) pthread_equal (pthread_t __thread1, pthread_t __thread2)
> +{
> +  return __thread1 == __thread2;
> +}
> +void cleanup_fn(void *mutex);
> +typedef struct {
> +  size_t progress;
> +  size_t total;
> +  pthread_mutex_t mutex;
> +  pthread_cond_t cond;
> +  double min_wait;
> +} dmnsn_future;
> +void
> +dmnsn_future_wait(dmnsn_future *future, double progress)
> +{
> +  pthread_mutex_lock(&future->mutex);
> +  while ((double)future->progress/future->total < progress) {
> +    if (progress < future->min_wait) {
> +      future->min_wait = progress;
> +    }
> +    do { __pthread_unwind_buf_t __cancel_buf; void (*__cancel_routine) (void *) = (cleanup_fn); void *__cancel_arg = (&future->mutex); int __not_first_call = __sigsetjmp ((struct __jmp_buf_tag *) (void *) __cancel_buf.__cancel_jmp_buf, 0); if (__builtin_expect ((__not_first_call), 0)) { __cancel_routine (__cancel_arg); __pthread_unwind_next (&__cancel_buf); } __pthread_register_cancel (&__cancel_buf); do {;
> +    pthread_cond_wait(&future->cond, &future->mutex);
> +    do { } while (0); } while (0); __pthread_unregister_cancel (&__cancel_buf); if (0) __cancel_routine (__cancel_arg); } while (0);
> +  }
> +  pthread_mutex_unlock(&future->mutex);
> +}
>
Richard Biener Feb. 28, 2018, 10:48 a.m. UTC | #2
On Wed, Feb 28, 2018 at 11:43 AM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Wed, Feb 28, 2018 at 1:16 AM, Jeff Law <law@redhat.com> wrote:
>> Richi, you worked on 57147 which touches on the issues here.  Your
>> thoughts would be greatly appreciated.
>>
>>
>> So 61118 is one of several bugs related to the clobbered-by-longjmp warning.
>>
>> In 61118 is we are unable to coalesce all the objects in the key
>> partitions.  To remove the relevant PHIs we have to create two
>> assignments to the key pseudos.
>>
>> Pseudos with more than one assignment are subject to the
>> clobbered-by-longjmp analysis:
>>
>>  * True if register REGNO was alive at a place where `setjmp' was
>>    called and was set more than once or is an argument.  Such regs may
>>    be clobbered by `longjmp'.  */
>>
>> static bool
>> regno_clobbered_at_setjmp (bitmap setjmp_crosses, int regno)
>> {
>>   /* There appear to be cases where some local vars never reach the
>>      backend but have bogus regnos.  */
>>   if (regno >= max_reg_num ())
>>     return false;
>>
>>   return ((REG_N_SETS (regno) > 1
>>            || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN
>> (cfun)),
>>                                regno))
>>           && REGNO_REG_SET_P (setjmp_crosses, regno));
>> }
>>
>>
>> The fact that no path sets the pseudo more than once is not considered.
>> If there is more than one static set of the pseudo, then it is
>> considered for possible warning.
>>
>> --
>>
>>
>> I looked at the propagations which led to the inability to coalesce.
>> They all seemed valid to me.  We have always allowed copy propagation to
>> replace one pseudo with another as long as neither has
>> SSA_NAME_USED_IN_ABNORMAL_PHI set.
>>
>> We have a PHI like
>>
>> x1(ab) = (x0, x3 (ab))
>>
>> x0 is not marked as abnormal because the edge isn't abnormal and thus we
>> can propagate into the x0 argument of the PHI.  This is consistent with
>> behavior since, well, forever.   We propagate a value for x0 resulting
>> in something like
>>
>> x1(b) = (y0, x3 (ab))
>>
>>
>> Where y0 is still live across the PHI.  Thus the partition for x1/x3,
>> etc conflicts with the partition for y0 and they can not be coalesced.
>> This leads to the multiple assignments to the pseudo for the x1/x3
>> partition.  I briefly looked marking all the PHI arguments as abnormal
>> when the destination is abnormal, but it just doesn't seem right.
>>
>> Anyway, I'd already been looking at 21161 and was aware that the CFG's
>> we're building in presence of setjmp/longjmp were slightly inaccurate.
>>
>> In particular, a longjmp returns to the point immediately after the
>> setjmp, not to the setjmp itself.  But our CFG building has the edge
>> from the abnormal dispatcher going to the block containing the setjmp call.
>
> Yeah...  for SJLJ EH we get this right via __builtin_setjmp_receiver.
>
>> This creates unnecessary irreducible loops.  It turns out that if we fix
>> the tree CFG, then lifetimes become more accurate (and more
>> constrained).  The more constrained, more accurate lifetime information
>> is enough to allow things to coalesce the way we want and everything for
>> 61118 just works.
>
> Sounds good.

Oh - and to mention it, we have one long-standing issue after me trying
to fix things here which is that RTL doesn't have all those abnormal edges
for setjmp and friends because we throw away abnormal edges during
RTL expansion and expect to recompute them...

IIRC there's a bugreport about this to track this issue and the fix is
to stop removing abnormal edges and instread transition them to RTL:

          /* At the moment not all abnormal edges match the RTL
             representation.  It is safe to remove them here as
             find_many_sub_basic_blocks will rediscover them.
             In the future we should get this fixed properly.  */
          if ((e->flags & EDGE_ABNORMAL)
              && !(e->flags & EDGE_SIBCALL))
            remove_edge (e);

not sure if we can use an edge flag to mark those we want to preserve.
But I don't understand the comment very well - why would any abnormal
edges not "match" the RTL representation?

>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>> that a "returns twice" function returns not to the call, but to the
>> point immediately after the call.  So if we have a call to a returns
>> twice function that ends a block with a single successor, when we wire
>> up the abnormal dispatcher, we target the single successor rather than
>> the block containing the returns-twice call.
>
> Hmm, I think you need to check whether the successor has a single
> predecessor, not whether we have a single successor (we always have
> that unless setjmp also throws).  If you fix that you keep the CFG
> "incorrect" if there are multiple predecessors so I think in addition
> to properly creating the edges you have to work on the BB building
> part to ensure that there's a single-predecessor block after
> returns-twice function calls.  Note that currently we force returns-twice
> to be the first (and only) stmt of a block -- your fix would relax this,
> returns-twice no longer needs to start a new BB.
>
> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                      &ab_edge_call, false);
> +               {
> +                 bool target_after_setjmp = false;
> +
> +                 /* If the returns twice statement looks like a setjmp
> +                    call at the end of a block with a single successor
> +                    then we want the edge from the dispatcher to target
> +                    that single successor.  That more accurately reflects
> +                    actual control flow.  The more accurate CFG also
> +                    results in fewer false positive warnings.  */
> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
> +                     && gimple_call_fndecl (call_stmt)
> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
> +                     && single_succ_p (bb))
> +                   target_after_setjmp = true;
> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> +                                        &ab_edge_call, false,
> +                                        target_after_setjmp);
> +               }
>
> I don't exactly get the hops you jump through here -- I think it's
> better to split the returns-twice (always last stmt of a block after
> the fixing) and the setjmp-receiver (always first stmt of a block) cases.
> So, remove the handling of returns-twice from the above case and
> handle returns-twice via
>
>   gimple *last = last_stmt (bb);
>   if (last && ...)
>
> also handle all returns-twice calls this way, not only setjmp_call_p.
>
>>
>> This compromises the test gcc.dg/torture/57147-2.c
>>
>>
>> Prior to this change the CFG looks like
>>
>>      2
>>     / \
>>    3<->4
>>    |
>>    R
>>
>> Where block #3 contains the setjmp.  The edges 2->4, 3->4 and 4->3 are
>> abnormals.  Block #4 is the abnormal dispatcher.
>>
>> Eventually we remove the edge from 2->3 because the last statement in
>> block #2 is to a non-returning function call.  But we leave the abnormal
>> edge 2->4 (on purpose) resulting in:
>>
>>
>>      2
>>      |
>>   +->4
>>   |  |
>>   +--3
>>      |
>>      R
>>
>> The test then proceeds to verify there is a call to setjmp in the
>> resulting .optimized dump -- which there is because block #3 remains
>> reachable.
>>
>>
>> With this change the CFG looks like:
>>
>>
>>
>>      2
>>     / \
>>    3-->4
>>    |  /
>>    | /
>>    |/
>>    R
>>
>>
>> Where the edges 2->4 and 3->4 and 4->R are abnormals.  Block #4 is still
>> the dispatcher and the setjmp is still in block #3.
>>
>> We realize block #2 ends with a call to a noreturn function and again we
>> remove the 2->3 edge.  That makes block #3 unreachable and it gets
>> removed, resulting in:
>>
>>     2
>>     |
>>     4
>>     |
>>     R
>>
>> Where 2->4 and 4->R are still abnormal edges.  With bb3 becoming
>> unreachable, the setjmp is unreachable and gets removed thus breaking
>> the scan part of the test.
>>
>>
>>
>>
>> If we review the source of the test:
>>
>>
>> struct __jmp_buf_tag {};
>> typedef struct __jmp_buf_tag jmp_buf[1];
>> extern int _setjmp (struct __jmp_buf_tag __env[1]);
>>
>> jmp_buf g_return_jmp_buf;
>>
>> void SetNaClSwitchExpectations (void)
>> {
>>   __builtin_longjmp (g_return_jmp_buf, 1);
>> }
>> void TestSyscall(void)
>> {
>>   SetNaClSwitchExpectations();
>>   _setjmp (g_return_jmp_buf);
>> }
>>
>>
>> We can easily see that the call to __setjmp can never be reached given
>> that we consider the longjmp call as non-returning.  So AFAICT
>> everything is as should be expected.  I think the right thing is to just
>> remove this compromised test.
>
> I agree.  Bonus points if you look at PR57147 and see if the testcase
> was misreduced (maybe it was just for an ICE so we can keep it
> and just remove the dump scanning?)
>
> Richard.
>
>> --
>>
>>
>>
>> The regression tested from pr61118 disables -ftracer as -ftracer creates
>> an additional assignment to key objects which gets carried through into
>> RTL thus triggering the problem all over again.  My RTL fixes for 21161
>> do not fix this.  So if the patch is accepted I propose we keep 61118
>> open, but without the gcc-8 regression marker.  It's still a deficiency
>> that -ftracer can trigger a bogus clobbered-by-longjmp warning.
>>
>> This has been bootstrapped and regression tested on x86_64.
>>
>> Thoughts?  OK for the trunk?
>>
>> Jeff
>>
>>         PR middle-end/61118
>>         * tree-cfg.c (handle_abnormal_edges): Accept new argument.
>>         (make_edges): Callers of handle_abnormal_edges changed.
>>
>>         * gcc.dg/torture/pr61118.c: New test.
>>         * gcc.dg/torture/pr57147.c: Remove compromised test.
>>
>>
>> diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
>> index b87e48d..551195a 100644
>> --- a/gcc/tree-cfg.c
>> +++ b/gcc/tree-cfg.c
>> @@ -35,6 +35,7 @@ along with GCC; see the file COPYING3.  If not see
>>  #include "fold-const.h"
>>  #include "trans-mem.h"
>>  #include "stor-layout.h"
>> +#include "calls.h"
>>  #include "print-tree.h"
>>  #include "cfganal.h"
>>  #include "gimple-fold.h"
>> @@ -776,13 +777,22 @@ get_abnormal_succ_dispatcher (basic_block bb)
>>  static void
>>  handle_abnormal_edges (basic_block *dispatcher_bbs,
>>                        basic_block for_bb, int *bb_to_omp_idx,
>> -                      auto_vec<basic_block> *bbs, bool computed_goto)
>> +                      auto_vec<basic_block> *bbs, bool computed_goto,
>> +                      bool target_after_setjmp)
>>  {
>>    basic_block *dispatcher = dispatcher_bbs + (computed_goto ? 1 : 0);
>>    unsigned int idx = 0;
>> -  basic_block bb;
>> +  basic_block bb, target_bb;
>>    bool inner = false;
>>
>> +  /* Determine the block the abnormal dispatcher will transfer
>> +     control to.  It may be FOR_BB, or in some cases it may be the
>> +     single successor of FOR_BB.  */
>> +  if (target_after_setjmp)
>> +    target_bb = single_succ (for_bb);
>> +  else
>> +    target_bb = for_bb;
>> +
>>    if (bb_to_omp_idx)
>>      {
>>        dispatcher = dispatcher_bbs + 2 * bb_to_omp_idx[for_bb->index];
>> @@ -878,7 +888,7 @@ handle_abnormal_edges (basic_block *dispatcher_bbs,
>>         }
>>      }
>>
>> -  make_edge (*dispatcher, for_bb, EDGE_ABNORMAL);
>> +  make_edge (*dispatcher, target_bb, EDGE_ABNORMAL);
>>  }
>>
>>  /* Creates outgoing edges for BB.  Returns 1 when it ends with an
>> @@ -1075,11 +1085,11 @@ make_edges (void)
>>                  potential target for a computed goto or a non-local goto.  */
>>               if (FORCED_LABEL (target))
>>                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> -                                      &ab_edge_goto, true);
>> +                                      &ab_edge_goto, true, false);
>>               if (DECL_NONLOCAL (target))
>>                 {
>>                   handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> -                                        &ab_edge_call, false);
>> +                                        &ab_edge_call, false, false);
>>                   break;
>>                 }
>>             }
>> @@ -1094,8 +1104,24 @@ make_edges (void)
>>                   && ((gimple_call_flags (call_stmt) & ECF_RETURNS_TWICE)
>>                       || gimple_call_builtin_p (call_stmt,
>>                                                 BUILT_IN_SETJMP_RECEIVER)))
>> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> -                                      &ab_edge_call, false);
>> +               {
>> +                 bool target_after_setjmp = false;
>> +
>> +                 /* If the returns twice statement looks like a setjmp
>> +                    call at the end of a block with a single successor
>> +                    then we want the edge from the dispatcher to target
>> +                    that single successor.  That more accurately reflects
>> +                    actual control flow.  The more accurate CFG also
>> +                    results in fewer false positive warnings.  */
>> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
>> +                     && gimple_call_fndecl (call_stmt)
>> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
>> +                     && single_succ_p (bb))
>> +                   target_after_setjmp = true;
>> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> +                                        &ab_edge_call, false,
>> +                                        target_after_setjmp);
>> +               }
>>             }
>>         }
>>
>> diff --git a/gcc/testsuite/gcc.dg/torture/pr57147-2.c b/gcc/testsuite/gcc.dg/torture/pr57147-2.c
>> deleted file mode 100644
>> index fc5fb39..0000000
>> --- a/gcc/testsuite/gcc.dg/torture/pr57147-2.c
>> +++ /dev/null
>> @@ -1,22 +0,0 @@
>> -/* { dg-do compile } */
>> -/* { dg-options "-fdump-tree-optimized" } */
>> -/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } { "" } } */
>> -/* { dg-require-effective-target indirect_jumps } */
>> -
>> -struct __jmp_buf_tag {};
>> -typedef struct __jmp_buf_tag jmp_buf[1];
>> -extern int _setjmp (struct __jmp_buf_tag __env[1]);
>> -
>> -jmp_buf g_return_jmp_buf;
>> -
>> -void SetNaClSwitchExpectations (void)
>> -{
>> -  __builtin_longjmp (g_return_jmp_buf, 1);
>> -}
>> -void TestSyscall(void)
>> -{
>> -  SetNaClSwitchExpectations();
>> -  _setjmp (g_return_jmp_buf);
>> -}
>> -
>> -/* { dg-final { scan-tree-dump "setjmp" "optimized" } } */
>> diff --git a/gcc/testsuite/gcc.dg/torture/pr61118.c b/gcc/testsuite/gcc.dg/torture/pr61118.c
>> new file mode 100644
>> index 0000000..12be892
>> --- /dev/null
>> +++ b/gcc/testsuite/gcc.dg/torture/pr61118.c
>> @@ -0,0 +1,652 @@
>> +/* { dg-options "-Wextra -fno-tracer" } */
>> +typedef unsigned char __u_char;
>> +typedef unsigned short int __u_short;
>> +typedef unsigned int __u_int;
>> +typedef unsigned long int __u_long;
>> +typedef signed char __int8_t;
>> +typedef unsigned char __uint8_t;
>> +typedef signed short int __int16_t;
>> +typedef unsigned short int __uint16_t;
>> +typedef signed int __int32_t;
>> +typedef unsigned int __uint32_t;
>> +typedef signed long int __int64_t;
>> +typedef unsigned long int __uint64_t;
>> +typedef long int __quad_t;
>> +typedef unsigned long int __u_quad_t;
>> +typedef unsigned long int __dev_t;
>> +typedef unsigned int __uid_t;
>> +typedef unsigned int __gid_t;
>> +typedef unsigned long int __ino_t;
>> +typedef unsigned long int __ino64_t;
>> +typedef unsigned int __mode_t;
>> +typedef unsigned long int __nlink_t;
>> +typedef long int __off_t;
>> +typedef long int __off64_t;
>> +typedef int __pid_t;
>> +typedef struct { int __val[2]; } __fsid_t;
>> +typedef long int __clock_t;
>> +typedef unsigned long int __rlim_t;
>> +typedef unsigned long int __rlim64_t;
>> +typedef unsigned int __id_t;
>> +typedef long int __time_t;
>> +typedef unsigned int __useconds_t;
>> +typedef long int __suseconds_t;
>> +typedef int __daddr_t;
>> +typedef int __key_t;
>> +typedef int __clockid_t;
>> +typedef void * __timer_t;
>> +typedef long int __blksize_t;
>> +typedef long int __blkcnt_t;
>> +typedef long int __blkcnt64_t;
>> +typedef unsigned long int __fsblkcnt_t;
>> +typedef unsigned long int __fsblkcnt64_t;
>> +typedef unsigned long int __fsfilcnt_t;
>> +typedef unsigned long int __fsfilcnt64_t;
>> +typedef long int __fsword_t;
>> +typedef long int __ssize_t;
>> +typedef long int __syscall_slong_t;
>> +typedef unsigned long int __syscall_ulong_t;
>> +typedef __off64_t __loff_t;
>> +typedef __quad_t *__qaddr_t;
>> +typedef char *__caddr_t;
>> +typedef long int __intptr_t;
>> +typedef unsigned int __socklen_t;
>> +static __inline unsigned int
>> +__bswap_32 (unsigned int __bsx)
>> +{
>> +  return __builtin_bswap32 (__bsx);
>> +}
>> +static __inline __uint64_t
>> +__bswap_64 (__uint64_t __bsx)
>> +{
>> +  return __builtin_bswap64 (__bsx);
>> +}
>> +typedef long unsigned int size_t;
>> +typedef __time_t time_t;
>> +struct timespec
>> +  {
>> +    __time_t tv_sec;
>> +    __syscall_slong_t tv_nsec;
>> +  };
>> +typedef __pid_t pid_t;
>> +struct sched_param
>> +  {
>> +    int __sched_priority;
>> +  };
>> +struct __sched_param
>> +  {
>> +    int __sched_priority;
>> +  };
>> +typedef unsigned long int __cpu_mask;
>> +typedef struct
>> +{
>> +  __cpu_mask __bits[1024 / (8 * sizeof (__cpu_mask))];
>> +} cpu_set_t;
>> +extern int __sched_cpucount (size_t __setsize, const cpu_set_t *__setp)
>> +  __attribute__ ((__nothrow__ , __leaf__));
>> +extern cpu_set_t *__sched_cpualloc (size_t __count) __attribute__ ((__nothrow__ , __leaf__)) ;
>> +extern void __sched_cpufree (cpu_set_t *__set) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_setparam (__pid_t __pid, const struct sched_param *__param)
>> +     __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_getparam (__pid_t __pid, struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_setscheduler (__pid_t __pid, int __policy,
>> +          const struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_getscheduler (__pid_t __pid) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_yield (void) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_get_priority_max (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_get_priority_min (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int sched_rr_get_interval (__pid_t __pid, struct timespec *__t) __attribute__ ((__nothrow__ , __leaf__));
>> +typedef __clock_t clock_t;
>> +typedef __clockid_t clockid_t;
>> +typedef __timer_t timer_t;
>> +struct tm
>> +{
>> +  int tm_sec;
>> +  int tm_min;
>> +  int tm_hour;
>> +  int tm_mday;
>> +  int tm_mon;
>> +  int tm_year;
>> +  int tm_wday;
>> +  int tm_yday;
>> +  int tm_isdst;
>> +  long int tm_gmtoff;
>> +  const char *tm_zone;
>> +};
>> +struct itimerspec
>> +  {
>> +    struct timespec it_interval;
>> +    struct timespec it_value;
>> +  };
>> +struct sigevent;
>> +extern clock_t clock (void) __attribute__ ((__nothrow__ , __leaf__));
>> +extern time_t time (time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
>> +extern double difftime (time_t __time1, time_t __time0)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
>> +extern time_t mktime (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern size_t strftime (char *__restrict __s, size_t __maxsize,
>> +   const char *__restrict __format,
>> +   const struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
>> +typedef struct __locale_struct
>> +{
>> +  struct __locale_data *__locales[13];
>> +  const unsigned short int *__ctype_b;
>> +  const int *__ctype_tolower;
>> +  const int *__ctype_toupper;
>> +  const char *__names[13];
>> +} *__locale_t;
>> +typedef __locale_t locale_t;
>> +extern size_t strftime_l (char *__restrict __s, size_t __maxsize,
>> +     const char *__restrict __format,
>> +     const struct tm *__restrict __tp,
>> +     __locale_t __loc) __attribute__ ((__nothrow__ , __leaf__));
>> +extern struct tm *gmtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
>> +extern struct tm *localtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
>> +extern struct tm *gmtime_r (const time_t *__restrict __timer,
>> +       struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern struct tm *localtime_r (const time_t *__restrict __timer,
>> +          struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern char *asctime (const struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern char *ctime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
>> +extern char *asctime_r (const struct tm *__restrict __tp,
>> +   char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
>> +extern char *ctime_r (const time_t *__restrict __timer,
>> +        char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
>> +extern char *__tzname[2];
>> +extern int __daylight;
>> +extern long int __timezone;
>> +extern char *tzname[2];
>> +extern void tzset (void) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int daylight;
>> +extern long int timezone;
>> +extern int stime (const time_t *__when) __attribute__ ((__nothrow__ , __leaf__));
>> +extern time_t timegm (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern time_t timelocal (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int dysize (int __year) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
>> +extern int nanosleep (const struct timespec *__requested_time,
>> +        struct timespec *__remaining);
>> +extern int clock_getres (clockid_t __clock_id, struct timespec *__res) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int clock_gettime (clockid_t __clock_id, struct timespec *__tp) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int clock_settime (clockid_t __clock_id, const struct timespec *__tp)
>> +     __attribute__ ((__nothrow__ , __leaf__));
>> +extern int clock_nanosleep (clockid_t __clock_id, int __flags,
>> +       const struct timespec *__req,
>> +       struct timespec *__rem);
>> +extern int clock_getcpuclockid (pid_t __pid, clockid_t *__clock_id) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int timer_create (clockid_t __clock_id,
>> +    struct sigevent *__restrict __evp,
>> +    timer_t *__restrict __timerid) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int timer_delete (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int timer_settime (timer_t __timerid, int __flags,
>> +     const struct itimerspec *__restrict __value,
>> +     struct itimerspec *__restrict __ovalue) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int timer_gettime (timer_t __timerid, struct itimerspec *__value)
>> +     __attribute__ ((__nothrow__ , __leaf__));
>> +extern int timer_getoverrun (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
>> +typedef unsigned long int pthread_t;
>> +union pthread_attr_t
>> +{
>> +  char __size[56];
>> +  long int __align;
>> +};
>> +typedef union pthread_attr_t pthread_attr_t;
>> +typedef struct __pthread_internal_list
>> +{
>> +  struct __pthread_internal_list *__prev;
>> +  struct __pthread_internal_list *__next;
>> +} __pthread_list_t;
>> +typedef union
>> +{
>> +  struct __pthread_mutex_s
>> +  {
>> +    int __lock;
>> +    unsigned int __count;
>> +    int __owner;
>> +    unsigned int __nusers;
>> +    int __kind;
>> +    short __spins;
>> +    short __elision;
>> +    __pthread_list_t __list;
>> +  } __data;
>> +  char __size[40];
>> +  long int __align;
>> +} pthread_mutex_t;
>> +typedef union
>> +{
>> +  char __size[4];
>> +  int __align;
>> +} pthread_mutexattr_t;
>> +typedef union
>> +{
>> +  struct
>> +  {
>> +    int __lock;
>> +    unsigned int __futex;
>> +    __extension__ unsigned long long int __total_seq;
>> +    __extension__ unsigned long long int __wakeup_seq;
>> +    __extension__ unsigned long long int __woken_seq;
>> +    void *__mutex;
>> +    unsigned int __nwaiters;
>> +    unsigned int __broadcast_seq;
>> +  } __data;
>> +  char __size[48];
>> +  __extension__ long long int __align;
>> +} pthread_cond_t;
>> +typedef union
>> +{
>> +  char __size[4];
>> +  int __align;
>> +} pthread_condattr_t;
>> +typedef unsigned int pthread_key_t;
>> +typedef int pthread_once_t;
>> +typedef union
>> +{
>> +  struct
>> +  {
>> +    int __lock;
>> +    unsigned int __nr_readers;
>> +    unsigned int __readers_wakeup;
>> +    unsigned int __writer_wakeup;
>> +    unsigned int __nr_readers_queued;
>> +    unsigned int __nr_writers_queued;
>> +    int __writer;
>> +    int __shared;
>> +    unsigned long int __pad1;
>> +    unsigned long int __pad2;
>> +    unsigned int __flags;
>> +  } __data;
>> +  char __size[56];
>> +  long int __align;
>> +} pthread_rwlock_t;
>> +typedef union
>> +{
>> +  char __size[8];
>> +  long int __align;
>> +} pthread_rwlockattr_t;
>> +typedef volatile int pthread_spinlock_t;
>> +typedef union
>> +{
>> +  char __size[32];
>> +  long int __align;
>> +} pthread_barrier_t;
>> +typedef union
>> +{
>> +  char __size[4];
>> +  int __align;
>> +} pthread_barrierattr_t;
>> +typedef long int __jmp_buf[8];
>> +enum
>> +{
>> +  PTHREAD_CREATE_JOINABLE,
>> +  PTHREAD_CREATE_DETACHED
>> +};
>> +enum
>> +{
>> +  PTHREAD_MUTEX_TIMED_NP,
>> +  PTHREAD_MUTEX_RECURSIVE_NP,
>> +  PTHREAD_MUTEX_ERRORCHECK_NP,
>> +  PTHREAD_MUTEX_ADAPTIVE_NP
>> +  ,
>> +  PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP,
>> +  PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP,
>> +  PTHREAD_MUTEX_ERRORCHECK = PTHREAD_MUTEX_ERRORCHECK_NP,
>> +  PTHREAD_MUTEX_DEFAULT = PTHREAD_MUTEX_NORMAL
>> +};
>> +enum
>> +{
>> +  PTHREAD_MUTEX_STALLED,
>> +  PTHREAD_MUTEX_STALLED_NP = PTHREAD_MUTEX_STALLED,
>> +  PTHREAD_MUTEX_ROBUST,
>> +  PTHREAD_MUTEX_ROBUST_NP = PTHREAD_MUTEX_ROBUST
>> +};
>> +enum
>> +{
>> +  PTHREAD_PRIO_NONE,
>> +  PTHREAD_PRIO_INHERIT,
>> +  PTHREAD_PRIO_PROTECT
>> +};
>> +enum
>> +{
>> +  PTHREAD_RWLOCK_PREFER_READER_NP,
>> +  PTHREAD_RWLOCK_PREFER_WRITER_NP,
>> +  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP,
>> +  PTHREAD_RWLOCK_DEFAULT_NP = PTHREAD_RWLOCK_PREFER_READER_NP
>> +};
>> +enum
>> +{
>> +  PTHREAD_INHERIT_SCHED,
>> +  PTHREAD_EXPLICIT_SCHED
>> +};
>> +enum
>> +{
>> +  PTHREAD_SCOPE_SYSTEM,
>> +  PTHREAD_SCOPE_PROCESS
>> +};
>> +enum
>> +{
>> +  PTHREAD_PROCESS_PRIVATE,
>> +  PTHREAD_PROCESS_SHARED
>> +};
>> +struct _pthread_cleanup_buffer
>> +{
>> +  void (*__routine) (void *);
>> +  void *__arg;
>> +  int __canceltype;
>> +  struct _pthread_cleanup_buffer *__prev;
>> +};
>> +enum
>> +{
>> +  PTHREAD_CANCEL_ENABLE,
>> +  PTHREAD_CANCEL_DISABLE
>> +};
>> +enum
>> +{
>> +  PTHREAD_CANCEL_DEFERRED,
>> +  PTHREAD_CANCEL_ASYNCHRONOUS
>> +};
>> +extern int pthread_create (pthread_t *__restrict __newthread,
>> +      const pthread_attr_t *__restrict __attr,
>> +      void *(*__start_routine) (void *),
>> +      void *__restrict __arg) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 3)));
>> +extern void pthread_exit (void *__retval) __attribute__ ((__noreturn__));
>> +extern int pthread_join (pthread_t __th, void **__thread_return);
>> +extern int pthread_detach (pthread_t __th) __attribute__ ((__nothrow__ , __leaf__));
>> +extern pthread_t pthread_self (void) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
>> +extern int pthread_equal (pthread_t __thread1, pthread_t __thread2)
>> +  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
>> +extern int pthread_attr_init (pthread_attr_t *__attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_destroy (pthread_attr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getdetachstate (const pthread_attr_t *__attr,
>> +     int *__detachstate)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setdetachstate (pthread_attr_t *__attr,
>> +     int __detachstate)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getguardsize (const pthread_attr_t *__attr,
>> +          size_t *__guardsize)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setguardsize (pthread_attr_t *__attr,
>> +          size_t __guardsize)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getschedparam (const pthread_attr_t *__restrict __attr,
>> +           struct sched_param *__restrict __param)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setschedparam (pthread_attr_t *__restrict __attr,
>> +           const struct sched_param *__restrict
>> +           __param) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_getschedpolicy (const pthread_attr_t *__restrict
>> +     __attr, int *__restrict __policy)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setschedpolicy (pthread_attr_t *__attr, int __policy)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getinheritsched (const pthread_attr_t *__restrict
>> +      __attr, int *__restrict __inherit)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setinheritsched (pthread_attr_t *__attr,
>> +      int __inherit)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getscope (const pthread_attr_t *__restrict __attr,
>> +      int *__restrict __scope)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setscope (pthread_attr_t *__attr, int __scope)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getstackaddr (const pthread_attr_t *__restrict
>> +          __attr, void **__restrict __stackaddr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2))) __attribute__ ((__deprecated__));
>> +extern int pthread_attr_setstackaddr (pthread_attr_t *__attr,
>> +          void *__stackaddr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1))) __attribute__ ((__deprecated__));
>> +extern int pthread_attr_getstacksize (const pthread_attr_t *__restrict
>> +          __attr, size_t *__restrict __stacksize)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_attr_setstacksize (pthread_attr_t *__attr,
>> +          size_t __stacksize)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_attr_getstack (const pthread_attr_t *__restrict __attr,
>> +      void **__restrict __stackaddr,
>> +      size_t *__restrict __stacksize)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2, 3)));
>> +extern int pthread_attr_setstack (pthread_attr_t *__attr, void *__stackaddr,
>> +      size_t __stacksize) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_setschedparam (pthread_t __target_thread, int __policy,
>> +      const struct sched_param *__param)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (3)));
>> +extern int pthread_getschedparam (pthread_t __target_thread,
>> +      int *__restrict __policy,
>> +      struct sched_param *__restrict __param)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 3)));
>> +extern int pthread_setschedprio (pthread_t __target_thread, int __prio)
>> +     __attribute__ ((__nothrow__ , __leaf__));
>> +extern int pthread_once (pthread_once_t *__once_control,
>> +    void (*__init_routine) (void)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_setcancelstate (int __state, int *__oldstate);
>> +extern int pthread_setcanceltype (int __type, int *__oldtype);
>> +extern int pthread_cancel (pthread_t __th);
>> +extern void pthread_testcancel (void);
>> +typedef struct
>> +{
>> +  struct
>> +  {
>> +    __jmp_buf __cancel_jmp_buf;
>> +    int __mask_was_saved;
>> +  } __cancel_jmp_buf[1];
>> +  void *__pad[4];
>> +} __pthread_unwind_buf_t __attribute__ ((__aligned__));
>> +struct __pthread_cleanup_frame
>> +{
>> +  void (*__cancel_routine) (void *);
>> +  void *__cancel_arg;
>> +  int __do_it;
>> +  int __cancel_type;
>> +};
>> +extern void __pthread_register_cancel (__pthread_unwind_buf_t *__buf)
>> +     ;
>> +extern void __pthread_unregister_cancel (__pthread_unwind_buf_t *__buf)
>> +  ;
>> +extern void __pthread_unwind_next (__pthread_unwind_buf_t *__buf)
>> +     __attribute__ ((__noreturn__))
>> +     __attribute__ ((__weak__))
>> +     ;
>> +struct __jmp_buf_tag;
>> +extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) __attribute__ ((__nothrow__));
>> +extern int pthread_mutex_init (pthread_mutex_t *__mutex,
>> +          const pthread_mutexattr_t *__mutexattr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutex_destroy (pthread_mutex_t *__mutex)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutex_trylock (pthread_mutex_t *__mutex)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutex_lock (pthread_mutex_t *__mutex)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutex_timedlock (pthread_mutex_t *__restrict __mutex,
>> +        const struct timespec *__restrict
>> +        __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutex_unlock (pthread_mutex_t *__mutex)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutex_getprioceiling (const pthread_mutex_t *
>> +      __restrict __mutex,
>> +      int *__restrict __prioceiling)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutex_setprioceiling (pthread_mutex_t *__restrict __mutex,
>> +      int __prioceiling,
>> +      int *__restrict __old_ceiling)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 3)));
>> +extern int pthread_mutex_consistent (pthread_mutex_t *__mutex)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_init (pthread_mutexattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_destroy (pthread_mutexattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_getpshared (const pthread_mutexattr_t *
>> +      __restrict __attr,
>> +      int *__restrict __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutexattr_setpshared (pthread_mutexattr_t *__attr,
>> +      int __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_gettype (const pthread_mutexattr_t *__restrict
>> +          __attr, int *__restrict __kind)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutexattr_settype (pthread_mutexattr_t *__attr, int __kind)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_getprotocol (const pthread_mutexattr_t *
>> +       __restrict __attr,
>> +       int *__restrict __protocol)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutexattr_setprotocol (pthread_mutexattr_t *__attr,
>> +       int __protocol)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_getprioceiling (const pthread_mutexattr_t *
>> +          __restrict __attr,
>> +          int *__restrict __prioceiling)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutexattr_setprioceiling (pthread_mutexattr_t *__attr,
>> +          int __prioceiling)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_mutexattr_getrobust (const pthread_mutexattr_t *__attr,
>> +     int *__robustness)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_mutexattr_setrobust (pthread_mutexattr_t *__attr,
>> +     int __robustness)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_init (pthread_rwlock_t *__restrict __rwlock,
>> +    const pthread_rwlockattr_t *__restrict
>> +    __attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_destroy (pthread_rwlock_t *__rwlock)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_rdlock (pthread_rwlock_t *__rwlock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_tryrdlock (pthread_rwlock_t *__rwlock)
>> +  __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_timedrdlock (pthread_rwlock_t *__restrict __rwlock,
>> +           const struct timespec *__restrict
>> +           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_rwlock_wrlock (pthread_rwlock_t *__rwlock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_trywrlock (pthread_rwlock_t *__rwlock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlock_timedwrlock (pthread_rwlock_t *__restrict __rwlock,
>> +           const struct timespec *__restrict
>> +           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_rwlock_unlock (pthread_rwlock_t *__rwlock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlockattr_init (pthread_rwlockattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlockattr_destroy (pthread_rwlockattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlockattr_getpshared (const pthread_rwlockattr_t *
>> +       __restrict __attr,
>> +       int *__restrict __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_rwlockattr_setpshared (pthread_rwlockattr_t *__attr,
>> +       int __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_rwlockattr_getkind_np (const pthread_rwlockattr_t *
>> +       __restrict __attr,
>> +       int *__restrict __pref)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_rwlockattr_setkind_np (pthread_rwlockattr_t *__attr,
>> +       int __pref) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_cond_init (pthread_cond_t *__restrict __cond,
>> +         const pthread_condattr_t *__restrict __cond_attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_cond_destroy (pthread_cond_t *__cond)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_cond_signal (pthread_cond_t *__cond)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_cond_broadcast (pthread_cond_t *__cond)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_cond_wait (pthread_cond_t *__restrict __cond,
>> +         pthread_mutex_t *__restrict __mutex)
>> +     __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_cond_timedwait (pthread_cond_t *__restrict __cond,
>> +       pthread_mutex_t *__restrict __mutex,
>> +       const struct timespec *__restrict __abstime)
>> +     __attribute__ ((__nonnull__ (1, 2, 3)));
>> +extern int pthread_condattr_init (pthread_condattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_condattr_destroy (pthread_condattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_condattr_getpshared (const pthread_condattr_t *
>> +     __restrict __attr,
>> +     int *__restrict __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_condattr_setpshared (pthread_condattr_t *__attr,
>> +     int __pshared) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_condattr_getclock (const pthread_condattr_t *
>> +          __restrict __attr,
>> +          __clockid_t *__restrict __clock_id)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_condattr_setclock (pthread_condattr_t *__attr,
>> +          __clockid_t __clock_id)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_spin_init (pthread_spinlock_t *__lock, int __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_spin_destroy (pthread_spinlock_t *__lock)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_spin_lock (pthread_spinlock_t *__lock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_spin_trylock (pthread_spinlock_t *__lock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_spin_unlock (pthread_spinlock_t *__lock)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrier_init (pthread_barrier_t *__restrict __barrier,
>> +     const pthread_barrierattr_t *__restrict
>> +     __attr, unsigned int __count)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrier_destroy (pthread_barrier_t *__barrier)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrier_wait (pthread_barrier_t *__barrier)
>> +     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrierattr_init (pthread_barrierattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrierattr_destroy (pthread_barrierattr_t *__attr)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_barrierattr_getpshared (const pthread_barrierattr_t *
>> +        __restrict __attr,
>> +        int *__restrict __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
>> +extern int pthread_barrierattr_setpshared (pthread_barrierattr_t *__attr,
>> +        int __pshared)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_key_create (pthread_key_t *__key,
>> +          void (*__destr_function) (void *))
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
>> +extern int pthread_key_delete (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
>> +extern void *pthread_getspecific (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
>> +extern int pthread_setspecific (pthread_key_t __key,
>> +    const void *__pointer) __attribute__ ((__nothrow__ , __leaf__)) ;
>> +extern int pthread_getcpuclockid (pthread_t __thread_id,
>> +      __clockid_t *__clock_id)
>> +     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2)));
>> +extern int pthread_atfork (void (*__prepare) (void),
>> +      void (*__parent) (void),
>> +      void (*__child) (void)) __attribute__ ((__nothrow__ , __leaf__));
>> +extern __inline __attribute__ ((__gnu_inline__)) int
>> +__attribute__ ((__nothrow__ , __leaf__)) pthread_equal (pthread_t __thread1, pthread_t __thread2)
>> +{
>> +  return __thread1 == __thread2;
>> +}
>> +void cleanup_fn(void *mutex);
>> +typedef struct {
>> +  size_t progress;
>> +  size_t total;
>> +  pthread_mutex_t mutex;
>> +  pthread_cond_t cond;
>> +  double min_wait;
>> +} dmnsn_future;
>> +void
>> +dmnsn_future_wait(dmnsn_future *future, double progress)
>> +{
>> +  pthread_mutex_lock(&future->mutex);
>> +  while ((double)future->progress/future->total < progress) {
>> +    if (progress < future->min_wait) {
>> +      future->min_wait = progress;
>> +    }
>> +    do { __pthread_unwind_buf_t __cancel_buf; void (*__cancel_routine) (void *) = (cleanup_fn); void *__cancel_arg = (&future->mutex); int __not_first_call = __sigsetjmp ((struct __jmp_buf_tag *) (void *) __cancel_buf.__cancel_jmp_buf, 0); if (__builtin_expect ((__not_first_call), 0)) { __cancel_routine (__cancel_arg); __pthread_unwind_next (&__cancel_buf); } __pthread_register_cancel (&__cancel_buf); do {;
>> +    pthread_cond_wait(&future->cond, &future->mutex);
>> +    do { } while (0); } while (0); __pthread_unregister_cancel (&__cancel_buf); if (0) __cancel_routine (__cancel_arg); } while (0);
>> +  }
>> +  pthread_mutex_unlock(&future->mutex);
>> +}
>>
Jeff Law Feb. 28, 2018, 3:46 p.m. UTC | #3
On 02/28/2018 03:48 AM, Richard Biener wrote:
> On Wed, Feb 28, 2018 at 11:43 AM, Richard Biener
> <richard.guenther@gmail.com> wrote:
>> On Wed, Feb 28, 2018 at 1:16 AM, Jeff Law <law@redhat.com> wrote:
>>> Richi, you worked on 57147 which touches on the issues here.  Your
>>> thoughts would be greatly appreciated.
>>>
>>>
>>> So 61118 is one of several bugs related to the clobbered-by-longjmp warning.
>>>
>>> In 61118 is we are unable to coalesce all the objects in the key
>>> partitions.  To remove the relevant PHIs we have to create two
>>> assignments to the key pseudos.
>>>
>>> Pseudos with more than one assignment are subject to the
>>> clobbered-by-longjmp analysis:
>>>
>>>  * True if register REGNO was alive at a place where `setjmp' was
>>>    called and was set more than once or is an argument.  Such regs may
>>>    be clobbered by `longjmp'.  */
>>>
>>> static bool
>>> regno_clobbered_at_setjmp (bitmap setjmp_crosses, int regno)
>>> {
>>>   /* There appear to be cases where some local vars never reach the
>>>      backend but have bogus regnos.  */
>>>   if (regno >= max_reg_num ())
>>>     return false;
>>>
>>>   return ((REG_N_SETS (regno) > 1
>>>            || REGNO_REG_SET_P (df_get_live_out (ENTRY_BLOCK_PTR_FOR_FN
>>> (cfun)),
>>>                                regno))
>>>           && REGNO_REG_SET_P (setjmp_crosses, regno));
>>> }
>>>
>>>
>>> The fact that no path sets the pseudo more than once is not considered.
>>> If there is more than one static set of the pseudo, then it is
>>> considered for possible warning.
>>>
>>> --
>>>
>>>
>>> I looked at the propagations which led to the inability to coalesce.
>>> They all seemed valid to me.  We have always allowed copy propagation to
>>> replace one pseudo with another as long as neither has
>>> SSA_NAME_USED_IN_ABNORMAL_PHI set.
>>>
>>> We have a PHI like
>>>
>>> x1(ab) = (x0, x3 (ab))
>>>
>>> x0 is not marked as abnormal because the edge isn't abnormal and thus we
>>> can propagate into the x0 argument of the PHI.  This is consistent with
>>> behavior since, well, forever.   We propagate a value for x0 resulting
>>> in something like
>>>
>>> x1(b) = (y0, x3 (ab))
>>>
>>>
>>> Where y0 is still live across the PHI.  Thus the partition for x1/x3,
>>> etc conflicts with the partition for y0 and they can not be coalesced.
>>> This leads to the multiple assignments to the pseudo for the x1/x3
>>> partition.  I briefly looked marking all the PHI arguments as abnormal
>>> when the destination is abnormal, but it just doesn't seem right.
>>>
>>> Anyway, I'd already been looking at 21161 and was aware that the CFG's
>>> we're building in presence of setjmp/longjmp were slightly inaccurate.
>>>
>>> In particular, a longjmp returns to the point immediately after the
>>> setjmp, not to the setjmp itself.  But our CFG building has the edge
>>> from the abnormal dispatcher going to the block containing the setjmp call.
>>
>> Yeah...  for SJLJ EH we get this right via __builtin_setjmp_receiver.
>>
>>> This creates unnecessary irreducible loops.  It turns out that if we fix
>>> the tree CFG, then lifetimes become more accurate (and more
>>> constrained).  The more constrained, more accurate lifetime information
>>> is enough to allow things to coalesce the way we want and everything for
>>> 61118 just works.
>>
>> Sounds good.
> 
> Oh - and to mention it, we have one long-standing issue after me trying
> to fix things here which is that RTL doesn't have all those abnormal edges
> for setjmp and friends because we throw away abnormal edges during
> RTL expansion and expect to recompute them...
> 
> IIRC there's a bugreport about this to track this issue and the fix is57147
> to stop removing abnormal edges and instread transition them to RTL:
> 
>           /* At the moment not all abnormal edges match the RTL
>              representation.  It is safe to remove them here as
>              find_many_sub_basic_blocks will rediscover them.
>              In the future we should get this fixed properly.  */
>           if ((e->flags & EDGE_ABNORMAL)
>               && !(e->flags & EDGE_SIBCALL))
>             remove_edge (e);
> 
> not sure if we can use an edge flag to mark those we want to preserve.
> But I don't understand the comment very well - why would any abnormal
> edges not "match" the RTL representation?
Yea, I was a bit surprised when I found out the lack of coordination
between gimple and RTL on this stuff.  I got as far as realizing that
the gimple edges related to setjmp/longjmp were dropped on the floor
when looking at 21161.  Fixing the CFG in gimple made absolutely no
difference and surprised the hell out of me.

Luckily I had the code stashed and could resurrect it when I realized
61118 was a problem with life analysis/coalescing caused by the slight
incorrectness in the gimple CFG.

For 21161 we end up needing to scan the RTL after the call to find the
condjump.  Then we figure out where the condjump goes.  That allows us
to look at the registers live on the longjmp path as opposed to what's
live at the setjmp call.  That's about 95% complete -- I fat fingered
and lost the sparc/s390 specific bits, so I need to recreate those.
Ultimately, I can fix 21161 on all targets but mips (which have an
amazingly annoying unspec between the setjmp call and the conditional
branch on its result).



WRT 57147, the test that I'm removing was something you derived from the
original plus inspection of tree-cfg.c.  The test for the original issue
is pr57147-1.c and I had already verified that we continue to do the
right thing for it -- I probably should have noted that in the patch
submission.

jeff
Jeff Law Feb. 28, 2018, 5:35 p.m. UTC | #4
On 02/28/2018 03:43 AM, Richard Biener wrote:
> On Wed, Feb 28, 2018 at 1:16 AM, Jeff Law <law@redhat.com> wrote:
[ ... snip ...]
>>
>> Anyway, I'd already been looking at 21161 and was aware that the CFG's
>> we're building in presence of setjmp/longjmp were slightly inaccurate.
>>
>> In particular, a longjmp returns to the point immediately after the
>> setjmp, not to the setjmp itself.  But our CFG building has the edge
>> from the abnormal dispatcher going to the block containing the setjmp call.
> 
> Yeah...  for SJLJ EH we get this right via __builtin_setjmp_receiver.
Right.

And as you noted in a follow-up we don't carry these abnormal edges from
gimple to RTL.  As a result they are both inaccurate, but in subtly
different ways.  I'd like to fix that wart, but probably not during this
cycle.

[ Another snip ]

>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>> that a "returns twice" function returns not to the call, but to the
>> point immediately after the call.  So if we have a call to a returns
>> twice function that ends a block with a single successor, when we wire
>> up the abnormal dispatcher, we target the single successor rather than
>> the block containing the returns-twice call.
> 
> Hmm, I think you need to check whether the successor has a single
> predecessor, not whether we have a single successor (we always have
> that unless setjmp also throws).  If you fix that you keep the CFG
> "incorrect" if there are multiple predecessors so I think in addition
> to properly creating the edges you have to work on the BB building
> part to ensure that there's a single-predecessor block after
> returns-twice function calls.  Note that currently we force returns-twice
> to be the first (and only) stmt of a block -- your fix would relax this,
> returns-twice no longer needs to start a new BB.
The single successor test was strictly my paranoia WRT abnormal/EH edges.

I don't immediately see why the CFG would be incorrect if the successor
of the setjmp block has multiple preds.  But maybe it's something
downstream that I'm not familiar with -- my worry all along with this
work was the gimple->rtl conversion and block/edge discovery at that point.

Verifying the successor has a single pred seems simple enough if done at
the right time.  If it has multiple preds, then we can always fall back
to the less accurate CFG that we've been building for a decade or more.
The less precise CFG works, but the imprecision can impede more precise
analysis as we've seen with this BZ.



> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                      &ab_edge_call, false);
> +               {
> +                 bool target_after_setjmp = false;
> +
> +                 /* If the returns twice statement looks like a setjmp
> +                    call at the end of a block with a single successor
> +                    then we want the edge from the dispatcher to target
> +                    that single successor.  That more accurately reflects
> +                    actual control flow.  The more accurate CFG also
> +                    results in fewer false positive warnings.  */
> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
> +                     && gimple_call_fndecl (call_stmt)
> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
> +                     && single_succ_p (bb))
> +                   target_after_setjmp = true;
> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> +                                        &ab_edge_call, false,
> +                                        target_after_setjmp);
> +               }
> 
> I don't exactly get the hops you jump through here -- I think it's
> better to split the returns-twice (always last stmt of a block after
> the fixing) and the setjmp-receiver (always first stmt of a block) cases.
> So, remove the handling of returns-twice from the above case and
> handle returns-twice via
> 
>   gimple *last = last_stmt (bb);
>   if (last && ...)
> 
> also handle all returns-twice calls this way, not only setjmp_call_p.
> 
>>
[ ... snip ... ]

>> We can easily see that the call to __setjmp can never be reached given
>> that we consider the longjmp call as non-returning.  So AFAICT
>> everything is as should be expected.  I think the right thing is to just
>> remove this compromised test.
> 
> I agree.  Bonus points if you look at PR57147 and see if the testcase
> was misreduced (maybe it was just for an ICE so we can keep it
> and just remove the dump scanning?)
I had actually looked at 57147.  The original test is pr57147-1 which we
still handle correctly.  pr57147-2 doesn't represent any real world
code.  You took pr57147-1.c and twiddled it slightly based on reviewing
the CFG cleanup code and spotting a potential issue.

We might be able to change pr57147-2 to scan for changes in the edges
and their flags and given the simplicity of the test that may not be
terribly fragile.  I wouldn't want to do that in a larger test though.
Let me give that a whirl.

jeff
Richard Biener March 1, 2018, 10:45 a.m. UTC | #5
On Wed, Feb 28, 2018 at 6:35 PM, Jeff Law <law@redhat.com> wrote:
> On 02/28/2018 03:43 AM, Richard Biener wrote:
>> On Wed, Feb 28, 2018 at 1:16 AM, Jeff Law <law@redhat.com> wrote:
> [ ... snip ...]
>>>
>>> Anyway, I'd already been looking at 21161 and was aware that the CFG's
>>> we're building in presence of setjmp/longjmp were slightly inaccurate.
>>>
>>> In particular, a longjmp returns to the point immediately after the
>>> setjmp, not to the setjmp itself.  But our CFG building has the edge
>>> from the abnormal dispatcher going to the block containing the setjmp call.
>>
>> Yeah...  for SJLJ EH we get this right via __builtin_setjmp_receiver.
> Right.
>
> And as you noted in a follow-up we don't carry these abnormal edges from
> gimple to RTL.  As a result they are both inaccurate, but in subtly
> different ways.  I'd like to fix that wart, but probably not during this
> cycle.

Heh, keep pushing that back as well.  As you explained in the other mail
"perserving" the edges isn't so easy but that's still what we eventually
should do (because we can't really re-create them in the same optimistic
way we could at the beginning).  Maybe we can keep them and just
have a sweep "redirecting" those we kept to/from appropriate places.

> [ Another snip ]
>
>>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>>> that a "returns twice" function returns not to the call, but to the
>>> point immediately after the call.  So if we have a call to a returns
>>> twice function that ends a block with a single successor, when we wire
>>> up the abnormal dispatcher, we target the single successor rather than
>>> the block containing the returns-twice call.
>>
>> Hmm, I think you need to check whether the successor has a single
>> predecessor, not whether we have a single successor (we always have
>> that unless setjmp also throws).  If you fix that you keep the CFG
>> "incorrect" if there are multiple predecessors so I think in addition
>> to properly creating the edges you have to work on the BB building
>> part to ensure that there's a single-predecessor block after
>> returns-twice function calls.  Note that currently we force returns-twice
>> to be the first (and only) stmt of a block -- your fix would relax this,
>> returns-twice no longer needs to start a new BB.
> The single successor test was strictly my paranoia WRT abnormal/EH edges.
>
> I don't immediately see why the CFG would be incorrect if the successor
> of the setjmp block has multiple preds.  But maybe it's something
> downstream that I'm not familiar with -- my worry all along with this
> work was the gimple->rtl conversion and block/edge discovery at that point.

Hmm, right, maybe extra paranoia from me here as well.  Note that I think
even with EH edges out of the setjmp the abnormal return wouldn't be
wrong to the successor given it isn't going to throw after the abnormal
return.

So I guess the proper action is to remove the successor check and
not replace it with a single-pedecessor-of-the-successor check.

> Verifying the successor has a single pred seems simple enough if done at
> the right time.  If it has multiple preds, then we can always fall back
> to the less accurate CFG that we've been building for a decade or more.
> The less precise CFG works, but the imprecision can impede more precise
> analysis as we've seen with this BZ.

As said I'd rather be consistent on what we do for this instead of doing
sth different depending on "context".  So let's drop those restrictions
on the surrounding CFG.

>
>
>> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> -                                      &ab_edge_call, false);
>> +               {
>> +                 bool target_after_setjmp = false;
>> +
>> +                 /* If the returns twice statement looks like a setjmp
>> +                    call at the end of a block with a single successor
>> +                    then we want the edge from the dispatcher to target
>> +                    that single successor.  That more accurately reflects
>> +                    actual control flow.  The more accurate CFG also
>> +                    results in fewer false positive warnings.  */
>> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
>> +                     && gimple_call_fndecl (call_stmt)
>> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
>> +                     && single_succ_p (bb))
>> +                   target_after_setjmp = true;
>> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> +                                        &ab_edge_call, false,
>> +                                        target_after_setjmp);
>> +               }
>>
>> I don't exactly get the hops you jump through here -- I think it's
>> better to split the returns-twice (always last stmt of a block after
>> the fixing) and the setjmp-receiver (always first stmt of a block) cases.
>> So, remove the handling of returns-twice from the above case and
>> handle returns-twice via
>>
>>   gimple *last = last_stmt (bb);
>>   if (last && ...)
>>
>> also handle all returns-twice calls this way, not only setjmp_call_p.
>>
>>>
> [ ... snip ... ]
>
>>> We can easily see that the call to __setjmp can never be reached given
>>> that we consider the longjmp call as non-returning.  So AFAICT
>>> everything is as should be expected.  I think the right thing is to just
>>> remove this compromised test.
>>
>> I agree.  Bonus points if you look at PR57147 and see if the testcase
>> was misreduced (maybe it was just for an ICE so we can keep it
>> and just remove the dump scanning?)
> I had actually looked at 57147.  The original test is pr57147-1 which we
> still handle correctly.  pr57147-2 doesn't represent any real world
> code.  You took pr57147-1.c and twiddled it slightly based on reviewing
> the CFG cleanup code and spotting a potential issue.

Ah, I see.

> We might be able to change pr57147-2 to scan for changes in the edges
> and their flags and given the simplicity of the test that may not be
> terribly fragile.  I wouldn't want to do that in a larger test though.
> Let me give that a whirl.
>
> jeff
Jeff Law March 2, 2018, 10:18 p.m. UTC | #6
On 02/28/2018 03:43 AM, Richard Biener wrote:
[ More snipping ]

> 
>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>> that a "returns twice" function returns not to the call, but to the
>> point immediately after the call.  So if we have a call to a returns
>> twice function that ends a block with a single successor, when we wire
>> up the abnormal dispatcher, we target the single successor rather than
>> the block containing the returns-twice call.
> 
> Hmm, I think you need to check whether the successor has a single
> predecessor, not whether we have a single successor (we always have
> that unless setjmp also throws).  If you fix that you keep the CFG
> "incorrect" if there are multiple predecessors so I think in addition
> to properly creating the edges you have to work on the BB building
> part to ensure that there's a single-predecessor block after
> returns-twice function calls.  Note that currently we force returns-twice
> to be the first (and only) stmt of a block -- your fix would relax this,
> returns-twice no longer needs to start a new BB.
So I found the code which makes the setjmp start a new block. But I
haven't found the code which makes setjmp end a block.  I'm going to
have to throw things into the debugger  to find the latter.


We ought to remove the code that makes the setjmp start a new block.
That's just unnecessary.   setjmp certainly needs to end the block though.




> 
> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> -                                      &ab_edge_call, false);
> +               {
> +                 bool target_after_setjmp = false;
> +
> +                 /* If the returns twice statement looks like a setjmp
> +                    call at the end of a block with a single successor
> +                    then we want the edge from the dispatcher to target
> +                    that single successor.  That more accurately reflects
> +                    actual control flow.  The more accurate CFG also
> +                    results in fewer false positive warnings.  */
> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
> +                     && gimple_call_fndecl (call_stmt)
> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
> +                     && single_succ_p (bb))
> +                   target_after_setjmp = true;
> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
> +                                        &ab_edge_call, false,
> +                                        target_after_setjmp);
> +               }
> 
> I don't exactly get the hops you jump through here -- I think it's
> better to split the returns-twice (always last stmt of a block after
> the fixing) and the setjmp-receiver (always first stmt of a block) cases.
> So, remove the handling of returns-twice from the above case and
> handle returns-twice via
Just wanted to verify the setjmp was the last statement in the block and
the block passed control to a single successor.  If the setjmp is not
the last statement, then having the longjmp pass control to the
successor block potentially skips over statements between the setjmp and
the end of the block.  That obviously would be bad.

As I mentioned before the single_succ_p test was just my paranoia.

Note that GSI can point to a setjmp receiver at this point.  We don't
want to treat that like a setjmp.


> 
>   gimple *last = last_stmt (bb);
>   if (last && ...)
> 
> also handle all returns-twice calls this way, not only setjmp_call_p.
Note that setjmp_call_p returns true for any returns-twice function.  So
we are handling those.


So I think the open issue with this patch is removal of making the
setjmp start a block and verification that we always have it end the
block.  The latter should allow some simplifications to the code I added
in make_edges and provide a level of consistency that is desirable.

Jeff
Jakub Jelinek March 2, 2018, 11:07 p.m. UTC | #7
On Fri, Mar 02, 2018 at 03:18:05PM -0700, Jeff Law wrote:
> On 02/28/2018 03:43 AM, Richard Biener wrote:
> [ More snipping ]
> 
> > 
> >> It's actually pretty easy to fix the CFG.  We  just need to recognize
> >> that a "returns twice" function returns not to the call, but to the
> >> point immediately after the call.  So if we have a call to a returns
> >> twice function that ends a block with a single successor, when we wire
> >> up the abnormal dispatcher, we target the single successor rather than
> >> the block containing the returns-twice call.
> > 
> > Hmm, I think you need to check whether the successor has a single
> > predecessor, not whether we have a single successor (we always have
> > that unless setjmp also throws).  If you fix that you keep the CFG
> > "incorrect" if there are multiple predecessors so I think in addition
> > to properly creating the edges you have to work on the BB building
> > part to ensure that there's a single-predecessor block after
> > returns-twice function calls.  Note that currently we force returns-twice
> > to be the first (and only) stmt of a block -- your fix would relax this,
> > returns-twice no longer needs to start a new BB.
> So I found the code which makes the setjmp start a new block. But I
> haven't found the code which makes setjmp end a block.  I'm going to
> have to throw things into the debugger  to find the latter.
> 
> 
> We ought to remove the code that makes the setjmp start a new block.
> That's just unnecessary.   setjmp certainly needs to end the block though.

At least in gimple, having setjmp start the block is intentional;
we have abnormal edges from all the (possible) spots which might have
longjmp to abnormal dispatcher artificial bb and from that block abnormal
edges to all the sigjmp starts, which is how we emulate the fact that
longjmp might return in a setjmp location.  Edges are created by
handle_abnormal_edges.

	Jakub
Jeff Law March 2, 2018, 11:17 p.m. UTC | #8
On 03/02/2018 04:07 PM, Jakub Jelinek wrote:
> On Fri, Mar 02, 2018 at 03:18:05PM -0700, Jeff Law wrote:
>> On 02/28/2018 03:43 AM, Richard Biener wrote:
>> [ More snipping ]
>>
>>>
>>>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>>>> that a "returns twice" function returns not to the call, but to the
>>>> point immediately after the call.  So if we have a call to a returns
>>>> twice function that ends a block with a single successor, when we wire
>>>> up the abnormal dispatcher, we target the single successor rather than
>>>> the block containing the returns-twice call.
>>>
>>> Hmm, I think you need to check whether the successor has a single
>>> predecessor, not whether we have a single successor (we always have
>>> that unless setjmp also throws).  If you fix that you keep the CFG
>>> "incorrect" if there are multiple predecessors so I think in addition
>>> to properly creating the edges you have to work on the BB building
>>> part to ensure that there's a single-predecessor block after
>>> returns-twice function calls.  Note that currently we force returns-twice
>>> to be the first (and only) stmt of a block -- your fix would relax this,
>>> returns-twice no longer needs to start a new BB.
>> So I found the code which makes the setjmp start a new block. But I
>> haven't found the code which makes setjmp end a block.  I'm going to
>> have to throw things into the debugger  to find the latter.
>>
>>
>> We ought to remove the code that makes the setjmp start a new block.
>> That's just unnecessary.   setjmp certainly needs to end the block though.
> 
> At least in gimple, having setjmp start the block is intentional;
> we have abnormal edges from all the (possible) spots which might have
> longjmp to abnormal dispatcher artificial bb and from that block abnormal
> edges to all the sigjmp starts, which is how we emulate the fact that
> longjmp might return in a setjmp location.  Edges are created by
> handle_abnormal_edges.
But the longjmp should not return to the setjmp call, it should return
to the point immediately after the setjmp call.  That's the core of the
issue with 61118.

jeff
Richard Biener March 5, 2018, 2:07 p.m. UTC | #9
On Fri, Mar 2, 2018 at 11:18 PM, Jeff Law <law@redhat.com> wrote:
> On 02/28/2018 03:43 AM, Richard Biener wrote:
> [ More snipping ]
>
>>
>>> It's actually pretty easy to fix the CFG.  We  just need to recognize
>>> that a "returns twice" function returns not to the call, but to the
>>> point immediately after the call.  So if we have a call to a returns
>>> twice function that ends a block with a single successor, when we wire
>>> up the abnormal dispatcher, we target the single successor rather than
>>> the block containing the returns-twice call.
>>
>> Hmm, I think you need to check whether the successor has a single
>> predecessor, not whether we have a single successor (we always have
>> that unless setjmp also throws).  If you fix that you keep the CFG
>> "incorrect" if there are multiple predecessors so I think in addition
>> to properly creating the edges you have to work on the BB building
>> part to ensure that there's a single-predecessor block after
>> returns-twice function calls.  Note that currently we force returns-twice
>> to be the first (and only) stmt of a block -- your fix would relax this,
>> returns-twice no longer needs to start a new BB.
> So I found the code which makes the setjmp start a new block. But I
> haven't found the code which makes setjmp end a block.  I'm going to
> have to throw things into the debugger  to find the latter.

stmt_starts_bb_p

>
> We ought to remove the code that makes the setjmp start a new block.
> That's just unnecessary.   setjmp certainly needs to end the block though.

yes, after your change, of course.  The code in stmt_starts_bb_p
uses ECF_RETURNS_TWICE, so ...

>
>
>
>>
>> -               handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> -                                      &ab_edge_call, false);
>> +               {
>> +                 bool target_after_setjmp = false;
>> +
>> +                 /* If the returns twice statement looks like a setjmp
>> +                    call at the end of a block with a single successor
>> +                    then we want the edge from the dispatcher to target
>> +                    that single successor.  That more accurately reflects
>> +                    actual control flow.  The more accurate CFG also
>> +                    results in fewer false positive warnings.  */
>> +                 if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
>> +                     && gimple_call_fndecl (call_stmt)
>> +                     && setjmp_call_p (gimple_call_fndecl (call_stmt))
>> +                     && single_succ_p (bb))
>> +                   target_after_setjmp = true;
>> +                 handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
>> +                                        &ab_edge_call, false,
>> +                                        target_after_setjmp);
>> +               }
>>
>> I don't exactly get the hops you jump through here -- I think it's
>> better to split the returns-twice (always last stmt of a block after
>> the fixing) and the setjmp-receiver (always first stmt of a block) cases.
>> So, remove the handling of returns-twice from the above case and
>> handle returns-twice via
> Just wanted to verify the setjmp was the last statement in the block and
> the block passed control to a single successor.  If the setjmp is not
> the last statement, then having the longjmp pass control to the
> successor block potentially skips over statements between the setjmp and
> the end of the block.  That obviously would be bad.
>
> As I mentioned before the single_succ_p test was just my paranoia.
>
> Note that GSI can point to a setjmp receiver at this point.  We don't
> want to treat that like a setjmp.

True.

>
>>
>>   gimple *last = last_stmt (bb);
>>   if (last && ...)
>>
>> also handle all returns-twice calls this way, not only setjmp_call_p.
> Note that setjmp_call_p returns true for any returns-twice function.  So
> we are handling those.

... that's intended as well I think.

>
> So I think the open issue with this patch is removal of making the
> setjmp start a block and verification that we always have it end the
> block.  The latter should allow some simplifications to the code I added
> in make_edges and provide a level of consistency that is desirable.

We've abstracted that bit into GF_CALL_CTRL_ALTERING which we
compute during CFG build and only ever clear afterwards (so an
indirect call to setjmp via a type not having returns_twice will not
end up ending a BB and will not have abnormal edges associated).

So I don't think anything besides fixing CFG build is necessary.
Well - the whole RTL transition business of course.

Richard.

> Jeff
>
Michael Matz March 5, 2018, 6:30 p.m. UTC | #10
Hi,

On Wed, 28 Feb 2018, Jeff Law wrote:

> The single successor test was strictly my paranoia WRT abnormal/EH edges.
> 
> I don't immediately see why the CFG would be incorrect if the successor
> of the setjmp block has multiple preds.

Actually, without further conditions I don't see how it would be safe for 
the successor to have multiple preds.  We might have this situation:

bb1: ret = setjmp
bb2: x0 = phi <x1 (bb1), foo(bbX)>

As you noted the second "return" from setjmp is precisely after the setjmp 
call itself, i.e. on the edge bb1->bb2.  Simply regarding it as landing at 
the start of bb2 it becomes unclear from which edge bb2 was entered and 
hence the runtime model of PHI nodes breaks down.

So, the mental model would be that a hypothetical setjmp_receiver (which 
isn't hypothetical for SJLJ) would have to be inserted on the bb1->bb2 
edge.  From then normal edge insertion routines take over: because the 
bb1-setjmp block only has a single successor (I know it currently doesn't, 
bear with me) it's actually inserted at the end of bb1 (so that edge 
doesn't need splitting), which is indeed what you want.  Except that to 
have a target for the abnormal edges the setjmp_receiver needs to start 
its own basic block, so you'd need to create a new edge, which then 
implicitely creates the situation that the successor of the setjmp block 
only has a single predecessor (until you add the abnormal edges to the 
receiver of course):

bb1 : setjmp; /* fallthrough */
bb1': ret = setjmp_receiver /* fallthrough */
bg2 : x0 = phi<x1(bb1'), foo(bbX)>

While this models what actually happens with setjmp quite fine it poses 
the problem that nothing must be moved between setjmp and setjmp_receiver, 
so it seems indeed better to leave them as just one instruction.  But the 
edge modelling needs to ensure that the above runtime model is followed, 
and so needs to ensure that the actual target of the abnormal edge doesn't 
have multiple predecessors (before adding the abnormal edge that is), so 
the edge must be split I think.

Actually I wonder if it weren't better to regard return-twice functions 
like COND_EXPR, and always create two outgoing edges of such blocks, one 
the normal first-return, the other the (abnormal) second-return.  The 
target block of that second-return would then also be the target of all 
function calls potentially calling longjmp, and so on.  That target block 
would then only have abnormal predecessors, and your problem would have 
also gone away.  In addition a setjmp call would then be normally 
eliminable by unreachable-code analysis (and the second-return target 
block as well eventually)  This would also naturally deal with the 
problem, that while the real second return is after the setjmp call it 
happens before the setting of the return value.


Ciao,
Michael.
Jeff Law March 5, 2018, 7:08 p.m. UTC | #11
On 03/05/2018 11:30 AM, Michael Matz wrote:
> Hi,
> 
> On Wed, 28 Feb 2018, Jeff Law wrote:
> 
>> The single successor test was strictly my paranoia WRT abnormal/EH edges.
>>
>> I don't immediately see why the CFG would be incorrect if the successor
>> of the setjmp block has multiple preds.
> 
> Actually, without further conditions I don't see how it would be safe for 
> the successor to have multiple preds.  We might have this situation:
> 
> bb1: ret = setjmp
> bb2: x0 = phi <x1 (bb1), foo(bbX)>
No.  Can't happen -- we're still building the initial CFG.  There are no
PHI nodes, there are no SSA_NAMEs.

We have two choices we can either target the setjmp itself, which is
what we've been doing and is inaccurate.  Or we can target the point
immediately after the setjmp, which is accurate.

After we have created the CFG, we'll proceed to build dominance
frontiers, compute lifetimes, etc that are nececessary to place the PHI
nodes.




> 
> As you noted the second "return" from setjmp is precisely after the setjmp 
> call itself, i.e. on the edge bb1->bb2.  Simply regarding it as landing at 
> the start of bb2 it becomes unclear from which edge bb2 was entered and 
> hence the runtime model of PHI nodes breaks down.
?!?    Again, we don't have PHIs and we're not simply regarding the
setjmp as landing at the start of BB2.  We are creating an edge from the
dispatcher to BB2.


Jeff
Michael Matz March 5, 2018, 7:30 p.m. UTC | #12
Hi,

On Mon, 5 Mar 2018, Jeff Law wrote:

> >> The single successor test was strictly my paranoia WRT abnormal/EH 
> >> edges.
> >>
> >> I don't immediately see why the CFG would be incorrect if the 
> >> successor of the setjmp block has multiple preds.
> > 
> > Actually, without further conditions I don't see how it would be safe 
> > for the successor to have multiple preds.  We might have this 
> > situation:
> > 
> > bb1: ret = setjmp
> > bb2: x0 = phi <x1 (bb1), foo(bbX)>
> No.  Can't happen -- we're still building the initial CFG.  There are no
> PHI nodes, there are no SSA_NAMEs.

While that is currently true I think it's short-sighted.  Thinking about 
the situation in terms of SSA names and PHI nodes clears up the mind.  In 
addition there is already code which builds (sub-)cfgs when SSA form 
exists (gimple_find_sub_bbs).  Currently that code can't ever generate 
setjmp calls, so it's not an issue.

> We have two choices we can either target the setjmp itself, which is
> what we've been doing and is inaccurate.  Or we can target the point
> immediately after the setjmp, which is accurate.

Not precisely, because the setting of the return value of setjmp does 
happen after both returns.  So moving the whole second-return edge target 
to after the setjmp call (when it includes an LHS) is not correct 
(irrespective how the situation in the successor BBs like like).

> > As you noted the second "return" from setjmp is precisely after the setjmp 
> > call itself, i.e. on the edge bb1->bb2.  Simply regarding it as landing at 
> > the start of bb2 it becomes unclear from which edge bb2 was entered and 
> > hence the runtime model of PHI nodes breaks down.
> ?!?    Again, we don't have PHIs and we're not simply regarding the
> setjmp as landing at the start of BB2.  We are creating an edge from the
> dispatcher to BB2.

Sure, the dispatcher is in between, but I don't regard it as material for 
the issue at hand: it's really
 ret=setjmp --(ab)-> dispatch --(ab)-> XXX
 any-other-call --(ab)-> dispatch
and the question is what XXX should be.  It should be after setjmp for 
precision, but must be before 'ret='.  I was ignoring the dispatcher and 
just said that setjmp and all calls directly transfer to XXX (and then 
discussed what the XXX may be).

So, even if you chose to ignore SSA names and PHI nodes (which probably is 
fine for gcc8) you still have a problem of ignoring the effect on the LHS.


Ciao,
Michael.
Jeff Law March 6, 2018, 3:41 a.m. UTC | #13
On 03/05/2018 12:30 PM, Michael Matz wrote:
> Hi,
> 
> On Mon, 5 Mar 2018, Jeff Law wrote:
> 
>>>> The single successor test was strictly my paranoia WRT abnormal/EH 
>>>> edges.
>>>>
>>>> I don't immediately see why the CFG would be incorrect if the 
>>>> successor of the setjmp block has multiple preds.
>>>
>>> Actually, without further conditions I don't see how it would be safe 
>>> for the successor to have multiple preds.  We might have this 
>>> situation:
>>>
>>> bb1: ret = setjmp
>>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
>> No.  Can't happen -- we're still building the initial CFG.  There are no
>> PHI nodes, there are no SSA_NAMEs.
> 
> While that is currently true I think it's short-sighted.  Thinking about 
> the situation in terms of SSA names and PHI nodes clears up the mind.  In 
> addition there is already code which builds (sub-)cfgs when SSA form 
> exists (gimple_find_sub_bbs).  Currently that code can't ever generate 
> setjmp calls, so it's not an issue.
It's not clearing up anything for me.  Clearly you're onto something
that I'm missing, but still trying to figure out.

Certainly we have to be careful WRT the implicit set of the return value
of the setjmp call that occurs on the longjmp path.  That's worth
investigating.  I suspect that works today more by accident of having an
incorrect CFG than by design.


> 
>> We have two choices we can either target the setjmp itself, which is
>> what we've been doing and is inaccurate.  Or we can target the point
>> immediately after the setjmp, which is accurate.
> 
> Not precisely, because the setting of the return value of setjmp does 
> happen after both returns.  So moving the whole second-return edge target 
> to after the setjmp call (when it includes an LHS) is not correct 
> (irrespective how the situation in the successor BBs like like).
But it does or at least it should.  It's implicitly set on the longjmp
side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
 That's easy enough to instrument and check for.

This aspect of setjmp/longjmp is, in some ways, easier to see in RTL
because the call returns its value in a hard reg which is implicitly set
by the longjmp and we immediately copy it into a pseudo.   Which would
magically DTRT if we had the longjmp edge target the point just after
the setjmp in RTL.




Jeff
Richard Biener March 6, 2018, 8:57 a.m. UTC | #14
On Tue, Mar 6, 2018 at 4:41 AM, Jeff Law <law@redhat.com> wrote:
> On 03/05/2018 12:30 PM, Michael Matz wrote:
>> Hi,
>>
>> On Mon, 5 Mar 2018, Jeff Law wrote:
>>
>>>>> The single successor test was strictly my paranoia WRT abnormal/EH
>>>>> edges.
>>>>>
>>>>> I don't immediately see why the CFG would be incorrect if the
>>>>> successor of the setjmp block has multiple preds.
>>>>
>>>> Actually, without further conditions I don't see how it would be safe
>>>> for the successor to have multiple preds.  We might have this
>>>> situation:
>>>>
>>>> bb1: ret = setjmp
>>>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
>>> No.  Can't happen -- we're still building the initial CFG.  There are no
>>> PHI nodes, there are no SSA_NAMEs.
>>
>> While that is currently true I think it's short-sighted.  Thinking about
>> the situation in terms of SSA names and PHI nodes clears up the mind.  In
>> addition there is already code which builds (sub-)cfgs when SSA form
>> exists (gimple_find_sub_bbs).  Currently that code can't ever generate
>> setjmp calls, so it's not an issue.
> It's not clearing up anything for me.  Clearly you're onto something
> that I'm missing, but still trying to figure out.
>
> Certainly we have to be careful WRT the implicit set of the return value
> of the setjmp call that occurs on the longjmp path.  That's worth
> investigating.  I suspect that works today more by accident of having an
> incorrect CFG than by design.
>
>
>>
>>> We have two choices we can either target the setjmp itself, which is
>>> what we've been doing and is inaccurate.  Or we can target the point
>>> immediately after the setjmp, which is accurate.
>>
>> Not precisely, because the setting of the return value of setjmp does
>> happen after both returns.  So moving the whole second-return edge target
>> to after the setjmp call (when it includes an LHS) is not correct
>> (irrespective how the situation in the successor BBs like like).
> But it does or at least it should.  It's implicitly set on the longjmp
> side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
>  That's easy enough to instrument and check for.
>
> This aspect of setjmp/longjmp is, in some ways, easier to see in RTL
> because the call returns its value in a hard reg which is implicitly set
> by the longjmp and we immediately copy it into a pseudo.   Which would
> magically DTRT if we had the longjmp edge target the point just after
> the setjmp in RTL.

While it's true that the hardreg is set by the callee the GIMPLE IL
indeed doesn't reflect this (and we have a similar issue with EH
where the exceptional return does _not_ include the assignment
to the LHS but the GIMPLE IL does...).

So with your patch we should see

 ret_1 = setjmp ();
   |                                 \
   |                              AB dispatcher
   |                                    /
   v                                   v
# ret_2 = PHI <ret_1, ret_1(ab)>
...

even w/o a PHI.  So I think we should be fine given we have that
edge from setjmp to the abnormal dispatcher.

Richard.

>
>
>
> Jeff
Michael Matz March 6, 2018, 2:17 p.m. UTC | #15
Hi,

On Mon, 5 Mar 2018, Jeff Law wrote:

> >>> Actually, without further conditions I don't see how it would be safe 
> >>> for the successor to have multiple preds.  We might have this 
> >>> situation:
> >>>
> >>> bb1: ret = setjmp
> >>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
> >> No.  Can't happen -- we're still building the initial CFG.  There are no
> >> PHI nodes, there are no SSA_NAMEs.
> > 
> > While that is currently true I think it's short-sighted.  Thinking about 
> > the situation in terms of SSA names and PHI nodes clears up the mind.  In 
> > addition there is already code which builds (sub-)cfgs when SSA form 
> > exists (gimple_find_sub_bbs).  Currently that code can't ever generate 
> > setjmp calls, so it's not an issue.
> 
> It's not clearing up anything for me.  Clearly you're onto something
> that I'm missing, but still trying to figure out.

I'm saying that if there is SSA form then having the second-return edge to 
the successor block if that has multiple predecessors, then this would be 
incorrect.  Do you agree?  I'm also saying that if it's incorrect when in 
SSA form, then it can't be correct (perhaps only very subtly so) without 
SSA form either; I guess that's where we don't agree.

I'm also saying that we currently don't have SSA form while dealing with 
setjmp by more luck and giving up than thought: inlining and va-arg 
expansion do construct sub-cfgs while in SSA.  inlining setjmp calls is 
simply disabled, va-arg expansion doesn't emit setjmp calls into the 
sequence.  But there is nothing inherently magic about SSA form and 
setjmp, so if we're going to fiddle with the CFG form created by setjmp to 
make it more precise, then I think we ought to do it in a way that is 
correct in all intermediate forms, especially at this devel stage, because 
that'd give confidence that it's also correct in the constrained way we're 
needing it.

> Certainly we have to be careful WRT the implicit set of the return value
> of the setjmp call that occurs on the longjmp path.  That's worth
> investigating.  I suspect that works today more by accident of having an
> incorrect CFG than by design.

No, our current imprecise CFG makes it so that the setting of the return 
value is reached by the abnormal edge from the dispatcher, so in a 
data-flow sense it's executed on the first- and second-return case and all 
is well, and so the IL subsumes all the effects that happen in reality 
(and more, which is the reason for the missed optimizations).  You want to 
change the CFG such that it doesn't subsume effects that don't happen in 
reality, and that makes sense, but by that you're making it not reflect 
actions which _do_ happen.

> But it does or at least it should.  It's implicitly set on the longjmp
> side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
>  That's easy enough to instrument and check for.

Not only uninit uses but conceivably misoptimizations as well.  Let's 
contrieve an example which uses gotos to demonstrate the abnormal edges 
and let's have the second-return edge to land after the setjmp call (and 
setting of return value):

  ret = setjmp(buf);
second_return:
  if (ret == 0) {             // first return
inside:                       // see below
    call foo(buf)             // might call longjmp
    maybe-goto dispatch       // therefore we have an edge to dispatch
  }
  return;
dispatch:                     // our abnormal dispatcher
  goto second_return;         // go to second-return target

Now, as far as data-flow is concerned, as soon as the ret==0 block is 
entered once, ret will never change from 0 again (there simply is no 
visible set which could change it).  So an optimizer would be correct in 
threading the sequence maybe-goto-dispatch->second_return to inside (bear 
with me):

  ret = setjmp();
  if (ret == 0) {             // first return
inside:                       // see below
    call foo()                // might call longjmp
    maybe-goto inside         // if it does we loop, ret must still be zero ?!
  }
  return;

This currently would not happen: first we don't thread through abnormal 
edges, and propagation of knowledge over abnormal edges is artificially 
limited as well.  But from a pure data-flow perspective the above 
transformation would be correct, meaning that initial IL can't have been 
correct.  You might say in practice even if the above would happen it 
doesn't matter, because at runtime, with a real longjmp, the "maybe-goto 
inside" would in reality land at the setjmp return again, hence setting 
ret, then the test comes along and all is well.  That would probably be 
true currently, but assume foo always calls longjmp, and GCC can figure 
out it's using the same buffer as the above setjmp, then a further 
optimization might do away with the longjmp and just jump directly to the 
inside label: misoptimization achieved.

None of the above events can happen when the second-return target is 
imprecise and before the setjmp.

Now that all might sound like theoretical games, but I think it highlights 
why we should think long and hard about setjmp CFGs and IL before we relax 
it.

FWIW: I think none of the issues I'm thinking about matter when you check 
two things: 1) that the setjmp call has no LHS, and 2) that the target BB 
has a single predecessor.  I was only triggered by the discussion between 
you and Richi of why (2) might not be important.  But you certainly miss 
checking (1) as well.  IIRC the testcase you were trying to optimize had 
no LHS, right?

It'd be nice if we try to make the setjmp IL correct For Real (tm) in the 
next devel phase:
1) make setjmp have two outgoing edges, always
2) create a setjmp_receive internal function that has a return value
For 'ret = setjmp(buf)', create this CFG:

  bb1
  ret = setjmp(buf)
   |       \              bb-recv
   |        ----------------\
   |                ret = setjmp_receiver
   |                        /
  normal   /---------------/
  path    /
   |     /
  bb-succ

None of these edges would be abnormal.  bb-recv would be the target for 
edges from all calls that might call longjmp(buf).  Those edges might need 
to be abnormal.  As the above CFG reflects all runtime effects precisely, 
but not which instructions are used to achieve them the expansion to RTL 
will be special.


Ciao,
Michael.
Richard Biener March 6, 2018, 2:43 p.m. UTC | #16
On Tue, Mar 6, 2018 at 3:17 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Mon, 5 Mar 2018, Jeff Law wrote:
>
>> >>> Actually, without further conditions I don't see how it would be safe
>> >>> for the successor to have multiple preds.  We might have this
>> >>> situation:
>> >>>
>> >>> bb1: ret = setjmp
>> >>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
>> >> No.  Can't happen -- we're still building the initial CFG.  There are no
>> >> PHI nodes, there are no SSA_NAMEs.
>> >
>> > While that is currently true I think it's short-sighted.  Thinking about
>> > the situation in terms of SSA names and PHI nodes clears up the mind.  In
>> > addition there is already code which builds (sub-)cfgs when SSA form
>> > exists (gimple_find_sub_bbs).  Currently that code can't ever generate
>> > setjmp calls, so it's not an issue.
>>
>> It's not clearing up anything for me.  Clearly you're onto something
>> that I'm missing, but still trying to figure out.
>
> I'm saying that if there is SSA form then having the second-return edge to
> the successor block if that has multiple predecessors, then this would be
> incorrect.  Do you agree?  I'm also saying that if it's incorrect when in
> SSA form, then it can't be correct (perhaps only very subtly so) without
> SSA form either; I guess that's where we don't agree.
>
> I'm also saying that we currently don't have SSA form while dealing with
> setjmp by more luck and giving up than thought: inlining and va-arg
> expansion do construct sub-cfgs while in SSA.  inlining setjmp calls is
> simply disabled, va-arg expansion doesn't emit setjmp calls into the
> sequence.  But there is nothing inherently magic about SSA form and
> setjmp, so if we're going to fiddle with the CFG form created by setjmp to
> make it more precise, then I think we ought to do it in a way that is
> correct in all intermediate forms, especially at this devel stage, because
> that'd give confidence that it's also correct in the constrained way we're
> needing it.
>
>> Certainly we have to be careful WRT the implicit set of the return value
>> of the setjmp call that occurs on the longjmp path.  That's worth
>> investigating.  I suspect that works today more by accident of having an
>> incorrect CFG than by design.
>
> No, our current imprecise CFG makes it so that the setting of the return
> value is reached by the abnormal edge from the dispatcher, so in a
> data-flow sense it's executed on the first- and second-return case and all
> is well, and so the IL subsumes all the effects that happen in reality
> (and more, which is the reason for the missed optimizations).  You want to
> change the CFG such that it doesn't subsume effects that don't happen in
> reality, and that makes sense, but by that you're making it not reflect
> actions which _do_ happen.
>
>> But it does or at least it should.  It's implicitly set on the longjmp
>> side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
>>  That's easy enough to instrument and check for.
>
> Not only uninit uses but conceivably misoptimizations as well.  Let's
> contrieve an example which uses gotos to demonstrate the abnormal edges
> and let's have the second-return edge to land after the setjmp call (and
> setting of return value):
>
>   ret = setjmp(buf);
> second_return:
>   if (ret == 0) {             // first return
> inside:                       // see below
>     call foo(buf)             // might call longjmp
>     maybe-goto dispatch       // therefore we have an edge to dispatch
>   }
>   return;
> dispatch:                     // our abnormal dispatcher
>   goto second_return;         // go to second-return target
>
> Now, as far as data-flow is concerned, as soon as the ret==0 block is
> entered once, ret will never change from 0 again (there simply is no
> visible set which could change it).  So an optimizer would be correct in
> threading the sequence maybe-goto-dispatch->second_return to inside (bear
> with me):
>
>   ret = setjmp();
>   if (ret == 0) {             // first return
> inside:                       // see below
>     call foo()                // might call longjmp
>     maybe-goto inside         // if it does we loop, ret must still be zero ?!
>   }
>   return;
>
> This currently would not happen: first we don't thread through abnormal
> edges, and propagation of knowledge over abnormal edges is artificially
> limited as well.  But from a pure data-flow perspective the above
> transformation would be correct, meaning that initial IL can't have been
> correct.  You might say in practice even if the above would happen it
> doesn't matter, because at runtime, with a real longjmp, the "maybe-goto
> inside" would in reality land at the setjmp return again, hence setting
> ret, then the test comes along and all is well.  That would probably be
> true currently, but assume foo always calls longjmp, and GCC can figure
> out it's using the same buffer as the above setjmp, then a further
> optimization might do away with the longjmp and just jump directly to the
> inside label: misoptimization achieved.
>
> None of the above events can happen when the second-return target is
> imprecise and before the setjmp.
>
> Now that all might sound like theoretical games, but I think it highlights
> why we should think long and hard about setjmp CFGs and IL before we relax
> it.
>
> FWIW: I think none of the issues I'm thinking about matter when you check
> two things: 1) that the setjmp call has no LHS, and 2) that the target BB
> has a single predecessor.  I was only triggered by the discussion between
> you and Richi of why (2) might not be important.  But you certainly miss
> checking (1) as well.  IIRC the testcase you were trying to optimize had
> no LHS, right?
>
> It'd be nice if we try to make the setjmp IL correct For Real (tm) in the
> next devel phase:
> 1) make setjmp have two outgoing edges, always
> 2) create a setjmp_receive internal function that has a return value
> For 'ret = setjmp(buf)', create this CFG:
>
>   bb1
>   ret = setjmp(buf)
>    |       \              bb-recv
>    |        ----------------\
>    |                ret = setjmp_receiver
>    |                        /
>   normal   /---------------/
>   path    /
>    |     /
>   bb-succ
>
> None of these edges would be abnormal.  bb-recv would be the target for
> edges from all calls that might call longjmp(buf).  Those edges might need
> to be abnormal.  As the above CFG reflects all runtime effects precisely,
> but not which instructions are used to achieve them the expansion to RTL
> will be special.

Why do you still have the edge from setjmp to the setjmp receiver?
In your scheme ret is set twice on the longjmp return path, no?  That
is, you have the same issue left as we have with EH returns from a
stmt with a LHS.

We currently have two outgoing edges from setjmp, one which feeds back
to right before the setjmp call via the abnormal dispatcher (so it looks like
a loop).  Jeffs change will make it two outgoing edges to the same
single successor,
one dispatched through the abnormal dispatcher (that also nicely gets
around the limitation of only having a single edge between two blocks...)

Richard.

>
> Ciao,
> Michael.
Michael Matz March 6, 2018, 4:31 p.m. UTC | #17
Hi,

On Tue, 6 Mar 2018, Richard Biener wrote:

> >   bb1
> >   ret = setjmp(buf)
> >    |       \              bb-recv
> >    |        ----------------\
> >    |                ret = setjmp_receiver
> >    |                        /
> >   normal   /---------------/
> >   path    /
> >    |     /
> >   bb-succ
> >
> > None of these edges would be abnormal.  bb-recv would be the target for
> > edges from all calls that might call longjmp(buf).  Those edges might need
> > to be abnormal.  As the above CFG reflects all runtime effects precisely,
> > but not which instructions are used to achieve them the expansion to RTL
> > will be special.
> 
> Why do you still have the edge from setjmp to the setjmp receiver?

Ah, yes, that needs explanation.  Without that edge the receiver hangs in 
the air, so to speak.  But all effects that happened before the setjmp 
invocation also have happened before the second return, so the 
setjmp_receiver needs to be dominated by the setjmp call, and that 
requires and CFG edge.  A different way of thinking about this is that 
both "calls" need VDEF/VUSE, and the VUSE of setjmp_receiver needs to be 
the VDEF of the setjmp, so again that edge needs to be there for ordering 
reasons.  At least that's the obvious way of ordering.  Thinking harder 
might make the edge unnecessary after all: all paths leading to longjmps 
need to go through a setjmp, so the call->receiver edges are already 
ordered behind setjmp calls (though not necessarily dominated by them), so 
the receiver is, and so we might be fine.  I'd have to paint some pictures 
on our board to see how this behaves with multiple reaching setjmps.

> In your scheme ret is set twice on the longjmp return path, no?

No, it's set once on the first-return path, and once on the second-return 
path (from longjmp to setjmp_receiver, which sets ret, the set of the 
setjmp call isn't done on the second-return path).  Which is indeed what 
happens in reality, the return register is set once on first-return and 
once on second-return.

> That is, you have the same issue left as we have with EH returns from a 
> stmt with a LHS.

I don't see that, which problem?

> We currently have two outgoing edges from setjmp, one which feeds back 
> to right before the setjmp call via the abnormal dispatcher (so it looks 
> like a loop).  Jeffs change will make it two outgoing edges to the same 
> single successor, one dispatched through the abnormal dispatcher (that 
> also nicely gets around the limitation of only having a single edge 
> between two blocks...)

The crucial thing that needs to happen is that all paths from longjmp to 
the normal successor of the setjmp call contain an assignment to LHS.  
The edges out of setjmp aren't the important thing for this, the 
destination of edges from the dispatcher are (because that's the one 
targeted by the longjmp calls).  And IIUC Jeffs patch makes those edges 
target something after the LHS-set, and this can't be right.  Make the 
dispatcher set an LHS (and hence have one per setjmp, not one per 
function) and you're effectively ending up with my proposal above.


Ciao,
ichael.
Jeff Law March 7, 2018, 6:01 a.m. UTC | #18
On 03/06/2018 01:57 AM, Richard Biener wrote:
> On Tue, Mar 6, 2018 at 4:41 AM, Jeff Law <law@redhat.com> wrote:
>> On 03/05/2018 12:30 PM, Michael Matz wrote:
>>> Hi,
>>>
>>> On Mon, 5 Mar 2018, Jeff Law wrote:
>>>
>>>>>> The single successor test was strictly my paranoia WRT abnormal/EH
>>>>>> edges.
>>>>>>
>>>>>> I don't immediately see why the CFG would be incorrect if the
>>>>>> successor of the setjmp block has multiple preds.
>>>>>
>>>>> Actually, without further conditions I don't see how it would be safe
>>>>> for the successor to have multiple preds.  We might have this
>>>>> situation:
>>>>>
>>>>> bb1: ret = setjmp
>>>>> bb2: x0 = phi <x1 (bb1), foo(bbX)>
>>>> No.  Can't happen -- we're still building the initial CFG.  There are no
>>>> PHI nodes, there are no SSA_NAMEs.
>>>
>>> While that is currently true I think it's short-sighted.  Thinking about
>>> the situation in terms of SSA names and PHI nodes clears up the mind.  In
>>> addition there is already code which builds (sub-)cfgs when SSA form
>>> exists (gimple_find_sub_bbs).  Currently that code can't ever generate
>>> setjmp calls, so it's not an issue.
>> It's not clearing up anything for me.  Clearly you're onto something
>> that I'm missing, but still trying to figure out.
>>
>> Certainly we have to be careful WRT the implicit set of the return value
>> of the setjmp call that occurs on the longjmp path.  That's worth
>> investigating.  I suspect that works today more by accident of having an
>> incorrect CFG than by design.
>>
>>
>>>
>>>> We have two choices we can either target the setjmp itself, which is
>>>> what we've been doing and is inaccurate.  Or we can target the point
>>>> immediately after the setjmp, which is accurate.
>>>
>>> Not precisely, because the setting of the return value of setjmp does
>>> happen after both returns.  So moving the whole second-return edge target
>>> to after the setjmp call (when it includes an LHS) is not correct
>>> (irrespective how the situation in the successor BBs like like).
>> But it does or at least it should.  It's implicitly set on the longjmp
>> side.  If we get this wrong I'd expect we'll see uninit uses in the PHI.
>>  That's easy enough to instrument and check for.
>>
>> This aspect of setjmp/longjmp is, in some ways, easier to see in RTL
>> because the call returns its value in a hard reg which is implicitly set
>> by the longjmp and we immediately copy it into a pseudo.   Which would
>> magically DTRT if we had the longjmp edge target the point just after
>> the setjmp in RTL.
> 
> While it's true that the hardreg is set by the callee the GIMPLE IL
> indeed doesn't reflect this (and we have a similar issue with EH
> where the exceptional return does _not_ include the assignment
> to the LHS but the GIMPLE IL does...).
> 
> So with your patch we should see
> 
>  ret_1 = setjmp ();
>    |                                 \
>    |                              AB dispatcher
>    |                                    /
>    v                                   v
> # ret_2 = PHI <ret_1, ret_1(ab)>
> ...
> 
> even w/o a PHI.  So I think we should be fine given we have that
> edge from setjmp to the abnormal dispatcher.
I believe so by nature that the setjmp dominates the longjmp sites and
thus also dominates the dispatcher.  But it's something I want to
explicitly check before resubmitting.

jeff
Peter Bergner March 8, 2018, 12:04 a.m. UTC | #19
On 3/7/18 12:01 AM, Jeff Law wrote:
> I believe so by nature that the setjmp dominates the longjmp sites and
> thus also dominates the dispatcher.  But it's something I want to
> explicitly check before resubmitting.

Are we sure a setjmp has to dominate its longjmp sites?  Couldn't you
have something like:

bb(a):                     bb(b):
  ...                        ...
  setjmp (env)               setjmp (env)
      \                         /
       \                       /
        \                     /
         \                   /
          \                 /
           \               /
                bb(c):
                  ...
                  longjmp (env)

...or:

bb(a):
  ...
  setjmp (env)
  |\
  | \
  |  \
  |   \
  |   bb(b):
  |     ...
  |     setjmp (env)
  |   /
  |  /
  | /
  v
bb(c):
  ...
  longjmp (env)

If so, then the setjmp calls might not dominate the longjmp call.

Peter
Michael Matz March 8, 2018, 12:54 p.m. UTC | #20
Hi,

On Wed, 7 Mar 2018, Peter Bergner wrote:

> On 3/7/18 12:01 AM, Jeff Law wrote:
> > I believe so by nature that the setjmp dominates the longjmp sites and
> > thus also dominates the dispatcher.  But it's something I want to
> > explicitly check before resubmitting.
> 
> Are we sure a setjmp has to dominate its longjmp sites?

No, they don't have to dominate.  For lack of better term I used something 
like "ordered after" in my mails :)

> Couldn't you have something like:

Yeah, exactly.


Ciao,
Michael.
Richard Biener March 8, 2018, 1:22 p.m. UTC | #21
On Thu, Mar 8, 2018 at 1:54 PM, Michael Matz <matz@suse.de> wrote:
> Hi,
>
> On Wed, 7 Mar 2018, Peter Bergner wrote:
>
>> On 3/7/18 12:01 AM, Jeff Law wrote:
>> > I believe so by nature that the setjmp dominates the longjmp sites and
>> > thus also dominates the dispatcher.  But it's something I want to
>> > explicitly check before resubmitting.
>>
>> Are we sure a setjmp has to dominate its longjmp sites?
>
> No, they don't have to dominate.  For lack of better term I used something
> like "ordered after" in my mails :)
>
>> Couldn't you have something like:
>
> Yeah, exactly.

So given all the discussion _iff_ we want to change the CFG we generate then
let's invent a general __builtin_receiver () and lower setjmp to

  setjmp ();
  res = __builtin_receiver ();

and construct a CFG around that.  Remember that IIRC I added the abnormal
edges to and from setjmp to inhibit code-motion across it so whatever CFG
we'll end up with should ensure that there can't be any code-motion optimization
across the above pair nor "inbetween" it.  The straight-forward
CFG of, apart from the fallthru from setjmp to the receiver, a abnormal edge
to the dispatcher from setjmp and an abnormal edge from the dispatcher to
the receiver would do that trick I think.

I'd rather not do that for GCC 8 though.  So to fix the warning can't we do
sth else "good" and move the strange warning code from RTL to GIMPLE?

Or re-do the warning?  Since in the other thread about setjmp side-effects
we concluded that setjmp has to preserve all call-saved regs?  I don't see
that reflected in regno_clobbered_at_setjmp or its caller -- that is,
we should only warn for call clobbered and thus caller-saved regs because
normal return may clobber the spilled values.

Not sure if the PR testcase is amongst the cases fixed by such change.

Richard.

>
> Ciao,
> Michael.
Richard Biener March 8, 2018, 1:26 p.m. UTC | #22
On Thu, Mar 8, 2018 at 2:22 PM, Richard Biener
<richard.guenther@gmail.com> wrote:
> On Thu, Mar 8, 2018 at 1:54 PM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Wed, 7 Mar 2018, Peter Bergner wrote:
>>
>>> On 3/7/18 12:01 AM, Jeff Law wrote:
>>> > I believe so by nature that the setjmp dominates the longjmp sites and
>>> > thus also dominates the dispatcher.  But it's something I want to
>>> > explicitly check before resubmitting.
>>>
>>> Are we sure a setjmp has to dominate its longjmp sites?
>>
>> No, they don't have to dominate.  For lack of better term I used something
>> like "ordered after" in my mails :)
>>
>>> Couldn't you have something like:
>>
>> Yeah, exactly.
>
> So given all the discussion _iff_ we want to change the CFG we generate then
> let's invent a general __builtin_receiver () and lower setjmp to
>
>   setjmp ();
>   res = __builtin_receiver ();
>
> and construct a CFG around that.  Remember that IIRC I added the abnormal
> edges to and from setjmp to inhibit code-motion across it so whatever CFG
> we'll end up with should ensure that there can't be any code-motion optimization
> across the above pair nor "inbetween" it.  The straight-forward
> CFG of, apart from the fallthru from setjmp to the receiver, a abnormal edge
> to the dispatcher from setjmp and an abnormal edge from the dispatcher to
> the receiver would do that trick I think.
>
> I'd rather not do that for GCC 8 though.  So to fix the warning can't we do
> sth else "good" and move the strange warning code from RTL to GIMPLE?
>
> Or re-do the warning?  Since in the other thread about setjmp side-effects
> we concluded that setjmp has to preserve all call-saved regs?  I don't see
> that reflected in regno_clobbered_at_setjmp or its caller -- that is,
> we should only warn for call clobbered and thus caller-saved regs because
> normal return may clobber the spilled values.
>
> Not sure if the PR testcase is amongst the cases fixed by such change.

Ah, slight complication is we warn from IRA but _before_ RA.  Not sure if
there's a place where hardregs are assigned but not yet spilled (and those
hardreg assignments hold in the end) where we can move the warning to.

Richard.

> Richard.
>
>>
>> Ciao,
>> Michael.
Michael Matz March 8, 2018, 1:28 p.m. UTC | #23
Hi,

On Thu, 8 Mar 2018, Richard Biener wrote:

> Or re-do the warning?  Since in the other thread about setjmp side-effects
> we concluded that setjmp has to preserve all call-saved regs?

Even worse.  On SPARC setjmp clobbers even more than just call-clobbered 
regs (!).


Ciao,
Michael.
Jeff Law March 9, 2018, 7:42 p.m. UTC | #24
On 03/08/2018 06:22 AM, Richard Biener wrote:
> On Thu, Mar 8, 2018 at 1:54 PM, Michael Matz <matz@suse.de> wrote:
>> Hi,
>>
>> On Wed, 7 Mar 2018, Peter Bergner wrote:
>>
>>> On 3/7/18 12:01 AM, Jeff Law wrote:
>>>> I believe so by nature that the setjmp dominates the longjmp sites and
>>>> thus also dominates the dispatcher.  But it's something I want to
>>>> explicitly check before resubmitting.
>>>
>>> Are we sure a setjmp has to dominate its longjmp sites?
>>
>> No, they don't have to dominate.  For lack of better term I used something
>> like "ordered after" in my mails :)
>>
>>> Couldn't you have something like:
>>
>> Yeah, exactly.
> 
> So given all the discussion _iff_ we want to change the CFG we generate then
> let's invent a general __builtin_receiver () and lower setjmp to
And I'm seriously thinking we may want to hold off on the fix for 61118
for gcc-8.  We still might want to fix 21161 though.  I still need to
digest all the discussion.


> 
>   setjmp ();
>   res = __builtin_receiver ();
> 
> and construct a CFG around that.  Remember that IIRC I added the abnormal
> edges to and from setjmp to inhibit code-motion across it so whatever CFG
> we'll end up with should ensure that there can't be any code-motion optimization
> across the above pair nor "inbetween" it.  The straight-forward
> CFG of, apart from the fallthru from setjmp to the receiver, a abnormal edge
> to the dispatcher from setjmp and an abnormal edge from the dispatcher to
> the receiver would do that trick I think.
> 
> I'd rather not do that for GCC 8 though.  So to fix the warning can't we do
> sth else "good" and move the strange warning code from RTL to GIMPLE?
Someone mentioned that possibility in one of the related BZs.  The
concern was the factoring of the handler could really hinder good
dataflow analysis.  We could end up making things worse :(


> 
> Or re-do the warning?  Since in the other thread about setjmp side-effects
> we concluded that setjmp has to preserve all call-saved regs?  I don't see
> that reflected in regno_clobbered_at_setjmp or its caller -- that is,
> we should only warn for call clobbered and thus caller-saved regs because
> normal return may clobber the spilled values.
Possibly.  It's clear from the discussion and multitude of BZs that this
is complex and easily goof'd.

I believe part of the "trick" here is that once we compute (in RTL) the
set of objects live across the setjmp the allocators then refuse to
allocate those values into call-saved registers (hence the other
discussion thread) with Peter and co.  Of course the RTL analysis get
this wrong in a roughly similar manner (21161).


> Not sure if the PR testcase is amongst the cases fixed by such change.
Unclear -- it'd likely depend on where we do the analysis.  It's
certainly the case that for 61118 that if the analysis happens in RTL
and we haven't addressed our CFG correctness issues that we're going to
fail.

jeff.
Jeff Law March 9, 2018, 7:45 p.m. UTC | #25
On 03/07/2018 05:04 PM, Peter Bergner wrote:
> On 3/7/18 12:01 AM, Jeff Law wrote:
>> I believe so by nature that the setjmp dominates the longjmp sites and
>> thus also dominates the dispatcher.  But it's something I want to
>> explicitly check before resubmitting.
> 
> Are we sure a setjmp has to dominate its longjmp sites?  Couldn't you
> have something like:
> 
> bb(a):                     bb(b):
>   ...                        ...
>   setjmp (env)               setjmp (env)
>       \                         /
>        \                       /
>         \                     /
>          \                   /
>           \                 /
>            \               /
>                 bb(c):
>                   ...
>                   longjmp (env)
> 
> ...or:
> 
> bb(a):
>   ...
>   setjmp (env)
>   |\
>   | \
>   |  \
>   |   \
>   |   bb(b):
>   |     ...
>   |     setjmp (env)
>   |   /
>   |  /
>   | /
>   v
> bb(c):
>   ...
>   longjmp (env)
> 
> If so, then the setjmp calls might not dominate the longjmp call.
Right.  This is one of the cases that needs investigation WRT the value
that would flow into the PHI from the dispatcher.

Jeff
Richard Biener March 9, 2018, 8:20 p.m. UTC | #26
On March 9, 2018 8:42:16 PM GMT+01:00, Jeff Law <law@redhat.com> wrote:
>On 03/08/2018 06:22 AM, Richard Biener wrote:
>> On Thu, Mar 8, 2018 at 1:54 PM, Michael Matz <matz@suse.de> wrote:
>>> Hi,
>>>
>>> On Wed, 7 Mar 2018, Peter Bergner wrote:
>>>
>>>> On 3/7/18 12:01 AM, Jeff Law wrote:
>>>>> I believe so by nature that the setjmp dominates the longjmp sites
>and
>>>>> thus also dominates the dispatcher.  But it's something I want to
>>>>> explicitly check before resubmitting.
>>>>
>>>> Are we sure a setjmp has to dominate its longjmp sites?
>>>
>>> No, they don't have to dominate.  For lack of better term I used
>something
>>> like "ordered after" in my mails :)
>>>
>>>> Couldn't you have something like:
>>>
>>> Yeah, exactly.
>> 
>> So given all the discussion _iff_ we want to change the CFG we
>generate then
>> let's invent a general __builtin_receiver () and lower setjmp to
>And I'm seriously thinking we may want to hold off on the fix for 61118
>for gcc-8.  We still might want to fix 21161 though.  I still need to
>digest all the discussion.
>
>
>> 
>>   setjmp ();
>>   res = __builtin_receiver ();
>> 
>> and construct a CFG around that.  Remember that IIRC I added the
>abnormal
>> edges to and from setjmp to inhibit code-motion across it so whatever
>CFG
>> we'll end up with should ensure that there can't be any code-motion
>optimization
>> across the above pair nor "inbetween" it.  The straight-forward
>> CFG of, apart from the fallthru from setjmp to the receiver, a
>abnormal edge
>> to the dispatcher from setjmp and an abnormal edge from the
>dispatcher to
>> the receiver would do that trick I think.
>> 
>> I'd rather not do that for GCC 8 though.  So to fix the warning can't
>we do
>> sth else "good" and move the strange warning code from RTL to GIMPLE?
>Someone mentioned that possibility in one of the related BZs.  The
>concern was the factoring of the handler could really hinder good
>dataflow analysis.  We could end up making things worse :(
>
>
>> 
>> Or re-do the warning?  Since in the other thread about setjmp
>side-effects
>> we concluded that setjmp has to preserve all call-saved regs?  I
>don't see
>> that reflected in regno_clobbered_at_setjmp or its caller -- that is,
>> we should only warn for call clobbered and thus caller-saved regs
>because
>> normal return may clobber the spilled values.
>Possibly.  It's clear from the discussion and multitude of BZs that
>this
>is complex and easily goof'd.
>
>I believe part of the "trick" here is that once we compute (in RTL) the
>set of objects live across the setjmp the allocators then refuse to
>allocate those values into call-saved registers (hence the other
>discussion thread) with Peter and co.  Of course the RTL analysis get
>this wrong in a roughly similar manner (21161).
>
>
>> Not sure if the PR testcase is amongst the cases fixed by such
>change.
>Unclear -- it'd likely depend on where we do the analysis.  It's
>certainly the case that for 61118 that if the analysis happens in RTL
>and we haven't addressed our CFG correctness issues that we're going to
>fail.

Note that on RTL the CFG does not have any abnormal edges for setjmp and thus code is freely moved across it. Maybe addressing that would also help. 

Richard. 

>
>jeff.
diff mbox series

Patch

diff --git a/gcc/tree-cfg.c b/gcc/tree-cfg.c
index b87e48d..551195a 100644
--- a/gcc/tree-cfg.c
+++ b/gcc/tree-cfg.c
@@ -35,6 +35,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "fold-const.h"
 #include "trans-mem.h"
 #include "stor-layout.h"
+#include "calls.h"
 #include "print-tree.h"
 #include "cfganal.h"
 #include "gimple-fold.h"
@@ -776,13 +777,22 @@  get_abnormal_succ_dispatcher (basic_block bb)
 static void
 handle_abnormal_edges (basic_block *dispatcher_bbs,
 		       basic_block for_bb, int *bb_to_omp_idx,
-		       auto_vec<basic_block> *bbs, bool computed_goto)
+		       auto_vec<basic_block> *bbs, bool computed_goto,
+		       bool target_after_setjmp)
 {
   basic_block *dispatcher = dispatcher_bbs + (computed_goto ? 1 : 0);
   unsigned int idx = 0;
-  basic_block bb;
+  basic_block bb, target_bb;
   bool inner = false;
 
+  /* Determine the block the abnormal dispatcher will transfer
+     control to.  It may be FOR_BB, or in some cases it may be the
+     single successor of FOR_BB.  */
+  if (target_after_setjmp)
+    target_bb = single_succ (for_bb);
+  else
+    target_bb = for_bb;
+
   if (bb_to_omp_idx)
     {
       dispatcher = dispatcher_bbs + 2 * bb_to_omp_idx[for_bb->index];
@@ -878,7 +888,7 @@  handle_abnormal_edges (basic_block *dispatcher_bbs,
 	}
     }
 
-  make_edge (*dispatcher, for_bb, EDGE_ABNORMAL);
+  make_edge (*dispatcher, target_bb, EDGE_ABNORMAL);
 }
 
 /* Creates outgoing edges for BB.  Returns 1 when it ends with an
@@ -1075,11 +1085,11 @@  make_edges (void)
 		 potential target for a computed goto or a non-local goto.  */
 	      if (FORCED_LABEL (target))
 		handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
-				       &ab_edge_goto, true);
+				       &ab_edge_goto, true, false);
 	      if (DECL_NONLOCAL (target))
 		{
 		  handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
-					 &ab_edge_call, false);
+					 &ab_edge_call, false, false);
 		  break;
 		}
 	    }
@@ -1094,8 +1104,24 @@  make_edges (void)
 		  && ((gimple_call_flags (call_stmt) & ECF_RETURNS_TWICE)
 		      || gimple_call_builtin_p (call_stmt,
 						BUILT_IN_SETJMP_RECEIVER)))
-		handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
-				       &ab_edge_call, false);
+		{
+		  bool target_after_setjmp = false;
+
+		  /* If the returns twice statement looks like a setjmp
+		     call at the end of a block with a single successor
+		     then we want the edge from the dispatcher to target
+		     that single successor.  That more accurately reflects
+		     actual control flow.  The more accurate CFG also
+		     results in fewer false positive warnings.  */
+		  if (gsi_stmt (gsi_last_nondebug_bb (bb)) == call_stmt
+		      && gimple_call_fndecl (call_stmt)
+		      && setjmp_call_p (gimple_call_fndecl (call_stmt))
+		      && single_succ_p (bb))
+		    target_after_setjmp = true;
+		  handle_abnormal_edges (dispatcher_bbs, bb, bb_to_omp_idx,
+					 &ab_edge_call, false,
+					 target_after_setjmp);
+		}
 	    }
 	}
 
diff --git a/gcc/testsuite/gcc.dg/torture/pr57147-2.c b/gcc/testsuite/gcc.dg/torture/pr57147-2.c
deleted file mode 100644
index fc5fb39..0000000
--- a/gcc/testsuite/gcc.dg/torture/pr57147-2.c
+++ /dev/null
@@ -1,22 +0,0 @@ 
-/* { dg-do compile } */
-/* { dg-options "-fdump-tree-optimized" } */
-/* { dg-skip-if "" { *-*-* } { "-fno-fat-lto-objects" } { "" } } */
-/* { dg-require-effective-target indirect_jumps } */
-
-struct __jmp_buf_tag {};
-typedef struct __jmp_buf_tag jmp_buf[1];
-extern int _setjmp (struct __jmp_buf_tag __env[1]);
-
-jmp_buf g_return_jmp_buf;
-
-void SetNaClSwitchExpectations (void)
-{
-  __builtin_longjmp (g_return_jmp_buf, 1);
-}
-void TestSyscall(void)
-{
-  SetNaClSwitchExpectations();
-  _setjmp (g_return_jmp_buf);
-}
-
-/* { dg-final { scan-tree-dump "setjmp" "optimized" } } */
diff --git a/gcc/testsuite/gcc.dg/torture/pr61118.c b/gcc/testsuite/gcc.dg/torture/pr61118.c
new file mode 100644
index 0000000..12be892
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/torture/pr61118.c
@@ -0,0 +1,652 @@ 
+/* { dg-options "-Wextra -fno-tracer" } */
+typedef unsigned char __u_char;
+typedef unsigned short int __u_short;
+typedef unsigned int __u_int;
+typedef unsigned long int __u_long;
+typedef signed char __int8_t;
+typedef unsigned char __uint8_t;
+typedef signed short int __int16_t;
+typedef unsigned short int __uint16_t;
+typedef signed int __int32_t;
+typedef unsigned int __uint32_t;
+typedef signed long int __int64_t;
+typedef unsigned long int __uint64_t;
+typedef long int __quad_t;
+typedef unsigned long int __u_quad_t;
+typedef unsigned long int __dev_t;
+typedef unsigned int __uid_t;
+typedef unsigned int __gid_t;
+typedef unsigned long int __ino_t;
+typedef unsigned long int __ino64_t;
+typedef unsigned int __mode_t;
+typedef unsigned long int __nlink_t;
+typedef long int __off_t;
+typedef long int __off64_t;
+typedef int __pid_t;
+typedef struct { int __val[2]; } __fsid_t;
+typedef long int __clock_t;
+typedef unsigned long int __rlim_t;
+typedef unsigned long int __rlim64_t;
+typedef unsigned int __id_t;
+typedef long int __time_t;
+typedef unsigned int __useconds_t;
+typedef long int __suseconds_t;
+typedef int __daddr_t;
+typedef int __key_t;
+typedef int __clockid_t;
+typedef void * __timer_t;
+typedef long int __blksize_t;
+typedef long int __blkcnt_t;
+typedef long int __blkcnt64_t;
+typedef unsigned long int __fsblkcnt_t;
+typedef unsigned long int __fsblkcnt64_t;
+typedef unsigned long int __fsfilcnt_t;
+typedef unsigned long int __fsfilcnt64_t;
+typedef long int __fsword_t;
+typedef long int __ssize_t;
+typedef long int __syscall_slong_t;
+typedef unsigned long int __syscall_ulong_t;
+typedef __off64_t __loff_t;
+typedef __quad_t *__qaddr_t;
+typedef char *__caddr_t;
+typedef long int __intptr_t;
+typedef unsigned int __socklen_t;
+static __inline unsigned int
+__bswap_32 (unsigned int __bsx)
+{
+  return __builtin_bswap32 (__bsx);
+}
+static __inline __uint64_t
+__bswap_64 (__uint64_t __bsx)
+{
+  return __builtin_bswap64 (__bsx);
+}
+typedef long unsigned int size_t;
+typedef __time_t time_t;
+struct timespec
+  {
+    __time_t tv_sec;
+    __syscall_slong_t tv_nsec;
+  };
+typedef __pid_t pid_t;
+struct sched_param
+  {
+    int __sched_priority;
+  };
+struct __sched_param
+  {
+    int __sched_priority;
+  };
+typedef unsigned long int __cpu_mask;
+typedef struct
+{
+  __cpu_mask __bits[1024 / (8 * sizeof (__cpu_mask))];
+} cpu_set_t;
+extern int __sched_cpucount (size_t __setsize, const cpu_set_t *__setp)
+  __attribute__ ((__nothrow__ , __leaf__));
+extern cpu_set_t *__sched_cpualloc (size_t __count) __attribute__ ((__nothrow__ , __leaf__)) ;
+extern void __sched_cpufree (cpu_set_t *__set) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_setparam (__pid_t __pid, const struct sched_param *__param)
+     __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_getparam (__pid_t __pid, struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_setscheduler (__pid_t __pid, int __policy,
+          const struct sched_param *__param) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_getscheduler (__pid_t __pid) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_yield (void) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_get_priority_max (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_get_priority_min (int __algorithm) __attribute__ ((__nothrow__ , __leaf__));
+extern int sched_rr_get_interval (__pid_t __pid, struct timespec *__t) __attribute__ ((__nothrow__ , __leaf__));
+typedef __clock_t clock_t;
+typedef __clockid_t clockid_t;
+typedef __timer_t timer_t;
+struct tm
+{
+  int tm_sec;
+  int tm_min;
+  int tm_hour;
+  int tm_mday;
+  int tm_mon;
+  int tm_year;
+  int tm_wday;
+  int tm_yday;
+  int tm_isdst;
+  long int tm_gmtoff;
+  const char *tm_zone;
+};
+struct itimerspec
+  {
+    struct timespec it_interval;
+    struct timespec it_value;
+  };
+struct sigevent;
+extern clock_t clock (void) __attribute__ ((__nothrow__ , __leaf__));
+extern time_t time (time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
+extern double difftime (time_t __time1, time_t __time0)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
+extern time_t mktime (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
+extern size_t strftime (char *__restrict __s, size_t __maxsize,
+   const char *__restrict __format,
+   const struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
+typedef struct __locale_struct
+{
+  struct __locale_data *__locales[13];
+  const unsigned short int *__ctype_b;
+  const int *__ctype_tolower;
+  const int *__ctype_toupper;
+  const char *__names[13];
+} *__locale_t;
+typedef __locale_t locale_t;
+extern size_t strftime_l (char *__restrict __s, size_t __maxsize,
+     const char *__restrict __format,
+     const struct tm *__restrict __tp,
+     __locale_t __loc) __attribute__ ((__nothrow__ , __leaf__));
+extern struct tm *gmtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
+extern struct tm *localtime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
+extern struct tm *gmtime_r (const time_t *__restrict __timer,
+       struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
+extern struct tm *localtime_r (const time_t *__restrict __timer,
+          struct tm *__restrict __tp) __attribute__ ((__nothrow__ , __leaf__));
+extern char *asctime (const struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
+extern char *ctime (const time_t *__timer) __attribute__ ((__nothrow__ , __leaf__));
+extern char *asctime_r (const struct tm *__restrict __tp,
+   char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
+extern char *ctime_r (const time_t *__restrict __timer,
+        char *__restrict __buf) __attribute__ ((__nothrow__ , __leaf__));
+extern char *__tzname[2];
+extern int __daylight;
+extern long int __timezone;
+extern char *tzname[2];
+extern void tzset (void) __attribute__ ((__nothrow__ , __leaf__));
+extern int daylight;
+extern long int timezone;
+extern int stime (const time_t *__when) __attribute__ ((__nothrow__ , __leaf__));
+extern time_t timegm (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
+extern time_t timelocal (struct tm *__tp) __attribute__ ((__nothrow__ , __leaf__));
+extern int dysize (int __year) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
+extern int nanosleep (const struct timespec *__requested_time,
+        struct timespec *__remaining);
+extern int clock_getres (clockid_t __clock_id, struct timespec *__res) __attribute__ ((__nothrow__ , __leaf__));
+extern int clock_gettime (clockid_t __clock_id, struct timespec *__tp) __attribute__ ((__nothrow__ , __leaf__));
+extern int clock_settime (clockid_t __clock_id, const struct timespec *__tp)
+     __attribute__ ((__nothrow__ , __leaf__));
+extern int clock_nanosleep (clockid_t __clock_id, int __flags,
+       const struct timespec *__req,
+       struct timespec *__rem);
+extern int clock_getcpuclockid (pid_t __pid, clockid_t *__clock_id) __attribute__ ((__nothrow__ , __leaf__));
+extern int timer_create (clockid_t __clock_id,
+    struct sigevent *__restrict __evp,
+    timer_t *__restrict __timerid) __attribute__ ((__nothrow__ , __leaf__));
+extern int timer_delete (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
+extern int timer_settime (timer_t __timerid, int __flags,
+     const struct itimerspec *__restrict __value,
+     struct itimerspec *__restrict __ovalue) __attribute__ ((__nothrow__ , __leaf__));
+extern int timer_gettime (timer_t __timerid, struct itimerspec *__value)
+     __attribute__ ((__nothrow__ , __leaf__));
+extern int timer_getoverrun (timer_t __timerid) __attribute__ ((__nothrow__ , __leaf__));
+typedef unsigned long int pthread_t;
+union pthread_attr_t
+{
+  char __size[56];
+  long int __align;
+};
+typedef union pthread_attr_t pthread_attr_t;
+typedef struct __pthread_internal_list
+{
+  struct __pthread_internal_list *__prev;
+  struct __pthread_internal_list *__next;
+} __pthread_list_t;
+typedef union
+{
+  struct __pthread_mutex_s
+  {
+    int __lock;
+    unsigned int __count;
+    int __owner;
+    unsigned int __nusers;
+    int __kind;
+    short __spins;
+    short __elision;
+    __pthread_list_t __list;
+  } __data;
+  char __size[40];
+  long int __align;
+} pthread_mutex_t;
+typedef union
+{
+  char __size[4];
+  int __align;
+} pthread_mutexattr_t;
+typedef union
+{
+  struct
+  {
+    int __lock;
+    unsigned int __futex;
+    __extension__ unsigned long long int __total_seq;
+    __extension__ unsigned long long int __wakeup_seq;
+    __extension__ unsigned long long int __woken_seq;
+    void *__mutex;
+    unsigned int __nwaiters;
+    unsigned int __broadcast_seq;
+  } __data;
+  char __size[48];
+  __extension__ long long int __align;
+} pthread_cond_t;
+typedef union
+{
+  char __size[4];
+  int __align;
+} pthread_condattr_t;
+typedef unsigned int pthread_key_t;
+typedef int pthread_once_t;
+typedef union
+{
+  struct
+  {
+    int __lock;
+    unsigned int __nr_readers;
+    unsigned int __readers_wakeup;
+    unsigned int __writer_wakeup;
+    unsigned int __nr_readers_queued;
+    unsigned int __nr_writers_queued;
+    int __writer;
+    int __shared;
+    unsigned long int __pad1;
+    unsigned long int __pad2;
+    unsigned int __flags;
+  } __data;
+  char __size[56];
+  long int __align;
+} pthread_rwlock_t;
+typedef union
+{
+  char __size[8];
+  long int __align;
+} pthread_rwlockattr_t;
+typedef volatile int pthread_spinlock_t;
+typedef union
+{
+  char __size[32];
+  long int __align;
+} pthread_barrier_t;
+typedef union
+{
+  char __size[4];
+  int __align;
+} pthread_barrierattr_t;
+typedef long int __jmp_buf[8];
+enum
+{
+  PTHREAD_CREATE_JOINABLE,
+  PTHREAD_CREATE_DETACHED
+};
+enum
+{
+  PTHREAD_MUTEX_TIMED_NP,
+  PTHREAD_MUTEX_RECURSIVE_NP,
+  PTHREAD_MUTEX_ERRORCHECK_NP,
+  PTHREAD_MUTEX_ADAPTIVE_NP
+  ,
+  PTHREAD_MUTEX_NORMAL = PTHREAD_MUTEX_TIMED_NP,
+  PTHREAD_MUTEX_RECURSIVE = PTHREAD_MUTEX_RECURSIVE_NP,
+  PTHREAD_MUTEX_ERRORCHECK = PTHREAD_MUTEX_ERRORCHECK_NP,
+  PTHREAD_MUTEX_DEFAULT = PTHREAD_MUTEX_NORMAL
+};
+enum
+{
+  PTHREAD_MUTEX_STALLED,
+  PTHREAD_MUTEX_STALLED_NP = PTHREAD_MUTEX_STALLED,
+  PTHREAD_MUTEX_ROBUST,
+  PTHREAD_MUTEX_ROBUST_NP = PTHREAD_MUTEX_ROBUST
+};
+enum
+{
+  PTHREAD_PRIO_NONE,
+  PTHREAD_PRIO_INHERIT,
+  PTHREAD_PRIO_PROTECT
+};
+enum
+{
+  PTHREAD_RWLOCK_PREFER_READER_NP,
+  PTHREAD_RWLOCK_PREFER_WRITER_NP,
+  PTHREAD_RWLOCK_PREFER_WRITER_NONRECURSIVE_NP,
+  PTHREAD_RWLOCK_DEFAULT_NP = PTHREAD_RWLOCK_PREFER_READER_NP
+};
+enum
+{
+  PTHREAD_INHERIT_SCHED,
+  PTHREAD_EXPLICIT_SCHED
+};
+enum
+{
+  PTHREAD_SCOPE_SYSTEM,
+  PTHREAD_SCOPE_PROCESS
+};
+enum
+{
+  PTHREAD_PROCESS_PRIVATE,
+  PTHREAD_PROCESS_SHARED
+};
+struct _pthread_cleanup_buffer
+{
+  void (*__routine) (void *);
+  void *__arg;
+  int __canceltype;
+  struct _pthread_cleanup_buffer *__prev;
+};
+enum
+{
+  PTHREAD_CANCEL_ENABLE,
+  PTHREAD_CANCEL_DISABLE
+};
+enum
+{
+  PTHREAD_CANCEL_DEFERRED,
+  PTHREAD_CANCEL_ASYNCHRONOUS
+};
+extern int pthread_create (pthread_t *__restrict __newthread,
+      const pthread_attr_t *__restrict __attr,
+      void *(*__start_routine) (void *),
+      void *__restrict __arg) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 3)));
+extern void pthread_exit (void *__retval) __attribute__ ((__noreturn__));
+extern int pthread_join (pthread_t __th, void **__thread_return);
+extern int pthread_detach (pthread_t __th) __attribute__ ((__nothrow__ , __leaf__));
+extern pthread_t pthread_self (void) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
+extern int pthread_equal (pthread_t __thread1, pthread_t __thread2)
+  __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__const__));
+extern int pthread_attr_init (pthread_attr_t *__attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_destroy (pthread_attr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getdetachstate (const pthread_attr_t *__attr,
+     int *__detachstate)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setdetachstate (pthread_attr_t *__attr,
+     int __detachstate)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getguardsize (const pthread_attr_t *__attr,
+          size_t *__guardsize)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setguardsize (pthread_attr_t *__attr,
+          size_t __guardsize)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getschedparam (const pthread_attr_t *__restrict __attr,
+           struct sched_param *__restrict __param)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setschedparam (pthread_attr_t *__restrict __attr,
+           const struct sched_param *__restrict
+           __param) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_getschedpolicy (const pthread_attr_t *__restrict
+     __attr, int *__restrict __policy)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setschedpolicy (pthread_attr_t *__attr, int __policy)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getinheritsched (const pthread_attr_t *__restrict
+      __attr, int *__restrict __inherit)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setinheritsched (pthread_attr_t *__attr,
+      int __inherit)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getscope (const pthread_attr_t *__restrict __attr,
+      int *__restrict __scope)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setscope (pthread_attr_t *__attr, int __scope)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getstackaddr (const pthread_attr_t *__restrict
+          __attr, void **__restrict __stackaddr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2))) __attribute__ ((__deprecated__));
+extern int pthread_attr_setstackaddr (pthread_attr_t *__attr,
+          void *__stackaddr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1))) __attribute__ ((__deprecated__));
+extern int pthread_attr_getstacksize (const pthread_attr_t *__restrict
+          __attr, size_t *__restrict __stacksize)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_attr_setstacksize (pthread_attr_t *__attr,
+          size_t __stacksize)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_attr_getstack (const pthread_attr_t *__restrict __attr,
+      void **__restrict __stackaddr,
+      size_t *__restrict __stacksize)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2, 3)));
+extern int pthread_attr_setstack (pthread_attr_t *__attr, void *__stackaddr,
+      size_t __stacksize) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_setschedparam (pthread_t __target_thread, int __policy,
+      const struct sched_param *__param)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (3)));
+extern int pthread_getschedparam (pthread_t __target_thread,
+      int *__restrict __policy,
+      struct sched_param *__restrict __param)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2, 3)));
+extern int pthread_setschedprio (pthread_t __target_thread, int __prio)
+     __attribute__ ((__nothrow__ , __leaf__));
+extern int pthread_once (pthread_once_t *__once_control,
+    void (*__init_routine) (void)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_setcancelstate (int __state, int *__oldstate);
+extern int pthread_setcanceltype (int __type, int *__oldtype);
+extern int pthread_cancel (pthread_t __th);
+extern void pthread_testcancel (void);
+typedef struct
+{
+  struct
+  {
+    __jmp_buf __cancel_jmp_buf;
+    int __mask_was_saved;
+  } __cancel_jmp_buf[1];
+  void *__pad[4];
+} __pthread_unwind_buf_t __attribute__ ((__aligned__));
+struct __pthread_cleanup_frame
+{
+  void (*__cancel_routine) (void *);
+  void *__cancel_arg;
+  int __do_it;
+  int __cancel_type;
+};
+extern void __pthread_register_cancel (__pthread_unwind_buf_t *__buf)
+     ;
+extern void __pthread_unregister_cancel (__pthread_unwind_buf_t *__buf)
+  ;
+extern void __pthread_unwind_next (__pthread_unwind_buf_t *__buf)
+     __attribute__ ((__noreturn__))
+     __attribute__ ((__weak__))
+     ;
+struct __jmp_buf_tag;
+extern int __sigsetjmp (struct __jmp_buf_tag *__env, int __savemask) __attribute__ ((__nothrow__));
+extern int pthread_mutex_init (pthread_mutex_t *__mutex,
+          const pthread_mutexattr_t *__mutexattr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutex_destroy (pthread_mutex_t *__mutex)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutex_trylock (pthread_mutex_t *__mutex)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutex_lock (pthread_mutex_t *__mutex)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutex_timedlock (pthread_mutex_t *__restrict __mutex,
+        const struct timespec *__restrict
+        __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutex_unlock (pthread_mutex_t *__mutex)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutex_getprioceiling (const pthread_mutex_t *
+      __restrict __mutex,
+      int *__restrict __prioceiling)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutex_setprioceiling (pthread_mutex_t *__restrict __mutex,
+      int __prioceiling,
+      int *__restrict __old_ceiling)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 3)));
+extern int pthread_mutex_consistent (pthread_mutex_t *__mutex)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_init (pthread_mutexattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_destroy (pthread_mutexattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_getpshared (const pthread_mutexattr_t *
+      __restrict __attr,
+      int *__restrict __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutexattr_setpshared (pthread_mutexattr_t *__attr,
+      int __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_gettype (const pthread_mutexattr_t *__restrict
+          __attr, int *__restrict __kind)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutexattr_settype (pthread_mutexattr_t *__attr, int __kind)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_getprotocol (const pthread_mutexattr_t *
+       __restrict __attr,
+       int *__restrict __protocol)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutexattr_setprotocol (pthread_mutexattr_t *__attr,
+       int __protocol)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_getprioceiling (const pthread_mutexattr_t *
+          __restrict __attr,
+          int *__restrict __prioceiling)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutexattr_setprioceiling (pthread_mutexattr_t *__attr,
+          int __prioceiling)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_mutexattr_getrobust (const pthread_mutexattr_t *__attr,
+     int *__robustness)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_mutexattr_setrobust (pthread_mutexattr_t *__attr,
+     int __robustness)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_init (pthread_rwlock_t *__restrict __rwlock,
+    const pthread_rwlockattr_t *__restrict
+    __attr) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_destroy (pthread_rwlock_t *__rwlock)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_rdlock (pthread_rwlock_t *__rwlock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_tryrdlock (pthread_rwlock_t *__rwlock)
+  __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_timedrdlock (pthread_rwlock_t *__restrict __rwlock,
+           const struct timespec *__restrict
+           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_rwlock_wrlock (pthread_rwlock_t *__rwlock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_trywrlock (pthread_rwlock_t *__rwlock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlock_timedwrlock (pthread_rwlock_t *__restrict __rwlock,
+           const struct timespec *__restrict
+           __abstime) __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_rwlock_unlock (pthread_rwlock_t *__rwlock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlockattr_init (pthread_rwlockattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlockattr_destroy (pthread_rwlockattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlockattr_getpshared (const pthread_rwlockattr_t *
+       __restrict __attr,
+       int *__restrict __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_rwlockattr_setpshared (pthread_rwlockattr_t *__attr,
+       int __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_rwlockattr_getkind_np (const pthread_rwlockattr_t *
+       __restrict __attr,
+       int *__restrict __pref)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_rwlockattr_setkind_np (pthread_rwlockattr_t *__attr,
+       int __pref) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_cond_init (pthread_cond_t *__restrict __cond,
+         const pthread_condattr_t *__restrict __cond_attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_cond_destroy (pthread_cond_t *__cond)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_cond_signal (pthread_cond_t *__cond)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_cond_broadcast (pthread_cond_t *__cond)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_cond_wait (pthread_cond_t *__restrict __cond,
+         pthread_mutex_t *__restrict __mutex)
+     __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_cond_timedwait (pthread_cond_t *__restrict __cond,
+       pthread_mutex_t *__restrict __mutex,
+       const struct timespec *__restrict __abstime)
+     __attribute__ ((__nonnull__ (1, 2, 3)));
+extern int pthread_condattr_init (pthread_condattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_condattr_destroy (pthread_condattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_condattr_getpshared (const pthread_condattr_t *
+     __restrict __attr,
+     int *__restrict __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_condattr_setpshared (pthread_condattr_t *__attr,
+     int __pshared) __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_condattr_getclock (const pthread_condattr_t *
+          __restrict __attr,
+          __clockid_t *__restrict __clock_id)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_condattr_setclock (pthread_condattr_t *__attr,
+          __clockid_t __clock_id)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_spin_init (pthread_spinlock_t *__lock, int __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_spin_destroy (pthread_spinlock_t *__lock)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_spin_lock (pthread_spinlock_t *__lock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_spin_trylock (pthread_spinlock_t *__lock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_spin_unlock (pthread_spinlock_t *__lock)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrier_init (pthread_barrier_t *__restrict __barrier,
+     const pthread_barrierattr_t *__restrict
+     __attr, unsigned int __count)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrier_destroy (pthread_barrier_t *__barrier)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrier_wait (pthread_barrier_t *__barrier)
+     __attribute__ ((__nothrow__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrierattr_init (pthread_barrierattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrierattr_destroy (pthread_barrierattr_t *__attr)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_barrierattr_getpshared (const pthread_barrierattr_t *
+        __restrict __attr,
+        int *__restrict __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1, 2)));
+extern int pthread_barrierattr_setpshared (pthread_barrierattr_t *__attr,
+        int __pshared)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_key_create (pthread_key_t *__key,
+          void (*__destr_function) (void *))
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (1)));
+extern int pthread_key_delete (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
+extern void *pthread_getspecific (pthread_key_t __key) __attribute__ ((__nothrow__ , __leaf__));
+extern int pthread_setspecific (pthread_key_t __key,
+    const void *__pointer) __attribute__ ((__nothrow__ , __leaf__)) ;
+extern int pthread_getcpuclockid (pthread_t __thread_id,
+      __clockid_t *__clock_id)
+     __attribute__ ((__nothrow__ , __leaf__)) __attribute__ ((__nonnull__ (2)));
+extern int pthread_atfork (void (*__prepare) (void),
+      void (*__parent) (void),
+      void (*__child) (void)) __attribute__ ((__nothrow__ , __leaf__));
+extern __inline __attribute__ ((__gnu_inline__)) int
+__attribute__ ((__nothrow__ , __leaf__)) pthread_equal (pthread_t __thread1, pthread_t __thread2)
+{
+  return __thread1 == __thread2;
+}
+void cleanup_fn(void *mutex);
+typedef struct {
+  size_t progress;
+  size_t total;
+  pthread_mutex_t mutex;
+  pthread_cond_t cond;
+  double min_wait;
+} dmnsn_future;
+void
+dmnsn_future_wait(dmnsn_future *future, double progress)
+{
+  pthread_mutex_lock(&future->mutex);
+  while ((double)future->progress/future->total < progress) {
+    if (progress < future->min_wait) {
+      future->min_wait = progress;
+    }
+    do { __pthread_unwind_buf_t __cancel_buf; void (*__cancel_routine) (void *) = (cleanup_fn); void *__cancel_arg = (&future->mutex); int __not_first_call = __sigsetjmp ((struct __jmp_buf_tag *) (void *) __cancel_buf.__cancel_jmp_buf, 0); if (__builtin_expect ((__not_first_call), 0)) { __cancel_routine (__cancel_arg); __pthread_unwind_next (&__cancel_buf); } __pthread_register_cancel (&__cancel_buf); do {;
+    pthread_cond_wait(&future->cond, &future->mutex);
+    do { } while (0); } while (0); __pthread_unregister_cancel (&__cancel_buf); if (0) __cancel_routine (__cancel_arg); } while (0);
+  }
+  pthread_mutex_unlock(&future->mutex);
+}