diff mbox series

libgfortran: Fix and simplify IO locking [PR92836]

Message ID 20200131133753.29995-1-blomqvist.janne@gmail.com
State New
Headers show
Series libgfortran: Fix and simplify IO locking [PR92836] | expand

Commit Message

Janne Blomqvist Jan. 31, 2020, 1:37 p.m. UTC
Simplify IO locking in libgfortran.  The new IO implementation avoids
accessing units without locks, as seen in PR 92836.  It also avoids
lock inversion (except for a corner case wrt namelist query when
reading from stdin and outputting to stdout), making it easier to
verify correctness with tools like valgrind or threadsanitizer.  It is
also simplified as the waiting and closed variables are not needed
anymore, making it easier to understand and analyze.

Regtested on x86_64-pc-linux-gnu, Ok for master?
---
 libgfortran/io/close.c     |  17 ++---
 libgfortran/io/io.h        |  57 +-------------
 libgfortran/io/list_read.c |  26 ++++++-
 libgfortran/io/unit.c      | 150 +++++++++++++++----------------------
 libgfortran/io/unix.c      |  89 ++++------------------
 5 files changed, 109 insertions(+), 230 deletions(-)

Comments

Janne Blomqvist Jan. 31, 2020, 10:35 p.m. UTC | #1
On Fri, Jan 31, 2020 at 3:38 PM Janne Blomqvist
<blomqvist.janne@gmail.com> wrote:
>
> Simplify IO locking in libgfortran.  The new IO implementation avoids
> accessing units without locks, as seen in PR 92836.  It also avoids
> lock inversion (except for a corner case wrt namelist query when
> reading from stdin and outputting to stdout), making it easier to
> verify correctness with tools like valgrind or threadsanitizer.  It is
> also simplified as the waiting and closed variables are not needed
> anymore, making it easier to understand and analyze.
>
> Regtested on x86_64-pc-linux-gnu, Ok for master?

And, I forgot the ChangeLog entry. Here it is:

libgfortran/ChangeLog:

2020-01-31  Janne Blomqvist  <jb@gcc.gnu.org>

    PR libfortran/92836
    * io/close.c (st_close): Close unit with unit_lock held.
    * io/io.h (gfc_unit): Remove waiting and closed members.
    (find_unit_locked): New prototype.
    (inc_waiting_locked): Remove.
    (predec_waiting_locked): Remove.
    (dec_waiting_unlocked): Remove.
    * io/list_read.c (nml_query): Avoid deadlock due to lock
    inversion.
    * io/unit.c (newunit_lock): New variable.
    (get_gfc_unit): Add new argument, use it. Remove logic waiting for
    closing logic..
    (find_unit): Use new argument for get_gfc_unit.
    (find_unit_locked): New function.
    (find_or_create_unit): Use new argumet for gfc_get_unit.
    (get_unit): Likewise.
    (init_units): Initialize newunit_lock.
    (close_unit_1): Change meaning of locked argument, remove waiting
    for closing logic.
    (close_unit): Adapt to new close_unit_1 arguments.
    (close_units): Likewwise.
    (newunit_alloc): Use newunit_lock to protect.
    (newunit_free): Likewise.
    * io/unix.c (find_file0): Lock unit before accessing.
    (find_file): Remove waiting for closing logic.
    (flush_all_units_1): Likewise.
    (flush_all_units): Likewise.
Thomas Koenig Feb. 1, 2020, 4:37 p.m. UTC | #2
Hi Janne,

> Simplify IO locking in libgfortran.  The new IO implementation avoids
> accessing units without locks, as seen in PR 92836.  It also avoids
> lock inversion (except for a corner case wrt namelist query when
> reading from stdin and outputting to stdout), making it easier to
> verify correctness with tools like valgrind or threadsanitizer.  It is
> also simplified as the waiting and closed variables are not needed
> anymore, making it easier to understand and analyze.
> 
> Regtested on x86_64-pc-linux-gnu, Ok for master?

I'll look into it, this might take a bit of time.

What are you planning to use as a test case? You can put
multighreading programs into libgomp/testsuite/libgomp.fortran
where they will be executed.

Regards

	Thomas
Janne Blomqvist Feb. 1, 2020, 6:38 p.m. UTC | #3
On Sat, Feb 1, 2020 at 6:37 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hi Janne,
>
> > Simplify IO locking in libgfortran.  The new IO implementation avoids
> > accessing units without locks, as seen in PR 92836.  It also avoids
> > lock inversion (except for a corner case wrt namelist query when
> > reading from stdin and outputting to stdout), making it easier to
> > verify correctness with tools like valgrind or threadsanitizer.  It is
> > also simplified as the waiting and closed variables are not needed
> > anymore, making it easier to understand and analyze.
> >
> > Regtested on x86_64-pc-linux-gnu, Ok for master?
>
> I'll look into it, this might take a bit of time.

Thanks for looking into this.

> What are you planning to use as a test case? You can put
> multighreading programs into libgomp/testsuite/libgomp.fortran
> where they will be executed.

In this case I just ran the test program in comment #14 in PR 92836
under valgrind --tool=helgrind (there is still a remaining problem
with the locking of unit_cache, but that can be handled separately,
and I think it's benign). I'm not sure how one would run such a test
as part of the testsuite.

But if anyone has tests that reliably deadlock within a relatively
short time, then yeah, such testcases would be awesome.
Thomas Koenig Feb. 17, 2020, 10:33 p.m. UTC | #4
Hi Janne,

> Simplify IO locking in libgfortran.  The new IO implementation avoids
> accessing units without locks, as seen in PR 92836.  It also avoids
> lock inversion (except for a corner case wrt namelist query when
> reading from stdin and outputting to stdout), making it easier to
> verify correctness with tools like valgrind or threadsanitizer.  It is
> also simplified as the waiting and closed variables are not needed
> anymore, making it easier to understand and analyze.
> 
> Regtested on x86_64-pc-linux-gnu, Ok for master?

Sorry it took me so long to actually look at this. The patch for
PR 93599 took a bit longer than anticipated...

With your patch on top of the one I just submitted, I get
valgrind errors for the asynchronous I/O tests:

$ gfortran -g async_io_1.f90 -pthread
$ valgrind --tool=drd ./a.out
==22685== drd, a thread error detector
==22685== Copyright (C) 2006-2017, and GNU GPL'd, by Bart Van Assche.
==22685== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==22685== Command: ./a.out
==22685==
==22685== Destroying locked mutex: mutex 0x60708f0, recursion count 1, 
owner 1.
==22685==    at 0x4C37B65: pthread_mutex_destroy_intercept 
(drd_pthread_intercepts.c:865)
==22685==    by 0x4C37B65: pthread_mutex_destroy 
(drd_pthread_intercepts.c:875)
==22685==    by 0x50A6CAE: __gthread_mutex_destroy (gthr-default.h:740)
==22685==    by 0x50A6CAE: destroy_unit_mutex (unit.c:253)
==22685==    by 0x50A6CAE: close_unit_1 (unit.c:732)
==22685==    by 0x5091E67: _gfortran_st_close (close.c:108)
==22685==    by 0x4011D3: MAIN__ (async_io_1.f90:25)
==22685==    by 0x401AA9: main (async_io_1.f90:48)
==22685== mutex 0x60708f0 was first observed at:
==22685==    at 0x4C37FD5: pthread_mutex_lock_intercept 
(drd_pthread_intercepts.c:885)
==22685==    by 0x4C37FD5: pthread_mutex_lock (drd_pthread_intercepts.c:898)
==22685==    by 0x50A6039: __gthread_mutex_lock (gthr-default.h:749)
==22685==    by 0x50A6039: insert_unit (unit.c:241)
==22685==    by 0x50A615F: get_gfc_unit (unit.c:353)
==22685==    by 0x509DC93: _gfortran_st_open (open.c:889)
==22685==    by 0x400E4A: MAIN__ (async_io_1.f90:16)
==22685==    by 0x401AA9: main (async_io_1.f90:48)

(plus a few more).  I am currently bootstrapping without the patch
above so I can be sure that this is not an artifact.

However, it looks as if the "locked" argument to close_unit_1 gets
passed the wrong value somehow.  Could you maybe look at that?

Regards

	Thomas
diff mbox series

Patch

diff --git a/libgfortran/io/close.c b/libgfortran/io/close.c
index 8aaa00393e7..3b176285149 100644
--- a/libgfortran/io/close.c
+++ b/libgfortran/io/close.c
@@ -61,20 +61,15 @@  st_close (st_parameter_close *clp)
     find_option (&clp->common, clp->status, clp->status_len,
 		 status_opt, "Bad STATUS parameter in CLOSE statement");
 
-  u = find_unit (clp->common.unit);
+  LOCK (&unit_lock);
+  u = find_unit_locked (clp->common.unit);
 
   if (ASYNC_IO && u && u->au)
     if (async_wait (&(clp->common), u->au))
-      {
-	library_end ();
-	return;
-      }
+      goto done;
 
   if ((clp->common.flags & IOPARM_LIBRETURN_MASK) != IOPARM_LIBRETURN_OK)
-  {
-    library_end ();
-    return;
-  }
+    goto done;
 
   if (u != NULL)
     {
@@ -123,6 +118,8 @@  st_close (st_parameter_close *clp)
 #endif
     }
 
-  /* CLOSE on unconnected unit is legal and a no-op: F95 std., 9.3.5. */ 
+ done:
+  /* CLOSE on unconnected unit is legal and a no-op: F95 std., 9.3.5. */
+  UNLOCK (&unit_lock);
   library_end ();
 }
diff --git a/libgfortran/io/io.h b/libgfortran/io/io.h
index ab4a103787c..ae0ed1b10a9 100644
--- a/libgfortran/io/io.h
+++ b/libgfortran/io/io.h
@@ -681,16 +681,6 @@  typedef struct gfc_unit
   struct async_unit *au;
 
   __gthread_mutex_t lock;
-  /* Number of threads waiting to acquire this unit's lock.
-     When non-zero, close_unit doesn't only removes the unit
-     from the UNIT_ROOT tree, but doesn't free it and the
-     last of the waiting threads will do that.
-     This must be either atomically increased/decreased, or
-     always guarded by UNIT_LOCK.  */
-  int waiting;
-  /* Flag set by close_unit if the unit as been closed.
-     Must be manipulated under unit's lock.  */
-  int closed;
 
   /* For traversing arrays */
   array_loop_spec *ls;
@@ -780,6 +770,9 @@  internal_proto(stash_internal_unit);
 extern gfc_unit *find_unit (int);
 internal_proto(find_unit);
 
+extern gfc_unit *find_unit_locked (int);
+internal_proto(find_unit_locked);
+
 extern gfc_unit *find_or_create_unit (int);
 internal_proto(find_or_create_unit);
 
@@ -973,50 +966,6 @@  internal_proto(size_from_complex_kind);
 extern void free_ionml (st_parameter_dt *);
 internal_proto(free_ionml);
 
-static inline void
-inc_waiting_locked (gfc_unit *u)
-{
-#ifdef HAVE_ATOMIC_FETCH_ADD
-  (void) __atomic_fetch_add (&u->waiting, 1, __ATOMIC_RELAXED);
-#else
-  u->waiting++;
-#endif
-}
-
-static inline int
-predec_waiting_locked (gfc_unit *u)
-{
-#ifdef HAVE_ATOMIC_FETCH_ADD
-  /* Note that the pattern
-
-     if (predec_waiting_locked (u) == 0)
-         // destroy u
-	 
-     could be further optimized by making this be an __ATOMIC_RELEASE,
-     and then inserting a
-
-     __atomic_thread_fence (__ATOMIC_ACQUIRE);
-
-     inside the branch before destroying.  But for now, lets keep it
-     simple.  */
-  return __atomic_add_fetch (&u->waiting, -1, __ATOMIC_ACQ_REL);
-#else
-  return --u->waiting;
-#endif
-}
-
-static inline void
-dec_waiting_unlocked (gfc_unit *u)
-{
-#ifdef HAVE_ATOMIC_FETCH_ADD
-  (void) __atomic_fetch_add (&u->waiting, -1, __ATOMIC_RELAXED);
-#else
-  __gthread_mutex_lock (&unit_lock);
-  u->waiting--;
-  __gthread_mutex_unlock (&unit_lock);
-#endif
-}
-
 
 static inline void
 memset4 (gfc_char4_t *p, gfc_char4_t c, int k)
diff --git a/libgfortran/io/list_read.c b/libgfortran/io/list_read.c
index 77d61421a0f..c337b3c8600 100644
--- a/libgfortran/io/list_read.c
+++ b/libgfortran/io/list_read.c
@@ -2793,7 +2793,31 @@  nml_query (st_parameter_dt *dtp, char c)
   /* Store the current unit and transfer to stdout.  */
 
   temp_unit = dtp->u.p.current_unit;
-  dtp->u.p.current_unit = find_unit (options.stdout_unit);
+
+  /* Since we already hold the lock for stdin (temp_unit), trying to
+     acquire the unit_lock in order to find the stdout unit would be a
+     lock order inversion, which is not allowed as it could cause a
+     deadlock.  Hence first try to lock unit_lock without blocking.
+     If that fails, fall back to unlocking temp_unit and then block
+     waiting for the lock in order to avoid the lock order inversion.
+     This is strictly speaking not correct, as it's possible another
+     thread could lock temp_unit inbetween causing inconsistent reads
+     from it. On the other hand it's a quite esoteric corner case, and
+     trying to read from stdin with multiple threads without other
+     synchronization is in any case a bad idea.  */
+  if (!__gthread_mutex_trylock (&unit_lock))
+    {
+      dtp->u.p.current_unit = find_unit_locked (options.stdout_unit);
+      __gthread_mutex_unlock (&unit_lock);
+    }
+  else
+    {
+      __gthread_mutex_unlock (&temp_unit->lock);
+      __gthread_mutex_lock (&unit_lock);
+      __gthread_mutex_lock (&temp_unit->lock);
+      dtp->u.p.current_unit = find_unit_locked (options.stdout_unit);
+      __gthread_mutex_unlock (&unit_lock);
+    }
 
   if (dtp->u.p.current_unit)
     {
diff --git a/libgfortran/io/unit.c b/libgfortran/io/unit.c
index 0030d7e8701..aa13deeac5a 100644
--- a/libgfortran/io/unit.c
+++ b/libgfortran/io/unit.c
@@ -36,39 +36,30 @@  see the files COPYING3 and COPYING.RUNTIME respectively.  If not, see
    UNIT_LOCK is a master lock, protecting UNIT_ROOT tree and UNIT_CACHE.
    Concurrent use of different units should be supported, so
    each unit has its own lock, LOCK.
-   Open should be atomic with its reopening of units and list_read.c
-   in several places needs find_unit another unit while holding stdin
-   unit's lock, so it must be possible to acquire UNIT_LOCK while holding
-   some unit's lock.  Therefore to avoid deadlocks, it is forbidden
-   to acquire unit's private locks while holding UNIT_LOCK, except
-   for freshly created units (where no other thread can get at their
-   address yet) or when using just trylock rather than lock operation.
-   In addition to unit's private lock each unit has a WAITERS counter
-   and CLOSED flag.  WAITERS counter must be either only
-   atomically incremented/decremented in all places (if atomic builtins
-   are supported), or protected by UNIT_LOCK in all places (otherwise).
-   CLOSED flag must be always protected by unit's LOCK.
-   After finding a unit in UNIT_CACHE or UNIT_ROOT with UNIT_LOCK held,
-   WAITERS must be incremented to avoid concurrent close from freeing
-   the unit between unlocking UNIT_LOCK and acquiring unit's LOCK.
-   Unit freeing is always done under UNIT_LOCK.  If close_unit sees any
-   WAITERS, it doesn't free the unit but instead sets the CLOSED flag
-   and the thread that decrements WAITERS to zero while CLOSED flag is
-   set is responsible for freeing it (while holding UNIT_LOCK).
-   flush_all_units operation is iterating over the unit tree with
-   increasing UNIT_NUMBER while holding UNIT_LOCK and attempting to
-   flush each unit (and therefore needs the unit's LOCK held as well).
-   To avoid deadlocks, it just trylocks the LOCK and if unsuccessful,
-   remembers the current unit's UNIT_NUMBER, unlocks UNIT_LOCK, acquires
-   unit's LOCK and after flushing reacquires UNIT_LOCK and restarts with
-   the smallest UNIT_NUMBER above the last one flushed.
-
-   If find_unit/find_or_create_unit/find_file/get_unit routines return
-   non-NULL, the returned unit has its private lock locked and when the
-   caller is done with it, it must call either unlock_unit or close_unit
-   on it.  unlock_unit or close_unit must be always called only with the
-   private lock held.  */
 
+   Lock ordering: UNIT_LOCK must be acquired before LOCK. It is NOT
+   allowed to acquire UNIT_LOCK while holding LOCK, as that can lead
+   to a deadlock.  There is one exception to this rule, when reading a
+   namelist from stdin GFortran supports an extension where it can
+   print the namelist or elements thereof to stdout. To do this, both
+   stdin and stdout LOCK's must be acquired. See comment in
+   list_read.c for how this is done while avoiding deadlock.
+
+   Opening units: When creating a new unit, one can freely manipulate
+   it's LOCK as no other thread can yet access it. Only after
+   inserting the new unit into the UNIT_ROOT tree and unlocking
+   UNIT_LOCK other thread can access the new unit, and from that point
+   the same rules as for old units apply.
+
+   Opening units with NEWUNIT=: When opening a unit with the NEWUNIT=
+   specifier (and for internal units), the unit number allocator (see
+   below) is protected by newunit_lock. It is not permitted to acquire
+   any other lock with newunit_lock held.
+
+   Closing units: When closing a unit, UNIT_LOCK is first acquired,
+   then the unit's own LOCK, then the unit is closed (without
+   unlocking UNIT_LOCK!) and resources are freed.  Only after
+   everything is done is UNIT_LOCK unlocked.  */
 
 
 /* Table of allocated newunit values.  A simple solution would be to
@@ -91,6 +82,12 @@  static int newunit_size; /* Total number of elements in the newunits array.  */
    allocated and free units. */
 static int newunit_lwi;
 
+#ifdef __GTHREAD_MUTEX_INIT
+static __gthread_mutex_t newunit_lock = __GTHREAD_MUTEX_INIT;
+#else
+static __gthread_mutex_t newunit_lock;
+#endif
+
 /* Unit numbers assigned with NEWUNIT start from here.  */
 #define NEWUNIT_START -10
 
@@ -323,15 +320,15 @@  delete_unit (gfc_unit *old)
    otherwise returns a locked unit. */
 
 static gfc_unit *
-get_gfc_unit (int n, int do_create)
+get_gfc_unit (int n, int do_create, bool locked)
 {
   gfc_unit *p;
   int c, created = 0;
 
   NOTE ("Unit n=%d, do_create = %d", n, do_create);
-  LOCK (&unit_lock);
+  if (!locked)
+    LOCK (&unit_lock);
 
-retry:
   for (c = 0; c < CACHE_SIZE; c++)
     if (unit_cache[c] != NULL && unit_cache[c]->unit_number == n)
       {
@@ -369,41 +366,16 @@  retry:
     {
       /* Newly created units have their lock held already
 	 from insert_unit.  Just unlock UNIT_LOCK and return.  */
-      UNLOCK (&unit_lock);
+      if (!locked)
+	UNLOCK (&unit_lock);
       return p;
     }
 
 found:
   if (p != NULL && (p->child_dtio == 0))
-    {
-      /* Fast path.  */
-      if (! TRYLOCK (&p->lock))
-	{
-	  /* assert (p->closed == 0); */
-	  UNLOCK (&unit_lock);
-	  return p;
-	}
-
-      inc_waiting_locked (p);
-    }
-
-
-  UNLOCK (&unit_lock);
-
-  if (p != NULL && (p->child_dtio == 0))
-    {
-      LOCK (&p->lock);
-      if (p->closed)
-	{
-	  LOCK (&unit_lock);
-	  UNLOCK (&p->lock);
-	  if (predec_waiting_locked (p) == 0)
-	    destroy_unit_mutex (p);
-	  goto retry;
-	}
-
-      dec_waiting_unlocked (p);
-    }
+    LOCK (&p->lock);
+  if (!locked)
+    UNLOCK (&unit_lock);
   return p;
 }
 
@@ -411,14 +383,19 @@  found:
 gfc_unit *
 find_unit (int n)
 {
-  return get_gfc_unit (n, 0);
+  return get_gfc_unit (n, 0, false);
 }
 
+gfc_unit *
+find_unit_locked (int n)
+{
+  return get_gfc_unit (n, 0, true);
+}
 
 gfc_unit *
 find_or_create_unit (int n)
 {
-  return get_gfc_unit (n, 1);
+  return get_gfc_unit (n, 1, false);
 }
 
 
@@ -553,7 +530,7 @@  get_unit (st_parameter_dt *dtp, int do_create)
 
       dtp->u.p.unit_is_internal = 1;
       dtp->common.unit = newunit_alloc ();
-      unit = get_gfc_unit (dtp->common.unit, do_create);
+      unit = get_gfc_unit (dtp->common.unit, do_create, false);
       set_internal_unit (dtp, unit, kind);
       fbuf_init (unit, 128);
       return unit;
@@ -571,10 +548,10 @@  get_unit (st_parameter_dt *dtp, int do_create)
     {
       if (dtp->common.unit > NEWUNIT_START) /* Reserved units.  */
 	return NULL;
-      return get_gfc_unit (dtp->common.unit, 0);
+      return get_gfc_unit (dtp->common.unit, 0, false);
     }
 
-  return get_gfc_unit (dtp->common.unit, do_create);
+  return get_gfc_unit (dtp->common.unit, do_create, false);
 }
 
 
@@ -596,6 +573,7 @@  init_units (void)
 
 #ifndef __GTHREAD_MUTEX_INIT
   __GTHREAD_MUTEX_INIT_FUNCTION (&unit_lock);
+  __GTHREAD_MUTEX_INIT_FUNCTION (&newunit_lock);
 #endif
 
   if (sizeof (max_offset) == 8)
@@ -716,10 +694,13 @@  init_units (void)
 
 
 static int
-close_unit_1 (gfc_unit *u, int locked)
+close_unit_1 (gfc_unit *u, bool locked)
 {
   int i, rc;
 
+  if (!locked)
+    LOCK (&u->lock);
+
   if (ASYNC_IO && u->au)
     async_close (u->au);
 
@@ -730,10 +711,6 @@  close_unit_1 (gfc_unit *u, int locked)
 
   rc = (u->s == NULL) ? 0 : sclose (u->s) == -1;
 
-  u->closed = 1;
-  if (!locked)
-    LOCK (&unit_lock);
-
   for (i = 0; i < CACHE_SIZE; i++)
     if (unit_cache[i] == u)
       unit_cache[i] = NULL;
@@ -752,14 +729,7 @@  close_unit_1 (gfc_unit *u, int locked)
   if (!locked)
     UNLOCK (&u->lock);
 
-  /* If there are any threads waiting in find_unit for this unit,
-     avoid freeing the memory, the last such thread will free it
-     instead.  */
-  if (u->waiting == 0)
-    destroy_unit_mutex (u);
-
-  if (!locked)
-    UNLOCK (&unit_lock);
+  destroy_unit_mutex (u);
 
   return rc;
 }
@@ -774,12 +744,12 @@  unlock_unit (gfc_unit *u)
 
 /* close_unit()-- Close a unit.  The stream is closed, and any memory
    associated with the stream is freed.  Returns nonzero on I/O error.
-   Should be called with the u->lock locked. */
+   Should be called with unit_lock and u->lock locked. */
 
 int
 close_unit (gfc_unit *u)
 {
-  return close_unit_1 (u, 0);
+  return close_unit_1 (u, true);
 }
 
 
@@ -795,7 +765,7 @@  close_units (void)
 {
   LOCK (&unit_lock);
   while (unit_root != NULL)
-    close_unit_1 (unit_root, 1);
+    close_unit_1 (unit_root, false);
   UNLOCK (&unit_lock);
 
   free (newunits);
@@ -903,7 +873,7 @@  finish_last_advance_record (gfc_unit *u)
 int
 newunit_alloc (void)
 {
-  LOCK (&unit_lock);
+  LOCK (&newunit_lock);
   if (!newunits)
     {
       newunits = xcalloc (16, 1);
@@ -917,7 +887,7 @@  newunit_alloc (void)
         {
           newunits[ii] = true;
           newunit_lwi = ii + 1;
-	  UNLOCK (&unit_lock);
+	  UNLOCK (&newunit_lock);
           return -ii + NEWUNIT_START;
         }
     }
@@ -930,7 +900,7 @@  newunit_alloc (void)
   memset (newunits + old_size, 0, old_size);
   newunits[old_size] = true;
   newunit_lwi = old_size + 1;
-    UNLOCK (&unit_lock);
+  UNLOCK (&newunit_lock);
   return -old_size + NEWUNIT_START;
 }
 
@@ -943,7 +913,9 @@  newunit_free (int unit)
 {
   int ind = -unit + NEWUNIT_START;
   assert(ind >= 0 && ind < newunit_size);
+  LOCK (&newunit_lock);
   newunits[ind] = false;
   if (ind < newunit_lwi)
     newunit_lwi = ind;
+  UNLOCK (&newunit_lock);
 }
diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c
index 563c7cb64cc..b3297abc1fd 100644
--- a/libgfortran/io/unix.c
+++ b/libgfortran/io/unix.c
@@ -1718,6 +1718,8 @@  find_file0 (gfc_unit *u, FIND_FILE0_DECL)
   if (u == NULL)
     return NULL;
 
+  LOCK (&u->lock);
+
 #ifdef HAVE_WORKING_STAT
   if (u->s != NULL)
     {
@@ -1738,6 +1740,8 @@  find_file0 (gfc_unit *u, FIND_FILE0_DECL)
       return u;
 #endif
 
+  UNLOCK (&u->lock);
+
   v = find_file0 (u->left, FIND_FILE0_ARGS);
   if (v != NULL)
     return v;
@@ -1775,100 +1779,33 @@  find_file (const char *file, gfc_charlen_type file_len)
 #endif
 
   LOCK (&unit_lock);
-retry:
   u = find_file0 (unit_root, FIND_FILE0_ARGS);
-  if (u != NULL)
-    {
-      /* Fast path.  */
-      if (! __gthread_mutex_trylock (&u->lock))
-	{
-	  /* assert (u->closed == 0); */
-	  UNLOCK (&unit_lock);
-	  goto done;
-	}
-
-      inc_waiting_locked (u);
-    }
   UNLOCK (&unit_lock);
-  if (u != NULL)
-    {
-      LOCK (&u->lock);
-      if (u->closed)
-	{
-	  LOCK (&unit_lock);
-	  UNLOCK (&u->lock);
-	  if (predec_waiting_locked (u) == 0)
-	    free (u);
-	  goto retry;
-	}
-
-      dec_waiting_unlocked (u);
-    }
  done:
   free (path);
   return u;
 }
 
-static gfc_unit *
-flush_all_units_1 (gfc_unit *u, int min_unit)
+static void
+flush_all_units_1 (gfc_unit *u)
 {
   while (u != NULL)
     {
-      if (u->unit_number > min_unit)
-	{
-	  gfc_unit *r = flush_all_units_1 (u->left, min_unit);
-	  if (r != NULL)
-	    return r;
-	}
-      if (u->unit_number >= min_unit)
-	{
-	  if (__gthread_mutex_trylock (&u->lock))
-	    return u;
-	  if (u->s)
-	    sflush (u->s);
-	  UNLOCK (&u->lock);
-	}
+      LOCK (&u->lock);
+      if (u->s)
+	sflush (u->s);
+      UNLOCK (&u->lock);
+      flush_all_units_1 (u->left);
       u = u->right;
     }
-  return NULL;
 }
 
 void
 flush_all_units (void)
 {
-  gfc_unit *u;
-  int min_unit = 0;
-
   LOCK (&unit_lock);
-  do
-    {
-      u = flush_all_units_1 (unit_root, min_unit);
-      if (u != NULL)
-	inc_waiting_locked (u);
-      UNLOCK (&unit_lock);
-      if (u == NULL)
-	return;
-
-      LOCK (&u->lock);
-
-      min_unit = u->unit_number + 1;
-
-      if (u->closed == 0)
-	{
-	  sflush (u->s);
-	  LOCK (&unit_lock);
-	  UNLOCK (&u->lock);
-	  (void) predec_waiting_locked (u);
-	}
-      else
-	{
-	  LOCK (&unit_lock);
-	  UNLOCK (&u->lock);
-	  if (predec_waiting_locked (u) == 0)
-	    free (u);
-	}
-    }
-  while (1);
+  flush_all_units_1 (unit_root);
+  UNLOCK (&unit_lock);
 }