diff mbox

[libfortran] Thread safety and simplification of error printing

Message ID BANLkTim5TtOYWtxnzhHLNwKqE-mpwG9pJg@mail.gmail.com
State New
Headers show

Commit Message

Janne Blomqvist May 7, 2011, 4:35 p.m. UTC
Hi,

the error printing functionality (in io/unix.c) st_printf and
st_vprintf are not thread-safe as they use a static buffer. However,
since these routines are used when something has gone wrong, we
shouldn't use malloc() to allocate thread-specific buffers or anything
fancy like that. The way the buffer is used is that it is filled with
vs(n)printf() and then the buffer is written using write(); the
simplest way to fix this is to just print directly using vfprintf(),
which is what this patch does. Also, the patch uses
flockfile/funlockfile in a few places to increase the likelihood of
multiple related messages to end up together on the output.

As stderr is unbuffered and stdout is buffered, this patch also gets
rid of the functionality to choose where to send the error output, as
we don't want to deal with buffering in the error handling path. IMHO
this is no loss, as all targets/shells/batch schedulers/ etc. where
this might be relevant, offer functionality to redirect stderr to
wherever stdout goes. So there is no need for a gfortran-specific
mechanism for this.

Regtested on x86_64-unknown-linux-gnu, Ok for trunk?

2011-05-07  Janne Blomqvist  <jb@gcc.gnu.org>

        * gfortran.texi: Remove GFORTRAN_USE_STDERR documentation.

2011-05-07  Janne Blomqvist  <jb@gcc.gnu.org>

        * config.h.in: Regenerated.
        * configure: Likewise.
        * configure.ac: Check for flockfile, funlockfile.
        * io/unix.c (st_vprintf): Remove.
        (st_printf): Use vfprintf() instead of st_vprintf.
        * libgfortran.h (flockfile,funlockfile): Define to empty if
        functions not available.
        (struct options_t): Remove use_stderr field.
        (st_vprintf): Remove prototype.
        * runtime/environ.c (variable_table[]): Remove GFORTRAN_USE_STDERR
        entry.
        * runtime/error.c (runtime_error): Use vfprintf instead of
        st_vprintf.
        (runtime_error_at): Likewise.
        (runtime_warning_at): Likewise.

Comments

Janne Blomqvist May 8, 2011, 11:40 a.m. UTC | #1
On Sat, May 7, 2011 at 19:35, Janne Blomqvist <blomqvist.janne@gmail.com> wrote:
> Hi,
>
> the error printing functionality (in io/unix.c) st_printf and
> st_vprintf are not thread-safe as they use a static buffer. However,
> since these routines are used when something has gone wrong, we
> shouldn't use malloc() to allocate thread-specific buffers or anything
> fancy like that. The way the buffer is used is that it is filled with
> vs(n)printf() and then the buffer is written using write(); the
> simplest way to fix this is to just print directly using vfprintf(),
> which is what this patch does. Also, the patch uses
> flockfile/funlockfile in a few places to increase the likelihood of
> multiple related messages to end up together on the output.
>
> As stderr is unbuffered and stdout is buffered, this patch also gets
> rid of the functionality to choose where to send the error output, as
> we don't want to deal with buffering in the error handling path. IMHO
> this is no loss, as all targets/shells/batch schedulers/ etc. where
> this might be relevant, offer functionality to redirect stderr to
> wherever stdout goes. So there is no need for a gfortran-specific
> mechanism for this.

While this patch makes error printing thread-safe, it's no longer
async-signal-safe as the stderr lock might lead to a deadlock. So I'm
retracting this patch and thinking some more about this problem.
N.M. Maclaren May 8, 2011, 1:42 p.m. UTC | #2
On May 8 2011, Janne Blomqvist wrote:
>>
>> the error printing functionality (in io/unix.c) st_printf and
>> st_vprintf are not thread-safe as they use a static buffer. ...
>
>While this patch makes error printing thread-safe, it's no longer
>async-signal-safe as the stderr lock might lead to a deadlock. So I'm
>retracting this patch and thinking some more about this problem.

It's theoretically insoluble, given the constraints you are working
under.  Sorry.  It is possible to do reasonably well, but there will
always be likely scenarios where all you can do is to say "Aargh!
I give up."

Both I and the VMS people adopted the ratchet design.  You have N
levels of error recovery, each level allocates all of the resources
it needs before startup, and any exception during level K increases
the level to K+1 and calls the level K+1 error handler.  When you
have an exception at level N, you just die.

That imposes the constraint that all diagnostics have a fixed upper
bound on the resources they need (not just buffer space, but that's
the main one).  It's a real bummer when the system has some critical
resources that you can't reserve, so you have to treat an allocation
failure as an exception, but buffer space is not one such.

That also tackles the thread problem, not very satisfactorily.  If a
resource needs to be locked, you can try to get it for a bit, and
then raise a higher exception if you can't.  And, typically, one or
more of the highest levels are for closing down the process, and
simply suspend any subsequent threads that call them (i.e. just leave
them waiting for a lock they won't get).

Yes, it's not pretty.  But I don't know how to do better.

Regards,
Nick Maclaren.
N.M. Maclaren May 8, 2011, 2:17 p.m. UTC | #3
Sorry - I should have clarified that ANYTHING that can't be used 
independently in multiple threads and at multiple levels in the same thread 
counts as a resource, and that includes stderr.

Regards,
Nick Maclaren.
Janne Blomqvist May 8, 2011, 6:56 p.m. UTC | #4
On Sun, May 8, 2011 at 16:42, N.M. Maclaren <nmm1@cam.ac.uk> wrote:
> On May 8 2011, Janne Blomqvist wrote:
>>>
>>> the error printing functionality (in io/unix.c) st_printf and
>>> st_vprintf are not thread-safe as they use a static buffer. ...
>>
>> While this patch makes error printing thread-safe, it's no longer
>> async-signal-safe as the stderr lock might lead to a deadlock. So I'm
>> retracting this patch and thinking some more about this problem.
>
> It's theoretically insoluble, given the constraints you are working
> under.  Sorry.  It is possible to do reasonably well, but there will
> always be likely scenarios where all you can do is to say "Aargh!
> I give up."

Well, I realize perfection is impossible, so I'm settling for merely
improving the status quo!

> Both I and the VMS people adopted the ratchet design.  You have N
> levels of error recovery, each level allocates all of the resources
> it needs before startup, and any exception during level K increases
> the level to K+1 and calls the level K+1 error handler.  When you
> have an exception at level N, you just die.

To some extent we have a crude version of this, in that when we're
entering many of the fatal error handling functions we do a recursion
check and if that fails, die. Also, in a recent patch of mine
(http://gcc.gnu.org/ml/gcc-patches/2011-05/msg00584.html ) the fatal
signal handler function has been reworked to hopefully deal better
with other signal(s) being delivered before it's done; that code is
modeled after an example in the glibc manual, and I'm a bit unsure if
the recursion check thingy really works or we just end up in an
infinite recursion (that is, do we need to re-set to the default
handler before re-raising? I have a vague memory that the signal
handler for SIGXXX must finish before starting the handler for another
SIGXXX pending signal, which would make the current version safe).

> That imposes the constraint that all diagnostics have a fixed upper
> bound on the resources they need (not just buffer space, but that's
> the main one).  It's a real bummer when the system has some critical
> resources that you can't reserve, so you have to treat an allocation
> failure as an exception, but buffer space is not one such.
>
> That also tackles the thread problem, not very satisfactorily.  If a
> resource needs to be locked, you can try to get it for a bit, and
> then raise a higher exception if you can't.  And, typically, one or
> more of the highest levels are for closing down the process, and
> simply suspend any subsequent threads that call them (i.e. just leave
> them waiting for a lock they won't get).

I think in our case the situation is a bit easier in that we're not
trying to recover from a serious failure, merely print some diagnostic
information without getting stuck in a deadlock.
N.M. Maclaren May 8, 2011, 10:49 p.m. UTC | #5
On May 8 2011, Janne Blomqvist wrote:
>>
>> It's theoretically insoluble, given the constraints you are working
>> under.  Sorry.  It is possible to do reasonably well, but there will
>> always be likely scenarios where all you can do is to say "Aargh!
>> I give up."
>
>Well, I realize perfection is impossible, so I'm settling for merely
>improving the status quo!

Very reasonable, given the problem.  Where the effort increases rapidly
with the reliability, not being too fancy is a sound strategy.

>I think in our case the situation is a bit easier in that we're not
>trying to recover from a serious failure, merely print some diagnostic
>information without getting stuck in a deadlock.

I wasn't trying to do much more!  I was mainly trying to maximise the
chances of files being closed cleanly and appropriate diagnostics
being produced.  Once there was a serious error, I didn't allow the
program to continue.  But I did allow user-written exception handlers
(which could only return - NOT jump out).  Even that was hard enough :-(

Regards,
Nick Maclaren.
diff mbox

Patch

diff --git a/gcc/fortran/gfortran.texi b/gcc/fortran/gfortran.texi
index 1284c3d..c810fe2 100644
--- a/gcc/fortran/gfortran.texi
+++ b/gcc/fortran/gfortran.texi
@@ -579,7 +579,6 @@  Malformed environment variables are silently ignored.
 * GFORTRAN_STDIN_UNIT:: Unit number for standard input
 * GFORTRAN_STDOUT_UNIT:: Unit number for standard output
 * GFORTRAN_STDERR_UNIT:: Unit number for standard error
-* GFORTRAN_USE_STDERR:: Send library output to standard error
 * GFORTRAN_TMPDIR:: Directory for scratch files
 * GFORTRAN_UNBUFFERED_ALL:: Don't buffer I/O for all units.
 * GFORTRAN_UNBUFFERED_PRECONNECTED:: Don't buffer I/O for preconnected units.
@@ -613,14 +612,6 @@  This environment variable can be used to select the unit number
 preconnected to standard error.  This must be a positive integer.
 The default value is 0.
 
-@node GFORTRAN_USE_STDERR
-@section @env{GFORTRAN_USE_STDERR}---Send library output to standard error
-
-This environment variable controls where library output is sent.
-If the first letter is @samp{y}, @samp{Y} or @samp{1}, standard
-error is used.  If the first letter is @samp{n}, @samp{N} or
-@samp{0}, standard output is used.
-
 @node GFORTRAN_TMPDIR
 @section @env{GFORTRAN_TMPDIR}---Directory for scratch files
 
diff --git a/libgfortran/config.h.in b/libgfortran/config.h.in
index 30da5fa..a48381d 100644
--- a/libgfortran/config.h.in
+++ b/libgfortran/config.h.in
@@ -393,6 +393,9 @@ 
 /* Define to 1 if you have the <float.h> header file. */
 #undef HAVE_FLOAT_H
 
+/* Define to 1 if you have the `flockfile' function. */
+#undef HAVE_FLOCKFILE
+
 /* libm includes floor */
 #undef HAVE_FLOOR
 
@@ -441,6 +444,9 @@ 
 /* Define to 1 if you have the `ftruncate' function. */
 #undef HAVE_FTRUNCATE
 
+/* Define to 1 if you have the `funlockfile' function. */
+#undef HAVE_FUNLOCKFILE
+
 /* Define to 1 if you have the `getcwd' function. */
 #undef HAVE_GETCWD
 
diff --git a/libgfortran/configure b/libgfortran/configure
index 833a0e1..9fa2b7f 100755
--- a/libgfortran/configure
+++ b/libgfortran/configure
@@ -16458,7 +16458,7 @@  _ACEOF
 fi
 done
 
-for ac_func in clock_gettime strftime
+for ac_func in clock_gettime strftime flockfile funlockfile
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/libgfortran/configure.ac b/libgfortran/configure.ac
index cf38fb0..e39ae17 100644
--- a/libgfortran/configure.ac
+++ b/libgfortran/configure.ac
@@ -264,7 +264,7 @@  AC_CHECK_FUNCS(sleep time ttyname signal alarm clock access fork execl)
 AC_CHECK_FUNCS(wait setmode execvp pipe dup2 close fdopen strcasestr getrlimit)
 AC_CHECK_FUNCS(gettimeofday stat fstat lstat getpwuid vsnprintf dup getcwd)
 AC_CHECK_FUNCS(localtime_r gmtime_r strerror_r getpwuid_r ttyname_r)
-AC_CHECK_FUNCS(clock_gettime strftime)
+AC_CHECK_FUNCS(clock_gettime strftime flockfile funlockfile)
 
 # Check for glibc backtrace functions
 AC_CHECK_FUNCS(backtrace backtrace_symbols)
diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c
index 4e4bc3b..3b80e36 100644
--- a/libgfortran/io/unix.c
+++ b/libgfortran/io/unix.c
@@ -1353,48 +1353,7 @@  error_stream (void)
 }
 
 
-/* st_vprintf()-- vprintf function for error output.  To avoid buffer
-   overruns, we limit the length of the buffer to ST_VPRINTF_SIZE.  2k
-   is big enough to completely fill a 80x25 terminal, so it shuld be
-   OK.  We use a direct write() because it is simpler and least likely
-   to be clobbered by memory corruption.  Writing an error message
-   longer than that is an error.  */
-
-#define ST_VPRINTF_SIZE 2048
-
-int
-st_vprintf (const char *format, va_list ap)
-{
-  static char buffer[ST_VPRINTF_SIZE];
-  int written;
-  int fd;
-
-  fd = options.use_stderr ? STDERR_FILENO : STDOUT_FILENO;
-#ifdef HAVE_VSNPRINTF
-  written = vsnprintf(buffer, ST_VPRINTF_SIZE, format, ap);
-#else
-  written = vsprintf(buffer, format, ap);
-
-  if (written >= ST_VPRINTF_SIZE-1)
-    {
-      /* The error message was longer than our buffer.  Ouch.  Because
-	 we may have messed up things badly, report the error and
-	 quit.  */
-#define ERROR_MESSAGE "Internal error: buffer overrun in st_vprintf()\n"
-      write (fd, buffer, ST_VPRINTF_SIZE-1);
-      write (fd, ERROR_MESSAGE, strlen(ERROR_MESSAGE));
-      sys_exit(2);
-#undef ERROR_MESSAGE
-
-    }
-#endif
-
-  written = write (fd, buffer, written);
-  return written;
-}
-
-/* st_printf()-- printf() function for error output.  This just calls
-   st_vprintf() to do the actual work.  */
+/* st_printf()-- printf() function for error output.  */
 
 int
 st_printf (const char *format, ...)
@@ -1402,7 +1361,7 @@  st_printf (const char *format, ...)
   int written;
   va_list ap;
   va_start (ap, format);
-  written = st_vprintf(format, ap);
+  written = vfprintf (stderr, format, ap);
   va_end (ap);
   return written;
 }
diff --git a/libgfortran/libgfortran.h b/libgfortran/libgfortran.h
index 6cccaca..ca1f462 100644
--- a/libgfortran/libgfortran.h
+++ b/libgfortran/libgfortran.h
@@ -125,6 +125,13 @@  extern int __mingw_snprintf (char *, size_t, const char *, ...)
 #define snprintf(str, size, ...) sprintf (str, __VA_ARGS__)
 #endif
 
+#ifndef HAVE_FLOCKFILE
+#define flockfile(fhandle)
+#endif
+
+#ifndef HAVE_FUNLOCKFILE
+#define funlockfile(fhandle)
+#endif
 
 /* For a library, a standard prefix is a requirement in order to partition
    the namespace.  IPREFIX is for symbols intended to be internal to the
@@ -508,7 +515,7 @@  typedef struct
   int separator_len;
   const char *separator;
 
-  int use_stderr, all_unbuffered, unbuffered_preconnected, default_recl;
+  int all_unbuffered, unbuffered_preconnected, default_recl;
   int fpe, dump_core, backtrace;
 }
 options_t;
@@ -796,9 +803,6 @@  extern int st_printf (const char *, ...)
   __attribute__ ((format (gfc_printf, 1, 2)));
 internal_proto(st_printf);
 
-extern int st_vprintf (const char *, va_list);
-internal_proto(st_vprintf);
-
 extern char * filename_from_unit (int);
 internal_proto(filename_from_unit);
 
diff --git a/libgfortran/runtime/environ.c b/libgfortran/runtime/environ.c
index a6ce645..d70103c 100644
--- a/libgfortran/runtime/environ.c
+++ b/libgfortran/runtime/environ.c
@@ -281,10 +281,6 @@  static variable variable_table[] = {
    "Unit number that will be preconnected to standard error\n"
    "(No preconnection if negative)", 0},
 
-  {"GFORTRAN_USE_STDERR", 1, &options.use_stderr, init_boolean,
-   show_boolean,
-   "Sends library output to standard error instead of standard output.", 0},
-
   {"GFORTRAN_TMPDIR", 0, NULL, init_string, show_string,
    "Directory for scratch files.  Overrides the TMP environment variable\n"
    "If TMP is not set " DEFAULT_TEMPDIR " is used.", 0},
diff --git a/libgfortran/runtime/error.c b/libgfortran/runtime/error.c
index 06c144a..694d668 100644
--- a/libgfortran/runtime/error.c
+++ b/libgfortran/runtime/error.c
@@ -249,11 +249,13 @@  runtime_error (const char *message, ...)
   va_list ap;
 
   recursion_check ();
+  flockfile (stderr);
   st_printf ("Fortran runtime error: ");
   va_start (ap, message);
-  st_vprintf (message, ap);
+  vfprintf (stderr, message, ap);
   va_end (ap);
   st_printf ("\n");
+  funlockfile (stderr);
   sys_exit (2);
 }
 iexport(runtime_error);
@@ -267,12 +269,14 @@  runtime_error_at (const char *where, const char *message, ...)
   va_list ap;
 
   recursion_check ();
+  flockfile (stderr);
   st_printf ("%s\n", where);
   st_printf ("Fortran runtime error: ");
   va_start (ap, message);
-  st_vprintf (message, ap);
+  vfprintf (stderr, message, ap);
   va_end (ap);
   st_printf ("\n");
+  funlockfile (stderr);
   sys_exit (2);
 }
 iexport(runtime_error_at);
@@ -283,12 +287,14 @@  runtime_warning_at (const char *where, const char *message, ...)
 {
   va_list ap;
 
+  flockfile (stderr);
   st_printf ("%s\n", where);
   st_printf ("Fortran runtime warning: ");
   va_start (ap, message);
-  st_vprintf (message, ap);
+  vfprintf (stderr, message, ap);
   va_end (ap);
   st_printf ("\n");
+  funlockfile (stderr);
 }
 iexport(runtime_warning_at);