diff mbox series

[libfortran] Adjust block size for libgfortran for unformatted reads

Message ID 4b6f06e1-e1ad-9489-5311-5be78b1ae1dd@netcologne.de
State New
Headers show
Series [libfortran] Adjust block size for libgfortran for unformatted reads | expand

Commit Message

Thomas Koenig July 7, 2019, 8:13 p.m. UTC
Hello world,

the attached patch sets the I/O block size for unformatted files to
2**17 and makes this, and the block size for formatted files,
adjustable via environment variables.

The main reason is that -fconvert=big-endian was quite slow on
some HPC systems. A bigger buffer should eliminate that.  Also,
People who use unformatted files are likely to write large amounts
of data, so this seems like a good fit.  Finally, some benchmarking
showed that 131072 seemed like a good value to use. Thanks to Jerry
for support.

I didn't change the value for formatted files because, frankly, we are
using a lot of CPU for converting numbers there, so any gain
negligible (unless somebody comes up with a benchmark which says
otherwise).

As this is a change in behavior / new feature, I don't think that
backporting is indicated, but if somebody feels otherwise, please
speak up.

Regression-tested. OK for trunk?

Regards

	Thomas

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
	(GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* io/unix.c (BUFFER_SIZE): Delete.
	(BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
	(BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
	(unix_stream): Add buffer_size.
	(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
	(buf_write): Likewise.
	(buf_init): Add argument unformatted.  Handle block sizes
	for unformatted vs. formatted, using defaults if provided.
	(fd_to_stream): Add argument unformatted in call to buf_init.
	* libgfortran.h (options_t): Add buffer_size_formatted and
	buffer_size_unformatted.
	* runtime/environ.c (variable_table): Add
	GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.

Comments

Manfred Schwarb July 8, 2019, 8:18 a.m. UTC | #1
Am 07.07.19 um 22:13 schrieb Thomas Koenig:
> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.
>
> The main reason is that -fconvert=big-endian was quite slow on
> some HPC systems. A bigger buffer should eliminate that.  Also,
> People who use unformatted files are likely to write large amounts
> of data, so this seems like a good fit.  Finally, some benchmarking
> showed that 131072 seemed like a good value to use. Thanks to Jerry
> for support.
>
> I didn't change the value for formatted files because, frankly, we are
> using a lot of CPU for converting numbers there, so any gain
> negligible (unless somebody comes up with a benchmark which says
> otherwise).

formatted write: Did you try writing to an USB stick or similar? I guess
for flash based devices anything below 64k will slow down operation.
Computers like Raspberry Pi and the like often have flash based storage,
and it is not extremely unrealistic to run fortran programs on them.


Cheers.
Manfred

>
> As this is a change in behavior / new feature, I don't think that
> backporting is indicated, but if somebody feels otherwise, please
> speak up.
>
> Regression-tested. OK for trunk?
>
> Regards
>
>     Thomas
>
> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>     PR libfortran/91030
>     * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>     (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.
>
> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>     PR libfortran/91030
>     * io/unix.c (BUFFER_SIZE): Delete.
>     (BUFFER_SIZE_FORMATTED_DEFAULT): New variable.
>     (BUFFER_SIZE_UNFORMATTED_DEFAULT): New variable.
>     (unix_stream): Add buffer_size.
>     (buf_read): Use s->buffer_size instead of BUFFER_SIZE.
>     (buf_write): Likewise.
>     (buf_init): Add argument unformatted.  Handle block sizes
>     for unformatted vs. formatted, using defaults if provided.
>     (fd_to_stream): Add argument unformatted in call to buf_init.
>     * libgfortran.h (options_t): Add buffer_size_formatted and
>     buffer_size_unformatted.
>     * runtime/environ.c (variable_table): Add
>     GFORTRAN_BUFFER_SIZE_UNFORMATTED and GFORTRAN_BUFFER_SIZE_FORMATTED.
>
Janne Blomqvist July 8, 2019, 1:02 p.m. UTC | #2
On Mon, Jul 8, 2019 at 11:18 AM Manfred Schwarb <manfred99@gmx.ch> wrote:
>
> Am 07.07.19 um 22:13 schrieb Thomas Koenig:
> > Hello world,
> >
> > the attached patch sets the I/O block size for unformatted files to
> > 2**17 and makes this, and the block size for formatted files,
> > adjustable via environment variables.
> >
> > The main reason is that -fconvert=big-endian was quite slow on
> > some HPC systems. A bigger buffer should eliminate that.  Also,
> > People who use unformatted files are likely to write large amounts
> > of data, so this seems like a good fit.  Finally, some benchmarking
> > showed that 131072 seemed like a good value to use. Thanks to Jerry
> > for support.
> >
> > I didn't change the value for formatted files because, frankly, we are
> > using a lot of CPU for converting numbers there, so any gain
> > negligible (unless somebody comes up with a benchmark which says
> > otherwise).
>
> formatted write: Did you try writing to an USB stick or similar? I guess
> for flash based devices anything below 64k will slow down operation.
> Computers like Raspberry Pi and the like often have flash based storage,
> and it is not extremely unrealistic to run fortran programs on them.

Good point. If you happen to have a USB stick handy, can you try the
simple C benchmark program at
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?

(the kernel will coalesce IO's by itself, so the granularity of IO
syscalls is not necessarily the same as the actual IO to devices.
Network filesystems like NFS/Lustre/GPFS may have less latitude here
due to coherency requirements etc.)


--
Janne Blomqvist
Janne Blomqvist July 8, 2019, 1:51 p.m. UTC | #3
On Sun, Jul 7, 2019 at 11:13 PM Thomas Koenig <tkoenig@netcologne.de> wrote:
>
> Hello world,
>
> the attached patch sets the I/O block size for unformatted files to
> 2**17 and makes this, and the block size for formatted files,
> adjustable via environment variables.

Should the default be affected by the page size
(sysconf(_SC_PAGESIZE)) and/or the IO blocksize (we already fstat() a
file when we open it, so we could get st_blksize for free)?

Though one could of course argue those aren't particularly useful;
except for power, everything uses a 4k page size (?), and the IO
blocksize varies from too small (4k, ext4) to too large (1 MB, NFS (or
whatever rsize/wsize is set to), or 4 MB for Lustre).

> 2019-07-07  Thomas König  <tkoenig@gcc.gnu.org>
>
>         PR libfortran/91030
>         * gfortran.texi (GFORTRAN_BUFFER_SIZE_FORMATTED): Document
>         (GFORTRAN_BUFFER_SIZE_FORMATTED): Likewise.

One of these should be _UNFORMATTED?
Steve Kargl July 8, 2019, 4:05 p.m. UTC | #4
On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:
> 
> Good point. If you happen to have a USB stick handy, can you try the
> simple C benchmark program at
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
> 
> (the kernel will coalesce IO's by itself, so the granularity of IO
> syscalls is not necessarily the same as the actual IO to devices.
> Network filesystems like NFS/Lustre/GPFS may have less latitude here
> due to coherency requirements etc.)
> 

Writing to USB is constrained be the speed of the bus.

kernel: ugen6.3: <Kingston DataTraveler 3.0> at usbus6
kernel: umass1 on uhub0
kernel: umass1: <Kingston DataTraveler 3.0, class 0/0, rev 2.10/0.01, addr 3> on usbus6
kernel: da1 at umass-sim1 bus 1 scbus5 target 0 lun 0
kernel: da1: <Kingston DataTraveler 3.0 > Removable Direct Access SPC-4 SCSI device
kernel: da1: Serial Number 0C9D927F0A77F330A89D1210
kernel: da1: 40.000MB/s transfers
kernel: da1: 29510MB (60437492 512 byte sectors)
kernel: da1: quirks=0x2<NO_6_BYTE>

Mounting the thumb drive as a MSDOSFS on FreeBSD.

Test using 100000000 bytes
Block size of file system: 16384
bs =       1024, 8.86 MiB/s
bs =       2048, 7.52 MiB/s
bs =       4096, 7.32 MiB/s
bs =       8192, 7.32 MiB/s
bs =      16384, 7.40 MiB/s
bs =      32768, 7.40 MiB/s
bs =      65536, 5.57 MiB/s
bs =     131072, 7.43 MiB/s
bs =     262144, 5.55 MiB/s
bs =     524288, 7.43 MiB/s
bs =    1048576, 5.56 MiB/s
bs =    2097152, 7.39 MiB/s
bs =    4194304, 7.62 MiB/s
bs =    8388608, 5.58 MiB/s
bs =   16777216, 7.42 MiB/s
bs =   33554432, 5.59 MiB/s
bs =   67108864, 4.77 MiB/s

For the same binary, writing to a UFS2 on a SATAII SSD

Test using 100000000 bytes
Block size of file system: 32768
bs =       1024, 123.09 MiB/s
bs =       2048, 210.30 MiB/s
bs =       4096, 184.75 MiB/s
bs =       8192, 237.28 MiB/s
bs =      16384, 244.42 MiB/s
bs =      32768, 243.20 MiB/s
bs =      65536, 253.77 MiB/s
bs =     131072, 243.32 MiB/s
bs =     262144, 238.81 MiB/s
bs =     524288, 243.87 MiB/s
bs =    1048576, 246.55 MiB/s
bs =    2097152, 242.39 MiB/s
bs =    4194304, 243.68 MiB/s
bs =    8388608, 243.88 MiB/s
bs =   16777216, 242.21 MiB/s
bs =   33554432, 253.19 MiB/s
bs =   67108864, 178.29 MiB/s
Bernhard Reutner-Fischer July 9, 2019, 7:11 p.m. UTC | #5
On Mon, 8 Jul 2019 09:05:04 -0700
Steve Kargl <sgk@troutmask.apl.washington.edu> wrote:

> On Mon, Jul 08, 2019 at 04:02:17PM +0300, Janne Blomqvist wrote:
> > 
> > Good point. If you happen to have a USB stick handy, can you try the
> > simple C benchmark program at
> > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91030#c38 ?
> > 
> > (the kernel will coalesce IO's by itself, so the granularity of IO
> > syscalls is not necessarily the same as the actual IO to devices.
> > Network filesystems like NFS/Lustre/GPFS may have less latitude here
> > due to coherency requirements etc.)

GFORTRAN_BUFFER_SIZE_FORMATTED sounds a bit odd, maybe
GFORTRAN_(UN)FORMATTED_BUFFER_SIZE would read more natural.

And let me note 2 flaws in the benchmark:

>       left = N;
>       w = p;
>       t1 = walltime();
>       while (left > 0)
>         {
>           if (left >= blocksize)
>             to_write = blocksize;
>           else
>             to_write = left;
> 
>           write (fd, w, blocksize);

1) this should write to_write, not blocksize i assume.
2) you don't catch short writes

>           w += to_write;
>           left -= to_write;
>         }

So, short of using iozone, it should probably be more like (modulo
typos):
      left = N;
      w = p;
      t1 = walltime();
      while (left > 0)
        {
          if (left >= blocksize)
            to_write = blocksize;
          else
            to_write = left;
          while (to_write > 0) {
            errno = 0;
            ssize_t wrote = write (fd, w, to_write);
            if (wrote < 0 && errno != EINTR) /* retry EINTR or bail */
              break;
            w += wrote;
            left -= wrote;
            to_write -= wrote;
          }
        }
thanks,
Thomas Koenig July 14, 2019, 10:07 a.m. UTC | #6
OK, so here is a new version.

I think the discussion has shown that enlaring the buffer makes sense,
and that the buffer size for unformatted seems to be too bad.

I've reversed the names of the environment variables according to
Behnard's suggestion.

So, OK for trunk?

Also, what should we do about gcc-9?  I have now come to think
that we should add the environment variables to set the buffer lengths,
but leave the old default (8192).

What do you think?

Regards

	Thomas

2019-07-14  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* gfortran.texi (GFORTRAN_FORMATTED_BUFFER_SIZE): Document
	(GFORTRAN_UNFORMATTED_BUFFER_SIZE): Likewise.

2019-07-14  Thomas König  <tkoenig@gcc.gnu.org>

	PR libfortran/91030
	* io/unix.c (BUFFER_SIZE): Delete.
	(BUFFER_FORMATTED_SIZE_DEFAULT): New variable.
	(BUFFER_UNFORMATTED_SIZE_DEFAULT): New variable.
	(unix_stream): Add buffer_size.
	(buf_read): Use s->buffer_size instead of BUFFER_SIZE.
	(buf_write): Likewise.
	(buf_init): Add argument unformatted.  Handle block sizes
	for unformatted vs. formatted, using defaults if provided.
	(fd_to_stream): Add argument unformatted in call to buf_init.
	* libgfortran.h (options_t): Add buffer_size_formatted and
	buffer_size_unformatted.
	* runtime/environ.c (variable_table): Add
	GFORTRAN_UNFORMATTED_BUFFER_SIZE and
	GFORTRAN_FORMATTED_BUFFER_SIZE.
Thomas Koenig July 14, 2019, 10:08 a.m. UTC | #7
... of course, better with the actual patch.
Steve Kargl July 14, 2019, 5:37 p.m. UTC | #8
On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
> OK, so here is a new version.
> 
> I think the discussion has shown that enlaring the buffer makes sense,
> and that the buffer size for unformatted seems to be too bad.
> 
> I've reversed the names of the environment variables according to
> Behnard's suggestion.
> 
> So, OK for trunk?
> 
> Also, what should we do about gcc-9?  I have now come to think
> that we should add the environment variables to set the buffer lengths,
> but leave the old default (8192).
> 
> What do you think?
> 

If you are inclined to back port a portion of the patch to 9-branch,
then bumping up the old default would seem to be the most important
part.  As dje noted, users seem to have an aversion to reading the
documentation, so finding the environment variables may not happen.

Isn't 8192 an internal implementation detail for libgfortran?  Can
bumping it to larger value in 9-branch cause an issue for a normal
user?
Thomas Koenig July 19, 2019, 8:42 p.m. UTC | #9
Hi Steve,

> On Sun, Jul 14, 2019 at 12:07:58PM +0200, Thomas Koenig wrote:
>> OK, so here is a new version.
>>
>> I think the discussion has shown that enlaring the buffer makes sense,
>> and that the buffer size for unformatted seems to be too bad.
>>
>> I've reversed the names of the environment variables according to
>> Behnard's suggestion.
>>
>> So, OK for trunk?
>>
>> Also, what should we do about gcc-9?  I have now come to think
>> that we should add the environment variables to set the buffer lengths,
>> but leave the old default (8192).
>>
>> What do you think?
>>
> 
> If you are inclined to back port a portion of the patch to 9-branch,
> then bumping up the old default would seem to be the most important
> part.  As dje noted, users seem to have an aversion to reading the
> documentation, so finding the environment variables may not happen.
> 
> Isn't 8192 an internal implementation detail for libgfortran?  Can
> bumping it to larger value in 9-branch cause an issue for a normal
> user?

Well, it allocates a bigger memory block, that's all.

Upon reconsideration, I think your point about people not reading
the docs is valid :-|

So, I will commit the patch to trunk over the weekend and to 9.2
a few days afterwards, unless somebody objects.

Regards

	Thomas
diff mbox series

Patch

Index: gcc/fortran/gfortran.texi
===================================================================
--- gcc/fortran/gfortran.texi	(Revision 273183)
+++ gcc/fortran/gfortran.texi	(Arbeitskopie)
@@ -611,6 +611,8 @@  Malformed environment variables are silently ignor
 * GFORTRAN_LIST_SEPARATOR::  Separator for list output
 * GFORTRAN_CONVERT_UNIT::  Set endianness for unformatted I/O
 * GFORTRAN_ERROR_BACKTRACE:: Show backtrace on run-time errors
+* GFORTRAN_BUFFER_SIZE_FORMATTED:: Buffer size for formatted files.
+* GFORTRAN_BUFFER_SIZE_UNFORMATTED:: Buffer size for unformatted files.
 @end menu
 
 @node TMPDIR
@@ -782,6 +784,20 @@  the backtracing, set the variable to @samp{n}, @sa
 Default is to print a backtrace unless the @option{-fno-backtrace}
 compile option was used.
 
+@node GFORTRAN_BUFFER_SIZE_FORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_FORMATTED}---Set buffer size for formatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_FORMATTED} environment variable
+specifies buffer size in bytes to be used for formatted output.
+The default value is 8192.
+
+@node GFORTRAN_BUFFER_SIZE_UNFORMATTED
+@section @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED}---Set buffer size for unformatted I/O
+
+The @env{GFORTRAN_BUFFER_SIZE_UNFORMATTED} environment variable
+specifies buffer size in bytes to be used for unformatted output.
+The default value is 131072.
+
 @c =====================================================================
 @c PART II: LANGUAGE REFERENCE
 @c =====================================================================
Index: libgfortran/io/unix.c
===================================================================
--- libgfortran/io/unix.c	(Revision 273183)
+++ libgfortran/io/unix.c	(Arbeitskopie)
@@ -193,7 +193,8 @@  fallback_access (const char *path, int mode)
 
 /* Unix and internal stream I/O module */
 
-static const int BUFFER_SIZE = 8192;
+static const int BUFFER_SIZE_FORMATTED_DEFAULT = 8192;
+static const int BUFFER_SIZE_UNFORMATTED_DEFAULT = 128*1024;
 
 typedef struct
 {
@@ -205,6 +206,7 @@  typedef struct
   gfc_offset file_length;	/* Length of the file. */
 
   char *buffer;                 /* Pointer to the buffer.  */
+  ssize_t buffer_size;           /* Length of the buffer.  */
   int fd;                       /* The POSIX file descriptor.  */
 
   int active;			/* Length of valid bytes in the buffer */
@@ -592,9 +594,9 @@  buf_read (unix_stream *s, void *buf, ssize_t nbyte
           && raw_seek (s, new_logical, SEEK_SET) < 0)
         return -1;
       s->buffer_offset = s->physical_offset = new_logical;
-      if (to_read <= BUFFER_SIZE/2)
+      if (to_read <= s->buffer_size/2)
         {
-          did_read = raw_read (s, s->buffer, BUFFER_SIZE);
+          did_read = raw_read (s, s->buffer, s->buffer_size);
 	  if (likely (did_read >= 0))
 	    {
 	      s->physical_offset += did_read;
@@ -632,11 +634,11 @@  buf_write (unix_stream *s, const void *buf, ssize_
     s->buffer_offset = s->logical_offset;
 
   /* Does the data fit into the buffer?  As a special case, if the
-     buffer is empty and the request is bigger than BUFFER_SIZE/2,
+     buffer is empty and the request is bigger than s->buffer_size/2,
      write directly. This avoids the case where the buffer would have
      to be flushed at every write.  */
-  if (!(s->ndirty == 0 && nbyte > BUFFER_SIZE/2)
-      && s->logical_offset + nbyte <= s->buffer_offset + BUFFER_SIZE
+  if (!(s->ndirty == 0 && nbyte > s->buffer_size/2)
+      && s->logical_offset + nbyte <= s->buffer_offset + s->buffer_size
       && s->buffer_offset <= s->logical_offset
       && s->buffer_offset + s->ndirty >= s->logical_offset)
     {
@@ -651,7 +653,7 @@  buf_write (unix_stream *s, const void *buf, ssize_
          the request is bigger than the buffer size, write directly
          bypassing the buffer.  */
       buf_flush (s);
-      if (nbyte <= BUFFER_SIZE/2)
+      if (nbyte <= s->buffer_size/2)
         {
           memcpy (s->buffer, buf, nbyte);
           s->buffer_offset = s->logical_offset;
@@ -688,7 +690,7 @@  buf_write (unix_stream *s, const void *buf, ssize_
 static int
 buf_markeor (unix_stream *s)
 {
-  if (s->unbuffered || s->ndirty >= BUFFER_SIZE / 2)
+  if (s->unbuffered || s->ndirty >= s->buffer_size / 2)
     return buf_flush (s);
   return 0;
 }
@@ -765,11 +767,32 @@  static const struct stream_vtable buf_vtable = {
 };
 
 static int
-buf_init (unix_stream *s)
+buf_init (unix_stream *s, bool unformatted)
 {
   s->st.vptr = &buf_vtable;
 
-  s->buffer = xmalloc (BUFFER_SIZE);
+  /* Try to guess a good value for the buffer size.  For formatted
+     I/O, we use so many CPU cycles converting the data that there is
+     more sense in converving memory and especially cache.  For
+     unformatted, a bigger block can have a large impact in some
+     environments.  */
+
+  if (unformatted)
+    {
+      if (options.buffer_size_unformatted > 0)
+	s->buffer_size = options.buffer_size_unformatted;
+      else
+	s->buffer_size = BUFFER_SIZE_UNFORMATTED_DEFAULT;
+    }
+  else
+    {
+      if (options.buffer_size_formatted > 0)
+	s->buffer_size = options.buffer_size_formatted;
+      else
+	s->buffer_size = BUFFER_SIZE_FORMATTED_DEFAULT;
+    }
+
+  s->buffer = xmalloc (s->buffer_size);
   return 0;
 }
 
@@ -1120,13 +1143,13 @@  fd_to_stream (int fd, bool unformatted)
 	   (s->fd == STDIN_FILENO 
 	    || s->fd == STDOUT_FILENO 
 	    || s->fd == STDERR_FILENO)))
-    buf_init (s);
+    buf_init (s, unformatted);
   else
     {
       if (unformatted)
 	{
 	  s->unbuffered = true;
-	  buf_init (s);
+	  buf_init (s, unformatted);
 	}
       else
 	raw_init (s);
Index: libgfortran/libgfortran.h
===================================================================
--- libgfortran/libgfortran.h	(Revision 273183)
+++ libgfortran/libgfortran.h	(Arbeitskopie)
@@ -540,6 +540,7 @@  typedef struct
 
   int all_unbuffered, unbuffered_preconnected;
   int fpe, backtrace;
+  int buffer_size_unformatted, buffer_size_formatted;
 }
 options_t;
 
Index: libgfortran/runtime/environ.c
===================================================================
--- libgfortran/runtime/environ.c	(Revision 273183)
+++ libgfortran/runtime/environ.c	(Arbeitskopie)
@@ -198,6 +198,14 @@  static variable variable_table[] = {
   /* Print out a backtrace if possible on runtime error */
   { "GFORTRAN_ERROR_BACKTRACE", -1, &options.backtrace, init_boolean },
 
+  /* Buffer size for unformatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_UNFORMATTED", 0, &options.buffer_size_unformatted,
+    init_integer },
+
+  /* Buffer size for formatted files.  */
+  { "GFORTRAN_BUFFER_SIZE_FORMATTED", 0, &options.buffer_size_formatted,
+    init_integer },
+
   { NULL, 0, NULL, NULL }
 };