Message ID | CAO9iq9G+fu-ZaCSb-JGLOdxmrXbqSGAqQEcqD6cKx6s4m4o_2Q@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Mon, Feb 13, 2012 at 7:20 PM, Janne Blomqvist <blomqvist.janne@gmail.com> wrote: > Hi, > > the attached patch changes the low-level libgfortran IO dispatching > mechanism to use shared vtables for each stream type, instead of all > the function pointers being replicated for each unit. This is similar > to e.g. how the C++ frontend implements vtables. The benefits are: > > - Slightly smaller heap memory overhead for each unit as only the > vtable pointer needs to be stored, and slightly faster unit > initialization as only the vtable pointer needs to be setup instead of > all the function pointers in the stream struct. > > - Looking at unix.o with readelf, one sees > > Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset > 0x15550 contains 8 entries: > > and similarly for the other vtables; according to > http://www.airs.com/blog/archives/189 this means that after relocation > the page where this data resides may be marked read-only. > > The downside is that the sizes of the .text and .data sections are > increased. Before: > > text data bss dec hex filename > 1116991 6664 592 1124247 112797 > ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so > > After: > > text data bss dec hex filename > 1117487 6936 592 1125015 112a97 > ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so > > > The data section increase is due to the vtables, the text increase is, > I guess, due to the extra pointer dereference when calling the IO > functions. > > Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8? Certainly not for trunk at this stage. For 4.8: So the trade-off is between faster initialization and smaller heap vs. fewer pointer dereferences? Does this patch fix an actual problem? Does it bring a killer feature? Otherwise, I'd say "if it ain't broke, don't fix it!" Ciao! Steven
On Mon, Feb 13, 2012 at 23:04, Steven Bosscher <stevenb.gcc@gmail.com> wrote: > On Mon, Feb 13, 2012 at 7:20 PM, Janne Blomqvist > <blomqvist.janne@gmail.com> wrote: >> Hi, >> >> the attached patch changes the low-level libgfortran IO dispatching >> mechanism to use shared vtables for each stream type, instead of all >> the function pointers being replicated for each unit. This is similar >> to e.g. how the C++ frontend implements vtables. The benefits are: >> >> - Slightly smaller heap memory overhead for each unit as only the >> vtable pointer needs to be stored, and slightly faster unit >> initialization as only the vtable pointer needs to be setup instead of >> all the function pointers in the stream struct. >> >> - Looking at unix.o with readelf, one sees >> >> Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset >> 0x15550 contains 8 entries: >> >> and similarly for the other vtables; according to >> http://www.airs.com/blog/archives/189 this means that after relocation >> the page where this data resides may be marked read-only. >> >> The downside is that the sizes of the .text and .data sections are >> increased. Before: >> >> text data bss dec hex filename >> 1116991 6664 592 1124247 112797 >> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >> >> After: >> >> text data bss dec hex filename >> 1117487 6936 592 1125015 112a97 >> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >> >> >> The data section increase is due to the vtables, the text increase is, >> I guess, due to the extra pointer dereference when calling the IO >> functions. >> >> Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8? > > Certainly not for trunk at this stage. Fair enough. > For 4.8: So the trade-off is between faster initialization and smaller > heap vs. fewer pointer dereferences? From a performance perspective, yes. Also, a multi-threaded program may benefit slightly from fewer cacheline ping-pongs due to the read-only vtables being separated from the RW data in the stream struct. That being said, while there are a few performance issues lurking here and there in libgfortran, virtual function dispatch isn't one of them. I think you'd be hard pressed to come up with a benchmark where you could see a difference. The other advantage is that by putting the vtables in read-only pages, the chance of a buggy program getting a SIGSEGV instead of corrupting the library state is slightly increased. > Does this patch fix an actual > problem? Does it bring a killer feature? No, it's certainly not a "killer feature". In general, considering the maturity of gfortran, I think it's unreasonable to expect that a bordering-on-trivial patch would bring a killer feature. The patch came about as a small experiment I did, and as I considered the result an overall improvement (even if admittedly quite a minor one), I added a ChangeLog and submitted it.
On Fri, Feb 17, 2012 at 12:44 PM, Janne Blomqvist <blomqvist.janne@gmail.com> wrote: > On Mon, Feb 13, 2012 at 23:04, Steven Bosscher <stevenb.gcc@gmail.com> wrote: >> On Mon, Feb 13, 2012 at 7:20 PM, Janne Blomqvist >> <blomqvist.janne@gmail.com> wrote: >>> Hi, >>> >>> the attached patch changes the low-level libgfortran IO dispatching >>> mechanism to use shared vtables for each stream type, instead of all >>> the function pointers being replicated for each unit. This is similar >>> to e.g. how the C++ frontend implements vtables. The benefits are: >>> >>> - Slightly smaller heap memory overhead for each unit as only the >>> vtable pointer needs to be stored, and slightly faster unit >>> initialization as only the vtable pointer needs to be setup instead of >>> all the function pointers in the stream struct. >>> >>> - Looking at unix.o with readelf, one sees >>> >>> Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset >>> 0x15550 contains 8 entries: >>> >>> and similarly for the other vtables; according to >>> http://www.airs.com/blog/archives/189 this means that after relocation >>> the page where this data resides may be marked read-only. >>> >>> The downside is that the sizes of the .text and .data sections are >>> increased. Before: >>> >>> text data bss dec hex filename >>> 1116991 6664 592 1124247 112797 >>> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >>> >>> After: >>> >>> text data bss dec hex filename >>> 1117487 6936 592 1125015 112a97 >>> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >>> >>> >>> The data section increase is due to the vtables, the text increase is, >>> I guess, due to the extra pointer dereference when calling the IO >>> functions. >>> >>> Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8? >> >> Certainly not for trunk at this stage. > > Fair enough. > >> For 4.8: So the trade-off is between faster initialization and smaller >> heap vs. fewer pointer dereferences? > > From a performance perspective, yes. Also, a multi-threaded program > may benefit slightly from fewer cacheline ping-pongs due to the > read-only vtables being separated from the RW data in the stream > struct. That being said, while there are a few performance issues > lurking here and there in libgfortran, virtual function dispatch isn't > one of them. I think you'd be hard pressed to come up with a benchmark > where you could see a difference. The cache line thing had crossed my mind, too. I'm not sure if that's a performance blocker for I/O in the end, though. > The other advantage is that by putting the vtables in read-only pages, > the chance of a buggy program getting a SIGSEGV instead of corrupting > the library state is slightly increased. Fair enough, that's a good reason to make this change. >> Does this patch fix an actual >> problem? Does it bring a killer feature? > > No, it's certainly not a "killer feature". In general, considering the > maturity of gfortran, I think it's unreasonable to expect that a > bordering-on-trivial patch would bring a killer feature. > > The patch came about as a small experiment I did, and as I considered > the result an overall improvement (even if admittedly quite a minor > one), I added a ChangeLog and submitted it. Heh, I did not mean to be unreasonable. I really appreciate all the work you're doing on gfortran! :-) Ciao! Steven
PING! (At this point, obviously for trunk only) On Mon, Feb 13, 2012 at 20:20, Janne Blomqvist <blomqvist.janne@gmail.com> wrote: > Hi, > > the attached patch changes the low-level libgfortran IO dispatching > mechanism to use shared vtables for each stream type, instead of all > the function pointers being replicated for each unit. This is similar > to e.g. how the C++ frontend implements vtables. The benefits are: > > - Slightly smaller heap memory overhead for each unit as only the > vtable pointer needs to be stored, and slightly faster unit > initialization as only the vtable pointer needs to be setup instead of > all the function pointers in the stream struct. > > - Looking at unix.o with readelf, one sees > > Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset > 0x15550 contains 8 entries: > > and similarly for the other vtables; according to > http://www.airs.com/blog/archives/189 this means that after relocation > the page where this data resides may be marked read-only. > > The downside is that the sizes of the .text and .data sections are > increased. Before: > > text data bss dec hex filename > 1116991 6664 592 1124247 112797 > ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so > > After: > > text data bss dec hex filename > 1117487 6936 592 1125015 112a97 > ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so > > > The data section increase is due to the vtables, the text increase is, > I guess, due to the extra pointer dereference when calling the IO > functions. > > Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8? > > 2012-02-13 Janne Blomqvist <jb@gcc.gnu.org> > > * io/unix.h (struct stream): Rename to stream_vtable. > (struct stream): New struct definition. > (sread): Dereference vtable pointer. > (swrite): Likewise. > (sseek): Likewise. > (struncate): Likewise. > (sflush): Likewise. > (sclose): Likewise. > * io/unix.c (raw_vtable): New variable. > (buf_vtable): Likewise. > (mem_vtable): Likewise. > (mem4_vtable): Likewise. > (raw_init): Assign vtable pointer. > (buf_init): Likewise. > (open_internal): Likewise. > (open_internal4): Likewise. > > > > -- > Janne Blomqvist
On 03/15/2012 11:42 AM, Janne Blomqvist wrote: > PING! (At this point, obviously for trunk only) > Yes, OK for trunk. > On Mon, Feb 13, 2012 at 20:20, Janne Blomqvist > <blomqvist.janne@gmail.com> wrote: >> Hi, >> >> the attached patch changes the low-level libgfortran IO dispatching >> mechanism to use shared vtables for each stream type, instead of all >> the function pointers being replicated for each unit. This is similar >> to e.g. how the C++ frontend implements vtables. The benefits are: >> >> - Slightly smaller heap memory overhead for each unit as only the >> vtable pointer needs to be stored, and slightly faster unit >> initialization as only the vtable pointer needs to be setup instead of >> all the function pointers in the stream struct. >> >> - Looking at unix.o with readelf, one sees >> >> Relocation section '.rela.data.rel.ro.local.mem_vtable' at offset >> 0x15550 contains 8 entries: >> >> and similarly for the other vtables; according to >> http://www.airs.com/blog/archives/189 this means that after relocation >> the page where this data resides may be marked read-only. >> >> The downside is that the sizes of the .text and .data sections are >> increased. Before: >> >> text data bss dec hex filename >> 1116991 6664 592 1124247 112797 >> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >> >> After: >> >> text data bss dec hex filename >> 1117487 6936 592 1125015 112a97 >> ./x86_64-unknown-linux-gnu/libgfortran/.libs/libgfortran.so >> >> >> The data section increase is due to the vtables, the text increase is, >> I guess, due to the extra pointer dereference when calling the IO >> functions. >> >> Regtested on x86_64-unknown-linux-gnu, Ok for trunk, or 4.8? >> >> 2012-02-13 Janne Blomqvist<jb@gcc.gnu.org> >> >> * io/unix.h (struct stream): Rename to stream_vtable. >> (struct stream): New struct definition. >> (sread): Dereference vtable pointer. >> (swrite): Likewise. >> (sseek): Likewise. >> (struncate): Likewise. >> (sflush): Likewise. >> (sclose): Likewise. >> * io/unix.c (raw_vtable): New variable. >> (buf_vtable): Likewise. >> (mem_vtable): Likewise. >> (mem4_vtable): Likewise. >> (raw_init): Assign vtable pointer. >> (buf_init): Likewise. >> (open_internal): Likewise. >> (open_internal4): Likewise. >> >> >> >> -- >> Janne Blomqvist > > >
diff --git a/libgfortran/io/unix.c b/libgfortran/io/unix.c index 6eef3f9..978c3ff 100644 --- a/libgfortran/io/unix.c +++ b/libgfortran/io/unix.c @@ -401,17 +401,21 @@ raw_close (unix_stream * s) return retval; } +static const struct stream_vtable raw_vtable = { + .read = (void *) raw_read, + .write = (void *) raw_write, + .seek = (void *) raw_seek, + .tell = (void *) raw_tell, + .size = (void *) raw_size, + .trunc = (void *) raw_truncate, + .close = (void *) raw_close, + .flush = (void *) raw_flush +}; + static int raw_init (unix_stream * s) { - s->st.read = (void *) raw_read; - s->st.write = (void *) raw_write; - s->st.seek = (void *) raw_seek; - s->st.tell = (void *) raw_tell; - s->st.size = (void *) raw_size; - s->st.trunc = (void *) raw_truncate; - s->st.close = (void *) raw_close; - s->st.flush = (void *) raw_flush; + s->st.vptr = &raw_vtable; s->buffer = NULL; return 0; @@ -619,17 +623,21 @@ buf_close (unix_stream * s) return raw_close (s); } +static const struct stream_vtable buf_vtable = { + .read = (void *) buf_read, + .write = (void *) buf_write, + .seek = (void *) buf_seek, + .tell = (void *) buf_tell, + .size = (void *) buf_size, + .trunc = (void *) buf_truncate, + .close = (void *) buf_close, + .flush = (void *) buf_flush +}; + static int buf_init (unix_stream * s) { - s->st.read = (void *) buf_read; - s->st.write = (void *) buf_write; - s->st.seek = (void *) buf_seek; - s->st.tell = (void *) buf_tell; - s->st.size = (void *) buf_size; - s->st.trunc = (void *) buf_truncate; - s->st.close = (void *) buf_close; - s->st.flush = (void *) buf_flush; + s->st.vptr = &buf_vtable; s->buffer = get_mem (BUFFER_SIZE); return 0; @@ -872,6 +880,31 @@ mem_close (unix_stream * s) return 0; } +static const struct stream_vtable mem_vtable = { + .read = (void *) mem_read, + .write = (void *) mem_write, + .seek = (void *) mem_seek, + .tell = (void *) mem_tell, + /* buf_size is not a typo, we just reuse an identical + implementation. */ + .size = (void *) buf_size, + .trunc = (void *) mem_truncate, + .close = (void *) mem_close, + .flush = (void *) mem_flush +}; + +static const struct stream_vtable mem4_vtable = { + .read = (void *) mem_read4, + .write = (void *) mem_write4, + .seek = (void *) mem_seek, + .tell = (void *) mem_tell, + /* buf_size is not a typo, we just reuse an identical + implementation. */ + .size = (void *) buf_size, + .trunc = (void *) mem_truncate, + .close = (void *) mem_close, + .flush = (void *) mem_flush +}; /********************************************************************* Public functions -- A reimplementation of this module needs to @@ -895,16 +928,7 @@ open_internal (char *base, int length, gfc_offset offset) s->logical_offset = 0; s->active = s->file_length = length; - s->st.close = (void *) mem_close; - s->st.seek = (void *) mem_seek; - s->st.tell = (void *) mem_tell; - /* buf_size is not a typo, we just reuse an identical - implementation. */ - s->st.size = (void *) buf_size; - s->st.trunc = (void *) mem_truncate; - s->st.read = (void *) mem_read; - s->st.write = (void *) mem_write; - s->st.flush = (void *) mem_flush; + s->st.vptr = &mem_vtable; return (stream *) s; } @@ -926,16 +950,7 @@ open_internal4 (char *base, int length, gfc_offset offset) s->logical_offset = 0; s->active = s->file_length = length; - s->st.close = (void *) mem_close; - s->st.seek = (void *) mem_seek; - s->st.tell = (void *) mem_tell; - /* buf_size is not a typo, we just reuse an identical - implementation. */ - s->st.size = (void *) buf_size; - s->st.trunc = (void *) mem_truncate; - s->st.read = (void *) mem_read4; - s->st.write = (void *) mem_write4; - s->st.flush = (void *) mem_flush; + s->st.vptr = &mem4_vtable; return (stream *) s; } diff --git a/libgfortran/io/unix.h b/libgfortran/io/unix.h index 52f3e0c..f4f3ab6 100644 --- a/libgfortran/io/unix.h +++ b/libgfortran/io/unix.h @@ -28,68 +28,71 @@ see the files COPYING3 and COPYING.RUNTIME respectively. If not, see #include "io.h" - -struct stream +struct stream_vtable { - ssize_t (*read) (struct stream *, void *, ssize_t); - ssize_t (*write) (struct stream *, const void *, ssize_t); - gfc_offset (*seek) (struct stream *, gfc_offset, int); - gfc_offset (*tell) (struct stream *); - gfc_offset (*size) (struct stream *); + ssize_t (* const read) (struct stream *, void *, ssize_t); + ssize_t (* const write) (struct stream *, const void *, ssize_t); + gfc_offset (* const seek) (struct stream *, gfc_offset, int); + gfc_offset (* const tell) (struct stream *); + gfc_offset (* const size) (struct stream *); /* Avoid keyword truncate due to AIX namespace collision. */ - int (*trunc) (struct stream *, gfc_offset); - int (*flush) (struct stream *); - int (*close) (struct stream *); + int (* const trunc) (struct stream *, gfc_offset); + int (* const flush) (struct stream *); + int (* const close) (struct stream *); }; +struct stream +{ + const struct stream_vtable *vptr; +}; /* Inline functions for doing file I/O given a stream. */ static inline ssize_t sread (stream * s, void * buf, ssize_t nbyte) { - return s->read (s, buf, nbyte); + return s->vptr->read (s, buf, nbyte); } static inline ssize_t swrite (stream * s, const void * buf, ssize_t nbyte) { - return s->write (s, buf, nbyte); + return s->vptr->write (s, buf, nbyte); } static inline gfc_offset sseek (stream * s, gfc_offset offset, int whence) { - return s->seek (s, offset, whence); + return s->vptr->seek (s, offset, whence); } static inline gfc_offset stell (stream * s) { - return s->tell (s); + return s->vptr->tell (s); } static inline gfc_offset ssize (stream * s) { - return s->size (s); + return s->vptr->size (s); } static inline int struncate (stream * s, gfc_offset length) { - return s->trunc (s, length); + return s->vptr->trunc (s, length); } static inline int sflush (stream * s) { - return s->flush (s); + return s->vptr->flush (s); } static inline int sclose (stream * s) { - return s->close (s); + return s->vptr->close (s); }