Message ID | 565870B7.2020901@redhat.com |
---|---|
State | New |
Headers | show |
On 11/27/2015 10:03 AM, Carlos O'Donell wrote: > The asprintf/vasprintf implementations calls into _IO_vasprintf > to carry out operations on a string-based FILE* object. When > the work is complete and the buffer filled with the result the > implementation null-terminates and returns the string which is > passed back to the caller of asprintf. > > During underflow the string-based buffer might have grown, and > conversely the buffer might be exactly the right size, but short > by 1-byte for the NULL terminator. Therefore the implementation > chooses to realloc the buffer to return back any excess memory > or get 1-byte more for the NULL terminator. > > In 2000 the implementation was changed by Ulrich Drpper to > make a local deicsion to use realloc vs. malloc/memcpy/free > based on a heuristic and in order to reduce memory fragmentation. > The idea is that if the buffer is only just bigger than what the > user needs the realloc will split the chunk and increase > fragmentation. While this is true, this decision should not be > made at the _IO_vasprintf level. The allocator should be making > this decision based on known fragmentation estimates and a tunable > fragmentation parameter which could be used to control when realloc > splits a chunk and when it decides to malloc/memcpy/free. > > Therefore we remove the local heuristic. This should increase > performance slightly at the cost of fragmentation. Future malloc > work to handle fragmentation should be pursued. If someone measures > fragmentation and tracks it back down to _IO_vasprintf, we add a > comment to guide the reader to what we consider is best practice: > enhance realloc to handle fragmentation decisions. > > OK to checkin? > > No regressions on x86_64. I'm going to drop this until I can get a benchtest to show this is better than before. c.
> I'm going to drop this until I can get a benchtest to show this > is better than before. IMHO if benchmarks simply show it's not clearly worse, then you should go ahead because it simplifies the code. If the motivation for a change is not performance alone, then it doesn't need to demonstrate a performance improvement per se.
On 03/04/2016 01:42 PM, Roland McGrath wrote: >> I'm going to drop this until I can get a benchtest to show this >> is better than before. > > IMHO if benchmarks simply show it's not clearly worse, then you should go > ahead because it simplifies the code. If the motivation for a change is > not performance alone, then it doesn't need to demonstrate a performance > improvement per se. It's hard to tell. I'd like to prove it to myself that it's generating better code. c.
diff --git a/libio/vasprintf.c b/libio/vasprintf.c index 61cdfdd..5b39ee7 100644 --- a/libio/vasprintf.c +++ b/libio/vasprintf.c @@ -41,7 +41,6 @@ _IO_vasprintf (char **result_ptr, const char *format, _IO_va_list args) _IO_strfile sf; int ret; _IO_size_t needed; - _IO_size_t allocated; /* No need to clear the memory here (unlike for open_memstream) since we know we will never seek on the stream. */ string = (char *) malloc (init_string_size); @@ -62,24 +61,18 @@ _IO_vasprintf (char **result_ptr, const char *format, _IO_va_list args) free (sf._sbf._f._IO_buf_base); return ret; } - /* Only use realloc if the size we need is of the same (binary) - order of magnitude then the memory we allocated. */ needed = sf._sbf._f._IO_write_ptr - sf._sbf._f._IO_write_base + 1; - allocated = sf._sbf._f._IO_write_end - sf._sbf._f._IO_write_base; - if ((allocated >> 1) <= needed) - *result_ptr = (char *) realloc (sf._sbf._f._IO_buf_base, needed); - else - { - *result_ptr = (char *) malloc (needed); - if (*result_ptr != NULL) - { - memcpy (*result_ptr, sf._sbf._f._IO_buf_base, needed - 1); - free (sf._sbf._f._IO_buf_base); - } - else - /* We have no choice, use the buffer we already have. */ - *result_ptr = (char *) realloc (sf._sbf._f._IO_buf_base, needed); - } + /* We are either shrinking the buffer, doing nothing, or growing + the buffer by 1 byte for the NULL. In all of these cases it's + fastest to call realloc. At the cost of more memory we could + choose not to realloc and return a larger buffer, but tuning + that requires some thought. At one point this code was expanded + to consider realloc vs. malloc/memcpy/free as a method for + reducing fragmentation caused by realloc, but was later removed + because such decisions are tuning parameters that should be part + of the realloc decisions made by malloc e.g. fragmentation vs. + making a copy. */ + *result_ptr = (char *) realloc (sf._sbf._f._IO_buf_base, needed); if (*result_ptr == NULL) *result_ptr = sf._sbf._f._IO_buf_base; (*result_ptr)[needed - 1] = '\0';