Message ID | c05f8996-6cc6-b9ee-0f94-f78ba0cf3bd8@redhat.com |
---|---|
State | New |
Headers | show |
On 04/20/2017 03:53 AM, Florian Weimer wrote: > +@node Replacing malloc > +@subsection Replacing @code{malloc} > + > +@cindex @code{malloc} replacement > +@cindex @code{LD_PRELOAD} and @code{malloc} > +@cindex alternative @code{malloc} implementations > +@cindex customizing @code{malloc} > +@cindex interposing @code{malloc} > +@cindex preempting @code{malloc} > +@cindex replacing @code{malloc} > +@Theglibc{} supports replacing the built-in @code{malloc} implementation I appreciate the liberal use of @cindex here, especially having one that also begins with "malloc". Rical
On 20/04/17 11:53, Florian Weimer wrote: > +@node Replacing malloc > +@subsection Replacing @code{malloc} > + > +@cindex @code{malloc} replacement > +@cindex @code{LD_PRELOAD} and @code{malloc} > +@cindex alternative @code{malloc} implementations > +@cindex customizing @code{malloc} > +@cindex interposing @code{malloc} > +@cindex preempting @code{malloc} > +@cindex replacing @code{malloc} > +@Theglibc{} supports replacing the built-in @code{malloc} implementation > +with a different allocator with the same interface. For dynamically > +linked programs, this happens through ELF symbol interposition, either > +using shared object dependencies or @code{LD_PRELOAD}. For static > +linking, the @code{malloc} replacement library must be linked in before > +linking against @code{libc.a} (explicitly or implicitly). > + this documentation does not mention known caveats, e.g. - when wrapping calloc via dlsym, dlsym may call calloc, the user has to deal with it, - similarly any interface that internally may use malloc (in the future) better not be used in the malloc implementation. - malloc may be called when locks are held (e.g. some stdio lock during scanf) so synchronizing with anything that might also hold the same lock in the malloc implementation may deadlock (a more interesting example is probably the dl_load_lock while allocating dynamic tls in some cases so a user provided malloc should not use dtls) - some other invariants may not hold internally in libc when malloc is called, so the malloc implementation should not rely on those. (e.g. posix tsd deallocation calls free which might observe tsd in an inconsistent state) - glibc tries to provide some guarantees that may not work with interposed malloc (unlocking malloc locks at multi- threaded fork ?) it would be useful to specify the libc internal malloc interface contracts, but that is a lot harder than just listing the set of interfaces that need to be provided. > +@strong{Note:} Failure to provide a complete set of replacement > +functions (that is, all the functions used by the application, > +@theglibc{}, and other linked-in libraries) can lead to static linking > +failures, and, at run time, to heap corruption and application crashes. > + > +The minimum set of functions which has to be provided by a custom > +@code{malloc} is given in the table below. > + > +@table @code > +@item malloc > +@item free > +@item calloc > +@item realloc > +@end table > + > +These @code{malloc}-related functions are required for @theglibc{} to > +work.@footnote{Versions of @theglibc{} before 2.25 required that a > +custom @code{malloc} defines @code{__libc_memalign} (with the same > +interface as the @code{memalign} function).} > + > +The @code{malloc} implementation in @theglibc{} provides additional > +functionality not used by the library itself, but which is often used by > +other system libraries and applications. A general-purpose replacement > +@code{malloc} implementation should provide definitions of these > +functions, too. Their names are listed in the following table. > + > +@table @code > +@item aligned_alloc > +@item malloc_usable_size > +@item memalign > +@item posix_memalign > +@item pvalloc > +@item valloc > +@end table > + > +In addition, very old applications may use the obsolete @code{cfree} > +function. > + > +Further @code{malloc}-related functions such as @code{mallopt} or > +@code{mallinfo} will not have any effect or return incorrect statistics > +when a replacement @code{malloc} is in use. However, failure to replace > +these functions typically does not result in crashes or other incorrect > +application behavior, but may result in static linking failures. > + > @node Obstacks > @subsection Obstacks > @cindex obstacks >
On 04/24/2017 07:17 PM, Szabolcs Nagy wrote: > On 20/04/17 11:53, Florian Weimer wrote: >> +@node Replacing malloc >> +@subsection Replacing @code{malloc} >> + >> +@cindex @code{malloc} replacement >> +@cindex @code{LD_PRELOAD} and @code{malloc} >> +@cindex alternative @code{malloc} implementations >> +@cindex customizing @code{malloc} >> +@cindex interposing @code{malloc} >> +@cindex preempting @code{malloc} >> +@cindex replacing @code{malloc} >> +@Theglibc{} supports replacing the built-in @code{malloc} implementation >> +with a different allocator with the same interface. For dynamically >> +linked programs, this happens through ELF symbol interposition, either >> +using shared object dependencies or @code{LD_PRELOAD}. For static >> +linking, the @code{malloc} replacement library must be linked in before >> +linking against @code{libc.a} (explicitly or implicitly). >> + > > this documentation does not mention known caveats, e.g. There are many more, like following ABI requirements regarding alignment and pointer bit patterns. > - when wrapping calloc via dlsym, dlsym may call calloc, the > user has to deal with it, > > - similarly any interface that internally may use malloc (in > the future) better not be used in the malloc implementation. > > - malloc may be called when locks are held (e.g. some stdio > lock during scanf) so synchronizing with anything that might > also hold the same lock in the malloc implementation may > deadlock (a more interesting example is probably the dl_load_lock > while allocating dynamic tls in some cases so a user provided > malloc should not use dtls) It should be able to use initial-exec TLS, though. Is it really worthwhile to go into such details? The malloc/fork interaction is something mostly dependent on malloc implementation details. I can add a general warning to the documentation that implementing malloc is not easy, but I'm not sure how helpful it would be. The symbol list is mainly there because jemalloc forgot to override __libc_memalign for older glibcs, and we don't want a repeat of that. Thanks, Florian
On 03/05/17 10:08, Florian Weimer wrote: > On 04/24/2017 07:17 PM, Szabolcs Nagy wrote: >> On 20/04/17 11:53, Florian Weimer wrote: >>> +@node Replacing malloc >>> +@subsection Replacing @code{malloc} >>> + >>> +@cindex @code{malloc} replacement >>> +@cindex @code{LD_PRELOAD} and @code{malloc} >>> +@cindex alternative @code{malloc} implementations >>> +@cindex customizing @code{malloc} >>> +@cindex interposing @code{malloc} >>> +@cindex preempting @code{malloc} >>> +@cindex replacing @code{malloc} >>> +@Theglibc{} supports replacing the built-in @code{malloc} implementation >>> +with a different allocator with the same interface. For dynamically >>> +linked programs, this happens through ELF symbol interposition, either >>> +using shared object dependencies or @code{LD_PRELOAD}. For static >>> +linking, the @code{malloc} replacement library must be linked in before >>> +linking against @code{libc.a} (explicitly or implicitly). >>> + >> >> this documentation does not mention known caveats, e.g. > > There are many more, like following ABI requirements regarding alignment and pointer bit patterns. > >> - when wrapping calloc via dlsym, dlsym may call calloc, the >> user has to deal with it, >> >> - similarly any interface that internally may use malloc (in >> the future) better not be used in the malloc implementation. >> >> - malloc may be called when locks are held (e.g. some stdio >> lock during scanf) so synchronizing with anything that might >> also hold the same lock in the malloc implementation may >> deadlock (a more interesting example is probably the dl_load_lock >> while allocating dynamic tls in some cases so a user provided >> malloc should not use dtls) > > It should be able to use initial-exec TLS, though. > > Is it really worthwhile to go into such details? The malloc/fork interaction is something mostly dependent on > malloc implementation details. > > I can add a general warning to the documentation that implementing malloc is not easy, but I'm not sure how > helpful it would be. > i don't want to block this patch, it is better to specify the list of symbols to interpose than not saying anything on the matter. > The symbol list is mainly there because jemalloc forgot to override __libc_memalign for older glibcs, and we > don't want a repeat of that. > i remember other cases of malloc interposition bugs mostly related to glibc internals calling malloc in unexpected situations, so it is worth specifying the internal interface contract in more detail at some point. > Thanks, > Florian
manual: Document replacing malloc [BZ #20424] 2017-04-20 Florian Weimer <fweimer@redhat.com> [BZ #20424] * manual/memory.texi (Replacing malloc): New section. (Allocating Storage For Program Data): Reference it. (The GNU Allocator): Likewise. diff --git a/manual/memory.texi b/manual/memory.texi index a39cac8..a256ca0 100644 --- a/manual/memory.texi +++ b/manual/memory.texi @@ -167,6 +167,7 @@ special to @theglibc{} and GNU Compiler. * Unconstrained Allocation:: The @code{malloc} facility allows fully general dynamic allocation. * Allocation Debugging:: Finding memory leaks and not freed memory. +* Replacing malloc:: Using your own @code{malloc}-style allocator. * Obstacks:: Obstacks are less general than malloc but more efficient and convenient. * Variable Size Automatic:: Allocation of variable-sized blocks @@ -299,6 +300,9 @@ A more detailed technical description of the GNU Allocator is maintained in the @glibcadj{} wiki. See @uref{https://sourceware.org/glibc/wiki/MallocInternals}. +It is possible to use your own custom @code{malloc} instead of the +built-in allocator provided by @theglibc{}. @xref{Replacing malloc}. + @node Unconstrained Allocation @subsection Unconstrained Allocation @cindex unconstrained memory allocation @@ -1898,6 +1902,67 @@ from line 33 in the source file @file{/home/drepper/tst-mtrace.c} four times without freeing this memory before the program terminates. Whether this is a real problem remains to be investigated. +@node Replacing malloc +@subsection Replacing @code{malloc} + +@cindex @code{malloc} replacement +@cindex @code{LD_PRELOAD} and @code{malloc} +@cindex alternative @code{malloc} implementations +@cindex customizing @code{malloc} +@cindex interposing @code{malloc} +@cindex preempting @code{malloc} +@cindex replacing @code{malloc} +@Theglibc{} supports replacing the built-in @code{malloc} implementation +with a different allocator with the same interface. For dynamically +linked programs, this happens through ELF symbol interposition, either +using shared object dependencies or @code{LD_PRELOAD}. For static +linking, the @code{malloc} replacement library must be linked in before +linking against @code{libc.a} (explicitly or implicitly). + +@strong{Note:} Failure to provide a complete set of replacement +functions (that is, all the functions used by the application, +@theglibc{}, and other linked-in libraries) can lead to static linking +failures, and, at run time, to heap corruption and application crashes. + +The minimum set of functions which has to be provided by a custom +@code{malloc} is given in the table below. + +@table @code +@item malloc +@item free +@item calloc +@item realloc +@end table + +These @code{malloc}-related functions are required for @theglibc{} to +work.@footnote{Versions of @theglibc{} before 2.25 required that a +custom @code{malloc} defines @code{__libc_memalign} (with the same +interface as the @code{memalign} function).} + +The @code{malloc} implementation in @theglibc{} provides additional +functionality not used by the library itself, but which is often used by +other system libraries and applications. A general-purpose replacement +@code{malloc} implementation should provide definitions of these +functions, too. Their names are listed in the following table. + +@table @code +@item aligned_alloc +@item malloc_usable_size +@item memalign +@item posix_memalign +@item pvalloc +@item valloc +@end table + +In addition, very old applications may use the obsolete @code{cfree} +function. + +Further @code{malloc}-related functions such as @code{mallopt} or +@code{mallinfo} will not have any effect or return incorrect statistics +when a replacement @code{malloc} is in use. However, failure to replace +these functions typically does not result in crashes or other incorrect +application behavior, but may result in static linking failures. + @node Obstacks @subsection Obstacks @cindex obstacks