manual: Document replacing malloc [BZ #20424]

Submitted by Florian Weimer on April 20, 2017, 10:53 a.m.

Details

Message ID c05f8996-6cc6-b9ee-0f94-f78ba0cf3bd8@redhat.com
State New
Headers show

Commit Message

Florian Weimer April 20, 2017, 10:53 a.m.
On 04/19/2017 11:20 PM, DJ Delorie wrote:
> Florian Weimer <fw@deneb.enyo.de> writes:
> 
>> * DJ Delorie:
>>
>>> fweimer@redhat.com (Florian Weimer) writes:
>>>> +linked programs, this happens through ELF symbol interposition.  For
>>>> +static linking, the @code{malloc} replacement library must be linked in
>>>> +before linking against @code{libc.a} (explicitly or implicitly).
>>>
>>> Note that in statically linked cases, it's important to make sure that
>>> ALL your API functions are linked in.  If your allocator is a library,
>>> with one function per object, you might get only some of your functions
>>> - and then try to get the rest from glibc, which will of course fail.
>>
>> That's what I meant with “linking failures and crashes at run time”
>> below.
>>
>> I don't think it matters whether you have one object file or multiple
>> ones.
> 
> My point was, just putting your *.a first in the link line is
> insufficient to make sure all your *.o are included, and if
> some-but-not-all of your *.o are pulled from your library, that's an
> additional problem for you to solve.  Or at least watch out for.

Ahh, you mean this:

     “Normally, an archive is searched
      only once in the order that it is specified on the command line.
      If a symbol in that archive is needed to resolve an undefined
      symbol referred to by an object in an archive that appears later on
      the command line, the linker would not be able to resolve that
      reference.”

I think this falls under correctly using the static linker.  I'm not 
sure if we should go into that level of detail in the malloc documentation.

> That would do.  Maybe one general warning about static linking rather
> than repeating it throughout, though?

I added a Note: in the attached patch.

Thanks,
Florian

Comments

ricaljasan April 21, 2017, 2:33 a.m.
On 04/20/2017 03:53 AM, Florian Weimer wrote:
> +@node Replacing malloc
> +@subsection Replacing @code{malloc}
> +
> +@cindex @code{malloc} replacement
> +@cindex @code{LD_PRELOAD} and @code{malloc}
> +@cindex alternative @code{malloc} implementations
> +@cindex customizing @code{malloc}
> +@cindex interposing @code{malloc}
> +@cindex preempting @code{malloc}
> +@cindex replacing @code{malloc}
> +@Theglibc{} supports replacing the built-in @code{malloc} implementation

I appreciate the liberal use of @cindex here, especially having one that
also begins with "malloc".

Rical
Szabolcs Nagy April 24, 2017, 5:17 p.m.
On 20/04/17 11:53, Florian Weimer wrote:
> +@node Replacing malloc
> +@subsection Replacing @code{malloc}
> +
> +@cindex @code{malloc} replacement
> +@cindex @code{LD_PRELOAD} and @code{malloc}
> +@cindex alternative @code{malloc} implementations
> +@cindex customizing @code{malloc}
> +@cindex interposing @code{malloc}
> +@cindex preempting @code{malloc}
> +@cindex replacing @code{malloc}
> +@Theglibc{} supports replacing the built-in @code{malloc} implementation
> +with a different allocator with the same interface.  For dynamically
> +linked programs, this happens through ELF symbol interposition, either
> +using shared object dependencies or @code{LD_PRELOAD}.  For static
> +linking, the @code{malloc} replacement library must be linked in before
> +linking against @code{libc.a} (explicitly or implicitly).
> +

this documentation does not mention known caveats, e.g.

- when wrapping calloc via dlsym, dlsym may call calloc, the
  user has to deal with it,

- similarly any interface that internally may use malloc (in
  the future) better not be used in the malloc implementation.

- malloc may be called when locks are held (e.g. some stdio
  lock during scanf) so synchronizing with anything that might
  also hold the same lock in the malloc implementation may
  deadlock (a more interesting example is probably the dl_load_lock
  while allocating dynamic tls in some cases so a user provided
  malloc should not use dtls)

- some other invariants may not hold internally in libc when
  malloc is called, so the malloc implementation should not
  rely on those. (e.g. posix tsd deallocation calls free which
  might observe tsd in an inconsistent state)

- glibc tries to provide some guarantees that may not work
  with interposed malloc (unlocking malloc locks at multi-
  threaded fork ?)

it would be useful to specify the libc internal malloc interface
contracts, but that is a lot harder than just listing the set of
interfaces that need to be provided.

> +@strong{Note:} Failure to provide a complete set of replacement
> +functions (that is, all the functions used by the application,
> +@theglibc{}, and other linked-in libraries) can lead to static linking
> +failures, and, at run time, to heap corruption and application crashes.
> +
> +The minimum set of functions which has to be provided by a custom
> +@code{malloc} is given in the table below.
> +
> +@table @code
> +@item malloc
> +@item free
> +@item calloc
> +@item realloc
> +@end table
> +
> +These @code{malloc}-related functions are required for @theglibc{} to
> +work.@footnote{Versions of @theglibc{} before 2.25 required that a
> +custom @code{malloc} defines @code{__libc_memalign} (with the same
> +interface as the @code{memalign} function).}
> +
> +The @code{malloc} implementation in @theglibc{} provides additional
> +functionality not used by the library itself, but which is often used by
> +other system libraries and applications.  A general-purpose replacement
> +@code{malloc} implementation should provide definitions of these
> +functions, too.  Their names are listed in the following table.
> +
> +@table @code
> +@item aligned_alloc
> +@item malloc_usable_size
> +@item memalign
> +@item posix_memalign
> +@item pvalloc
> +@item valloc
> +@end table
> +
> +In addition, very old applications may use the obsolete @code{cfree}
> +function.
> +
> +Further @code{malloc}-related functions such as @code{mallopt} or
> +@code{mallinfo} will not have any effect or return incorrect statistics
> +when a replacement @code{malloc} is in use.  However, failure to replace
> +these functions typically does not result in crashes or other incorrect
> +application behavior, but may result in static linking failures.
> +
>  @node Obstacks
>  @subsection Obstacks
>  @cindex obstacks
>

Patch hide | download patch | download mbox

manual: Document replacing malloc [BZ #20424]

2017-04-20  Florian Weimer  <fweimer@redhat.com>

	[BZ #20424]
	* manual/memory.texi (Replacing malloc): New section.
	(Allocating Storage For Program Data): Reference it.
	(The GNU Allocator): Likewise.

diff --git a/manual/memory.texi b/manual/memory.texi
index a39cac8..a256ca0 100644
--- a/manual/memory.texi
+++ b/manual/memory.texi
@@ -167,6 +167,7 @@  special to @theglibc{} and GNU Compiler.
 * Unconstrained Allocation::    The @code{malloc} facility allows fully general
 		 		 dynamic allocation.
 * Allocation Debugging::        Finding memory leaks and not freed memory.
+* Replacing malloc::            Using your own @code{malloc}-style allocator.
 * Obstacks::                    Obstacks are less general than malloc
 				 but more efficient and convenient.
 * Variable Size Automatic::     Allocation of variable-sized blocks
@@ -299,6 +300,9 @@  A more detailed technical description of the GNU Allocator is maintained in
 the @glibcadj{} wiki. See
 @uref{https://sourceware.org/glibc/wiki/MallocInternals}.
 
+It is possible to use your own custom @code{malloc} instead of the
+built-in allocator provided by @theglibc{}.  @xref{Replacing malloc}.
+
 @node Unconstrained Allocation
 @subsection Unconstrained Allocation
 @cindex unconstrained memory allocation
@@ -1898,6 +1902,67 @@  from line 33 in the source file @file{/home/drepper/tst-mtrace.c} four
 times without freeing this memory before the program terminates.
 Whether this is a real problem remains to be investigated.
 
+@node Replacing malloc
+@subsection Replacing @code{malloc}
+
+@cindex @code{malloc} replacement
+@cindex @code{LD_PRELOAD} and @code{malloc}
+@cindex alternative @code{malloc} implementations
+@cindex customizing @code{malloc}
+@cindex interposing @code{malloc}
+@cindex preempting @code{malloc}
+@cindex replacing @code{malloc}
+@Theglibc{} supports replacing the built-in @code{malloc} implementation
+with a different allocator with the same interface.  For dynamically
+linked programs, this happens through ELF symbol interposition, either
+using shared object dependencies or @code{LD_PRELOAD}.  For static
+linking, the @code{malloc} replacement library must be linked in before
+linking against @code{libc.a} (explicitly or implicitly).
+
+@strong{Note:} Failure to provide a complete set of replacement
+functions (that is, all the functions used by the application,
+@theglibc{}, and other linked-in libraries) can lead to static linking
+failures, and, at run time, to heap corruption and application crashes.
+
+The minimum set of functions which has to be provided by a custom
+@code{malloc} is given in the table below.
+
+@table @code
+@item malloc
+@item free
+@item calloc
+@item realloc
+@end table
+
+These @code{malloc}-related functions are required for @theglibc{} to
+work.@footnote{Versions of @theglibc{} before 2.25 required that a
+custom @code{malloc} defines @code{__libc_memalign} (with the same
+interface as the @code{memalign} function).}
+
+The @code{malloc} implementation in @theglibc{} provides additional
+functionality not used by the library itself, but which is often used by
+other system libraries and applications.  A general-purpose replacement
+@code{malloc} implementation should provide definitions of these
+functions, too.  Their names are listed in the following table.
+
+@table @code
+@item aligned_alloc
+@item malloc_usable_size
+@item memalign
+@item posix_memalign
+@item pvalloc
+@item valloc
+@end table
+
+In addition, very old applications may use the obsolete @code{cfree}
+function.
+
+Further @code{malloc}-related functions such as @code{mallopt} or
+@code{mallinfo} will not have any effect or return incorrect statistics
+when a replacement @code{malloc} is in use.  However, failure to replace
+these functions typically does not result in crashes or other incorrect
+application behavior, but may result in static linking failures.
+
 @node Obstacks
 @subsection Obstacks
 @cindex obstacks