diff mbox

Avoid deadlock in malloc on backtrace

Message ID 20150224100249.GA31871@spoyarek.pnq.redhat.com
State New
Headers show

Commit Message

Siddhesh Poyarekar Feb. 24, 2015, 10:02 a.m. UTC
When the malloc subsystem detects some kind of memory corruption,
depending on the configuration it prints the error, a backtrace, a
memory map and then aborts the process.  In this process, the
backtrace() call may result in a call to malloc, resulting in
various kinds of problematic behavior.

In one case, the malloc it calls may detect a corruption and call
backtrace again, and a stack overflow may result due to the infinite
recursion.  In another case, the malloc it calls may deadlock on an
arena lock with the malloc (or free, realloc, etc.) that detected the
corruption.  In yet another case, if the program is linked with
pthreads, backtrace may do a pthread_once initialization, which
deadlocks on itself.

In all these cases, the program exit is not as intended.  This is
avoidable by marking the arena that malloc detected a corruption on,
as unusable.  The following patch does that.  Features of this patch
are as follows:

- A flag is added to the mstate struct of the arena to indicate if the
  arena is corrupt.

- The flag is checked whenever malloc functions try to get a lock on
  an arena.  If the arena is unusable, a NULL is returned, causing the
  malloc to use mmap or try the next arena.

- malloc_printerr sets the corrupt flag on the arena when it detects a
  corruption

- free does not concern itself with the flag at all.  It is not
  important since the backtrace workflow does not need free.  A free
  in a parallel thread may cause another corruption, but that's not
  new

- The flag check and set are not atomic and may race.  This is fine
  since we don't care about contention during the flag check.  We want
  to make sure that the malloc call in the backtrace does not trip on
  itself and all that action happens in the same thread and not across
  threads.

I verified that the test case does not show any regressions due to
this patch.  I also ran the malloc benchmarks and found an
insignificant difference in timings (< 2%).

	* malloc/Makefile (tests): New test case tst-malloc-backtrace.
	* malloc/arena.c (arena_lock): Check if arena is corrupt.
	(reused_arena): Find a non-corrupt arena.
	(heap_trim): Pass arena to unlink.
	* malloc/hooks.c (malloc_check_get_size): Pass arena to
	malloc_printerr.
	(top_check): Likewise.
	(free_check): Likewise.
	(realloc_check): Likewise.
	* malloc/malloc.c (malloc_printerr): Add arena argument.
	(unlink): Likewise.
	(munmap_chunk): Adjust.
	(ARENA_CORRUPTION_BIT): New macro.
	(arena_is_corrupt): Likewise.
	(set_arena_corrupt): Likewise.
	(sysmalloc): Use mmap if there are no usable arenas.
	(_int_malloc): Likewise.
	(__libc_malloc): Don't fail if arena_get returns NULL.
	(_mid_memalign): Likewise.
	(__libc_calloc): Likewise.
	(__libc_realloc): Adjust for additional argument to
	malloc_printerr.
	(_int_free): Likewise.
	(malloc_consolidate): Likewise.
	(_int_realloc): Likewise.
	(_int_memalign): Don't touch corrupt arenas.
	* malloc/tst-malloc-backtrace.c: New test case.
---
 malloc/Makefile               |   5 +-
 malloc/arena.c                |  22 +++++-
 malloc/hooks.c                |  12 ++-
 malloc/malloc.c               | 173 ++++++++++++++++++++++++++----------------
 malloc/tst-malloc-backtrace.c |  50 ++++++++++++
 5 files changed, 186 insertions(+), 76 deletions(-)
 create mode 100644 malloc/tst-malloc-backtrace.c

Comments

Florian Weimer Feb. 24, 2015, 10:31 a.m. UTC | #1
On 02/24/2015 11:02 AM, Siddhesh Poyarekar wrote:
> When the malloc subsystem detects some kind of memory corruption, 
> depending on the configuration it prints the error, a backtrace, a 
> memory map and then aborts the process.  In this process, the 
> backtrace() call may result in a call to malloc, resulting in 
> various kinds of problematic behavior.

Hasn't GCC a backtrace library which does not use malloc?

> In one case, the malloc it calls may detect a corruption and call 
> backtrace again, and a stack overflow may result due to the
> infinite recursion.  In another case, the malloc it calls may
> deadlock on an arena lock with the malloc (or free, realloc, etc.)
> that detected the corruption.  In yet another case, if the program
> is linked with pthreads, backtrace may do a pthread_once
> initialization, which deadlocks on itself.
> 
> In all these cases, the program exit is not as intended.  This is 
> avoidable by marking the arena that malloc detected a corruption
> on, as unusable.

From a security point of view, this is wrong.  glibc should not try to
attempt to do even more work if a heap corruption is detected.  The
proper fix would be to use a coredump handler to print the backtrace,
completely outside the process.  This also has the advantage that the
backtrace can use separate/compressed debugging information.

Maybe from a functionality point of view, this is the right thing to do.

The test case is invalid for multiple reasons: the compiler can detect
that the pointer arithmetic before the allocated buffer is invalid.
There is a use-after-free.  Maybe it's possible to fix this with
-ffreestanding; I don't know if the glibc headers obey that, though.

In the test case, the magic constant “8” should probably be replaced
with 2 * sizeof (size_t).  Please also add a comment what this test
case is testing, it is not immediately obvious.
Siddhesh Poyarekar Feb. 24, 2015, 11:51 a.m. UTC | #2
On Tue, Feb 24, 2015 at 11:31:02AM +0100, Florian Weimer wrote:
> Hasn't GCC a backtrace library which does not use malloc?

Yes, backtrace uses it (unwinder support in libgcc_s).  It doesn't use
malloc directly, but dynamically loading the library requires malloc.

> From a security point of view, this is wrong.  glibc should not try to
> attempt to do even more work if a heap corruption is detected.  The

In the heap corruption workflow, the only extra work it does is to
mark the arena as corrupt, which is to set a single flag in a memory
location that is never used by the chunk allocator.  Everything else
works exactly the way it has been.

> proper fix would be to use a coredump handler to print the backtrace,
> completely outside the process.  This also has the advantage that the
> backtrace can use separate/compressed debugging information.

Agreed, but that would be beyond the scope of glibc.  The other
alternative would be to remove the option completely, but I don't know
if that would be a popular choice.  the backtrace and map dump is
informative, but I don't know how many developers actually understand
it.  Even if we keep it as a non-default option, we'll have to fix the
deadlock and the stack overflow.

> Maybe from a functionality point of view, this is the right thing to do.
> 
> The test case is invalid for multiple reasons: the compiler can detect
> that the pointer arithmetic before the allocated buffer is invalid.
> There is a use-after-free.  Maybe it's possible to fix this with
> -ffreestanding; I don't know if the glibc headers obey that, though.

This is intentional to induce a memory corruption that results in the
deadlock.

Siddhesh
Florian Weimer Feb. 24, 2015, 11:55 a.m. UTC | #3
On 02/24/2015 12:51 PM, Siddhesh Poyarekar wrote:
> Agreed, but that would be beyond the scope of glibc.  The other 
> alternative would be to remove the option completely, but I don't
> know if that would be a popular choice.  the backtrace and map dump
> is informative, but I don't know how many developers actually
> understand it.  Even if we keep it as a non-default option, we'll
> have to fix the deadlock and the stack overflow.

I think we should really consider removal.  Most distributions will
have core dump catchers once they start using a glibc version derived
from master.

Another issue in one of the glibc crash handlers is the user of
getenv, which will not work if a stack overflow has overwritten the
environment.

>> Maybe from a functionality point of view, this is the right thing
>> to do.
>> 
>> The test case is invalid for multiple reasons: the compiler can
>> detect that the pointer arithmetic before the allocated buffer is
>> invalid. There is a use-after-free.  Maybe it's possible to fix
>> this with -ffreestanding; I don't know if the glibc headers obey
>> that, though.
> 
> This is intentional to induce a memory corruption that results in
> the deadlock.

I get that, I was suggesting a way to future-proof the test case
against compiler changes.
Siddhesh Poyarekar Feb. 24, 2015, 12:03 p.m. UTC | #4
On Tue, Feb 24, 2015 at 12:55:59PM +0100, Florian Weimer wrote:
> I think we should really consider removal.  Most distributions will
> have core dump catchers once they start using a glibc version derived
> from master.

I don't need a lot of convincing for this, but I guess we need more
consensus.  Does anybody think that removing the backtrace and maps is
a bad idea?

> I get that, I was suggesting a way to future-proof the test case
> against compiler changes.

Ahh OK, I get it now.

Thanks,
Siddhesh
Florian Weimer Feb. 24, 2015, 12:50 p.m. UTC | #5
On 02/24/2015 01:03 PM, Siddhesh Poyarekar wrote:
> On Tue, Feb 24, 2015 at 12:55:59PM +0100, Florian Weimer wrote:
>> I think we should really consider removal.  Most distributions will
>> have core dump catchers once they start using a glibc version derived
>> from master.
> 
> I don't need a lot of convincing for this, but I guess we need more
> consensus.  Does anybody think that removing the backtrace and maps is
> a bad idea?

Previous bugs filed for this:

https://sourceware.org/bugzilla/show_bug.cgi?id=12189 (CVE-2010-3192)
https://sourceware.org/bugzilla/show_bug.cgi?id=16159

The second bug appears to be the same thing you are fixing, Siddhesh.
Paul Eggert Feb. 24, 2015, 5:51 p.m. UTC | #6
On 02/24/2015 03:51 AM, Siddhesh Poyarekar wrote:
> dynamically loading the library requires malloc

How about dynamically linking _Unwind_Backtrace etc. at startup, rather 
than doing it lazily after heap memory may been corrupted and one really 
needs the backtrace?

It is news to me that 'backtrace' can call 'malloc' the first time that 
one uses 'backtrace'.  Now that I know this, I suppose I should modify 
Emacs to do a no-op call to 'backtrace' first thing, before Emacs does 
anything important.  That way, when Emacs really needs to call 
'backtrace' due to a bad pointer or whatever, we'll have more confidence 
that the call to 'backtrace' will actually work.

But really, we shouldn't have to modify applications to work around this 
problem, and the problem should be fixed in glibc.  'backtrace' 
shouldn't require 'malloc' to work, as we want 'backtrace' to get a 
backtrace as reliably as possible even when part of memory is corrupted.
Siddhesh Poyarekar Feb. 25, 2015, 8:47 a.m. UTC | #7
On Tue, Feb 24, 2015 at 01:50:32PM +0100, Florian Weimer wrote:
> Previous bugs filed for this:
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=12189 (CVE-2010-3192)
> https://sourceware.org/bugzilla/show_bug.cgi?id=16159
> 
> The second bug appears to be the same thing you are fixing, Siddhesh.

Oh yes, in fact I forgot to mention the bz number in my submission.  I
didn't know about the first bug though.

Siddhesh
Siddhesh Poyarekar Feb. 25, 2015, 9:10 a.m. UTC | #8
On Tue, Feb 24, 2015 at 09:51:57AM -0800, Paul Eggert wrote:
> On 02/24/2015 03:51 AM, Siddhesh Poyarekar wrote:
> >dynamically loading the library requires malloc
> 
> How about dynamically linking _Unwind_Backtrace etc. at startup, rather than
> doing it lazily after heap memory may been corrupted and one really needs
> the backtrace?

That seems to be off the table because it complicates bootstrapping:

https://sourceware.org/bugzilla/show_bug.cgi?id=16159

> It is news to me that 'backtrace' can call 'malloc' the first time that one
> uses 'backtrace'.  Now that I know this, I suppose I should modify Emacs to
> do a no-op call to 'backtrace' first thing, before Emacs does anything
> important.  That way, when Emacs really needs to call 'backtrace' due to a
> bad pointer or whatever, we'll have more confidence that the call to
> 'backtrace' will actually work.
> 
> But really, we shouldn't have to modify applications to work around this
> problem, and the problem should be fixed in glibc.  'backtrace' shouldn't
> require 'malloc' to work, as we want 'backtrace' to get a backtrace as
> reliably as possible even when part of memory is corrupted.

But how useful is the backtrace anyway?  Most distributions have abrt
or equivalent to get much more useful information from crashes from
programs and that method seems much more secure since they're doing it
from a separate process context and not from within a process that has
corrupted its heap.

Siddhesh
Florian Weimer Feb. 25, 2015, 9:23 a.m. UTC | #9
On 02/24/2015 06:51 PM, Paul Eggert wrote:

> But really, we shouldn't have to modify applications to work around this
> problem, and the problem should be fixed in glibc.  'backtrace'
> shouldn't require 'malloc' to work, as we want 'backtrace' to get a
> backtrace as reliably as possible even when part of memory is corrupted.

That's simply not possible with an in-process backtrace.  As I already
said, I think it is a bad idea to continue running after detecting a
security violation.  The idea that it's somehow possible to recover from
stack or heap corruption is fundamentally flawed.
Paul Eggert Feb. 26, 2015, 7:32 a.m. UTC | #10
Florian Weimer wrote:
> On 02/24/2015 06:51 PM, Paul Eggert wrote:
>
>> >'backtrace' shouldn't require 'malloc' to work, as we want 'backtrace' to get a
>> >backtrace as reliably as possible even when part of memory is corrupted.

> That's simply not possible with an in-process backtrace.

Sure it is.  I wasn't asking for backtrace to *always* work -- obviously that's 
impossible in general.  I'm asking only that it work as reliably as possible.

Siddhesh Poyarekar wrote:
 > But how useful is the backtrace anyway?

It's used reasonably often by people who report Emacs bugs and who didn't happen 
to have Emacs running under a debugger at the time.  The procedure is documented 
in the Emacs manual, at:

http://www.gnu.org/software/emacs/manual/html_node/emacs/Crashing.html

At this point we have a workaround (namely, have 'main' call 'backtrace' first 
thing, as a no-op) so it's not high priority to fix the problem in glibc. 
Still, it's an annoyance.
Carlos O'Donell Feb. 26, 2015, 7:26 p.m. UTC | #11
On 02/25/2015 04:23 AM, Florian Weimer wrote:
> On 02/24/2015 06:51 PM, Paul Eggert wrote:
> 
>> But really, we shouldn't have to modify applications to work around this
>> problem, and the problem should be fixed in glibc.  'backtrace'
>> shouldn't require 'malloc' to work, as we want 'backtrace' to get a
>> backtrace as reliably as possible even when part of memory is corrupted.
> 
> That's simply not possible with an in-process backtrace.  As I already
> said, I think it is a bad idea to continue running after detecting a
> security violation.  The idea that it's somehow possible to recover from
> stack or heap corruption is fundamentally flawed.

You have one of two cases:

* You are finishing printing an error message and will abort(). The
  process is on it's way to being shut down and it is very hard to stop
  it from doing so without changing read-only code.

* You have called mcheck to setup a handler, or have a SIGABRT handler,
  and will longjmp back to a point in the program execution where you
  will reinitialize the entire runtime or as much as you can.

The latter is a valid and presently supported mode of operation. As a
conservative project we have allowed such behaviours for developers
that are writing robust code. Those developers make a choice about the
safety and trade-off in the face of errors, not all of which are
security issues. Consider rad-hardened systems, robust systems with
failing memory, etc. etc.

Cheers,
Carlos.
Carlos O'Donell Feb. 26, 2015, 10:18 p.m. UTC | #12
On 02/24/2015 05:02 AM, Siddhesh Poyarekar wrote:
> When the malloc subsystem detects some kind of memory corruption,
> depending on the configuration it prints the error, a backtrace, a
> memory map and then aborts the process.  In this process, the
> backtrace() call may result in a call to malloc, resulting in
> various kinds of problematic behavior.

There have been various comments about this patch and I would like to
list some supporting rationale for this patch:

(1) Delaying the abort is bad for security.

Problem: The library should abort() immediately from a security perspective.

Solution: Don't delay, call abort() immediately.

We all agree that delaying the abort is bad from a security perspective.
However, the present solution is a trade-off between providing useful
diagnostics and *then* aborting.

I'm against removing the corruption detection messages because they
are useful for *me*, and when abrt doesn't work or doesn't work
correctly (see 3).

The best argument is that the default should be to call abort()
immediately, and I don't disagree with that, but I still find the
backtrace useful, and supportable, and would argue to keep it in
place as a choice to developers and system integrators.

At present if you want to log something during an abort() sequence
you register a SIGABRT handler, longjmp to somewhere safe, try to
log something, and then _exit(). You may object to this, but it's
a valid sequence of events that we support today, and the only way to
robustly support this without a custom malloc() is to make malloc()
more robust against corruption *or* provide a way for the user to
indicate that robust malloc is now desired (after the return from
the longjmp). Either way the infrastructure added here is needed.

In summary: Be conservative, keep the backtrace, consider making
it non-default, followed by potential switch to prevent abuse of
robust malloc.

(2) Use a simpler backtrace.
- Use a backtrace that doesn't require dlopen, or malloc.

One doesn't exist, and would be hard to create, just look at the
work it took to get the unwinder to the level that it is in gcc
today.

In summary: Not an immediate option.

(3) Use a coredump handler.
- Like the Fedora abrtd.

Doesn't support 3rd party ISV applications that aren't packed
with the OS.

Fedora defaults /etc/abrt/abrt-action-save-package-data.conf to:
ProcessUnpackaged = no

https://github.com/abrt/abrt/wiki/FAQ

Won't change any time soon and therefore means the in-process
backtrace continues to be useful.

(4) During corruption environment variables can't be trusted.
- The default shutdown process looks for LIBC_FATAL_STDERR_ env var
  to determine if it should write to /dev/tty or stderr.

Orthogonal to the issue at hand and should be discussed.

(5) Dynamically link with _Unwind_Backtrace to avoid dlopen's malloc.
- Link dynamically with libgcc_s.so in order to have immediate access
  to the unwinder.

Complicates the bootstrap process. Loading early also penalizes all
applications when only those that want backtrace should have a need
to load libgcc_s.so.

The general argument here, along with Rich's arguments about libpthread
are that de-modulaizing glibc makes maintenance and implementation
easier. This is true, but without care it also makes it brittle as you
start to depend on everything else and need strict adherence to development
protocols to prevent creep.

~~~~

As it is I see this patch as a win.

> In one case, the malloc it calls may detect a corruption and call
> backtrace again, and a stack overflow may result due to the infinite
> recursion.  In another case, the malloc it calls may deadlock on an
> arena lock with the malloc (or free, realloc, etc.) that detected the
> corruption.  In yet another case, if the program is linked with
> pthreads, backtrace may do a pthread_once initialization, which
> deadlocks on itself.
> 
> In all these cases, the program exit is not as intended.  This is
> avoidable by marking the arena that malloc detected a corruption on,
> as unusable.  The following patch does that.  Features of this patch
> are as follows:
> 
> - A flag is added to the mstate struct of the arena to indicate if the
>   arena is corrupt.

OK.

> - The flag is checked whenever malloc functions try to get a lock on
>   an arena.  If the arena is unusable, a NULL is returned, causing the
>   malloc to use mmap or try the next arena.

OK.

> - malloc_printerr sets the corrupt flag on the arena when it detects a
>   corruption

After which the process will eventually call abort or the users' registered
abort function via mcheck?

Which means either a SIGABRT handler runs and longjmp's or a user function
that does not abort are the only ways back to the running program.

OK.

> - free does not concern itself with the flag at all.  It is not
>   important since the backtrace workflow does not need free.  A free
>   in a parallel thread may cause another corruption, but that's not
>   new

OK.

> - The flag check and set are not atomic and may race.  This is fine
>   since we don't care about contention during the flag check.  We want
>   to make sure that the malloc call in the backtrace does not trip on
>   itself and all that action happens in the same thread and not across
>   threads.

OK.

> I verified that the test case does not show any regressions due to
> this patch.  I also ran the malloc benchmarks and found an
> insignificant difference in timings (< 2%).
> 
> 	* malloc/Makefile (tests): New test case tst-malloc-backtrace.
> 	* malloc/arena.c (arena_lock): Check if arena is corrupt.
> 	(reused_arena): Find a non-corrupt arena.
> 	(heap_trim): Pass arena to unlink.
> 	* malloc/hooks.c (malloc_check_get_size): Pass arena to
> 	malloc_printerr.
> 	(top_check): Likewise.
> 	(free_check): Likewise.
> 	(realloc_check): Likewise.
> 	* malloc/malloc.c (malloc_printerr): Add arena argument.
> 	(unlink): Likewise.
> 	(munmap_chunk): Adjust.
> 	(ARENA_CORRUPTION_BIT): New macro.
> 	(arena_is_corrupt): Likewise.
> 	(set_arena_corrupt): Likewise.
> 	(sysmalloc): Use mmap if there are no usable arenas.
> 	(_int_malloc): Likewise.
> 	(__libc_malloc): Don't fail if arena_get returns NULL.
> 	(_mid_memalign): Likewise.
> 	(__libc_calloc): Likewise.
> 	(__libc_realloc): Adjust for additional argument to
> 	malloc_printerr.
> 	(_int_free): Likewise.
> 	(malloc_consolidate): Likewise.
> 	(_int_realloc): Likewise.
> 	(_int_memalign): Don't touch corrupt arenas.
> 	* malloc/tst-malloc-backtrace.c: New test case.

I have a nit about the test case, like Florian I think it's a bit
fragile. I've made a suggestion for an alternate test case. I don't
know if it will work though, but we need a better way to corrupt
the chunk header.

> ---
>  malloc/Makefile               |   5 +-
>  malloc/arena.c                |  22 +++++-
>  malloc/hooks.c                |  12 ++-
>  malloc/malloc.c               | 173 ++++++++++++++++++++++++++----------------
>  malloc/tst-malloc-backtrace.c |  50 ++++++++++++
>  5 files changed, 186 insertions(+), 76 deletions(-)
>  create mode 100644 malloc/tst-malloc-backtrace.c
> 
> diff --git a/malloc/Makefile b/malloc/Makefile
> index 5f68a79..f385d9e 100644
> --- a/malloc/Makefile
> +++ b/malloc/Makefile
> @@ -27,7 +27,7 @@ headers := $(dist-headers) obstack.h mcheck.h
>  tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \
>  	 tst-mallocstate tst-mcheck tst-mallocfork tst-trim1 \
>  	 tst-malloc-usable tst-realloc tst-posix_memalign \
> -	 tst-pvalloc tst-memalign tst-mallopt
> +	 tst-pvalloc tst-memalign tst-mallopt tst-malloc-backtrace

OK.

>  test-srcs = tst-mtrace
>  
>  routines = malloc morecore mcheck mtrace obstack
> @@ -42,6 +42,9 @@ extra-libs-others = $(extra-libs)
>  libmemusage-routines = memusage
>  libmemusage-inhibit-o = $(filter-out .os,$(object-suffixes))
>  
> +$(objpfx)tst-malloc-backtrace: $(common-objpfx)nptl/libpthread.so \
> +			       $(common-objpfx)nptl/libpthread_nonshared.a
> +

OK.

>  # These should be removed by `make clean'.
>  extra-objs = mcheck-init.o libmcheck.a
>  
> diff --git a/malloc/arena.c b/malloc/arena.c
> index 8af51f0..bb1aad0 100644
> --- a/malloc/arena.c
> +++ b/malloc/arena.c
> @@ -99,7 +99,7 @@ int __malloc_initialized = -1;
>    } while (0)
>  
>  #define arena_lock(ptr, size) do {					      \
> -      if (ptr)								      \
> +      if (ptr && !arena_is_corrupt (ptr))				      \

OK.

>          (void) mutex_lock (&ptr->mutex);				      \
>        else								      \
>          ptr = arena_get2 (ptr, (size), NULL);				      \
> @@ -686,7 +686,7 @@ heap_trim (heap_info *heap, size_t pad)
>        if (!prev_inuse (p)) /* consolidate backward */
>          {
>            p = prev_chunk (p);
> -          unlink (p, bck, fwd);
> +          unlink (ar_ptr, p, bck, fwd);

OK.

>          }
>        assert (((unsigned long) ((char *) p + new_size) & (pagesz - 1)) == 0);
>        assert (((char *) p + new_size) == ((char *) heap + heap->size));
> @@ -802,7 +802,7 @@ reused_arena (mstate avoid_arena)
>    result = next_to_use;
>    do
>      {
> -      if (!mutex_trylock (&result->mutex))
> +      if (!arena_is_corrupt (result) && !mutex_trylock (&result->mutex))

OK.

>          goto out;
>  
>        result = result->next;
> @@ -814,7 +814,21 @@ reused_arena (mstate avoid_arena)
>    if (result == avoid_arena)
>      result = result->next;
>  
> -  /* No arena available.  Wait for the next in line.  */
> +  /* Make sure that the arena we get is not corrupted.  */
> +  mstate begin = result;
> +  while (arena_is_corrupt (result) || result == avoid_arena)
> +    {
> +      result = result->next;
> +      if (result == begin)
> +	break;
> +    }

OK.

> +
> +  /* We could not find any arena that was either not corrupted or not the one
> +     we wanted to avoid.  */
> +  if (result == begin || result == avoid_arena)
> +    return NULL;

OK. This is going to happen over and over again, but the pointer chase
performance is probably minimal and we're in a failing scenario.

> +
> +  /* No arena available without contention.  Wait for the next in line.  */
>    LIBC_PROBE (memory_arena_reuse_wait, 3, &result->mutex, result, avoid_arena);
>    (void) mutex_lock (&result->mutex);
>  
> diff --git a/malloc/hooks.c b/malloc/hooks.c
> index 0c4816f..9303fe5 100644
> --- a/malloc/hooks.c
> +++ b/malloc/hooks.c
> @@ -112,7 +112,8 @@ malloc_check_get_size (mchunkptr p)
>        if (c <= 0 || size < (c + 2 * SIZE_SZ))
>          {
>            malloc_printerr (check_action, "malloc_check_get_size: memory corruption",
> -                           chunk2mem (p));
> +                           chunk2mem (p),
> +			   chunk_is_mmapped (p) ? NULL : arena_for_chunk (p));

OK.

>            return 0;
>          }
>      }
> @@ -237,7 +238,8 @@ top_check (void)
>          (char *) t + chunksize (t) == mp_.sbrk_base + main_arena.system_mem)))
>      return 0;
>  
> -  malloc_printerr (check_action, "malloc: top chunk is corrupt", t);
> +  malloc_printerr (check_action, "malloc: top chunk is corrupt", t,
> +		   &main_arena);

OK.

>  
>    /* Try to set up a new top chunk. */
>    brk = MORECORE (0);
> @@ -295,7 +297,8 @@ free_check (void *mem, const void *caller)
>      {
>        (void) mutex_unlock (&main_arena.mutex);
>  
> -      malloc_printerr (check_action, "free(): invalid pointer", mem);
> +      malloc_printerr (check_action, "free(): invalid pointer", mem,
> +		       &main_arena);

OK.

>        return;
>      }
>    if (chunk_is_mmapped (p))
> @@ -333,7 +336,8 @@ realloc_check (void *oldmem, size_t bytes, const void *caller)
>    (void) mutex_unlock (&main_arena.mutex);
>    if (!oldp)
>      {
> -      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem);
> +      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem,
> +		       &main_arena);

OK.

>        return malloc_check (bytes, NULL);
>      }
>    const INTERNAL_SIZE_T oldsize = chunksize (oldp);
> diff --git a/malloc/malloc.c b/malloc/malloc.c
> index f361bad..452f036 100644
> --- a/malloc/malloc.c
> +++ b/malloc/malloc.c
> @@ -1059,7 +1059,7 @@ static void*  _int_realloc(mstate, mchunkptr, INTERNAL_SIZE_T,
>  static void*  _int_memalign(mstate, size_t, size_t);
>  static void*  _mid_memalign(size_t, size_t, void *);
>  
> -static void malloc_printerr(int action, const char *str, void *ptr);
> +static void malloc_printerr(int action, const char *str, void *ptr, mstate av);

OK.

>  
>  static void* internal_function mem2mem_check(void *p, size_t sz);
>  static int internal_function top_check(void);
> @@ -1411,11 +1411,11 @@ typedef struct malloc_chunk *mbinptr;
>  #define last(b)      ((b)->bk)
>  
>  /* Take a chunk off a bin list */
> -#define unlink(P, BK, FD) {                                            \
> +#define unlink(AV, P, BK, FD) {                                            \
>      FD = P->fd;								      \
>      BK = P->bk;								      \
>      if (__builtin_expect (FD->bk != P || BK->fd != P, 0))		      \
> -      malloc_printerr (check_action, "corrupted double-linked list", P);      \
> +      malloc_printerr (check_action, "corrupted double-linked list", P, AV);  \

OK.

>      else {								      \
>          FD->bk = BK;							      \
>          BK->fd = FD;							      \
> @@ -1424,7 +1424,8 @@ typedef struct malloc_chunk *mbinptr;
>  	    if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0)	      \
>  		|| __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0))    \
>  	      malloc_printerr (check_action,				      \
> -			       "corrupted double-linked list (not small)", P);\
> +			       "corrupted double-linked list (not small)",    \
> +			       P, AV);					      \

OK.

>              if (FD->fd_nextsize == NULL) {				      \
>                  if (P->fd_nextsize == P)				      \
>                    FD->fd_nextsize = FD->bk_nextsize = FD;		      \
> @@ -1656,6 +1657,15 @@ typedef struct malloc_chunk *mfastbinptr;
>  #define set_noncontiguous(M)   ((M)->flags |= NONCONTIGUOUS_BIT)
>  #define set_contiguous(M)      ((M)->flags &= ~NONCONTIGUOUS_BIT)
>  
> +/* ARENA_CORRUPTION_BIT is set if a memory corruption was detected on the
> +   arena.  Such an arena is no longer used to allocate chunks.  Chunks
> +   allocated in that arena before detecting corruption are not freed.  */
> +
> +#define ARENA_CORRUPTION_BIT (4U)
> +
> +#define arena_is_corrupt(A)	(((A)->flags & ARENA_CORRUPTION_BIT))
> +#define set_arena_corrupt(A)	((A)->flags |= ARENA_CORRUPTION_BIT)

OK.

> +
>  /*
>     Set value of max_fast.
>     Use impossibly small value if 0.
> @@ -2280,8 +2290,9 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
>       rather than expanding top.
>     */
>  
> -  if ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold) &&
> -      (mp_.n_mmaps < mp_.n_mmaps_max))
> +  if (av == NULL
> +      || ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
> +	  && (mp_.n_mmaps < mp_.n_mmaps_max)))

OK.

>      {
>        char *mm;           /* return value from mmap call*/
>  
> @@ -2354,6 +2365,10 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
>          }
>      }
>  
> +  /* There are no usable arenas and mmap also failed.  */
> +  if (av == NULL)
> +    return 0;

OK.

> +
>    /* Record incoming configuration of top */
>  
>    old_top = av->top;
> @@ -2528,7 +2543,8 @@ sysmalloc (INTERNAL_SIZE_T nb, mstate av)
>            else if (contiguous (av) && old_size && brk < old_end)
>              {
>                /* Oops!  Someone else killed our space..  Can't touch anything.  */
> -              malloc_printerr (3, "break adjusted to free malloc space", brk);
> +              malloc_printerr (3, "break adjusted to free malloc space", brk,
> +			       av);

OK.

>              }
>  
>            /*
> @@ -2818,7 +2834,7 @@ munmap_chunk (mchunkptr p)
>    if (__builtin_expect (((block | total_size) & (GLRO (dl_pagesize) - 1)) != 0, 0))
>      {
>        malloc_printerr (check_action, "munmap_chunk(): invalid pointer",
> -                       chunk2mem (p));
> +                       chunk2mem (p), NULL);

OK.

>        return;
>      }
>  
> @@ -2888,22 +2904,19 @@ __libc_malloc (size_t bytes)
>  
>    arena_get (ar_ptr, bytes);
>  
> -  if (!ar_ptr)
> -    return 0;
> -

OK.

>    victim = _int_malloc (ar_ptr, bytes);
> -  if (!victim)
> +  /* Retry with another arena only if we were able to find a usable arena
> +     before.  */
> +  if (!victim && ar_ptr != NULL)


OK.

>      {
>        LIBC_PROBE (memory_malloc_retry, 1, bytes);
>        ar_ptr = arena_get_retry (ar_ptr, bytes);
> -      if (__builtin_expect (ar_ptr != NULL, 1))
> -        {
> -          victim = _int_malloc (ar_ptr, bytes);
> -          (void) mutex_unlock (&ar_ptr->mutex);
> -        }
> +      victim = _int_malloc (ar_ptr, bytes);

OK. Because we now have a fallback for ar_ptr == NULL.

>      }
> -  else
> +
> +  if (ar_ptr != NULL)
>      (void) mutex_unlock (&ar_ptr->mutex);

OK.

> +
>    assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
>            ar_ptr == arena_for_chunk (mem2chunk (victim)));
>    return victim;
> @@ -2979,6 +2992,11 @@ __libc_realloc (void *oldmem, size_t bytes)
>    /* its size */
>    const INTERNAL_SIZE_T oldsize = chunksize (oldp);
>  
> +  if (chunk_is_mmapped (oldp))
> +    ar_ptr = NULL;
> +  else
> +    ar_ptr = arena_for_chunk (oldp);

OK.

> +
>    /* Little security check which won't hurt performance: the
>       allocator never wrapps around at the end of the address space.
>       Therefore we can exclude some size values which might appear
> @@ -2986,7 +3004,8 @@ __libc_realloc (void *oldmem, size_t bytes)
>    if (__builtin_expect ((uintptr_t) oldp > (uintptr_t) -oldsize, 0)
>        || __builtin_expect (misaligned_chunk (oldp), 0))
>      {
> -      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem);
> +      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem,
> +		       ar_ptr);

OK.

>        return NULL;
>      }
>  
> @@ -3015,10 +3034,8 @@ __libc_realloc (void *oldmem, size_t bytes)
>        return newmem;
>      }
>  
> -  ar_ptr = arena_for_chunk (oldp);
>    (void) mutex_lock (&ar_ptr->mutex);
>  
> -

OK.

>    newp = _int_realloc (ar_ptr, oldp, oldsize, nb);
>  
>    (void) mutex_unlock (&ar_ptr->mutex);
> @@ -3093,22 +3110,18 @@ _mid_memalign (size_t alignment, size_t bytes, void *address)
>      }
>  
>    arena_get (ar_ptr, bytes + alignment + MINSIZE);
> -  if (!ar_ptr)
> -    return 0;
>  

OK.

>    p = _int_memalign (ar_ptr, alignment, bytes);
> -  if (!p)
> +  if (!p && ar_ptr != NULL)
>      {

OK.

>        LIBC_PROBE (memory_memalign_retry, 2, bytes, alignment);
>        ar_ptr = arena_get_retry (ar_ptr, bytes);
> -      if (__builtin_expect (ar_ptr != NULL, 1))
> -        {
> -          p = _int_memalign (ar_ptr, alignment, bytes);
> -          (void) mutex_unlock (&ar_ptr->mutex);
> -        }
> +      p = _int_memalign (ar_ptr, alignment, bytes);

OK.

>      }
> -  else
> +
> +  if (ar_ptr != NULL)
>      (void) mutex_unlock (&ar_ptr->mutex);
> +

OK.

>    assert (!p || chunk_is_mmapped (mem2chunk (p)) ||
>            ar_ptr == arena_for_chunk (mem2chunk (p)));
>    return p;
> @@ -3187,47 +3200,53 @@ __libc_calloc (size_t n, size_t elem_size)
>    sz = bytes;
>  
>    arena_get (av, sz);
> -  if (!av)
> -    return 0;
> -
> -  /* Check if we hand out the top chunk, in which case there may be no
> -     need to clear. */
> +  if (av)
> +    {
> +      /* Check if we hand out the top chunk, in which case there may be no
> +	 need to clear. */
>  #if MORECORE_CLEARS
> -  oldtop = top (av);
> -  oldtopsize = chunksize (top (av));
> +      oldtop = top (av);
> +      oldtopsize = chunksize (top (av));
>  # if MORECORE_CLEARS < 2
> -  /* Only newly allocated memory is guaranteed to be cleared.  */
> -  if (av == &main_arena &&
> -      oldtopsize < mp_.sbrk_base + av->max_system_mem - (char *) oldtop)
> -    oldtopsize = (mp_.sbrk_base + av->max_system_mem - (char *) oldtop);
> +      /* Only newly allocated memory is guaranteed to be cleared.  */
> +      if (av == &main_arena &&
> +	  oldtopsize < mp_.sbrk_base + av->max_system_mem - (char *) oldtop)
> +	oldtopsize = (mp_.sbrk_base + av->max_system_mem - (char *) oldtop);
>  # endif
> -  if (av != &main_arena)
> +      if (av != &main_arena)
> +	{
> +	  heap_info *heap = heap_for_ptr (oldtop);
> +	  if (oldtopsize < (char *) heap + heap->mprotect_size - (char *) oldtop)
> +	    oldtopsize = (char *) heap + heap->mprotect_size - (char *) oldtop;
> +	}
> +#endif
> +    }
> +  else
>      {
> -      heap_info *heap = heap_for_ptr (oldtop);
> -      if (oldtopsize < (char *) heap + heap->mprotect_size - (char *) oldtop)
> -        oldtopsize = (char *) heap + heap->mprotect_size - (char *) oldtop;
> +      /* No usable arenas.  */
> +      oldtop = 0;
> +      oldtopsize = 0;

OK.

>      }
> -#endif
>    mem = _int_malloc (av, sz);
>  
>  
>    assert (!mem || chunk_is_mmapped (mem2chunk (mem)) ||
>            av == arena_for_chunk (mem2chunk (mem)));
>  
> -  if (mem == 0)
> +  if (mem == 0 && av != NULL)
>      {
>        LIBC_PROBE (memory_calloc_retry, 1, sz);
>        av = arena_get_retry (av, sz);
> -      if (__builtin_expect (av != NULL, 1))
> -        {
> -          mem = _int_malloc (av, sz);
> -          (void) mutex_unlock (&av->mutex);
> -        }
> -      if (mem == 0)
> -        return 0;
> +      mem = _int_malloc (av, sz);

OK.

There certainly are at least 3 copies of this pattern, nice to see it go.

>      }
> -  else
> +
> +  if (av != NULL)
>      (void) mutex_unlock (&av->mutex);
> +
> +  /* Allocation failed even after a retry.  */
> +  if (mem == 0)
> +    return 0;
> +
>    p = mem2chunk (mem);
>  
>    /* Two optional cases in which clearing not necessary */
> @@ -3323,6 +3342,16 @@ _int_malloc (mstate av, size_t bytes)
>  
>    checked_request2size (bytes, nb);
>  
> +  /* There are no usable arenas.  Fall back to sysmalloc to get a chunk from
> +     mmap.  */
> +  if (__glibc_unlikely (av == NULL))
> +    {
> +      void *p = sysmalloc (nb, av);
> +      if (p != NULL)
> +	alloc_perturb (p, bytes);
> +      return p;
> +    }

OK.

> +
>    /*
>       If the size qualifies as a fastbin, first check corresponding bin.
>       This code is safe to execute even if av is not yet initialized, so we
> @@ -3348,7 +3377,7 @@ _int_malloc (mstate av, size_t bytes)
>              {
>                errstr = "malloc(): memory corruption (fast)";
>              errout:
> -              malloc_printerr (check_action, errstr, chunk2mem (victim));
> +              malloc_printerr (check_action, errstr, chunk2mem (victim), av);

OK.

>                return NULL;
>              }
>            check_remalloced_chunk (av, victim, nb);
> @@ -3437,7 +3466,7 @@ _int_malloc (mstate av, size_t bytes)
>            if (__builtin_expect (victim->size <= 2 * SIZE_SZ, 0)
>                || __builtin_expect (victim->size > av->system_mem, 0))
>              malloc_printerr (check_action, "malloc(): memory corruption",
> -                             chunk2mem (victim));
> +                             chunk2mem (victim), av);

OK.

>            size = chunksize (victim);
>  
>            /*
> @@ -3584,7 +3613,7 @@ _int_malloc (mstate av, size_t bytes)
>                  victim = victim->fd;
>  
>                remainder_size = size - nb;
> -              unlink (victim, bck, fwd);
> +              unlink (av, victim, bck, fwd);

OK.

>  
>                /* Exhaust */
>                if (remainder_size < MINSIZE)
> @@ -3689,7 +3718,7 @@ _int_malloc (mstate av, size_t bytes)
>                remainder_size = size - nb;
>  
>                /* unlink */
> -              unlink (victim, bck, fwd);
> +              unlink (av, victim, bck, fwd);

OK.

>  
>                /* Exhaust */
>                if (remainder_size < MINSIZE)
> @@ -3829,7 +3858,7 @@ _int_free (mstate av, mchunkptr p, int have_lock)
>      errout:
>        if (!have_lock && locked)
>          (void) mutex_unlock (&av->mutex);
> -      malloc_printerr (check_action, errstr, chunk2mem (p));
> +      malloc_printerr (check_action, errstr, chunk2mem (p), av);

OK.

>        return;
>      }
>    /* We know that each chunk is at least MINSIZE bytes in size or a
> @@ -3967,7 +3996,7 @@ _int_free (mstate av, mchunkptr p, int have_lock)
>        prevsize = p->prev_size;
>        size += prevsize;
>        p = chunk_at_offset(p, -((long) prevsize));
> -      unlink(p, bck, fwd);
> +      unlink(av, p, bck, fwd);

OK.

>      }
>  
>      if (nextchunk != av->top) {
> @@ -3976,7 +4005,7 @@ _int_free (mstate av, mchunkptr p, int have_lock)
>  
>        /* consolidate forward */
>        if (!nextinuse) {
> -	unlink(nextchunk, bck, fwd);
> +	unlink(av, nextchunk, bck, fwd);

OK.

>  	size += nextsize;
>        } else
>  	clear_inuse_bit_at_offset(nextchunk, 0);
> @@ -4137,7 +4166,7 @@ static void malloc_consolidate(mstate av)
>  	    prevsize = p->prev_size;
>  	    size += prevsize;
>  	    p = chunk_at_offset(p, -((long) prevsize));
> -	    unlink(p, bck, fwd);
> +	    unlink(av, p, bck, fwd);

OK.

>  	  }
>  
>  	  if (nextchunk != av->top) {
> @@ -4145,7 +4174,7 @@ static void malloc_consolidate(mstate av)
>  
>  	    if (!nextinuse) {
>  	      size += nextsize;
> -	      unlink(nextchunk, bck, fwd);
> +	      unlink(av, nextchunk, bck, fwd);

OK.

>  	    } else
>  	      clear_inuse_bit_at_offset(nextchunk, 0);
>  
> @@ -4214,7 +4243,7 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
>      {
>        errstr = "realloc(): invalid old size";
>      errout:
> -      malloc_printerr (check_action, errstr, chunk2mem (oldp));
> +      malloc_printerr (check_action, errstr, chunk2mem (oldp), av);

OK.

>        return NULL;
>      }
>  
> @@ -4260,7 +4289,7 @@ _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
>                 (unsigned long) (nb))
>          {
>            newp = oldp;
> -          unlink (next, bck, fwd);
> +          unlink (av, next, bck, fwd);

OK.

>          }
>  
>        /* allocate, copy, free */
> @@ -4455,6 +4484,10 @@ _int_memalign (mstate av, size_t alignment, size_t bytes)
>  static int
>  mtrim (mstate av, size_t pad)
>  {
> +  /* Don't touch corrupt arenas.  */
> +  if (arena_is_corrupt (av))
> +    return 0;

OK.

> +
>    /* Ensure initialization/consolidation */
>    malloc_consolidate (av);
>  
> @@ -4945,8 +4978,14 @@ libc_hidden_def (__libc_mallopt)
>  extern char **__libc_argv attribute_hidden;
>  
>  static void
> -malloc_printerr (int action, const char *str, void *ptr)
> +malloc_printerr (int action, const char *str, void *ptr, mstate ar_ptr)
>  {
> +  /* Avoid using this arena in future.  We do not attempt to synchronize this
> +     with anything else because we minimally want to ensure that __libc_message
> +     gets its resources safely without stumbling on the current corruption.  */
> +  if (ar_ptr)
> +    set_arena_corrupt (ar_ptr);

OK.

> +
>    if ((action & 5) == 5)
>      __libc_message (action & 2, "%s\n", str);
>    else if (action & 1)
> diff --git a/malloc/tst-malloc-backtrace.c b/malloc/tst-malloc-backtrace.c
> new file mode 100644
> index 0000000..796a42f
> --- /dev/null
> +++ b/malloc/tst-malloc-backtrace.c
> @@ -0,0 +1,50 @@
> +/* Verify that backtrace does not deadlock on itself on memory corruption.
> +   Copyright (C) 2015 Free Software Foundation, Inc.
> +   This file is part of the GNU C Library.
> +
> +   The GNU C Library is free software; you can redistribute it and/or
> +   modify it under the terms of the GNU Lesser General Public
> +   License as published by the Free Software Foundation; either
> +   version 2.1 of the License, or (at your option) any later version.
> +
> +   The GNU C Library is distributed in the hope that it will be useful,
> +   but WITHOUT ANY WARRANTY; without even the implied warranty of
> +   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> +   Lesser General Public License for more details.
> +
> +   You should have received a copy of the GNU Lesser General Public
> +   License along with the GNU C Library; if not, see
> +   <http://www.gnu.org/licenses/>.  */
> +
> +
> +#include <stdlib.h>
> +
> +#define SIZE 4096
> +
> +/* Wrap free with a function to prevent gcc from optimizing it out.  */
> +static void
> +__attribute__((noinline))
> +call_free (void *ptr)
> +{
> +  free (ptr);
> +  *(size_t *)(ptr - 8) = 1;

Nit: Document magic constant. Why did you choose it?

Wouldn't a more compiler-safe test use memset to destroy
the inter-block canaries? Knoing ptr1 and ptr2's location
you could more easily zero out the space inbetween without
fear of a SIGSEGV, knowing the pages would be mapped on the
primary heap. This would corrupt the block and that corruption
would be detected at free.

This would not be easily detected or removed by the compiler?

> +}
> +
> +int
> +do_test (void)
> +{
> +  void *ptr1 = malloc (SIZE);
> +  void *ptr2 = malloc (SIZE);
> +
> +  call_free ((void *) ptr1);
> +  ptr1 = malloc (SIZE);
> +
> +  /* Not reached.  The return statement is to put ptr2 into use so that gcc
> +     doesn't optimize out that malloc call.  */
> +  return (ptr1 == ptr2);
> +}
> +
> +#define TEST_FUNCTION do_test ()
> +#define EXPECTED_SIGNAL SIGABRT
> +
> +#include "../test-skeleton.c"
> 

Cheers,
Carlos.
Siddhesh Poyarekar Feb. 27, 2015, 7:16 a.m. UTC | #13
I am not convinced about the utility of backtrace in the heap
corruption cases, so I'm going to try and argue in favour of its
removal.  From the review, Carlos seems OK with the patch but not the
test case, so I'll work on the latter.

TL;DR; there are two distinct issues to think about and we should
tackle them separately:

1. How useful is the backtrace in a memory corruption case

2. How do we ensure that backtrace is robust when called from weird
   contexts, especially from within a signal handler.

On Thu, Feb 26, 2015 at 05:18:04PM -0500, Carlos O'Donell wrote:
> There have been various comments about this patch and I would like to
> list some supporting rationale for this patch:
> 
> (1) Delaying the abort is bad for security.
> 
> Problem: The library should abort() immediately from a security perspective.
> 
> Solution: Don't delay, call abort() immediately.
> 
> We all agree that delaying the abort is bad from a security perspective.
> However, the present solution is a trade-off between providing useful
> diagnostics and *then* aborting.
> 
> I'm against removing the corruption detection messages because they
> are useful for *me*, and when abrt doesn't work or doesn't work
> correctly (see 3).

I don't think the argument is for removing corruption detection
messages.  It is for removing the backtrace and map info from it since
the former touches the heap and the latter is mostly useless without
the former.  The corruption detection message remains and IMO, it
gives sufficient information about what kind of corruption caused the
program exit.

Removing the backtrace removes the information about where in the
program the error occured, but that is mostly useless because the
point at which the corruption is detected is almost never the point
where the corruption happened.

> The best argument is that the default should be to call abort()
> immediately, and I don't disagree with that, but I still find the
> backtrace useful, and supportable, and would argue to keep it in
> place as a choice to developers and system integrators.
> 
> At present if you want to log something during an abort() sequence
> you register a SIGABRT handler, longjmp to somewhere safe, try to
> log something, and then _exit(). You may object to this, but it's
> a valid sequence of events that we support today, and the only way to
> robustly support this without a custom malloc() is to make malloc()
> more robust against corruption *or* provide a way for the user to
> indicate that robust malloc is now desired (after the return from
> the longjmp). Either way the infrastructure added here is needed.
> 
> In summary: Be conservative, keep the backtrace, consider making
> it non-default, followed by potential switch to prevent abuse of
> robust malloc.

I don't disagree with the idea of robustifying backtrace since that is
useful for other applications.  Howeber, to make it truly robust, a
backtrace call should never result in a malloc since it is often
called in a signal context.

We need to ensure that either the dynamic linker does not use malloc
at all in that context, or the dynamic linker does not come into the
picture at all when we're calling backtrace.  Robustifying malloc to
try and robustify backtrace seems like too tall an order.  We'll
continually run into different kinds of issues till we make malloc
completely async-cancel-safe.

> (3) Use a coredump handler.
> - Like the Fedora abrtd.
> 
> Doesn't support 3rd party ISV applications that aren't packed
> with the OS.
> 
> Fedora defaults /etc/abrt/abrt-action-save-package-data.conf to:
> ProcessUnpackaged = no
> 
> https://github.com/abrt/abrt/wiki/FAQ
> 
> Won't change any time soon and therefore means the in-process
> backtrace continues to be useful.

A coredump handler is just gravy on top of the actual feature,
i.e. dumping core, which has been there since before I was born.  A
core dump provides way more information than the backtrace can ever
hope to provide and the fact that we abort on exit, guarantees that we
get a core dump if it is enabled.  If it helps, we can add a message
telling the user to enable core dumps if they haven't already.

> (5) Dynamically link with _Unwind_Backtrace to avoid dlopen's malloc.
> - Link dynamically with libgcc_s.so in order to have immediate access
>   to the unwinder.
> 
> Complicates the bootstrap process. Loading early also penalizes all
> applications when only those that want backtrace should have a need
> to load libgcc_s.so.

A new function backtrace_init() could help break this deadlock.  Or
something similar that loads libgcc_s.so.1 so that backtrace doesn't
result in dynamic linker invocation during the backtrace call.

> > +#include <stdlib.h>
> > +
> > +#define SIZE 4096
> > +
> > +/* Wrap free with a function to prevent gcc from optimizing it out.  */
> > +static void
> > +__attribute__((noinline))
> > +call_free (void *ptr)
> > +{
> > +  free (ptr);
> > +  *(size_t *)(ptr - 8) = 1;
> 
> Nit: Document magic constant. Why did you choose it?
> 
> Wouldn't a more compiler-safe test use memset to destroy
> the inter-block canaries? Knoing ptr1 and ptr2's location
> you could more easily zero out the space inbetween without
> fear of a SIGSEGV, knowing the pages would be mapped on the
> primary heap. This would corrupt the block and that corruption
> would be detected at free.
> 
> This would not be easily detected or removed by the compiler?

OK, I'll see if I can change this.

Siddhesh
Paul Eggert Feb. 27, 2015, 7:33 a.m. UTC | #14
Siddhesh Poyarekar wrote:
> A new function backtrace_init() could help break this deadlock.

Or we could document the current behavior, which is that backtrace (0, 0) 
initializes the backtrace module and avoids the problem.  This is pretty much 
what bleeding-edge Emacs does now.  (Unfortunately this does not work on sparc 
or microblaze but those would presumably be considered bugs.)
Siddhesh Poyarekar Feb. 27, 2015, 8:22 a.m. UTC | #15
On Thu, Feb 26, 2015 at 11:33:49PM -0800, Paul Eggert wrote:
> Siddhesh Poyarekar wrote:
> >A new function backtrace_init() could help break this deadlock.
> 
> Or we could document the current behavior, which is that backtrace (0, 0)
> initializes the backtrace module and avoids the problem.  This is pretty
> much what bleeding-edge Emacs does now.  (Unfortunately this does not work
> on sparc or microblaze but those would presumably be considered bugs.)

Oh great, that is even better since it doesn't need an ABI event.

Siddhesh
Szabolcs Nagy Feb. 27, 2015, 2:20 p.m. UTC | #16
* Carlos O'Donell <carlos@redhat.com> [2015-02-26 17:18:04 -0500]:
> (1) Delaying the abort is bad for security.
> 
> Problem: The library should abort() immediately from a security perspective.
> 
> Solution: Don't delay, call abort() immediately.

i think from security pov immediate crash (__builtin_trap) is the right solution

abort is complex (eg glibc tries to fflush stdio buffers)

> (2) Use a simpler backtrace.
> - Use a backtrace that doesn't require dlopen, or malloc.
> 
> One doesn't exist, and would be hard to create, just look at the
> work it took to get the unwinder to the level that it is in gcc
> today.

dlopen("libgcc_s.so") is bad

when glibc and gcc are not the installed host tools it will use
the wrong library because nobody would expect the libc to open
shared objects of the compiler at runtime so that wont be in the
library path: eg. the testrun.sh in glibc does not expect it

> (5) Dynamically link with _Unwind_Backtrace to avoid dlopen's malloc.
> - Link dynamically with libgcc_s.so in order to have immediate access
>   to the unwinder.

as far as i'm concerned libgcc_s should not exist:
such dependency between the c runtime and the compiler is bound to
break things

(i would be happy with a dummy backtrace)
Rich Felker Feb. 27, 2015, 2:40 p.m. UTC | #17
On Fri, Feb 27, 2015 at 03:20:11PM +0100, Szabolcs Nagy wrote:
> * Carlos O'Donell <carlos@redhat.com> [2015-02-26 17:18:04 -0500]:
> > (1) Delaying the abort is bad for security.
> > 
> > Problem: The library should abort() immediately from a security perspective.
> > 
> > Solution: Don't delay, call abort() immediately.
> 
> i think from security pov immediate crash (__builtin_trap) is the right solution

Absolutely. Even a simple function call (that goes via the PLT) or a
syscall (using a sysenter/syscall pointer in the TCB) is dangerous
when the process's memory space has already been trashed by an
attacker.

> abort is complex (eg glibc tries to fflush stdio buffers)

Yes, and this is a bug. abort is required to be AS-safe, but there is
no way to do the flushing AS-safely. Attempting it in an AS-safe
manner with fallbacks may be possible, but the way it's done right now
is not safe, and such behavior does not seem useful anyway.

> > (2) Use a simpler backtrace.
> > - Use a backtrace that doesn't require dlopen, or malloc.
> > 
> > One doesn't exist, and would be hard to create, just look at the
> > work it took to get the unwinder to the level that it is in gcc
> > today.
> 
> dlopen("libgcc_s.so") is bad
> 
> when glibc and gcc are not the installed host tools it will use
> the wrong library because nobody would expect the libc to open
> shared objects of the compiler at runtime so that wont be in the
> library path: eg. the testrun.sh in glibc does not expect it
> 
> > (5) Dynamically link with _Unwind_Backtrace to avoid dlopen's malloc.
> > - Link dynamically with libgcc_s.so in order to have immediate access
> >   to the unwinder.
> 
> as far as i'm concerned libgcc_s should not exist:
> such dependency between the c runtime and the compiler is bound to
> break things
> 
> (i would be happy with a dummy backtrace)

I agree dlopen is bad but the glibc developers seem to have reasons
they don't want to change for not wanting to add -lgcc_s or -lgcc_eh
(static) related to bootstrapping. Perhaps an alternate would be to
include the unwind code directly in glibc?

BTW an equivalent issue affects pthread_cancel and causes it to crash
the program if libgcc_s.so cannot be loaded for any reason. I believe
there's an open bug report for this.

Rich
Carlos O'Donell Feb. 27, 2015, 4:33 p.m. UTC | #18
On 02/27/2015 09:20 AM, Szabolcs Nagy wrote:
> * Carlos O'Donell <carlos@redhat.com> [2015-02-26 17:18:04 -0500]:
>> (1) Delaying the abort is bad for security.
>>
>> Problem: The library should abort() immediately from a security perspective.
>>
>> Solution: Don't delay, call abort() immediately.
> 
> i think from security pov immediate crash (__builtin_trap) is the right solution
> 
> abort is complex (eg glibc tries to fflush stdio buffers)

I agree that __builtin_trap would be an option.

c.
Mike Frysinger March 2, 2015, 5:30 a.m. UTC | #19
On 26 Feb 2015 17:18, Carlos O'Donell wrote:
> On 02/24/2015 05:02 AM, Siddhesh Poyarekar wrote:
> > When the malloc subsystem detects some kind of memory corruption,
> > depending on the configuration it prints the error, a backtrace, a
> > memory map and then aborts the process.  In this process, the
> > backtrace() call may result in a call to malloc, resulting in
> > various kinds of problematic behavior.
> 
> There have been various comments about this patch and I would like to
> list some supporting rationale for this patch:
> 
> (1) Delaying the abort is bad for security.
> 
> Problem: The library should abort() immediately from a security perspective.
> 
> Solution: Don't delay, call abort() immediately.
> 
> We all agree that delaying the abort is bad from a security perspective.
> However, the present solution is a trade-off between providing useful
> diagnostics and *then* aborting.

i think the current line glibc attempts to walk is reasonable.  there are other
funcs which are arguably more important (__chk_fail & __fortify_fail) that do a
good amount of work (calling into __libc_message).  from a security pov, that is 
all bad mojo.

it might be nice if there was a configure option to enable a paranoid mode 
whereby we die as safely as possible when a failure is detected.  in Gentoo we 
have a custom handler that uses direct syscalls and logs to the system logger
before killing itself (SIGABRT & SIGKILL).

> I'm against removing the corruption detection messages because they
> are useful for *me*, and when abrt doesn't work or doesn't work
> correctly (see 3).

agreed

> The best argument is that the default should be to call abort()
> immediately, and I don't disagree with that, but I still find the
> backtrace useful, and supportable, and would argue to keep it in
> place as a choice to developers and system integrators.

i also find this useful, but we probably want to make sure we don't spend too 
much time on a complicated solution.  the proposed patch didn't seem too bad to 
me.
-mike
Rich Felker March 2, 2015, 5:34 p.m. UTC | #20
On Mon, Mar 02, 2015 at 12:30:51AM -0500, Mike Frysinger wrote:
> On 26 Feb 2015 17:18, Carlos O'Donell wrote:
> > On 02/24/2015 05:02 AM, Siddhesh Poyarekar wrote:
> > > When the malloc subsystem detects some kind of memory corruption,
> > > depending on the configuration it prints the error, a backtrace, a
> > > memory map and then aborts the process.  In this process, the
> > > backtrace() call may result in a call to malloc, resulting in
> > > various kinds of problematic behavior.
> > 
> > There have been various comments about this patch and I would like to
> > list some supporting rationale for this patch:
> > 
> > (1) Delaying the abort is bad for security.
> > 
> > Problem: The library should abort() immediately from a security perspective.
> > 
> > Solution: Don't delay, call abort() immediately.
> > 
> > We all agree that delaying the abort is bad from a security perspective.
> > However, the present solution is a trade-off between providing useful
> > diagnostics and *then* aborting.
> 
> i think the current line glibc attempts to walk is reasonable.  there are other
> funcs which are arguably more important (__chk_fail & __fortify_fail) that do a
> good amount of work (calling into __libc_message).  from a security pov, that is 
> all bad mojo.

And they should all be fixed.

> it might be nice if there was a configure option to enable a paranoid mode 
> whereby we die as safely as possible when a failure is detected.  in Gentoo we 
> have a custom handler that uses direct syscalls and logs to the system logger
> before killing itself (SIGABRT & SIGKILL).

Even syscalls are unsafe on x86 unless you use custom asm. The default
mechanism goes through a syscall vdso pointer in the TCB at %gs:...
which is easily attacker-controlled after stack overflow on a non-main
thread.

The right solution is a HCF instruction or even better a sequence
like the following to avoid trapping by the application:

	push $-1
	push $-1
	xor %edx,%edx
	mov %esp,%ecx
	xor %ebx,%ebx
	mov $SYS_rt_sigprocmask,%eax
	int $128
	hlt

Rich
Mike Frysinger March 2, 2015, 5:57 p.m. UTC | #21
On 02 Mar 2015 12:34, Rich Felker wrote:
> On Mon, Mar 02, 2015 at 12:30:51AM -0500, Mike Frysinger wrote:
> > On 26 Feb 2015 17:18, Carlos O'Donell wrote:
> > > On 02/24/2015 05:02 AM, Siddhesh Poyarekar wrote:
> > > > When the malloc subsystem detects some kind of memory corruption,
> > > > depending on the configuration it prints the error, a backtrace, a
> > > > memory map and then aborts the process.  In this process, the
> > > > backtrace() call may result in a call to malloc, resulting in
> > > > various kinds of problematic behavior.
> > > 
> > > There have been various comments about this patch and I would like to
> > > list some supporting rationale for this patch:
> > > 
> > > (1) Delaying the abort is bad for security.
> > > 
> > > Problem: The library should abort() immediately from a security perspective.
> > > 
> > > Solution: Don't delay, call abort() immediately.
> > > 
> > > We all agree that delaying the abort is bad from a security perspective.
> > > However, the present solution is a trade-off between providing useful
> > > diagnostics and *then* aborting.
> > 
> > i think the current line glibc attempts to walk is reasonable.  there are other
> > funcs which are arguably more important (__chk_fail & __fortify_fail) that do a
> > good amount of work (calling into __libc_message).  from a security pov, that is 
> > all bad mojo.
> 
> And they should all be fixed.

to be clear, while i support adding an optional "paranoid" mode, i'm entirely 
against disabling them by default.  these crash messages have improved the lives 
of users and developers significantly since they were added.  i don't have data 
to back this claim up, but having worked as a developer writing my own code, 
an upstream developer releasing my own packages, and a distro maintainer on a 
wide swath of packages, all continuously since the glibc 2.2.5 days, i think my 
anecdotal memory is sufficient.  as a maintainer of the toolchain, i certainly 
know it helps bucket bugs -- before this message, most random crashes were 
thrown at the toolchain distro maintainers (after all, a crash in malloc must 
surely be the fault of the C library).  now with all the memory corruption 
hooks, we've got a pretty strong/reliable signal that it is entirely the fault 
of the package.  returning to that status quo would be a terrible idea.

if a distro has frameworks in place to replace the functionality of these (e.g. 
a core dump handler), then they can make the decision to also opt in their C 
library to this mode.  on Gentoo, we make the decision based on how hardened the 
user wants their system.

> > it might be nice if there was a configure option to enable a paranoid mode 
> > whereby we die as safely as possible when a failure is detected.  in Gentoo we 
> > have a custom handler that uses direct syscalls and logs to the system logger
> > before killing itself (SIGABRT & SIGKILL).
> 
> Even syscalls are unsafe on x86 unless you use custom asm. The default
> mechanism goes through a syscall vdso pointer in the TCB at %gs:...

i'm aware ... we do force the int80 method in some places because of interaction 
with gs setup as the code will crash otherwise during early init.  although this 
particular case doesn't seem to be accounting for that, so i'll update it too.
-mike
Carlos O'Donell March 5, 2015, 7:22 p.m. UTC | #22
On 03/02/2015 12:57 PM, Mike Frysinger wrote:
> to be clear, while i support adding an optional "paranoid" mode, i'm entirely 
> against disabling them by default.  these crash messages have improved the lives 
> of users and developers significantly since they were added.  i don't have data 
> to back this claim up, but having worked as a developer writing my own code, 
> an upstream developer releasing my own packages, and a distro maintainer on a 
> wide swath of packages, all continuously since the glibc 2.2.5 days, i think my 
> anecdotal memory is sufficient.  as a maintainer of the toolchain, i certainly 
> know it helps bucket bugs -- before this message, most random crashes were 
> thrown at the toolchain distro maintainers (after all, a crash in malloc must 
> surely be the fault of the C library).  now with all the memory corruption 
> hooks, we've got a pretty strong/reliable signal that it is entirely the fault 
> of the package.  returning to that status quo would be a terrible idea.
> 
> if a distro has frameworks in place to replace the functionality of these (e.g. 
> a core dump handler), then they can make the decision to also opt in their C 
> library to this mode.  on Gentoo, we make the decision based on how hardened the 
> user wants their system.

That's a very cogent argument for keeping the status-quo behaviour and fixing
the deadlock, while continuing the discussion over what exactly should be done.

Cheers,
Carlos.
diff mbox

Patch

diff --git a/malloc/Makefile b/malloc/Makefile
index 5f68a79..f385d9e 100644
--- a/malloc/Makefile
+++ b/malloc/Makefile
@@ -27,7 +27,7 @@  headers := $(dist-headers) obstack.h mcheck.h
 tests := mallocbug tst-malloc tst-valloc tst-calloc tst-obstack \
 	 tst-mallocstate tst-mcheck tst-mallocfork tst-trim1 \
 	 tst-malloc-usable tst-realloc tst-posix_memalign \
-	 tst-pvalloc tst-memalign tst-mallopt
+	 tst-pvalloc tst-memalign tst-mallopt tst-malloc-backtrace
 test-srcs = tst-mtrace
 
 routines = malloc morecore mcheck mtrace obstack
@@ -42,6 +42,9 @@  extra-libs-others = $(extra-libs)
 libmemusage-routines = memusage
 libmemusage-inhibit-o = $(filter-out .os,$(object-suffixes))
 
+$(objpfx)tst-malloc-backtrace: $(common-objpfx)nptl/libpthread.so \
+			       $(common-objpfx)nptl/libpthread_nonshared.a
+
 # These should be removed by `make clean'.
 extra-objs = mcheck-init.o libmcheck.a
 
diff --git a/malloc/arena.c b/malloc/arena.c
index 8af51f0..bb1aad0 100644
--- a/malloc/arena.c
+++ b/malloc/arena.c
@@ -99,7 +99,7 @@  int __malloc_initialized = -1;
   } while (0)
 
 #define arena_lock(ptr, size) do {					      \
-      if (ptr)								      \
+      if (ptr && !arena_is_corrupt (ptr))				      \
         (void) mutex_lock (&ptr->mutex);				      \
       else								      \
         ptr = arena_get2 (ptr, (size), NULL);				      \
@@ -686,7 +686,7 @@  heap_trim (heap_info *heap, size_t pad)
       if (!prev_inuse (p)) /* consolidate backward */
         {
           p = prev_chunk (p);
-          unlink (p, bck, fwd);
+          unlink (ar_ptr, p, bck, fwd);
         }
       assert (((unsigned long) ((char *) p + new_size) & (pagesz - 1)) == 0);
       assert (((char *) p + new_size) == ((char *) heap + heap->size));
@@ -802,7 +802,7 @@  reused_arena (mstate avoid_arena)
   result = next_to_use;
   do
     {
-      if (!mutex_trylock (&result->mutex))
+      if (!arena_is_corrupt (result) && !mutex_trylock (&result->mutex))
         goto out;
 
       result = result->next;
@@ -814,7 +814,21 @@  reused_arena (mstate avoid_arena)
   if (result == avoid_arena)
     result = result->next;
 
-  /* No arena available.  Wait for the next in line.  */
+  /* Make sure that the arena we get is not corrupted.  */
+  mstate begin = result;
+  while (arena_is_corrupt (result) || result == avoid_arena)
+    {
+      result = result->next;
+      if (result == begin)
+	break;
+    }
+
+  /* We could not find any arena that was either not corrupted or not the one
+     we wanted to avoid.  */
+  if (result == begin || result == avoid_arena)
+    return NULL;
+
+  /* No arena available without contention.  Wait for the next in line.  */
   LIBC_PROBE (memory_arena_reuse_wait, 3, &result->mutex, result, avoid_arena);
   (void) mutex_lock (&result->mutex);
 
diff --git a/malloc/hooks.c b/malloc/hooks.c
index 0c4816f..9303fe5 100644
--- a/malloc/hooks.c
+++ b/malloc/hooks.c
@@ -112,7 +112,8 @@  malloc_check_get_size (mchunkptr p)
       if (c <= 0 || size < (c + 2 * SIZE_SZ))
         {
           malloc_printerr (check_action, "malloc_check_get_size: memory corruption",
-                           chunk2mem (p));
+                           chunk2mem (p),
+			   chunk_is_mmapped (p) ? NULL : arena_for_chunk (p));
           return 0;
         }
     }
@@ -237,7 +238,8 @@  top_check (void)
         (char *) t + chunksize (t) == mp_.sbrk_base + main_arena.system_mem)))
     return 0;
 
-  malloc_printerr (check_action, "malloc: top chunk is corrupt", t);
+  malloc_printerr (check_action, "malloc: top chunk is corrupt", t,
+		   &main_arena);
 
   /* Try to set up a new top chunk. */
   brk = MORECORE (0);
@@ -295,7 +297,8 @@  free_check (void *mem, const void *caller)
     {
       (void) mutex_unlock (&main_arena.mutex);
 
-      malloc_printerr (check_action, "free(): invalid pointer", mem);
+      malloc_printerr (check_action, "free(): invalid pointer", mem,
+		       &main_arena);
       return;
     }
   if (chunk_is_mmapped (p))
@@ -333,7 +336,8 @@  realloc_check (void *oldmem, size_t bytes, const void *caller)
   (void) mutex_unlock (&main_arena.mutex);
   if (!oldp)
     {
-      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem);
+      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem,
+		       &main_arena);
       return malloc_check (bytes, NULL);
     }
   const INTERNAL_SIZE_T oldsize = chunksize (oldp);
diff --git a/malloc/malloc.c b/malloc/malloc.c
index f361bad..452f036 100644
--- a/malloc/malloc.c
+++ b/malloc/malloc.c
@@ -1059,7 +1059,7 @@  static void*  _int_realloc(mstate, mchunkptr, INTERNAL_SIZE_T,
 static void*  _int_memalign(mstate, size_t, size_t);
 static void*  _mid_memalign(size_t, size_t, void *);
 
-static void malloc_printerr(int action, const char *str, void *ptr);
+static void malloc_printerr(int action, const char *str, void *ptr, mstate av);
 
 static void* internal_function mem2mem_check(void *p, size_t sz);
 static int internal_function top_check(void);
@@ -1411,11 +1411,11 @@  typedef struct malloc_chunk *mbinptr;
 #define last(b)      ((b)->bk)
 
 /* Take a chunk off a bin list */
-#define unlink(P, BK, FD) {                                            \
+#define unlink(AV, P, BK, FD) {                                            \
     FD = P->fd;								      \
     BK = P->bk;								      \
     if (__builtin_expect (FD->bk != P || BK->fd != P, 0))		      \
-      malloc_printerr (check_action, "corrupted double-linked list", P);      \
+      malloc_printerr (check_action, "corrupted double-linked list", P, AV);  \
     else {								      \
         FD->bk = BK;							      \
         BK->fd = FD;							      \
@@ -1424,7 +1424,8 @@  typedef struct malloc_chunk *mbinptr;
 	    if (__builtin_expect (P->fd_nextsize->bk_nextsize != P, 0)	      \
 		|| __builtin_expect (P->bk_nextsize->fd_nextsize != P, 0))    \
 	      malloc_printerr (check_action,				      \
-			       "corrupted double-linked list (not small)", P);\
+			       "corrupted double-linked list (not small)",    \
+			       P, AV);					      \
             if (FD->fd_nextsize == NULL) {				      \
                 if (P->fd_nextsize == P)				      \
                   FD->fd_nextsize = FD->bk_nextsize = FD;		      \
@@ -1656,6 +1657,15 @@  typedef struct malloc_chunk *mfastbinptr;
 #define set_noncontiguous(M)   ((M)->flags |= NONCONTIGUOUS_BIT)
 #define set_contiguous(M)      ((M)->flags &= ~NONCONTIGUOUS_BIT)
 
+/* ARENA_CORRUPTION_BIT is set if a memory corruption was detected on the
+   arena.  Such an arena is no longer used to allocate chunks.  Chunks
+   allocated in that arena before detecting corruption are not freed.  */
+
+#define ARENA_CORRUPTION_BIT (4U)
+
+#define arena_is_corrupt(A)	(((A)->flags & ARENA_CORRUPTION_BIT))
+#define set_arena_corrupt(A)	((A)->flags |= ARENA_CORRUPTION_BIT)
+
 /*
    Set value of max_fast.
    Use impossibly small value if 0.
@@ -2280,8 +2290,9 @@  sysmalloc (INTERNAL_SIZE_T nb, mstate av)
      rather than expanding top.
    */
 
-  if ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold) &&
-      (mp_.n_mmaps < mp_.n_mmaps_max))
+  if (av == NULL
+      || ((unsigned long) (nb) >= (unsigned long) (mp_.mmap_threshold)
+	  && (mp_.n_mmaps < mp_.n_mmaps_max)))
     {
       char *mm;           /* return value from mmap call*/
 
@@ -2354,6 +2365,10 @@  sysmalloc (INTERNAL_SIZE_T nb, mstate av)
         }
     }
 
+  /* There are no usable arenas and mmap also failed.  */
+  if (av == NULL)
+    return 0;
+
   /* Record incoming configuration of top */
 
   old_top = av->top;
@@ -2528,7 +2543,8 @@  sysmalloc (INTERNAL_SIZE_T nb, mstate av)
           else if (contiguous (av) && old_size && brk < old_end)
             {
               /* Oops!  Someone else killed our space..  Can't touch anything.  */
-              malloc_printerr (3, "break adjusted to free malloc space", brk);
+              malloc_printerr (3, "break adjusted to free malloc space", brk,
+			       av);
             }
 
           /*
@@ -2818,7 +2834,7 @@  munmap_chunk (mchunkptr p)
   if (__builtin_expect (((block | total_size) & (GLRO (dl_pagesize) - 1)) != 0, 0))
     {
       malloc_printerr (check_action, "munmap_chunk(): invalid pointer",
-                       chunk2mem (p));
+                       chunk2mem (p), NULL);
       return;
     }
 
@@ -2888,22 +2904,19 @@  __libc_malloc (size_t bytes)
 
   arena_get (ar_ptr, bytes);
 
-  if (!ar_ptr)
-    return 0;
-
   victim = _int_malloc (ar_ptr, bytes);
-  if (!victim)
+  /* Retry with another arena only if we were able to find a usable arena
+     before.  */
+  if (!victim && ar_ptr != NULL)
     {
       LIBC_PROBE (memory_malloc_retry, 1, bytes);
       ar_ptr = arena_get_retry (ar_ptr, bytes);
-      if (__builtin_expect (ar_ptr != NULL, 1))
-        {
-          victim = _int_malloc (ar_ptr, bytes);
-          (void) mutex_unlock (&ar_ptr->mutex);
-        }
+      victim = _int_malloc (ar_ptr, bytes);
     }
-  else
+
+  if (ar_ptr != NULL)
     (void) mutex_unlock (&ar_ptr->mutex);
+
   assert (!victim || chunk_is_mmapped (mem2chunk (victim)) ||
           ar_ptr == arena_for_chunk (mem2chunk (victim)));
   return victim;
@@ -2979,6 +2992,11 @@  __libc_realloc (void *oldmem, size_t bytes)
   /* its size */
   const INTERNAL_SIZE_T oldsize = chunksize (oldp);
 
+  if (chunk_is_mmapped (oldp))
+    ar_ptr = NULL;
+  else
+    ar_ptr = arena_for_chunk (oldp);
+
   /* Little security check which won't hurt performance: the
      allocator never wrapps around at the end of the address space.
      Therefore we can exclude some size values which might appear
@@ -2986,7 +3004,8 @@  __libc_realloc (void *oldmem, size_t bytes)
   if (__builtin_expect ((uintptr_t) oldp > (uintptr_t) -oldsize, 0)
       || __builtin_expect (misaligned_chunk (oldp), 0))
     {
-      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem);
+      malloc_printerr (check_action, "realloc(): invalid pointer", oldmem,
+		       ar_ptr);
       return NULL;
     }
 
@@ -3015,10 +3034,8 @@  __libc_realloc (void *oldmem, size_t bytes)
       return newmem;
     }
 
-  ar_ptr = arena_for_chunk (oldp);
   (void) mutex_lock (&ar_ptr->mutex);
 
-
   newp = _int_realloc (ar_ptr, oldp, oldsize, nb);
 
   (void) mutex_unlock (&ar_ptr->mutex);
@@ -3093,22 +3110,18 @@  _mid_memalign (size_t alignment, size_t bytes, void *address)
     }
 
   arena_get (ar_ptr, bytes + alignment + MINSIZE);
-  if (!ar_ptr)
-    return 0;
 
   p = _int_memalign (ar_ptr, alignment, bytes);
-  if (!p)
+  if (!p && ar_ptr != NULL)
     {
       LIBC_PROBE (memory_memalign_retry, 2, bytes, alignment);
       ar_ptr = arena_get_retry (ar_ptr, bytes);
-      if (__builtin_expect (ar_ptr != NULL, 1))
-        {
-          p = _int_memalign (ar_ptr, alignment, bytes);
-          (void) mutex_unlock (&ar_ptr->mutex);
-        }
+      p = _int_memalign (ar_ptr, alignment, bytes);
     }
-  else
+
+  if (ar_ptr != NULL)
     (void) mutex_unlock (&ar_ptr->mutex);
+
   assert (!p || chunk_is_mmapped (mem2chunk (p)) ||
           ar_ptr == arena_for_chunk (mem2chunk (p)));
   return p;
@@ -3187,47 +3200,53 @@  __libc_calloc (size_t n, size_t elem_size)
   sz = bytes;
 
   arena_get (av, sz);
-  if (!av)
-    return 0;
-
-  /* Check if we hand out the top chunk, in which case there may be no
-     need to clear. */
+  if (av)
+    {
+      /* Check if we hand out the top chunk, in which case there may be no
+	 need to clear. */
 #if MORECORE_CLEARS
-  oldtop = top (av);
-  oldtopsize = chunksize (top (av));
+      oldtop = top (av);
+      oldtopsize = chunksize (top (av));
 # if MORECORE_CLEARS < 2
-  /* Only newly allocated memory is guaranteed to be cleared.  */
-  if (av == &main_arena &&
-      oldtopsize < mp_.sbrk_base + av->max_system_mem - (char *) oldtop)
-    oldtopsize = (mp_.sbrk_base + av->max_system_mem - (char *) oldtop);
+      /* Only newly allocated memory is guaranteed to be cleared.  */
+      if (av == &main_arena &&
+	  oldtopsize < mp_.sbrk_base + av->max_system_mem - (char *) oldtop)
+	oldtopsize = (mp_.sbrk_base + av->max_system_mem - (char *) oldtop);
 # endif
-  if (av != &main_arena)
+      if (av != &main_arena)
+	{
+	  heap_info *heap = heap_for_ptr (oldtop);
+	  if (oldtopsize < (char *) heap + heap->mprotect_size - (char *) oldtop)
+	    oldtopsize = (char *) heap + heap->mprotect_size - (char *) oldtop;
+	}
+#endif
+    }
+  else
     {
-      heap_info *heap = heap_for_ptr (oldtop);
-      if (oldtopsize < (char *) heap + heap->mprotect_size - (char *) oldtop)
-        oldtopsize = (char *) heap + heap->mprotect_size - (char *) oldtop;
+      /* No usable arenas.  */
+      oldtop = 0;
+      oldtopsize = 0;
     }
-#endif
   mem = _int_malloc (av, sz);
 
 
   assert (!mem || chunk_is_mmapped (mem2chunk (mem)) ||
           av == arena_for_chunk (mem2chunk (mem)));
 
-  if (mem == 0)
+  if (mem == 0 && av != NULL)
     {
       LIBC_PROBE (memory_calloc_retry, 1, sz);
       av = arena_get_retry (av, sz);
-      if (__builtin_expect (av != NULL, 1))
-        {
-          mem = _int_malloc (av, sz);
-          (void) mutex_unlock (&av->mutex);
-        }
-      if (mem == 0)
-        return 0;
+      mem = _int_malloc (av, sz);
     }
-  else
+
+  if (av != NULL)
     (void) mutex_unlock (&av->mutex);
+
+  /* Allocation failed even after a retry.  */
+  if (mem == 0)
+    return 0;
+
   p = mem2chunk (mem);
 
   /* Two optional cases in which clearing not necessary */
@@ -3323,6 +3342,16 @@  _int_malloc (mstate av, size_t bytes)
 
   checked_request2size (bytes, nb);
 
+  /* There are no usable arenas.  Fall back to sysmalloc to get a chunk from
+     mmap.  */
+  if (__glibc_unlikely (av == NULL))
+    {
+      void *p = sysmalloc (nb, av);
+      if (p != NULL)
+	alloc_perturb (p, bytes);
+      return p;
+    }
+
   /*
      If the size qualifies as a fastbin, first check corresponding bin.
      This code is safe to execute even if av is not yet initialized, so we
@@ -3348,7 +3377,7 @@  _int_malloc (mstate av, size_t bytes)
             {
               errstr = "malloc(): memory corruption (fast)";
             errout:
-              malloc_printerr (check_action, errstr, chunk2mem (victim));
+              malloc_printerr (check_action, errstr, chunk2mem (victim), av);
               return NULL;
             }
           check_remalloced_chunk (av, victim, nb);
@@ -3437,7 +3466,7 @@  _int_malloc (mstate av, size_t bytes)
           if (__builtin_expect (victim->size <= 2 * SIZE_SZ, 0)
               || __builtin_expect (victim->size > av->system_mem, 0))
             malloc_printerr (check_action, "malloc(): memory corruption",
-                             chunk2mem (victim));
+                             chunk2mem (victim), av);
           size = chunksize (victim);
 
           /*
@@ -3584,7 +3613,7 @@  _int_malloc (mstate av, size_t bytes)
                 victim = victim->fd;
 
               remainder_size = size - nb;
-              unlink (victim, bck, fwd);
+              unlink (av, victim, bck, fwd);
 
               /* Exhaust */
               if (remainder_size < MINSIZE)
@@ -3689,7 +3718,7 @@  _int_malloc (mstate av, size_t bytes)
               remainder_size = size - nb;
 
               /* unlink */
-              unlink (victim, bck, fwd);
+              unlink (av, victim, bck, fwd);
 
               /* Exhaust */
               if (remainder_size < MINSIZE)
@@ -3829,7 +3858,7 @@  _int_free (mstate av, mchunkptr p, int have_lock)
     errout:
       if (!have_lock && locked)
         (void) mutex_unlock (&av->mutex);
-      malloc_printerr (check_action, errstr, chunk2mem (p));
+      malloc_printerr (check_action, errstr, chunk2mem (p), av);
       return;
     }
   /* We know that each chunk is at least MINSIZE bytes in size or a
@@ -3967,7 +3996,7 @@  _int_free (mstate av, mchunkptr p, int have_lock)
       prevsize = p->prev_size;
       size += prevsize;
       p = chunk_at_offset(p, -((long) prevsize));
-      unlink(p, bck, fwd);
+      unlink(av, p, bck, fwd);
     }
 
     if (nextchunk != av->top) {
@@ -3976,7 +4005,7 @@  _int_free (mstate av, mchunkptr p, int have_lock)
 
       /* consolidate forward */
       if (!nextinuse) {
-	unlink(nextchunk, bck, fwd);
+	unlink(av, nextchunk, bck, fwd);
 	size += nextsize;
       } else
 	clear_inuse_bit_at_offset(nextchunk, 0);
@@ -4137,7 +4166,7 @@  static void malloc_consolidate(mstate av)
 	    prevsize = p->prev_size;
 	    size += prevsize;
 	    p = chunk_at_offset(p, -((long) prevsize));
-	    unlink(p, bck, fwd);
+	    unlink(av, p, bck, fwd);
 	  }
 
 	  if (nextchunk != av->top) {
@@ -4145,7 +4174,7 @@  static void malloc_consolidate(mstate av)
 
 	    if (!nextinuse) {
 	      size += nextsize;
-	      unlink(nextchunk, bck, fwd);
+	      unlink(av, nextchunk, bck, fwd);
 	    } else
 	      clear_inuse_bit_at_offset(nextchunk, 0);
 
@@ -4214,7 +4243,7 @@  _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
     {
       errstr = "realloc(): invalid old size";
     errout:
-      malloc_printerr (check_action, errstr, chunk2mem (oldp));
+      malloc_printerr (check_action, errstr, chunk2mem (oldp), av);
       return NULL;
     }
 
@@ -4260,7 +4289,7 @@  _int_realloc(mstate av, mchunkptr oldp, INTERNAL_SIZE_T oldsize,
                (unsigned long) (nb))
         {
           newp = oldp;
-          unlink (next, bck, fwd);
+          unlink (av, next, bck, fwd);
         }
 
       /* allocate, copy, free */
@@ -4455,6 +4484,10 @@  _int_memalign (mstate av, size_t alignment, size_t bytes)
 static int
 mtrim (mstate av, size_t pad)
 {
+  /* Don't touch corrupt arenas.  */
+  if (arena_is_corrupt (av))
+    return 0;
+
   /* Ensure initialization/consolidation */
   malloc_consolidate (av);
 
@@ -4945,8 +4978,14 @@  libc_hidden_def (__libc_mallopt)
 extern char **__libc_argv attribute_hidden;
 
 static void
-malloc_printerr (int action, const char *str, void *ptr)
+malloc_printerr (int action, const char *str, void *ptr, mstate ar_ptr)
 {
+  /* Avoid using this arena in future.  We do not attempt to synchronize this
+     with anything else because we minimally want to ensure that __libc_message
+     gets its resources safely without stumbling on the current corruption.  */
+  if (ar_ptr)
+    set_arena_corrupt (ar_ptr);
+
   if ((action & 5) == 5)
     __libc_message (action & 2, "%s\n", str);
   else if (action & 1)
diff --git a/malloc/tst-malloc-backtrace.c b/malloc/tst-malloc-backtrace.c
new file mode 100644
index 0000000..796a42f
--- /dev/null
+++ b/malloc/tst-malloc-backtrace.c
@@ -0,0 +1,50 @@ 
+/* Verify that backtrace does not deadlock on itself on memory corruption.
+   Copyright (C) 2015 Free Software Foundation, Inc.
+   This file is part of the GNU C Library.
+
+   The GNU C Library is free software; you can redistribute it and/or
+   modify it under the terms of the GNU Lesser General Public
+   License as published by the Free Software Foundation; either
+   version 2.1 of the License, or (at your option) any later version.
+
+   The GNU C Library is distributed in the hope that it will be useful,
+   but WITHOUT ANY WARRANTY; without even the implied warranty of
+   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+   Lesser General Public License for more details.
+
+   You should have received a copy of the GNU Lesser General Public
+   License along with the GNU C Library; if not, see
+   <http://www.gnu.org/licenses/>.  */
+
+
+#include <stdlib.h>
+
+#define SIZE 4096
+
+/* Wrap free with a function to prevent gcc from optimizing it out.  */
+static void
+__attribute__((noinline))
+call_free (void *ptr)
+{
+  free (ptr);
+  *(size_t *)(ptr - 8) = 1;
+}
+
+int
+do_test (void)
+{
+  void *ptr1 = malloc (SIZE);
+  void *ptr2 = malloc (SIZE);
+
+  call_free ((void *) ptr1);
+  ptr1 = malloc (SIZE);
+
+  /* Not reached.  The return statement is to put ptr2 into use so that gcc
+     doesn't optimize out that malloc call.  */
+  return (ptr1 == ptr2);
+}
+
+#define TEST_FUNCTION do_test ()
+#define EXPECTED_SIGNAL SIGABRT
+
+#include "../test-skeleton.c"