diff mbox

[U-Boot,3/4] malloc_simple: Return NULL on malloc failure rather then calling panic()

Message ID 1423051551-948-4-git-send-email-hdegoede@redhat.com
State Accepted
Delegated to: Simon Glass
Headers show

Commit Message

Hans de Goede Feb. 4, 2015, 12:05 p.m. UTC
All callers of malloc should already do error checking, and may even be able
to continue without the alloc succeeding.

Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
common/built-in.o when building the SPL, triggering this gcc bug:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303

Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
using malloc_simple in the first place.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
---
 common/malloc_simple.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Simon Glass Feb. 5, 2015, 3:04 a.m. UTC | #1
On 4 February 2015 at 05:05, Hans de Goede <hdegoede@redhat.com> wrote:
> All callers of malloc should already do error checking, and may even be able
> to continue without the alloc succeeding.
>
> Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
> common/built-in.o when building the SPL, triggering this gcc bug:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303
>
> Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
> using malloc_simple in the first place.
>
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  common/malloc_simple.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Great find, thanks!

Acked-by: Simon Glass <sjg@chromium.org>
Siarhei Siamashka Feb. 5, 2015, 11:24 a.m. UTC | #2
On Wed,  4 Feb 2015 13:05:50 +0100
Hans de Goede <hdegoede@redhat.com> wrote:

> All callers of malloc should already do error checking, and may even be able
> to continue without the alloc succeeding.
> 
> Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
> common/built-in.o when building the SPL, triggering this gcc bug:
> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303
> 
> Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
> using malloc_simple in the first place.
> 
> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
> ---
>  common/malloc_simple.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/common/malloc_simple.c b/common/malloc_simple.c
> index afdacff..64ae036 100644
> --- a/common/malloc_simple.c
> +++ b/common/malloc_simple.c
> @@ -19,7 +19,7 @@ void *malloc_simple(size_t bytes)
>  
>  	new_ptr = gd->malloc_ptr + bytes;
>  	if (new_ptr > gd->malloc_limit)
> -		panic("Out of pre-reloc memory");
> +		return NULL;
>  	ptr = map_sysmem(gd->malloc_base + gd->malloc_ptr, bytes);
>  	gd->malloc_ptr = ALIGN(new_ptr, sizeof(new_ptr));
>  	return ptr;

The other patches look great, but I'm not convinced that requiring the
malloc callers to do error checking is such a great idea. This means a
lot of checks in a lot of places with extra code paths instead of just
a single check in one place for the "impossible to happen" critical
failure. I think that we should normally assume that malloc always
succeeds in the production code and the panic may only happen while
debugging.

If the malloc pool is in the DRAM, then we usually have orders of
magnitude more space than necessary. While the code might be still
in the SRAM at the same time (the extra branching code logic for
errors checking may be wasting the scarce SRAM space).

If the malloc pool is in the SRAM and potentially may fail allocations,
then this is a major reliability problem by itself. The malloc pool is
always inefficient, has fragmentation problems, etc. If this is the
case, then IMHO the only right solution is to replace such problematic
dynamic allocations with static reservations in the ".data" section.
Otherwise the reliability critical things (something like Mars rovers
for example) will be sometimes failing. The Murphy law exists for
a reason :-)

The workaround for the GCC compiler bug is orthogonal to this.
Maybe there is some other solution?
Hans de Goede Feb. 5, 2015, 5:53 p.m. UTC | #3
Hi,

On 05-02-15 12:24, Siarhei Siamashka wrote:
> On Wed,  4 Feb 2015 13:05:50 +0100
> Hans de Goede <hdegoede@redhat.com> wrote:
>
>> All callers of malloc should already do error checking, and may even be able
>> to continue without the alloc succeeding.
>>
>> Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
>> common/built-in.o when building the SPL, triggering this gcc bug:
>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303
>>
>> Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
>> using malloc_simple in the first place.
>>
>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>> ---
>>   common/malloc_simple.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/common/malloc_simple.c b/common/malloc_simple.c
>> index afdacff..64ae036 100644
>> --- a/common/malloc_simple.c
>> +++ b/common/malloc_simple.c
>> @@ -19,7 +19,7 @@ void *malloc_simple(size_t bytes)
>>
>>   	new_ptr = gd->malloc_ptr + bytes;
>>   	if (new_ptr > gd->malloc_limit)
>> -		panic("Out of pre-reloc memory");
>> +		return NULL;
>>   	ptr = map_sysmem(gd->malloc_base + gd->malloc_ptr, bytes);
>>   	gd->malloc_ptr = ALIGN(new_ptr, sizeof(new_ptr));
>>   	return ptr;
>
> The other patches look great, but I'm not convinced that requiring the
> malloc callers to do error checking is such a great idea. This means a
> lot of checks in a lot of places with extra code paths instead of just
> a single check in one place for the "impossible to happen" critical
> failure. I think that we should normally assume that malloc always
> succeeds in the production code and the panic may only happen while
> debugging.
>
> If the malloc pool is in the DRAM, then we usually have orders of
> magnitude more space than necessary. While the code might be still
> in the SRAM at the same time (the extra branching code logic for
> errors checking may be wasting the scarce SRAM space).
>
> If the malloc pool is in the SRAM and potentially may fail allocations,
> then this is a major reliability problem by itself. The malloc pool is
> always inefficient, has fragmentation problems, etc. If this is the
> case, then IMHO the only right solution is to replace such problematic
> dynamic allocations with static reservations in the ".data" section.
> Otherwise the reliability critical things (something like Mars rovers
> for example) will be sometimes failing. The Murphy law exists for
> a reason :-)

Actually we had a discussion about this at the u-boot miniconf at ELCE,
I was actually advocating for having malloc behave as gmalloc from
glib2 does, so basically what you're advocating for, but I lost the
discussion. All this patch does is bring the simple, light-weight malloc
from malloc_simple.c inline with the regular malloc implementation which
can already fail.

> The workaround for the GCC compiler bug is orthogonal to this.
> Maybe there is some other solution?

To workaround this gcc bug we need to not use any const strings.
I guess we could do something like this and still panic ():

char str[4];

str[0] = 'O';
str[1] = 'O';
str[2] = 'M';
str[3] = 0;

But I think that just removing the panic is better, as it makes
malloc_simple.c consistent with the regular malloc. In the end
we should really get the gcc bug fixed, this will likely
trim down .rodata significantly.

Regards,

Hans


p.s.

I've also seen your reply to 0/4, I will reply when I've some
more time.
Simon Glass Feb. 5, 2015, 6 p.m. UTC | #4
Hi,

On 5 February 2015 at 10:53, Hans de Goede <hdegoede@redhat.com> wrote:
> Hi,
>
>
> On 05-02-15 12:24, Siarhei Siamashka wrote:
>>
>> On Wed,  4 Feb 2015 13:05:50 +0100
>> Hans de Goede <hdegoede@redhat.com> wrote:
>>
>>> All callers of malloc should already do error checking, and may even be
>>> able
>>> to continue without the alloc succeeding.
>>>
>>> Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
>>> common/built-in.o when building the SPL, triggering this gcc bug:
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303
>>>
>>> Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
>>> using malloc_simple in the first place.
>>>
>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>> ---
>>>   common/malloc_simple.c | 2 +-
>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/common/malloc_simple.c b/common/malloc_simple.c
>>> index afdacff..64ae036 100644
>>> --- a/common/malloc_simple.c
>>> +++ b/common/malloc_simple.c
>>> @@ -19,7 +19,7 @@ void *malloc_simple(size_t bytes)
>>>
>>>         new_ptr = gd->malloc_ptr + bytes;
>>>         if (new_ptr > gd->malloc_limit)
>>> -               panic("Out of pre-reloc memory");
>>> +               return NULL;
>>>         ptr = map_sysmem(gd->malloc_base + gd->malloc_ptr, bytes);
>>>         gd->malloc_ptr = ALIGN(new_ptr, sizeof(new_ptr));
>>>         return ptr;
>>
>>
>> The other patches look great, but I'm not convinced that requiring the
>> malloc callers to do error checking is such a great idea. This means a
>> lot of checks in a lot of places with extra code paths instead of just
>> a single check in one place for the "impossible to happen" critical
>> failure. I think that we should normally assume that malloc always
>> succeeds in the production code and the panic may only happen while
>> debugging.
>>
>> If the malloc pool is in the DRAM, then we usually have orders of
>> magnitude more space than necessary. While the code might be still
>> in the SRAM at the same time (the extra branching code logic for
>> errors checking may be wasting the scarce SRAM space).
>>
>> If the malloc pool is in the SRAM and potentially may fail allocations,
>> then this is a major reliability problem by itself. The malloc pool is
>> always inefficient, has fragmentation problems, etc. If this is the
>> case, then IMHO the only right solution is to replace such problematic
>> dynamic allocations with static reservations in the ".data" section.
>> Otherwise the reliability critical things (something like Mars rovers
>> for example) will be sometimes failing. The Murphy law exists for
>> a reason :-)
>
>
> Actually we had a discussion about this at the u-boot miniconf at ELCE,
> I was actually advocating for having malloc behave as gmalloc from
> glib2 does, so basically what you're advocating for, but I lost the
> discussion. All this patch does is bring the simple, light-weight malloc
> from malloc_simple.c inline with the regular malloc implementation which
> can already fail.
>
>> The workaround for the GCC compiler bug is orthogonal to this.
>> Maybe there is some other solution?
>
>
> To workaround this gcc bug we need to not use any const strings.
> I guess we could do something like this and still panic ():
>
> char str[4];
>
> str[0] = 'O';
> str[1] = 'O';
> str[2] = 'M';
> str[3] = 0;
>
> But I think that just removing the panic is better, as it makes
> malloc_simple.c consistent with the regular malloc. In the end
> we should really get the gcc bug fixed, this will likely
> trim down .rodata significantly.

Yes I'd like to apply this one. It works around a really annoying gcc bug.

Hans I think you are right that we can deal with the errors at the top
level, as we do with other things. SPL error handling is probably not
great, but that's a separate issue.

Regards,
Simon
Simon Glass Feb. 10, 2015, 9:33 p.m. UTC | #5
On 5 February 2015 at 11:00, Simon Glass <sjg@chromium.org> wrote:
> Hi,
>
> On 5 February 2015 at 10:53, Hans de Goede <hdegoede@redhat.com> wrote:
>> Hi,
>>
>>
>> On 05-02-15 12:24, Siarhei Siamashka wrote:
>>>
>>> On Wed,  4 Feb 2015 13:05:50 +0100
>>> Hans de Goede <hdegoede@redhat.com> wrote:
>>>
>>>> All callers of malloc should already do error checking, and may even be
>>>> able
>>>> to continue without the alloc succeeding.
>>>>
>>>> Moreover, common/malloc_simple.c is the only user of .rodata.str1.1 in
>>>> common/built-in.o when building the SPL, triggering this gcc bug:
>>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=54303
>>>>
>>>> Causing .rodata to grow with e.g. 0xc21 bytes, nullifying all benefits of
>>>> using malloc_simple in the first place.
>>>>
>>>> Signed-off-by: Hans de Goede <hdegoede@redhat.com>
>>>> ---
>>>>   common/malloc_simple.c | 2 +-
>>>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>>>
>>>> diff --git a/common/malloc_simple.c b/common/malloc_simple.c
>>>> index afdacff..64ae036 100644
>>>> --- a/common/malloc_simple.c
>>>> +++ b/common/malloc_simple.c
>>>> @@ -19,7 +19,7 @@ void *malloc_simple(size_t bytes)
>>>>
>>>>         new_ptr = gd->malloc_ptr + bytes;
>>>>         if (new_ptr > gd->malloc_limit)
>>>> -               panic("Out of pre-reloc memory");
>>>> +               return NULL;
>>>>         ptr = map_sysmem(gd->malloc_base + gd->malloc_ptr, bytes);
>>>>         gd->malloc_ptr = ALIGN(new_ptr, sizeof(new_ptr));
>>>>         return ptr;
>>>
>>>
>>> The other patches look great, but I'm not convinced that requiring the
>>> malloc callers to do error checking is such a great idea. This means a
>>> lot of checks in a lot of places with extra code paths instead of just
>>> a single check in one place for the "impossible to happen" critical
>>> failure. I think that we should normally assume that malloc always
>>> succeeds in the production code and the panic may only happen while
>>> debugging.
>>>
>>> If the malloc pool is in the DRAM, then we usually have orders of
>>> magnitude more space than necessary. While the code might be still
>>> in the SRAM at the same time (the extra branching code logic for
>>> errors checking may be wasting the scarce SRAM space).
>>>
>>> If the malloc pool is in the SRAM and potentially may fail allocations,
>>> then this is a major reliability problem by itself. The malloc pool is
>>> always inefficient, has fragmentation problems, etc. If this is the
>>> case, then IMHO the only right solution is to replace such problematic
>>> dynamic allocations with static reservations in the ".data" section.
>>> Otherwise the reliability critical things (something like Mars rovers
>>> for example) will be sometimes failing. The Murphy law exists for
>>> a reason :-)
>>
>>
>> Actually we had a discussion about this at the u-boot miniconf at ELCE,
>> I was actually advocating for having malloc behave as gmalloc from
>> glib2 does, so basically what you're advocating for, but I lost the
>> discussion. All this patch does is bring the simple, light-weight malloc
>> from malloc_simple.c inline with the regular malloc implementation which
>> can already fail.
>>
>>> The workaround for the GCC compiler bug is orthogonal to this.
>>> Maybe there is some other solution?
>>
>>
>> To workaround this gcc bug we need to not use any const strings.
>> I guess we could do something like this and still panic ():
>>
>> char str[4];
>>
>> str[0] = 'O';
>> str[1] = 'O';
>> str[2] = 'M';
>> str[3] = 0;
>>
>> But I think that just removing the panic is better, as it makes
>> malloc_simple.c consistent with the regular malloc. In the end
>> we should really get the gcc bug fixed, this will likely
>> trim down .rodata significantly.
>
> Yes I'd like to apply this one. It works around a really annoying gcc bug.
>
> Hans I think you are right that we can deal with the errors at the top
> level, as we do with other things. SPL error handling is probably not
> great, but that's a separate issue.
>
> Regards,
> Simon

Applied to u-boot-dm, thanks!
diff mbox

Patch

diff --git a/common/malloc_simple.c b/common/malloc_simple.c
index afdacff..64ae036 100644
--- a/common/malloc_simple.c
+++ b/common/malloc_simple.c
@@ -19,7 +19,7 @@  void *malloc_simple(size_t bytes)
 
 	new_ptr = gd->malloc_ptr + bytes;
 	if (new_ptr > gd->malloc_limit)
-		panic("Out of pre-reloc memory");
+		return NULL;
 	ptr = map_sysmem(gd->malloc_base + gd->malloc_ptr, bytes);
 	gd->malloc_ptr = ALIGN(new_ptr, sizeof(new_ptr));
 	return ptr;