diff mbox

Permit zero-sized qemu_malloc() & friends

Message ID m3skbwxgkp.fsf@crossbow.pond.sub.org
State New
Headers show

Commit Message

Markus Armbruster Nov. 30, 2009, 1:55 p.m. UTC
Commit a7d27b53 made zero-sized allocations a fatal error, deviating
from ISO C's malloc() & friends.  Revert that, but take care never to
return a null pointer, like malloc() & friends may do (it's
implementation defined), because that's another source of bugs.

Rationale: while zero-sized allocations might occasionally be a sign of
something going wrong, they can also be perfectly legitimate.  The
change broke such legitimate uses.  We've found and "fixed" at least one
of them already (commit eb0b64f7, also reverted by this patch), and
another one just popped up: the change broke qcow2 images with virtual
disk size zero, i.e. images that don't hold real data but only VM state
of snapshots.

If a change breaks two uses, it probably breaks more.  As a quick check,
I reviewed the first six of more than 200 uses of qemu_mallocz(),
qemu_malloc() and qemu_realloc() that don't have an argument of the form
sizeof(...) or similar:

* load_elf32(), load_elf64() in hw/elf_ops.h:

    size = ehdr.e_phnum * sizeof(phdr[0]);
    lseek(fd, ehdr.e_phoff, SEEK_SET);
    phdr = qemu_mallocz(size);

  This aborts when the ELF file has no program header table, because
  then e_phnum is zero (TIS ELF 1.2 page 1-6).  Even if we require the
  ELF file to have a program header table, aborting is not an acceptable
  way to reject an unsuitable or malformed ELF file.

* load_elf32(), load_elf64() in hw/elf_ops.h:

            mem_size = ph->p_memsz;
            /* XXX: avoid allocating */
            data = qemu_mallocz(mem_size);

  This aborts when the segment occupies zero bytes in the image file
  (TIS ELF 1.2 page 2-2).

* bdrv_open2() in block.c:

    bs->opaque = qemu_mallocz(drv->instance_size);

  However, vvfat_write_target.instance_size is zero.  Not sure whether
  this actually bites or is "only" a time bomb.

* qemu_aio_get() in block.c:

        acb = qemu_mallocz(pool->aiocb_size);

  No existing instance of BlockDriverAIOCB has a zero aiocb_size.
  Probably okay.

* defaultallocator_create_displaysurface() in console.c has

    surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);

  With Xen, surface->linesize and surface->height come out of
  xenfb_configure_fb(), which set xenfb->width and xenfb->height to
  values obtained from the guest.  If a buggy guest sets one to zero, we
  abort.  Not an good way to handle that.

  Non-Xen uses aren't obviously correct either, but I didn't dig deeper.

* load_device_tree() in device_tree.c:

    *sizep = 0;
    dt_size = get_image_size(filename_path);
    if (dt_size < 0) {
        printf("Unable to get size of device tree file '%s'\n",
            filename_path);
        goto fail;
    }

    /* Expand to 2x size to give enough room for manipulation.  */
    dt_size *= 2;
    /* First allocate space in qemu for device tree */
    fdt = qemu_mallocz(dt_size);

  We abort if the image is empty.  Not an acceptable way to handle that.

Based on this sample, I'm confident there are many more uses broken by
the change.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 block/qcow2-snapshot.c |    5 -----
 qemu-malloc.c          |   14 ++------------
 2 files changed, 2 insertions(+), 17 deletions(-)

Comments

Avi Kivity Nov. 30, 2009, 2:01 p.m. UTC | #1
On 11/30/2009 03:55 PM, Markus Armbruster wrote:
> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc()&  friends.  Revert that, but take care never to
> return a null pointer, like malloc()&  friends may do (it's
> implementation defined), because that's another source of bugs.
>
> Rationale: while zero-sized allocations might occasionally be a sign of
> something going wrong, they can also be perfectly legitimate.  The
> change broke such legitimate uses.  We've found and "fixed" at least one
> of them already (commit eb0b64f7, also reverted by this patch), and
> another one just popped up: the change broke qcow2 images with virtual
> disk size zero, i.e. images that don't hold real data but only VM state
> of snapshots.
>    

I strongly agree with this patch.  A consistent API improves quality, 
especially when it is also consistent with people's expectations.
Kevin Wolf Nov. 30, 2009, 2:23 p.m. UTC | #2
Am 30.11.2009 14:55, schrieb Markus Armbruster:
> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc() & friends.  Revert that, but take care never to
> return a null pointer, like malloc() & friends may do (it's
> implementation defined), because that's another source of bugs.
> 
> [...]
> 
> Based on this sample, I'm confident there are many more uses broken by
> the change.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>

Acked-by: Kevin Wolf <kwolf@redhat.com>

We really should fix the root cause once and for all instead of working
around unexpected qemu_malloc behaviour each time when one of the time
bombs has blown up another (possibly production) VM.
Gerd Hoffmann Dec. 1, 2009, 12:40 p.m. UTC | #3
> diff --git a/qemu-malloc.c b/qemu-malloc.c
> index 295d185..aeeb78b 100644
> --- a/qemu-malloc.c
> +++ b/qemu-malloc.c
> @@ -44,22 +44,12 @@ void qemu_free(void *ptr)
>
>   void *qemu_malloc(size_t size)
>   {
> -    if (!size) {
> -        abort();
> -    }
> -    return oom_check(malloc(size));
> +    return oom_check(malloc(size ? size : 1));
>   }

You might want to have a 'static uint8_t zero_length_malloc[0]' and 
return that instead of the magic cookie '1'.  Makes the code more 
readable IMHO and you'll also have symbol in gdb when debugging qemu.

Even more advanced:  Make zero_length_malloc page-sized and 
page-aligned, then munmap int, so dereferencing it actually traps.

>   void *qemu_realloc(void *ptr, size_t size)
>   {
> +    return oom_check(realloc(ptr, size ? size : 1));

qemu_realloc(qemu_malloc(0), 42);

should better work correctly ...

Likewise qemu_free(qemu_malloc(0));

cheers,
   Gerd
Paul Brook Dec. 1, 2009, 12:57 p.m. UTC | #4
> You might want to have a 'static uint8_t zero_length_malloc[0]' and
> return that instead of the magic cookie '1'.  Makes the code more
> readable IMHO and you'll also have symbol in gdb when debugging qemu.

Having multiple malloc return the same pointer sounds like a really bad idea.

Paul
Gerd Hoffmann Dec. 1, 2009, 12:57 p.m. UTC | #5
On 12/01/09 13:40, Gerd Hoffmann wrote:

>> + return oom_check(malloc(size ? size : 1));

>> void *qemu_realloc(void *ptr, size_t size)
>> {
>> + return oom_check(realloc(ptr, size ? size : 1));
>
> qemu_realloc(qemu_malloc(0), 42);
>
> should better work correctly ...

Oops, scratch that.

"return malloc(size ? size : 1)" is very different from "return size ? 
malloc(size) : 1" which my mind somehow made out of the code line above ...

sorry,
   Gerd
Markus Armbruster Dec. 1, 2009, 1:11 p.m. UTC | #6
Gerd Hoffmann <kraxel@redhat.com> writes:

>> diff --git a/qemu-malloc.c b/qemu-malloc.c
>> index 295d185..aeeb78b 100644
>> --- a/qemu-malloc.c
>> +++ b/qemu-malloc.c
>> @@ -44,22 +44,12 @@ void qemu_free(void *ptr)
>>
>>   void *qemu_malloc(size_t size)
>>   {
>> -    if (!size) {
>> -        abort();
>> -    }
>> -    return oom_check(malloc(size));
>> +    return oom_check(malloc(size ? size : 1));
>>   }
>
> You might want to have a 'static uint8_t zero_length_malloc[0]' and
> return that instead of the magic cookie '1'.  Makes the code more
> readable IMHO and you'll also have symbol in gdb when debugging qemu.

Complicates qemu_realloc() and qemu_free() somewhat, and that makes me
think we better do it as a separate commit.  Agree?

> Even more advanced:  Make zero_length_malloc page-sized and
> page-aligned, then munmap int, so dereferencing it actually traps.

Overrunning a malloc'ed buffer very rarely traps, not sure catching this
special case is worth the portability headaches.  If you really want to
catch overruns, you need special tools like valgrind or electric fence
anyway.

>>   void *qemu_realloc(void *ptr, size_t size)
>>   {
>> +    return oom_check(realloc(ptr, size ? size : 1));
>
> qemu_realloc(qemu_malloc(0), 42);
>
> should better work correctly ...
>
> Likewise qemu_free(qemu_malloc(0));
Glauber Costa Dec. 1, 2009, 1:47 p.m. UTC | #7
On Tue, Dec 01, 2009 at 12:57:27PM +0000, Paul Brook wrote:
> > You might want to have a 'static uint8_t zero_length_malloc[0]' and
> > return that instead of the magic cookie '1'.  Makes the code more
> > readable IMHO and you'll also have symbol in gdb when debugging qemu.
> 
> Having multiple malloc return the same pointer sounds like a really bad idea.
And why's that?

Keep in mind that *any* dereference over that address is a bug.

Actually, I very much like Gerd's idea to unmap that address, so the bug
won't hide from us in any circumnstances.
Markus Armbruster Dec. 1, 2009, 2:08 p.m. UTC | #8
Glauber Costa <glommer@redhat.com> writes:

> On Tue, Dec 01, 2009 at 12:57:27PM +0000, Paul Brook wrote:
>> > You might want to have a 'static uint8_t zero_length_malloc[0]' and
>> > return that instead of the magic cookie '1'.  Makes the code more
>> > readable IMHO and you'll also have symbol in gdb when debugging qemu.
>> 
>> Having multiple malloc return the same pointer sounds like a really bad idea.
> And why's that?
>
> Keep in mind that *any* dereference over that address is a bug.
>
> Actually, I very much like Gerd's idea to unmap that address, so the bug
> won't hide from us in any circumnstances.

For what it's worth, it violates the spec for malloc().  For zero-sized
allocations, we may either return a null pointer (but we already decided
we don't want to), or an object different from any other object alive.
Thus, we can't return the same non-null pointer for all zero-sized
allocations.

Chapter and verse: ISO/IEC 9899:1999 7.20.3 Memory management functions

    The order and contiguity of storage allocated by successive calls to
    the calloc, malloc, and realloc functions is unspecified.  The
    pointer returned if the allocation succeeds is suitably aligned so
    that it may be assigned to a pointer to any type of object and then
    used to access such an object or an array of such objects in the
    space allocated (until the space is explicitly deallocated).  The
    lifetime of an allocated object extends from the allocation until
    the deallocation.  Each such allocation shall yield a pointer to an
    object disjoint from any other object.  The pointer returned points
    to the start (lowest byte address) of the allocated space.  If the
    space cannot be allocated, a null pointer is returned.  If the size
    of the space requested is zero, the behavior is implementation-
    defined: either a null pointer is returned, or the behavior is as if
    the size were some nonzero value, except that the returned pointer
    shall not be used to access an object.
Paul Brook Dec. 1, 2009, 2:21 p.m. UTC | #9
On Tuesday 01 December 2009, Glauber Costa wrote:
> On Tue, Dec 01, 2009 at 12:57:27PM +0000, Paul Brook wrote:
> > > You might want to have a 'static uint8_t zero_length_malloc[0]' and
> > > return that instead of the magic cookie '1'.  Makes the code more
> > > readable IMHO and you'll also have symbol in gdb when debugging qemu.
> >
> > Having multiple malloc return the same pointer sounds like a really bad
> > idea.
> 
> And why's that?
> 
> Keep in mind that *any* dereference over that address is a bug.

Dereferencing the address is a bug. However testing the addresses themselves 
for equality is valid. This is much the same reason I think returning NULL 
would be a bad idea.

Paul
Avi Kivity Dec. 1, 2009, 2:34 p.m. UTC | #10
On 12/01/2009 02:40 PM, Gerd Hoffmann wrote:
>
> Even more advanced:  Make zero_length_malloc page-sized and 
> page-aligned, then munmap int, so dereferencing it actually traps.

That will cause vma fragmentation.  Since we don't trap malloc(n)[n] for 
n > 0, we may as well not trap it for n = 0.
Gerd Hoffmann Dec. 1, 2009, 2:47 p.m. UTC | #11
On 12/01/09 15:08, Markus Armbruster wrote:
> For what it's worth, it violates the spec for malloc().  For zero-sized
> allocations, we may either return a null pointer (but we already decided
> we don't want to), or an object different from any other object alive.

Which clearly rules out the fixed memory location for malloc(0).

"malloc(size ? size : 1)" is indeed the easiest way to make sure we 
conform to the spec and also don't return NULL.

cheers,
   Gerd
Gerd Hoffmann Dec. 1, 2009, 2:53 p.m. UTC | #12
On 12/01/09 15:34, Avi Kivity wrote:
> On 12/01/2009 02:40 PM, Gerd Hoffmann wrote:
>>
>> Even more advanced: Make zero_length_malloc page-sized and
>> page-aligned, then munmap int, so dereferencing it actually traps.
>
> That will cause vma fragmentation.

I was thinking about a single unmapped page, which wouldn't cause 
fragmentation.  But that is a bad idea too for other reasons ...

cheers,
   Gerd
Eduardo Habkost Dec. 1, 2009, 3:32 p.m. UTC | #13
Excerpts from Markus Armbruster's message of Mon Nov 30 11:55:34 -0200 2009:
> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc() & friends.  Revert that, but take care never to
> return a null pointer, like malloc() & friends may do (it's
> implementation defined), because that's another source of bugs.
> 
> Rationale: while zero-sized allocations might occasionally be a sign of
> something going wrong, they can also be perfectly legitimate.  The
> change broke such legitimate uses.  We've found and "fixed" at least one
> of them already (commit eb0b64f7, also reverted by this patch), and
> another one just popped up: the change broke qcow2 images with virtual
> disk size zero, i.e. images that don't hold real data but only VM state
> of snapshots.
> 
> If a change breaks two uses, it probably breaks more.  As a quick check,
> I reviewed the first six of more than 200 uses of qemu_mallocz(),
> qemu_malloc() and qemu_realloc() that don't have an argument of the form
> sizeof(...) or similar:

Acked-by: Eduardo Habkost <ehabkost@redhat.com>

This also makes qemu_realloc(NULL, size) completely equivalent to
qemu_malloc(size), and that's a good thing.
Anthony Liguori Dec. 4, 2009, 4:49 p.m. UTC | #14
Markus Armbruster wrote:
> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc() & friends.  Revert that, but take care never to
> return a null pointer, like malloc() & friends may do (it's
> implementation defined), because that's another source of bugs.
>
> Rationale: while zero-sized allocations might occasionally be a sign of
> something going wrong, they can also be perfectly legitimate.  The
> change broke such legitimate uses.  We've found and "fixed" at least one
> of them already (commit eb0b64f7, also reverted by this patch), and
> another one just popped up: the change broke qcow2 images with virtual
> disk size zero, i.e. images that don't hold real data but only VM state
> of snapshots.
>   

I still believe that it is poor practice to pass size==0 to *malloc().  
I think actively discouraging this in qemu is a good thing because it's 
a broken idiom.

That said, we've had this change in for a while and it's painfully 
obviously we haven't eliminated all of these instances in qemu.  Knowing 
that we still have occurrences of this and actively exit()'ing when they 
happen is pretty much suicidal in a production environment.  It's not a 
bug, it's just a poor practice.

Here's what I propose.  I think we should introduce a 
CONFIG_DEBUG_ZERO_MALLOC.  When this is set, we should assert(size!=0).  
Otherwise, we should return malloc(1) as Markus' patch does below.

Using the same rules as we follow for -Werror, we should enable 
CONFIG_DEBUG_ZERO_MALLOC for the development tree and disable it for any 
releases.

This helps us make qemu better during development while not 
unnecessarily causing us harm in a production environment.  What happens 
here long term I think remains to be seen.  But for right now, I think 
this is an important change to make for 0.12.

> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  block/qcow2-snapshot.c |    5 -----
>  qemu-malloc.c          |   14 ++------------
>  2 files changed, 2 insertions(+), 17 deletions(-)
>
> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> index 94cb838..e3b208c 100644
> --- a/block/qcow2-snapshot.c
> +++ b/block/qcow2-snapshot.c
> @@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
>      QCowSnapshot *sn;
>      int i;
>  
> -    if (!s->nb_snapshots) {
> -        *psn_tab = NULL;
> -        return s->nb_snapshots;
> -    }
> -
>   

This does not belong here.

Regards,

Anthony Liguori
Avi Kivity Dec. 5, 2009, 5:07 p.m. UTC | #15
On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>
> I still believe that it is poor practice to pass size==0 to 
> *malloc().  I think actively discouraging this in qemu is a good thing 
> because it's a broken idiom.

Why?  Unless we have a separate array allocator (like C++'s new and 
new[]), we need to support zero-element arrays without pushing the 
burden to callers (in the same way that for () supports zero iteration 
loops without a separate if ()).
Anthony Liguori Dec. 5, 2009, 5:27 p.m. UTC | #16
Avi Kivity wrote:
> On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>>
>> I still believe that it is poor practice to pass size==0 to 
>> *malloc().  I think actively discouraging this in qemu is a good 
>> thing because it's a broken idiom.
>
> Why?  Unless we have a separate array allocator (like C++'s new and 
> new[]), we need to support zero-element arrays without pushing the 
> burden to callers (in the same way that for () supports zero iteration 
> loops without a separate if ()).

If you call malloc(size=0), then it's impossible to differentiate 
between a failed malloc return and a valid result on certain platform 
(like AIX).

So the only correct way would be to write:

array = malloc(size);
if (array == NULL && size != 0) {
    return -ENOMEM;
}

If you were writing portable code.  But that's not what people write.  
You can argue that qemu_malloc() can have any semantics we want and 
while that's true, it doesn't change my above statement.  I think the 
main argument for this behavior in qemu is that people are used to using 
this idiom with malloc() but it's a non-portable practice.

If qemu_malloc() didn't carry the name "malloc()" then semantics with 
size=0 would be a different discussion.  But so far, all qemu_* 
functions tend to behave almost exactly like their C counterparts.  
Relying on the result of size=0 with malloc() is broken.

Regards,

Anthony Liguori
Blue Swirl Dec. 5, 2009, 5:28 p.m. UTC | #17
On Sat, Dec 5, 2009 at 5:07 PM, Avi Kivity <avi@redhat.com> wrote:
> On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>>
>> I still believe that it is poor practice to pass size==0 to *malloc().  I
>> think actively discouraging this in qemu is a good thing because it's a
>> broken idiom.
>
> Why?  Unless we have a separate array allocator (like C++'s new and new[]),
> we need to support zero-element arrays without pushing the burden to callers
> (in the same way that for () supports zero iteration loops without a
> separate if ()).

Running a loop zero or nonzero number of times always has a very clear
and precise meaning. A pointer returned from allocating zero or
nonzero number of items may be completely unusable or usable,
respectively.

I think Laurent's proposal would work. We even could go so far as
rename the current function as qemu_malloc_possibly_broken (and adjust
callers mechanically) and introduce two new versions, which handle the
zero case in clearly advertised ways. Patches would fix the callers to
use the correct one.
Avi Kivity Dec. 5, 2009, 5:40 p.m. UTC | #18
On 12/05/2009 07:27 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>>>
>>> I still believe that it is poor practice to pass size==0 to 
>>> *malloc().  I think actively discouraging this in qemu is a good 
>>> thing because it's a broken idiom.
>>
>> Why?  Unless we have a separate array allocator (like C++'s new and 
>> new[]), we need to support zero-element arrays without pushing the 
>> burden to callers (in the same way that for () supports zero 
>> iteration loops without a separate if ()).
>
> If you call malloc(size=0), then it's impossible to differentiate 
> between a failed malloc return and a valid result on certain platform 
> (like AIX).
>
> So the only correct way would be to write:
>
> array = malloc(size);
> if (array == NULL && size != 0) {
>    return -ENOMEM;
> }
>
> If you were writing portable code.  But that's not what people write.  
> You can argue that qemu_malloc() can have any semantics we want and 
> while that's true, it doesn't change my above statement.  I think the 
> main argument for this behavior in qemu is that people are used to 
> using this idiom with malloc() but it's a non-portable practice.
>
> If qemu_malloc() didn't carry the name "malloc()" then semantics with 
> size=0 would be a different discussion.  But so far, all qemu_* 
> functions tend to behave almost exactly like their C counterparts.  
> Relying on the result of size=0 with malloc() is broken.

A zero-supporting qemu_malloc() is fully compatible with malloc(), we're 
only restricting the possible returns.  So we're not misleading any 
caller.  In fact, taking your argument to the extreme, a malloc 
implementation would need to

    if (size == 0) {
       if (rand() % 2) {
           return NULL;
       } else {
          size = 1;
       }
    }

Note that qemu_malloc() already restricts return values by not returning 
NULL on oom, according to your logic we shouldn't do this (and have two 
tests accompany any variable-size qemu_malloc()).
Avi Kivity Dec. 5, 2009, 5:44 p.m. UTC | #19
On 12/05/2009 07:28 PM, Blue Swirl wrote:
> On Sat, Dec 5, 2009 at 5:07 PM, Avi Kivity<avi@redhat.com>  wrote:
>    
>> On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>>      
>>> I still believe that it is poor practice to pass size==0 to *malloc().  I
>>> think actively discouraging this in qemu is a good thing because it's a
>>> broken idiom.
>>>        
>> Why?  Unless we have a separate array allocator (like C++'s new and new[]),
>> we need to support zero-element arrays without pushing the burden to callers
>> (in the same way that for () supports zero iteration loops without a
>> separate if ()).
>>      
> Running a loop zero or nonzero number of times always has a very clear
> and precise meaning. A pointer returned from allocating zero or
> nonzero number of items may be completely unusable or usable,
> respectively.
>    

Only if you allocate using POSIX malloc().  If you allocate using a 
function that is defined to return a valid pointer for zero length 
allocations, you're happy.

> I think Laurent's proposal would work. We even could go so far as
> rename the current function as qemu_malloc_possibly_broken (and adjust
> callers mechanically) and introduce two new versions, which handle the
> zero case in clearly advertised ways. Patches would fix the callers to
> use the correct one

Good idea.  Let's name the function that returns a valid pointer 
qemu_malloc() (since that's what many callers expect anyway, and it's 
fully backwards compatible), and see who calls 
qemu_malloc_dont_call_me_with_zero().

Realistically, do we need two variants of malloc/realloc/free, or can we 
stick with one that works for callers with a minimum of fuss?
Anthony Liguori Dec. 5, 2009, 5:54 p.m. UTC | #20
Avi Kivity wrote:
> A zero-supporting qemu_malloc() is fully compatible with malloc(), 
> we're only restricting the possible returns.  So we're not misleading 
> any caller.  In fact, taking your argument to the extreme, a malloc 
> implementation would need to

This is really the crux of the whole argument.  You're arguing that 
while most people rely on incorrect idioms with malloc(), the problem is 
not the idioms themselves but the definition of malloc().  The opposing 
argument is that instead of providing a "fixed" version of malloc(), we 
should encourage people to use a proper idiom.

I dislike the entire notion of qemu_malloc().  I've always disliked the 
fact that it abort()s on OOM.  I'd rather see us use a normal malloc() 
and code to that malloc currently which means avoiding size=0 and 
checking NULL results.

However, this is all personal preference and I'd rather focus my energy 
on things that have true functional impact.  Markus raised a valid 
functional problem with the current implementation and I proposed a 
solution that would address that functional problem.  I'd rather see the 
discussion focus on the merits of that solution than revisiting whether 
ANSI got the semantics of malloc() correct in the standards definition.

Regards,

Anthony Liguori
Avi Kivity Dec. 5, 2009, 6:06 p.m. UTC | #21
On 12/05/2009 07:54 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> A zero-supporting qemu_malloc() is fully compatible with malloc(), 
>> we're only restricting the possible returns.  So we're not misleading 
>> any caller.  In fact, taking your argument to the extreme, a malloc 
>> implementation would need to
>
> This is really the crux of the whole argument.  You're arguing that 
> while most people rely on incorrect idioms with malloc(), the problem 
> is not the idioms themselves but the definition of malloc().  The 
> opposing argument is that instead of providing a "fixed" version of 
> malloc(), we should encourage people to use a proper idiom.

When we see a lengthy and error prone idiom we usually provide a 
wrapper.  That wrapper is qemu_malloc().  If you like, don't see it as a 
fixed malloc(), but as qemu's way of allocating memory which is totally 
independent from malloc().

>
> I dislike the entire notion of qemu_malloc().  I've always disliked 
> the fact that it abort()s on OOM.  I'd rather see us use a normal 
> malloc() and code to that malloc currently which means avoiding size=0 
> and checking NULL results.

I believe that's impossible.  Many times, when the guest writes to a 
register, you have no way to indicate failure.  Sometimes you can 
indicate failure (say, time out on IDE write requests) but will likely 
cause much damage to the guest.  Preallocating memory is likely to be 
very difficult and also very wasteful (since you'll have to account for 
the worst case).

Furthermore, Linux under its default configuration will never fail a 
small allocation, instead oom-killing a semi-random process.  So all 
that work will not help on that platform.

>
> However, this is all personal preference and I'd rather focus my 
> energy on things that have true functional impact.  Markus raised a 
> valid functional problem with the current implementation and I 
> proposed a solution that would address that functional problem.  I'd 
> rather see the discussion focus on the merits of that solution than 
> revisiting whether ANSI got the semantics of malloc() correct in the 
> standards definition.
>

Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to get 
that right rather than wrapping every array caller with useless tests.
Laurent Desnogues Dec. 5, 2009, 6:16 p.m. UTC | #22
On Sat, Dec 5, 2009 at 6:44 PM, Avi Kivity <avi@redhat.com> wrote:
[...]
>
>> I think Laurent's proposal would work. We even could go so far as
>> rename the current function as qemu_malloc_possibly_broken (and adjust
>> callers mechanically) and introduce two new versions, which handle the
>> zero case in clearly advertised ways. Patches would fix the callers to
>> use the correct one
>
> Good idea.  Let's name the function that returns a valid pointer
> qemu_malloc() (since that's what many callers expect anyway, and it's fully
> backwards compatible), and see who calls
> qemu_malloc_dont_call_me_with_zero().

Yes, you're always free to follow poor coding rules by breaking strict
API.


Laurent
Anthony Liguori Dec. 5, 2009, 8:58 p.m. UTC | #23
Avi Kivity wrote:
> When we see a lengthy and error prone idiom we usually provide a 
> wrapper.  That wrapper is qemu_malloc().  If you like, don't see it as 
> a fixed malloc(), but as qemu's way of allocating memory which is 
> totally independent from malloc().

We constantly get patches with qemu_malloc() with a NULL check.  Then we 
tell people to remove the NULL check.  It feels very weird to ask people 
to remove error handling.

I can understand the argument that getting OOM right is very difficult 
but it's not impossible.

>>
>> However, this is all personal preference and I'd rather focus my 
>> energy on things that have true functional impact.  Markus raised a 
>> valid functional problem with the current implementation and I 
>> proposed a solution that would address that functional problem.  I'd 
>> rather see the discussion focus on the merits of that solution than 
>> revisiting whether ANSI got the semantics of malloc() correct in the 
>> standards definition.
>>
>
> Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to get 
> that right rather than wrapping every array caller with useless tests.

If you're concerned about array allocation, introduce an array 
allocation function.  Honestly, there's very little reason to open code 
array allocation/manipulation at all.  We should either be using a list 
type or if we really need to, we should introduce a vector type.

Regards,

Anthony Liguori
Avi Kivity Dec. 5, 2009, 10:26 p.m. UTC | #24
On 12/05/2009 10:58 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> When we see a lengthy and error prone idiom we usually provide a 
>> wrapper.  That wrapper is qemu_malloc().  If you like, don't see it 
>> as a fixed malloc(), but as qemu's way of allocating memory which is 
>> totally independent from malloc().
>
> We constantly get patches with qemu_malloc() with a NULL check.  Then 
> we tell people to remove the NULL check.  It feels very weird to ask 
> people to remove error handling.

You prefer to explain to them how to do error handling correctly?

>
> I can understand the argument that getting OOM right is very difficult 
> but it's not impossible.

There are 755 calls to malloc in the code.  And practically every 
syscall can return ENOMEM, including the innocuous KVM_RUN ioctl().  
It's going to be pretty close to impossible to recover from malloc() 
failure, and impossible to recover from KVM_RUN failure (except by 
retrying, which you can assume the kernel already has).  All for 
something which never happens.  I propose that fixing OOM handling is 
going to introduce some errors into the non-error paths, and many errors 
into the error return paths, for zero benefit.

>
>>>
>>> However, this is all personal preference and I'd rather focus my 
>>> energy on things that have true functional impact.  Markus raised a 
>>> valid functional problem with the current implementation and I 
>>> proposed a solution that would address that functional problem.  I'd 
>>> rather see the discussion focus on the merits of that solution than 
>>> revisiting whether ANSI got the semantics of malloc() correct in the 
>>> standards definition.
>>>
>>
>> Unless ANSI has a say on qemu_malloc(), I think it's worthwhile to 
>> get that right rather than wrapping every array caller with useless 
>> tests.
>
> If you're concerned about array allocation, introduce an array 
> allocation function.  Honestly, there's very little reason to open 
> code array allocation/manipulation at all.  We should either be using 
> a list type or if we really need to, we should introduce a vector type.

A NEW(type) and ARRAY_NEW(type, count) marcros would improve type safety 
and plug a dormant buffer overflow due to multiplication overflow, yes.  
Even qemu_calloc() would be an improvement.  But having qemu_malloc() 
not fix the zero length array case which we know we have is 
irresponsible, IMO.
Ian Molton Dec. 5, 2009, 11:08 p.m. UTC | #25
Avi Kivity wrote:

> Only if you allocate using POSIX malloc().  If you allocate using a
> function that is defined to return a valid pointer for zero length
> allocations, you're happy.

Wouldnt it be better to, rather than use a qemu_malloc() that is utterly
counterintuitive in that it has no way to report failure, and behaves in
ways people dont expect, to use normal malloc() and never pass it 0 ?

Seriously, who does that anyway? why call malloc when you dont want the
space? so you can use realloc? 99.99% of the time realloc() is the Wrong
Solution(tm).

stick to what people know, and LART them for misuse of it if necessary.

(Just my 2p)

-Ian
Avi Kivity Dec. 5, 2009, 11:11 p.m. UTC | #26
On 12/06/2009 01:08 AM, Ian Molton wrote:
> Avi Kivity wrote:
>
>    
>> Only if you allocate using POSIX malloc().  If you allocate using a
>> function that is defined to return a valid pointer for zero length
>> allocations, you're happy.
>>      
> Wouldnt it be better to, rather than use a qemu_malloc() that is utterly
> counterintuitive in that it has no way to report failure, and behaves in
> ways people dont expect, to use normal malloc() and never pass it 0 ?
>    

It's not that it doesn't have a way to report failure, it's that it 
doesn't fail.  Do you prefer functions that fail and report it to 
functions that don't fail?

> Seriously, who does that anyway? why call malloc when you dont want the
> space? so you can use realloc? 99.99% of the time realloc() is the Wrong
> Solution(tm).
>    

Read the beginning of the thread.  Basically it's for arrays, malloc(n * 
sizeof(x)).

> stick to what people know, and LART them for misuse of it if necessary.
>    

The LART is a crash, great.
Ian Molton Dec. 5, 2009, 11:25 p.m. UTC | #27
Avi Kivity wrote:

> It's not that it doesn't have a way to report failure, it's that it
> doesn't fail.  Do you prefer functions that fail and report it to
> functions that don't fail?

You have a way of allocating memory that will _never_ fail?

>> Seriously, who does that anyway? why call malloc when you dont want the
>> space? so you can use realloc? 99.99% of the time realloc() is the Wrong
>> Solution(tm).
>>    
> Read the beginning of the thread.  Basically it's for arrays, malloc(n *
> sizeof(x)).

well, make sure n is not 0. Its not that hard. I dont think I've *ever*
had a situation where I wanted to pass 0 to malloc.

>> stick to what people know, and LART them for misuse of it if necessary.
>
> The LART is a crash, great.

No, the LART would be a 'your patch does this wrong, try this:'

-Ian
Markus Armbruster Dec. 6, 2009, 8:12 a.m. UTC | #28
Anthony Liguori <anthony@codemonkey.ws> writes:

> Avi Kivity wrote:
>> On 12/04/2009 06:49 PM, Anthony Liguori wrote:
>>>
>>> I still believe that it is poor practice to pass size==0 to
>>> *malloc().  I think actively discouraging this in qemu is a good
>>> thing because it's a broken idiom.
>>
>> Why?  Unless we have a separate array allocator (like C++'s new and
>> new[]), we need to support zero-element arrays without pushing the
>> burden to callers (in the same way that for () supports zero
>> iteration loops without a separate if ()).
>
> If you call malloc(size=0), then it's impossible to differentiate
> between a failed malloc return and a valid result on certain platform
> (like AIX).
>
> So the only correct way would be to write:
>
> array = malloc(size);
> if (array == NULL && size != 0) {
>    return -ENOMEM;
> }
>
> If you were writing portable code.  But that's not what people write.

Yup.  But

    if (n == 0)
        p = malloc(n * sizeof(struct foo))
    else
        p = NULL;

isn't what people write either.

> You can argue that qemu_malloc() can have any semantics we want and
> while that's true, it doesn't change my above statement.  I think the
> main argument for this behavior in qemu is that people are used to
> using this idiom with malloc() but it's a non-portable practice.

We've decided long ago to disable the trap you quoted by not returning
NULL for empty allocations.  That way it doesn't kill us when people
fail to call qemu_malloc() perfectly every time.

> If qemu_malloc() didn't carry the name "malloc()" then semantics with
> size=0 would be a different discussion.  But so far, all qemu_*
> functions tend to behave almost exactly like their C counterparts.

That's a reason for making qemu_malloc() work for zero arguments,
because its C counterpart does.

> Relying on the result of size=0 with malloc() is broken.

No.  Relying on non-null-ness is broken.  You can write perfectly
portable, working code without doing that.

    p = malloc(n * sizeof(struct foo);
    if (n && !p)
        exit_no_mem();
    for (i = 0; i < n; i++)
        compute_one(p, i);

With qemu_malloc(), the error handling moves into qemu_malloc():

    p = qemu_malloc(n * sizeof(struct foo);
    for (i = 0; i < n; i++)
        compute_one(p, i);
Markus Armbruster Dec. 6, 2009, 8:24 a.m. UTC | #29
Avi Kivity <avi@redhat.com> writes:

> A NEW(type) and ARRAY_NEW(type, count) marcros would improve type
> safety and plug a dormant buffer overflow due to multiplication
> overflow, yes.  Even qemu_calloc() would be an improvement.  But
> having qemu_malloc() not fix the zero length array case which we know
> we have is irresponsible, IMO.

Agree on all counts.
Avi Kivity Dec. 6, 2009, 1:07 p.m. UTC | #30
On 12/06/2009 01:25 AM, Ian Molton wrote:
> Avi Kivity wrote:
>
>    
>> It's not that it doesn't have a way to report failure, it's that it
>> doesn't fail.  Do you prefer functions that fail and report it to
>> functions that don't fail?
>>      
> You have a way of allocating memory that will _never_ fail?
>
>    

Sort of.  Did you look at the code?

>>> Seriously, who does that anyway? why call malloc when you dont want the
>>> space? so you can use realloc? 99.99% of the time realloc() is the Wrong
>>> Solution(tm).
>>>
>>>        
>> Read the beginning of the thread.  Basically it's for arrays, malloc(n *
>> sizeof(x)).
>>      
> well, make sure n is not 0. Its not that hard. I dont think I've *ever*
> had a situation where I wanted to pass 0 to malloc.
>    

There are multiple such cases in the code.

>>> stick to what people know, and LART them for misuse of it if necessary.
>>>        
>> The LART is a crash, great.
>>      
> No, the LART would be a 'your patch does this wrong, try this:'
>    

What about existing usage?  Will you audit all the existing calls?
Ian Molton Dec. 6, 2009, 4:52 p.m. UTC | #31
Markus Armbruster wrote:

>     p = malloc(n * sizeof(struct foo);
>     if (n && !p)
>         exit_no_mem();
>     for (i = 0; i < n; i++)
>         compute_one(p, i);
> 
> With qemu_malloc(), the error handling moves into qemu_malloc():
> 
>     p = qemu_malloc(n * sizeof(struct foo);
>     for (i = 0; i < n; i++)
>         compute_one(p, i);

And you lose the ability to fail gracefully...
Ian Molton Dec. 6, 2009, 4:58 p.m. UTC | #32
Avi Kivity wrote:
> On 12/06/2009 01:25 AM, Ian Molton wrote:
>> Avi Kivity wrote:
>>
>>   
>>> It's not that it doesn't have a way to report failure, it's that it
>>> doesn't fail.  Do you prefer functions that fail and report it to
>>> functions that don't fail?
>>>      
>> You have a way of allocating memory that will _never_ fail?
>
> Sort of. 

'sort of' never ?

> Did you look at the code?

Yes. Its hardly infallible.

>> well, make sure n is not 0. Its not that hard. I dont think I've *ever*
>> had a situation where I wanted to pass 0 to malloc.
> 
> There are multiple such cases in the code.
> 
>>>> stick to what people know, and LART them for misuse of it if necessary.
>>>>        
>>> The LART is a crash, great.
>>>      
>> No, the LART would be a 'your patch does this wrong, try this:'
> 
> What about existing usage?  Will you audit all the existing calls?

mark qemu_malloc as deprecated. don't include new patches that use it.
Plenty of time to fix the broken uses...

-Ian
Avi Kivity Dec. 6, 2009, 5:07 p.m. UTC | #33
On 12/06/2009 06:58 PM, Ian Molton wrote:
> Avi Kivity wrote:
>    
>> On 12/06/2009 01:25 AM, Ian Molton wrote:
>>      
>>> Avi Kivity wrote:
>>>
>>>
>>>        
>>>> It's not that it doesn't have a way to report failure, it's that it
>>>> doesn't fail.  Do you prefer functions that fail and report it to
>>>> functions that don't fail?
>>>>
>>>>          
>>> You have a way of allocating memory that will _never_ fail?
>>>        
>> Sort of.
>>      
> 'sort of' never ?
>
>    
>> Did you look at the code?
>>      
> Yes. Its hardly infallible.
>    

It will never fail on Linux.  On other hosts it prevents a broken oom 
handler from taking the guest down a death spiral.

>> What about existing usage?  Will you audit all the existing calls?
>>      
> mark qemu_malloc as deprecated. don't include new patches that use it.
> Plenty of time to fix the broken uses...
>    

Send patches.  I don't think it's realistic to handle OOM in qemu 
(handling n=0 is easy, but a lot of work for no real gain).
Avi Kivity Dec. 6, 2009, 5:14 p.m. UTC | #34
On 12/06/2009 06:52 PM, Ian Molton wrote:
> Markus Armbruster wrote:
>
>    
>>      p = malloc(n * sizeof(struct foo);
>>      if (n&&  !p)
>>          exit_no_mem();
>>      for (i = 0; i<  n; i++)
>>          compute_one(p, i);
>>
>> With qemu_malloc(), the error handling moves into qemu_malloc():
>>
>>      p = qemu_malloc(n * sizeof(struct foo);
>>      for (i = 0; i<  n; i++)
>>          compute_one(p, i);
>>      
> And you lose the ability to fail gracefully...
>    

We never had it.  Suppose p is allocated in response to an SCSI register 
write, and we allocate a scatter-gather list.  What do you do if you OOM?

1) Introduce an error path that works synchronously off the stack and so 
does not need to allocate memory
2) Don't allocate in the first place, always read guest memory and 
transform it for sglists, preallocate everything else
3) Have a per-thread emergency pool, stall the vcpu until all memory is 
returned to the emergency pool

all of these options are pretty horrible, either to the code or to the 
guest, for something that never happens.
malc Dec. 6, 2009, 5:45 p.m. UTC | #35
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 06:52 PM, Ian Molton wrote:
> > Markus Armbruster wrote:
> > 
> >    
> > >      p = malloc(n * sizeof(struct foo);
> > >      if (n&&  !p)
> > >          exit_no_mem();
> > >      for (i = 0; i<  n; i++)
> > >          compute_one(p, i);
> > > 
> > > With qemu_malloc(), the error handling moves into qemu_malloc():
> > > 
> > >      p = qemu_malloc(n * sizeof(struct foo);
> > >      for (i = 0; i<  n; i++)
> > >          compute_one(p, i);
> > >      
> > And you lose the ability to fail gracefully...
> >    
> 
> We never had it.  Suppose p is allocated in response to an SCSI register
> write, and we allocate a scatter-gather list.  What do you do if you OOM?

Uh, please do not generalize.

See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
logged and debugged, no OOM, no abort. Not all scenarios admit this, but
claiming that there are none that do is incorrect.

> 
> 1) Introduce an error path that works synchronously off the stack and so does
> not need to allocate memory
> 2) Don't allocate in the first place, always read guest memory and transform
> it for sglists, preallocate everything else
> 3) Have a per-thread emergency pool, stall the vcpu until all memory is
> returned to the emergency pool
> 
> all of these options are pretty horrible, either to the code or to the guest,
> for something that never happens.
> 
>
malc Dec. 6, 2009, 5:47 p.m. UTC | #36
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 06:58 PM, Ian Molton wrote:
> > Avi Kivity wrote:
> >    
> > > On 12/06/2009 01:25 AM, Ian Molton wrote:
> > >      
> > > > Avi Kivity wrote:
> > > > 
> > > > 
> > > >        
> > > > > It's not that it doesn't have a way to report failure, it's that it
> > > > > doesn't fail.  Do you prefer functions that fail and report it to
> > > > > functions that don't fail?
> > > > > 
> > > > >          
> > > > You have a way of allocating memory that will _never_ fail?
> > > >        
> > > Sort of.
> > >      
> > 'sort of' never ?
> > 
> >    
> > > Did you look at the code?
> > >      
> > Yes. Its hardly infallible.
> >    
> 
> It will never fail on Linux.  On other hosts it prevents a broken oom handler
> from taking the guest down a death spiral.

It fails here all the time i'm sorry to say, i have overcommit disabled
(mostly because kpdf when doing a text search tends to overwhelm the VM
subsystem and Linux happily picks X11 as it's OOM kill target)

> 
> > > What about existing usage?  Will you audit all the existing calls?
> > >      
> > mark qemu_malloc as deprecated. don't include new patches that use it.
> > Plenty of time to fix the broken uses...
> >    
> 
> Send patches.  I don't think it's realistic to handle OOM in qemu (handling
> n=0 is easy, but a lot of work for no real gain).
>
Avi Kivity Dec. 6, 2009, 5:59 p.m. UTC | #37
On 12/06/2009 07:47 PM, malc wrote:
>
>> It will never fail on Linux.  On other hosts it prevents a broken oom handler
>> from taking the guest down a death spiral.
>>      
> It fails here all the time i'm sorry to say, i have overcommit disabled
> (mostly because kpdf when doing a text search tends to overwhelm the VM
> subsystem and Linux happily picks X11 as it's OOM kill target)
>    

Right, I meant under the default configuration.  Unfortunately there is 
no good configuration for Linux - without strict overcommit you'll 
invoke the oom killer, with strict overcommit you'll need ridiculous 
amounts of swap because fork() and MAP_PRIVATE require tons of backing 
store.

On my home machine I have

   $ grep Commit /proc/meminfo
   CommitLimit:     7235160 kB
   Committed_AS:    4386172 kB

So, 4GB of virtual memory needed just to run a normal desktop with 
thunderbird+firefox.  Actual anonymous memory is less than 2GB, so you 
could easily run this workload on a 2GB machine without swap, but with 
strict overcommit 2GB RAM + 2GB swap will see failures even though you 
have free RAM.
Avi Kivity Dec. 6, 2009, 6:02 p.m. UTC | #38
On 12/06/2009 07:45 PM, malc wrote:
>
>    
>>> And you lose the ability to fail gracefully...
>>>
>>>        
>> We never had it.  Suppose p is allocated in response to an SCSI register
>> write, and we allocate a scatter-gather list.  What do you do if you OOM?
>>      
> Uh, please do not generalize.
>
>    

Sorry.

> See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
> logged and debugged, no OOM, no abort. Not all scenarios admit this, but
> claiming that there are none that do is incorrect.
>    

Init is pretty easy to handle.  I'm worried about runtime where you 
can't report an error to the guest.  Real hardware doesn't oom.
malc Dec. 6, 2009, 6:09 p.m. UTC | #39
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 07:47 PM, malc wrote:
> > 
> > > It will never fail on Linux.  On other hosts it prevents a broken oom
> > > handler
> > > from taking the guest down a death spiral.
> > >      
> > It fails here all the time i'm sorry to say, i have overcommit disabled
> > (mostly because kpdf when doing a text search tends to overwhelm the VM
> > subsystem and Linux happily picks X11 as it's OOM kill target)
> >    
> 
> Right, I meant under the default configuration.  Unfortunately there is no
> good configuration for Linux - without strict overcommit you'll invoke the oom
> killer, with strict overcommit you'll need ridiculous amounts of swap because
> fork() and MAP_PRIVATE require tons of backing store.
> 
> On my home machine I have
> 
>   $ grep Commit /proc/meminfo
>   CommitLimit:     7235160 kB
>   Committed_AS:    4386172 kB
> 
> So, 4GB of virtual memory needed just to run a normal desktop with
> thunderbird+firefox.  Actual anonymous memory is less than 2GB, so you could
> easily run this workload on a 2GB machine without swap, but with strict
> overcommit 2GB RAM + 2GB swap will see failures even though you have free RAM.
> 

Well, i don't have a swap...

~$ cat /proc/meminfo 
MemTotal:       515708 kB
MemFree:          3932 kB
Buffers:         10120 kB
Cached:         365320 kB
SwapCached:          0 kB
Active:         238120 kB
Inactive:       222396 kB
SwapTotal:           0 kB
SwapFree:            0 kB
...
CommitLimit:    464136 kB
Committed_AS:   178920 kB
...
malc Dec. 6, 2009, 6:12 p.m. UTC | #40
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 07:45 PM, malc wrote:
> > 
> >    
> > > > And you lose the ability to fail gracefully...
> > > > 
> > > >        
> > > We never had it.  Suppose p is allocated in response to an SCSI register
> > > write, and we allocate a scatter-gather list.  What do you do if you OOM?
> > >      
> > Uh, please do not generalize.
> > 
> >    
> 
> Sorry.
> 
> > See for instance 29ddf27b72960d6e6b115cd69812c9c57b2a7b13 the incident was
> > logged and debugged, no OOM, no abort. Not all scenarios admit this, but
> > claiming that there are none that do is incorrect.
> >    
> 
> Init is pretty easy to handle.  I'm worried about runtime where you can't
> report an error to the guest.  Real hardware doesn't oom.

Here, i do agree, but mostly because most of the users of allocation
functions just themselves returned NULL or -1 or whatever and never
really bothered to report anything, so the addition of OOM check that
you've added made the code less cluttered.
Avi Kivity Dec. 6, 2009, 6:16 p.m. UTC | #41
On 12/06/2009 08:09 PM, malc wrote:
>
> Well, i don't have a swap...
>
> ~$ cat /proc/meminfo
> MemTotal:       515708 kB
> MemFree:          3932 kB
> Buffers:         10120 kB
> Cached:         365320 kB
> SwapCached:          0 kB
> Active:         238120 kB
> Inactive:       222396 kB
> SwapTotal:           0 kB
> SwapFree:            0 kB
> ...
> CommitLimit:    464136 kB
> Committed_AS:   178920 kB
> ...
>    

Out of curiosity, what desktop and apps are you running?  Here firefox 
takes 1.3GB virt size and 377MB RSS, that alone could overwhelm your system.

[avi@firebolt qemu-kvm (master)]$ cat /proc/2698/status
Name:    firefox
State:    S (sleeping)
Tgid:    2698
Pid:    2698
PPid:    2686
TracerPid:    0
Uid:    500    500    500    500
Gid:    500    500    500    500
Utrace:    0
FDSize:    256
Groups:    500 502 512
VmPeak:     1478804 kB
VmSize:     1288100 kB
VmLck:           0 kB
VmHWM:      434788 kB
VmRSS:      386296 kB
VmData:      676384 kB
VmStk:         248 kB
VmExe:          88 kB
VmLib:       62504 kB
VmPTE:        2312 kB
Avi Kivity Dec. 6, 2009, 6:19 p.m. UTC | #42
On 12/06/2009 08:12 PM, malc wrote:
>
>> Init is pretty easy to handle.  I'm worried about runtime where you can't
>> report an error to the guest.  Real hardware doesn't oom.
>>      
> Here, i do agree, but mostly because most of the users of allocation
> functions just themselves returned NULL or -1 or whatever and never
> really bothered to report anything, so the addition of OOM check that
> you've added made the code less cluttered.
>    

My point is that it would take a major rework, and would probably 
involve removing the allocations instead of handling any errors.  For 
example, a scsi device would tell the block device the upper bound of 
aiocbs it could possibly issue, and the maximum number of sg elements in 
a request, and qcow2 (or any other backing format driver) would 
preallocate enough resources to satisfy the worst case.  And we still 
can't handle a syscall returning ENOMEM.
malc Dec. 6, 2009, 6:21 p.m. UTC | #43
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 08:09 PM, malc wrote:
> > 
> > Well, i don't have a swap...
> > 
> > ~$ cat /proc/meminfo
> > MemTotal:       515708 kB
> > MemFree:          3932 kB
> > Buffers:         10120 kB
> > Cached:         365320 kB
> > SwapCached:          0 kB
> > Active:         238120 kB
> > Inactive:       222396 kB
> > SwapTotal:           0 kB
> > SwapFree:            0 kB
> > ...
> > CommitLimit:    464136 kB
> > Committed_AS:   178920 kB
> > ...
> >    
> 
> Out of curiosity, what desktop and apps are you running?  Here firefox takes
> 1.3GB virt size and 377MB RSS, that alone could overwhelm your system.

Most of the time my setup is:

IceWM - 4 virtual desktops
XEmacs
Seamonkey 1.1.16
IceDock
A bunch of rxvts

> 
> [avi@firebolt qemu-kvm (master)]$ cat /proc/2698/status
> Name:    firefox
> State:    S (sleeping)
> Tgid:    2698
> Pid:    2698
> PPid:    2686
> TracerPid:    0
> Uid:    500    500    500    500
> Gid:    500    500    500    500
> Utrace:    0
> FDSize:    256
> Groups:    500 502 512
> VmPeak:     1478804 kB
> VmSize:     1288100 kB
> VmLck:           0 kB
> VmHWM:      434788 kB
> VmRSS:      386296 kB
> VmData:      676384 kB
> VmStk:         248 kB
> VmExe:          88 kB
> VmLib:       62504 kB
> VmPTE:        2312 kB
> 

~$ cat /proc/$(pidof seamonkey-bin)/status
Name:   seamonkey-bin
State:  S (sleeping)
Tgid:   2332
Pid:    2332
PPid:   1
TracerPid:      0
Uid:    1000    1000    1000    1000
Gid:    100     100     100     100
FDSize: 256
Groups: 10 11 17 18 19 100 
VmPeak:   151848 kB
VmSize:   150976 kB
VmLck:         0 kB
VmHWM:     57032 kB
VmRSS:     56260 kB
VmData:   105740 kB
VmStk:        84 kB
VmExe:       296 kB
VmLib:     34472 kB
VmPTE:       136 kB
Threads:        5
[...]
Jamie Lokier Dec. 6, 2009, 6:31 p.m. UTC | #44
Ian Molton wrote:
> > Read the beginning of the thread.  Basically it's for arrays, malloc(n *
> > sizeof(x)).
> 
> well, make sure n is not 0. Its not that hard. I dont think I've *ever*
> had a situation where I wanted to pass 0 to malloc.

I would like to remind everyone that sizeof(x) can be 0 too.  For
example, on Linux sizeof(spinlock_t) == 0 on UP.  Anything where you
have a bunch of structure fields which depend on compile time
configuration, or where a type might be replaced by a stub empty
structure, is a possible sizeof(x) == 0.

> Its not that hard.

The fact is there are a number of bugs in qemu where n == 0 is not
checked prior to calling qemu_malloc() at the moment.  None of them
are "hard" to fix - they are rare cases that nobody noticed when
writing them.

Until we have code analysis tools checking for that, bugs of that kind
will probably keep arising.

-- Jamie
Jamie Lokier Dec. 6, 2009, 6:36 p.m. UTC | #45
Avi Kivity wrote:
> A NEW(type) and ARRAY_NEW(type, count) marcros would improve type safety
> and plug a dormant buffer overflow due to multiplication overflow, yes.  
> Even qemu_calloc() would be an improvement.

In my code I regularly use type_alloc(type) and type_free(type, ptr),
giving type safety at both ends (and possibility to optimise
allocations, but that's separate).

If you have ARRAY_NEW(type, count) which permits count to be zero and
returns a non-NULL result, I wonder, why is it ok to convert zero
count to a guaranteed non-NULL unique result, but not do that for
sizeof(type) (or just size)?

-- Jamie
malc Dec. 6, 2009, 6:41 p.m. UTC | #46
On Sun, 6 Dec 2009, Avi Kivity wrote:

> On 12/06/2009 08:12 PM, malc wrote:
> > 
> > > Init is pretty easy to handle.  I'm worried about runtime where you can't
> > > report an error to the guest.  Real hardware doesn't oom.
> > >      
> > Here, i do agree, but mostly because most of the users of allocation
> > functions just themselves returned NULL or -1 or whatever and never
> > really bothered to report anything, so the addition of OOM check that
> > you've added made the code less cluttered.
> >    
> 
> My point is that it would take a major rework, and would probably involve
> removing the allocations instead of handling any errors.  For example, a scsi
> device would tell the block device the upper bound of aiocbs it could possibly
> issue, and the maximum number of sg elements in a request, and qcow2 (or any
> other backing format driver) would preallocate enough resources to satisfy the
> worst case.  And we still can't handle a syscall returning ENOMEM.
> 

Sure. My point is that sometimes failure to allocate is due to bugs,
invalid input etc, and conversion to OOM aborts en masse, might have
not been the best possible route to take, but most likely it was better
than doing nothing.
Ian Molton Dec. 6, 2009, 10:38 p.m. UTC | #47
Avi Kivity wrote:

> Init is pretty easy to handle.  I'm worried about runtime where you
> can't report an error to the guest.  Real hardware doesn't oom.

In the case of the socket reconnect code I posted recently, if the
allocation failed, it would give up trying to reconnect and inform the
user of that chardev that it had closed. Ok, this doesnt help the guest,
but it allows other code to clean up nicely, and we can report the
failure to the host. IMHO thats better than leaving a sysadmin
scratching their head wondering why it suddenly just stopped feeding the
guest entropy and isnt trying to reconnect anymore...

(IMO)

-Ian
Ian Molton Dec. 6, 2009, 10:40 p.m. UTC | #48
malc wrote:

> Well, i don't have a swap...

Indeed, nor do I (machine has an NFS root. swap on NFS is... not good.)
Jamie Lokier Dec. 7, 2009, 2:51 a.m. UTC | #49
Ian Molton wrote:
> Avi Kivity wrote:
> 
> > Init is pretty easy to handle.  I'm worried about runtime where you
> > can't report an error to the guest.  Real hardware doesn't oom.
> 
> In the case of the socket reconnect code I posted recently, if the
> allocation failed, it would give up trying to reconnect and inform the
> user of that chardev that it had closed. Ok, this doesnt help the guest,
> but it allows other code to clean up nicely, and we can report the
> failure to the host. IMHO thats better than leaving a sysadmin
> scratching their head wondering why it suddenly just stopped feeding the
> guest entropy and isnt trying to reconnect anymore...

If the system as a whole runs out of memory so that no-overcommit
malloc() fails on a small alloc, there's a good chance that you won't
be able to send a message to the host (how do you format the QMP
message without malloc?), and if you do manage that, there's a good
chance the host won't be able to receive it (it can't malloc either),
and if it does manage to receive the message, you can be almost
certain that it won't be able to run any GUI operations, send mail,
etc. to inform the admin.

The chances of the path "qemu small alloc -> chardev error -> send QMP
message -> receive QMP message -> parse QMG message -> do something
useful (log/email/UI)" having fully preallocated buffers for every
step, including a preallocated emergency pool for the buffers used by
QMG formatting and parsing, so that it gets all the way past the last
step are very slim indeed.

There's no point writing the code for the first steps, if it's
intractable to make the later steps do something useful.

Btw, as an admin I would really rather the socket reconnection code
keeps trying in that circumstance, if qemu does not simply fall over
due to alloc failing for something else soon after.  The most likely
scenario, imho in a server like that, is to notice it is running out
of memory and kill the real cause (e.g. another runaway process), then
restart all daemons which have died.  I'm not going to notice a
non-fatal message (in the unlikely event it is propagated all the way
up) because there are plenty of other non-fatal messages in normal
use, multiplied by hundreds of guests (across a cluster).  Or, if you
mean the chardev closing causes qemu to terminate - what's the
difference from the current qemu_malloc() behaviour?

I'd rather it behaves like a broken HWRNG if it can't get host
entropy: Don't provide data, and let the guest decide what to do, just
like it does for a broken HWRNG.  Except virtio-rng can report
unavailability rather than simply being broken :-)

-- Jamie
Kevin Wolf Dec. 7, 2009, 8:48 a.m. UTC | #50
Am 05.12.2009 18:27, schrieb Anthony Liguori:
> If qemu_malloc() didn't carry the name "malloc()" then semantics with 
> size=0 would be a different discussion.  But so far, all qemu_* 
> functions tend to behave almost exactly like their C counterparts.  
> Relying on the result of size=0 with malloc() is broken.

Then let's rename it as qemu_getmem, do a sed run over the whole code
and have sane semantics. :-)

Kevin
Ian Molton Dec. 7, 2009, 9:39 a.m. UTC | #51
Jamie Lokier wrote:

> If the system as a whole runs out of memory so that no-overcommit
> malloc() fails on a small alloc, there's a good chance that you won't
> be able to send a message to the host

Send what message to the host? If the malloc in the socet reconnect code
fails, its the code on the host thats going to flag up that malloc failed.

> and if it does manage to receive the message, you can be almost
> certain that it won't be able to run any GUI operations, send mail,
> etc. to inform the admin.

OTOH, If all it does it log it to a file, theres a fair chance it might
succeed.

> There's no point writing the code for the first steps, if it's
> intractable to make the later steps do something useful.

OTOH, a simple printed warning, and closing the socket are fairly likely
to work.

> Btw, as an admin I would really rather the socket reconnection code
> keeps trying in that circumstance, if qemu does not simply fall over
> due to alloc failing for something else soon after.

Surely better to keep running by dropping nonessential services so that
the guest might get a chance to shut down or the host might recover.

> I'd rather it behaves like a broken HWRNG if it can't get host
> entropy: Don't provide data, and let the guest decide what to do, just
> like it does for a broken HWRNG.

It does.

>  Except virtio-rng can report unavailability rather than simply being broken :-)

It could, in theory.

-Ian
Markus Armbruster Dec. 7, 2009, 9:45 a.m. UTC | #52
Ian Molton <ian.molton@collabora.co.uk> writes:

> Markus Armbruster wrote:
>
>>     p = malloc(n * sizeof(struct foo);
>>     if (n && !p)
>>         exit_no_mem();
>>     for (i = 0; i < n; i++)
>>         compute_one(p, i);
>> 
>> With qemu_malloc(), the error handling moves into qemu_malloc():
>> 
>>     p = qemu_malloc(n * sizeof(struct foo);
>>     for (i = 0; i < n; i++)
>>         compute_one(p, i);
>
> And you lose the ability to fail gracefully...

That's a deliberate choice.  It has its drawbacks, it has its
advantages.  And it's not related to the question at hand: permitting
zero arguments.
Avi Kivity Dec. 7, 2009, 9:47 a.m. UTC | #53
On 12/06/2009 08:41 PM, malc wrote:
> Sure. My point is that sometimes failure to allocate is due to bugs,
> invalid input etc, and conversion to OOM aborts en masse, might have
> not been the best possible route to take, but most likely it was better
> than doing nothing.
>    

I agree.  Early oom handling does limit opportunities for recovery in 
the cases where it is possible/easy.  We can have an alternative API 
that doesn't do oom handling for those cases where it is desirable.
Paolo Bonzini Dec. 7, 2009, 9:55 a.m. UTC | #54
On 12/07/2009 10:39 AM, Ian Molton wrote:
> Send what message to the host? If the malloc in the socet reconnect code
> fails, its the code on the host thats going to flag up that malloc failed.

He means to management code.

>> >  and if it does manage to receive the message, you can be almost
>> >  certain that it won't be able to run any GUI operations, send mail,
>> >  etc. to inform the admin.
>
> OTOH, If all it does it log it to a file, theres a fair chance it might
> succeed.

Not necessarily, for example fprintf may fail.  There _may_ be a fair 
chance to succeed if you use write(2) directly, but that's it basically, 
and ENOMEM is always there waiting for you...

Paolo
Markus Armbruster Dec. 7, 2009, 9:56 a.m. UTC | #55
Jamie Lokier <jamie@shareable.org> writes:

> Ian Molton wrote:
>> > Read the beginning of the thread.  Basically it's for arrays, malloc(n *
>> > sizeof(x)).
>> 
>> well, make sure n is not 0. Its not that hard. I dont think I've *ever*
>> had a situation where I wanted to pass 0 to malloc.
>
> I would like to remind everyone that sizeof(x) can be 0 too.  For
> example, on Linux sizeof(spinlock_t) == 0 on UP.  Anything where you
> have a bunch of structure fields which depend on compile time
> configuration, or where a type might be replaced by a stub empty
> structure, is a possible sizeof(x) == 0.

Therefore, outlawing qemu_malloc(0) impacts abstract data types: given
some abstract type T, about which you're not supposed to assume
anything, you can't do qemu_malloc(sizeof(T)).

If anybody here is into outlawing scary zeroes, types with size zero
looks like another one to outlaw.

>> Its not that hard.
>
> The fact is there are a number of bugs in qemu where n == 0 is not
> checked prior to calling qemu_malloc() at the moment.  None of them
> are "hard" to fix - they are rare cases that nobody noticed when
> writing them.

Hard or not so hard, what would "fixing" them buy us?  I keep asking
this question, no takers so far.

> Until we have code analysis tools checking for that, bugs of that kind
> will probably keep arising.

Correct.
Kevin Wolf Dec. 7, 2009, 10:20 a.m. UTC | #56
Am 07.12.2009 10:47, schrieb Avi Kivity:
> On 12/06/2009 08:41 PM, malc wrote:
>> Sure. My point is that sometimes failure to allocate is due to bugs,
>> invalid input etc, and conversion to OOM aborts en masse, might have
>> not been the best possible route to take, but most likely it was better
>> than doing nothing.
>>    
> 
> I agree.  Early oom handling does limit opportunities for recovery in 
> the cases where it is possible/easy.  We can have an alternative API 
> that doesn't do oom handling for those cases where it is desirable.

You could simply use normal malloc as an alternative API there. After
all, the only thing that qemu_malloc adds is the OOM check (and
currently also the zero check).

Kevin
malc Dec. 7, 2009, 11:30 a.m. UTC | #57
On Mon, 30 Nov 2009, Markus Armbruster wrote:

> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc() & friends.  Revert that, but take care never to
> return a null pointer, like malloc() & friends may do (it's
> implementation defined), because that's another source of bugs.
> 
> Rationale: while zero-sized allocations might occasionally be a sign of
> something going wrong, they can also be perfectly legitimate.  The
> change broke such legitimate uses.  We've found and "fixed" at least one
> of them already (commit eb0b64f7, also reverted by this patch), and
> another one just popped up: the change broke qcow2 images with virtual
> disk size zero, i.e. images that don't hold real data but only VM state
> of snapshots.
> 
> If a change breaks two uses, it probably breaks more.  As a quick check,
> I reviewed the first six of more than 200 uses of qemu_mallocz(),
> qemu_malloc() and qemu_realloc() that don't have an argument of the form
> sizeof(...) or similar:

Bottom line: your point is that there are benign uses of zero allocation.
There are, at the expense of memory consumption/fragmentation and useless
code which, as your investigation shows, involve syscalls and what not.

Outlook from my side of the fence: no one audited the code, no one
knows that all of the uses are benign, abort gives the best automatic
way to know for sure one way or the other.

Rationale for keeping it as is: zero-sized allocations - overwhelming
majority of the times, are not anticipated and not well understood,
aborting gives us the ability to eliminate them in an automatic
fashion.

> 
> * load_elf32(), load_elf64() in hw/elf_ops.h:
> 
>     size = ehdr.e_phnum * sizeof(phdr[0]);
>     lseek(fd, ehdr.e_phoff, SEEK_SET);
>     phdr = qemu_mallocz(size);
> 
>   This aborts when the ELF file has no program header table, because
>   then e_phnum is zero (TIS ELF 1.2 page 1-6).  Even if we require the
>   ELF file to have a program header table, aborting is not an acceptable
>   way to reject an unsuitable or malformed ELF file.

If there's a malformed ELF files that must be supported (and by supported
- user notification is meant) then things should be checked, because having
gargantuan size will force qemu_mallocz to abort via OOM check (even
with Linux and overcommit, since malloc will fallback to mmap which
will fail) and this argument falls apart.


> * load_elf32(), load_elf64() in hw/elf_ops.h:
> 
>             mem_size = ph->p_memsz;
>             /* XXX: avoid allocating */
>             data = qemu_mallocz(mem_size);
> 
>   This aborts when the segment occupies zero bytes in the image file
>   (TIS ELF 1.2 page 2-2).
> 
> * bdrv_open2() in block.c:
> 
>     bs->opaque = qemu_mallocz(drv->instance_size);
> 
>   However, vvfat_write_target.instance_size is zero.  Not sure whether
>   this actually bites or is "only" a time bomb.
> 
> * qemu_aio_get() in block.c:
> 
>         acb = qemu_mallocz(pool->aiocb_size);
> 
>   No existing instance of BlockDriverAIOCB has a zero aiocb_size.
>   Probably okay.
> 
> * defaultallocator_create_displaysurface() in console.c has
> 
>     surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
> 
>   With Xen, surface->linesize and surface->height come out of
>   xenfb_configure_fb(), which set xenfb->width and xenfb->height to
>   values obtained from the guest.  If a buggy guest sets one to zero, we
>   abort.  Not an good way to handle that.

What is? Let's suppose height is zero, without explicit checks, user
will never know why his screen doesn't update, with abort he will at
least know that something is very wrong.

>   Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
> 
> * load_device_tree() in device_tree.c:
> 
>     *sizep = 0;
>     dt_size = get_image_size(filename_path);
>     if (dt_size < 0) {
>         printf("Unable to get size of device tree file '%s'\n",
>             filename_path);
>         goto fail;
>     }
> 
>     /* Expand to 2x size to give enough room for manipulation.  */
>     dt_size *= 2;
>     /* First allocate space in qemu for device tree */
>     fdt = qemu_mallocz(dt_size);
> 
>   We abort if the image is empty.  Not an acceptable way to handle that.
> 
> Based on this sample, I'm confident there are many more uses broken by
> the change.

Based on this sample i'm confident, we can eventually fix them should those
become the problem.

> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  block/qcow2-snapshot.c |    5 -----
>  qemu-malloc.c          |   14 ++------------
>  2 files changed, 2 insertions(+), 17 deletions(-)
> 
> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> index 94cb838..e3b208c 100644
> --- a/block/qcow2-snapshot.c
> +++ b/block/qcow2-snapshot.c
> @@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
>      QCowSnapshot *sn;
>      int i;
>  
> -    if (!s->nb_snapshots) {
> -        *psn_tab = NULL;
> -        return s->nb_snapshots;
> -    }
> -
>      sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
>      for(i = 0; i < s->nb_snapshots; i++) {
>          sn_info = sn_tab + i;
> diff --git a/qemu-malloc.c b/qemu-malloc.c
> index 295d185..aeeb78b 100644
> --- a/qemu-malloc.c
> +++ b/qemu-malloc.c
> @@ -44,22 +44,12 @@ void qemu_free(void *ptr)
>  
>  void *qemu_malloc(size_t size)
>  {
> -    if (!size) {
> -        abort();
> -    }
> -    return oom_check(malloc(size));
> +    return oom_check(malloc(size ? size : 1));
>  }
>  
>  void *qemu_realloc(void *ptr, size_t size)
>  {
> -    if (size) {
> -        return oom_check(realloc(ptr, size));
> -    } else {
> -        if (ptr) {
> -            return realloc(ptr, size);
> -        }
> -    }
> -    abort();
> +    return oom_check(realloc(ptr, size ? size : 1));
>  }

http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
Avi Kivity Dec. 7, 2009, 1:28 p.m. UTC | #58
On 12/07/2009 11:55 AM, Paolo Bonzini wrote:
>> OTOH, If all it does it log it to a file, theres a fair chance it might
>> succeed.
>
>
> Not necessarily, for example fprintf may fail.  There _may_ be a fair 
> chance to succeed if you use write(2) directly, but that's it 
> basically, and ENOMEM is always there waiting for you...

Right, if malloc() failed write(2) is likely to fail as well.  More 
likely, in fact, since malloc() may have unused process memory to draw 
upon whereas write(2) can only allocate kernel memory.
Markus Armbruster Dec. 7, 2009, 2:45 p.m. UTC | #59
malc <av1474@comtv.ru> writes:

> On Mon, 30 Nov 2009, Markus Armbruster wrote:
>
>> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
>> from ISO C's malloc() & friends.  Revert that, but take care never to
>> return a null pointer, like malloc() & friends may do (it's
>> implementation defined), because that's another source of bugs.
>> 
>> Rationale: while zero-sized allocations might occasionally be a sign of
>> something going wrong, they can also be perfectly legitimate.  The
>> change broke such legitimate uses.  We've found and "fixed" at least one
>> of them already (commit eb0b64f7, also reverted by this patch), and
>> another one just popped up: the change broke qcow2 images with virtual
>> disk size zero, i.e. images that don't hold real data but only VM state
>> of snapshots.
>> 
>> If a change breaks two uses, it probably breaks more.  As a quick check,
>> I reviewed the first six of more than 200 uses of qemu_mallocz(),
>> qemu_malloc() and qemu_realloc() that don't have an argument of the form
>> sizeof(...) or similar:
>
> Bottom line: your point is that there are benign uses of zero allocation.
> There are, at the expense of memory consumption/fragmentation and useless
> code which, as your investigation shows, involve syscalls and what not.

I doubt the performance impact is relevant, but since you insist on
discussing it...

First and foremost, any performance argument not backed by measurements
is speculative.

Potential zero allocation is common in code.  Actual zero allocation is
rare at runtime.  The amount of memory consumed is therefore utterly
trivial compared to total dynamic memory use.  Likewise, time spent in
allocating is dwarved by time spent in other invocations of malloc()
several times over.

Adding a special case for avoiding zero allocation is not free.  Unless
zero allocations are sufficiently frequent, that cost exceeds the cost
of the rare zero allocation.

> Outlook from my side of the fence: no one audited the code, no one
> knows that all of the uses are benign, abort gives the best automatic
> way to know for sure one way or the other.
>
> Rationale for keeping it as is: zero-sized allocations - overwhelming
> majority of the times, are not anticipated and not well understood,
> aborting gives us the ability to eliminate them in an automatic
> fashion.

Keeping it as is releasing with known crash bugs, and a known pattern
for unknown crash bugs.  Is that what you want us to do?  I doubt it.

I grant you that there may be allocations we expect never to be empty,
and where things break when they are.  Would you concede that there are
allocations that work just fine when empty?

If you do, we end up with three kinds of allocations: known empty bad,
known empty fine, unknown.  Aborting on known empty bad is okay with me.
Aborting on unknown in developement builds is okay with me, too.  I
don't expect it to be a successful way to find bugs, because empty
allocations are rare.

Initially, all allocations should be treated as "unknown".  I want a way
to mark an allocation as "known empty fine".  I figure you want a way to
mark "known empty bad".

>> * load_elf32(), load_elf64() in hw/elf_ops.h:
>> 
>>     size = ehdr.e_phnum * sizeof(phdr[0]);
>>     lseek(fd, ehdr.e_phoff, SEEK_SET);
>>     phdr = qemu_mallocz(size);
>> 
>>   This aborts when the ELF file has no program header table, because
>>   then e_phnum is zero (TIS ELF 1.2 page 1-6).  Even if we require the
>>   ELF file to have a program header table, aborting is not an acceptable
>>   way to reject an unsuitable or malformed ELF file.
>
> If there's a malformed ELF files that must be supported (and by supported
> - user notification is meant) then things should be checked, because having
> gargantuan size will force qemu_mallocz to abort via OOM check (even
> with Linux and overcommit, since malloc will fallback to mmap which
> will fail) and this argument falls apart.

ELF files with empty PHT are *not* malformed.  Empty PHT is explicitly
permitted by ELF TIS.  Likewise for empty segments.  I already gave you
chapter & verse.

>> * load_elf32(), load_elf64() in hw/elf_ops.h:
>> 
>>             mem_size = ph->p_memsz;
>>             /* XXX: avoid allocating */
>>             data = qemu_mallocz(mem_size);
>> 
>>   This aborts when the segment occupies zero bytes in the image file
>>   (TIS ELF 1.2 page 2-2).
>> 
>> * bdrv_open2() in block.c:
>> 
>>     bs->opaque = qemu_mallocz(drv->instance_size);
>> 
>>   However, vvfat_write_target.instance_size is zero.  Not sure whether
>>   this actually bites or is "only" a time bomb.
>> 
>> * qemu_aio_get() in block.c:
>> 
>>         acb = qemu_mallocz(pool->aiocb_size);
>> 
>>   No existing instance of BlockDriverAIOCB has a zero aiocb_size.
>>   Probably okay.
>> 
>> * defaultallocator_create_displaysurface() in console.c has
>> 
>>     surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
>> 
>>   With Xen, surface->linesize and surface->height come out of
>>   xenfb_configure_fb(), which set xenfb->width and xenfb->height to
>>   values obtained from the guest.  If a buggy guest sets one to zero, we
>>   abort.  Not an good way to handle that.
>
> What is? Let's suppose height is zero, without explicit checks, user
> will never know why his screen doesn't update, with abort he will at
> least know that something is very wrong.

My point isn't that permitting malloc(0) is the best way to handle it.
It's a better way than aborting, though.

I'm not arguing against treating case 0 specially ever.

>>   Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
>> 
>> * load_device_tree() in device_tree.c:
>> 
>>     *sizep = 0;
>>     dt_size = get_image_size(filename_path);
>>     if (dt_size < 0) {
>>         printf("Unable to get size of device tree file '%s'\n",
>>             filename_path);
>>         goto fail;
>>     }
>> 
>>     /* Expand to 2x size to give enough room for manipulation.  */
>>     dt_size *= 2;
>>     /* First allocate space in qemu for device tree */
>>     fdt = qemu_mallocz(dt_size);
>> 
>>   We abort if the image is empty.  Not an acceptable way to handle that.
>> 
>> Based on this sample, I'm confident there are many more uses broken by
>> the change.
>
> Based on this sample i'm confident, we can eventually fix them should those
> become the problem.

You broke working code.  I'm glad you're confident we can fix it
"eventually".  What about 0.12?  Ship it broken?

>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  block/qcow2-snapshot.c |    5 -----
>>  qemu-malloc.c          |   14 ++------------
>>  2 files changed, 2 insertions(+), 17 deletions(-)
>> 
>> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
>> index 94cb838..e3b208c 100644
>> --- a/block/qcow2-snapshot.c
>> +++ b/block/qcow2-snapshot.c
>> @@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
>>      QCowSnapshot *sn;
>>      int i;
>>  
>> -    if (!s->nb_snapshots) {
>> -        *psn_tab = NULL;
>> -        return s->nb_snapshots;
>> -    }
>> -
>>      sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
>>      for(i = 0; i < s->nb_snapshots; i++) {
>>          sn_info = sn_tab + i;
>> diff --git a/qemu-malloc.c b/qemu-malloc.c
>> index 295d185..aeeb78b 100644
>> --- a/qemu-malloc.c
>> +++ b/qemu-malloc.c
>> @@ -44,22 +44,12 @@ void qemu_free(void *ptr)
>>  
>>  void *qemu_malloc(size_t size)
>>  {
>> -    if (!size) {
>> -        abort();
>> -    }
>> -    return oom_check(malloc(size));
>> +    return oom_check(malloc(size ? size : 1));
>>  }
>>  
>>  void *qemu_realloc(void *ptr, size_t size)
>>  {
>> -    if (size) {
>> -        return oom_check(realloc(ptr, size));
>> -    } else {
>> -        if (ptr) {
>> -            return realloc(ptr, size);
>> -        }
>> -    }
>> -    abort();
>> +    return oom_check(realloc(ptr, size ? size : 1));
>>  }
>
> http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b

--verbose ?
Anthony Liguori Dec. 7, 2009, 3:50 p.m. UTC | #60
Markus Armbruster wrote:
> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> from ISO C's malloc() & friends.  Revert that, but take care never to
> return a null pointer, like malloc() & friends may do (it's
> implementation defined), because that's another source of bugs.
>   

While it's always fun to argue about standards interpretation, I wanted 
to capture some action items from the discussion that I think there is 
agreement about.  Since I want to make changes for 0.12, I think it 
would be best to try and settle these now so we can do this before -rc2.

For 0.12.0-rc2:

I will send out a patch tonight or tomorrow changing qemu_malloc() to 
return malloc(1) when size=0 only for production builds (via 
--enable-zero-mallocs).  Development trees will maintain their current 
behavior.

For 0.13:

Someone (Marcus?) will introduce four new allocation functions.

type *qemu_new(type, n_types);
type *qemu_new0(type, n_types);

type *qemu_renew(type, mem, n_types);
type *qemu_renew0(type, mem, n_types);

NB: The names are borrowed from glib.  I'm not tied to them.

Will do our best to convert old code to use these functions where ever 
possible.  New code will be required to use these functions unless not 
possible.  n_types==0 is valid.  sizeof(type)==0 is valid.  Both 
circumstances return a unique non-NULL pointer.  If memory allocation 
fails, execution will abort.

The existing qemu_malloc() will maintain it's current behavior (with the 
exception of the relaxed size==0 assertion for production releases).

Does anyone object to this moving forward?

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 4 p.m. UTC | #61
On 12/07/2009 05:50 PM, Anthony Liguori wrote:
>
> While it's always fun to argue about standards interpretation, I 
> wanted to capture some action items from the discussion that I think 
> there is agreement about.  Since I want to make changes for 0.12, I 
> think it would be best to try and settle these now so we can do this 
> before -rc2.
>
> For 0.12.0-rc2:
>
> I will send out a patch tonight or tomorrow changing qemu_malloc() to 
> return malloc(1) when size=0 only for production builds (via 
> --enable-zero-mallocs).  Development trees will maintain their current 
> behavior.
>

Since active development is ceasing on 0.12, I'd suggest not having 
separate behaviour for devel and production.  Do we want patches for 
n==0 array allocations at this time?

I'd really like to see Markus' patch applied.

> For 0.13:
>
> Someone (Marcus?) will introduce four new allocation functions.
>
> type *qemu_new(type, n_types);
> type *qemu_new0(type, n_types);
>
> type *qemu_renew(type, mem, n_types);
> type *qemu_renew0(type, mem, n_types);
>

I'd like to see separate functions for arrays and single objects, to 
avoid ", 1)" everywhere.

   qemu_new()
   qemu_new0()

   qemu_new_array()
   qemu_new_array0()
   qemu_renew_array()
   qemu_renew_array0()

In addition, Markus' patch should be applied to master to avoid 
regressions while the code is converted.
Anthony Liguori Dec. 7, 2009, 4:06 p.m. UTC | #62
Avi Kivity wrote:
> On 12/07/2009 05:50 PM, Anthony Liguori wrote:
>>
>> While it's always fun to argue about standards interpretation, I 
>> wanted to capture some action items from the discussion that I think 
>> there is agreement about.  Since I want to make changes for 0.12, I 
>> think it would be best to try and settle these now so we can do this 
>> before -rc2.
>>
>> For 0.12.0-rc2:
>>
>> I will send out a patch tonight or tomorrow changing qemu_malloc() to 
>> return malloc(1) when size=0 only for production builds (via 
>> --enable-zero-mallocs).  Development trees will maintain their 
>> current behavior.
>>
>
> Since active development is ceasing on 0.12, I'd suggest not having 
> separate behaviour for devel and production.  Do we want patches for 
> n==0 array allocations at this time?

Covering every qemu_malloc instance this close to the GA is too risky.  
I agree that having separate behavior is less than ideal but I think 
it's the only sane way forward.

> I'd really like to see Markus' patch applied.

For 0.12, that doesn't seem like a possibility.

>> For 0.13:
>>
>> Someone (Marcus?) will introduce four new allocation functions.
>>
>> type *qemu_new(type, n_types);
>> type *qemu_new0(type, n_types);
>>
>> type *qemu_renew(type, mem, n_types);
>> type *qemu_renew0(type, mem, n_types);
>>
>
> I'd like to see separate functions for arrays and single objects, to 
> avoid ", 1)" everywhere.
>
>   qemu_new()
>   qemu_new0()
>
>   qemu_new_array()
>   qemu_new_array0()
>   qemu_renew_array()
>   qemu_renew_array0()

Like I said, I'm not tied to naming.  I'll defer this to whoever 
contributes the patch and signs up for the conversion work.

> In addition, Markus' patch should be applied to master to avoid 
> regressions while the code is converted.

Let's separate that discussion as it's an independent consideration.

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 4:11 p.m. UTC | #63
On 12/07/2009 06:06 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> On 12/07/2009 05:50 PM, Anthony Liguori wrote:
>>>
>>> While it's always fun to argue about standards interpretation, I 
>>> wanted to capture some action items from the discussion that I think 
>>> there is agreement about.  Since I want to make changes for 0.12, I 
>>> think it would be best to try and settle these now so we can do this 
>>> before -rc2.
>>>
>>> For 0.12.0-rc2:
>>>
>>> I will send out a patch tonight or tomorrow changing qemu_malloc() 
>>> to return malloc(1) when size=0 only for production builds (via 
>>> --enable-zero-mallocs).  Development trees will maintain their 
>>> current behavior.
>>>
>>
>> Since active development is ceasing on 0.12, I'd suggest not having 
>> separate behaviour for devel and production.  Do we want patches for 
>> n==0 array allocations at this time?
>
> Covering every qemu_malloc instance this close to the GA is too 
> risky.  I agree that having separate behavior is less than ideal but I 
> think it's the only sane way forward.
>

I don't understand why.  What's so insane about Markus' patch?  Allowing 
size=0 for both developer and production builds?

It seems like the least risky, least change approach to me.  Exactly 
what we want for 0.12.

>> I'd really like to see Markus' patch applied.
>
> For 0.12, that doesn't seem like a possibility.

Please explain why.

>
>> In addition, Markus' patch should be applied to master to avoid 
>> regressions while the code is converted.
>
> Let's separate that discussion as it's an independent consideration.
>

I've asked for qemu-malloc-discuss@vger.kernel.org to be created for 
this purpose.
Anthony Liguori Dec. 7, 2009, 4:20 p.m. UTC | #64
Avi Kivity wrote:
>> Covering every qemu_malloc instance this close to the GA is too 
>> risky.  I agree that having separate behavior is less than ideal but 
>> I think it's the only sane way forward.
>>
>
> I don't understand why.  What's so insane about Markus' patch?  
> Allowing size=0 for both developer and production builds?

There is a bug here.  Callers are calling qemu_malloc incorrectly.

There is an open discussion about how to address it--fix all callers of 
qemu_malloc() or allow size=0.  Since there isn't agreement, a 
compromise of sticking to the current behavior for the development tree, 
and using the later for production since we can't guarantee the former 
seems reasonable.

> It seems like the least risky, least change approach to me.  Exactly 
> what we want for 0.12.

The risk is that everyone will agree to this approach in the next two 
weeks.  I'm fairly certain no amount of discussion on qemu-devel is 
going to lead to that.

>>> In addition, Markus' patch should be applied to master to avoid 
>>> regressions while the code is converted.
>>
>> Let's separate that discussion as it's an independent consideration.
>>
>
> I've asked for qemu-malloc-discuss@vger.kernel.org to be created for 
> this purpose.

:-)

Regards,

Anthony Liguori
Paul Brook Dec. 7, 2009, 4:24 p.m. UTC | #65
> type *qemu_new(type, n_types);
> type *qemu_new0(type, n_types);
> 
> type *qemu_renew(type, mem, n_types);
> type *qemu_renew0(type, mem, n_types);

It always annoys me having to specify element count for things that aren't 
arrays.

I suggestion a single object allocation function, and an array allocation 
function that allows zero length arrays. Possibly also a must-not-be-zero 
array form if we think it's important.

Note that conversion to object/type based allocation is not always 
straightforward because inheritance means we don't have the final object type 
when doing the allocation.

Paul
Avi Kivity Dec. 7, 2009, 4:26 p.m. UTC | #66
On 12/07/2009 06:20 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>>> Covering every qemu_malloc instance this close to the GA is too 
>>> risky.  I agree that having separate behavior is less than ideal but 
>>> I think it's the only sane way forward.
>>>
>>
>> I don't understand why.  What's so insane about Markus' patch?  
>> Allowing size=0 for both developer and production builds?
>
> There is a bug here.  Callers are calling qemu_malloc incorrectly.
>
> There is an open discussion about how to address it--fix all callers 
> of qemu_malloc() or allow size=0.  Since there isn't agreement, a 
> compromise of sticking to the current behavior for the development 
> tree, and using the later for production since we can't guarantee the 
> former seems reasonable.

If we apply the patch, the callers are no longer incorrect.  Since we're 
winding down development on that tree, I see no reason for the 
production build to be correct and the development tree to be incorrect.

>> It seems like the least risky, least change approach to me.  Exactly 
>> what we want for 0.12.
>
> The risk is that everyone will agree to this approach in the next two 
> weeks.  I'm fairly certain no amount of discussion on qemu-devel is 
> going to lead to that.

You've already agreed to it for the production build.  There's no gain 
in having separate behaviour for development and production, when a tree 
is headed towards production.
Anthony Liguori Dec. 7, 2009, 4:27 p.m. UTC | #67
Paul Brook wrote:
>> type *qemu_new(type, n_types);
>> type *qemu_new0(type, n_types);
>>
>> type *qemu_renew(type, mem, n_types);
>> type *qemu_renew0(type, mem, n_types);
>>     
>
> It always annoys me having to specify element count for things that aren't 
> arrays.
>
> I suggestion a single object allocation function, and an array allocation 
> function that allows zero length arrays. Possibly also a must-not-be-zero 
> array form if we think it's important.
>   

Sure.

> Note that conversion to object/type based allocation is not always 
> straightforward because inheritance means we don't have the final object type 
> when doing the allocation.
>   

That's if you have the base object do the allocation, yup.

> Paul
>
Avi Kivity Dec. 7, 2009, 4:28 p.m. UTC | #68
On 12/07/2009 06:24 PM, Paul Brook wrote:
> Note that conversion to object/type based allocation is not always
> straightforward because inheritance means we don't have the final object type
> when doing the allocation.
>    

Instead of specifying the size, we can specify a constructor function 
(we usually do have an .init).  It's still dissatisfying in that it has 
to duplicate the call to whatever it ends up being called, though.
Anthony Liguori Dec. 7, 2009, 4:32 p.m. UTC | #69
Avi Kivity wrote:
> On 12/07/2009 06:20 PM, Anthony Liguori wrote:
>> Avi Kivity wrote:
>>>> Covering every qemu_malloc instance this close to the GA is too 
>>>> risky.  I agree that having separate behavior is less than ideal 
>>>> but I think it's the only sane way forward.
>>>>
>>>
>>> I don't understand why.  What's so insane about Markus' patch?  
>>> Allowing size=0 for both developer and production builds?
>>
>> There is a bug here.  Callers are calling qemu_malloc incorrectly.
>>
>> There is an open discussion about how to address it--fix all callers 
>> of qemu_malloc() or allow size=0.  Since there isn't agreement, a 
>> compromise of sticking to the current behavior for the development 
>> tree, and using the later for production since we can't guarantee the 
>> former seems reasonable.
>
> If we apply the patch, the callers are no longer incorrect.  Since 
> we're winding down development on that tree, I see no reason for the 
> production build to be correct and the development tree to be incorrect.

The problem with this whole discussion is taking the position that one 
form is "correct" while the other form is "incorrect".  It's not so 
black and white.

I don't think a reasonable person can claim that either form of 
qemu_malloc() is absolutely correct while the other is incorrect.  You 
may think one is better than the other but clearly, that sentiment is 
not universally shared.

If we change the production build, we eliminate the immediate problem.  
For 0.13, we can reduce the usage of qemu_malloc() such that it stops 
becoming an issue.  Then no one ever has to agree about which form is 
better and we can move on with our lives.

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 4:37 p.m. UTC | #70
On 12/07/2009 06:32 PM, Anthony Liguori wrote:
>> If we apply the patch, the callers are no longer incorrect.  Since 
>> we're winding down development on that tree, I see no reason for the 
>> production build to be correct and the development tree to be incorrect.
>
>
> The problem with this whole discussion is taking the position that one 
> form is "correct" while the other form is "incorrect".  It's not so 
> black and white.
>
> I don't think a reasonable person can claim that either form of 
> qemu_malloc() is absolutely correct while the other is incorrect.  You 
> may think one is better than the other but clearly, that sentiment is 
> not universally shared.
>

I'm less concerned about qemu_malloc() and more about qemu itself 
(though qemu_malloc() as patched fulfils all the standard's requirements).

> If we change the production build, we eliminate the immediate problem. 

What about developers that hit the assert?  Do they send patches that 
fix code that works in production just so they can run in developer mode?

> For 0.13, we can reduce the usage of qemu_malloc() such that it stops 
> becoming an issue.  Then no one ever has to agree about which form is 
> better and we can move on with our lives.

I agree, and the proposed changes are an improvement in their own right.
malc Dec. 7, 2009, 4:55 p.m. UTC | #71
On Mon, 7 Dec 2009, Markus Armbruster wrote:

> malc <av1474@comtv.ru> writes:
> 
> > On Mon, 30 Nov 2009, Markus Armbruster wrote:
> >
> >> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> >> from ISO C's malloc() & friends.  Revert that, but take care never to
> >> return a null pointer, like malloc() & friends may do (it's
> >> implementation defined), because that's another source of bugs.
> >> 
> >> Rationale: while zero-sized allocations might occasionally be a sign of
> >> something going wrong, they can also be perfectly legitimate.  The
> >> change broke such legitimate uses.  We've found and "fixed" at least one
> >> of them already (commit eb0b64f7, also reverted by this patch), and
> >> another one just popped up: the change broke qcow2 images with virtual
> >> disk size zero, i.e. images that don't hold real data but only VM state
> >> of snapshots.
> >> 
> >> If a change breaks two uses, it probably breaks more.  As a quick check,
> >> I reviewed the first six of more than 200 uses of qemu_mallocz(),
> >> qemu_malloc() and qemu_realloc() that don't have an argument of the form
> >> sizeof(...) or similar:
> >
> > Bottom line: your point is that there are benign uses of zero allocation.
> > There are, at the expense of memory consumption/fragmentation and useless
> > code which, as your investigation shows, involve syscalls and what not.
> 
> I doubt the performance impact is relevant, but since you insist on
> discussing it...
> 
> First and foremost, any performance argument not backed by measurements
> is speculative.
> 
> Potential zero allocation is common in code.  Actual zero allocation is
> rare at runtime.  The amount of memory consumed is therefore utterly
> trivial compared to total dynamic memory use.  Likewise, time spent in
> allocating is dwarved by time spent in other invocations of malloc()
> several times over.
> 
> Adding a special case for avoiding zero allocation is not free.  Unless
> zero allocations are sufficiently frequent, that cost exceeds the cost
> of the rare zero allocation.

I bet you dollars to donuts that testing for zero before calling malloc
is cheaper than the eventual test that is done inside it.

> 
> > Outlook from my side of the fence: no one audited the code, no one
> > knows that all of the uses are benign, abort gives the best automatic
> > way to know for sure one way or the other.
> >
> > Rationale for keeping it as is: zero-sized allocations - overwhelming
> > majority of the times, are not anticipated and not well understood,
> > aborting gives us the ability to eliminate them in an automatic
> > fashion.
> 
> Keeping it as is releasing with known crash bugs, and a known pattern
> for unknown crash bugs.  Is that what you want us to do?  I doubt it.

Yes, it's better than having _silent_ bugs, which, furthermore, have a
good potential of mainfesting themselves a lot further from the "bug"
site.

> 
> I grant you that there may be allocations we expect never to be empty,
> and where things break when they are.  Would you concede that there are
> allocations that work just fine when empty?

I wont dispute that.

> 
> If you do, we end up with three kinds of allocations: known empty bad,
> known empty fine, unknown.  Aborting on known empty bad is okay with me.
> Aborting on unknown in developement builds is okay with me, too.  I
> don't expect it to be a successful way to find bugs, because empty
> allocations are rare.
> 
> Initially, all allocations should be treated as "unknown".  I want a way
> to mark an allocation as "known empty fine".  I figure you want a way to
> mark "known empty bad".
> 
> >> * load_elf32(), load_elf64() in hw/elf_ops.h:
> >> 
> >>     size = ehdr.e_phnum * sizeof(phdr[0]);
> >>     lseek(fd, ehdr.e_phoff, SEEK_SET);
> >>     phdr = qemu_mallocz(size);
> >> 
> >>   This aborts when the ELF file has no program header table, because
> >>   then e_phnum is zero (TIS ELF 1.2 page 1-6).  Even if we require the
> >>   ELF file to have a program header table, aborting is not an acceptable
> >>   way to reject an unsuitable or malformed ELF file.
> >
> > If there's a malformed ELF files that must be supported (and by supported
> > - user notification is meant) then things should be checked, because having
> > gargantuan size will force qemu_mallocz to abort via OOM check (even
> > with Linux and overcommit, since malloc will fallback to mmap which
> > will fail) and this argument falls apart.
> 
> ELF files with empty PHT are *not* malformed.  Empty PHT is explicitly
> permitted by ELF TIS.  Likewise for empty segments.  I already gave you
> chapter & verse.

Uh, it's you who brought the whole malformed issue.

> 
> >> * load_elf32(), load_elf64() in hw/elf_ops.h:
> >> 
> >>             mem_size = ph->p_memsz;
> >>             /* XXX: avoid allocating */
> >>             data = qemu_mallocz(mem_size);
> >> 
> >>   This aborts when the segment occupies zero bytes in the image file
> >>   (TIS ELF 1.2 page 2-2).
> >> 
> >> * bdrv_open2() in block.c:
> >> 
> >>     bs->opaque = qemu_mallocz(drv->instance_size);
> >> 
> >>   However, vvfat_write_target.instance_size is zero.  Not sure whether
> >>   this actually bites or is "only" a time bomb.
> >> 
> >> * qemu_aio_get() in block.c:
> >> 
> >>         acb = qemu_mallocz(pool->aiocb_size);
> >> 
> >>   No existing instance of BlockDriverAIOCB has a zero aiocb_size.
> >>   Probably okay.
> >> 
> >> * defaultallocator_create_displaysurface() in console.c has
> >> 
> >>     surface->data = (uint8_t*) qemu_mallocz(surface->linesize * surface->height);
> >> 
> >>   With Xen, surface->linesize and surface->height come out of
> >>   xenfb_configure_fb(), which set xenfb->width and xenfb->height to
> >>   values obtained from the guest.  If a buggy guest sets one to zero, we
> >>   abort.  Not an good way to handle that.
> >
> > What is? Let's suppose height is zero, without explicit checks, user
> > will never know why his screen doesn't update, with abort he will at
> > least know that something is very wrong.
> 
> My point isn't that permitting malloc(0) is the best way to handle it.
> It's a better way than aborting, though.

And this is precisely the gist of our disagreement.

> 
> I'm not arguing against treating case 0 specially ever.
> 
> >>   Non-Xen uses aren't obviously correct either, but I didn't dig deeper.
> >> 
> >> * load_device_tree() in device_tree.c:
> >> 
> >>     *sizep = 0;
> >>     dt_size = get_image_size(filename_path);
> >>     if (dt_size < 0) {
> >>         printf("Unable to get size of device tree file '%s'\n",
> >>             filename_path);
> >>         goto fail;
> >>     }
> >> 
> >>     /* Expand to 2x size to give enough room for manipulation.  */
> >>     dt_size *= 2;
> >>     /* First allocate space in qemu for device tree */
> >>     fdt = qemu_mallocz(dt_size);
> >> 
> >>   We abort if the image is empty.  Not an acceptable way to handle that.
> >> 
> >> Based on this sample, I'm confident there are many more uses broken by
> >> the change.
> >
> > Based on this sample i'm confident, we can eventually fix them should those
> > become the problem.
> 
> You broke working code.  I'm glad you're confident we can fix it
> "eventually".  What about 0.12?  Ship it broken?

I'm fixing those as they arrive.

> 
> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >> ---
> >>  block/qcow2-snapshot.c |    5 -----
> >>  qemu-malloc.c          |   14 ++------------
> >>  2 files changed, 2 insertions(+), 17 deletions(-)
> >> 
> >> diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
> >> index 94cb838..e3b208c 100644
> >> --- a/block/qcow2-snapshot.c
> >> +++ b/block/qcow2-snapshot.c
> >> @@ -381,11 +381,6 @@ int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
> >>      QCowSnapshot *sn;
> >>      int i;
> >>  
> >> -    if (!s->nb_snapshots) {
> >> -        *psn_tab = NULL;
> >> -        return s->nb_snapshots;
> >> -    }
> >> -
> >>      sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
> >>      for(i = 0; i < s->nb_snapshots; i++) {
> >>          sn_info = sn_tab + i;
> >> diff --git a/qemu-malloc.c b/qemu-malloc.c
> >> index 295d185..aeeb78b 100644
> >> --- a/qemu-malloc.c
> >> +++ b/qemu-malloc.c
> >> @@ -44,22 +44,12 @@ void qemu_free(void *ptr)
> >>  
> >>  void *qemu_malloc(size_t size)
> >>  {
> >> -    if (!size) {
> >> -        abort();
> >> -    }
> >> -    return oom_check(malloc(size));
> >> +    return oom_check(malloc(size ? size : 1));
> >>  }
> >>  
> >>  void *qemu_realloc(void *ptr, size_t size)
> >>  {
> >> -    if (size) {
> >> -        return oom_check(realloc(ptr, size));
> >> -    } else {
> >> -        if (ptr) {
> >> -            return realloc(ptr, size);
> >> -        }
> >> -    }
> >> -    abort();
> >> +    return oom_check(realloc(ptr, size ? size : 1));
> >>  }
> >
> > http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
> 
> --verbose ?

Sigh...

C90 - realloc(non-null, 0) =
      free(non-null); return NULL;

C99 - realloc(non-null, 0) =
      free(non-null); return realloc(NULL, 0); 

GLIBC 2.7 = C90

How anyone can use this interface and it's implementations portably
or "sanely" is beyond me.

Do read the discussion the linked above.
malc Dec. 7, 2009, 4:57 p.m. UTC | #72
On Mon, 7 Dec 2009, Anthony Liguori wrote:

> Markus Armbruster wrote:
> > Commit a7d27b53 made zero-sized allocations a fatal error, deviating
> > from ISO C's malloc() & friends.  Revert that, but take care never to
> > return a null pointer, like malloc() & friends may do (it's
> > implementation defined), because that's another source of bugs.
> >   
> 
> While it's always fun to argue about standards interpretation, I wanted to
> capture some action items from the discussion that I think there is agreement
> about.  Since I want to make changes for 0.12, I think it would be best to try
> and settle these now so we can do this before -rc2.
> 
> For 0.12.0-rc2:
> 
> I will send out a patch tonight or tomorrow changing qemu_malloc() to return
> malloc(1) when size=0 only for production builds (via --enable-zero-mallocs).
> Development trees will maintain their current behavior.
> 
> For 0.13:
> 
> Someone (Marcus?) will introduce four new allocation functions.
> 
> type *qemu_new(type, n_types);
> type *qemu_new0(type, n_types);
> 
> type *qemu_renew(type, mem, n_types);
> type *qemu_renew0(type, mem, n_types);
> 
> NB: The names are borrowed from glib.  I'm not tied to them.
> 
> Will do our best to convert old code to use these functions where ever
> possible.  New code will be required to use these functions unless not
> possible.  n_types==0 is valid.  sizeof(type)==0 is valid.  Both circumstances
> return a unique non-NULL pointer.  If memory allocation fails, execution will
> abort.
> 
> The existing qemu_malloc() will maintain it's current behavior (with the
> exception of the relaxed size==0 assertion for production releases).
> 
> Does anyone object to this moving forward?

Yeah, i object to the split production/development qemu_malloc[z].
Anthony Liguori Dec. 7, 2009, 4:59 p.m. UTC | #73
Avi Kivity wrote:
> What about developers that hit the assert?  Do they send patches that 
> fix code that works in production just so they can run in developer mode?

Sounds like a good way to get developers to help convert from 
qemu_malloc() to qemu_new*() :-)

If it helps, we can do a grep -l through qemu, put up a wiki page with 
all of the files, and allow people to sign up to convert individual 
files.  I'm sure we could convert the whole tree over in a short period 
of time if we parallelize the effort and make some attempt to coordinate.

Regards,

Anthony Liguori
Anthony Liguori Dec. 7, 2009, 5:01 p.m. UTC | #74
malc wrote:
>> Does anyone object to this moving forward?
>>     
>
> Yeah, i object to the split production/development qemu_malloc[z].
>   

It's clear to me that there are still improper callers of qemu_malloc() 
in the tree.  How do you propose we address this for 0.12?

Aborting in a production build is a rather hostile thing to do if it can 
be avoided.

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 5:07 p.m. UTC | #75
On 12/07/2009 06:59 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> What about developers that hit the assert?  Do they send patches that 
>> fix code that works in production just so they can run in developer 
>> mode?
>
> Sounds like a good way to get developers to help convert from 
> qemu_malloc() to qemu_new*() :-)
>

In 0.12?

My objection was to different behaviour of 0.12 in dev and production modes.
Anthony Liguori Dec. 7, 2009, 5:09 p.m. UTC | #76
Avi Kivity wrote:
> On 12/07/2009 06:59 PM, Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> What about developers that hit the assert?  Do they send patches 
>>> that fix code that works in production just so they can run in 
>>> developer mode?
>>
>> Sounds like a good way to get developers to help convert from 
>> qemu_malloc() to qemu_new*() :-)
>>
>
> In 0.12?
>
> My objection was to different behaviour of 0.12 in dev and production 
> modes.

It's a temporary problem that hopefully will be addressed quickly in the 
0.13 cycle.

Regards,

Anthony Liguori
malc Dec. 7, 2009, 5:09 p.m. UTC | #77
On Mon, 7 Dec 2009, Anthony Liguori wrote:

> malc wrote:
> > > Does anyone object to this moving forward?
> > >     
> > 
> > Yeah, i object to the split production/development qemu_malloc[z].
> >   
> 
> It's clear to me that there are still improper callers of qemu_malloc() in the
> tree.  How do you propose we address this for 0.12?
> 
> Aborting in a production build is a rather hostile thing to do if it can be
> avoided.

The only real issue encountered so far was eb0b64f7a, there are claims
that "maybe there are more", well i can also claim that there are abusers
of the interface that just weren't encoutered yet, and those will 
potentially lead to hard to track bugs, wiped out HDDs, general 
dissatisfaction and so on and so forth. Truth is that no one performed
thorough audit so those are pure speculation.

As for 0.12, go wild, i don't care, but only if it lives in it's own
zero happy branch.
Avi Kivity Dec. 7, 2009, 5:13 p.m. UTC | #78
On 12/07/2009 07:09 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> On 12/07/2009 06:59 PM, Anthony Liguori wrote:
>>> Avi Kivity wrote:
>>>> What about developers that hit the assert?  Do they send patches 
>>>> that fix code that works in production just so they can run in 
>>>> developer mode?
>>>
>>> Sounds like a good way to get developers to help convert from 
>>> qemu_malloc() to qemu_new*() :-)
>>>
>>
>> In 0.12?
>>
>> My objection was to different behaviour of 0.12 in dev and production 
>> modes.
>
> It's a temporary problem that hopefully will be addressed quickly in 
> the 0.13 cycle.
>

I don't understand.  People will develop patches for 0.12 for a while as 
bugs are reported and fixed.
Anthony Liguori Dec. 7, 2009, 5:17 p.m. UTC | #79
Avi Kivity wrote:
> I don't understand.  People will develop patches for 0.12 for a while 
> as bugs are reported and fixed.

What's the exact problem here?

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 5:19 p.m. UTC | #80
On 12/07/2009 07:17 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> I don't understand.  People will develop patches for 0.12 for a while 
>> as bugs are reported and fixed.
>
> What's the exact problem here?
>

Bug reported against qemu.  Developer develops patch, while testing qemu 
crashes on unrelated assert(size == 0).
Glauber Costa Dec. 7, 2009, 5:32 p.m. UTC | #81
> So the only correct way would be to write:
>
> array = malloc(size);
> if (array == NULL && size != 0) {
>   return -ENOMEM;
> }
>

Of course we can differentiate. A failed malloc will abort, a successful one
will not.

> If you were writing portable code.  But that's not what people write.  You
> can argue that qemu_malloc() can have any semantics we want and while that's
> true, it doesn't change my above statement.  I think the main argument for
> this behavior in qemu is that people are used to using this idiom with
> malloc() but it's a non-portable practice.
>
> If qemu_malloc() didn't carry the name "malloc()" then semantics with size=0
> would be a different discussion.  But so far, all qemu_* functions tend to
> behave almost exactly like their C counterparts.  Relying on the result of
> size=0 with malloc() is broken.

We can change qemu_malloc to qemu_alloc_memory(), or whatever.
But from the moment we do things like abort on failing, we are already
deviating from its C counterpart.
Anthony Liguori Dec. 7, 2009, 5:40 p.m. UTC | #82
Avi Kivity wrote:
> On 12/07/2009 07:17 PM, Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> I don't understand.  People will develop patches for 0.12 for a 
>>> while as bugs are reported and fixed.
>>
>> What's the exact problem here?
>>
>
> Bug reported against qemu.  Developer develops patch, while testing 
> qemu crashes on unrelated assert(size == 0).

1) Developer develops a patch against 0.12, it works, and they submit it 
to upstream.
2) Upstream crashes because of assert(size==0).
3) Developer is publicly shamed for developing against a release instead 
of a git tree.

The problem is (1), not (2).   Not to mention that we won't allow 
qemu_malloc() uses in upstream anymore which should make (2) impossible.

Regards,

Anthony Liguori
Blue Swirl Dec. 7, 2009, 6:12 p.m. UTC | #83
On Mon, Dec 7, 2009 at 5:50 PM, Anthony Liguori <anthony@codemonkey.ws> wrote:
> Markus Armbruster wrote:
>>
>> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
>> from ISO C's malloc() & friends.  Revert that, but take care never to
>> return a null pointer, like malloc() & friends may do (it's
>> implementation defined), because that's another source of bugs.
>>
>
> While it's always fun to argue about standards interpretation, I wanted to
> capture some action items from the discussion that I think there is
> agreement about.  Since I want to make changes for 0.12, I think it would be
> best to try and settle these now so we can do this before -rc2.
>
> For 0.12.0-rc2:
>
> I will send out a patch tonight or tomorrow changing qemu_malloc() to return
> malloc(1) when size=0 only for production builds (via
> --enable-zero-mallocs).  Development trees will maintain their current
> behavior.
>
> For 0.13:
>
> Someone (Marcus?) will introduce four new allocation functions.
>
> type *qemu_new(type, n_types);
> type *qemu_new0(type, n_types);
>
> type *qemu_renew(type, mem, n_types);
> type *qemu_renew0(type, mem, n_types);
>
> NB: The names are borrowed from glib.  I'm not tied to them.
>
> Will do our best to convert old code to use these functions where ever
> possible.  New code will be required to use these functions unless not
> possible.  n_types==0 is valid.  sizeof(type)==0 is valid.  Both
> circumstances return a unique non-NULL pointer.  If memory allocation fails,
> execution will abort.
>
> The existing qemu_malloc() will maintain it's current behavior (with the
> exception of the relaxed size==0 assertion for production releases).
>
> Does anyone object to this moving forward?

Excellent plan.
Avi Kivity Dec. 7, 2009, 6:25 p.m. UTC | #84
On 12/07/2009 07:40 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> On 12/07/2009 07:17 PM, Anthony Liguori wrote:
>>> Avi Kivity wrote:
>>>> I don't understand.  People will develop patches for 0.12 for a 
>>>> while as bugs are reported and fixed.
>>>
>>> What's the exact problem here?
>>>
>>
>> Bug reported against qemu.  Developer develops patch, while testing 
>> qemu crashes on unrelated assert(size == 0).
>
> 1) Developer develops a patch against 0.12, it works, and they submit 
> it to upstream.
> 2) Upstream crashes because of assert(size==0).
> 3) Developer is publicly shamed for developing against a release 
> instead of a git tree.
>
> The problem is (1), not (2).   Not to mention that we won't allow 
> qemu_malloc() uses in upstream anymore which should make (2) impossible.

My problem is with stable-0.12.  Consider upstream fixed.

1) Bug reported against qemu-0.12.0.
2) Developer writes patch against master, submits, all is well except 
for the CODING_STYLE argument it triggers.
3) Developer writes patch against stable-0.12, can't test because 
testing crashes in some place where production doesn't crash.

The patch doesn't have to do anything with memory allocation.  By 
introducing gratuitous differences between developer and production 
mode, you're reducing quality.
Anthony Liguori Dec. 7, 2009, 6:59 p.m. UTC | #85
Avi Kivity wrote:
> My problem is with stable-0.12.  Consider upstream fixed.
>
> 1) Bug reported against qemu-0.12.0.
> 2) Developer writes patch against master, submits, all is well except 
> for the CODING_STYLE argument it triggers.
> 3) Developer writes patch against stable-0.12, can't test because 
> testing crashes in some place where production doesn't crash.

Stable-0.12 always carries a VERSION of 0.12.x where x < 50.  This means 
that the stable-0.12 branch will always behave like a production release.

You don't get -Werror on stable-0.XX and you won't get zero malloc()s 
assert.

Regards,

Anthony Liguori
Avi Kivity Dec. 7, 2009, 7:01 p.m. UTC | #86
On 12/07/2009 08:59 PM, Anthony Liguori wrote:
> Avi Kivity wrote:
>> My problem is with stable-0.12.  Consider upstream fixed.
>>
>> 1) Bug reported against qemu-0.12.0.
>> 2) Developer writes patch against master, submits, all is well except 
>> for the CODING_STYLE argument it triggers.
>> 3) Developer writes patch against stable-0.12, can't test because 
>> testing crashes in some place where production doesn't crash.
>
> Stable-0.12 always carries a VERSION of 0.12.x where x < 50.  This 
> means that the stable-0.12 branch will always behave like a production 
> release.
>
> You don't get -Werror on stable-0.XX and you won't get zero malloc()s 
> assert.
>

That's good enough for me.  Allow 0 for 0.12 and new allocation 
functions for mainline, then?
Anthony Liguori Dec. 7, 2009, 7:07 p.m. UTC | #87
Avi Kivity wrote:
> On 12/07/2009 08:59 PM, Anthony Liguori wrote:
>> Avi Kivity wrote:
>>> My problem is with stable-0.12.  Consider upstream fixed.
>>>
>>> 1) Bug reported against qemu-0.12.0.
>>> 2) Developer writes patch against master, submits, all is well 
>>> except for the CODING_STYLE argument it triggers.
>>> 3) Developer writes patch against stable-0.12, can't test because 
>>> testing crashes in some place where production doesn't crash.
>>
>> Stable-0.12 always carries a VERSION of 0.12.x where x < 50.  This 
>> means that the stable-0.12 branch will always behave like a 
>> production release.
>>
>> You don't get -Werror on stable-0.XX and you won't get zero malloc()s 
>> assert.
>>
>
> That's good enough for me.  Allow 0 for 0.12 and new allocation 
> functions for mainline, then?
Yup.

Regards,

Anthony Liguori
Markus Armbruster Dec. 8, 2009, 8:21 a.m. UTC | #88
malc <av1474@comtv.ru> writes:

> On Mon, 7 Dec 2009, Markus Armbruster wrote:
>
>> malc <av1474@comtv.ru> writes:
>> 
>> > On Mon, 30 Nov 2009, Markus Armbruster wrote:
>> >
>> >> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
>> >> from ISO C's malloc() & friends.  Revert that, but take care never to
>> >> return a null pointer, like malloc() & friends may do (it's
>> >> implementation defined), because that's another source of bugs.
>> >> 
>> >> Rationale: while zero-sized allocations might occasionally be a sign of
>> >> something going wrong, they can also be perfectly legitimate.  The
>> >> change broke such legitimate uses.  We've found and "fixed" at least one
>> >> of them already (commit eb0b64f7, also reverted by this patch), and
>> >> another one just popped up: the change broke qcow2 images with virtual
>> >> disk size zero, i.e. images that don't hold real data but only VM state
>> >> of snapshots.
>> >> 
>> >> If a change breaks two uses, it probably breaks more.  As a quick check,
>> >> I reviewed the first six of more than 200 uses of qemu_mallocz(),
>> >> qemu_malloc() and qemu_realloc() that don't have an argument of the form
>> >> sizeof(...) or similar:
>> >
>> > Bottom line: your point is that there are benign uses of zero allocation.
>> > There are, at the expense of memory consumption/fragmentation and useless
>> > code which, as your investigation shows, involve syscalls and what not.
>> 
>> I doubt the performance impact is relevant, but since you insist on
>> discussing it...
>> 
>> First and foremost, any performance argument not backed by measurements
>> is speculative.
>> 
>> Potential zero allocation is common in code.  Actual zero allocation is
>> rare at runtime.  The amount of memory consumed is therefore utterly
>> trivial compared to total dynamic memory use.  Likewise, time spent in
>> allocating is dwarved by time spent in other invocations of malloc()
>> several times over.
>> 
>> Adding a special case for avoiding zero allocation is not free.  Unless
>> zero allocations are sufficiently frequent, that cost exceeds the cost
>> of the rare zero allocation.
>
> I bet you dollars to donuts that testing for zero before calling malloc
> is cheaper than the eventual test that is done inside it.

Testing for zero before calling malloc() doesn't make the eventual test
that is done inside it go away, so it carries an additional cost.  What
you save by it is the cost of actual zero allocation.  Weight the two by
frequency and compare.

But once again, I doubt the performance impact is relevant.

>> > Outlook from my side of the fence: no one audited the code, no one
>> > knows that all of the uses are benign, abort gives the best automatic
>> > way to know for sure one way or the other.
>> >
>> > Rationale for keeping it as is: zero-sized allocations - overwhelming
>> > majority of the times, are not anticipated and not well understood,
>> > aborting gives us the ability to eliminate them in an automatic
>> > fashion.
>> 
>> Keeping it as is releasing with known crash bugs, and a known pattern
>> for unknown crash bugs.  Is that what you want us to do?  I doubt it.
>
> Yes, it's better than having _silent_ bugs, which, furthermore, have a
> good potential of mainfesting themselves a lot further from the "bug"
> site.
>
>> 
>> I grant you that there may be allocations we expect never to be empty,
>> and where things break when they are.  Would you concede that there are
>> allocations that work just fine when empty?
>
> I wont dispute that.

Good.

>> If you do, we end up with three kinds of allocations: known empty bad,
>> known empty fine, unknown.  Aborting on known empty bad is okay with me.
>> Aborting on unknown in developement builds is okay with me, too.  I
>> don't expect it to be a successful way to find bugs, because empty
>> allocations are rare.
[...]

[On one specific example where malloc(0) is a sign of something weird
going on:]
>> My point isn't that permitting malloc(0) is the best way to handle it.
>> It's a better way than aborting, though.
>
> And this is precisely the gist of our disagreement.

Yup.  When a feature can be used in a safe manner, but its use can
sometimes also be a sign of a bug, then I prefer the program to take its
chances and continue, while you prefer to stop the program cold, so you
can eliminate these scary uses as they occur.  Either choice can
conceivably lead to grief.  We disagree on likelihood and severety of
the grief to expect.

[...]
>> > http://groups.google.com/group/comp.std.c/browse_thread/thread/4e9af8847613d71f/6f75ad22e0768a0b?q=realloc++group:comp.std.c#6f75ad22e0768a0b
>> 
>> --verbose ?
>
> Sigh...
>
> C90 - realloc(non-null, 0) =
>       free(non-null); return NULL;
>
> C99 - realloc(non-null, 0) =
>       free(non-null); return realloc(NULL, 0); 
>
> GLIBC 2.7 = C90

glibc frees since 1999.  From glibc/ChangeLog.10:

1999-04-28  Andreas Jaeger  <aj@arthur.rhein-neckar.de>

	* malloc/malloc.c (REALLOC_ZERO_BYTES_FREES): Define it to follow
	ISO C9x and Unix98.

> How anyone can use this interface and it's implementations portably
> or "sanely" is beyond me.
>
> Do read the discussion the linked above.

Yes, malloc() & friends are tricky for portable programs.  And yes,
bugs do creep into international standards.

But since we wrap malloc() & friends anyway, we can pick a variant of
the semantics we like.  No need to punish ourselves with wrappers that
are hard to use, just because the wrappees are hard to use.

Anyway, I figure this thread has wandered off topic for this list.
Markus Armbruster Dec. 8, 2009, 8:30 a.m. UTC | #89
Anthony Liguori <anthony@codemonkey.ws> writes:

> Markus Armbruster wrote:
>> Commit a7d27b53 made zero-sized allocations a fatal error, deviating
>> from ISO C's malloc() & friends.  Revert that, but take care never to
>> return a null pointer, like malloc() & friends may do (it's
>> implementation defined), because that's another source of bugs.
>>   
>
> While it's always fun to argue about standards interpretation, I
> wanted to capture some action items from the discussion that I think
> there is agreement about.  Since I want to make changes for 0.12, I
> think it would be best to try and settle these now so we can do this
> before -rc2.
>
> For 0.12.0-rc2:
>
> I will send out a patch tonight or tomorrow changing qemu_malloc() to
> return malloc(1) when size=0 only for production builds (via
> --enable-zero-mallocs).  Development trees will maintain their current
> behavior.

I don't think 0.12 needs the compile-time option, it's a production
release after all.  But since I won't actually hurt 0.12, I don't mind.

> For 0.13:
>
> Someone (Marcus?) will introduce four new allocation functions.
>
> type *qemu_new(type, n_types);
> type *qemu_new0(type, n_types);
>
> type *qemu_renew(type, mem, n_types);
> type *qemu_renew0(type, mem, n_types);
>
> NB: The names are borrowed from glib.  I'm not tied to them.

I'll see what I can do post release.

> Will do our best to convert old code to use these functions where ever
> possible.  New code will be required to use these functions unless not
> possible.  n_types==0 is valid.  sizeof(type)==0 is valid.  Both
> circumstances return a unique non-NULL pointer.  If memory allocation
> fails, execution will abort.
>
> The existing qemu_malloc() will maintain it's current behavior (with
> the exception of the relaxed size==0 assertion for production
> releases).
>
> Does anyone object to this moving forward?

Thanks, works for me.
Kevin Wolf Dec. 8, 2009, 9:02 a.m. UTC | #90
Am 07.12.2009 18:09, schrieb malc:
> On Mon, 7 Dec 2009, Anthony Liguori wrote:
> 
>> malc wrote:
>>>> Does anyone object to this moving forward?
>>>>     
>>>
>>> Yeah, i object to the split production/development qemu_malloc[z].
>>>   
>>
>> It's clear to me that there are still improper callers of qemu_malloc() in the
>> tree.  How do you propose we address this for 0.12?
>>
>> Aborting in a production build is a rather hostile thing to do if it can be
>> avoided.
> 
> The only real issue encountered so far was eb0b64f7a, there are claims
> that "maybe there are more", 

702ef63f was the one that started this discussion. And Markus had list
of five other places that can very likely crash with the abort. Every
single crash is one crash too much in a production environment. Turning
the abort off is the only sane option there.

Having the abort in place in the development branch is a concession to
you. Crashes don't have as bad impact there, so we probably can afford
it. I'm fine with that as long as there is a plan to move forward to a
better API.

> well i can also claim that there are abusers
> of the interface that just weren't encoutered yet,

Right, but have you found a single one of them yet? If not, in terms of
committed patches the score is 4:0 for legitimate users broken by
abort() vs. abusers found by abort().

Kevin
malc Dec. 8, 2009, 10:22 a.m. UTC | #91
On Tue, 8 Dec 2009, Markus Armbruster wrote:

> malc <av1474@comtv.ru> writes:
> 

[..snip..]

> glibc frees since 1999.  From glibc/ChangeLog.10:
> 
> 1999-04-28  Andreas Jaeger  <aj@arthur.rhein-neckar.de>
> 
> 	* malloc/malloc.c (REALLOC_ZERO_BYTES_FREES): Define it to follow
> 	ISO C9x and Unix98.

Problem is - this is NOT what C9x where x > 0 says.

> 
> > How anyone can use this interface and it's implementations portably
> > or "sanely" is beyond me.
> >
> > Do read the discussion the linked above.
> 
> Yes, malloc() & friends are tricky for portable programs.  And yes,
> bugs do creep into international standards.
> 
> But since we wrap malloc() & friends anyway, we can pick a variant of
> the semantics we like.  No need to punish ourselves with wrappers that
> are hard to use, just because the wrappees are hard to use.
> 
> Anyway, I figure this thread has wandered off topic for this list.
>
diff mbox

Patch

diff --git a/block/qcow2-snapshot.c b/block/qcow2-snapshot.c
index 94cb838..e3b208c 100644
--- a/block/qcow2-snapshot.c
+++ b/block/qcow2-snapshot.c
@@ -381,11 +381,6 @@  int qcow2_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
     QCowSnapshot *sn;
     int i;
 
-    if (!s->nb_snapshots) {
-        *psn_tab = NULL;
-        return s->nb_snapshots;
-    }
-
     sn_tab = qemu_mallocz(s->nb_snapshots * sizeof(QEMUSnapshotInfo));
     for(i = 0; i < s->nb_snapshots; i++) {
         sn_info = sn_tab + i;
diff --git a/qemu-malloc.c b/qemu-malloc.c
index 295d185..aeeb78b 100644
--- a/qemu-malloc.c
+++ b/qemu-malloc.c
@@ -44,22 +44,12 @@  void qemu_free(void *ptr)
 
 void *qemu_malloc(size_t size)
 {
-    if (!size) {
-        abort();
-    }
-    return oom_check(malloc(size));
+    return oom_check(malloc(size ? size : 1));
 }
 
 void *qemu_realloc(void *ptr, size_t size)
 {
-    if (size) {
-        return oom_check(realloc(ptr, size));
-    } else {
-        if (ptr) {
-            return realloc(ptr, size);
-        }
-    }
-    abort();
+    return oom_check(realloc(ptr, size ? size : 1));
 }
 
 void *qemu_mallocz(size_t size)