diff mbox

qapi-commands.py generates code that uses uninitialized variables

Message ID 20140320192134.8983.86526@loki
State New
Headers show

Commit Message

Michael Roth March 20, 2014, 7:21 p.m. UTC
Quoting Markus Armbruster (2014-03-18 04:32:08)
> Peter Maydell <peter.maydell@linaro.org> writes:
> 
> > This is something clang's -fsanitize=undefined spotted. The
> > code generated by qapi-commands.py in qmp-marshal.c for
> > qmp_marshal_* functions where there are some optional
> > arguments looks like this:
> >
> >     bool has_force = false;
> >     bool force;
> >
> >     mi = qmp_input_visitor_new_strict(QOBJECT(args));
> >     v = qmp_input_get_visitor(mi);
> >     visit_type_str(v, &device, "device", errp);
> >     visit_start_optional(v, &has_force, "force", errp);
> >     if (has_force) {
> >         visit_type_bool(v, &force, "force", errp);
> >     }
> >     visit_end_optional(v, errp);
> >     qmp_input_visitor_cleanup(mi);
> >
> >     if (error_is_set(errp)) {
> >         goto out;
> >     }
> >     qmp_eject(device, has_force, force, errp);
> >
> > In the case where has_force is false, we never initialize
> > force, but then we use it by passing it to qmp_eject.
> > I imagine we don't then actually use the value, but clang
> 
> Use of FOO when !has_FOO is a bug.
> 
> > complains in particular for 'bool' variables because the value
> > that ends up being loaded from memory for 'force' is not either
> > 0 or 1 (being uninitialized stack contents).
> >
> > Anybody understand what the codegenerator is doing well enough
> > to suggest a fix? I'd guess that just initializing the variable either
> > at point of declaration or in an else {) clause of the 'if (has_force)'
> > conditional would suffice, but presumably you need to handle
> > all the possible data types...
> 
> I can give it a try.  Will probably take a while, though.

Could it be as simple as this?:

Pointer-type are special-cased initialized to NULL, so that leaves these guys
in the current set of qapi-defined types that we use as direct arguments for
qmp commands:

  NON-POINTER TYPE: BlockdevOnError
  NON-POINTER TYPE: bool
  NON-POINTER TYPE: DataFormat
  NON-POINTER TYPE: double
  NON-POINTER TYPE: DumpGuestMemoryFormat
  NON-POINTER TYPE: int64_t
  NON-POINTER TYPE: MirrorSyncMode
  NON-POINTER TYPE: NewImageMode
  NON-POINTER TYPE: uint32_t

I'm trying to make sense of whether {0} is a valid initializer in all these
cases, as I saw some references to GCC complaining about cases where you don't
use an initializer for each nested subtype (back in 2002 at least:
http://www.ex-parrot.com/~chris/random/initialise.html), but that doesn't seem
to be the case now.

If that's not safe, we can memset based on sizeof() in the else clause, but
obviously that's sub-optimal.

Comments

Markus Armbruster March 26, 2014, 2:34 p.m. UTC | #1
Michael Roth <mdroth@linux.vnet.ibm.com> writes:

> Quoting Markus Armbruster (2014-03-18 04:32:08)
>> Peter Maydell <peter.maydell@linaro.org> writes:
>> 
>> > This is something clang's -fsanitize=undefined spotted. The
>> > code generated by qapi-commands.py in qmp-marshal.c for
>> > qmp_marshal_* functions where there are some optional
>> > arguments looks like this:
>> >
>> >     bool has_force = false;
>> >     bool force;
>> >
>> >     mi = qmp_input_visitor_new_strict(QOBJECT(args));
>> >     v = qmp_input_get_visitor(mi);
>> >     visit_type_str(v, &device, "device", errp);
>> >     visit_start_optional(v, &has_force, "force", errp);
>> >     if (has_force) {
>> >         visit_type_bool(v, &force, "force", errp);
>> >     }
>> >     visit_end_optional(v, errp);
>> >     qmp_input_visitor_cleanup(mi);
>> >
>> >     if (error_is_set(errp)) {
>> >         goto out;
>> >     }
>> >     qmp_eject(device, has_force, force, errp);
>> >
>> > In the case where has_force is false, we never initialize
>> > force, but then we use it by passing it to qmp_eject.
>> > I imagine we don't then actually use the value, but clang
>> 
>> Use of FOO when !has_FOO is a bug.
>> 
>> > complains in particular for 'bool' variables because the value
>> > that ends up being loaded from memory for 'force' is not either
>> > 0 or 1 (being uninitialized stack contents).
>> >
>> > Anybody understand what the codegenerator is doing well enough
>> > to suggest a fix? I'd guess that just initializing the variable either
>> > at point of declaration or in an else {) clause of the 'if (has_force)'
>> > conditional would suffice, but presumably you need to handle
>> > all the possible data types...
>> 
>> I can give it a try.  Will probably take a while, though.
>
> Could it be as simple as this?:

Possibly :)

> diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
> index 9734ab0..a70482e 100644
> --- a/scripts/qapi-commands.py
> +++ b/scripts/qapi-commands.py
> @@ -99,7 +99,7 @@ bool has_%(argname)s = false;
>                           argname=c_var(argname), argtype=c_type(argtype))
>          else:
>              ret += mcgen('''
> -%(argtype)s %(argname)s;
> +%(argtype)s %(argname)s = {0};
>  ''',
>                           argname=c_var(argname), argtype=c_type(argtype))
>
> Pointer-type are special-cased initialized to NULL, so that leaves these guys
> in the current set of qapi-defined types that we use as direct arguments for
> qmp commands:
>
>   NON-POINTER TYPE: BlockdevOnError
>   NON-POINTER TYPE: bool
>   NON-POINTER TYPE: DataFormat
>   NON-POINTER TYPE: double
>   NON-POINTER TYPE: DumpGuestMemoryFormat
>   NON-POINTER TYPE: int64_t
>   NON-POINTER TYPE: MirrorSyncMode
>   NON-POINTER TYPE: NewImageMode
>   NON-POINTER TYPE: uint32_t
>
> I'm trying to make sense of whether {0} is a valid initializer in all these
> cases, as I saw some references to GCC complaining about cases where you don't
> use an initializer for each nested subtype (back in 2002 at least:
> http://www.ex-parrot.com/~chris/random/initialise.html), but that doesn't seem
> to be the case now.
>
> If that's not safe, we can memset based on sizeof() in the else clause, but
> obviously that's sub-optimal.

A superficial reading of C99 suggests {0} should work as long as 0 can
be assigned to the left hand side when it's of scalar type, or its first
part when it's not.

Predicting what might trigger warnings from random compilers is an
exercise in futility.  For what it's worth, we already have a number of
'{0}' initializers.

If they don't work out here, we can make the conditional enumerate more
(sets of) types.  I wouldn't worry about that now.
Peter Maydell March 28, 2014, 2:19 p.m. UTC | #2
On 20 March 2014 19:21, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> Could it be as simple as this?:
>
> diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
> index 9734ab0..a70482e 100644
> --- a/scripts/qapi-commands.py
> +++ b/scripts/qapi-commands.py
> @@ -99,7 +99,7 @@ bool has_%(argname)s = false;
>                           argname=c_var(argname), argtype=c_type(argtype))
>          else:
>              ret += mcgen('''
> -%(argtype)s %(argname)s;
> +%(argtype)s %(argname)s = {0};
>  ''',
>                           argname=c_var(argname), argtype=c_type(argtype))

Well, clang doesn't complain about this syntax, and it
fixes the warnings about bools (which is good, because
there was a genuine bug in dma-helpers.c that was hiding
in amongst these other similar warnings...)

Tested-by: Peter Maydell <peter.maydell@linaro.org>

thanks
-- PMM
Eric Blake April 11, 2014, 1:40 a.m. UTC | #3
On 03/28/2014 08:19 AM, Peter Maydell wrote:
> On 20 March 2014 19:21, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
>> Could it be as simple as this?:
>>
>> diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
>> index 9734ab0..a70482e 100644
>> --- a/scripts/qapi-commands.py
>> +++ b/scripts/qapi-commands.py
>> @@ -99,7 +99,7 @@ bool has_%(argname)s = false;
>>                           argname=c_var(argname), argtype=c_type(argtype))
>>          else:
>>              ret += mcgen('''
>> -%(argtype)s %(argname)s;
>> +%(argtype)s %(argname)s = {0};
>>  ''',
>>                           argname=c_var(argname), argtype=c_type(argtype))
> 
> Well, clang doesn't complain about this syntax, and it
> fixes the warnings about bools (which is good, because
> there was a genuine bug in dma-helpers.c that was hiding
> in amongst these other similar warnings...)
> 
> Tested-by: Peter Maydell <peter.maydell@linaro.org>

We uncovered a real bug that would be fixed by this patch:
https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01745.html

Is it worth cleaning this up into a formal submission and cc'ing
qemu-stable and/or trying for the 2.0 release?  If made formal, feel
free to add:
Reviewed-by: Eric Blake <eblake@redhat.com>
Peter Maydell April 11, 2014, 7:27 a.m. UTC | #4
On 11 April 2014 02:40, Eric Blake <eblake@redhat.com> wrote:
> We uncovered a real bug that would be fixed by this patch:
> https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01745.html

No, that's a bug in the called code. The API here defines
that for optional parameters, if the have_foo bool is false
then the foo argument isn't set. The generated code
can't know the correct default value (it just happens
to be 0 in the case you point out, but what if the default
speed were 100?) so this must be handled by the called
code.

thanks
-- PMM
Fam Zheng April 11, 2014, 7:48 a.m. UTC | #5
On Fri, 04/11 08:27, Peter Maydell wrote:
> On 11 April 2014 02:40, Eric Blake <eblake@redhat.com> wrote:
> > We uncovered a real bug that would be fixed by this patch:
> > https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01745.html
> 
> No, that's a bug in the called code. The API here defines
> that for optional parameters, if the have_foo bool is false
> then the foo argument isn't set. The generated code
> can't know the correct default value (it just happens
> to be 0 in the case you point out, but what if the default
> speed were 100?) so this must be handled by the called
> code.
> 

Default value for a variable isn't default value for API logic, so apparently
called code must always handle both have_foo and foo. But there is a point to
take this patch from the language perspective, to avoid that an unset variable
is passed as a parameter from generated code.

Thanks,
Fam
Eric Blake April 11, 2014, 1:11 p.m. UTC | #6
On 04/11/2014 01:27 AM, Peter Maydell wrote:
> On 11 April 2014 02:40, Eric Blake <eblake@redhat.com> wrote:
>> We uncovered a real bug that would be fixed by this patch:
>> https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01745.html
> 
> No, that's a bug in the called code. The API here defines
> that for optional parameters, if the have_foo bool is false
> then the foo argument isn't set. The generated code
> can't know the correct default value (it just happens
> to be 0 in the case you point out, but what if the default
> speed were 100?) so this must be handled by the called
> code.

The called code ALSO needs a fix, but guaranteeing that
'have_foo==false' implies 'foo==0' is MUCH nicer than 'have_foo==false'
implies 'foo is indeterminate'.  For this particular caller, an
indeterminate foo had detrimental effects, and a known foo==0 happened
to be the right default.  I agree that we can't always predict the right
default for all callers, but avoiding random behavior can be considered
a bug fix in its own right, and if we make it part of the contract that
callers can rely on zero initialization, we could simplify a lot of
callers that ARE happy with a 0 default.
Peter Maydell April 11, 2014, 1:27 p.m. UTC | #7
On 11 April 2014 14:11, Eric Blake <eblake@redhat.com> wrote:
> The called code ALSO needs a fix, but guaranteeing that
> 'have_foo==false' implies 'foo==0' is MUCH nicer than 'have_foo==false'
> implies 'foo is indeterminate'.  For this particular caller, an
> indeterminate foo had detrimental effects, and a known foo==0 happened
> to be the right default.  I agree that we can't always predict the right
> default for all callers, but avoiding random behavior can be considered
> a bug fix in its own right, and if we make it part of the contract that
> callers can rely on zero initialization, we could simplify a lot of
> callers that ARE happy with a 0 default.

I totally agree, which is why I reported this problem in
the first place; but it's not 2.0 material. I have no problem
with it being cc'd for stable if people want it in a 2.0.x.

thanks
-- PMM
Laszlo Ersek April 11, 2014, 2:01 p.m. UTC | #8
On 03/20/14 20:21, Michael Roth wrote:
> Quoting Markus Armbruster (2014-03-18 04:32:08)
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>>> This is something clang's -fsanitize=undefined spotted. The
>>> code generated by qapi-commands.py in qmp-marshal.c for
>>> qmp_marshal_* functions where there are some optional
>>> arguments looks like this:
>>>
>>>     bool has_force = false;
>>>     bool force;
>>>
>>>     mi = qmp_input_visitor_new_strict(QOBJECT(args));
>>>     v = qmp_input_get_visitor(mi);
>>>     visit_type_str(v, &device, "device", errp);
>>>     visit_start_optional(v, &has_force, "force", errp);
>>>     if (has_force) {
>>>         visit_type_bool(v, &force, "force", errp);
>>>     }
>>>     visit_end_optional(v, errp);
>>>     qmp_input_visitor_cleanup(mi);
>>>
>>>     if (error_is_set(errp)) {
>>>         goto out;
>>>     }
>>>     qmp_eject(device, has_force, force, errp);
>>>
>>> In the case where has_force is false, we never initialize
>>> force, but then we use it by passing it to qmp_eject.
>>> I imagine we don't then actually use the value, but clang
>>
>> Use of FOO when !has_FOO is a bug.
>>
>>> complains in particular for 'bool' variables because the value
>>> that ends up being loaded from memory for 'force' is not either
>>> 0 or 1 (being uninitialized stack contents).
>>>
>>> Anybody understand what the codegenerator is doing well enough
>>> to suggest a fix? I'd guess that just initializing the variable either
>>> at point of declaration or in an else {) clause of the 'if (has_force)'
>>> conditional would suffice, but presumably you need to handle
>>> all the possible data types...
>>
>> I can give it a try.  Will probably take a while, though.
> 
> Could it be as simple as this?:
> 
> diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
> index 9734ab0..a70482e 100644
> --- a/scripts/qapi-commands.py
> +++ b/scripts/qapi-commands.py
> @@ -99,7 +99,7 @@ bool has_%(argname)s = false;
>                           argname=c_var(argname), argtype=c_type(argtype))
>          else:
>              ret += mcgen('''
> -%(argtype)s %(argname)s;
> +%(argtype)s %(argname)s = {0};
>  ''',
>                           argname=c_var(argname), argtype=c_type(argtype))
> 
> Pointer-type are special-cased initialized to NULL, so that leaves these guys
> in the current set of qapi-defined types that we use as direct arguments for
> qmp commands:
> 
>   NON-POINTER TYPE: BlockdevOnError
>   NON-POINTER TYPE: bool
>   NON-POINTER TYPE: DataFormat
>   NON-POINTER TYPE: double
>   NON-POINTER TYPE: DumpGuestMemoryFormat
>   NON-POINTER TYPE: int64_t
>   NON-POINTER TYPE: MirrorSyncMode
>   NON-POINTER TYPE: NewImageMode
>   NON-POINTER TYPE: uint32_t
> 
> I'm trying to make sense of whether {0} is a valid initializer in all these
> cases, as I saw some references to GCC complaining about cases where you don't
> use an initializer for each nested subtype (back in 2002 at least:
> http://www.ex-parrot.com/~chris/random/initialise.html), but that doesn't seem
> to be the case now.
> 
> If that's not safe, we can memset based on sizeof() in the else clause, but
> obviously that's sub-optimal.

{ 0 } is safe. { 0 } is a "universal initializer". If you tell me which
C version we care about this week, I can look up and cite the language
for you. The gist, as far as I remember, is that
- 0 is a good initializer for any scalar type,
- the outermost braces are ignored when initializing a scalar,
- the outermost braces allow initialization of an aggregate (struct or
array) or a union,
- sub-aggregates don't require further braces.

Thanks,
Laszlo
Kevin Wolf April 15, 2014, 9:40 a.m. UTC | #9
Am 11.04.2014 um 15:11 hat Eric Blake geschrieben:
> On 04/11/2014 01:27 AM, Peter Maydell wrote:
> > On 11 April 2014 02:40, Eric Blake <eblake@redhat.com> wrote:
> >> We uncovered a real bug that would be fixed by this patch:
> >> https://lists.gnu.org/archive/html/qemu-devel/2014-04/msg01745.html
> > 
> > No, that's a bug in the called code. The API here defines
> > that for optional parameters, if the have_foo bool is false
> > then the foo argument isn't set. The generated code
> > can't know the correct default value (it just happens
> > to be 0 in the case you point out, but what if the default
> > speed were 100?) so this must be handled by the called
> > code.
> 
> The called code ALSO needs a fix, but guaranteeing that
> 'have_foo==false' implies 'foo==0' is MUCH nicer than 'have_foo==false'
> implies 'foo is indeterminate'.  For this particular caller, an
> indeterminate foo had detrimental effects, and a known foo==0 happened
> to be the right default.  I agree that we can't always predict the right
> default for all callers, but avoiding random behavior can be considered
> a bug fix in its own right, and if we make it part of the contract that
> callers can rely on zero initialization, we could simplify a lot of
> callers that ARE happy with a 0 default.

In fact, I would really like to see support for default values in the
schema, so that generated code can supply it. It wouldn't only mean that
the called code doesn't have to care about it any more, but it also
serves as a better API documentation.

Kevin
Peter Maydell May 20, 2014, 11:46 a.m. UTC | #10
On 20 March 2014 19:21, Michael Roth <mdroth@linux.vnet.ibm.com> wrote:
> Quoting Markus Armbruster (2014-03-18 04:32:08)
>> Peter Maydell <peter.maydell@linaro.org> writes:
>>
>> > This is something clang's -fsanitize=undefined spotted. The
>> > code generated by qapi-commands.py in qmp-marshal.c for
>> > qmp_marshal_* functions where there are some optional
>> > arguments looks like this:
>> >
>> >     bool has_force = false;
>> >     bool force;
>> >
>> >     mi = qmp_input_visitor_new_strict(QOBJECT(args));
>> >     v = qmp_input_get_visitor(mi);
>> >     visit_type_str(v, &device, "device", errp);
>> >     visit_start_optional(v, &has_force, "force", errp);
>> >     if (has_force) {
>> >         visit_type_bool(v, &force, "force", errp);
>> >     }
>> >     visit_end_optional(v, errp);
>> >     qmp_input_visitor_cleanup(mi);
>> >
>> >     if (error_is_set(errp)) {
>> >         goto out;
>> >     }
>> >     qmp_eject(device, has_force, force, errp);
>> >
>> > In the case where has_force is false, we never initialize
>> > force, but then we use it by passing it to qmp_eject.
>> > I imagine we don't then actually use the value, but clang
>>
>> Use of FOO when !has_FOO is a bug.
>>
>> > complains in particular for 'bool' variables because the value
>> > that ends up being loaded from memory for 'force' is not either
>> > 0 or 1 (being uninitialized stack contents).
>> >
>> > Anybody understand what the codegenerator is doing well enough
>> > to suggest a fix? I'd guess that just initializing the variable either
>> > at point of declaration or in an else {) clause of the 'if (has_force)'
>> > conditional would suffice, but presumably you need to handle
>> > all the possible data types...
>>
>> I can give it a try.  Will probably take a while, though.
>
> Could it be as simple as this?:
>
> diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
> index 9734ab0..a70482e 100644
> --- a/scripts/qapi-commands.py
> +++ b/scripts/qapi-commands.py
> @@ -99,7 +99,7 @@ bool has_%(argname)s = false;
>                           argname=c_var(argname), argtype=c_type(argtype))
>          else:
>              ret += mcgen('''
> -%(argtype)s %(argname)s;
> +%(argtype)s %(argname)s = {0};
>  ''',
>                           argname=c_var(argname), argtype=c_type(argtype))
>


What's the status of this? I noticed that clang is now
complaining at compile time as well as runtime:

  CC    tests/test-qmp-marshal.o
tests/test-qmp-marshal.c:181:9: warning: variable 'b' is used
uninitialized whenever 'if' condition is false
[-Wsometimes-uninitialized]
    if (has_b) {
        ^~~~~
tests/test-qmp-marshal.c:188:42: note: uninitialized use occurs here
    retval = qmp_user_def_cmd3(a, has_b, b, &local_err);
                                         ^
tests/test-qmp-marshal.c:181:5: note: remove the 'if' if its condition
is always true
    if (has_b) {
    ^~~~~~~~~~~
tests/test-qmp-marshal.c:170:14: note: initialize the variable 'b' to
silence this warning
    int64_t b;
             ^
              = 0
1 warning generated.


thanks
-- PMM
diff mbox

Patch

diff --git a/scripts/qapi-commands.py b/scripts/qapi-commands.py
index 9734ab0..a70482e 100644
--- a/scripts/qapi-commands.py
+++ b/scripts/qapi-commands.py
@@ -99,7 +99,7 @@  bool has_%(argname)s = false;
                          argname=c_var(argname), argtype=c_type(argtype))
         else:
             ret += mcgen('''
-%(argtype)s %(argname)s;
+%(argtype)s %(argname)s = {0};
 ''',
                          argname=c_var(argname), argtype=c_type(argtype))