diff mbox series

clang-tidy: use g_new() family of functions

Message ID CAJ+F1C+mPu9ftXwgs4UrGTQp4f65-thhvmWoB-kuRmu0=vnFFw@mail.gmail.com
State New
Headers show
Series clang-tidy: use g_new() family of functions | expand

Commit Message

Marc-André Lureau Sept. 5, 2017, 12:14 p.m. UTC
Hi,

I have a series of changes generated with clang-tidy qemu [1] pending
for review [2].

It translates calloc/*malloc*/*realloc() calls to
g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].

This is the first commit to give you an idea:

g_new() advantages (from glib doc):
- the returned pointer is cast to a pointer to the given type.
- care is taken to avoid overflow when calculating the size of the
allocated block.

But it is also shorter&nicer :)

I have not included in the first batch the opportunity to also
translate: alloc(sizeof(*p) * x) to g_new(typeof(*p), x), since it is
arguably not much nicer. But for consistency, I think it would be good
to use g_new(). What do you think?

I splitted the changes amont the various MAINTAINERS entries, but it
is still about 70 patches (without the typeof() changes).


[1] https://github.com/elmarco/clang-tools-extra/
[2] https://github.com/elmarco/qemu/commits/gnew

Comments

Philippe Mathieu-Daudé Sept. 5, 2017, 12:41 p.m. UTC | #1
Hi Marc-André,

On 09/05/2017 09:14 AM, Marc-André Lureau wrote:
> Hi,
> 
> I have a series of changes generated with clang-tidy qemu [1] pending
> for review [2].
> 
> It translates calloc/*malloc*/*realloc() calls to
> g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].
> 
> This is the first commit to give you an idea:
> 
> diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c
> index 98fddd7ac1..75affcb8a6 100644
> --- a/hw/timer/arm_timer.c
> +++ b/hw/timer/arm_timer.c
> @@ -166,7 +166,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq)
>       arm_timer_state *s;
>       QEMUBH *bh;
> 
> -    s = (arm_timer_state *)g_malloc0(sizeof(arm_timer_state));
> +    s = g_new0(arm_timer_state, 1);
>       s->freq = freq;
>       s->control = TIMER_CTRL_IE;
> 
> g_new() advantages (from glib doc):
> - the returned pointer is cast to a pointer to the given type.
> - care is taken to avoid overflow when calculating the size of the
> allocated block.
> 
> But it is also shorter&nicer :)
> 
> I have not included in the first batch the opportunity to also
> translate: alloc(sizeof(*p) * x) to g_new(typeof(*p), x), since it is
> arguably not much nicer. But for consistency, I think it would be good

"avoid overflow when calculating the size" sounds nicer.

> to use g_new(). What do you think?

Dunno if useful, but Juan Quintela had some issue:

http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg00085.html

> 
> I splitted the changes amont the various MAINTAINERS entries, but it
> is still about 70 patches (without the typeof() changes).
> 
> 
> [1] https://github.com/elmarco/clang-tools-extra/
> [2] https://github.com/elmarco/qemu/commits/gnew
> 
>
Marc-André Lureau Sept. 5, 2017, 12:48 p.m. UTC | #2
On Tue, Sep 5, 2017 at 2:41 PM Philippe Mathieu-Daudé <f4bug@amsat.org>
wrote:

> Hi Marc-André,
>
> On 09/05/2017 09:14 AM, Marc-André Lureau wrote:
> > Hi,
> >
> > I have a series of changes generated with clang-tidy qemu [1] pending
> > for review [2].
> >
> > It translates calloc/*malloc*/*realloc() calls to
> > g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].
> >
> > This is the first commit to give you an idea:
> >
> > diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c
> > index 98fddd7ac1..75affcb8a6 100644
> > --- a/hw/timer/arm_timer.c
> > +++ b/hw/timer/arm_timer.c
> > @@ -166,7 +166,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq)
> >       arm_timer_state *s;
> >       QEMUBH *bh;
> >
> > -    s = (arm_timer_state *)g_malloc0(sizeof(arm_timer_state));
> > +    s = g_new0(arm_timer_state, 1);
> >       s->freq = freq;
> >       s->control = TIMER_CTRL_IE;
> >
> > g_new() advantages (from glib doc):
> > - the returned pointer is cast to a pointer to the given type.
> > - care is taken to avoid overflow when calculating the size of the
> > allocated block.
> >
> > But it is also shorter&nicer :)
> >
> > I have not included in the first batch the opportunity to also
> > translate: alloc(sizeof(*p) * x) to g_new(typeof(*p), x), since it is
> > arguably not much nicer. But for consistency, I think it would be good
>
> "avoid overflow when calculating the size" sounds nicer.
>
> > to use g_new(). What do you think?
>
> Dunno if useful, but Juan Quintela had some issue:
>
> http://lists.nongnu.org/archive/html/qemu-devel/2017-09/msg00085.html


Thanks for pointing this out.

It looks like the last iteration didn't change it after all:
https://lists.gnu.org/archive/html/qemu-devel/2017-09/msg00076.html


>
> >
> > I splitted the changes amont the various MAINTAINERS entries, but it
> > is still about 70 patches (without the typeof() changes).
> >
> >
> > [1] https://github.com/elmarco/clang-tools-extra/
> > [2] https://github.com/elmarco/qemu/commits/gnew
> >
> >
>
Markus Armbruster Sept. 5, 2017, 12:56 p.m. UTC | #3
Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi,
>
> I have a series of changes generated with clang-tidy qemu [1] pending
> for review [2].
>
> It translates calloc/*malloc*/*realloc() calls to
> g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].

Only for *type* T, or for arbitrary T?

For type T, we've done the transformation repeatedly with Coccinelle,
first in commit b45c03f (commit message appended).  Repeatedly, because
they creep back in.

How does your technique compare to this prior one?

> This is the first commit to give you an idea:
>
> diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c
> index 98fddd7ac1..75affcb8a6 100644
> --- a/hw/timer/arm_timer.c
> +++ b/hw/timer/arm_timer.c
> @@ -166,7 +166,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq)
>      arm_timer_state *s;
>      QEMUBH *bh;
>
> -    s = (arm_timer_state *)g_malloc0(sizeof(arm_timer_state));
> +    s = g_new0(arm_timer_state, 1);
>      s->freq = freq;
>      s->control = TIMER_CTRL_IE;
>
> g_new() advantages (from glib doc):
> - the returned pointer is cast to a pointer to the given type.
> - care is taken to avoid overflow when calculating the size of the
> allocated block.
>
> But it is also shorter&nicer :)
>
> I have not included in the first batch the opportunity to also
> translate: alloc(sizeof(*p) * x) to g_new(typeof(*p), x), since it is
> arguably not much nicer. But for consistency, I think it would be good
> to use g_new(). What do you think?

p = g_malloc(sizeof(*p)) is idiomatic, and not obviously inferior to
p = g_new(T, 1), where T is the type of *p.  Use whatever is easier to
read, I guess.

But once you add multiplication, g_new() adds something useful: overflow
protection.  Conversion to g_new() might make sense then.

> I splitted the changes amont the various MAINTAINERS entries, but it
> is still about 70 patches (without the typeof() changes).
>
>
> [1] https://github.com/elmarco/clang-tools-extra/
> [2] https://github.com/elmarco/qemu/commits/gnew


commit b45c03f585ea9bb1af76c73e82195418c294919d
Author: Markus Armbruster <armbru@redhat.com>
Date:   Mon Sep 7 10:39:27 2015 +0100

    arm: Use g_new() & friends where that makes obvious sense
    
    g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
    for two reasons.  One, it catches multiplication overflowing size_t.
    Two, it returns T * rather than void *, which lets the compiler catch
    more type errors.
    
    This commit only touches allocations with size arguments of the form
    sizeof(T).
    
    Coccinelle semantic patch:
    
        @@
        type T;
        @@
        -g_malloc(sizeof(T))
        +g_new(T, 1)
        @@
        type T;
        @@
        -g_try_malloc(sizeof(T))
        +g_try_new(T, 1)
        @@
        type T;
        @@
        -g_malloc0(sizeof(T))
        +g_new0(T, 1)
        @@
        type T;
        @@
        -g_try_malloc0(sizeof(T))
        +g_try_new0(T, 1)
        @@
        type T;
        expression n;
        @@
        -g_malloc(sizeof(T) * (n))
        +g_new(T, n)
        @@
        type T;
        expression n;
        @@
        -g_try_malloc(sizeof(T) * (n))
        +g_try_new(T, n)
        @@
        type T;
        expression n;
        @@
        -g_malloc0(sizeof(T) * (n))
        +g_new0(T, n)
        @@
        type T;
        expression n;
        @@
        -g_try_malloc0(sizeof(T) * (n))
        +g_try_new0(T, n)
        @@
        type T;
        expression p, n;
        @@
        -g_realloc(p, sizeof(T) * (n))
        +g_renew(T, p, n)
        @@
        type T;
        expression p, n;
        @@
        -g_try_realloc(p, sizeof(T) * (n))
        +g_try_renew(T, p, n)
        @@
        type T;
        expression n;
        @@
        -(T *)g_new(T, n)
        +g_new(T, n)
        @@
        type T;
        expression n;
        @@
        -(T *)g_new0(T, n)
        +g_new0(T, n)
        @@
        type T;
        expression p, n;
        @@
        -(T *)g_renew(T, p, n)
        +g_renew(T, p, n)
    
    Signed-off-by: Markus Armbruster <armbru@redhat.com>
    Reviewed-by: Eric Blake <eblake@redhat.com>
    Message-id: 1440524394-15640-1-git-send-email-armbru@redhat.com
    Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Marc-André Lureau Sept. 5, 2017, 1:29 p.m. UTC | #4
Hi

On Tue, Sep 5, 2017 at 2:56 PM Markus Armbruster <armbru@redhat.com> wrote:

> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
>
> > Hi,
> >
> > I have a series of changes generated with clang-tidy qemu [1] pending
> > for review [2].
> >
> > It translates calloc/*malloc*/*realloc() calls to
> > g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].
>
> Only for *type* T, or for arbitrary T?
>
>
If sizeof() argument is a type:
https://github.com/elmarco/clang-tools-extra/blob/master/clang-tidy/qemu/UseGnewCheck.cpp#L65

For type T, we've done the transformation repeatedly with Coccinelle,
> first in commit b45c03f (commit message appended).  Repeatedly, because
> they creep back in.
>
> How does your technique compare to this prior one?
>

Well, for one, I have a lot of trouble running coccinelle, mostly because
it is slow and/or eat all my memory. Afaik, coccinelle also has trouble
parsing the code in various cases. I have also trouble writing coccinelle
semantic patch... And, I gave up.

clang-tidy in comparison, is quite fast (it takes a minute to apply on qemu
code base). I find it easier to write a tidy check, and more friendly
command line / output etc:

/tmp/test.c:3:17: warning: use g_new() instead [qemu-use-gnew]
        int *f = (int*)malloc(sizeof(int) * 2);
                 ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
                 g_new(int, 2)

clang-tidy should also respect clang-format rules when modifying the code
(see also
https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg05762.html).

I also imagine it should be possible to integrate to clang warnings in the
future (similar to libreoffice clang plugins for ex). All in all,
coccinelle or clang-tidy are probably both capable of doing this job, but
clang-tidy is snappier and has more potentials for toolchain integration.

> This is the first commit to give you an idea:
> >
> > diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c
> > index 98fddd7ac1..75affcb8a6 100644
> > --- a/hw/timer/arm_timer.c
> > +++ b/hw/timer/arm_timer.c
> > @@ -166,7 +166,7 @@ static arm_timer_state *arm_timer_init(uint32_t freq)
> >      arm_timer_state *s;
> >      QEMUBH *bh;
> >
> > -    s = (arm_timer_state *)g_malloc0(sizeof(arm_timer_state));
> > +    s = g_new0(arm_timer_state, 1);
> >      s->freq = freq;
> >      s->control = TIMER_CTRL_IE;
> >
> > g_new() advantages (from glib doc):
> > - the returned pointer is cast to a pointer to the given type.
> > - care is taken to avoid overflow when calculating the size of the
> > allocated block.
> >
> > But it is also shorter&nicer :)
> >
> > I have not included in the first batch the opportunity to also
> > translate: alloc(sizeof(*p) * x) to g_new(typeof(*p), x), since it is
> > arguably not much nicer. But for consistency, I think it would be good
> > to use g_new(). What do you think?
>
> p = g_malloc(sizeof(*p)) is idiomatic, and not obviously inferior to
> p = g_new(T, 1), where T is the type of *p.  Use whatever is easier to
> read, I guess.
>
> But once you add multiplication, g_new() adds something useful: overflow
> protection.  Conversion to g_new() might make sense then.
>
> > I splitted the changes amont the various MAINTAINERS entries, but it
> > is still about 70 patches (without the typeof() changes).
> >
> >
> > [1] https://github.com/elmarco/clang-tools-extra/
> > [2] https://github.com/elmarco/qemu/commits/gnew
>
>
> commit b45c03f585ea9bb1af76c73e82195418c294919d
> Author: Markus Armbruster <armbru@redhat.com>
> Date:   Mon Sep 7 10:39:27 2015 +0100
>
>     arm: Use g_new() & friends where that makes obvious sense
>
>     g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
>     for two reasons.  One, it catches multiplication overflowing size_t.
>     Two, it returns T * rather than void *, which lets the compiler catch
>     more type errors.
>
>     This commit only touches allocations with size arguments of the form
>     sizeof(T).
>
>     Coccinelle semantic patch:
>
>         @@
>         type T;
>         @@
>         -g_malloc(sizeof(T))
>         +g_new(T, 1)
>         @@
>         type T;
>         @@
>         -g_try_malloc(sizeof(T))
>         +g_try_new(T, 1)
>

thanks, I actually forgot the g_try_new() family (I thought we didn't use
it!)

        @@
>         type T;
>         @@
>         -g_malloc0(sizeof(T))
>         +g_new0(T, 1)
>         @@
>         type T;
>         @@
>         -g_try_malloc0(sizeof(T))
>         +g_try_new0(T, 1)
>         @@
>         type T;
>         expression n;
>         @@
>         -g_malloc(sizeof(T) * (n))
>         +g_new(T, n)
>         @@
>         type T;
>         expression n;
>         @@
>         -g_try_malloc(sizeof(T) * (n))
>         +g_try_new(T, n)
>         @@
>         type T;
>         expression n;
>         @@
>         -g_malloc0(sizeof(T) * (n))
>         +g_new0(T, n)
>         @@
>         type T;
>         expression n;
>         @@
>         -g_try_malloc0(sizeof(T) * (n))
>         +g_try_new0(T, n)
>         @@
>         type T;
>         expression p, n;
>         @@
>         -g_realloc(p, sizeof(T) * (n))
>         +g_renew(T, p, n)
>         @@
>         type T;
>         expression p, n;
>         @@
>         -g_try_realloc(p, sizeof(T) * (n))
>         +g_try_renew(T, p, n)
>         @@
>         type T;
>         expression n;
>         @@
>         -(T *)g_new(T, n)
>         +g_new(T, n)
>         @@
>         type T;
>         expression n;
>         @@
>         -(T *)g_new0(T, n)
>         +g_new0(T, n)
>         @@
>         type T;
>         expression p, n;
>         @@
>         -(T *)g_renew(T, p, n)
>         +g_renew(T, p, n)
>
>     Signed-off-by: Markus Armbruster <armbru@redhat.com>
>     Reviewed-by: Eric Blake <eblake@redhat.com>
>     Message-id: 1440524394-15640-1-git-send-email-armbru@redhat.com
>     Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
>
Markus Armbruster Sept. 5, 2017, 3:33 p.m. UTC | #5
Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
> On Tue, Sep 5, 2017 at 2:56 PM Markus Armbruster <armbru@redhat.com> wrote:
>
>> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
>>
>> > Hi,
>> >
>> > I have a series of changes generated with clang-tidy qemu [1] pending
>> > for review [2].
>> >
>> > It translates calloc/*malloc*/*realloc() calls to
>> > g_new/g_newa/g_new0/g_renew() where the argument is a sizeof(T) [* N].
>>
>> Only for *type* T, or for arbitrary T?
>>
>>
> If sizeof() argument is a type:
> https://github.com/elmarco/clang-tools-extra/blob/master/clang-tidy/qemu/UseGnewCheck.cpp#L65
>
> For type T, we've done the transformation repeatedly with Coccinelle,
>> first in commit b45c03f (commit message appended).  Repeatedly, because
>> they creep back in.
>>
>> How does your technique compare to this prior one?
>>
>
> Well, for one, I have a lot of trouble running coccinelle, mostly because
> it is slow and/or eat all my memory.

It's a bit of a pig, but it doesn't get anywhere close to prohibitive
even on my rather modest development box.  Example run with a minor
variation of the semantic patch from commit b45c03f:

$ time spatch --in-place --sp-file g_new.cocci --macro-file scripts/cocci-macro-file.h --use-idutils --dir .
[...]
real	0m42.379s
user	0m38.988s
sys	0m2.179s
$ git-diff --shortstat
 123 files changed, 211 insertions(+), 216 deletions(-)

Note that "--use-idutils --dir ." directs spatch to work only on files
that need work.  There's also --use-glimpse.  You can easily do the
directing by hand with a judicious `git-grep ...`.  Your experience can
be far worse if you feed it source files indiscriminately.

>                                      Afaik, coccinelle also has trouble
> parsing the code in various cases.

Yes, but anything else that doesn't is almost certainly not tackling the
full problem.

Preprocessing C is easy enough.  Parsing preprocessed C is easy enough.
Parsing unpreprocessed C into something that's suitable for transforming
is darn hard.

Doesn't mean that clang-tidy couldn't be doing a better job for our
code.

>                                    I have also trouble writing coccinelle
> semantic patch...

Point taken.

>                   And, I gave up.

I've found it frustrating at times, but overall too useful to pass up.

I'd love to buy a well-written book on the thing.

> clang-tidy in comparison, is quite fast (it takes a minute to apply on qemu

Same order of magnitude as my spatch run above.

> code base). I find it easier to write a tidy check, and more friendly
> command line / output etc:
>
> /tmp/test.c:3:17: warning: use g_new() instead [qemu-use-gnew]
>         int *f = (int*)malloc(sizeof(int) * 2);
>                  ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
>                  g_new(int, 2)
>
> clang-tidy should also respect clang-format rules when modifying the code
> (see also
> https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg05762.html).
>
> I also imagine it should be possible to integrate to clang warnings in the
> future (similar to libreoffice clang plugins for ex). All in all,
> coccinelle or clang-tidy are probably both capable of doing this job, but
> clang-tidy is snappier and has more potentials for toolchain integration.

Suggest you show us cool things you can do with clang-tidy that haven't
been done with Coccinelle :)

[...]
Marc-André Lureau Sept. 5, 2017, 3:55 p.m. UTC | #6
Hi


> Suggest you show us cool things you can do with clang-tidy that haven't
> been done with Coccinelle :)
>
> Well to do that I would have to have a transformations to do & know the
limits/strength of coccinelle & clang-tidy, I am not there yet... Today, I
prefer invest in clang-tidy for what I need to do.

We already discussed some of the pros/cons of coccinelle vs tidy her and in
the previous round-up series. For ex, clang-tidy is able to evaluate
constant expressions, so you can write generic rules that you can't capture
with coccinelle yet (A + B-1) / B * B:
https://github.com/elmarco/clang-tools-extra/blob/master/clang-tidy/qemu/RoundCheck.cpp
.

However, I think it is more difficult to write clang-tidy transformation
that spans accross various code paths (like adding errors/free/locks etc).
Coccinelle makes that fairly easily apparently.
Markus Armbruster Sept. 5, 2017, 4:23 p.m. UTC | #7
Marc-André Lureau <marcandre.lureau@gmail.com> writes:

> Hi
>
>
>> Suggest you show us cool things you can do with clang-tidy that haven't
>> been done with Coccinelle :)
>>
> Well to do that I would have to have a transformations to do & know the
> limits/strength of coccinelle & clang-tidy, I am not there yet... Today, I
> prefer invest in clang-tidy for what I need to do.

With any luck, you'll soon enough run into something that hasn't been
done with Coccinelle.

> We already discussed some of the pros/cons of coccinelle vs tidy her and in
> the previous round-up series. For ex, clang-tidy is able to evaluate
> constant expressions, so you can write generic rules that you can't capture
> with coccinelle yet (A + B-1) / B * B:
> https://github.com/elmarco/clang-tools-extra/blob/master/clang-tidy/qemu/RoundCheck.cpp
> .
>
> However, I think it is more difficult to write clang-tidy transformation
> that spans accross various code paths (like adding errors/free/locks etc).
> Coccinelle makes that fairly easily apparently.

Fortunately, we can use both.

I'm not so fond of doing the same things with both, though :)
Kevin Wolf Sept. 6, 2017, 8:52 a.m. UTC | #8
Am 05.09.2017 um 17:33 hat Markus Armbruster geschrieben:
> Marc-André Lureau <marcandre.lureau@gmail.com> writes:
> > code base). I find it easier to write a tidy check, and more friendly
> > command line / output etc:
> >
> > /tmp/test.c:3:17: warning: use g_new() instead [qemu-use-gnew]
> >         int *f = (int*)malloc(sizeof(int) * 2);
> >                  ~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
> >                  g_new(int, 2)
> >
> > clang-tidy should also respect clang-format rules when modifying the code
> > (see also
> > https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg05762.html).
> >
> > I also imagine it should be possible to integrate to clang warnings in the
> > future (similar to libreoffice clang plugins for ex). All in all,
> > coccinelle or clang-tidy are probably both capable of doing this job, but
> > clang-tidy is snappier and has more potentials for toolchain integration.
> 
> Suggest you show us cool things you can do with clang-tidy that haven't
> been done with Coccinelle :)

Well, I don't really know clang-tidy, but if the above is what it does,
that _is_ cool. With Coccinelle scripts you run the script from time to
time and remove any new offenders. But if clang-tidy allows us to turn
any new offender into a build error so that they don't even make it into
master, then to me that's way superior.

Not all users are using clang, obviously, so even if we can
automatically enable this for clang, not everyone would see the
warnings. But if we can make some patchew hosts use it, that would
already be enough to catch those things.

Unless I'm misinterpreting what Marc-André is saying and running
clang-tidy actually requires more than is reasonable for patchew.

Kevin
diff mbox series

Patch

diff --git a/hw/timer/arm_timer.c b/hw/timer/arm_timer.c
index 98fddd7ac1..75affcb8a6 100644
--- a/hw/timer/arm_timer.c
+++ b/hw/timer/arm_timer.c
@@ -166,7 +166,7 @@  static arm_timer_state *arm_timer_init(uint32_t freq)
     arm_timer_state *s;
     QEMUBH *bh;

-    s = (arm_timer_state *)g_malloc0(sizeof(arm_timer_state));
+    s = g_new0(arm_timer_state, 1);
     s->freq = freq;
     s->control = TIMER_CTRL_IE;