diff mbox series

Re: [PATCH] plug memory leaks in warn_parm_array_mismatch (PR 99055)

Message ID 23a920e2-a6e2-f7db-8e75-ff6224f732d2@gmail.com
State New
Headers show
Series Re: [PATCH] plug memory leaks in warn_parm_array_mismatch (PR 99055) | expand

Commit Message

Martin Sebor Feb. 11, 2021, 6:35 p.m. UTC
On 2/11/21 12:59 AM, Richard Biener wrote:
> On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>> The attached patch replaces calls to print_generic_expr_to_str() with
>> a helper function that returns a std::string and releases the caller
>> from the responsibility to explicitly free memory.
> 
> I don't like this.  What's the reason to use generic_expr_as_string
> anyway?  We have %E pretty-printers after all.

[Reposting; my first reply was too big.]

The VLA bound is either an expression or the asterisk (*) when newbnd
is null.  The reason for using the function is to avoid duplicating
the warning call and making one with %E and another with "*".

> Couldn't you have
> fixed the leak by doing
> 
> if (newbnd)
>    free (newbndstr);
> 
> etc.?

Yes, I could have.  I considered it and chose not to because this
is much simpler, safer (if newbnd were to change after newbndstr
is set), and more in the "C++ spirit" (RAII).  This code already
uses std::string (attr_access::array_as_string) and so my change
is in line with it.

I understand that you prefer a more C-ish coding style so if you
consider it a prerequisite for accepting this fix I can make
the change conform to it.  See the attached revision (although
I hope you'll see why I chose not to go this route).

For what it's worth, print_generic_expr_to_str() would be improved
by returning std::string.  It would mean including <string> in
most sources in the compiler, which I suspect may not be a popular
suggestion with everyone here, and which is why I didn't make it
to begin with.  But if you're open to it I'm happy to make that
change either for GCC 12 or even now.

> 
>> With the patch installed, Valgrind shows more leaks in this code that
>> I'm not sure what to do about:
>>
>> 1) A tree built by build_type_attribute_qual_variant() called from
>>      attr_access::array_as_string() to build a temporary type only
>>      for the purposes of formatting it.
>>
>> 2) A tree (an attribute list) built by tree_cons() called from
>>      build_attr_access_from_parms() that's used only for the duration
>>      of the caller.
>>
>> Do these temporary trees need to be released somehow or are the leaks
>> expected?
> 
> You should configure GCC with --enable-valgrind-annotations to make
> it aware of our GC.

I did configure with that option:

$ /src/gcc/master/configure --enable-checking=yes 
--enable-languages=all,jit,lto --enable-host-shared 
--enable-valgrind-annotations

Attached to pr99055 is my valgrind output for gcc.dg/Wvla-parameter.c
with the top of trunk and the patch applied:

$ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall 
/src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper 
valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt

Do you not see the same leaks?

Martin

Comments

Richard Biener Feb. 12, 2021, 7:35 a.m. UTC | #1
On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com> wrote:
>
> On 2/11/21 12:59 AM, Richard Biener wrote:
> > On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
> >>
> >> The attached patch replaces calls to print_generic_expr_to_str() with
> >> a helper function that returns a std::string and releases the caller
> >> from the responsibility to explicitly free memory.
> >
> > I don't like this.  What's the reason to use generic_expr_as_string
> > anyway?  We have %E pretty-printers after all.
>
> [Reposting; my first reply was too big.]
>
> The VLA bound is either an expression or the asterisk (*) when newbnd
> is null.  The reason for using the function is to avoid duplicating
> the warning call and making one with %E and another with "*".
>
> > Couldn't you have
> > fixed the leak by doing
> >
> > if (newbnd)
> >    free (newbndstr);
> >
> > etc.?
>
> Yes, I could have.  I considered it and chose not to because this
> is much simpler, safer (if newbnd were to change after newbndstr
> is set), and more in the "C++ spirit" (RAII).  This code already
> uses std::string (attr_access::array_as_string) and so my change
> is in line with it.
>
> I understand that you prefer a more C-ish coding style so if you
> consider it a prerequisite for accepting this fix I can make
> the change conform to it.  See the attached revision (although
> I hope you'll see why I chose not to go this route).

I can understand but I still disagree ;)

> For what it's worth, print_generic_expr_to_str() would be improved
> by returning std::string.  It would mean including <string> in
> most sources in the compiler, which I suspect may not be a popular
> suggestion with everyone here, and which is why I didn't make it
> to begin with.  But if you're open to it I'm happy to make that
> change either for GCC 12 or even now.

Well, looking at print_generic_expr_to_str I see

/* Print a single expression T to string, and return it.  */

char *
print_generic_expr_to_str (tree t)
{
  pretty_printer pp;
  dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
  return xstrdup (pp_formatted_text (&pp));
}

which makes me think that this instead should be sth like

class generic_expr_as_str : public pretty_printer
{
   generic_expr_as_str (tree t) { dump_generic_node (&pp, t, 0,
TDF_VOPS|TDF_MEMSYMS, false); }
   operator const char *() { return pp_formatted_text (&pp); }
   pretty_printer pp;
};

As we do seem to have a RAII pretty-printer class already.  That said,
I won't mind using

   pretty_printer pp;
   dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);

and pp_formatted_text () in the users either.

That is, print_generic_expr_to_str looks just like a bad designed hack.
Using std::string there doesn't make it any better.

Don't you agree to that?

So your 2nd variant patch is OK but you might consider just using
a pretty-printer manually and even do away with the xstrdup in that
way entirely!

> >
> >> With the patch installed, Valgrind shows more leaks in this code that
> >> I'm not sure what to do about:
> >>
> >> 1) A tree built by build_type_attribute_qual_variant() called from
> >>      attr_access::array_as_string() to build a temporary type only
> >>      for the purposes of formatting it.
> >>
> >> 2) A tree (an attribute list) built by tree_cons() called from
> >>      build_attr_access_from_parms() that's used only for the duration
> >>      of the caller.
> >>
> >> Do these temporary trees need to be released somehow or are the leaks
> >> expected?
> >
> > You should configure GCC with --enable-valgrind-annotations to make
> > it aware of our GC.
>
> I did configure with that option:
>
> $ /src/gcc/master/configure --enable-checking=yes
> --enable-languages=all,jit,lto --enable-host-shared
> --enable-valgrind-annotations
>
> Attached to pr99055 is my valgrind output for gcc.dg/Wvla-parameter.c
> with the top of trunk and the patch applied:
>
> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>
> Do you not see the same leaks?

Err, well.  --show-leak-kinds=all is probably the cause.  We
definitely do not force-release
all reachable GC allocated memory at program end.  Not sure if
valgrind annotations can
make that obvious to valgrind.  I'm just using --leak-check=full and
thus look for
unreleased and no longer reachable memory.

Richard.

>
> Martin
>
Martin Sebor Feb. 12, 2021, 6:21 p.m. UTC | #2
On 2/12/21 12:35 AM, Richard Biener wrote:
> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com> wrote:
>>
>> On 2/11/21 12:59 AM, Richard Biener wrote:
>>> On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
>>>>
>>>> The attached patch replaces calls to print_generic_expr_to_str() with
>>>> a helper function that returns a std::string and releases the caller
>>>> from the responsibility to explicitly free memory.
>>>
>>> I don't like this.  What's the reason to use generic_expr_as_string
>>> anyway?  We have %E pretty-printers after all.
>>
>> [Reposting; my first reply was too big.]
>>
>> The VLA bound is either an expression or the asterisk (*) when newbnd
>> is null.  The reason for using the function is to avoid duplicating
>> the warning call and making one with %E and another with "*".
>>
>>> Couldn't you have
>>> fixed the leak by doing
>>>
>>> if (newbnd)
>>>     free (newbndstr);
>>>
>>> etc.?
>>
>> Yes, I could have.  I considered it and chose not to because this
>> is much simpler, safer (if newbnd were to change after newbndstr
>> is set), and more in the "C++ spirit" (RAII).  This code already
>> uses std::string (attr_access::array_as_string) and so my change
>> is in line with it.
>>
>> I understand that you prefer a more C-ish coding style so if you
>> consider it a prerequisite for accepting this fix I can make
>> the change conform to it.  See the attached revision (although
>> I hope you'll see why I chose not to go this route).
> 
> I can understand but I still disagree ;)
> 
>> For what it's worth, print_generic_expr_to_str() would be improved
>> by returning std::string.  It would mean including <string> in
>> most sources in the compiler, which I suspect may not be a popular
>> suggestion with everyone here, and which is why I didn't make it
>> to begin with.  But if you're open to it I'm happy to make that
>> change either for GCC 12 or even now.
> 
> Well, looking at print_generic_expr_to_str I see
> 
> /* Print a single expression T to string, and return it.  */
> 
> char *
> print_generic_expr_to_str (tree t)
> {
>    pretty_printer pp;
>    dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
>    return xstrdup (pp_formatted_text (&pp));
> }
> 
> which makes me think that this instead should be sth like
> 
> class generic_expr_as_str : public pretty_printer
> {
>     generic_expr_as_str (tree t) { dump_generic_node (&pp, t, 0,
> TDF_VOPS|TDF_MEMSYMS, false); }
>     operator const char *() { return pp_formatted_text (&pp); }
>     pretty_printer pp;
> };

This wouldn't be a good general solution either (in fact, I'd say
it would make it worse) because the string's lifetime is tied to
the lifetime of the class object, turning memory leaks to uses-
after-free.  Consider:

   const char *str = generic_expr_as_str (node);
   ...
   warning ("node = %s", str);   // str is dangling/invalid

(Trying to "fix" it by replacing the conversion operator with a named
member function like str() doesn't solve the problem.)

> 
> As we do seem to have a RAII pretty-printer class already.  That said,
> I won't mind using
> 
>     pretty_printer pp;
>     dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
> 
> and pp_formatted_text () in the users either.

Okay.

> 
> That is, print_generic_expr_to_str looks just like a bad designed hack.
> Using std::string there doesn't make it any better.
> 
> Don't you agree to that?

I agree that print_generic_expr_to_str() could be improved.  But
a simple API that returns a string representation of a tree node
needs to be easy to use safely and correctly and hard to misuse.
It shouldn't require its clients to explicitly manage the lifetimes
of multiple objects.

But this isn't a new problem, and the solution is as old as C++
itself: have the API return an object of an RAII class like
std::string, or more generally something like std::unique_ptr.
In this case it could even be GCC's auto_vec<char*>, or (less
user-friendly) a garbage collected STRING_CST.

> 
> So your 2nd variant patch is OK but you might consider just using
> a pretty-printer manually and even do away with the xstrdup in that
> way entirely!

Avoiding the xstrdup sounds good to me.  I've changed the code to
use the pretty_printer directly and committed the attached patch.

Martin

> 
>>>
>>>> With the patch installed, Valgrind shows more leaks in this code that
>>>> I'm not sure what to do about:
>>>>
>>>> 1) A tree built by build_type_attribute_qual_variant() called from
>>>>       attr_access::array_as_string() to build a temporary type only
>>>>       for the purposes of formatting it.
>>>>
>>>> 2) A tree (an attribute list) built by tree_cons() called from
>>>>       build_attr_access_from_parms() that's used only for the duration
>>>>       of the caller.
>>>>
>>>> Do these temporary trees need to be released somehow or are the leaks
>>>> expected?
>>>
>>> You should configure GCC with --enable-valgrind-annotations to make
>>> it aware of our GC.
>>
>> I did configure with that option:
>>
>> $ /src/gcc/master/configure --enable-checking=yes
>> --enable-languages=all,jit,lto --enable-host-shared
>> --enable-valgrind-annotations
>>
>> Attached to pr99055 is my valgrind output for gcc.dg/Wvla-parameter.c
>> with the top of trunk and the patch applied:
>>
>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>
>> Do you not see the same leaks?
> 
> Err, well.  --show-leak-kinds=all is probably the cause.  We
> definitely do not force-release
> all reachable GC allocated memory at program end.  Not sure if
> valgrind annotations can
> make that obvious to valgrind.  I'm just using --leak-check=full and
> thus look for
> unreleased and no longer reachable memory.
> 
> Richard.
> 
>>
>> Martin
>>
Richard Biener Feb. 12, 2021, 7:36 p.m. UTC | #3
On February 12, 2021 7:21:25 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>On 2/12/21 12:35 AM, Richard Biener wrote:
>> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com>
>wrote:
>>>
>>> On 2/11/21 12:59 AM, Richard Biener wrote:
>>>> On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com>
>wrote:
>>>>>
>>>>> The attached patch replaces calls to print_generic_expr_to_str()
>with
>>>>> a helper function that returns a std::string and releases the
>caller
>>>>> from the responsibility to explicitly free memory.
>>>>
>>>> I don't like this.  What's the reason to use generic_expr_as_string
>>>> anyway?  We have %E pretty-printers after all.
>>>
>>> [Reposting; my first reply was too big.]
>>>
>>> The VLA bound is either an expression or the asterisk (*) when
>newbnd
>>> is null.  The reason for using the function is to avoid duplicating
>>> the warning call and making one with %E and another with "*".
>>>
>>>> Couldn't you have
>>>> fixed the leak by doing
>>>>
>>>> if (newbnd)
>>>>     free (newbndstr);
>>>>
>>>> etc.?
>>>
>>> Yes, I could have.  I considered it and chose not to because this
>>> is much simpler, safer (if newbnd were to change after newbndstr
>>> is set), and more in the "C++ spirit" (RAII).  This code already
>>> uses std::string (attr_access::array_as_string) and so my change
>>> is in line with it.
>>>
>>> I understand that you prefer a more C-ish coding style so if you
>>> consider it a prerequisite for accepting this fix I can make
>>> the change conform to it.  See the attached revision (although
>>> I hope you'll see why I chose not to go this route).
>> 
>> I can understand but I still disagree ;)
>> 
>>> For what it's worth, print_generic_expr_to_str() would be improved
>>> by returning std::string.  It would mean including <string> in
>>> most sources in the compiler, which I suspect may not be a popular
>>> suggestion with everyone here, and which is why I didn't make it
>>> to begin with.  But if you're open to it I'm happy to make that
>>> change either for GCC 12 or even now.
>> 
>> Well, looking at print_generic_expr_to_str I see
>> 
>> /* Print a single expression T to string, and return it.  */
>> 
>> char *
>> print_generic_expr_to_str (tree t)
>> {
>>    pretty_printer pp;
>>    dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
>>    return xstrdup (pp_formatted_text (&pp));
>> }
>> 
>> which makes me think that this instead should be sth like
>> 
>> class generic_expr_as_str : public pretty_printer
>> {
>>     generic_expr_as_str (tree t) { dump_generic_node (&pp, t, 0,
>> TDF_VOPS|TDF_MEMSYMS, false); }
>>     operator const char *() { return pp_formatted_text (&pp); }
>>     pretty_printer pp;
>> };
>
>This wouldn't be a good general solution either (in fact, I'd say
>it would make it worse) because the string's lifetime is tied to
>the lifetime of the class object, turning memory leaks to uses-
>after-free.  Consider:
>
>   const char *str = generic_expr_as_str (node);
>   ...
>   warning ("node = %s", str);   // str is dangling/invalid
>
>(Trying to "fix" it by replacing the conversion operator with a named
>member function like str() doesn't solve the problem.)

But the issue with std::string is that people will have to use .c_str() because of the gazillion C style interfaces in GCC
and storage of std::string will eventually lead to lots of copying or use the other very same use after free or leak issues. 

std::string isn't the great solution you are presenting it as. 

Richard. 

>> 
>> As we do seem to have a RAII pretty-printer class already.  That
>said,
>> I won't mind using
>> 
>>     pretty_printer pp;
>>     dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
>> 
>> and pp_formatted_text () in the users either.
>
>Okay.
>
>> 
>> That is, print_generic_expr_to_str looks just like a bad designed
>hack.
>> Using std::string there doesn't make it any better.
>> 
>> Don't you agree to that?
>
>I agree that print_generic_expr_to_str() could be improved.  But
>a simple API that returns a string representation of a tree node
>needs to be easy to use safely and correctly and hard to misuse.
>It shouldn't require its clients to explicitly manage the lifetimes
>of multiple objects.
>
>But this isn't a new problem, and the solution is as old as C++
>itself: have the API return an object of an RAII class like
>std::string, or more generally something like std::unique_ptr.
>In this case it could even be GCC's auto_vec<char*>, or (less
>user-friendly) a garbage collected STRING_CST.
>
>> 
>> So your 2nd variant patch is OK but you might consider just using
>> a pretty-printer manually and even do away with the xstrdup in that
>> way entirely!
>
>Avoiding the xstrdup sounds good to me.  I've changed the code to
>use the pretty_printer directly and committed the attached patch.
>
>Martin
>
>> 
>>>>
>>>>> With the patch installed, Valgrind shows more leaks in this code
>that
>>>>> I'm not sure what to do about:
>>>>>
>>>>> 1) A tree built by build_type_attribute_qual_variant() called from
>>>>>       attr_access::array_as_string() to build a temporary type
>only
>>>>>       for the purposes of formatting it.
>>>>>
>>>>> 2) A tree (an attribute list) built by tree_cons() called from
>>>>>       build_attr_access_from_parms() that's used only for the
>duration
>>>>>       of the caller.
>>>>>
>>>>> Do these temporary trees need to be released somehow or are the
>leaks
>>>>> expected?
>>>>
>>>> You should configure GCC with --enable-valgrind-annotations to make
>>>> it aware of our GC.
>>>
>>> I did configure with that option:
>>>
>>> $ /src/gcc/master/configure --enable-checking=yes
>>> --enable-languages=all,jit,lto --enable-host-shared
>>> --enable-valgrind-annotations
>>>
>>> Attached to pr99055 is my valgrind output for
>gcc.dg/Wvla-parameter.c
>>> with the top of trunk and the patch applied:
>>>
>>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>>>
>valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>>
>>> Do you not see the same leaks?
>> 
>> Err, well.  --show-leak-kinds=all is probably the cause.  We
>> definitely do not force-release
>> all reachable GC allocated memory at program end.  Not sure if
>> valgrind annotations can
>> make that obvious to valgrind.  I'm just using --leak-check=full and
>> thus look for
>> unreleased and no longer reachable memory.
>> 
>> Richard.
>> 
>>>
>>> Martin
>>>
Martin Sebor Feb. 12, 2021, 8:32 p.m. UTC | #4
On 2/12/21 12:36 PM, Richard Biener wrote:
> On February 12, 2021 7:21:25 PM GMT+01:00, Martin Sebor <msebor@gmail.com> wrote:
>> On 2/12/21 12:35 AM, Richard Biener wrote:
>>> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com>
>> wrote:
>>>>
>>>> On 2/11/21 12:59 AM, Richard Biener wrote:
>>>>> On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com>
>> wrote:
>>>>>>
>>>>>> The attached patch replaces calls to print_generic_expr_to_str()
>> with
>>>>>> a helper function that returns a std::string and releases the
>> caller
>>>>>> from the responsibility to explicitly free memory.
>>>>>
>>>>> I don't like this.  What's the reason to use generic_expr_as_string
>>>>> anyway?  We have %E pretty-printers after all.
>>>>
>>>> [Reposting; my first reply was too big.]
>>>>
>>>> The VLA bound is either an expression or the asterisk (*) when
>> newbnd
>>>> is null.  The reason for using the function is to avoid duplicating
>>>> the warning call and making one with %E and another with "*".
>>>>
>>>>> Couldn't you have
>>>>> fixed the leak by doing
>>>>>
>>>>> if (newbnd)
>>>>>      free (newbndstr);
>>>>>
>>>>> etc.?
>>>>
>>>> Yes, I could have.  I considered it and chose not to because this
>>>> is much simpler, safer (if newbnd were to change after newbndstr
>>>> is set), and more in the "C++ spirit" (RAII).  This code already
>>>> uses std::string (attr_access::array_as_string) and so my change
>>>> is in line with it.
>>>>
>>>> I understand that you prefer a more C-ish coding style so if you
>>>> consider it a prerequisite for accepting this fix I can make
>>>> the change conform to it.  See the attached revision (although
>>>> I hope you'll see why I chose not to go this route).
>>>
>>> I can understand but I still disagree ;)
>>>
>>>> For what it's worth, print_generic_expr_to_str() would be improved
>>>> by returning std::string.  It would mean including <string> in
>>>> most sources in the compiler, which I suspect may not be a popular
>>>> suggestion with everyone here, and which is why I didn't make it
>>>> to begin with.  But if you're open to it I'm happy to make that
>>>> change either for GCC 12 or even now.
>>>
>>> Well, looking at print_generic_expr_to_str I see
>>>
>>> /* Print a single expression T to string, and return it.  */
>>>
>>> char *
>>> print_generic_expr_to_str (tree t)
>>> {
>>>     pretty_printer pp;
>>>     dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
>>>     return xstrdup (pp_formatted_text (&pp));
>>> }
>>>
>>> which makes me think that this instead should be sth like
>>>
>>> class generic_expr_as_str : public pretty_printer
>>> {
>>>      generic_expr_as_str (tree t) { dump_generic_node (&pp, t, 0,
>>> TDF_VOPS|TDF_MEMSYMS, false); }
>>>      operator const char *() { return pp_formatted_text (&pp); }
>>>      pretty_printer pp;
>>> };
>>
>> This wouldn't be a good general solution either (in fact, I'd say
>> it would make it worse) because the string's lifetime is tied to
>> the lifetime of the class object, turning memory leaks to uses-
>> after-free.  Consider:
>>
>>    const char *str = generic_expr_as_str (node);
>>    ...
>>    warning ("node = %s", str);   // str is dangling/invalid
>>
>> (Trying to "fix" it by replacing the conversion operator with a named
>> member function like str() doesn't solve the problem.)
> 
> But the issue with std::string is that people will have to use .c_str() because of the gazillion C style interfaces in GCC
> and storage of std::string will eventually lead to lots of copying or use the other very same use after free or leak issues.
> 
> std::string isn't the great solution you are presenting it as.

It's the standard solution, but it isn't the only one.  If we don't
want to use it we can create our own, improved class (I did mention
auto_vec).  I'm only advocating the use of RAII to avoid these hunts
for memory leaks that can be trivially avoided by adopting basic C++
idioms.

Martin

> 
> Richard.
> 
>>>
>>> As we do seem to have a RAII pretty-printer class already.  That
>> said,
>>> I won't mind using
>>>
>>>      pretty_printer pp;
>>>      dump_generic_node (&pp, t, 0, TDF_VOPS|TDF_MEMSYMS, false);
>>>
>>> and pp_formatted_text () in the users either.
>>
>> Okay.
>>
>>>
>>> That is, print_generic_expr_to_str looks just like a bad designed
>> hack.
>>> Using std::string there doesn't make it any better.
>>>
>>> Don't you agree to that?
>>
>> I agree that print_generic_expr_to_str() could be improved.  But
>> a simple API that returns a string representation of a tree node
>> needs to be easy to use safely and correctly and hard to misuse.
>> It shouldn't require its clients to explicitly manage the lifetimes
>> of multiple objects.
>>
>> But this isn't a new problem, and the solution is as old as C++
>> itself: have the API return an object of an RAII class like
>> std::string, or more generally something like std::unique_ptr.
>> In this case it could even be GCC's auto_vec<char*>, or (less
>> user-friendly) a garbage collected STRING_CST.
>>
>>>
>>> So your 2nd variant patch is OK but you might consider just using
>>> a pretty-printer manually and even do away with the xstrdup in that
>>> way entirely!
>>
>> Avoiding the xstrdup sounds good to me.  I've changed the code to
>> use the pretty_printer directly and committed the attached patch.
>>
>> Martin
>>
>>>
>>>>>
>>>>>> With the patch installed, Valgrind shows more leaks in this code
>> that
>>>>>> I'm not sure what to do about:
>>>>>>
>>>>>> 1) A tree built by build_type_attribute_qual_variant() called from
>>>>>>        attr_access::array_as_string() to build a temporary type
>> only
>>>>>>        for the purposes of formatting it.
>>>>>>
>>>>>> 2) A tree (an attribute list) built by tree_cons() called from
>>>>>>        build_attr_access_from_parms() that's used only for the
>> duration
>>>>>>        of the caller.
>>>>>>
>>>>>> Do these temporary trees need to be released somehow or are the
>> leaks
>>>>>> expected?
>>>>>
>>>>> You should configure GCC with --enable-valgrind-annotations to make
>>>>> it aware of our GC.
>>>>
>>>> I did configure with that option:
>>>>
>>>> $ /src/gcc/master/configure --enable-checking=yes
>>>> --enable-languages=all,jit,lto --enable-host-shared
>>>> --enable-valgrind-annotations
>>>>
>>>> Attached to pr99055 is my valgrind output for
>> gcc.dg/Wvla-parameter.c
>>>> with the top of trunk and the patch applied:
>>>>
>>>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>>>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>>>>
>> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>>>
>>>> Do you not see the same leaks?
>>>
>>> Err, well.  --show-leak-kinds=all is probably the cause.  We
>>> definitely do not force-release
>>> all reachable GC allocated memory at program end.  Not sure if
>>> valgrind annotations can
>>> make that obvious to valgrind.  I'm just using --leak-check=full and
>>> thus look for
>>> unreleased and no longer reachable memory.
>>>
>>> Richard.
>>>
>>>>
>>>> Martin
>>>>
>
Thomas Schwinge Aug. 6, 2021, 2:09 p.m. UTC | #5
Hi!

I'm working on plugging a memory leak in an entirely different
compartment of GCC, but also ran into this issue:

On 2021-02-12T08:35:52+0100, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com> wrote:
>> On 2/11/21 12:59 AM, Richard Biener wrote:
>> > On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
>> >> [...] Valgrind shows more leaks in this code that
>> >> I'm not sure what to do about:
>> >>
>> >> 1) A tree built by build_type_attribute_qual_variant() called from
>> >>      attr_access::array_as_string() to build a temporary type only
>> >>      for the purposes of formatting it.
>> >>
>> >> 2) A tree (an attribute list) built by tree_cons() called from
>> >>      build_attr_access_from_parms() that's used only for the duration
>> >>      of the caller.
>> >>
>> >> Do these temporary trees need to be released somehow or are the leaks
>> >> expected?
>> >
>> > You should configure GCC with --enable-valgrind-annotations to make
>> > it aware of our GC.
>>
>> I did configure with that option:
>>
>> $ /src/gcc/master/configure --enable-checking=yes
>> --enable-languages=all,jit,lto --enable-host-shared
>> --enable-valgrind-annotations

>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>
>> Do you not see the same leaks?

I do; also stuff like:

    56 bytes in 1 blocks are still reachable in loss record 152 of 875
       at 0x483DD99: calloc (vg_replace_malloc.c:762)
       by 0x1753240: xcalloc (xmalloc.c:162)
       by 0x669C83: ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-page.c:918)
       by 0x89E07D: ggc_internal_cleared_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-common.c:117)
       by 0xF65D0D: make_node(tree_code) (ggc.h:143)
       by 0xF6632B: build_decl(unsigned int, tree_code, tree_node*, tree_node*) (tree.c:5264)
       by 0xA28ADC: build_builtin_function(unsigned int, char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:681)
       by 0xA29FDD: add_builtin_function(char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:716)
       by 0x622BFB: def_builtin_1(built_in_function, char const*, built_in_class, tree_node*, tree_node*, bool, bool, bool, tree_node*, bool) [clone .constprop.25] (lto-lang.c:650)
       by 0x640709: lto_define_builtins(tree_node*, tree_node*) (omp-builtins.def:46)
       by 0x641EE3: lto_init() (lto-lang.c:1339)
       by 0x61E26A: toplev::main(int, char**) (toplev.c:1921)

... and many, many more.

> Err, well.  --show-leak-kinds=all is probably the cause.

Before finding this email, I too had convinced myself that everying that
came by 'ggc_*' I may ignore, because:

> We
> definitely do not force-release
> all reachable GC allocated memory at program end.

... of this: these blocks simply had not been GCed at program end.

It's however a bit tedious to filter, in my case, 11864 lines of Valgrind
output.

> Not sure if
> valgrind annotations can
> make that obvious to valgrind.

Or, if that's not feasible (I don't know much about Valgrind...), then
instead would it help to force a final GC at program end if we're running
in "valgrind mode"?  If that's a plausible thing to do, would guarding
that by GCC having been configured with '--enable-valgrind-annotations'
be OK, or do we need a '--param', or something else?

> I'm just using --leak-check=full and
> thus look for
> unreleased and no longer reachable memory.

ACK, in my case, that only shows seven errors (not related to my stuff).


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Richard Biener Aug. 6, 2021, 3:10 p.m. UTC | #6
On August 6, 2021 4:09:37 PM GMT+02:00, Thomas Schwinge <thomas@codesourcery.com> wrote:
>Hi!
>
>I'm working on plugging a memory leak in an entirely different
>compartment of GCC, but also ran into this issue:
>
>On 2021-02-12T08:35:52+0100, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com> wrote:
>>> On 2/11/21 12:59 AM, Richard Biener wrote:
>>> > On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
>>> >> [...] Valgrind shows more leaks in this code that
>>> >> I'm not sure what to do about:
>>> >>
>>> >> 1) A tree built by build_type_attribute_qual_variant() called from
>>> >>      attr_access::array_as_string() to build a temporary type only
>>> >>      for the purposes of formatting it.
>>> >>
>>> >> 2) A tree (an attribute list) built by tree_cons() called from
>>> >>      build_attr_access_from_parms() that's used only for the duration
>>> >>      of the caller.
>>> >>
>>> >> Do these temporary trees need to be released somehow or are the leaks
>>> >> expected?
>>> >
>>> > You should configure GCC with --enable-valgrind-annotations to make
>>> > it aware of our GC.
>>>
>>> I did configure with that option:
>>>
>>> $ /src/gcc/master/configure --enable-checking=yes
>>> --enable-languages=all,jit,lto --enable-host-shared
>>> --enable-valgrind-annotations
>
>>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>>> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>>
>>> Do you not see the same leaks?
>
>I do; also stuff like:
>
>    56 bytes in 1 blocks are still reachable in loss record 152 of 875
>       at 0x483DD99: calloc (vg_replace_malloc.c:762)
>       by 0x1753240: xcalloc (xmalloc.c:162)
>       by 0x669C83: ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-page.c:918)
>       by 0x89E07D: ggc_internal_cleared_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-common.c:117)
>       by 0xF65D0D: make_node(tree_code) (ggc.h:143)
>       by 0xF6632B: build_decl(unsigned int, tree_code, tree_node*, tree_node*) (tree.c:5264)
>       by 0xA28ADC: build_builtin_function(unsigned int, char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:681)
>       by 0xA29FDD: add_builtin_function(char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:716)
>       by 0x622BFB: def_builtin_1(built_in_function, char const*, built_in_class, tree_node*, tree_node*, bool, bool, bool, tree_node*, bool) [clone .constprop.25] (lto-lang.c:650)
>       by 0x640709: lto_define_builtins(tree_node*, tree_node*) (omp-builtins.def:46)
>       by 0x641EE3: lto_init() (lto-lang.c:1339)
>       by 0x61E26A: toplev::main(int, char**) (toplev.c:1921)
>
>... and many, many more.
>
>> Err, well.  --show-leak-kinds=all is probably the cause.
>
>Before finding this email, I too had convinced myself that everying that
>came by 'ggc_*' I may ignore, because:
>
>> We
>> definitely do not force-release
>> all reachable GC allocated memory at program end.
>
>... of this: these blocks simply had not been GCed at program end.
>
>It's however a bit tedious to filter, in my case, 11864 lines of Valgrind
>output.
>
>> Not sure if
>> valgrind annotations can
>> make that obvious to valgrind.
>
>Or, if that's not feasible (I don't know much about Valgrind...), then
>instead would it help to force a final GC at program end if we're running
>in "valgrind mode"?  If that's a plausible thing to do, would guarding
>that by GCC having been configured with '--enable-valgrind-annotations'
>be OK, or do we need a '--param', or something else?

Well, instead of a final GC we could explicitly release all GC managed memory. 

>> I'm just using --leak-check=full and
>> thus look for
>> unreleased and no longer reachable memory.
>
>ACK, in my case, that only shows seven errors (not related to my stuff).
>
>
>Grüße
> Thomas
>-----------------
>Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
Thomas Schwinge Aug. 17, 2021, 1 p.m. UTC | #7
Hi!

On 2021-08-06T17:10:36+0200, Richard Biener <richard.guenther@gmail.com> wrote:
> On August 6, 2021 4:09:37 PM GMT+02:00, Thomas Schwinge <thomas@codesourcery.com> wrote:
>>I'm working on plugging a memory leak in an entirely different
>>compartment of GCC, but also ran into this issue:
>>
>>On 2021-02-12T08:35:52+0100, Richard Biener via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>>> On Thu, Feb 11, 2021 at 7:35 PM Martin Sebor <msebor@gmail.com> wrote:
>>>> On 2/11/21 12:59 AM, Richard Biener wrote:
>>>> > On Wed, Feb 10, 2021 at 6:16 PM Martin Sebor <msebor@gmail.com> wrote:
>>>> >> [...] Valgrind shows more leaks in this code that
>>>> >> I'm not sure what to do about:
>>>> >>
>>>> >> 1) A tree built by build_type_attribute_qual_variant() called from
>>>> >>      attr_access::array_as_string() to build a temporary type only
>>>> >>      for the purposes of formatting it.
>>>> >>
>>>> >> 2) A tree (an attribute list) built by tree_cons() called from
>>>> >>      build_attr_access_from_parms() that's used only for the duration
>>>> >>      of the caller.
>>>> >>
>>>> >> Do these temporary trees need to be released somehow or are the leaks
>>>> >> expected?
>>>> >
>>>> > You should configure GCC with --enable-valgrind-annotations to make
>>>> > it aware of our GC.
>>>>
>>>> I did configure with that option:
>>>>
>>>> $ /src/gcc/master/configure --enable-checking=yes
>>>> --enable-languages=all,jit,lto --enable-host-shared
>>>> --enable-valgrind-annotations
>>
>>>> $ /build/gcc-master/gcc/xgcc -B /build/gcc-master/gcc -S -Wall
>>>> /src/gcc/master/gcc/testsuite/gcc.dg/Wvla-parameter.c -wrapper
>>>> valgrind,--leak-check=full,--show-leak-kinds=all,--track-origins=yes,--log-file=valgrind-out.txt
>>>>
>>>> Do you not see the same leaks?
>>
>>I do; also stuff like:
>>
>>    56 bytes in 1 blocks are still reachable in loss record 152 of 875
>>       at 0x483DD99: calloc (vg_replace_malloc.c:762)
>>       by 0x1753240: xcalloc (xmalloc.c:162)
>>       by 0x669C83: ggc_internal_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-page.c:918)
>>       by 0x89E07D: ggc_internal_cleared_alloc(unsigned long, void (*)(void*), unsigned long, unsigned long) (ggc-common.c:117)
>>       by 0xF65D0D: make_node(tree_code) (ggc.h:143)
>>       by 0xF6632B: build_decl(unsigned int, tree_code, tree_node*, tree_node*) (tree.c:5264)
>>       by 0xA28ADC: build_builtin_function(unsigned int, char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:681)
>>       by 0xA29FDD: add_builtin_function(char const*, tree_node*, int, built_in_class, char const*, tree_node*) (langhooks.c:716)
>>       by 0x622BFB: def_builtin_1(built_in_function, char const*, built_in_class, tree_node*, tree_node*, bool, bool, bool, tree_node*, bool) [clone .constprop.25] (lto-lang.c:650)
>>       by 0x640709: lto_define_builtins(tree_node*, tree_node*) (omp-builtins.def:46)
>>       by 0x641EE3: lto_init() (lto-lang.c:1339)
>>       by 0x61E26A: toplev::main(int, char**) (toplev.c:1921)
>>
>>... and many, many more.
>>
>>> Err, well.  --show-leak-kinds=all is probably the cause.
>>
>>Before finding this email, I too had convinced myself that everying that
>>came by 'ggc_*' I may ignore, because:
>>
>>> We
>>> definitely do not force-release
>>> all reachable GC allocated memory at program end.
>>
>>... of this: these blocks simply had not been GCed at program end.
>>
>>It's however a bit tedious to filter, in my case, 11864 lines of Valgrind
>>output.

(Actually, might use something like the "mitigated as follows" that I've
added here: <https://gcc.gnu.org/wiki/DebuggingGCC#Valgrind>.)

>>> Not sure if
>>> valgrind annotations can
>>> make that obvious to valgrind.
>>
>>Or, if that's not feasible (I don't know much about Valgrind...), then
>>instead would it help to force a final GC at program end if we're running
>>in "valgrind mode"?  If that's a plausible thing to do, would guarding
>>that by GCC having been configured with '--enable-valgrind-annotations'
>>be OK, or do we need a '--param', or something else?
>
> Well, instead of a final GC we could explicitly release all GC managed memory.

Heh, of course, a "final GC at program end" doesn't help (much), given
that (most of) all the blocks are still reachable via the usual GC roots.

So I tried looking into how we might release all GCC memory
unconditionally, via adapting 'ggc_mark_roots' (to not add back roots via
'ggc_mark_root_tab'), 'clear_marks', 'sweep_pages', 'release_pages',
etc., but couldn't get this to work.  It doesn't help, of course, that I
don't know much about how the GC really works internally.  Possibly my
non-understanding of the "context depth" is highly relevant.

Anyway, this isn't really important for me right now, having otherwise
resolve my original issue, so I'm not intending to spend a lot more time
on this.


Calling 'memory_block_pool::trim (0);' at the end of 'gcc/main.c:main'
does have some effect, too, but isn't sufficient/useful on its own, of
course.


Grüße
 Thomas
-----------------
Siemens Electronic Design Automation GmbH; Anschrift: Arnulfstraße 201, 80634 München; Gesellschaft mit beschränkter Haftung; Geschäftsführer: Thomas Heurung, Frank Thürauf; Sitz der Gesellschaft: München; Registergericht München, HRB 106955
diff mbox series

Patch

PR c/99055 - memory leak in warn_parm_array_mismatch

gcc/c-family/ChangeLog:

	PR c/99055
	* c-warn.c (warn_parm_array_mismatch): Free strings returned from
	print_generic_expr_to_str.

gcc/ChangeLog:

	* tree-pretty-print.c (print_generic_expr_to_str): Update comment.

diff --git a/gcc/c-family/c-warn.c b/gcc/c-family/c-warn.c
index e6e28d9b139..27c7baa61bb 100644
--- a/gcc/c-family/c-warn.c
+++ b/gcc/c-family/c-warn.c
@@ -3636,6 +3636,10 @@  warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 		    inform (&richloc, "previously declared as %s with bound "
 			    "%<%s%>", curparmstr.c_str (), curbndstr);
 
+		  if (newbnd)
+		    free (const_cast <char *>(newbndstr));
+		  if (curbnd)
+		    free (const_cast <char *>(curbndstr));
 		  continue;
 		}
 	    }
@@ -3646,7 +3650,13 @@  warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 		 Compare them lexicographically to detect gross mismatches
 		 such as between T[foo()] and T[bar()].  */
 	      if (operand_equal_p (newbnd, curbnd, OEP_LEXICOGRAPHIC))
-		continue;
+		{
+		  if (newbnd)
+		    free (const_cast <char *>(newbndstr));
+		  if (curbnd)
+		    free (const_cast <char *>(curbndstr));
+		  continue;
+		}
 
 	      if (warning_at (newloc, OPT_Wvla_parameter,
 			      "argument %u of type %s declared with "
@@ -3654,8 +3664,18 @@  warn_parm_array_mismatch (location_t origloc, tree fndecl, tree newparms)
 			      parmpos + 1, newparmstr.c_str (), newbndstr))
 		inform (origloc, "previously declared as %s with bound %qs",
 			curparmstr.c_str (), curbndstr);
+
+	      if (newbnd)
+		free (const_cast <char *>(newbndstr));
+	      if (curbnd)
+		free (const_cast <char *>(curbndstr));
 	      continue;
 	    }
+
+	  if (newbnd)
+	    free (const_cast <char *>(newbndstr));
+	  if (curbnd)
+	    free (const_cast <char *>(curbndstr));
 	}
 
       if (newa->minsize == cura->minsize
diff --git a/gcc/tree-pretty-print.c b/gcc/tree-pretty-print.c
index aabe6bb23b9..986f75d1d5f 100644
--- a/gcc/tree-pretty-print.c
+++ b/gcc/tree-pretty-print.c
@@ -169,7 +169,8 @@  print_generic_expr (FILE *file, tree t, dump_flags_t flags)
   pp_flush (tree_pp);
 }
 
-/* Print a single expression T to string, and return it.  */
+/* Print a single expression T to string, and return it.  The caller
+   must free the returned memory.  */
 
 char *
 print_generic_expr_to_str (tree t)