diff mbox

fix memory leak in gengtype

Message ID alpine.LNX.2.00.1104202353490.10280@localhost.localdomain
State New
Headers show

Commit Message

Dimitrios Apostolou April 20, 2011, 9:08 p.m. UTC
Hello list,

while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM 
killed. That's when I noticed that its RAM usage peaks at 150MB, which is 
a bit excessive for parsing a ~500K text file.

The attached patch fixes the leak and gengtype now uses a peak of 4MB 
heap. Hopefully I don't do something wrong, since it took me a while to 
understand those obstacks...


Thanks,
Dimitris


P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but 
unfortunately the sparcstation crashed and burned after this, so I can't 
continue the build and report back :-(

Comments

Jeff Law April 20, 2011, 10:55 p.m. UTC | #1
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/20/11 15:08, Dimitrios Apostolou wrote:
> Hello list,
> 
> while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
> killed. That's when I noticed that its RAM usage peaks at 150MB, which
> is a bit excessive for parsing a ~500K text file.
> 
> The attached patch fixes the leak and gengtype now uses a peak of 4MB
> heap. Hopefully I don't do something wrong, since it took me a while to
> understand those obstacks...
The code in question creates an obstack, allocates (and grows) a single
object on the obstack, then frees the object.  This leaks the underlying
obstack structure itself and potentially any chunks that were too small
to hold the object.

It turns out there's a similar leak in gengtype.c which is fixed in the
same way.

A quick valgrind test shows that prior to your change gengtype leaked
roughly 200M, after your change it leaks about 1.3M and after fixing
gengtype it leaks a little under 300k.

I'll run those changes through the usual tests and check in the changes
assuming they pass those tests.

Thanks for the patch!

> 
> P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu)
> but unfortunately the sparcstation crashed and burned after this, so I
> can't continue the build and report back :-(
:(  My old PA box has similar problems, though it merely overheats
before a bootstrap can complete, so in theory I could coax it to finish
a bootstrap.   Luckily others (particularly John) have stepped in over
the last decade and taken excellent care of the PA port.

Jeff

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNr2RiAAoJEBRtltQi2kC7ryUH/iYvVw8LWZNWc1zSczCOOo8w
T8uyVX6WX+0xjPDA52si34BdCXfKdNDmtQXAVpnRbbTrgT42lj1bTH9c9KLadWEZ
0/FUZQB5VGQTMYah7iDDAfyjUdyRRCZW/YWnbyfAP0UdVTR7xJsjqjjWEetuyyFA
jF6WQYovzWzjssUnKfPnD/WyQxoPm+gihBVw0abhdPpojXcH8uMYrXpZrGLEk0QA
drR0ogL3ZKNJiRMFZQH5NKrhhx76mPiACsRZmCJkXSm+N6GqRsJFE9gGbc7Lwpdn
bVjd1CGo5yYCscEM/yUBS4fclO6aDRRdMbT5/cVsObYXv58WGG1gfk0F6g1GqFs=
=d6SQ
-----END PGP SIGNATURE-----
Dimitrios Apostolou April 20, 2011, 11:35 p.m. UTC | #2
On Wed, 20 Apr 2011, Jeff Law wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 04/20/11 15:08, Dimitrios Apostolou wrote:
>> Hello list,
>>
>> while trying to build gcc-4.6.0 on my sparcstation, I got gengtype OOM
>> killed. That's when I noticed that its RAM usage peaks at 150MB, which
>> is a bit excessive for parsing a ~500K text file.
>>
>> The attached patch fixes the leak and gengtype now uses a peak of 4MB
>> heap. Hopefully I don't do something wrong, since it took me a while to
>> understand those obstacks...
> The code in question creates an obstack, allocates (and grows) a single
> object on the obstack, then frees the object.  This leaks the underlying
> obstack structure itself and potentially any chunks that were too small
> to hold the object.

Plus a whole page which is preallocated by the obstack, if I understand 
correctly. As a result, for each word in the text file we consume 4KB, 
which are never freed.

>
> It turns out there's a similar leak in gengtype.c which is fixed in the
> same way.
>

Nice, thanks for looking deeper into this, I just stopped when memory 
utilisation seemed ok.

> A quick valgrind test shows that prior to your change gengtype leaked
> roughly 200M, after your change it leaks about 1.3M and after fixing
> gengtype it leaks a little under 300k.
>
> I'll run those changes through the usual tests and check in the changes
> assuming they pass those tests.
>
> Thanks for the patch!
>
>>
>> P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu)
>> but unfortunately the sparcstation crashed and burned after this, so I
>> can't continue the build and report back :-(
> :(  My old PA box has similar problems, though it merely overheats
> before a bootstrap can complete, so in theory I could coax it to finish
> a bootstrap.   Luckily others (particularly John) have stepped in over
> the last decade and taken excellent care of the PA port.

If by PA you mean PA-RISC, I remember when I had access to a Visualize 
C200 with gentoo on. I loved the machine, but it had an important issue: 
it was absolutely random if it would power up, when pressing the power 
button. But as long as we never turned it off, it worked ok :-)


Dimitris
Laurynas Biveinis April 21, 2011, 3:44 a.m. UTC | #3
Dimitrios -

The patch is OK with a ChangeLog entry. Also a patch to fix the same
in gengtype.c:matching_file_name_substitute is pre-approved (but it
looks like Jeff will beat you to this :)

> P.S. I was trying to test gcc on a rare arch (sparc-unknown-linux-gnu) but
> unfortunately the sparcstation crashed and burned after this, so I can't
> continue the build and report back :-(

:( Why don't you get yourself a compile farm account?
http://gcc.gnu.org/wiki/CompileFarm
Jeff Law April 21, 2011, 3:13 p.m. UTC | #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/20/11 17:35, Dimitrios Apostolou wrote:

> 
> Plus a whole page which is preallocated by the obstack, if I understand
> correctly. As a result, for each word in the text file we consume 4KB,
> which are never freed.
Plausible.  Though I always thought obstacks would release all the
unused chunks as part of the obstack_free call and the obstack structure
itself was always supposed to be "small".

Regardless of what exactly was leaked, the two locations you identified
were leaking a ton of memory.

> 
>>
>> It turns out there's a similar leak in gengtype.c which is fixed in the
>> same way.
>>
> 
> Nice, thanks for looking deeper into this, I just stopped when memory
> utilisation seemed ok.
It's been so long since I had to think about obstacks and I wasn't sure
the leak was really going to be *that* bad, so I decided to run gengtype
under valgrind to verify leak behavior.

FWIW, the remaining 300K are almost all leaks through the hash tables.
It wouldn't take terribly long to verify the death of the hash tables
and insert a suitable htab_delete call.  It probably doesn't matter
much, but I hate leaving the leak, so I'll probably sit down and learn a
little more about gengtype* so I can safely plug the leak hash table leaks.


> 
> If by PA you mean PA-RISC, I remember when I had access to a Visualize
> C200 with gentoo on. I loved the machine, but it had an important issue:
> it was absolutely random if it would power up, when pressing the power
> button. But as long as we never turned it off, it worked ok :-)
I've got a 715, C240 & R390 in the basement.  The R390 overheats.  The
other two worked when I fired them up several years ago.

Anyway, attached is the patch I ultimately checked in.  Same as yours
with the addition of fixing a similar obstack leak in gengtype.c.

jeff
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNsEmrAAoJEBRtltQi2kC7EykH/2nbMtFrAcorqinQViI9Nvtt
xO7H764lFjTQa4gsKP/gq+RFRM2s8omyI8cgaFABM7kfkFp64ZUspXHXQS/U1PX/
PHlLCxnjmnw/w56VGDRjF8z9MZ30Cc3dU7xfJRUAbRuooYzYrPw5fMivCeQo4axy
+hPEhCHohIOzSjC5+yKnafklfgQdVE1pTc9Cp5LKCTDYAlzvh2vi6FOHf8FF2/1C
Dmqgv5qF8Bpd3tyjXj6+/raTU6RfsGsDcQiQo5fADosjPj+h4iWdoVir3xm4TGoU
mCwsAoywawk2tPBvlbHAMEg5fiia3qjQ73Pmt2Z62WcK/C+BaZJnj4Or/iZ6Xao=
=2JFI
-----END PGP SIGNATURE-----
* gengtype-state.c (read_a_state_token): Fix argument to 
	obstack_free.
	* gengtype.c (matching_file_name_substitute): Likewise.

Index: gengtype-state.c
===================================================================
*** gengtype-state.c	(revision 172644)
--- gengtype-state.c	(working copy)
*************** read_a_state_token (void)
*** 303,309 ****
        obstack_1grow (&id_obstack, (char) 0);
        ids = XOBFINISH (&id_obstack, char *);
        sid = state_ident_by_name (ids, INSERT);
!       obstack_free (&id_obstack, ids);
        ids = NULL;
        tk = XCNEW (struct state_token_st);
        tk->stok_kind = STOK_NAME;
--- 303,309 ----
        obstack_1grow (&id_obstack, (char) 0);
        ids = XOBFINISH (&id_obstack, char *);
        sid = state_ident_by_name (ids, INSERT);
!       obstack_free (&id_obstack, NULL);
        ids = NULL;
        tk = XCNEW (struct state_token_st);
        tk->stok_kind = STOK_NAME;
*************** read_a_state_token (void)
*** 408,414 ****
        tk->stok_file = state_path;
        tk->stok_next = NULL;
        strcpy (tk->stok_un.stok_string, cstr);
!       obstack_free (&bstring_obstack, cstr);
  
        return tk;
      }
--- 408,414 ----
        tk->stok_file = state_path;
        tk->stok_next = NULL;
        strcpy (tk->stok_un.stok_string, cstr);
!       obstack_free (&bstring_obstack, NULL);
  
        return tk;
      }
Index: gengtype.c
===================================================================
*** gengtype.c	(revision 172644)
--- gengtype.c	(working copy)
*************** matching_file_name_substitute (const cha
*** 1943,1949 ****
    obstack_1grow (&str_obstack, '\0');
    rawstr = XOBFINISH (&str_obstack, char *);
    str = xstrdup (rawstr);
!   obstack_free (&str_obstack, rawstr);
    DBGPRINTF ("matched replacement %s", str);
    rawstr = NULL;
    return str;
--- 1943,1949 ----
    obstack_1grow (&str_obstack, '\0');
    rawstr = XOBFINISH (&str_obstack, char *);
    str = xstrdup (rawstr);
!   obstack_free (&str_obstack, NULL);
    DBGPRINTF ("matched replacement %s", str);
    rawstr = NULL;
    return str;
Dimitrios Apostolou April 21, 2011, 6:26 p.m. UTC | #5
On Thu, 21 Apr 2011, Laurynas Biveinis wrote:
>
> :( Why don't you get yourself a compile farm account?
> http://gcc.gnu.org/wiki/CompileFarm
>

Thanks Laurynas, I am absolutely thrilled to see such a variety of 
hardware! I'll try applying, but I'm not sure I'm eligible, my 
contributions to OSS are just a few and mostly simple.


Thanks,
Dimitris
diff mbox

Patch

--- gcc/gengtype-state.c.orig	2011-04-20 23:06:29.000000000 +0300
+++ gcc/gengtype-state.c	2011-04-20 23:12:43.000000000 +0300
@@ -303,7 +303,7 @@ 
       obstack_1grow (&id_obstack, (char) 0);
       ids = XOBFINISH (&id_obstack, char *);
       sid = state_ident_by_name (ids, INSERT);
-      obstack_free (&id_obstack, ids);
+      obstack_free (&id_obstack, NULL);
       ids = NULL;
       tk = XCNEW (struct state_token_st);
       tk->stok_kind = STOK_NAME;
@@ -408,7 +408,7 @@ 
       tk->stok_file = state_path;
       tk->stok_next = NULL;
       strcpy (tk->stok_un.stok_string, cstr);
-      obstack_free (&bstring_obstack, cstr);
+      obstack_free (&bstring_obstack, NULL);
 
       return tk;
     }