Message ID | 57A32741.7010003@redhat.com |
---|---|
State | New |
Headers | show |
On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > Howdy! > > As part of my -Walloca-larger-than=life work, I've been running said pass > over gcc, binutils, and gdb, and trying to fix things along the way. > > Particularly irritating and error prone is having to free malloc'd pointers > on every function exit point. We end up with a lot of: > > foo(size_t len) > { > void *p, *m_p = NULL; > if (len < HUGE) > p = alloca(len); > else > p = m_p = malloc(len); > if (something) > goto out; > stuff(); > out: > free (m_p); > } > > ...which nobody really likes. > > I've been thinking that for GCC we could have a protected_alloca class whose > destructor frees any malloc'd memory: > > void foo() > { > char *p; > protected_alloca chunk(50000); > p = (char *) chunk.pointer(); > f(p); > } > > This would generate: > > void foo() () > { > void * _3; > > <bb 2>: > _3 = malloc (50000); > f (_3); > > <bb 3>: > free (_3); [tail call] > return; > } > > Now the problem with this is that the memory allocated by chunk is freed > when it goes out of scope, which may not be what you want. For example: > > func() > { > char *str; > { > protected_alloca chunk (99999999); > // malloc'd pointer will be freed when chunk goes out of scope. > str = (char *) chunk.pointer (); > } > use (str); // BAD! Use after free. > } But how's that an issue if the chunk is created at the exact place where there previously was an alloca? Your class also will not work when internal_alloc is not inlined and the alloca path is taken like when using non-GCC host compilers. > In the attached patch implementing this class I have provided another idiom > for avoiding this problem: > > func() > { > void *ptr; > protected_alloca chunk; > { > chunk.alloc (9999999); > str = (char *) chunk.pointer (); > } > // OK, pointer will be freed on function exit. > use (str); > } > > So I guess it's between annoying gotos and keeping track of multiple exit > points to a function previously calling alloca, or making sure the > protected_alloca object always resides in the scope where the memory is > going to be used. > > Is there a better blessed C++ way? If not, is this OK? It looks like you want to replace _all_ alloca uses? What's the point in doing this at all? Just to be able to enable the warning during bootstrap? Having the conditional malloc/alloca will also inhibit optimization like eliding the malloc or alloca calls completely. Thanks, Richard. > Included is the conversion of tree.c. More to follow once we agree on a > solution. > > Tested on x86-64 Linux. > > Aldy
On 08/04/2016 08:57 AM, Richard Biener wrote: > On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >> Howdy! >> >> As part of my -Walloca-larger-than=life work, I've been running said pass >> over gcc, binutils, and gdb, and trying to fix things along the way. >> >> Particularly irritating and error prone is having to free malloc'd pointers >> on every function exit point. We end up with a lot of: >> >> foo(size_t len) >> { >> void *p, *m_p = NULL; >> if (len < HUGE) >> p = alloca(len); >> else >> p = m_p = malloc(len); >> if (something) >> goto out; >> stuff(); >> out: >> free (m_p); >> } >> >> ...which nobody really likes. >> >> I've been thinking that for GCC we could have a protected_alloca class whose >> destructor frees any malloc'd memory: >> >> void foo() >> { >> char *p; >> protected_alloca chunk(50000); >> p = (char *) chunk.pointer(); >> f(p); >> } >> >> This would generate: >> >> void foo() () >> { >> void * _3; >> >> <bb 2>: >> _3 = malloc (50000); >> f (_3); >> >> <bb 3>: >> free (_3); [tail call] >> return; >> } >> >> Now the problem with this is that the memory allocated by chunk is freed >> when it goes out of scope, which may not be what you want. For example: >> >> func() >> { >> char *str; >> { >> protected_alloca chunk (99999999); >> // malloc'd pointer will be freed when chunk goes out of scope. >> str = (char *) chunk.pointer (); >> } >> use (str); // BAD! Use after free. >> } > > But how's that an issue if the chunk is created at the exact place where there > previously was an alloca? The pointer can escape if you assign it to a variable outside the scope of chunk? Take for instance the following snippet in tree.c: { ... ... q = (char *) alloca (9 + 17 + len + 1); memcpy (q, file, len + 1); snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX, crc32_string (0, name), get_random_seed (false)); p = q; } clean_symbol_name (q); If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca() call, chunk will be destroyed at the "}", whereas `q' is still being used outside of that scope. What I am suggesting for this escaping case is to define "protected_alloca chunk()" at function scope, and then do chunk.alloc(N) in the spot where the alloca() call was previously at. Or am I missing something? > > Your class also will not work when internal_alloc is not inlined and > the alloca path > is taken like when using non-GCC host compilers. Does not work, or is not optimal? Because defining _ALLOCA_INLINE_ to nothing and forcing no-inline with: g++ -c b.cc -fno-inline -fdump-tree-all -O1 -fno-exceptions I still see correct code. It's just that it's inefficient, which we shouldn't care because bootstrap fixes the non-GCC inlining problem :). protected_alloca chunk(123); str = (char *) chunk.pointer(); use(str); becomes: <bb 2>: protected_alloca::protected_alloca (&chunk, 123); str_3 = protected_alloca::pointer (&chunk); use (str_3); protected_alloca::~protected_alloca (&chunk); return; What am I missing? > >> In the attached patch implementing this class I have provided another idiom >> for avoiding this problem: >> >> func() >> { >> void *ptr; >> protected_alloca chunk; >> { >> chunk.alloc (9999999); >> str = (char *) chunk.pointer (); >> } >> // OK, pointer will be freed on function exit. >> use (str); >> } >> >> So I guess it's between annoying gotos and keeping track of multiple exit >> points to a function previously calling alloca, or making sure the >> protected_alloca object always resides in the scope where the memory is >> going to be used. >> >> Is there a better blessed C++ way? If not, is this OK? > > It looks like you want to replace _all_ alloca uses? What's the point > in doing this > at all? Just to be able to enable the warning during bootstrap? Well, it did cross my mind to nix anything that had 0 bounds checks, but I was mostly interested in things like this: gcc.c: temp = env.get ("COMPILER_PATH"); if (temp) { const char *startp, *endp; char *nstore = (char *) alloca (strlen (temp) + 3); I was just providing a generic interface for dealing with these cases in the future, instead of gotoing my way out of it. > > Having the conditional malloc/alloca will also inhibit optimization like eliding > the malloc or alloca calls completely. If we can elide the alloca, we can surely elide a conditional alloca / malloc pair, can't we? Aldy
How wedded are we to alloca?
On 08/04/2016 12:30 PM, Aldy Hernandez wrote:
> Is there a better blessed C++ way?
std::string
or if I'm reading gcc's vec.h correctly:
auto_vec<char, MAX_ALLOCA_SIZE>
?
(or some typedef of the latter)
The latter would always allocate the stack space, but
my guess it that it wouldn't matter.
Thanks,
Pedro Alves
On 08/04/2016 01:06 PM, Pedro Alves wrote:
> How wedded are we to alloca?
I would think only in the sense of existing codebase usage.
Based on how often alloca usage has resulted in a security vulnerability
it's clear we as developers can't use it correctly on a consistent
basis, thus I'd like to abolish it :-) I'll settle for warning folks
when they use it incorrectly though.
Jeff
On 08/04/2016 08:16 PM, Jeff Law wrote: > On 08/04/2016 01:06 PM, Pedro Alves wrote: >> How wedded are we to alloca? > I would think only in the sense of existing codebase usage. > > Based on how often alloca usage has resulted in a security vulnerability > it's clear we as developers can't use it correctly on a consistent > basis, thus I'd like to abolish it :-) I'll settle for warning folks > when they use it incorrectly though. Most allocas I've seen in my life were written to simply build strings at run time, while lazily avoiding to think about writing a "free" call, rather than having been written for optimizing some fast path, or for async-signal safe reasons. My guess is that auto_vec<char, MAX_ALLOCA_SIZE> covers any fast-path-handling requirement in gcc. (Where alloca is being used for async-signal safely reasons you can't have a malloc fallback anyway, so out of scope for the proposed protected_alloca too.) Thanks, Pedro Alves
On 08/04/2016 09:19 AM, Aldy Hernandez wrote: >>> >>> Particularly irritating and error prone is having to free malloc'd >>> pointers >>> on every function exit point. We end up with a lot of: >>> >>> foo(size_t len) >>> { >>> void *p, *m_p = NULL; >>> if (len < HUGE) >>> p = alloca(len); >>> else >>> p = m_p = malloc(len); >>> if (something) >>> goto out; >>> stuff(); >>> out: >>> free (m_p); >>> } >>> >>> ...which nobody really likes. Yup. You'll see that all over glibc. I don't think we use that idiom all that much for GCC. >>> >>> I've been thinking that for GCC we could have a protected_alloca >>> class whose >>> destructor frees any malloc'd memory: >>> >>> void foo() >>> { >>> char *p; >>> protected_alloca chunk(50000); >>> p = (char *) chunk.pointer(); >>> f(p); >>> } >>> >>> This would generate: >>> >>> void foo() () >>> { >>> void * _3; >>> >>> <bb 2>: >>> _3 = malloc (50000); >>> f (_3); >>> >>> <bb 3>: >>> free (_3); [tail call] >>> return; >>> } >>> >>> Now the problem with this is that the memory allocated by chunk is freed >>> when it goes out of scope, which may not be what you want. For example: >>> >>> func() >>> { >>> char *str; >>> { >>> protected_alloca chunk (99999999); >>> // malloc'd pointer will be freed when chunk goes out of >>> scope. >>> str = (char *) chunk.pointer (); >>> } >>> use (str); // BAD! Use after free. >>> } >> >> But how's that an issue if the chunk is created at the exact place >> where there >> previously was an alloca? > > The pointer can escape if you assign it to a variable outside the scope > of chunk? Take for instance the following snippet in tree.c: Right. The alloca model is (with the exception of vlas in loops) is the storage hangs around until function scope is left. So releasing on exit of the scope in which the space was allocated is wrong unless allocation occurred at function scope. That doesn't play as well with an RAII model where objects get destroyed as soon as they leave their lexical scope. So > > { > ... > ... > q = (char *) alloca (9 + 17 + len + 1); > memcpy (q, file, len + 1); > > snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX, > crc32_string (0, name), get_random_seed (false)); > > p = q; > } > > clean_symbol_name (q); > > If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca() > call, chunk will be destroyed at the "}", whereas `q' is still being > used outside of that scope. > > What I am suggesting for this escaping case is to define > "protected_alloca chunk()" at function scope, and then do chunk.alloc(N) > in the spot where the alloca() call was previously at. > > Or am I missing something? I think only thing you're missing is that we probably don't want to be using alloca in new code anyway. And if we're going through the trouble of fixing old code, we ought to just remove alloca. So building up this kind of class is probably taking folks in the wrong direction. Now you might use that opportunity to convert some object to a RAII model (and I'm a huge fan of that model). Jeff
On 08/04/2016 01:21 PM, Pedro Alves wrote: > On 08/04/2016 08:16 PM, Jeff Law wrote: >> On 08/04/2016 01:06 PM, Pedro Alves wrote: >>> How wedded are we to alloca? >> I would think only in the sense of existing codebase usage. >> >> Based on how often alloca usage has resulted in a security vulnerability >> it's clear we as developers can't use it correctly on a consistent >> basis, thus I'd like to abolish it :-) I'll settle for warning folks >> when they use it incorrectly though. > > Most allocas I've seen in my life were written to simply build > strings at run time, while lazily avoiding to think about writing > a "free" call, rather than having been written for optimizing > some fast path, or for async-signal safe reasons. Right -- the problem is if those strings are potentially under user control or the bad guys can arrange to overflow a size computation feeding an alloca, then this stuff becomes a huge gaping security hole. We've seen this repeatedly within glibc. Jeff
On 08/04/2016 08:26 PM, Jeff Law wrote: > On 08/04/2016 01:21 PM, Pedro Alves wrote: >> Most allocas I've seen in my life were written to simply build >> strings at run time, while lazily avoiding to think about writing >> a "free" call, rather than having been written for optimizing >> some fast path, or for async-signal safe reasons. > Right -- the problem is if those strings are potentially under user > control or the bad guys can arrange to overflow a size computation > feeding an alloca, then this stuff becomes a huge gaping security hole. > We've seen this repeatedly within glibc. Yup. Thanks, Pedro Alves
On 08/04/2016 05:30 AM, Aldy Hernandez wrote: > Howdy! > > As part of my -Walloca-larger-than=life work, I've been running said > pass over gcc, binutils, and gdb, and trying to fix things along the way. > > Particularly irritating and error prone is having to free malloc'd > pointers on every function exit point. We end up with a lot of: > > foo(size_t len) > { > void *p, *m_p = NULL; > if (len < HUGE) > p = alloca(len); > else > p = m_p = malloc(len); > if (something) > goto out; > stuff(); > out: > free (m_p); > } > > ...which nobody really likes. > > I've been thinking that for GCC we could have a protected_alloca class > whose destructor frees any malloc'd memory: > > void foo() > { > char *p; > protected_alloca chunk(50000); > p = (char *) chunk.pointer(); > f(p); > } > > This would generate: > > void foo() () > { > void * _3; > > <bb 2>: > _3 = malloc (50000); > f (_3); > > <bb 3>: > free (_3); [tail call] > return; > } > > Now the problem with this is that the memory allocated by chunk is freed > when it goes out of scope, which may not be what you want. For example: > > func() > { > char *str; > { > protected_alloca chunk (99999999); > // malloc'd pointer will be freed when chunk goes out of scope. > str = (char *) chunk.pointer (); > } > use (str); // BAD! Use after free. > } > > In the attached patch implementing this class I have provided another > idiom for avoiding this problem: > > func() > { > void *ptr; > protected_alloca chunk; > { > chunk.alloc (9999999); > str = (char *) chunk.pointer (); > } > // OK, pointer will be freed on function exit. > use (str); > } > > So I guess it's between annoying gotos and keeping track of multiple > exit points to a function previously calling alloca, or making sure the > protected_alloca object always resides in the scope where the memory is > going to be used. > > Is there a better blessed C++ way? If not, is this OK? I was surprised by the always_inline trick. I suppose it makes sense but I wouldn't have expected to be able to rely on it. Out of curiosity I tested it with other compilers. It turns out that it only works with some like GCC and IBM XLC++, but not others like Clang or Sun CC. In recursive calls, they don't seem to hold on to the memory allocated via alloca by the class ctor in the caller. FWIW, rather than creating a new class around alloca and putting effort into changing code to use it I think that investing time into replacing the function's uses with an existing C++ container class (either GCC's own or one from the STL) might be a better way to go. Martin
On Thu, Aug 4, 2016 at 5:19 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > On 08/04/2016 08:57 AM, Richard Biener wrote: >> >> On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >>> >>> Howdy! >>> >>> As part of my -Walloca-larger-than=life work, I've been running said pass >>> over gcc, binutils, and gdb, and trying to fix things along the way. >>> >>> Particularly irritating and error prone is having to free malloc'd >>> pointers >>> on every function exit point. We end up with a lot of: >>> >>> foo(size_t len) >>> { >>> void *p, *m_p = NULL; >>> if (len < HUGE) >>> p = alloca(len); >>> else >>> p = m_p = malloc(len); >>> if (something) >>> goto out; >>> stuff(); >>> out: >>> free (m_p); >>> } >>> >>> ...which nobody really likes. >>> >>> I've been thinking that for GCC we could have a protected_alloca class >>> whose >>> destructor frees any malloc'd memory: >>> >>> void foo() >>> { >>> char *p; >>> protected_alloca chunk(50000); >>> p = (char *) chunk.pointer(); >>> f(p); >>> } >>> >>> This would generate: >>> >>> void foo() () >>> { >>> void * _3; >>> >>> <bb 2>: >>> _3 = malloc (50000); >>> f (_3); >>> >>> <bb 3>: >>> free (_3); [tail call] >>> return; >>> } >>> >>> Now the problem with this is that the memory allocated by chunk is freed >>> when it goes out of scope, which may not be what you want. For example: >>> >>> func() >>> { >>> char *str; >>> { >>> protected_alloca chunk (99999999); >>> // malloc'd pointer will be freed when chunk goes out of scope. >>> str = (char *) chunk.pointer (); >>> } >>> use (str); // BAD! Use after free. >>> } >> >> >> But how's that an issue if the chunk is created at the exact place where >> there >> previously was an alloca? > > > The pointer can escape if you assign it to a variable outside the scope of > chunk? Take for instance the following snippet in tree.c: > > { > ... > ... > q = (char *) alloca (9 + 17 + len + 1); > memcpy (q, file, len + 1); > > snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX, > crc32_string (0, name), get_random_seed (false)); > > p = q; > } > > clean_symbol_name (q); > > If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca() > call, chunk will be destroyed at the "}", whereas `q' is still being used > outside of that scope. > > What I am suggesting for this escaping case is to define "protected_alloca > chunk()" at function scope, and then do chunk.alloc(N) in the spot where the > alloca() call was previously at. > > Or am I missing something? Ah, ok - alloca memory is only freed at the end of the function. Only VLAs are freed at scope boundary. >> >> Your class also will not work when internal_alloc is not inlined and >> the alloca path >> is taken like when using non-GCC host compilers. > > > Does not work, or is not optimal? Because defining _ALLOCA_INLINE_ to > nothing and forcing no-inline with: > > g++ -c b.cc -fno-inline -fdump-tree-all -O1 -fno-exceptions > > I still see correct code. It's just that it's inefficient, which we > shouldn't care because bootstrap fixes the non-GCC inlining problem :). As alloca() frees memory at function return code cannot be "correct". > protected_alloca chunk(123); > str = (char *) chunk.pointer(); > use(str); > > becomes: > > <bb 2>: > protected_alloca::protected_alloca (&chunk, 123); > str_3 = protected_alloca::pointer (&chunk); > use (str_3); > protected_alloca::~protected_alloca (&chunk); > return; > > What am I missing? The memory is released after internal_alloc returns. So if you do protected_alloca::protected_alloca (&chunk, 123); ... foo (); // function needing some stack space or GCC spilling ... you'll corrupt memory. >> >>> In the attached patch implementing this class I have provided another >>> idiom >>> for avoiding this problem: >>> >>> func() >>> { >>> void *ptr; >>> protected_alloca chunk; >>> { >>> chunk.alloc (9999999); >>> str = (char *) chunk.pointer (); >>> } >>> // OK, pointer will be freed on function exit. >>> use (str); >>> } >>> >>> So I guess it's between annoying gotos and keeping track of multiple exit >>> points to a function previously calling alloca, or making sure the >>> protected_alloca object always resides in the scope where the memory is >>> going to be used. >>> >>> Is there a better blessed C++ way? If not, is this OK? >> >> >> It looks like you want to replace _all_ alloca uses? What's the point >> in doing this >> at all? Just to be able to enable the warning during bootstrap? > > > Well, it did cross my mind to nix anything that had 0 bounds checks, but I > was mostly interested in things like this: > > gcc.c: > temp = env.get ("COMPILER_PATH"); > if (temp) > { > const char *startp, *endp; > char *nstore = (char *) alloca (strlen (temp) + 3); > > I was just providing a generic interface for dealing with these cases in the > future, instead of gotoing my way out of it. > >> >> Having the conditional malloc/alloca will also inhibit optimization like >> eliding >> the malloc or alloca calls completely. > > > If we can elide the alloca, we can surely elide a conditional alloca / > malloc pair, can't we? We can't at the moment. We don't for malloc (we only remove malloc/free pairs when the memory is not used), we can elide alloca to use a local decl instead. Richard. > > Aldy
> I think only thing you're missing is that we probably don't want to be > using alloca in new code anyway. And if we're going through the trouble > of fixing old code, we ought to just remove alloca. So building up this > kind of class is probably taking folks in the wrong direction. After spending some time combing over all alloca uses in the compiler, I am very much inclined to agree :). Pedro's comments up-thread ring true as well. Most of the alloca uses in the compiler are actually ad-hoc allocation for a string that will be quickly discarded or passed to get_identifier(). Very few, if any are actually on a critical path. Most uses are things like building diagnostics. However, there are a few potentially problematic ones like alloca()ing chunks of memory while processing GIMPLE_ASM (chunk * noutputs). This may require something more efficient than malloc(). Though I really wonder how prevalent inline asms are in that it would affect performance in any measurable way if we replaced these with malloc or GCC's auto_vec<> (which also uses malloc (or GGC) BTW). After looking at all this code, I realize no size will fit all if we want something clean. So, I take my approach back. I think we're best served by a couple different approaches, depending on usage. My question is, how wed are we to an alloca based approach? Can we replace the non-critical uses by std::string (diagnostics, BLAH_check_failed(), building a string to manipulate the output of env("SOME_VAR"), etc?). Then for things like the GCC driver wanting to create a vector of passed commands, we could use auto_vec<> (*gasp*, using a vector to represent a vector ;-)). Note that auto_vec<> uses malloc (or GC) but creating a malloc'd vector at startup is hardly on the fast path. For the remaining things we could use either alloca with a malloc fallback for trivial things like set_user_assembler_name() that only have one exit point, or take it on a case by case basis. But it seems to me that we can use an STL container for the non critical things (which are the majority of them), and something else (perhaps an RAII thinggie TBD later). Is this reasonable? I'd like to know before I spend any time converting anything to std::string and auto_vec<>. Thanks for everyone's input. Aldy p.s. Yeah yeah, both std::string and auto_vec<> malloc stuff and they're bloated, but do we care on non-critical things? We do get cleaner code as a result...
> I was surprised by the always_inline trick. I suppose it makes > sense but I wouldn't have expected to be able to rely on it. Out > of curiosity I tested it with other compilers. It turns out that > it only works with some like GCC and IBM XLC++, but not others > like Clang or Sun CC. In recursive calls, they don't seem to > hold on to the memory allocated via alloca by the class ctor in > the caller. Well, it was surprising to me as well, hence the comment. I expected it to just work, and when it didn't I had to hunt in the inliner code to find out why it was selectively inlining: case GIMPLE_CALL: /* Refuse to inline alloca call unless user explicitly forced so as this may change program's memory overhead drastically when the function using alloca is called in loop. In GCC present in SPEC2000 inlining into schedule_block cause it to require 2GB of RAM instead of 256MB. Don't do so for alloca calls emitted for VLA objects as those can't cause unbounded growth (they're always wrapped inside stack_save/stack_restore regions. */ As Richi pointed out, if the constructor doesn't get inlined (as you're seeing in Clang and Sun CC), we could potentially clobber freed memory. So much for that idea... > > FWIW, rather than creating a new class around alloca and putting > effort into changing code to use it I think that investing time > into replacing the function's uses with an existing C++ container > class (either GCC's own or one from the STL) might be a better > way to go. Yes, I suggested a combination of auto_vec<> (gcc's) and std::string down thread. Thanks for checking out the result from other compilers. Much appreciated. Aldy
On 08/05/2016 03:37 PM, Aldy Hernandez wrote: > My question is, how wed are we to an alloca based approach? Can we > replace the non-critical uses by std::string (diagnostics, > BLAH_check_failed(), building a string to manipulate the output of > env("SOME_VAR"), etc?). My suggestion too. ( I hope you'll pardon my sticking my nose into this. I'm hoping that down the road gcc and gdb/binutils end up finding a common ground for basic C++ utilities, and grow closer together. And you did say you're fixing gdb's alloca problems as well. :-) ) > Then for things like the GCC driver wanting to create a vector of passed > commands, we could use auto_vec<> (*gasp*, using a vector to represent a > vector ;-)). Note that auto_vec<> uses malloc (or GC) but creating a > malloc'd vector at startup is hardly on the fast path. Note that that's why I specifically suggested auto_vec<char, MAX_ALLOCA_SIZE> for critical-path things. The second template parameter of auto_vec specifies the size of the internal storage: /* auto_vec is a subclass of vec that automatically manages creating and releasing the internal vector. If N is non zero then it has N elements of internal storage. The default is no internal storage, and you probably only want to ask for internal storage for vectors on the stack because if the size of the vector is larger than the internal storage that space is wasted. */ template<typename T, size_t N = 0> class auto_vec : public vec<T, va_heap> { ... So for fast-path cases where you know off hand the usual scenario is "less than N", you could use an auto_vec<sometype, N> variable on the stack. This does not malloc unless you end up needing more than N. It actually ends up being quite similar to your protected_alloca. E.g., yours: protected_alloca chunk; str = (char *) chunk.pointer (); vs auto_vec: typedef auto_vec<char, MAX_ALLOCA_SIZE> protected_alloca; protected_alloca chunk; str = chunk.address (); (Of course a similar approach can be taken with other data structures, stack -> set/map/hash, etc., but auto_vec is there already, and vectors should be the default container.) > > For the remaining things we could use either alloca with a malloc > fallback for trivial things like set_user_assembler_name() that only > have one exit point, or take it on a case by case basis. See above. Thanks, Pedro Alves > > But it seems to me that we can use an STL container for the non critical > things (which are the majority of them), and something else (perhaps an > RAII thinggie TBD later). > > Is this reasonable? I'd like to know before I spend any time converting > anything to std::string and auto_vec<>. > > Thanks for everyone's input. > Aldy > > p.s. Yeah yeah, both std::string and auto_vec<> malloc stuff and they're > bloated, but do we care on non-critical things? We do get cleaner code > as a result...
On 08/05/2016 08:37 AM, Aldy Hernandez wrote: > > After looking at all this code, I realize no size will fit all if we > want something clean. So, I take my approach back. I think we're best > served by a couple different approaches, depending on usage. Yup. > > My question is, how wed are we to an alloca based approach? Can we > replace the non-critical uses by std::string (diagnostics, > BLAH_check_failed(), building a string to manipulate the output of > env("SOME_VAR"), etc?). I don't think we are at all wed to alloca. > > Then for things like the GCC driver wanting to create a vector of passed > commands, we could use auto_vec<> (*gasp*, using a vector to represent a > vector ;-)). Note that auto_vec<> uses malloc (or GC) but creating a > malloc'd vector at startup is hardly on the fast path. > > For the remaining things we could use either alloca with a malloc > fallback for trivial things like set_user_assembler_name() that only > have one exit point, or take it on a case by case basis. > > But it seems to me that we can use an STL container for the non critical > things (which are the majority of them), and something else (perhaps an > RAII thinggie TBD later). > > Is this reasonable? I'd like to know before I spend any time converting > anything to std::string and auto_vec<>. Yes, this is all reasonable to me. I'm a big believer in moving towards standard library implementations of things. In this case we get to remove the allocas *and* make the code easier to grok for those that are familiar with the C++ standard library. Jeff
On August 5, 2016 6:23:27 PM GMT+02:00, Jeff Law <law@redhat.com> wrote: >On 08/05/2016 08:37 AM, Aldy Hernandez wrote: >> >> After looking at all this code, I realize no size will fit all if we >> want something clean. So, I take my approach back. I think we're >best >> served by a couple different approaches, depending on usage. >Yup. > >> >> My question is, how wed are we to an alloca based approach? Can we >> replace the non-critical uses by std::string (diagnostics, >> BLAH_check_failed(), building a string to manipulate the output of >> env("SOME_VAR"), etc?). >I don't think we are at all wed to alloca. > >> >> Then for things like the GCC driver wanting to create a vector of >passed >> commands, we could use auto_vec<> (*gasp*, using a vector to >represent a >> vector ;-)). Note that auto_vec<> uses malloc (or GC) but creating a >> malloc'd vector at startup is hardly on the fast path. >> >> For the remaining things we could use either alloca with a malloc >> fallback for trivial things like set_user_assembler_name() that only >> have one exit point, or take it on a case by case basis. >> >> But it seems to me that we can use an STL container for the non >critical >> things (which are the majority of them), and something else (perhaps >an >> RAII thinggie TBD later). >> >> Is this reasonable? I'd like to know before I spend any time >converting >> anything to std::string and auto_vec<>. >Yes, this is all reasonable to me. I'm a big believer in moving >towards >standard library implementations of things. In this case we get to >remove the allocas *and* make the code easier to grok for those that >are >familiar with the C++ standard library. But at the same time please consider that each malloc costs 1000x more than an alloca. So if you replace one alloca that is executed once during compile, fine. If you replace one that is done for each symbol (the get_identifier case) it is not. Richard. >Jeff
On August 5, 2016 4:42:48 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote: > >> I was surprised by the always_inline trick. I suppose it makes >> sense but I wouldn't have expected to be able to rely on it. Out >> of curiosity I tested it with other compilers. It turns out that >> it only works with some like GCC and IBM XLC++, but not others >> like Clang or Sun CC. In recursive calls, they don't seem to >> hold on to the memory allocated via alloca by the class ctor in >> the caller. > >Well, it was surprising to me as well, hence the comment. I expected >it >to just work, and when it didn't I had to hunt in the inliner code to >find out why it was selectively inlining: > > case GIMPLE_CALL: > /* Refuse to inline alloca call unless user explicitly forced so as > this may change program's memory overhead drastically when the > function using alloca is called in loop. In GCC present in > SPEC2000 inlining into schedule_block cause it to require 2GB of > RAM instead of 256MB. Don't do so for alloca calls emitted for > VLA objects as those can't cause unbounded growth (they're always > wrapped inside stack_save/stack_restore regions. */ > >As Richi pointed out, if the constructor doesn't get inlined (as you're > >seeing in Clang and Sun CC), we could potentially clobber freed memory. > > So much for that idea... > >> >> FWIW, rather than creating a new class around alloca and putting >> effort into changing code to use it I think that investing time >> into replacing the function's uses with an existing C++ container >> class (either GCC's own or one from the STL) might be a better >> way to go. > >Yes, I suggested a combination of auto_vec<> (gcc's) and std::string >down thread. Please don't use std::string. For string building you can use obstacks. Richard. >Thanks for checking out the result from other compilers. Much >appreciated. > >Aldy
On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: > > Please don't use std::string. For string building you can use > obstacks. > Just out of curiosity ... why? I remember there was some discussion about it, what was the conclusion? Is that now a general rule or does it depend on the context where strings are used? Cheers, Oleg
On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote: >On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: > >> >> Please don't use std::string. For string building you can use >> obstacks. >> > >Just out of curiosity ... why? I remember there was some discussion >about it, what was the conclusion? Is that now a general rule or does >it depend on the context where strings are used? Because you make a messy mix of string handling variants. Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers. (Oh, and I hate I/o streams even more) Richard. >Cheers, >Oleg
On 08/05/2016 04:07 PM, Richard Biener wrote: > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote: >> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: >> >>> >>> Please don't use std::string. For string building you can use >>> obstacks. >>> >> >> Just out of curiosity ... why? I remember there was some discussion >> about it, what was the conclusion? Is that now a general rule or does >> it depend on the context where strings are used? > > Because you make a messy mix of string handling variants. Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers. Is this negotiable? I would prefer to use the standard library implementation of things when possible. At the very least, it's easier for others familiar with the C++ STL. I mean, we are using C++ after all :). I understand a reluctance in cases where it would be very inefficient, or where std::string is not up to speed. In such cases it makes sense to use obstacks or equivalents, but in straightforward things on a non critical path? I would gladly do either, but I would strongly prefer std::string when it does not overly pessimize code. > (Oh, and I hate I/o streams even more) Yeah, but that sounds like a personal preference? ;-). Let me know. I'll do either if it's an agreed upon mandate from the global deities :). Aldy
On 08/05/2016 04:07 PM, Richard Biener wrote: > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote: >> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: >> >>> >>> Please don't use std::string. For string building you can use >>> obstacks. >>> >> >> Just out of curiosity ... why? I remember there was some discussion >> about it, what was the conclusion? Is that now a general rule or does >> it depend on the context where strings are used? > > Because you make a messy mix of string handling variants. Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers. > (Oh, and I hate I/o streams even more) Oh, and there is prior use in ipa-chkp.c, although I suppose it could've crept in: std::string s; /* called_as_built_in checks DECL_NAME to identify calls to builtins. We want instrumented calls to builtins to be recognized by called_as_built_in. Therefore use original DECL_NAME for cloning with no prefixes. */ s = IDENTIFIER_POINTER (DECL_NAME (fndecl)); s += ".chkp"; DECL_NAME (new_decl) = get_identifier (s.c_str ()); You can't tell me obstacks are easier on the eyes for this ;-). Aldy
On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote: >On 08/05/2016 04:07 PM, Richard Biener wrote: >> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo ><oleg.endo@t-online.de> wrote: >>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: >>> >>>> >>>> Please don't use std::string. For string building you can use >>>> obstacks. >>>> >>> >>> Just out of curiosity ... why? I remember there was some discussion >>> about it, what was the conclusion? Is that now a general rule or >does >>> it depend on the context where strings are used? >> >> Because you make a messy mix of string handling variants. >Std::string is not powerful enough to capture all uses, it is vastly >more expensive to embed into structs and it pulls in too much headers. >> (Oh, and I hate I/o streams even more) > >Oh, and there is prior use in ipa-chkp.c, although I suppose it >could've >crept in: Definitely. > > std::string s; > > /* called_as_built_in checks DECL_NAME to identify calls to > builtins. We want instrumented calls to builtins to be > recognized by called_as_built_in. Therefore use original > DECL_NAME for cloning with no prefixes. */ > s = IDENTIFIER_POINTER (DECL_NAME (fndecl)); > s += ".chkp"; > DECL_NAME (new_decl) = get_identifier (s.c_str ()); > >You can't tell me obstacks are easier on the eyes for this ;-). Even strcat is shorter and cheaper. Richard. >Aldy
On 08/06/2016 09:08 AM, Richard Biener wrote: > On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote: >> On 08/05/2016 04:07 PM, Richard Biener wrote: >>> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo >> <oleg.endo@t-online.de> wrote: >>>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: >>>> >>>>> >>>>> Please don't use std::string. For string building you can use >>>>> obstacks. >>>>> >>>> >>>> Just out of curiosity ... why? I remember there was some discussion >>>> about it, what was the conclusion? Is that now a general rule or >> does >>>> it depend on the context where strings are used? >>> >>> Because you make a messy mix of string handling variants. >> Std::string is not powerful enough to capture all uses, it is vastly >> more expensive to embed into structs and it pulls in too much headers. >>> (Oh, and I hate I/o streams even more) >> >> Oh, and there is prior use in ipa-chkp.c, although I suppose it >> could've >> crept in: > > Definitely. > >> >> std::string s; >> >> /* called_as_built_in checks DECL_NAME to identify calls to >> builtins. We want instrumented calls to builtins to be >> recognized by called_as_built_in. Therefore use original >> DECL_NAME for cloning with no prefixes. */ >> s = IDENTIFIER_POINTER (DECL_NAME (fndecl)); >> s += ".chkp"; >> DECL_NAME (new_decl) = get_identifier (s.c_str ()); >> >> You can't tell me obstacks are easier on the eyes for this ;-). > > Even strcat is shorter and cheaper. ISTM that unless the code is performance critical we should be writing code that is easy to understand and hard to get wrong. Thus when we have something that is non-critical and can be handled by std:: routines we should be using them. If performance is important in a particular piece of code an obstack or explicit malloc/free seems better. Can we agree on those as guiding principles and thus boil the argument down to whether or not a particular piece of code is performance critical? jeff
On Mon, Aug 08, 2016 at 11:00:28AM -0600, Jeff Law wrote: > On 08/06/2016 09:08 AM, Richard Biener wrote: > > On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote: > > > On 08/05/2016 04:07 PM, Richard Biener wrote: > > > > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo > > > <oleg.endo@t-online.de> wrote: > > > > > On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: > > > > > > > > > > > > > > > > > Please don't use std::string. For string building you can use > > > > > > obstacks. > > > > > > > > > > > > > > > > Just out of curiosity ... why? I remember there was some discussion > > > > > about it, what was the conclusion? Is that now a general rule or > > > does > > > > > it depend on the context where strings are used? > > > > > > > > Because you make a messy mix of string handling variants. > > > Std::string is not powerful enough to capture all uses, it is vastly > > > more expensive to embed into structs and it pulls in too much headers. > > > > (Oh, and I hate I/o streams even more) > > > > > > Oh, and there is prior use in ipa-chkp.c, although I suppose it > > > could've > > > crept in: > > > > Definitely. > > > > > > > > std::string s; > > > > > > /* called_as_built_in checks DECL_NAME to identify calls to > > > builtins. We want instrumented calls to builtins to be > > > recognized by called_as_built_in. Therefore use original > > > DECL_NAME for cloning with no prefixes. */ > > > s = IDENTIFIER_POINTER (DECL_NAME (fndecl)); > > > s += ".chkp"; > > > DECL_NAME (new_decl) = get_identifier (s.c_str ()); > > > > > > You can't tell me obstacks are easier on the eyes for this ;-). > > > > Even strcat is shorter and cheaper. > ISTM that unless the code is performance critical we should be writing code > that is easy to understand and hard to get wrong. it seems hard to disagree with that ;) > Thus when we have something that is non-critical and can be handled by std:: > routines we should be using them. > > If performance is important in a particular piece of code an obstack or > explicit malloc/free seems better. I'm not totally convinced we have to trade off performance against a good interface. It seems to me it wouldn't be that complicated to build a string class on top of auto_vec that has a similar interface to std::string. I'd really rather not have to write a string class of our own, but I can see some significant advantages to it. First sizeof std::string is 32 on x86_64, a char *, a size_t for the length, and a 16 byte union of a size_t for allocated size and a 16 byte buffer for short strings. I suppose some of this is required by the C++ standard, but I doubt we really need to care about strings longer than 2^32 in gcc. If we put the length and allocated size in the buffer, and have a separate class for stack allocated buffers I think we can have a string that is just sizeof void *. Second it would be useful performance wise to have a std::string_view type class, but that is c++14 or 17? only so we'd need to import it into gcc or something. Trev
On August 8, 2016 7:39:39 PM GMT+02:00, Trevor Saunders <tbsaunde@tbsaunde.org> wrote: >On Mon, Aug 08, 2016 at 11:00:28AM -0600, Jeff Law wrote: >> On 08/06/2016 09:08 AM, Richard Biener wrote: >> > On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez ><aldyh@redhat.com> wrote: >> > > On 08/05/2016 04:07 PM, Richard Biener wrote: >> > > > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo >> > > <oleg.endo@t-online.de> wrote: >> > > > > On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote: >> > > > > >> > > > > > >> > > > > > Please don't use std::string. For string building you can >use >> > > > > > obstacks. >> > > > > > >> > > > > >> > > > > Just out of curiosity ... why? I remember there was some >discussion >> > > > > about it, what was the conclusion? Is that now a general >rule or >> > > does >> > > > > it depend on the context where strings are used? >> > > > >> > > > Because you make a messy mix of string handling variants. >> > > Std::string is not powerful enough to capture all uses, it is >vastly >> > > more expensive to embed into structs and it pulls in too much >headers. >> > > > (Oh, and I hate I/o streams even more) >> > > >> > > Oh, and there is prior use in ipa-chkp.c, although I suppose it >> > > could've >> > > crept in: >> > >> > Definitely. >> > >> > > >> > > std::string s; >> > > >> > > /* called_as_built_in checks DECL_NAME to identify calls to >> > > builtins. We want instrumented calls to builtins to be >> > > recognized by called_as_built_in. Therefore use original >> > > DECL_NAME for cloning with no prefixes. */ >> > > s = IDENTIFIER_POINTER (DECL_NAME (fndecl)); >> > > s += ".chkp"; >> > > DECL_NAME (new_decl) = get_identifier (s.c_str ()); >> > > >> > > You can't tell me obstacks are easier on the eyes for this ;-). >> > >> > Even strcat is shorter and cheaper. >> ISTM that unless the code is performance critical we should be >writing code >> that is easy to understand and hard to get wrong. > >it seems hard to disagree with that ;) s.c_str () is _not_ an easy to grasp thing. Unless interfaces in GCC generally use C strings I object to using C++ strings for anything. I don't see a single up-side and only downsides. Richard. >> Thus when we have something that is non-critical and can be handled >by std:: >> routines we should be using them. >> >> If performance is important in a particular piece of code an obstack >or >> explicit malloc/free seems better. > >I'm not totally convinced we have to trade off performance against a >good interface. It seems to me it wouldn't be that complicated to >build >a string class on top of auto_vec that has a similar interface to >std::string. > >I'd really rather not have to write a string class of our own, but I >can >see some significant advantages to it. > >First sizeof std::string is 32 on x86_64, a char *, a size_t for the >length, and a 16 byte union of a size_t for allocated size and a 16 >byte >buffer for short strings. I suppose some of this is required by the >C++ >standard, but I doubt we really need to care about strings longer than >2^32 in gcc. If we put the length and allocated size in the buffer, >and >have a separate class for stack allocated buffers I think we can have a >string that is just sizeof void *. > >Second it would be useful performance wise to have a std::string_view >type class, but that is c++14 or 17? only so we'd need to import it >into >gcc or something. > >Trev
On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote: > First sizeof std::string is 32 on x86_64, a char *, a size_t for the > length, and a 16 byte union of a size_t for allocated size and a 16 > byte buffer for short strings. I suppose some of this is required by > the C++ standard, I recommend checking what others have figured regarding that matter https://github.com/elliotgoodrich/SSO-23 > but I doubt we really need to care about strings longer than 2^32 in > gcc. If we put the length and allocated size in the buffer, > and have a separate class for stack allocated buffers I think we can > have a string that is just sizeof void *. This idea has one flaw in that it does not allow having objects by value. It essentially *requires* one to *always* allocate them on the heap via new/delete. Being able to store objects by value is useful in many situations. So if you do the above, the next logical step that will follow is a smart-pointer-like wrapper that allows value semantics. Because eventually somebody will want that operator == or operator < e.g. for associative containers. > > Second it would be useful performance wise to have a std::string_view > type class, but that is c++14 or 17? only so we'd need to import it > into gcc or something. http://en.cppreference.com/w/cpp/experimental/basic_string_view http://en.cppreference.com/w/cpp/string/basic_string_view It's "funny" that GCC contains a C++ stdlib implementation and so little is actually used by GCC itself. Cheers, Oleg
On 08/05/2016 01:55 PM, Richard Biener wrote:
Hi Richard.
> Please don't use std::string. For string building you can use obstacks.
Alright let's talk details then so I can write things up in a way you
approve of.
Take for instance simple uses like all the tree_*check_failed routines,
which I thought were great candidates for std::string-- they're going to
be outputted to the screen or disk which is clearly many times more
expensive than the malloc or overhead of std::string:
length += strlen ("expected ");
buffer = tmp = (char *) alloca (length);
length = 0;
while ((code = (enum tree_code) va_arg (args, int)))
{
const char *prefix = length ? " or " : "expected ";
strcpy (tmp + length, prefix);
length += strlen (prefix);
strcpy (tmp + length, get_tree_code_name (code));
length += strlen (get_tree_code_name (code));
}
Do you suggest using obstacks here, or did you have something else in mind?
How about if it's something like the above but there are multiple exit
points from the function. I mean, we still need to call obstack_free()
on all exit points, which is what I wanted to avoid for something as
simple as building a throw-away string.
Aldy
On 08/09/2016 03:17 PM, Aldy Hernandez wrote: > On 08/05/2016 01:55 PM, Richard Biener wrote: > > Hi Richard. > >> Please don't use std::string. For string building you can use obstacks. > > Alright let's talk details then so I can write things up in a way you > approve of. > > Take for instance simple uses like all the tree_*check_failed routines, > which I thought were great candidates for std::string-- they're going to > be outputted to the screen or disk which is clearly many times more > expensive than the malloc or overhead of std::string: > > length += strlen ("expected "); > buffer = tmp = (char *) alloca (length); > length = 0; > while ((code = (enum tree_code) va_arg (args, int))) > { > const char *prefix = length ? " or " : "expected "; > > strcpy (tmp + length, prefix); > length += strlen (prefix); > strcpy (tmp + length, get_tree_code_name (code)); > length += strlen (get_tree_code_name (code)); > } > > Do you suggest using obstacks here, or did you have something else in mind? > > How about if it's something like the above but there are multiple exit > points from the function. I mean, we still need to call obstack_free() > on all exit points, which is what I wanted to avoid for something as > simple as building a throw-away string. I think that given our choice to switch to C++, we should use the language in idiomatic ways, and the standard library is part of that. Otherwise the language switch makes even less sense than it did in the first place. Bernd
On Tue, Aug 09, 2016 at 08:33:49PM +0900, Oleg Endo wrote: > On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote: > > > > First sizeof std::string is 32 on x86_64, a char *, a size_t for the > > length, and a 16 byte union of a size_t for allocated size and a 16 > > byte buffer for short strings. I suppose some of this is required by > > the C++ standard, > > I recommend checking what others have figured regarding that matter > https://github.com/elliotgoodrich/SSO-23 I think one of the big mistakes in the C++ stl strings is that they try to shove stack allocated strings where you want an internal buffer for SSO and a class to put in structs into the same type. imho it would be better to have std::string for the second use case, and a separate std::stack_string or something for the first case. > > but I doubt we really need to care about strings longer than 2^32 in > > gcc. If we put the length and allocated size in the buffer, > > and have a separate class for stack allocated buffers I think we can > > have a string that is just sizeof void *. > > This idea has one flaw in that it does not allow having objects by > value. It essentially *requires* one to *always* allocate them on the > heap via new/delete. Being able to store objects by value is useful in > many situations. So if you do the above, the next logical step that > will follow is a smart-pointer-like wrapper that allows value > semantics. Because eventually somebody will want that operator == or > operator < e.g. for associative containers. If what you want is the ability to put the buffer on the stack instead of the heap then I think a stack_string class that interoperates with your string class is the thing you want. I don't really see anything wrong with a string class being a really fancy smart pointer that has a bunch of useful string stuff on it. As for operator == I'd be fairly ok with that, other than it hiding a O(N) operation in ==. > > > > Second it would be useful performance wise to have a std::string_view > > type class, but that is c++14 or 17? only so we'd need to import it > > into gcc or something. > > http://en.cppreference.com/w/cpp/experimental/basic_string_view > http://en.cppreference.com/w/cpp/string/basic_string_view > > It's "funny" that GCC contains a C++ stdlib implementation and so > little is actually used by GCC itself. Regretably necessary sure, but I'm not sure its funny. The first big problem with using the stl is that the subset available in C++98 isn't that great, you could maybe hack up libstdc++ so that you can use newer types just without the bits that use newer language features, but that would take some work. The other big problem is that the stl is often too general, and tries to be too simple. std::string is actually a good example of both of those problems. There's really no reason it should use size_t instead of uint32_t for the string length / capacity. It would also be a lot better if it had separate types for strings where you want an internal buffer or don't. Trev
On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: > On 08/05/2016 01:55 PM, Richard Biener wrote: > > Hi Richard. > >> Please don't use std::string. For string building you can use obstacks. > > > Alright let's talk details then so I can write things up in a way you > approve of. > > Take for instance simple uses like all the tree_*check_failed routines, > which I thought were great candidates for std::string-- they're going to be > outputted to the screen or disk which is clearly many times more expensive > than the malloc or overhead of std::string: > > length += strlen ("expected "); > buffer = tmp = (char *) alloca (length); > length = 0; > while ((code = (enum tree_code) va_arg (args, int))) > { > const char *prefix = length ? " or " : "expected "; > > strcpy (tmp + length, prefix); > length += strlen (prefix); > strcpy (tmp + length, get_tree_code_name (code)); > length += strlen (get_tree_code_name (code)); > } > > Do you suggest using obstacks here, or did you have something else in mind? Why would you want to get rid of the alloca here? > How about if it's something like the above but there are multiple exit > points from the function. I mean, we still need to call obstack_free() on > all exit points, which is what I wanted to avoid for something as simple as > building a throw-away string. But you'll end up in internal_error ("tree check: %s, have %s in %s, at %s:%d", buffer, get_tree_code_name (TREE_CODE (node)), function, trim_filename (file), line); where you have an interface using C strings again. It's not that the standard library is bad per-se, it's just using a tool that doesn't fit its uses. This makes the code a messy mix of two concepts which is what I object to. Yes, the above example may have premature optimization applied. Is that a reason to re-write it using std::string? No. Is it a reason to rewrite it using obstracks? No. Just leave it alone. Richard. > > Aldy
On Tue, 2016-08-09 at 13:41 -0400, Trevor Saunders wrote: > If what you want is the ability to put the buffer on the stack > instead of the heap then I think a stack_string class that > interoperates with your string class is the thing you want. I'd avoid a separate stack_string class. Instead use a linear allocator with storage allocated on the stack, e.g.: typedef std::basic_string<char, std::char_traits<char>, linear_allocator<char> my_tmp_string; void bleh (int foo) { linear_allocator<char> buffer (alloca (1024), 1024); tmp_string aaa (buffer); aaa += "test"; aaa += "moon"; ... // what happens when "buffer" runs out of space? // throw? switch to heap alloc? ... } > > I don't really see anything wrong with a string class being a really > fancy smart pointer that has a bunch of useful string stuff on it. That's what std::string basically is? > As for operator == I'd be fairly ok with that, other than it hiding > a O(N) operation in ==. Yes, it makes it more difficult to carry out algorithm complexity analysis using grep... > Regretably necessary sure, but I'm not sure its funny. Hence the quotes. > The first big problem with using the stl is that the subset available > in C++98 isn't that great, you could maybe hack up libstdc++ so that > you can use newer types just without the bits that use newer language > features, but that would take some work. Or just wait until people have agreed to switch to C++11 or C++14. I don't think in practice anybody uses an C++11-incapable GCC to build a newer GCC these days. > The other big problem is that the stl is often too general, and tries > to be too simple. std::string is actually a good example of both of > those problems. There's really no reason it should use size_t > instead of uint32_t for the string length / capacity. Yes, some of the things in the STL are collections of compromises for general-purpose usage. If you're doing something extra fancy, most likely you can outperform the generic STL implementation. But very often it's actually not needed. > It would also be a lot better if it had separate types for strings > where you want an internal buffer or don't. Using custom allocators is one way. For example ... template <unsigned int BufferSize> struct linear_allocator_with_buffer; template <unsigned int BufferSize> using stack_string = std::basic_string<char, std::char_traits<char>, linear_allocator_with_buffer<BufferSize>>; But then ... stack_string<32> a = "a"; stack_string<64> b = "b"; b += a; ... will not work because they are different types. One drawback of using custom allocators for that kind of thing is being unable to pass the above stack_string to a function that takes a "const std::string&" because they differ in template parameters. But that's where std::string_view comes in. Cheers, Oleg
On 08/10/2016 04:04 AM, Richard Biener wrote: > On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote: >> On 08/05/2016 01:55 PM, Richard Biener wrote: >> >> Hi Richard. >> >>> Please don't use std::string. For string building you can use obstacks. >> >> >> Alright let's talk details then so I can write things up in a way you >> approve of. >> >> Take for instance simple uses like all the tree_*check_failed routines, >> which I thought were great candidates for std::string-- they're going to be >> outputted to the screen or disk which is clearly many times more expensive >> than the malloc or overhead of std::string: >> >> length += strlen ("expected "); >> buffer = tmp = (char *) alloca (length); >> length = 0; >> while ((code = (enum tree_code) va_arg (args, int))) >> { >> const char *prefix = length ? " or " : "expected "; >> >> strcpy (tmp + length, prefix); >> length += strlen (prefix); >> strcpy (tmp + length, get_tree_code_name (code)); >> length += strlen (get_tree_code_name (code)); >> } >> >> Do you suggest using obstacks here, or did you have something else in mind? > > Why would you want to get rid of the alloca here? Do you know the range for LENGTH in the code above? Is it based on something the user could potentially control (like a variable name, typdef name, etc). If you don't know the length or it's possibly under the control of the user, then this can blow out the stack, which makes the code vulnerable to a stack shifting style attack by which further writes into the stack are actually writing into other parts of the stack, the heap, plt or some other location. Essentially this gives an attacker control over one or more stores to memory, which is often enough of a vulnerability to mount an attack. jeff
On August 10, 2016 8:00:20 PM GMT+02:00, Jeff Law <law@redhat.com> wrote: >On 08/10/2016 04:04 AM, Richard Biener wrote: >> On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com> >wrote: >>> On 08/05/2016 01:55 PM, Richard Biener wrote: >>> >>> Hi Richard. >>> >>>> Please don't use std::string. For string building you can use >obstacks. >>> >>> >>> Alright let's talk details then so I can write things up in a way >you >>> approve of. >>> >>> Take for instance simple uses like all the tree_*check_failed >routines, >>> which I thought were great candidates for std::string-- they're >going to be >>> outputted to the screen or disk which is clearly many times more >expensive >>> than the malloc or overhead of std::string: >>> >>> length += strlen ("expected "); >>> buffer = tmp = (char *) alloca (length); >>> length = 0; >>> while ((code = (enum tree_code) va_arg (args, int))) >>> { >>> const char *prefix = length ? " or " : "expected "; >>> >>> strcpy (tmp + length, prefix); >>> length += strlen (prefix); >>> strcpy (tmp + length, get_tree_code_name (code)); >>> length += strlen (get_tree_code_name (code)); >>> } >>> >>> Do you suggest using obstacks here, or did you have something else >in mind? >> >> Why would you want to get rid of the alloca here? >Do you know the range for LENGTH in the code above? Yes, it's a set of tree code names. Is it based on >something the user could potentially control (like a variable name, >typdef name, etc). If you don't know the length or it's possibly under > >the control of the user, then this can blow out the stack, which makes >the code vulnerable to a stack shifting style attack by which further >writes into the stack are actually writing into other parts of the >stack, the heap, plt or some other location. Essentially this gives an > >attacker control over one or more stores to memory, which is often >enough of a vulnerability to mount an attack. Yes, I understand that. The above is not such a case. If an attacker can trick me into compiling (and possibly executing) his code then things are lost anyway. No need for a fancy buffer overflow. IMHO the alloca case warrants for a protection on the level of the stack adjustment itself. Richard. > >jeff
On Thu, Aug 11, 2016 at 02:03:29AM +0900, Oleg Endo wrote: > On Tue, 2016-08-09 at 13:41 -0400, Trevor Saunders wrote: > > > If what you want is the ability to put the buffer on the stack > > instead of the heap then I think a stack_string class that > > interoperates with your string class is the thing you want. > > I'd avoid a separate stack_string class. Instead use a linear > allocator with storage allocated on the stack, e.g.: > > typedef std::basic_string<char, std::char_traits<char>, > linear_allocator<char> > my_tmp_string; > > void bleh (int foo) > { > linear_allocator<char> buffer (alloca (1024), 1024); > > tmp_string aaa (buffer); > aaa += "test"; > aaa += "moon"; > ... > > // what happens when "buffer" runs out of space? > // throw? switch to heap alloc? ... > } > > > > > > I don't really see anything wrong with a string class being a really > > fancy smart pointer that has a bunch of useful string stuff on it. > > > That's what std::string basically is? sure, but I'm saying it does that job baddly. > > As for operator == I'd be fairly ok with that, other than it hiding > > a O(N) operation in ==. > > Yes, it makes it more difficult to carry out algorithm complexity > analysis using grep... > > > > Regretably necessary sure, but I'm not sure its funny. > > Hence the quotes. > > > The first big problem with using the stl is that the subset available > > in C++98 isn't that great, you could maybe hack up libstdc++ so that > > you can use newer types just without the bits that use newer language > > features, but that would take some work. > > Or just wait until people have agreed to switch to C++11 or C++14. I > don't think in practice anybody uses an C++11-incapable GCC to build a > newer GCC these days. I'm pretty sure they do, I've seen patches for 4.1 and 4.3 go in this cycle, and rval refs didn't work until 4.4 and that was to an older version of the spec. Anyways people apparently use compilers other than gcc to build gcc, and I'm prepared to believe some of the proprietery compilers on ancient unix variants don't support C++11. > > The other big problem is that the stl is often too general, and tries > > to be too simple. std::string is actually a good example of both of > > those problems. There's really no reason it should use size_t > > instead of uint32_t for the string length / capacity. > > Yes, some of the things in the STL are collections of compromises for > general-purpose usage. If you're doing something extra fancy, most > likely you can outperform the generic STL implementation. But very > often it's actually not needed. Well, I'd say the compromises made in std::string make it pretty terrible for general purpose use accept where perf and memory doesn't matter at all. std::vector isn't very great size wise either, imho the size / capacity fields would be much better put in the buffer than in the struct itself, to save 2 * sizeof (void *) in sizeof std::vector. I haven't looked at other stl datastructures that much, but those two examples don't fill me with confidence :/ > > It would also be a lot better if it had separate types for strings > > where you want an internal buffer or don't. > > Using custom allocators is one way. For example ... > > template <unsigned int BufferSize> struct linear_allocator_with_buffer; > > template <unsigned int BufferSize> > using stack_string = > std::basic_string<char, std::char_traits<char>, > linear_allocator_with_buffer<BufferSize>>; that *is* a different type than std::string with the default template arguments. > But then ... > > stack_string<32> a = "a"; > stack_string<64> b = "b"; > > b += a; > > ... will not work because they are different types. > > One drawback of using custom allocators for that kind of thing is being > unable to pass the above stack_string to a function that takes a "const > std::string&" because they differ in template parameters. But that's > where std::string_view comes in. or just having your stack_string type convert to the normal string type so that you can pass mutable references. Trev > > > Cheers, > Oleg
On Wed, 2016-08-10 at 21:31 -0400, Trevor Saunders wrote: > > Well, I'd say the compromises made in std::string make it pretty > terrible for general purpose use accept where perf and memory doesn't > matter at all. > > std::vector isn't very great size wise either, imho the size / > capacity fields would be much better put in the buffer than in the > struct itself, to save 2 * sizeof (void *) in sizeof std::vector. There's nothing in the standard that says those fields have to go into the vector struct/class itself and not into the data buffer. It just happens to be the way it's implemented in libstdc++. E.g. libc++ implements some things in different ways to save a few bytes here and there. It looks more practical to put those fields into the container itself, otherwise you'd have to do a nullptr check every time you access the size field etc. Although newer compilers optimize away the redundant nullptr checks, older compilers (which GCC wants to be good with) might not do it. In any case, it seems to generate slightly bigger code, at least in one instance (see attachment). Anyway, normally there is more data in the vectors than there are vectors around, so whether sizeof (vector) is 8 or 24 bytes doesn't matter. > > or just having your stack_string type convert to the normal string > type so that you can pass mutable references. But that would imply copying, wouldn't it? Cheers, Oleg
On Thu, Aug 11, 2016 at 09:18:34PM +0900, Oleg Endo wrote: > On Wed, 2016-08-10 at 21:31 -0400, Trevor Saunders wrote: > > > > > Well, I'd say the compromises made in std::string make it pretty > > terrible for general purpose use accept where perf and memory doesn't > > matter at all. > > > > std::vector isn't very great size wise either, imho the size / > > capacity fields would be much better put in the buffer than in the > > struct itself, to save 2 * sizeof (void *) in sizeof std::vector. > > There's nothing in the standard that says those fields have to go into > the vector struct/class itself and not into the data buffer. It just > happens to be the way it's implemented in libstdc++. E.g. libc++ > implements some things in different ways to save a few bytes here and > there. > > It looks more practical to put those fields into the container itself, > otherwise you'd have to do a nullptr check every time you access the > size field etc. Although newer compilers optimize away the redundant > nullptr checks, older compilers (which GCC wants to be good with) might > not do it. In any case, it seems to generate slightly bigger code, at > least in one instance (see attachment). if you are clever you can have a empty global vector that you point all the empty vectors at and so eliminate the null check. Well, I guess that doesn't work quite as well if the global needs to be accessed through a plt sigh, but at least you only need to check against it for reallocation which is rather heavy weight anyway. > Anyway, normally there is more data in the vectors than there are > vectors around, so whether sizeof (vector) is 8 or 24 bytes doesn't > matter. It matters if you have any empty vectors around. The only numbers I can find are https://bugzilla.mozilla.org/show_bug.cgi?id=1007846 but I seem to remember seeing that for Firefox at least most vectors contained 0 elements. > > > > or just having your stack_string type convert to the normal string > > type so that you can pass mutable references. > > But that would imply copying, wouldn't it? Not if you are clever, you can use the same trick gcc uses in vec / auto_vec with strings. Trev > > Cheers, > Oleg > > #if 0 > template <typename T> > class vector > { > public: > unsigned int size (void) const { return m_buffer != nullptr ? m_buffer->size : 0; } > bool empty (void) const { return size () == 0; } > > unsigned int capacity (void) const { return m_buffer != nullptr ? m_buffer->capacity : 0; } > > T* data (void) { return (T*)(m_buffer + 1); } > const T* data (void) const { return (const T*)(m_buffer + 1); } > > T& operator [] (unsigned int i) { return data ()[i]; } > const T& operator [] (unsigned int i) const { return data ()[i]; } > > private: > struct buffer_header > { > unsigned int size; > unsigned int capacity; > }; > > buffer_header* m_buffer; > }; > > > int foo (const vector<int>& x) > { > if (x.empty ()) > return 0; > > int r = 0; > for (unsigned int i = 0; i < x.size (); ++i) > r += x[i]; > > return r; > } > > /* > -O2 -m2a > mov.l @r4,r2 > tst r2,r2 > bt .L5 > mov.l @r2,r1 > tst r1,r1 > bt.s .L5 > shll2 r1 > add #-4,r1 > shlr2 r1 > add #8,r2 > mov #0,r0 > add #1,r1 > .align 2 > .L3: > mov.l @r2+,r3 > dt r1 > bf.s .L3 > add r3,r0 > rts/n > .align 1 > .L5: > rts > mov #0,r0 > */ > #endif > > > #if 1 > > template <typename T> > class vector > { > public: > unsigned int size (void) const { return m_size; } > bool empty (void) const { return size () == 0; } > > unsigned int capacity (void) const { return m_capacity; } > > T* data (void) { return (T*)(m_buffer); } > const T* data (void) const { return (const T*)(m_buffer); } > > T& operator [] (unsigned int i) { return data ()[i]; } > const T& operator [] (unsigned int i) const { return data ()[i]; } > > private: > unsigned int m_size; > unsigned int m_capacity; > T* m_buffer; > }; > > > int foo (const vector<int>& x) > { > if (x.empty ()) > return 0; > > int r = 0; > for (unsigned int i = 0; i < x.size (); ++i) > r += x[i]; > > return r; > } > > /* > -O2 -m2a > mov.l @r4,r1 > tst r1,r1 > bt.s .L7 > mov #0,r0 > shll2 r1 > mov.l @(8,r4),r2 > add #-4,r1 > shlr2 r1 > add #1,r1 > .align 2 > .L3: > mov.l @r2+,r3 > dt r1 > bf.s .L3 > add r3,r0 > .L7: > rts/n > */ > > #endif >
On 08/10/2016 12:33 PM, Richard Biener wrote: >>> >>> Why would you want to get rid of the alloca here? >> Do you know the range for LENGTH in the code above? > > Yes, it's a set of tree code names. > > Is it based on >> something the user could potentially control (like a variable name, >> typdef name, etc). If you don't know the length or it's possibly >> under >> >> the control of the user, then this can blow out the stack, which >> makes the code vulnerable to a stack shifting style attack by which >> further writes into the stack are actually writing into other parts >> of the stack, the heap, plt or some other location. Essentially >> this gives an >> >> attacker control over one or more stores to memory, which is often >> enough of a vulnerability to mount an attack. > > Yes, I understand that. The above is not such a case. If an > attacker can trick me into compiling (and possibly executing) his > code then things are lost anyway. No need for a fancy buffer > overflow. I think you're being rather short-sighed here. GCC is being used in ways we can't necessarily predict -- which might include compile servers, JITs, web services, etc. In those kind of environments I think you have a significantly different threat model and GCC itself becomes part of the attack surface. There's existing web sites I'm pretty sure I could compromise if I had the interest by exploiting bugs in GCC and libiberty. Jeff
On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote: > I think you're being rather short-sighed here. GCC is being used in ways we > can't necessarily predict -- which might include compile servers, JITs, web > services, etc. For compile server/web services one needs to add the protection outside of gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.), because even with very short testcases e.g. in C/C++ one can eat arbitrary amounts of stack even without any uses of alloca in the compiler, simply through deep recursion in the parsers etc. The attack vector is so big that trying to do something just about alloca is IMHO pointless, and we really don't want to fight 20 gcc CVEs every day (1:1 with most bugreports). Alloca is really useful in the compiler IMO, it is significantly faster than heap allocation, and that is what matters in many places a lot. Jakub
On 08/16/2016 10:44 AM, Jakub Jelinek wrote: > On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote: >> I think you're being rather short-sighed here. GCC is being used in ways we >> can't necessarily predict -- which might include compile servers, JITs, web >> services, etc. > > For compile server/web services one needs to add the protection outside of > gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.), > because even with very short testcases e.g. in C/C++ one can eat arbitrary > amounts of stack even without any uses of alloca in the compiler, simply > through deep recursion in the parsers etc. Agreed. However, that doesn't mean we should not be locking down things like alloca and other attack vectors. The attack vector is so big that > trying to do something just about alloca is IMHO pointless, and we really > don't want to fight 20 gcc CVEs every day (1:1 with most bugreports). > Alloca is really useful in the compiler IMO, it is significantly faster than > heap allocation, and that is what matters in many places a lot. You have to start somewhere and we have the tools and willingness of an engineer to tackle part of this problem. Simply giving up because it's not a total solution is absurd. jeff
On 08/16/2016 10:47 AM, Jeff Law wrote: > On 08/16/2016 10:44 AM, Jakub Jelinek wrote: >> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote: >>> I think you're being rather short-sighed here. GCC is being used in >>> ways we >>> can't necessarily predict -- which might include compile servers, >>> JITs, web >>> services, etc. >> >> For compile server/web services one needs to add the protection >> outside of >> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.), >> because even with very short testcases e.g. in C/C++ one can eat >> arbitrary >> amounts of stack even without any uses of alloca in the compiler, simply >> through deep recursion in the parsers etc. > Agreed. However, that doesn't mean we should not be locking down things > like alloca and other attack vectors. I think I made this suggestion when Aldy posted his first patch but it didn't get much traction so let me try again. Since the concern is alloca calls with excessively large arguments, would transforming those (determined both at compile time and at run time) into pairs of malloc/free calls be an acceptable compromise? It would seem like a natural complement to the transformation in the opposite direction, brought up back then, of turning calls to malloc with small (compile-time) arguments into alloca. I would expect the malloc optimization to more than outweigh the performance cost of the alloca to malloc transformation. Perhaps even to the point to obviate the need for any explicit alloca calls at all. With the optimization in place, it seems that it should even be possible to transparently transform at least the most basic uses of some C++ containers written in terms of operator new and delete to use alloca instead when their sizes were known and under the alloca to malloc threshold. Martin
On Tue, Aug 16, 2016 at 7:54 PM, Martin Sebor <msebor@gmail.com> wrote: > On 08/16/2016 10:47 AM, Jeff Law wrote: >> >> On 08/16/2016 10:44 AM, Jakub Jelinek wrote: >>> >>> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote: >>>> >>>> I think you're being rather short-sighed here. GCC is being used in >>>> ways we >>>> can't necessarily predict -- which might include compile servers, >>>> JITs, web >>>> services, etc. >>> >>> >>> For compile server/web services one needs to add the protection >>> outside of >>> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.), >>> because even with very short testcases e.g. in C/C++ one can eat >>> arbitrary >>> amounts of stack even without any uses of alloca in the compiler, simply >>> through deep recursion in the parsers etc. >> >> Agreed. However, that doesn't mean we should not be locking down things >> like alloca and other attack vectors. > > > I think I made this suggestion when Aldy posted his first patch > but it didn't get much traction so let me try again. Since the > concern is alloca calls with excessively large arguments, would > transforming those (determined both at compile time and at run > time) into pairs of malloc/free calls be an acceptable compromise? > > It would seem like a natural complement to the transformation > in the opposite direction, brought up back then, of turning calls > to malloc with small (compile-time) arguments into alloca. > > I would expect the malloc optimization to more than outweigh > the performance cost of the alloca to malloc transformation. > Perhaps even to the point to obviate the need for any explicit > alloca calls at all. With the optimization in place, it seems > that it should even be possible to transparently transform at > least the most basic uses of some C++ containers written in > terms of operator new and delete to use alloca instead when > their sizes were known and under the alloca to malloc threshold. Please instead work on sth like -fstack-protector for alloca -- it should be straight-forward to add a runtime test on the stack adjustment being performed against some magic bound (yeah, needs more than only GCC support here). Richard. > Martin
On 08/17/2016 02:27 AM, Richard Biener wrote: > On Tue, Aug 16, 2016 at 7:54 PM, Martin Sebor <msebor@gmail.com> wrote: >> On 08/16/2016 10:47 AM, Jeff Law wrote: >>> >>> On 08/16/2016 10:44 AM, Jakub Jelinek wrote: >>>> >>>> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote: >>>>> >>>>> I think you're being rather short-sighed here. GCC is being used in >>>>> ways we >>>>> can't necessarily predict -- which might include compile servers, >>>>> JITs, web >>>>> services, etc. >>>> >>>> >>>> For compile server/web services one needs to add the protection >>>> outside of >>>> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.), >>>> because even with very short testcases e.g. in C/C++ one can eat >>>> arbitrary >>>> amounts of stack even without any uses of alloca in the compiler, simply >>>> through deep recursion in the parsers etc. >>> >>> Agreed. However, that doesn't mean we should not be locking down things >>> like alloca and other attack vectors. >> >> >> I think I made this suggestion when Aldy posted his first patch >> but it didn't get much traction so let me try again. Since the >> concern is alloca calls with excessively large arguments, would >> transforming those (determined both at compile time and at run >> time) into pairs of malloc/free calls be an acceptable compromise? >> >> It would seem like a natural complement to the transformation >> in the opposite direction, brought up back then, of turning calls >> to malloc with small (compile-time) arguments into alloca. >> >> I would expect the malloc optimization to more than outweigh >> the performance cost of the alloca to malloc transformation. >> Perhaps even to the point to obviate the need for any explicit >> alloca calls at all. With the optimization in place, it seems >> that it should even be possible to transparently transform at >> least the most basic uses of some C++ containers written in >> terms of operator new and delete to use alloca instead when >> their sizes were known and under the alloca to malloc threshold. > > Please instead work on sth like -fstack-protector for alloca -- it should > be straight-forward to add a runtime test on the stack adjustment > being performed against some magic bound (yeah, needs more than > only GCC support here). I agree that would be a useful first step, and certainly a more attainable goal. The C++ patch that handled the VLA part (bug 69517) had to be reverted just before 6.1 was released but it's still on my to do list to add to it a knob to adjust the bound at runtime and resubmit it. It should be straightforward to extend a similar approach to alloca and do the same thing in C (except with trapping rather than throwing an exception). Martin
On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote: > > Or just wait until people have agreed to switch to C++11 or C++14. I > don't think in practice anybody uses an C++11-incapable GCC to build a > newer GCC these days. I use the system gcc 4.4.7 on RHEL to build a newer cross compiler... I could bootstrap a newer native compiler, if I had too.
On 08/20/2016 03:29 AM, Mike Stump wrote: > On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote: >> >> Or just wait until people have agreed to switch to C++11 or C++14. I >> don't think in practice anybody uses an C++11-incapable GCC to build a >> newer GCC these days. gdb will drop support for building with a C compiler any week now, and even though we're starting out with C++03, just like gcc, it'd be great to require C++11 (or later). Having gcc itself switch to C++11 too would make proposing it for gdb so much easier... So +1 from me, FWIW. :-) > > I use the system gcc 4.4.7 on RHEL to build a newer cross compiler... I could bootstrap a newer native compiler, if I had too. > Yeah. I wonder whether the community would in general be fine with that too. Thanks, Pedro Alves
On Sun, Aug 21, 2016 at 09:00:26PM +0100, Pedro Alves wrote: > On 08/20/2016 03:29 AM, Mike Stump wrote: > > On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote: > >> > >> Or just wait until people have agreed to switch to C++11 or C++14. I > >> don't think in practice anybody uses an C++11-incapable GCC to build a > >> newer GCC these days. > > gdb will drop support for building with a C compiler any week > now, and even though we're starting out with C++03, just like gcc, > it'd be great to require C++11 (or later). Having gcc itself > switch to C++11 too would make proposing it for gdb so much > easier... huh, I would have sort of expected the oposit, if gdb was to require C++11, but gcc didn't then you could still use gdb on antique systems without a C++11 compiler by first building gcc. > So +1 from me, FWIW. :-) I'd mostly agree, at least requiring a compiler with rvalue references would be pretty useful. > > > > I use the system gcc 4.4.7 on RHEL to build a newer cross compiler... I could bootstrap a newer native compiler, if I had too. > > > > Yeah. I wonder whether the community would in general be fine with > that too. I personally don't have any machines where the system compiler is that old, but its worth noting the last C++11 features came in 4.8.1 I think and for libstdc++ its even later. Trev > > Thanks, > Pedro Alves >
On Mon, Aug 22, 2016 at 9:18 AM, Trevor Saunders <tbsaunde@tbsaunde.org> wrote: > On Sun, Aug 21, 2016 at 09:00:26PM +0100, Pedro Alves wrote: >> On 08/20/2016 03:29 AM, Mike Stump wrote: >> > On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote: >> >> >> >> Or just wait until people have agreed to switch to C++11 or C++14. I >> >> don't think in practice anybody uses an C++11-incapable GCC to build a >> >> newer GCC these days. >> >> gdb will drop support for building with a C compiler any week >> now, and even though we're starting out with C++03, just like gcc, >> it'd be great to require C++11 (or later). Having gcc itself >> switch to C++11 too would make proposing it for gdb so much >> easier... > > huh, I would have sort of expected the oposit, if gdb was to require > C++11, but gcc didn't then you could still use gdb on antique systems > without a C++11 compiler by first building gcc. > >> So +1 from me, FWIW. :-) > > I'd mostly agree, at least requiring a compiler with rvalue references > would be pretty useful. > >> > >> > I use the system gcc 4.4.7 on RHEL to build a newer cross compiler... I could bootstrap a newer native compiler, if I had too. >> > >> >> Yeah. I wonder whether the community would in general be fine with >> that too. > > I personally don't have any machines where the system compiler is that > old, but its worth noting the last C++11 features came in 4.8.1 I think > and for libstdc++ its even later. Most of our auto-testers are still on GCC 4.1 and GCC 4.3 host compilers (all but one actually). The oldest still maintained SLES has GCC 4.3 as host compiler (though GCC 4.8 is available as well, just not as 'gcc'). The "current" SLES has GCC 4.8 as host compiler. So unless there are very compelling reasons to require C++11 I'd rather not go down that route at this very moment. Richard. (yeah, time to update some of our autotesters - but then it will most likely be simply retiring them) > Trev > >> >> Thanks, >> Pedro Alves >>
On 8/21/16, Pedro Alves <palves@redhat.com> wrote: > On 08/20/2016 03:29 AM, Mike Stump wrote: >> On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote: >>> >>> Or just wait until people have agreed to switch to C++11 or C++14. I >>> don't think in practice anybody uses an C++11-incapable GCC to build a >>> newer GCC these days. > > gdb will drop support for building with a C compiler any week > now, and even though we're starting out with C++03, just like gcc, > it'd be great to require C++11 (or later). Having gcc itself > switch to C++11 too would make proposing it for gdb so much > easier... > > So +1 from me, FWIW. :-) > >> >> I use the system gcc 4.4.7 on RHEL to build a newer cross compiler... I >> could bootstrap a newer native compiler, if I had too. >> > > Yeah. I wonder whether the community would in general be fine with > that too. > > Thanks, > Pedro Alves > As a rookie programmer considering possibly contributing to GCC in the future once I'm more confident in my abilities, switching to C++11 would increase the barrier for me to contribute. I currently really only know C, and I still have to learn C++ in general, much less all the new features added in C++11. Also, I still occasionally bootstrap with Apple's gcc 4.2.1, which doesn't have C++11 support. That being said, I'd support taking steps to make it easier to optionally compile gcc as C++11 for those that want to do so, such as replacing -Wno-narrowing with -Wnarrowing in the warning flags that gcc uses. Out of curiosity, I tried doing just that, and there were actually fewer new unique warnings than I expected. I've attached a log of them. Eric Gallager
On 22/08/16 13:02, Eric Gallager wrote: > As a rookie programmer considering possibly contributing to GCC in the > future once I'm more confident in my abilities, switching to C++11 > would increase the barrier for me to contribute. I currently really > only know C, and I still have to learn C++ in general, much less all > the new features added in C++11. Also, I still occasionally bootstrap > with Apple's gcc 4.2.1, which doesn't have C++11 support. That being > said, I'd support taking steps to make it easier to optionally compile > gcc as C++11 for those that want to do so, such as replacing > -Wno-narrowing with -Wnarrowing in the warning flags that gcc uses. > Out of curiosity, I tried doing just that, and there were actually > fewer new unique warnings than I expected. I've attached a log of > them. If you are able to bootstrap GCC and run the regression testsuite, you have done 50% of what you need to submit your first patch: https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps The implementation language is often a detail hidden by many layers of abstraction. It is more important to get to know those abstractions, know how to use a debugger and understand how various things fit together in gcc (FEs, MEs, back-ends, gimple, trees, RTL) than the implementation of those things. And the best way to learn is to start with a small project in your area of interest (https://gcc.gnu.org/bugzilla/buglist.cgi?keywords=easyhack&list_id=158576&resolution=---) and persevere. Cheers, Manuel.
On Aug 22, 2016, at 5:02 AM, Eric Gallager <egall@gwmail.gwu.edu> wrote: > > As a rookie programmer considering possibly contributing to GCC in the > future once I'm more confident in my abilities, switching to C++11 > would increase the barrier for me to contribute. I currently really > only know C, and I still have to learn C++ in general, much less all > the new features added in C++11. Don't worry, we don't exploit every feature of the language on every line. Most of the lines likely look just like plain C.
On 8/22/16, Mike Stump <mikestump@comcast.net> wrote: > On Aug 22, 2016, at 5:02 AM, Eric Gallager <egall@gwmail.gwu.edu> wrote: >> >> As a rookie programmer considering possibly contributing to GCC in the >> future once I'm more confident in my abilities, switching to C++11 >> would increase the barrier for me to contribute. I currently really >> only know C, and I still have to learn C++ in general, much less all >> the new features added in C++11. > > Don't worry, we don't exploit every feature of the language on every line. > Most of the lines likely look just like plain C. Right, I suppose that's true. And even if you did start using every feature of the language, there'd still be other work available in areas I'm more comfortable in, such as shell scripts, autoconf macros, and Makefiles... assuming you also continue using the same build system, that is...
diff --git a/gcc/protected-alloca.h b/gcc/protected-alloca.h new file mode 100644 index 0000000..62f2a7b --- /dev/null +++ b/gcc/protected-alloca.h @@ -0,0 +1,113 @@ +/* Alloca wrapper with a malloc fallback. + Copyright (C) 2016 Free Software Foundation, Inc. + Contributed by Aldy Hernandez <aldyh@redhat.com>. + +This file is part of GCC. + +GCC is free software; you can redistribute it and/or modify +it under the terms of the GNU General Public License as published by +the Free Software Foundation; either version 3, or (at your option) +any later version. + +GCC is distributed in the hope that it will be useful, +but WITHOUT ANY WARRANTY; without even the implied warranty of +MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +GNU General Public License for more details. + +You should have received a copy of the GNU General Public License +along with GCC; see the file COPYING3. If not see +<http://www.gnu.org/licenses/>. */ + +#ifndef _PROTECTED_ALLOCA_H_ +#define _PROTECTED_ALLOCA_H_ + +#ifndef MAX_ALLOCA_SIZE +#define MAX_ALLOCA_SIZE 4096 +#endif + +#ifdef __GNUC__ +#define _ALLOCA_INLINE_ __attribute__((always_inline)) +#else +#define _ALLOCA_INLINE_ +#endif + +/* This is a wrapper class for alloca that falls back to malloc if the + allocation size is > MAX_ALLOCA_SIZE. It is meant to replace: + + char *str = (char *) alloca (N); + + by this: + + protected_alloca chunk (N); + char *str = (char *) chunk.pointer (); + + or this: + + protected_alloca chunk; + chunk.alloc (N); + char *str = (char *) chunk.pointer (); + + If N > MAX_ALLOCA_SIZE, malloc is used, and whenever `chunk' goes + out of scope, the malloc'd memory will be freed. Keep in mind that + the malloc'd memory gets freed when `chunk' goes out of scope, and + may be freed earlier than expected. For example: + + func() + { + char *str; + { + protected_alloca chunk (99999999); + // If malloc'd, pointer will be freed when chunk goes out of scope. + str = (char *) chunk.pointer (); + } + use (str); // BAD! Use after free. + } + + In this case, it is best to use the following idiom: + + func() + { + void *ptr; + protected_alloca chunk; + { + chunk.alloc (9999999); + str = (char *) chunk.pointer (); + } + // OK, pointer will be freed on function exit. + use (str); + } +*/ + +class protected_alloca { + void *p; + bool on_heap; +public: + // GCC will refuse to inline a function that calls alloca unless + // `always_inline' is used, for fear of inlining an alloca call + // appearing in a loop. Inlining this constructor won't make code + // any worse than it already is wrt loops. We know what we're + // doing. + _ALLOCA_INLINE_ void + internal_alloc (size_t size) + { + if (size < MAX_ALLOCA_SIZE) + { + p = alloca (size); + on_heap = false; + } + else + { + p = xmalloc (size); + on_heap = true; + } + } + _ALLOCA_INLINE_ + protected_alloca (size_t size) { internal_alloc (size); } + protected_alloca () { p = NULL; on_heap = false; } + ~protected_alloca () { if (on_heap) free (p); } + __attribute__((always_inline)) + void alloc (size_t size) { gcc_assert (!p); internal_alloc (size); } + void *pointer () { return p; } +}; + +#endif // _PROTECTED_ALLOCA_H_ diff --git a/gcc/tree.c b/gcc/tree.c index 11d3b51..621dcd5 100644 --- a/gcc/tree.c +++ b/gcc/tree.c @@ -62,6 +62,7 @@ along with GCC; see the file COPYING3. If not see #include "print-tree.h" #include "ipa-utils.h" #include "selftest.h" +#include "protected-alloca.h" /* Tree code classes. */ @@ -9639,6 +9640,7 @@ get_file_function_name (const char *type) char *buf; const char *p; char *q; + protected_alloca q_alloca; /* If we already have a name we know to be unique, just use that. */ if (first_global_object_name) @@ -9674,7 +9676,8 @@ get_file_function_name (const char *type) file = LOCATION_FILE (input_location); len = strlen (file); - q = (char *) alloca (9 + 17 + len + 1); + q_alloca.alloc (9 + 17 + len + 1); + q = (char *) q_alloca.pointer (); memcpy (q, file, len + 1); snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX, @@ -9684,8 +9687,9 @@ get_file_function_name (const char *type) } clean_symbol_name (q); - buf = (char *) alloca (sizeof (FILE_FUNCTION_FORMAT) + strlen (p) - + strlen (type)); + protected_alloca buf_alloca (sizeof (FILE_FUNCTION_FORMAT) + strlen (p) + + strlen (type)); + buf = (char *) buf_alloca.pointer (); /* Set up the name of the file-level functions we may need. Use a global object (which is already required to be unique over @@ -9709,7 +9713,8 @@ tree_check_failed (const_tree node, const char *file, { va_list args; const char *buffer; - unsigned length = 0; + protected_alloca buffer_alloca; + size_t length = 0; enum tree_code code; va_start (args, function); @@ -9721,7 +9726,8 @@ tree_check_failed (const_tree node, const char *file, char *tmp; va_start (args, function); length += strlen ("expected "); - buffer = tmp = (char *) alloca (length); + buffer_alloca.alloc (length); + buffer = tmp = (char *) buffer_alloca.pointer (); length = 0; while ((code = (enum tree_code) va_arg (args, int))) { @@ -9752,7 +9758,8 @@ tree_not_check_failed (const_tree node, const char *file, { va_list args; char *buffer; - unsigned length = 0; + protected_alloca buffer_alloca; + size_t length = 0; enum tree_code code; va_start (args, function); @@ -9760,7 +9767,8 @@ tree_not_check_failed (const_tree node, const char *file, length += 4 + strlen (get_tree_code_name (code)); va_end (args); va_start (args, function); - buffer = (char *) alloca (length); + buffer_alloca.alloc (length); + buffer = (char *) buffer_alloca.pointer (); length = 0; while ((code = (enum tree_code) va_arg (args, int))) { @@ -9802,14 +9810,16 @@ tree_range_check_failed (const_tree node, const char *file, int line, enum tree_code c2) { char *buffer; - unsigned length = 0; + protected_alloca buffer_alloca; + size_t length = 0; unsigned int c; for (c = c1; c <= c2; ++c) length += 4 + strlen (get_tree_code_name ((enum tree_code) c)); length += strlen ("expected "); - buffer = (char *) alloca (length); + buffer_alloca.alloc (length); + buffer = (char *) buffer_alloca.pointer (); length = 0; for (c = c1; c <= c2; ++c) @@ -9863,14 +9873,16 @@ omp_clause_range_check_failed (const_tree node, const char *file, int line, enum omp_clause_code c2) { char *buffer; - unsigned length = 0; + protected_alloca buffer_alloca; + size_t length = 0; unsigned int c; for (c = c1; c <= c2; ++c) length += 4 + strlen (omp_clause_code_name[c]); length += strlen ("expected "); - buffer = (char *) alloca (length); + buffer_alloca.alloc (length); + buffer = (char *) buffer_alloca.pointer (); length = 0; for (c = c1; c <= c2; ++c)