diff mbox

protected alloca class for malloc fallback

Message ID 57A32741.7010003@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez Aug. 4, 2016, 11:30 a.m. UTC
Howdy!

As part of my -Walloca-larger-than=life work, I've been running said 
pass over gcc, binutils, and gdb, and trying to fix things along the way.

Particularly irritating and error prone is having to free malloc'd 
pointers on every function exit point.  We end up with a lot of:

foo(size_t len)
{
   void *p, *m_p = NULL;
   if (len < HUGE)
     p = alloca(len);
   else
     p = m_p = malloc(len);
   if (something)
     goto out;
   stuff();
out:
   free (m_p);
}

...which nobody really likes.

I've been thinking that for GCC we could have a protected_alloca class 
whose destructor frees any malloc'd memory:

void foo()
{
   char *p;
   protected_alloca chunk(50000);
   p = (char *) chunk.pointer();
   f(p);
}

This would generate:

void foo() ()
{
   void * _3;

   <bb 2>:
   _3 = malloc (50000);
   f (_3);

   <bb 3>:
   free (_3); [tail call]
   return;
}

Now the problem with this is that the memory allocated by chunk is freed 
when it goes out of scope, which may not be what you want.  For example:

      func()
      {
        char *str;
        {
          protected_alloca chunk (99999999);
	 // malloc'd pointer will be freed when chunk goes out of scope.
          str = (char *) chunk.pointer ();
        }
        use (str);  // BAD!  Use after free.
      }

In the attached patch implementing this class I have provided another 
idiom for avoiding this problem:

      func()
      {
        void *ptr;
        protected_alloca chunk;
        {
          chunk.alloc (9999999);
	 str = (char *) chunk.pointer ();
        }
        // OK, pointer will be freed on function exit.
        use (str);
      }

So I guess it's between annoying gotos and keeping track of multiple 
exit points to a function previously calling alloca, or making sure the 
protected_alloca object always resides in the scope where the memory is 
going to be used.

Is there a better blessed C++ way?  If not, is this OK?

Included is the conversion of tree.c.  More to follow once we agree on a 
solution.

Tested on x86-64 Linux.

Aldy
commit fd0078ef60dd75ab488392e0e05b28f27d971bdf
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Thu Aug 4 06:43:37 2016 -0400

    	* protected-alloca.h: New.
    	* tree.c (get_file_function_name): Use protected_alloca.
    	(tree_check_failed): Same.
    	(tree_not_check_failed): Same.
    	(tree_range_check_failed): Same.
    	(omp_clause_range_check_failed): Same.

Comments

Richard Biener Aug. 4, 2016, 12:57 p.m. UTC | #1
On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> Howdy!
>
> As part of my -Walloca-larger-than=life work, I've been running said pass
> over gcc, binutils, and gdb, and trying to fix things along the way.
>
> Particularly irritating and error prone is having to free malloc'd pointers
> on every function exit point.  We end up with a lot of:
>
> foo(size_t len)
> {
>   void *p, *m_p = NULL;
>   if (len < HUGE)
>     p = alloca(len);
>   else
>     p = m_p = malloc(len);
>   if (something)
>     goto out;
>   stuff();
> out:
>   free (m_p);
> }
>
> ...which nobody really likes.
>
> I've been thinking that for GCC we could have a protected_alloca class whose
> destructor frees any malloc'd memory:
>
> void foo()
> {
>   char *p;
>   protected_alloca chunk(50000);
>   p = (char *) chunk.pointer();
>   f(p);
> }
>
> This would generate:
>
> void foo() ()
> {
>   void * _3;
>
>   <bb 2>:
>   _3 = malloc (50000);
>   f (_3);
>
>   <bb 3>:
>   free (_3); [tail call]
>   return;
> }
>
> Now the problem with this is that the memory allocated by chunk is freed
> when it goes out of scope, which may not be what you want.  For example:
>
>      func()
>      {
>        char *str;
>        {
>          protected_alloca chunk (99999999);
>          // malloc'd pointer will be freed when chunk goes out of scope.
>          str = (char *) chunk.pointer ();
>        }
>        use (str);  // BAD!  Use after free.
>      }

But how's that an issue if the chunk is created at the exact place where there
previously was an alloca?

Your class also will not work when internal_alloc is not inlined and
the alloca path
is taken like when using non-GCC host compilers.

> In the attached patch implementing this class I have provided another idiom
> for avoiding this problem:
>
>      func()
>      {
>        void *ptr;
>        protected_alloca chunk;
>        {
>          chunk.alloc (9999999);
>          str = (char *) chunk.pointer ();
>        }
>        // OK, pointer will be freed on function exit.
>        use (str);
>      }
>
> So I guess it's between annoying gotos and keeping track of multiple exit
> points to a function previously calling alloca, or making sure the
> protected_alloca object always resides in the scope where the memory is
> going to be used.
>
> Is there a better blessed C++ way?  If not, is this OK?

It looks like you want to replace _all_ alloca uses?  What's the point
in doing this
at all?  Just to be able to enable the warning during bootstrap?

Having the conditional malloc/alloca will also inhibit optimization like eliding
the malloc or alloca calls completely.

Thanks,
Richard.

> Included is the conversion of tree.c.  More to follow once we agree on a
> solution.
>
> Tested on x86-64 Linux.
>
> Aldy
Aldy Hernandez Aug. 4, 2016, 3:19 p.m. UTC | #2
On 08/04/2016 08:57 AM, Richard Biener wrote:
> On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> Howdy!
>>
>> As part of my -Walloca-larger-than=life work, I've been running said pass
>> over gcc, binutils, and gdb, and trying to fix things along the way.
>>
>> Particularly irritating and error prone is having to free malloc'd pointers
>> on every function exit point.  We end up with a lot of:
>>
>> foo(size_t len)
>> {
>>    void *p, *m_p = NULL;
>>    if (len < HUGE)
>>      p = alloca(len);
>>    else
>>      p = m_p = malloc(len);
>>    if (something)
>>      goto out;
>>    stuff();
>> out:
>>    free (m_p);
>> }
>>
>> ...which nobody really likes.
>>
>> I've been thinking that for GCC we could have a protected_alloca class whose
>> destructor frees any malloc'd memory:
>>
>> void foo()
>> {
>>    char *p;
>>    protected_alloca chunk(50000);
>>    p = (char *) chunk.pointer();
>>    f(p);
>> }
>>
>> This would generate:
>>
>> void foo() ()
>> {
>>    void * _3;
>>
>>    <bb 2>:
>>    _3 = malloc (50000);
>>    f (_3);
>>
>>    <bb 3>:
>>    free (_3); [tail call]
>>    return;
>> }
>>
>> Now the problem with this is that the memory allocated by chunk is freed
>> when it goes out of scope, which may not be what you want.  For example:
>>
>>       func()
>>       {
>>         char *str;
>>         {
>>           protected_alloca chunk (99999999);
>>           // malloc'd pointer will be freed when chunk goes out of scope.
>>           str = (char *) chunk.pointer ();
>>         }
>>         use (str);  // BAD!  Use after free.
>>       }
>
> But how's that an issue if the chunk is created at the exact place where there
> previously was an alloca?

The pointer can escape if you assign it to a variable outside the scope 
of chunk?  Take for instance the following snippet in tree.c:

     {
     ...
     ...
       q = (char *) alloca (9 + 17 + len + 1);
       memcpy (q, file, len + 1);

       snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX,
		crc32_string (0, name), get_random_seed (false));

       p = q;
     }

     clean_symbol_name (q);

If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca() 
call, chunk will be destroyed at the "}", whereas `q' is still being 
used outside of that scope.

What I am suggesting for this escaping case is to define 
"protected_alloca chunk()" at function scope, and then do chunk.alloc(N) 
in the spot where the alloca() call was previously at.

Or am I missing something?

>
> Your class also will not work when internal_alloc is not inlined and
> the alloca path
> is taken like when using non-GCC host compilers.

Does not work, or is not optimal?  Because defining _ALLOCA_INLINE_ to 
nothing and forcing no-inline with:

	g++ -c b.cc -fno-inline -fdump-tree-all  -O1 -fno-exceptions

I still see correct code.  It's just that it's inefficient, which we 
shouldn't care because bootstrap fixes the non-GCC inlining problem :).

   protected_alloca chunk(123);
   str = (char *) chunk.pointer();
   use(str);

becomes:

   <bb 2>:
   protected_alloca::protected_alloca (&chunk, 123);
   str_3 = protected_alloca::pointer (&chunk);
   use (str_3);
   protected_alloca::~protected_alloca (&chunk);
   return;

What am I missing?

>
>> In the attached patch implementing this class I have provided another idiom
>> for avoiding this problem:
>>
>>       func()
>>       {
>>         void *ptr;
>>         protected_alloca chunk;
>>         {
>>           chunk.alloc (9999999);
>>           str = (char *) chunk.pointer ();
>>         }
>>         // OK, pointer will be freed on function exit.
>>         use (str);
>>       }
>>
>> So I guess it's between annoying gotos and keeping track of multiple exit
>> points to a function previously calling alloca, or making sure the
>> protected_alloca object always resides in the scope where the memory is
>> going to be used.
>>
>> Is there a better blessed C++ way?  If not, is this OK?
>
> It looks like you want to replace _all_ alloca uses?  What's the point
> in doing this
> at all?  Just to be able to enable the warning during bootstrap?

Well, it did cross my mind to nix anything that had 0 bounds checks, but 
I was mostly interested in things like this:

gcc.c:
   temp = env.get ("COMPILER_PATH");
   if (temp)
     {
       const char *startp, *endp;
       char *nstore = (char *) alloca (strlen (temp) + 3);

I was just providing a generic interface for dealing with these cases in 
the future, instead of gotoing my way out of it.

>
> Having the conditional malloc/alloca will also inhibit optimization like eliding
> the malloc or alloca calls completely.

If we can elide the alloca, we can surely elide a conditional alloca / 
malloc pair, can't we?

Aldy
Pedro Alves Aug. 4, 2016, 7:06 p.m. UTC | #3
How wedded are we to alloca?

On 08/04/2016 12:30 PM, Aldy Hernandez wrote:

> Is there a better blessed C++ way?

  std::string

or if I'm reading gcc's vec.h correctly:

  auto_vec<char, MAX_ALLOCA_SIZE>

?

(or some typedef of the latter)

The latter would always allocate the stack space, but
my guess it that it wouldn't matter.

Thanks,
Pedro Alves
Jeff Law Aug. 4, 2016, 7:16 p.m. UTC | #4
On 08/04/2016 01:06 PM, Pedro Alves wrote:
> How wedded are we to alloca?
I would think only in the sense of existing codebase usage.

Based on how often alloca usage has resulted in a security vulnerability 
it's clear we as developers can't use it correctly on a consistent 
basis, thus I'd like to abolish it :-)  I'll settle for warning folks 
when they use it incorrectly though.

Jeff
Pedro Alves Aug. 4, 2016, 7:21 p.m. UTC | #5
On 08/04/2016 08:16 PM, Jeff Law wrote:
> On 08/04/2016 01:06 PM, Pedro Alves wrote:
>> How wedded are we to alloca?
> I would think only in the sense of existing codebase usage.
> 
> Based on how often alloca usage has resulted in a security vulnerability
> it's clear we as developers can't use it correctly on a consistent
> basis, thus I'd like to abolish it :-)  I'll settle for warning folks
> when they use it incorrectly though.

Most allocas I've seen in my life were written to simply build
strings at run time, while lazily avoiding to think about writing
a  "free" call, rather than having been written for optimizing
some fast path, or for async-signal safe reasons.

My guess is that auto_vec<char, MAX_ALLOCA_SIZE> covers any
fast-path-handling requirement in gcc.

(Where alloca is being used for async-signal safely reasons you
can't have a malloc fallback anyway, so out of scope for the
proposed protected_alloca too.)

Thanks,
Pedro Alves
Jeff Law Aug. 4, 2016, 7:24 p.m. UTC | #6
On 08/04/2016 09:19 AM, Aldy Hernandez wrote:

>>>
>>> Particularly irritating and error prone is having to free malloc'd
>>> pointers
>>> on every function exit point.  We end up with a lot of:
>>>
>>> foo(size_t len)
>>> {
>>>    void *p, *m_p = NULL;
>>>    if (len < HUGE)
>>>      p = alloca(len);
>>>    else
>>>      p = m_p = malloc(len);
>>>    if (something)
>>>      goto out;
>>>    stuff();
>>> out:
>>>    free (m_p);
>>> }
>>>
>>> ...which nobody really likes.
Yup.  You'll see that all over glibc.  I don't think we use that idiom 
all that much for GCC.

>>>
>>> I've been thinking that for GCC we could have a protected_alloca
>>> class whose
>>> destructor frees any malloc'd memory:
>>>
>>> void foo()
>>> {
>>>    char *p;
>>>    protected_alloca chunk(50000);
>>>    p = (char *) chunk.pointer();
>>>    f(p);
>>> }
>>>
>>> This would generate:
>>>
>>> void foo() ()
>>> {
>>>    void * _3;
>>>
>>>    <bb 2>:
>>>    _3 = malloc (50000);
>>>    f (_3);
>>>
>>>    <bb 3>:
>>>    free (_3); [tail call]
>>>    return;
>>> }
>>>
>>> Now the problem with this is that the memory allocated by chunk is freed
>>> when it goes out of scope, which may not be what you want.  For example:
>>>
>>>       func()
>>>       {
>>>         char *str;
>>>         {
>>>           protected_alloca chunk (99999999);
>>>           // malloc'd pointer will be freed when chunk goes out of
>>> scope.
>>>           str = (char *) chunk.pointer ();
>>>         }
>>>         use (str);  // BAD!  Use after free.
>>>       }
>>
>> But how's that an issue if the chunk is created at the exact place
>> where there
>> previously was an alloca?
>
> The pointer can escape if you assign it to a variable outside the scope
> of chunk?  Take for instance the following snippet in tree.c:
Right.  The alloca model is (with the exception of vlas in loops) is the 
storage hangs around until function scope is left.  So releasing on exit 
of the scope in which the space was allocated is wrong unless allocation 
occurred at function scope.

That doesn't play as well with an RAII model where objects get destroyed 
as soon as they leave their lexical scope.



So
>
>     {
>     ...
>     ...
>       q = (char *) alloca (9 + 17 + len + 1);
>       memcpy (q, file, len + 1);
>
>       snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX,
>         crc32_string (0, name), get_random_seed (false));
>
>       p = q;
>     }
>
>     clean_symbol_name (q);
>
> If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca()
> call, chunk will be destroyed at the "}", whereas `q' is still being
> used outside of that scope.
>
> What I am suggesting for this escaping case is to define
> "protected_alloca chunk()" at function scope, and then do chunk.alloc(N)
> in the spot where the alloca() call was previously at.
>
> Or am I missing something?
I think only thing you're missing is that we probably don't want to be 
using alloca in new code anyway.  And if we're going through the trouble 
of fixing old code, we ought to just remove alloca.  So building up this 
kind of class is probably taking folks in the wrong direction.

Now you might use that opportunity to convert some object to a RAII 
model (and I'm a huge fan of that model).

Jeff
Jeff Law Aug. 4, 2016, 7:26 p.m. UTC | #7
On 08/04/2016 01:21 PM, Pedro Alves wrote:
> On 08/04/2016 08:16 PM, Jeff Law wrote:
>> On 08/04/2016 01:06 PM, Pedro Alves wrote:
>>> How wedded are we to alloca?
>> I would think only in the sense of existing codebase usage.
>>
>> Based on how often alloca usage has resulted in a security vulnerability
>> it's clear we as developers can't use it correctly on a consistent
>> basis, thus I'd like to abolish it :-)  I'll settle for warning folks
>> when they use it incorrectly though.
>
> Most allocas I've seen in my life were written to simply build
> strings at run time, while lazily avoiding to think about writing
> a  "free" call, rather than having been written for optimizing
> some fast path, or for async-signal safe reasons.
Right -- the problem is if those strings are potentially under user 
control or the bad guys can arrange to overflow a size computation 
feeding an alloca, then this stuff becomes a huge gaping security hole. 
We've seen this repeatedly within glibc.

Jeff
Pedro Alves Aug. 4, 2016, 7:31 p.m. UTC | #8
On 08/04/2016 08:26 PM, Jeff Law wrote:
> On 08/04/2016 01:21 PM, Pedro Alves wrote:

>> Most allocas I've seen in my life were written to simply build
>> strings at run time, while lazily avoiding to think about writing
>> a  "free" call, rather than having been written for optimizing
>> some fast path, or for async-signal safe reasons.
> Right -- the problem is if those strings are potentially under user
> control or the bad guys can arrange to overflow a size computation
> feeding an alloca, then this stuff becomes a huge gaping security hole.
> We've seen this repeatedly within glibc.

Yup.

Thanks,
Pedro Alves
Martin Sebor Aug. 5, 2016, 2:10 a.m. UTC | #9
On 08/04/2016 05:30 AM, Aldy Hernandez wrote:
> Howdy!
>
> As part of my -Walloca-larger-than=life work, I've been running said
> pass over gcc, binutils, and gdb, and trying to fix things along the way.
>
> Particularly irritating and error prone is having to free malloc'd
> pointers on every function exit point.  We end up with a lot of:
>
> foo(size_t len)
> {
>    void *p, *m_p = NULL;
>    if (len < HUGE)
>      p = alloca(len);
>    else
>      p = m_p = malloc(len);
>    if (something)
>      goto out;
>    stuff();
> out:
>    free (m_p);
> }
>
> ...which nobody really likes.
>
> I've been thinking that for GCC we could have a protected_alloca class
> whose destructor frees any malloc'd memory:
>
> void foo()
> {
>    char *p;
>    protected_alloca chunk(50000);
>    p = (char *) chunk.pointer();
>    f(p);
> }
>
> This would generate:
>
> void foo() ()
> {
>    void * _3;
>
>    <bb 2>:
>    _3 = malloc (50000);
>    f (_3);
>
>    <bb 3>:
>    free (_3); [tail call]
>    return;
> }
>
> Now the problem with this is that the memory allocated by chunk is freed
> when it goes out of scope, which may not be what you want.  For example:
>
>       func()
>       {
>         char *str;
>         {
>           protected_alloca chunk (99999999);
>       // malloc'd pointer will be freed when chunk goes out of scope.
>           str = (char *) chunk.pointer ();
>         }
>         use (str);  // BAD!  Use after free.
>       }
>
> In the attached patch implementing this class I have provided another
> idiom for avoiding this problem:
>
>       func()
>       {
>         void *ptr;
>         protected_alloca chunk;
>         {
>           chunk.alloc (9999999);
>       str = (char *) chunk.pointer ();
>         }
>         // OK, pointer will be freed on function exit.
>         use (str);
>       }
>
> So I guess it's between annoying gotos and keeping track of multiple
> exit points to a function previously calling alloca, or making sure the
> protected_alloca object always resides in the scope where the memory is
> going to be used.
>
> Is there a better blessed C++ way?  If not, is this OK?

I was surprised by the always_inline trick.  I suppose it makes
sense but I wouldn't have expected to be able to rely on it.  Out
of curiosity I tested it with other compilers.  It turns out that
it only works with some like GCC and IBM XLC++, but not others
like Clang or Sun CC.  In recursive calls, they don't seem to
hold on to the memory allocated via alloca by the class ctor in
the caller.

FWIW, rather than creating a new class around alloca and putting
effort into changing code to use it I think that investing time
into replacing the function's uses with an existing C++ container
class (either GCC's own or one from the STL) might be a better
way to go.

Martin
Richard Biener Aug. 5, 2016, 8:17 a.m. UTC | #10
On Thu, Aug 4, 2016 at 5:19 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 08/04/2016 08:57 AM, Richard Biener wrote:
>>
>> On Thu, Aug 4, 2016 at 1:30 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>>>
>>> Howdy!
>>>
>>> As part of my -Walloca-larger-than=life work, I've been running said pass
>>> over gcc, binutils, and gdb, and trying to fix things along the way.
>>>
>>> Particularly irritating and error prone is having to free malloc'd
>>> pointers
>>> on every function exit point.  We end up with a lot of:
>>>
>>> foo(size_t len)
>>> {
>>>    void *p, *m_p = NULL;
>>>    if (len < HUGE)
>>>      p = alloca(len);
>>>    else
>>>      p = m_p = malloc(len);
>>>    if (something)
>>>      goto out;
>>>    stuff();
>>> out:
>>>    free (m_p);
>>> }
>>>
>>> ...which nobody really likes.
>>>
>>> I've been thinking that for GCC we could have a protected_alloca class
>>> whose
>>> destructor frees any malloc'd memory:
>>>
>>> void foo()
>>> {
>>>    char *p;
>>>    protected_alloca chunk(50000);
>>>    p = (char *) chunk.pointer();
>>>    f(p);
>>> }
>>>
>>> This would generate:
>>>
>>> void foo() ()
>>> {
>>>    void * _3;
>>>
>>>    <bb 2>:
>>>    _3 = malloc (50000);
>>>    f (_3);
>>>
>>>    <bb 3>:
>>>    free (_3); [tail call]
>>>    return;
>>> }
>>>
>>> Now the problem with this is that the memory allocated by chunk is freed
>>> when it goes out of scope, which may not be what you want.  For example:
>>>
>>>       func()
>>>       {
>>>         char *str;
>>>         {
>>>           protected_alloca chunk (99999999);
>>>           // malloc'd pointer will be freed when chunk goes out of scope.
>>>           str = (char *) chunk.pointer ();
>>>         }
>>>         use (str);  // BAD!  Use after free.
>>>       }
>>
>>
>> But how's that an issue if the chunk is created at the exact place where
>> there
>> previously was an alloca?
>
>
> The pointer can escape if you assign it to a variable outside the scope of
> chunk?  Take for instance the following snippet in tree.c:
>
>     {
>     ...
>     ...
>       q = (char *) alloca (9 + 17 + len + 1);
>       memcpy (q, file, len + 1);
>
>       snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX,
>                 crc32_string (0, name), get_random_seed (false));
>
>       p = q;
>     }
>
>     clean_symbol_name (q);
>
> If you define `protected_alloca chunk(9 + 17 + len + 1)' at the alloca()
> call, chunk will be destroyed at the "}", whereas `q' is still being used
> outside of that scope.
>
> What I am suggesting for this escaping case is to define "protected_alloca
> chunk()" at function scope, and then do chunk.alloc(N) in the spot where the
> alloca() call was previously at.
>
> Or am I missing something?

Ah, ok - alloca memory is only freed at the end of the function.  Only VLAs
are freed at scope boundary.

>>
>> Your class also will not work when internal_alloc is not inlined and
>> the alloca path
>> is taken like when using non-GCC host compilers.
>
>
> Does not work, or is not optimal?  Because defining _ALLOCA_INLINE_ to
> nothing and forcing no-inline with:
>
>         g++ -c b.cc -fno-inline -fdump-tree-all  -O1 -fno-exceptions
>
> I still see correct code.  It's just that it's inefficient, which we
> shouldn't care because bootstrap fixes the non-GCC inlining problem :).

As alloca() frees memory at function return code cannot be "correct".

>   protected_alloca chunk(123);
>   str = (char *) chunk.pointer();
>   use(str);
>
> becomes:
>
>   <bb 2>:
>   protected_alloca::protected_alloca (&chunk, 123);
>   str_3 = protected_alloca::pointer (&chunk);
>   use (str_3);
>   protected_alloca::~protected_alloca (&chunk);
>   return;
>
> What am I missing?

The memory is released after internal_alloc returns.  So if you do

  protected_alloca::protected_alloca (&chunk, 123);
  ...
  foo (); // function needing some stack space or GCC spilling
  ...

you'll corrupt memory.

>>
>>> In the attached patch implementing this class I have provided another
>>> idiom
>>> for avoiding this problem:
>>>
>>>       func()
>>>       {
>>>         void *ptr;
>>>         protected_alloca chunk;
>>>         {
>>>           chunk.alloc (9999999);
>>>           str = (char *) chunk.pointer ();
>>>         }
>>>         // OK, pointer will be freed on function exit.
>>>         use (str);
>>>       }
>>>
>>> So I guess it's between annoying gotos and keeping track of multiple exit
>>> points to a function previously calling alloca, or making sure the
>>> protected_alloca object always resides in the scope where the memory is
>>> going to be used.
>>>
>>> Is there a better blessed C++ way?  If not, is this OK?
>>
>>
>> It looks like you want to replace _all_ alloca uses?  What's the point
>> in doing this
>> at all?  Just to be able to enable the warning during bootstrap?
>
>
> Well, it did cross my mind to nix anything that had 0 bounds checks, but I
> was mostly interested in things like this:
>
> gcc.c:
>   temp = env.get ("COMPILER_PATH");
>   if (temp)
>     {
>       const char *startp, *endp;
>       char *nstore = (char *) alloca (strlen (temp) + 3);
>
> I was just providing a generic interface for dealing with these cases in the
> future, instead of gotoing my way out of it.
>
>>
>> Having the conditional malloc/alloca will also inhibit optimization like
>> eliding
>> the malloc or alloca calls completely.
>
>
> If we can elide the alloca, we can surely elide a conditional alloca /
> malloc pair, can't we?

We can't at the moment.  We don't for malloc (we only remove malloc/free
pairs when the memory is not used), we can elide alloca to use a local
decl instead.

Richard.

>
> Aldy
Aldy Hernandez Aug. 5, 2016, 2:37 p.m. UTC | #11
> I think only thing you're missing is that we probably don't want to be
> using alloca in new code anyway.  And if we're going through the trouble
> of fixing old code, we ought to just remove alloca.  So building up this
> kind of class is probably taking folks in the wrong direction.

After spending some time combing over all alloca uses in the compiler, I 
am very much inclined to agree :).

Pedro's comments up-thread ring true as well.  Most of the alloca uses 
in the compiler are actually ad-hoc allocation for a string that will be 
quickly discarded or passed to get_identifier().  Very few, if any are 
actually on a critical path.  Most uses are things like building 
diagnostics.

However, there are a few potentially problematic ones like alloca()ing 
chunks of memory while processing GIMPLE_ASM (chunk * noutputs).  This 
may require something more efficient than malloc().  Though I really 
wonder how prevalent inline asms are in that it would affect performance 
in any measurable way if we replaced these with malloc or GCC's 
auto_vec<> (which also uses malloc (or GGC) BTW).

After looking at all this code, I realize no size will fit all if we 
want something clean.  So, I take my approach back.  I think we're best 
served by a couple different approaches, depending on usage.

My question is, how wed are we to an alloca based approach?  Can we 
replace the non-critical uses by std::string (diagnostics, 
BLAH_check_failed(), building a string to manipulate the output of 
env("SOME_VAR"), etc?).

Then for things like the GCC driver wanting to create a vector of passed 
commands, we could use auto_vec<> (*gasp*, using a vector to represent a 
vector ;-)).  Note that auto_vec<> uses malloc (or GC) but creating a 
malloc'd vector at startup is hardly on the fast path.

For the remaining things we could use either alloca with a malloc 
fallback for trivial things like set_user_assembler_name() that only 
have one exit point, or take it on a case by case basis.

But it seems to me that we can use an STL container for the non critical 
things (which are the majority of them), and something else (perhaps an 
RAII thinggie TBD later).

Is this reasonable?  I'd like to know before I spend any time converting 
anything to std::string and auto_vec<>.

Thanks for everyone's input.
Aldy

p.s. Yeah yeah, both std::string and auto_vec<> malloc stuff and they're 
bloated, but do we care on non-critical things?  We do get cleaner code 
as a result...
Aldy Hernandez Aug. 5, 2016, 2:42 p.m. UTC | #12
> I was surprised by the always_inline trick.  I suppose it makes
> sense but I wouldn't have expected to be able to rely on it.  Out
> of curiosity I tested it with other compilers.  It turns out that
> it only works with some like GCC and IBM XLC++, but not others
> like Clang or Sun CC.  In recursive calls, they don't seem to
> hold on to the memory allocated via alloca by the class ctor in
> the caller.

Well, it was surprising to me as well, hence the comment.  I expected it 
to just work, and when it didn't I had to hunt in the inliner code to 
find out why it was selectively inlining:

     case GIMPLE_CALL:
       /* Refuse to inline alloca call unless user explicitly forced so as
	 this may change program's memory overhead drastically when the
	 function using alloca is called in loop.  In GCC present in
	 SPEC2000 inlining into schedule_block cause it to require 2GB of
	 RAM instead of 256MB.  Don't do so for alloca calls emitted for
	 VLA objects as those can't cause unbounded growth (they're always
	 wrapped inside stack_save/stack_restore regions.  */

As Richi pointed out, if the constructor doesn't get inlined (as you're 
seeing in Clang and Sun CC), we could potentially clobber freed memory. 
  So much for that idea...

>
> FWIW, rather than creating a new class around alloca and putting
> effort into changing code to use it I think that investing time
> into replacing the function's uses with an existing C++ container
> class (either GCC's own or one from the STL) might be a better
> way to go.

Yes, I suggested a combination of auto_vec<> (gcc's) and std::string 
down thread.

Thanks for checking out the result from other compilers.  Much appreciated.

Aldy
Pedro Alves Aug. 5, 2016, 3:15 p.m. UTC | #13
On 08/05/2016 03:37 PM, Aldy Hernandez wrote:

> My question is, how wed are we to an alloca based approach?  Can we
> replace the non-critical uses by std::string (diagnostics,
> BLAH_check_failed(), building a string to manipulate the output of
> env("SOME_VAR"), etc?).

My suggestion too.

( I hope you'll pardon my sticking my nose into this.  I'm hoping that
down the road gcc and gdb/binutils end up finding a common ground
for basic C++ utilities, and grow closer together.  And you did say
you're fixing gdb's alloca problems as well.  :-) )

> Then for things like the GCC driver wanting to create a vector of passed
> commands, we could use auto_vec<> (*gasp*, using a vector to represent a
> vector ;-)).  Note that auto_vec<> uses malloc (or GC) but creating a
> malloc'd vector at startup is hardly on the fast path.

Note that that's why I specifically suggested auto_vec<char, MAX_ALLOCA_SIZE>
for critical-path things.

The second template parameter of auto_vec specifies the size of the
internal storage:

 /* auto_vec is a subclass of vec that automatically manages creating and
    releasing the internal vector. If N is non zero then it has N elements of
    internal storage.  The default is no internal storage, and you probably only
    want to ask for internal storage for vectors on the stack because if the
    size of the vector is larger than the internal storage that space is wasted.
    */
 template<typename T, size_t N = 0>
 class auto_vec : public vec<T, va_heap>
 {
 ...

So for fast-path cases where you know off hand the usual scenario is "less
than N", you could use an auto_vec<sometype, N> variable on the stack.  This
does not malloc unless you end up needing more than N.  It actually ends up
being quite similar to your protected_alloca.  E.g.,

yours:

 protected_alloca chunk;
 str = (char *) chunk.pointer (); 

vs auto_vec:

 typedef auto_vec<char, MAX_ALLOCA_SIZE> protected_alloca;

 protected_alloca chunk;
 str = chunk.address (); 

(Of course a similar approach can be taken with other data
structures, stack -> set/map/hash, etc., but auto_vec is there
already, and vectors should be the default container.)

> 
> For the remaining things we could use either alloca with a malloc
> fallback for trivial things like set_user_assembler_name() that only
> have one exit point, or take it on a case by case basis.

See above.

Thanks,
Pedro Alves

> 
> But it seems to me that we can use an STL container for the non critical
> things (which are the majority of them), and something else (perhaps an
> RAII thinggie TBD later).
> 
> Is this reasonable?  I'd like to know before I spend any time converting
> anything to std::string and auto_vec<>.
> 
> Thanks for everyone's input.
> Aldy
> 
> p.s. Yeah yeah, both std::string and auto_vec<> malloc stuff and they're
> bloated, but do we care on non-critical things?  We do get cleaner code
> as a result...
Jeff Law Aug. 5, 2016, 4:23 p.m. UTC | #14
On 08/05/2016 08:37 AM, Aldy Hernandez wrote:
>
> After looking at all this code, I realize no size will fit all if we
> want something clean.  So, I take my approach back.  I think we're best
> served by a couple different approaches, depending on usage.
Yup.

>
> My question is, how wed are we to an alloca based approach?  Can we
> replace the non-critical uses by std::string (diagnostics,
> BLAH_check_failed(), building a string to manipulate the output of
> env("SOME_VAR"), etc?).
I don't think we are at all wed to alloca.

>
> Then for things like the GCC driver wanting to create a vector of passed
> commands, we could use auto_vec<> (*gasp*, using a vector to represent a
> vector ;-)).  Note that auto_vec<> uses malloc (or GC) but creating a
> malloc'd vector at startup is hardly on the fast path.
>
> For the remaining things we could use either alloca with a malloc
> fallback for trivial things like set_user_assembler_name() that only
> have one exit point, or take it on a case by case basis.
>
> But it seems to me that we can use an STL container for the non critical
> things (which are the majority of them), and something else (perhaps an
> RAII thinggie TBD later).
>
> Is this reasonable?  I'd like to know before I spend any time converting
> anything to std::string and auto_vec<>.
Yes, this is all reasonable to me.  I'm a big believer in moving towards 
standard library implementations of things.  In this case we get to 
remove the allocas *and* make the code easier to grok for those that are 
familiar with the C++ standard library.

Jeff
Richard Biener Aug. 5, 2016, 5:48 p.m. UTC | #15
On August 5, 2016 6:23:27 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 08/05/2016 08:37 AM, Aldy Hernandez wrote:
>>
>> After looking at all this code, I realize no size will fit all if we
>> want something clean.  So, I take my approach back.  I think we're
>best
>> served by a couple different approaches, depending on usage.
>Yup.
>
>>
>> My question is, how wed are we to an alloca based approach?  Can we
>> replace the non-critical uses by std::string (diagnostics,
>> BLAH_check_failed(), building a string to manipulate the output of
>> env("SOME_VAR"), etc?).
>I don't think we are at all wed to alloca.
>
>>
>> Then for things like the GCC driver wanting to create a vector of
>passed
>> commands, we could use auto_vec<> (*gasp*, using a vector to
>represent a
>> vector ;-)).  Note that auto_vec<> uses malloc (or GC) but creating a
>> malloc'd vector at startup is hardly on the fast path.
>>
>> For the remaining things we could use either alloca with a malloc
>> fallback for trivial things like set_user_assembler_name() that only
>> have one exit point, or take it on a case by case basis.
>>
>> But it seems to me that we can use an STL container for the non
>critical
>> things (which are the majority of them), and something else (perhaps
>an
>> RAII thinggie TBD later).
>>
>> Is this reasonable?  I'd like to know before I spend any time
>converting
>> anything to std::string and auto_vec<>.
>Yes, this is all reasonable to me.  I'm a big believer in moving
>towards 
>standard library implementations of things.  In this case we get to 
>remove the allocas *and* make the code easier to grok for those that
>are 
>familiar with the C++ standard library.

But at the same time please consider that each malloc costs 1000x more than an alloca.  So if you replace one alloca that is executed once during compile, fine.  If you replace one that is done for each symbol (the get_identifier case) it is not.

Richard.

>Jeff
Richard Biener Aug. 5, 2016, 5:55 p.m. UTC | #16
On August 5, 2016 4:42:48 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote:
>
>> I was surprised by the always_inline trick.  I suppose it makes
>> sense but I wouldn't have expected to be able to rely on it.  Out
>> of curiosity I tested it with other compilers.  It turns out that
>> it only works with some like GCC and IBM XLC++, but not others
>> like Clang or Sun CC.  In recursive calls, they don't seem to
>> hold on to the memory allocated via alloca by the class ctor in
>> the caller.
>
>Well, it was surprising to me as well, hence the comment.  I expected
>it 
>to just work, and when it didn't I had to hunt in the inliner code to 
>find out why it was selectively inlining:
>
>     case GIMPLE_CALL:
>    /* Refuse to inline alloca call unless user explicitly forced so as
>	 this may change program's memory overhead drastically when the
>	 function using alloca is called in loop.  In GCC present in
>	 SPEC2000 inlining into schedule_block cause it to require 2GB of
>	 RAM instead of 256MB.  Don't do so for alloca calls emitted for
>	 VLA objects as those can't cause unbounded growth (they're always
>	 wrapped inside stack_save/stack_restore regions.  */
>
>As Richi pointed out, if the constructor doesn't get inlined (as you're
>
>seeing in Clang and Sun CC), we could potentially clobber freed memory.
>
>  So much for that idea...
>
>>
>> FWIW, rather than creating a new class around alloca and putting
>> effort into changing code to use it I think that investing time
>> into replacing the function's uses with an existing C++ container
>> class (either GCC's own or one from the STL) might be a better
>> way to go.
>
>Yes, I suggested a combination of auto_vec<> (gcc's) and std::string 
>down thread.

Please don't use std::string.  For string building you can use obstacks.

Richard.

>Thanks for checking out the result from other compilers.  Much
>appreciated.
>
>Aldy
Oleg Endo Aug. 5, 2016, 6:15 p.m. UTC | #17
On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:

> 
> Please don't use std::string.  For string building you can use
> obstacks.
> 

Just out of curiosity ... why?  I remember there was some discussion
about it, what was the conclusion?  Is that now a general rule or does
it depend on the context where strings are used?

Cheers,
Oleg
Richard Biener Aug. 5, 2016, 8:07 p.m. UTC | #18
On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote:
>On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>
>> 
>> Please don't use std::string.  For string building you can use
>> obstacks.
>> 
>
>Just out of curiosity ... why?  I remember there was some discussion
>about it, what was the conclusion?  Is that now a general rule or does
>it depend on the context where strings are used?

Because you make a messy mix of string handling variants.  Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers.
(Oh, and I hate I/o streams even more)

Richard.

>Cheers,
>Oleg
Aldy Hernandez Aug. 6, 2016, 10:09 a.m. UTC | #19
On 08/05/2016 04:07 PM, Richard Biener wrote:
> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote:
>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>>
>>>
>>> Please don't use std::string.  For string building you can use
>>> obstacks.
>>>
>>
>> Just out of curiosity ... why?  I remember there was some discussion
>> about it, what was the conclusion?  Is that now a general rule or does
>> it depend on the context where strings are used?
>
> Because you make a messy mix of string handling variants.  Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers.

Is this negotiable?  I would prefer to use the standard library 
implementation of things when possible.  At the very least, it's easier 
for others familiar with the C++ STL.  I mean, we are using C++ after 
all :).

I understand a reluctance in cases where it would be very inefficient, 
or where std::string is not up to speed.  In such cases it makes sense 
to use obstacks or equivalents, but in straightforward things on a non 
critical path?

I would gladly do either, but I would strongly prefer std::string when 
it does not overly pessimize code.

 > (Oh, and I hate I/o streams even more)

Yeah, but that sounds like a personal preference? ;-).

Let me know.  I'll do either if it's an agreed upon mandate from the 
global deities :).

Aldy
Aldy Hernandez Aug. 6, 2016, 10:15 a.m. UTC | #20
On 08/05/2016 04:07 PM, Richard Biener wrote:
> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo <oleg.endo@t-online.de> wrote:
>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>>
>>>
>>> Please don't use std::string.  For string building you can use
>>> obstacks.
>>>
>>
>> Just out of curiosity ... why?  I remember there was some discussion
>> about it, what was the conclusion?  Is that now a general rule or does
>> it depend on the context where strings are used?
>
> Because you make a messy mix of string handling variants.  Std::string is not powerful enough to capture all uses, it is vastly more expensive to embed into structs and it pulls in too much headers.
> (Oh, and I hate I/o streams even more)

Oh, and there is prior use in ipa-chkp.c, although I suppose it could've 
crept in:

   std::string s;

   /* called_as_built_in checks DECL_NAME to identify calls to
      builtins.  We want instrumented calls to builtins to be
      recognized by called_as_built_in.  Therefore use original
      DECL_NAME for cloning with no prefixes.  */
   s = IDENTIFIER_POINTER (DECL_NAME (fndecl));
   s += ".chkp";
   DECL_NAME (new_decl) = get_identifier (s.c_str ());

You can't tell me obstacks are easier on the eyes for this ;-).

Aldy
Richard Biener Aug. 6, 2016, 3:08 p.m. UTC | #21
On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote:
>On 08/05/2016 04:07 PM, Richard Biener wrote:
>> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo
><oleg.endo@t-online.de> wrote:
>>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>>>
>>>>
>>>> Please don't use std::string.  For string building you can use
>>>> obstacks.
>>>>
>>>
>>> Just out of curiosity ... why?  I remember there was some discussion
>>> about it, what was the conclusion?  Is that now a general rule or
>does
>>> it depend on the context where strings are used?
>>
>> Because you make a messy mix of string handling variants. 
>Std::string is not powerful enough to capture all uses, it is vastly
>more expensive to embed into structs and it pulls in too much headers.
>> (Oh, and I hate I/o streams even more)
>
>Oh, and there is prior use in ipa-chkp.c, although I suppose it
>could've 
>crept in:

Definitely.

>
>   std::string s;
>
>   /* called_as_built_in checks DECL_NAME to identify calls to
>      builtins.  We want instrumented calls to builtins to be
>      recognized by called_as_built_in.  Therefore use original
>      DECL_NAME for cloning with no prefixes.  */
>   s = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>   s += ".chkp";
>   DECL_NAME (new_decl) = get_identifier (s.c_str ());
>
>You can't tell me obstacks are easier on the eyes for this ;-).

Even strcat is shorter and cheaper.

Richard.

>Aldy
Jeff Law Aug. 8, 2016, 5 p.m. UTC | #22
On 08/06/2016 09:08 AM, Richard Biener wrote:
> On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote:
>> On 08/05/2016 04:07 PM, Richard Biener wrote:
>>> On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo
>> <oleg.endo@t-online.de> wrote:
>>>> On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>>>>
>>>>>
>>>>> Please don't use std::string.  For string building you can use
>>>>> obstacks.
>>>>>
>>>>
>>>> Just out of curiosity ... why?  I remember there was some discussion
>>>> about it, what was the conclusion?  Is that now a general rule or
>> does
>>>> it depend on the context where strings are used?
>>>
>>> Because you make a messy mix of string handling variants.
>> Std::string is not powerful enough to capture all uses, it is vastly
>> more expensive to embed into structs and it pulls in too much headers.
>>> (Oh, and I hate I/o streams even more)
>>
>> Oh, and there is prior use in ipa-chkp.c, although I suppose it
>> could've
>> crept in:
>
> Definitely.
>
>>
>>   std::string s;
>>
>>   /* called_as_built_in checks DECL_NAME to identify calls to
>>      builtins.  We want instrumented calls to builtins to be
>>      recognized by called_as_built_in.  Therefore use original
>>      DECL_NAME for cloning with no prefixes.  */
>>   s = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>>   s += ".chkp";
>>   DECL_NAME (new_decl) = get_identifier (s.c_str ());
>>
>> You can't tell me obstacks are easier on the eyes for this ;-).
>
> Even strcat is shorter and cheaper.
ISTM that unless the code is performance critical we should be writing 
code that is easy to understand and hard to get wrong.

Thus when we have something that is non-critical and can be handled by 
std:: routines we should be using them.

If performance is important in a particular piece of code an obstack or 
explicit malloc/free seems better.

Can we agree on those as guiding principles and thus boil the argument 
down to whether or not a particular piece of code is performance critical?

jeff
Trevor Saunders Aug. 8, 2016, 5:39 p.m. UTC | #23
On Mon, Aug 08, 2016 at 11:00:28AM -0600, Jeff Law wrote:
> On 08/06/2016 09:08 AM, Richard Biener wrote:
> > On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez <aldyh@redhat.com> wrote:
> > > On 08/05/2016 04:07 PM, Richard Biener wrote:
> > > > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo
> > > <oleg.endo@t-online.de> wrote:
> > > > > On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
> > > > > 
> > > > > > 
> > > > > > Please don't use std::string.  For string building you can use
> > > > > > obstacks.
> > > > > > 
> > > > > 
> > > > > Just out of curiosity ... why?  I remember there was some discussion
> > > > > about it, what was the conclusion?  Is that now a general rule or
> > > does
> > > > > it depend on the context where strings are used?
> > > > 
> > > > Because you make a messy mix of string handling variants.
> > > Std::string is not powerful enough to capture all uses, it is vastly
> > > more expensive to embed into structs and it pulls in too much headers.
> > > > (Oh, and I hate I/o streams even more)
> > > 
> > > Oh, and there is prior use in ipa-chkp.c, although I suppose it
> > > could've
> > > crept in:
> > 
> > Definitely.
> > 
> > > 
> > >   std::string s;
> > > 
> > >   /* called_as_built_in checks DECL_NAME to identify calls to
> > >      builtins.  We want instrumented calls to builtins to be
> > >      recognized by called_as_built_in.  Therefore use original
> > >      DECL_NAME for cloning with no prefixes.  */
> > >   s = IDENTIFIER_POINTER (DECL_NAME (fndecl));
> > >   s += ".chkp";
> > >   DECL_NAME (new_decl) = get_identifier (s.c_str ());
> > > 
> > > You can't tell me obstacks are easier on the eyes for this ;-).
> > 
> > Even strcat is shorter and cheaper.
> ISTM that unless the code is performance critical we should be writing code
> that is easy to understand and hard to get wrong.

it seems hard to disagree with that ;)

> Thus when we have something that is non-critical and can be handled by std::
> routines we should be using them.
> 
> If performance is important in a particular piece of code an obstack or
> explicit malloc/free seems better.

I'm not totally convinced we have to trade off performance against a
good interface.  It seems to me it wouldn't be that complicated to build
a string class on top of auto_vec that has a similar interface to
std::string.

I'd really rather not have to write a string class of our own, but I can
see some significant advantages to it.

First sizeof std::string is 32 on x86_64, a char *, a size_t for the
length, and a 16 byte union of a size_t for allocated size and a 16 byte
buffer for short strings.  I suppose some of this is required by the C++
standard, but I doubt we really need to care about strings longer than
2^32 in gcc.  If we put the length and allocated size in the buffer, and
have a separate class for stack allocated buffers I think we can have a
string that is just sizeof void *.

Second it would be useful performance wise to have a std::string_view
type class, but that is c++14 or 17? only so we'd need to import it into
gcc or something.

Trev
Richard Biener Aug. 8, 2016, 7:03 p.m. UTC | #24
On August 8, 2016 7:39:39 PM GMT+02:00, Trevor Saunders <tbsaunde@tbsaunde.org> wrote:
>On Mon, Aug 08, 2016 at 11:00:28AM -0600, Jeff Law wrote:
>> On 08/06/2016 09:08 AM, Richard Biener wrote:
>> > On August 6, 2016 12:15:26 PM GMT+02:00, Aldy Hernandez
><aldyh@redhat.com> wrote:
>> > > On 08/05/2016 04:07 PM, Richard Biener wrote:
>> > > > On August 5, 2016 8:15:54 PM GMT+02:00, Oleg Endo
>> > > <oleg.endo@t-online.de> wrote:
>> > > > > On Fri, 2016-08-05 at 19:55 +0200, Richard Biener wrote:
>> > > > > 
>> > > > > > 
>> > > > > > Please don't use std::string.  For string building you can
>use
>> > > > > > obstacks.
>> > > > > > 
>> > > > > 
>> > > > > Just out of curiosity ... why?  I remember there was some
>discussion
>> > > > > about it, what was the conclusion?  Is that now a general
>rule or
>> > > does
>> > > > > it depend on the context where strings are used?
>> > > > 
>> > > > Because you make a messy mix of string handling variants.
>> > > Std::string is not powerful enough to capture all uses, it is
>vastly
>> > > more expensive to embed into structs and it pulls in too much
>headers.
>> > > > (Oh, and I hate I/o streams even more)
>> > > 
>> > > Oh, and there is prior use in ipa-chkp.c, although I suppose it
>> > > could've
>> > > crept in:
>> > 
>> > Definitely.
>> > 
>> > > 
>> > >   std::string s;
>> > > 
>> > >   /* called_as_built_in checks DECL_NAME to identify calls to
>> > >      builtins.  We want instrumented calls to builtins to be
>> > >      recognized by called_as_built_in.  Therefore use original
>> > >      DECL_NAME for cloning with no prefixes.  */
>> > >   s = IDENTIFIER_POINTER (DECL_NAME (fndecl));
>> > >   s += ".chkp";
>> > >   DECL_NAME (new_decl) = get_identifier (s.c_str ());
>> > > 
>> > > You can't tell me obstacks are easier on the eyes for this ;-).
>> > 
>> > Even strcat is shorter and cheaper.
>> ISTM that unless the code is performance critical we should be
>writing code
>> that is easy to understand and hard to get wrong.
>
>it seems hard to disagree with that ;)

s.c_str () is _not_ an easy to grasp thing.  Unless interfaces in GCC generally use C strings I object to using C++ strings for anything.  I don't see a single up-side and only downsides.

Richard.

>> Thus when we have something that is non-critical and can be handled
>by std::
>> routines we should be using them.
>> 
>> If performance is important in a particular piece of code an obstack
>or
>> explicit malloc/free seems better.
>
>I'm not totally convinced we have to trade off performance against a
>good interface.  It seems to me it wouldn't be that complicated to
>build
>a string class on top of auto_vec that has a similar interface to
>std::string.
>
>I'd really rather not have to write a string class of our own, but I
>can
>see some significant advantages to it.
>
>First sizeof std::string is 32 on x86_64, a char *, a size_t for the
>length, and a 16 byte union of a size_t for allocated size and a 16
>byte
>buffer for short strings.  I suppose some of this is required by the
>C++
>standard, but I doubt we really need to care about strings longer than
>2^32 in gcc.  If we put the length and allocated size in the buffer,
>and
>have a separate class for stack allocated buffers I think we can have a
>string that is just sizeof void *.
>
>Second it would be useful performance wise to have a std::string_view
>type class, but that is c++14 or 17? only so we'd need to import it
>into
>gcc or something.
>
>Trev
Oleg Endo Aug. 9, 2016, 11:33 a.m. UTC | #25
On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote:


> First sizeof std::string is 32 on x86_64, a char *, a size_t for the
> length, and a 16 byte union of a size_t for allocated size and a 16 
> byte buffer for short strings.  I suppose some of this is required by 
> the C++ standard,

I recommend checking what others have figured regarding that matter
https://github.com/elliotgoodrich/SSO-23


>  but I doubt we really need to care about strings longer than 2^32 in
> gcc.  If we put the length and allocated size in the buffer,
> and have a separate class for stack allocated buffers I think we can 
> have a string that is just sizeof void *.

This idea has one flaw in that it does not allow having objects by
value.  It essentially *requires* one to *always* allocate them on the
heap via new/delete.  Being able to store objects by value is useful in
many situations.  So if you do the above, the next logical step that
will follow is a smart-pointer-like wrapper that allows value
semantics.  Because eventually somebody will want that operator == or
operator < e.g. for associative containers.

> 
> Second it would be useful performance wise to have a std::string_view
> type class, but that is c++14 or 17? only so we'd need to import it 
> into gcc or something.

http://en.cppreference.com/w/cpp/experimental/basic_string_view
http://en.cppreference.com/w/cpp/string/basic_string_view

It's "funny" that GCC contains a C++ stdlib implementation and so
little is actually used by GCC itself.

Cheers,
Oleg
Aldy Hernandez Aug. 9, 2016, 1:17 p.m. UTC | #26
On 08/05/2016 01:55 PM, Richard Biener wrote:

Hi Richard.

> Please don't use std::string.  For string building you can use obstacks.

Alright let's talk details then so I can write things up in a way you 
approve of.

Take for instance simple uses like all the tree_*check_failed routines, 
which I thought were great candidates for std::string-- they're going to 
be outputted to the screen or disk which is clearly many times more 
expensive than the malloc or overhead of std::string:

       length += strlen ("expected ");
       buffer = tmp = (char *) alloca (length);
       length = 0;
       while ((code = (enum tree_code) va_arg (args, int)))
	{
	  const char *prefix = length ? " or " : "expected ";

	  strcpy (tmp + length, prefix);
	  length += strlen (prefix);
	  strcpy (tmp + length, get_tree_code_name (code));
	  length += strlen (get_tree_code_name (code));
	}

Do you suggest using obstacks here, or did you have something else in mind?

How about if it's something like the above but there are multiple exit 
points from the function.  I mean, we still need to call obstack_free() 
on all exit points, which is what I wanted to avoid for something as 
simple as building a throw-away string.

Aldy
Bernd Schmidt Aug. 9, 2016, 1:21 p.m. UTC | #27
On 08/09/2016 03:17 PM, Aldy Hernandez wrote:
> On 08/05/2016 01:55 PM, Richard Biener wrote:
>
> Hi Richard.
>
>> Please don't use std::string.  For string building you can use obstacks.
>
> Alright let's talk details then so I can write things up in a way you
> approve of.
>
> Take for instance simple uses like all the tree_*check_failed routines,
> which I thought were great candidates for std::string-- they're going to
> be outputted to the screen or disk which is clearly many times more
> expensive than the malloc or overhead of std::string:
>
>       length += strlen ("expected ");
>       buffer = tmp = (char *) alloca (length);
>       length = 0;
>       while ((code = (enum tree_code) va_arg (args, int)))
>     {
>       const char *prefix = length ? " or " : "expected ";
>
>       strcpy (tmp + length, prefix);
>       length += strlen (prefix);
>       strcpy (tmp + length, get_tree_code_name (code));
>       length += strlen (get_tree_code_name (code));
>     }
>
> Do you suggest using obstacks here, or did you have something else in mind?
>
> How about if it's something like the above but there are multiple exit
> points from the function.  I mean, we still need to call obstack_free()
> on all exit points, which is what I wanted to avoid for something as
> simple as building a throw-away string.

I think that given our choice to switch to C++, we should use the 
language in idiomatic ways, and the standard library is part of that. 
Otherwise the language switch makes even less sense than it did in the 
first place.


Bernd
Trevor Saunders Aug. 9, 2016, 5:41 p.m. UTC | #28
On Tue, Aug 09, 2016 at 08:33:49PM +0900, Oleg Endo wrote:
> On Mon, 2016-08-08 at 13:39 -0400, Trevor Saunders wrote:
> 
> 
> > First sizeof std::string is 32 on x86_64, a char *, a size_t for the
> > length, and a 16 byte union of a size_t for allocated size and a 16 
> > byte buffer for short strings.  I suppose some of this is required by 
> > the C++ standard,
> 
> I recommend checking what others have figured regarding that matter
> https://github.com/elliotgoodrich/SSO-23

 I think one of the big mistakes in the C++ stl strings is that they try
 to shove stack allocated strings where you want an internal buffer for
 SSO and a class to put in structs into the same type.  imho it would be
 better to have std::string for the second use case, and a separate
 std::stack_string or something for the first case.

> >  but I doubt we really need to care about strings longer than 2^32 in
> > gcc.  If we put the length and allocated size in the buffer,
> > and have a separate class for stack allocated buffers I think we can 
> > have a string that is just sizeof void *.
> 
> This idea has one flaw in that it does not allow having objects by
> value.  It essentially *requires* one to *always* allocate them on the
> heap via new/delete.  Being able to store objects by value is useful in
> many situations.  So if you do the above, the next logical step that
> will follow is a smart-pointer-like wrapper that allows value
> semantics.  Because eventually somebody will want that operator == or
> operator < e.g. for associative containers.

 If what you want is the ability to put the buffer on the stack instead
 of the heap then I think a stack_string class that interoperates with
 your string class is the thing you want.

 I don't really see anything wrong with a string class being a really
 fancy smart pointer that has a bunch of useful string stuff on it.  As
 for operator == I'd be fairly ok with that, other than it hiding a O(N)
 operation in ==.

> > 
> > Second it would be useful performance wise to have a std::string_view
> > type class, but that is c++14 or 17? only so we'd need to import it 
> > into gcc or something.
> 
> http://en.cppreference.com/w/cpp/experimental/basic_string_view
> http://en.cppreference.com/w/cpp/string/basic_string_view
> 
> It's "funny" that GCC contains a C++ stdlib implementation and so
> little is actually used by GCC itself.

Regretably necessary sure, but I'm not sure its funny.  The first big
problem with using the stl is that the subset available in C++98 isn't
that great, you could maybe hack up libstdc++ so that you can use newer
types just without the bits that use newer language features, but that
would take some work.  The other big problem is that the stl is often
too general, and tries to be too simple.  std::string is actually a good
example of both of those problems.  There's really no reason it should
use size_t instead of uint32_t for the string length / capacity.  It
would also be a lot better if it had separate types for strings where
you want an internal buffer or don't.

Trev
Richard Biener Aug. 10, 2016, 10:04 a.m. UTC | #29
On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
> On 08/05/2016 01:55 PM, Richard Biener wrote:
>
> Hi Richard.
>
>> Please don't use std::string.  For string building you can use obstacks.
>
>
> Alright let's talk details then so I can write things up in a way you
> approve of.
>
> Take for instance simple uses like all the tree_*check_failed routines,
> which I thought were great candidates for std::string-- they're going to be
> outputted to the screen or disk which is clearly many times more expensive
> than the malloc or overhead of std::string:
>
>       length += strlen ("expected ");
>       buffer = tmp = (char *) alloca (length);
>       length = 0;
>       while ((code = (enum tree_code) va_arg (args, int)))
>         {
>           const char *prefix = length ? " or " : "expected ";
>
>           strcpy (tmp + length, prefix);
>           length += strlen (prefix);
>           strcpy (tmp + length, get_tree_code_name (code));
>           length += strlen (get_tree_code_name (code));
>         }
>
> Do you suggest using obstacks here, or did you have something else in mind?

Why would you want to get rid of the alloca here?

> How about if it's something like the above but there are multiple exit
> points from the function.  I mean, we still need to call obstack_free() on
> all exit points, which is what I wanted to avoid for something as simple as
> building a throw-away string.

But you'll end up in

  internal_error ("tree check: %s, have %s in %s, at %s:%d",
                  buffer, get_tree_code_name (TREE_CODE (node)),
                  function, trim_filename (file), line);

where you have an interface using C strings again.

It's not that the standard library is bad per-se, it's just using a
tool that doesn't
fit its uses.  This makes the code a messy mix of two concepts which is what
I object to.

Yes, the above example may have premature optimization applied.  Is that
a reason to re-write it using std::string?  No.  Is it a reason to rewrite it
using obstracks?  No.  Just leave it alone.

Richard.

>
> Aldy
Oleg Endo Aug. 10, 2016, 5:03 p.m. UTC | #30
On Tue, 2016-08-09 at 13:41 -0400, Trevor Saunders wrote:

> If what you want is the ability to put the buffer on the stack
> instead  of the heap then I think a stack_string class that
> interoperates with  your string class is the thing you want.

I'd avoid a separate stack_string class.  Instead use a linear
allocator with storage allocated on the stack, e.g.:

typedef std::basic_string<char, std::char_traits<char>,
                          linear_allocator<char>
my_tmp_string;

void bleh (int foo)
{
  linear_allocator<char> buffer (alloca (1024), 1024);

  tmp_string aaa (buffer);
  aaa += "test";
  aaa += "moon";
  ...

  // what happens when "buffer" runs out of space?
  // throw?  switch to heap alloc? ...
}


> 
>  I don't really see anything wrong with a string class being a really
>  fancy smart pointer that has a bunch of useful string stuff on it. 


That's what std::string basically is?


>  As for operator == I'd be fairly ok with that, other than it hiding 
> a O(N) operation in ==.

Yes, it makes it more difficult to carry out algorithm complexity
analysis using grep...


> Regretably necessary sure, but I'm not sure its funny.

Hence the quotes.

> The first big problem with using the stl is that the subset available 
> in C++98 isn't that great, you could maybe hack up libstdc++ so that 
> you can use newer types just without the bits that use newer language 
> features, but that would take some work.

Or just wait until people have agreed to switch to C++11 or C++14.  I
don't think in practice anybody uses an C++11-incapable GCC to build a
newer GCC these days.

> The other big problem is that the stl is often too general, and tries 
> to be too simple.  std::string is actually a good example of both of 
> those problems.  There's really no reason it should use size_t 
> instead of uint32_t for the string length / capacity.

Yes, some of the things in the STL are collections of compromises for
general-purpose usage.  If you're doing something extra fancy, most
likely you can outperform the generic STL implementation.  But very
often it's actually not needed.


>   It would also be a lot better if it had separate types for strings 
> where you want an internal buffer or don't.

Using custom allocators is one way.  For example ...

template <unsigned int BufferSize> struct linear_allocator_with_buffer;

template <unsigned int BufferSize>
using stack_string =
  std::basic_string<char, std::char_traits<char>,
                    linear_allocator_with_buffer<BufferSize>>;

But then ...

stack_string<32> a = "a";
stack_string<64> b = "b";

b += a;

... will not work because they are different types.

One drawback of using custom allocators for that kind of thing is being
unable to pass the above stack_string to a function that takes a "const
std::string&" because they differ in template parameters.  But that's
where std::string_view comes in.


Cheers,
Oleg
Jeff Law Aug. 10, 2016, 6 p.m. UTC | #31
On 08/10/2016 04:04 AM, Richard Biener wrote:
> On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com> wrote:
>> On 08/05/2016 01:55 PM, Richard Biener wrote:
>>
>> Hi Richard.
>>
>>> Please don't use std::string.  For string building you can use obstacks.
>>
>>
>> Alright let's talk details then so I can write things up in a way you
>> approve of.
>>
>> Take for instance simple uses like all the tree_*check_failed routines,
>> which I thought were great candidates for std::string-- they're going to be
>> outputted to the screen or disk which is clearly many times more expensive
>> than the malloc or overhead of std::string:
>>
>>       length += strlen ("expected ");
>>       buffer = tmp = (char *) alloca (length);
>>       length = 0;
>>       while ((code = (enum tree_code) va_arg (args, int)))
>>         {
>>           const char *prefix = length ? " or " : "expected ";
>>
>>           strcpy (tmp + length, prefix);
>>           length += strlen (prefix);
>>           strcpy (tmp + length, get_tree_code_name (code));
>>           length += strlen (get_tree_code_name (code));
>>         }
>>
>> Do you suggest using obstacks here, or did you have something else in mind?
>
> Why would you want to get rid of the alloca here?
Do you know the range for LENGTH in the code above?  Is it based on 
something the user could potentially control (like a variable name, 
typdef name, etc).  If you don't know the length or it's possibly under 
the control of the user, then this can blow out the stack, which makes 
the code vulnerable to a stack shifting style attack by which further 
writes into the stack are actually writing into other parts of the 
stack, the heap, plt or some other location.  Essentially this gives an 
attacker control over one or more stores to memory, which is often 
enough of a vulnerability to mount an attack.

jeff
Richard Biener Aug. 10, 2016, 6:33 p.m. UTC | #32
On August 10, 2016 8:00:20 PM GMT+02:00, Jeff Law <law@redhat.com> wrote:
>On 08/10/2016 04:04 AM, Richard Biener wrote:
>> On Tue, Aug 9, 2016 at 3:17 PM, Aldy Hernandez <aldyh@redhat.com>
>wrote:
>>> On 08/05/2016 01:55 PM, Richard Biener wrote:
>>>
>>> Hi Richard.
>>>
>>>> Please don't use std::string.  For string building you can use
>obstacks.
>>>
>>>
>>> Alright let's talk details then so I can write things up in a way
>you
>>> approve of.
>>>
>>> Take for instance simple uses like all the tree_*check_failed
>routines,
>>> which I thought were great candidates for std::string-- they're
>going to be
>>> outputted to the screen or disk which is clearly many times more
>expensive
>>> than the malloc or overhead of std::string:
>>>
>>>       length += strlen ("expected ");
>>>       buffer = tmp = (char *) alloca (length);
>>>       length = 0;
>>>       while ((code = (enum tree_code) va_arg (args, int)))
>>>         {
>>>           const char *prefix = length ? " or " : "expected ";
>>>
>>>           strcpy (tmp + length, prefix);
>>>           length += strlen (prefix);
>>>           strcpy (tmp + length, get_tree_code_name (code));
>>>           length += strlen (get_tree_code_name (code));
>>>         }
>>>
>>> Do you suggest using obstacks here, or did you have something else
>in mind?
>>
>> Why would you want to get rid of the alloca here?
>Do you know the range for LENGTH in the code above? 

Yes, it's a set of tree code names.

 Is it based on 
>something the user could potentially control (like a variable name, 
>typdef name, etc).  If you don't know the length or it's possibly under
>
>the control of the user, then this can blow out the stack, which makes 
>the code vulnerable to a stack shifting style attack by which further 
>writes into the stack are actually writing into other parts of the 
>stack, the heap, plt or some other location.  Essentially this gives an
>
>attacker control over one or more stores to memory, which is often 
>enough of a vulnerability to mount an attack.

Yes, I understand that.  The above is not such a case.  If an attacker can trick me into compiling (and possibly executing) his code then things are lost anyway.  No need for a fancy buffer overflow.

IMHO the alloca case warrants for a protection on the level of the stack adjustment itself.

Richard.

>
>jeff
Trevor Saunders Aug. 11, 2016, 1:31 a.m. UTC | #33
On Thu, Aug 11, 2016 at 02:03:29AM +0900, Oleg Endo wrote:
> On Tue, 2016-08-09 at 13:41 -0400, Trevor Saunders wrote:
> 
> > If what you want is the ability to put the buffer on the stack
> > instead  of the heap then I think a stack_string class that
> > interoperates with  your string class is the thing you want.
> 
> I'd avoid a separate stack_string class.  Instead use a linear
> allocator with storage allocated on the stack, e.g.:
> 
> typedef std::basic_string<char, std::char_traits<char>,
>                           linear_allocator<char>
> my_tmp_string;
> 
> void bleh (int foo)
> {
>   linear_allocator<char> buffer (alloca (1024), 1024);
> 
>   tmp_string aaa (buffer);
>   aaa += "test";
>   aaa += "moon";
>   ...
> 
>   // what happens when "buffer" runs out of space?
>   // throw?  switch to heap alloc? ...
> }
> 
> 
> > 
> >  I don't really see anything wrong with a string class being a really
> >  fancy smart pointer that has a bunch of useful string stuff on it. 
> 
> 
> That's what std::string basically is?

sure, but I'm saying it does that job baddly.

> >  As for operator == I'd be fairly ok with that, other than it hiding 
> > a O(N) operation in ==.
> 
> Yes, it makes it more difficult to carry out algorithm complexity
> analysis using grep...
> 
> 
> > Regretably necessary sure, but I'm not sure its funny.
> 
> Hence the quotes.
> 
> > The first big problem with using the stl is that the subset available 
> > in C++98 isn't that great, you could maybe hack up libstdc++ so that 
> > you can use newer types just without the bits that use newer language 
> > features, but that would take some work.
> 
> Or just wait until people have agreed to switch to C++11 or C++14.  I
> don't think in practice anybody uses an C++11-incapable GCC to build a
> newer GCC these days.

I'm pretty sure they do, I've seen patches for 4.1 and 4.3 go in this
cycle, and rval refs didn't work until 4.4 and that was to an older
version of the spec.  Anyways people apparently use compilers other than
gcc to build gcc, and I'm prepared to believe some of the proprietery
compilers on ancient unix variants don't support C++11.

> > The other big problem is that the stl is often too general, and tries 
> > to be too simple.  std::string is actually a good example of both of 
> > those problems.  There's really no reason it should use size_t 
> > instead of uint32_t for the string length / capacity.
> 
> Yes, some of the things in the STL are collections of compromises for
> general-purpose usage.  If you're doing something extra fancy, most
> likely you can outperform the generic STL implementation.  But very
> often it's actually not needed.

Well, I'd say the compromises made in std::string make it pretty
terrible for general purpose use accept where perf and memory doesn't
matter at all.

std::vector isn't very great size wise either, imho the size / capacity
fields would be much better put in the buffer than in the struct itself,
to save 2 * sizeof (void *) in sizeof std::vector.

I haven't looked at other stl datastructures that much, but those two
examples don't fill me with confidence :/

> >   It would also be a lot better if it had separate types for strings 
> > where you want an internal buffer or don't.
> 
> Using custom allocators is one way.  For example ...
> 
> template <unsigned int BufferSize> struct linear_allocator_with_buffer;
> 
> template <unsigned int BufferSize>
> using stack_string =
>   std::basic_string<char, std::char_traits<char>,
>                     linear_allocator_with_buffer<BufferSize>>;

that *is* a different type than std::string with the default template
arguments.

> But then ...
> 
> stack_string<32> a = "a";
> stack_string<64> b = "b";
> 
> b += a;
> 
> ... will not work because they are different types.
> 
> One drawback of using custom allocators for that kind of thing is being
> unable to pass the above stack_string to a function that takes a "const
> std::string&" because they differ in template parameters.  But that's
> where std::string_view comes in.

or just having your stack_string type convert to the normal string type
so that you can pass mutable references.

Trev

> 
> 
> Cheers,
> Oleg
Oleg Endo Aug. 11, 2016, 12:18 p.m. UTC | #34
On Wed, 2016-08-10 at 21:31 -0400, Trevor Saunders wrote:

> 
> Well, I'd say the compromises made in std::string make it pretty
> terrible for general purpose use accept where perf and memory doesn't
> matter at all.
> 
> std::vector isn't very great size wise either, imho the size / 
> capacity fields would be much better put in the buffer than in the 
> struct itself, to save 2 * sizeof (void *) in sizeof std::vector.

There's nothing in the standard that says those fields have to go into
the vector struct/class itself and not into the data buffer.  It just
happens to be the way it's implemented in libstdc++.  E.g. libc++
implements some things in different ways to save a few bytes here and
there.

It looks more practical to put those fields into the container itself,
otherwise you'd have to do a nullptr check every time you access the
size field etc.  Although newer compilers optimize away the redundant
nullptr checks, older compilers (which GCC wants to be good with) might
not do it.  In any case, it seems to generate slightly bigger code, at
least in one instance (see attachment).

Anyway, normally there is more data in the vectors than there are
vectors around, so whether sizeof (vector) is 8 or 24 bytes doesn't
matter.

> 
> or just having your stack_string type convert to the normal string 
> type so that you can pass mutable references.

But that would imply copying, wouldn't it?

Cheers,
Oleg
Trevor Saunders Aug. 11, 2016, 6:02 p.m. UTC | #35
On Thu, Aug 11, 2016 at 09:18:34PM +0900, Oleg Endo wrote:
> On Wed, 2016-08-10 at 21:31 -0400, Trevor Saunders wrote:
> 
> > 
> > Well, I'd say the compromises made in std::string make it pretty
> > terrible for general purpose use accept where perf and memory doesn't
> > matter at all.
> > 
> > std::vector isn't very great size wise either, imho the size / 
> > capacity fields would be much better put in the buffer than in the 
> > struct itself, to save 2 * sizeof (void *) in sizeof std::vector.
> 
> There's nothing in the standard that says those fields have to go into
> the vector struct/class itself and not into the data buffer.  It just
> happens to be the way it's implemented in libstdc++.  E.g. libc++
> implements some things in different ways to save a few bytes here and
> there.
> 
> It looks more practical to put those fields into the container itself,
> otherwise you'd have to do a nullptr check every time you access the
> size field etc.  Although newer compilers optimize away the redundant
> nullptr checks, older compilers (which GCC wants to be good with) might
> not do it.  In any case, it seems to generate slightly bigger code, at
> least in one instance (see attachment).

if you are clever you can have a empty global vector that you point all
the empty vectors at and so eliminate the null check.  Well, I guess
that doesn't work quite as well if the global needs to be accessed
through a plt sigh, but at least you only need to check against it for
reallocation which is rather heavy weight anyway.

> Anyway, normally there is more data in the vectors than there are
> vectors around, so whether sizeof (vector) is 8 or 24 bytes doesn't
> matter.

It matters if you have any empty vectors around.  The only numbers I can
find are https://bugzilla.mozilla.org/show_bug.cgi?id=1007846 but I seem
to remember seeing that for Firefox at least most vectors contained 0
elements.

> > 
> > or just having your stack_string type convert to the normal string 
> > type so that you can pass mutable references.
> 
> But that would imply copying, wouldn't it?

Not if you are clever, you can use the same trick gcc uses in vec /
auto_vec with strings.

Trev

> 
> Cheers,
> Oleg

> 
> #if 0
> template <typename T>
> class vector
> {
> public:
>   unsigned int size (void) const { return m_buffer != nullptr ? m_buffer->size : 0; }
>   bool empty (void) const { return size () == 0; }
> 
>   unsigned int capacity (void) const { return m_buffer != nullptr ? m_buffer->capacity : 0; }
> 
>   T* data (void) { return (T*)(m_buffer + 1); }
>   const T* data (void) const { return (const T*)(m_buffer + 1); }
> 
>   T& operator [] (unsigned int i) { return data ()[i]; }
>   const T& operator [] (unsigned int i) const { return data ()[i]; }
> 
> private:
>   struct buffer_header
>   {
>     unsigned int size;
>     unsigned int capacity;
>   };
> 
>   buffer_header* m_buffer;
> };
> 
> 
> int foo (const vector<int>& x)
> {
>   if (x.empty ())
>     return 0;
> 
>   int r = 0;
>   for (unsigned int i = 0; i < x.size (); ++i)
>     r += x[i];
> 
>   return r;
> }
> 
> /*
> -O2 -m2a
> 	mov.l	@r4,r2
> 	tst	r2,r2
> 	bt	.L5
> 	mov.l	@r2,r1
> 	tst	r1,r1
> 	bt.s	.L5
> 	shll2	r1
> 	add	#-4,r1
> 	shlr2	r1
> 	add	#8,r2
> 	mov	#0,r0
> 	add	#1,r1
> 	.align 2
> .L3:
> 	mov.l	@r2+,r3
> 	dt	r1
> 	bf.s	.L3
> 	add	r3,r0
> 	rts/n
> 	.align 1
> .L5:
> 	rts
> 	mov	#0,r0
> */
> #endif
> 
> 
> #if 1
> 
> template <typename T>
> class vector
> {
> public:
>   unsigned int size (void) const { return m_size; }
>   bool empty (void) const { return size () == 0; }
> 
>   unsigned int capacity (void) const { return m_capacity; }
> 
>   T* data (void) { return (T*)(m_buffer); }
>   const T* data (void) const { return (const T*)(m_buffer); }
> 
>   T& operator [] (unsigned int i) { return data ()[i]; }
>   const T& operator [] (unsigned int i) const { return data ()[i]; }
> 
> private:
>   unsigned int m_size;
>   unsigned int m_capacity;
>   T* m_buffer;
> };
> 
> 
> int foo (const vector<int>& x)
> {
>   if (x.empty ())
>     return 0;
> 
>   int r = 0;
>   for (unsigned int i = 0; i < x.size (); ++i)
>     r += x[i];
> 
>   return r;
> }
> 
> /*
> -O2 -m2a
> 	mov.l	@r4,r1
> 	tst	r1,r1
> 	bt.s	.L7
> 	mov	#0,r0
> 	shll2	r1
> 	mov.l	@(8,r4),r2
> 	add	#-4,r1
> 	shlr2	r1
> 	add	#1,r1
> 	.align 2
> .L3:
> 	mov.l	@r2+,r3
> 	dt	r1
> 	bf.s	.L3
> 	add	r3,r0
> .L7:
> 	rts/n
> */
> 
> #endif
>
Jeff Law Aug. 16, 2016, 4:27 p.m. UTC | #36
On 08/10/2016 12:33 PM, Richard Biener wrote:
>>>
>>> Why would you want to get rid of the alloca here?
>> Do you know the range for LENGTH in the code above?
>
> Yes, it's a set of tree code names.
>
> Is it based on
>> something the user could potentially control (like a variable name,
>>  typdef name, etc).  If you don't know the length or it's possibly
>> under
>>
>> the control of the user, then this can blow out the stack, which
>> makes the code vulnerable to a stack shifting style attack by which
>> further writes into the stack are actually writing into other parts
>> of the stack, the heap, plt or some other location.  Essentially
>> this gives an
>>
>> attacker control over one or more stores to memory, which is often
>>  enough of a vulnerability to mount an attack.
>
> Yes, I understand that.  The above is not such a case.  If an
> attacker can trick me into compiling (and possibly executing) his
> code then things are lost anyway.  No need for a fancy buffer
> overflow.
I think you're being rather short-sighed here.  GCC is being used in 
ways we can't necessarily predict -- which might include compile 
servers, JITs, web services, etc.


In those kind of environments I think you have a significantly different 
threat model and GCC itself becomes part of the attack surface.

There's existing web sites I'm pretty sure I could compromise if I had 
the interest by exploiting bugs in GCC and libiberty.

Jeff
Jakub Jelinek Aug. 16, 2016, 4:44 p.m. UTC | #37
On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote:
> I think you're being rather short-sighed here.  GCC is being used in ways we
> can't necessarily predict -- which might include compile servers, JITs, web
> services, etc.

For compile server/web services one needs to add the protection outside of
gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.),
because even with very short testcases e.g. in C/C++ one can eat arbitrary
amounts of stack even without any uses of alloca in the compiler, simply
through deep recursion in the parsers etc.  The attack vector is so big that
trying to do something just about alloca is IMHO pointless, and we really
don't want to fight 20 gcc CVEs every day (1:1 with most bugreports).
Alloca is really useful in the compiler IMO, it is significantly faster than
heap allocation, and that is what matters in many places a lot.

	Jakub
Jeff Law Aug. 16, 2016, 4:47 p.m. UTC | #38
On 08/16/2016 10:44 AM, Jakub Jelinek wrote:
> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote:
>> I think you're being rather short-sighed here.  GCC is being used in ways we
>> can't necessarily predict -- which might include compile servers, JITs, web
>> services, etc.
>
> For compile server/web services one needs to add the protection outside of
> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.),
> because even with very short testcases e.g. in C/C++ one can eat arbitrary
> amounts of stack even without any uses of alloca in the compiler, simply
> through deep recursion in the parsers etc.
Agreed.  However, that doesn't mean we should not be locking down things 
like alloca and other attack vectors.

   The attack vector is so big that
> trying to do something just about alloca is IMHO pointless, and we really
> don't want to fight 20 gcc CVEs every day (1:1 with most bugreports).
> Alloca is really useful in the compiler IMO, it is significantly faster than
> heap allocation, and that is what matters in many places a lot.
You have to start somewhere and we have the tools and willingness of an 
engineer to tackle part of this problem.  Simply giving up because it's 
not a total solution is absurd.

jeff
Martin Sebor Aug. 16, 2016, 5:54 p.m. UTC | #39
On 08/16/2016 10:47 AM, Jeff Law wrote:
> On 08/16/2016 10:44 AM, Jakub Jelinek wrote:
>> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote:
>>> I think you're being rather short-sighed here.  GCC is being used in
>>> ways we
>>> can't necessarily predict -- which might include compile servers,
>>> JITs, web
>>> services, etc.
>>
>> For compile server/web services one needs to add the protection
>> outside of
>> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.),
>> because even with very short testcases e.g. in C/C++ one can eat
>> arbitrary
>> amounts of stack even without any uses of alloca in the compiler, simply
>> through deep recursion in the parsers etc.
> Agreed.  However, that doesn't mean we should not be locking down things
> like alloca and other attack vectors.

I think I made this suggestion when Aldy posted his first patch
but it didn't get much traction so let me try again.  Since the
concern is alloca calls with excessively large arguments, would
transforming those (determined both at compile time and at run
time) into pairs of malloc/free calls be an acceptable compromise?

It would seem like a natural complement to the transformation
in the opposite direction, brought up back then, of turning calls
to malloc with small (compile-time) arguments into alloca.

I would expect the malloc optimization to more than outweigh
the performance cost of the alloca to malloc transformation.
Perhaps even to the point to obviate the need for any explicit
alloca calls at all.  With the optimization in place, it seems
that it should even be possible to transparently transform at
least the most basic uses of some C++ containers written in
terms of operator new and delete to use alloca instead when
their sizes were known and under the alloca to malloc threshold.

Martin
Richard Biener Aug. 17, 2016, 8:27 a.m. UTC | #40
On Tue, Aug 16, 2016 at 7:54 PM, Martin Sebor <msebor@gmail.com> wrote:
> On 08/16/2016 10:47 AM, Jeff Law wrote:
>>
>> On 08/16/2016 10:44 AM, Jakub Jelinek wrote:
>>>
>>> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote:
>>>>
>>>> I think you're being rather short-sighed here.  GCC is being used in
>>>> ways we
>>>> can't necessarily predict -- which might include compile servers,
>>>> JITs, web
>>>> services, etc.
>>>
>>>
>>> For compile server/web services one needs to add the protection
>>> outside of
>>> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.),
>>> because even with very short testcases e.g. in C/C++ one can eat
>>> arbitrary
>>> amounts of stack even without any uses of alloca in the compiler, simply
>>> through deep recursion in the parsers etc.
>>
>> Agreed.  However, that doesn't mean we should not be locking down things
>> like alloca and other attack vectors.
>
>
> I think I made this suggestion when Aldy posted his first patch
> but it didn't get much traction so let me try again.  Since the
> concern is alloca calls with excessively large arguments, would
> transforming those (determined both at compile time and at run
> time) into pairs of malloc/free calls be an acceptable compromise?
>
> It would seem like a natural complement to the transformation
> in the opposite direction, brought up back then, of turning calls
> to malloc with small (compile-time) arguments into alloca.
>
> I would expect the malloc optimization to more than outweigh
> the performance cost of the alloca to malloc transformation.
> Perhaps even to the point to obviate the need for any explicit
> alloca calls at all.  With the optimization in place, it seems
> that it should even be possible to transparently transform at
> least the most basic uses of some C++ containers written in
> terms of operator new and delete to use alloca instead when
> their sizes were known and under the alloca to malloc threshold.

Please instead work on sth like -fstack-protector for alloca -- it should
be straight-forward to add a runtime test on the stack adjustment
being performed against some magic bound (yeah, needs more than
only GCC support here).

Richard.

> Martin
Martin Sebor Aug. 17, 2016, 1:39 p.m. UTC | #41
On 08/17/2016 02:27 AM, Richard Biener wrote:
> On Tue, Aug 16, 2016 at 7:54 PM, Martin Sebor <msebor@gmail.com> wrote:
>> On 08/16/2016 10:47 AM, Jeff Law wrote:
>>>
>>> On 08/16/2016 10:44 AM, Jakub Jelinek wrote:
>>>>
>>>> On Tue, Aug 16, 2016 at 10:27:58AM -0600, Jeff Law wrote:
>>>>>
>>>>> I think you're being rather short-sighed here.  GCC is being used in
>>>>> ways we
>>>>> can't necessarily predict -- which might include compile servers,
>>>>> JITs, web
>>>>> services, etc.
>>>>
>>>>
>>>> For compile server/web services one needs to add the protection
>>>> outside of
>>>> gcc (sandboxing, containers, SELinux, limiting CPU and/or memory, etc.),
>>>> because even with very short testcases e.g. in C/C++ one can eat
>>>> arbitrary
>>>> amounts of stack even without any uses of alloca in the compiler, simply
>>>> through deep recursion in the parsers etc.
>>>
>>> Agreed.  However, that doesn't mean we should not be locking down things
>>> like alloca and other attack vectors.
>>
>>
>> I think I made this suggestion when Aldy posted his first patch
>> but it didn't get much traction so let me try again.  Since the
>> concern is alloca calls with excessively large arguments, would
>> transforming those (determined both at compile time and at run
>> time) into pairs of malloc/free calls be an acceptable compromise?
>>
>> It would seem like a natural complement to the transformation
>> in the opposite direction, brought up back then, of turning calls
>> to malloc with small (compile-time) arguments into alloca.
>>
>> I would expect the malloc optimization to more than outweigh
>> the performance cost of the alloca to malloc transformation.
>> Perhaps even to the point to obviate the need for any explicit
>> alloca calls at all.  With the optimization in place, it seems
>> that it should even be possible to transparently transform at
>> least the most basic uses of some C++ containers written in
>> terms of operator new and delete to use alloca instead when
>> their sizes were known and under the alloca to malloc threshold.
>
> Please instead work on sth like -fstack-protector for alloca -- it should
> be straight-forward to add a runtime test on the stack adjustment
> being performed against some magic bound (yeah, needs more than
> only GCC support here).

I agree that would be a useful first step, and certainly a more 
attainable goal.  The C++ patch that handled the VLA part (bug
69517) had to be reverted just before 6.1 was released but it's
still on my to do list to add to it a knob to adjust the bound
at runtime and resubmit it.  It should be straightforward to
extend a similar approach to alloca and do the same thing in C
(except with trapping rather than throwing an exception).

Martin
Mike Stump Aug. 20, 2016, 2:29 a.m. UTC | #42
On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote:
> 
> Or just wait until people have agreed to switch to C++11 or C++14.  I
> don't think in practice anybody uses an C++11-incapable GCC to build a
> newer GCC these days.

I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I could bootstrap a newer native compiler, if I had too.
Pedro Alves Aug. 21, 2016, 8 p.m. UTC | #43
On 08/20/2016 03:29 AM, Mike Stump wrote:
> On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote:
>>
>> Or just wait until people have agreed to switch to C++11 or C++14.  I
>> don't think in practice anybody uses an C++11-incapable GCC to build a
>> newer GCC these days.

gdb will drop support for building with a C compiler any week
now, and even though we're starting out with C++03, just like gcc,
it'd be great to require C++11 (or later).  Having gcc itself
switch to C++11 too would make proposing it for gdb so much
easier...

So +1 from me, FWIW.  :-)

> 
> I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I could bootstrap a newer native compiler, if I had too.
> 

Yeah.  I wonder whether the community would in general be fine with
that too.

Thanks,
Pedro Alves
Trevor Saunders Aug. 22, 2016, 7:18 a.m. UTC | #44
On Sun, Aug 21, 2016 at 09:00:26PM +0100, Pedro Alves wrote:
> On 08/20/2016 03:29 AM, Mike Stump wrote:
> > On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote:
> >>
> >> Or just wait until people have agreed to switch to C++11 or C++14.  I
> >> don't think in practice anybody uses an C++11-incapable GCC to build a
> >> newer GCC these days.
> 
> gdb will drop support for building with a C compiler any week
> now, and even though we're starting out with C++03, just like gcc,
> it'd be great to require C++11 (or later).  Having gcc itself
> switch to C++11 too would make proposing it for gdb so much
> easier...

huh, I would have sort of expected the oposit, if gdb was to require
C++11, but gcc didn't then you could still use gdb on antique systems
without a C++11 compiler by first building gcc.

> So +1 from me, FWIW.  :-)

 I'd mostly agree, at least requiring a compiler with rvalue references
 would be pretty useful.

> > 
> > I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I could bootstrap a newer native compiler, if I had too.
> > 
> 
> Yeah.  I wonder whether the community would in general be fine with
> that too.

 I personally don't have any machines where the system compiler is that
 old, but its worth noting the last C++11 features came in 4.8.1 I think
 and for libstdc++ its even later.

 Trev

> 
> Thanks,
> Pedro Alves
>
Richard Biener Aug. 22, 2016, 7:28 a.m. UTC | #45
On Mon, Aug 22, 2016 at 9:18 AM, Trevor Saunders <tbsaunde@tbsaunde.org> wrote:
> On Sun, Aug 21, 2016 at 09:00:26PM +0100, Pedro Alves wrote:
>> On 08/20/2016 03:29 AM, Mike Stump wrote:
>> > On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote:
>> >>
>> >> Or just wait until people have agreed to switch to C++11 or C++14.  I
>> >> don't think in practice anybody uses an C++11-incapable GCC to build a
>> >> newer GCC these days.
>>
>> gdb will drop support for building with a C compiler any week
>> now, and even though we're starting out with C++03, just like gcc,
>> it'd be great to require C++11 (or later).  Having gcc itself
>> switch to C++11 too would make proposing it for gdb so much
>> easier...
>
> huh, I would have sort of expected the oposit, if gdb was to require
> C++11, but gcc didn't then you could still use gdb on antique systems
> without a C++11 compiler by first building gcc.
>
>> So +1 from me, FWIW.  :-)
>
>  I'd mostly agree, at least requiring a compiler with rvalue references
>  would be pretty useful.
>
>> >
>> > I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I could bootstrap a newer native compiler, if I had too.
>> >
>>
>> Yeah.  I wonder whether the community would in general be fine with
>> that too.
>
>  I personally don't have any machines where the system compiler is that
>  old, but its worth noting the last C++11 features came in 4.8.1 I think
>  and for libstdc++ its even later.

Most of our auto-testers are still on GCC 4.1 and GCC 4.3 host compilers
(all but one actually).  The oldest still maintained SLES has GCC 4.3 as
host compiler (though GCC 4.8 is available as well, just not as 'gcc').
The "current" SLES has GCC 4.8 as host compiler.

So unless there are very compelling reasons to require C++11 I'd rather
not go down that route at this very moment.

Richard.

(yeah, time to update some of our autotesters - but then it will most likely
be simply retiring them)

>  Trev
>
>>
>> Thanks,
>> Pedro Alves
>>
Eric Gallager Aug. 22, 2016, 12:02 p.m. UTC | #46
On 8/21/16, Pedro Alves <palves@redhat.com> wrote:
> On 08/20/2016 03:29 AM, Mike Stump wrote:
>> On Aug 10, 2016, at 10:03 AM, Oleg Endo <oleg.endo@t-online.de> wrote:
>>>
>>> Or just wait until people have agreed to switch to C++11 or C++14.  I
>>> don't think in practice anybody uses an C++11-incapable GCC to build a
>>> newer GCC these days.
>
> gdb will drop support for building with a C compiler any week
> now, and even though we're starting out with C++03, just like gcc,
> it'd be great to require C++11 (or later).  Having gcc itself
> switch to C++11 too would make proposing it for gdb so much
> easier...
>
> So +1 from me, FWIW.  :-)
>
>>
>> I use the system gcc 4.4.7 on RHEL to build a newer cross compiler...  I
>> could bootstrap a newer native compiler, if I had too.
>>
>
> Yeah.  I wonder whether the community would in general be fine with
> that too.
>
> Thanks,
> Pedro Alves
>

As a rookie programmer considering possibly contributing to GCC in the
future once I'm more confident in my abilities, switching to C++11
would increase the barrier for me to contribute. I currently really
only know C, and I still have to learn C++ in general, much less all
the new features added in C++11. Also, I still occasionally bootstrap
with Apple's gcc 4.2.1, which doesn't have C++11 support. That being
said, I'd support taking steps to make it easier to optionally compile
gcc as C++11 for those that want to do so, such as replacing
-Wno-narrowing with -Wnarrowing in the warning flags that gcc uses.
Out of curiosity, I tried doing just that, and there were actually
fewer new unique warnings than I expected. I've attached a log of
them.

Eric Gallager
Manuel López-Ibáñez Aug. 22, 2016, 12:58 p.m. UTC | #47
On 22/08/16 13:02, Eric Gallager wrote:
> As a rookie programmer considering possibly contributing to GCC in the
> future once I'm more confident in my abilities, switching to C++11
> would increase the barrier for me to contribute. I currently really
> only know C, and I still have to learn C++ in general, much less all
> the new features added in C++11. Also, I still occasionally bootstrap
> with Apple's gcc 4.2.1, which doesn't have C++11 support. That being
> said, I'd support taking steps to make it easier to optionally compile
> gcc as C++11 for those that want to do so, such as replacing
> -Wno-narrowing with -Wnarrowing in the warning flags that gcc uses.
> Out of curiosity, I tried doing just that, and there were actually
> fewer new unique warnings than I expected. I've attached a log of
> them.

If you are able to bootstrap GCC and run the regression testsuite, you have 
done 50% of what you need to submit your first patch:
 
https://gcc.gnu.org/wiki/GettingStarted#Basics:_Contributing_to_GCC_in_10_easy_steps

The implementation language is often a detail hidden by many layers of 
abstraction. It is more important to get to know those abstractions, know how 
to use a debugger and understand how various things fit together in gcc (FEs, 
MEs, back-ends, gimple, trees, RTL) than the implementation of those things. 
And the best way to learn is to start with a small project in your area of 
interest 
(https://gcc.gnu.org/bugzilla/buglist.cgi?keywords=easyhack&list_id=158576&resolution=---) 
and persevere.

Cheers,

	Manuel.
Mike Stump Aug. 22, 2016, 10:08 p.m. UTC | #48
On Aug 22, 2016, at 5:02 AM, Eric Gallager <egall@gwmail.gwu.edu> wrote:
> 
> As a rookie programmer considering possibly contributing to GCC in the
> future once I'm more confident in my abilities, switching to C++11
> would increase the barrier for me to contribute. I currently really
> only know C, and I still have to learn C++ in general, much less all
> the new features added in C++11. 

Don't worry, we don't exploit every feature of the language on every line.  Most of the lines likely look just like plain C.
Eric Gallager Aug. 23, 2016, 11:16 p.m. UTC | #49
On 8/22/16, Mike Stump <mikestump@comcast.net> wrote:
> On Aug 22, 2016, at 5:02 AM, Eric Gallager <egall@gwmail.gwu.edu> wrote:
>>
>> As a rookie programmer considering possibly contributing to GCC in the
>> future once I'm more confident in my abilities, switching to C++11
>> would increase the barrier for me to contribute. I currently really
>> only know C, and I still have to learn C++ in general, much less all
>> the new features added in C++11.
>
> Don't worry, we don't exploit every feature of the language on every line.
> Most of the lines likely look just like plain C.


Right, I suppose that's true. And even if you did start using every
feature of the language, there'd still be other work available in
areas I'm more comfortable in, such as shell scripts, autoconf macros,
and Makefiles... assuming you also continue using the same build
system, that is...
diff mbox

Patch

diff --git a/gcc/protected-alloca.h b/gcc/protected-alloca.h
new file mode 100644
index 0000000..62f2a7b
--- /dev/null
+++ b/gcc/protected-alloca.h
@@ -0,0 +1,113 @@ 
+/* Alloca wrapper with a malloc fallback.
+   Copyright (C) 2016 Free Software Foundation, Inc.
+   Contributed by Aldy Hernandez <aldyh@redhat.com>.
+
+This file is part of GCC.
+
+GCC is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 3, or (at your option)
+any later version.
+
+GCC is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+
+You should have received a copy of the GNU General Public License
+along with GCC; see the file COPYING3.  If not see
+<http://www.gnu.org/licenses/>.  */
+
+#ifndef _PROTECTED_ALLOCA_H_
+#define _PROTECTED_ALLOCA_H_
+
+#ifndef MAX_ALLOCA_SIZE
+#define MAX_ALLOCA_SIZE 4096
+#endif
+
+#ifdef __GNUC__
+#define _ALLOCA_INLINE_ __attribute__((always_inline))
+#else
+#define _ALLOCA_INLINE_
+#endif
+
+/* This is a wrapper class for alloca that falls back to malloc if the
+   allocation size is > MAX_ALLOCA_SIZE.  It is meant to replace:
+
+     char *str = (char *) alloca (N);
+
+   by this:
+
+     protected_alloca chunk (N);
+     char *str = (char *) chunk.pointer ();
+
+   or this:
+
+     protected_alloca chunk;
+     chunk.alloc (N);
+     char *str = (char *) chunk.pointer ();
+
+   If N > MAX_ALLOCA_SIZE, malloc is used, and whenever `chunk' goes
+   out of scope, the malloc'd memory will be freed.  Keep in mind that
+   the malloc'd memory gets freed when `chunk' goes out of scope, and
+   may be freed earlier than expected.  For example:
+
+     func()
+     {
+       char *str;
+       {
+         protected_alloca chunk (99999999);
+	 // If malloc'd, pointer will be freed when chunk goes out of scope.
+         str = (char *) chunk.pointer ();
+       }
+       use (str);  // BAD!  Use after free.
+     }
+
+   In this case, it is best to use the following idiom:
+
+     func()
+     {
+       void *ptr;
+       protected_alloca chunk;
+       {
+         chunk.alloc (9999999);
+	 str = (char *) chunk.pointer ();
+       }
+       // OK, pointer will be freed on function exit.
+       use (str);
+     }
+*/
+
+class protected_alloca {
+  void *p;
+  bool on_heap;
+public:
+  // GCC will refuse to inline a function that calls alloca unless
+  // `always_inline' is used, for fear of inlining an alloca call
+  // appearing in a loop.  Inlining this constructor won't make code
+  // any worse than it already is wrt loops.  We know what we're
+  // doing.
+  _ALLOCA_INLINE_ void
+  internal_alloc (size_t size)
+  {
+    if (size < MAX_ALLOCA_SIZE)
+      {
+	p = alloca (size);
+	on_heap = false;
+      }
+    else
+      {
+	p = xmalloc (size);
+	on_heap = true;
+      }
+  }
+  _ALLOCA_INLINE_
+  protected_alloca (size_t size) { internal_alloc (size); }
+  protected_alloca () { p = NULL; on_heap = false; }
+  ~protected_alloca () { if (on_heap) free (p); }
+  __attribute__((always_inline))
+  void alloc (size_t size) { gcc_assert (!p); internal_alloc (size); }
+  void *pointer () { return p; }
+};
+
+#endif // _PROTECTED_ALLOCA_H_
diff --git a/gcc/tree.c b/gcc/tree.c
index 11d3b51..621dcd5 100644
--- a/gcc/tree.c
+++ b/gcc/tree.c
@@ -62,6 +62,7 @@  along with GCC; see the file COPYING3.  If not see
 #include "print-tree.h"
 #include "ipa-utils.h"
 #include "selftest.h"
+#include "protected-alloca.h"
 
 /* Tree code classes.  */
 
@@ -9639,6 +9640,7 @@  get_file_function_name (const char *type)
   char *buf;
   const char *p;
   char *q;
+  protected_alloca q_alloca;
 
   /* If we already have a name we know to be unique, just use that.  */
   if (first_global_object_name)
@@ -9674,7 +9676,8 @@  get_file_function_name (const char *type)
 	file = LOCATION_FILE (input_location);
 
       len = strlen (file);
-      q = (char *) alloca (9 + 17 + len + 1);
+      q_alloca.alloc (9 + 17 + len + 1);
+      q = (char *) q_alloca.pointer ();
       memcpy (q, file, len + 1);
 
       snprintf (q + len, 9 + 17 + 1, "_%08X_" HOST_WIDE_INT_PRINT_HEX, 
@@ -9684,8 +9687,9 @@  get_file_function_name (const char *type)
     }
 
   clean_symbol_name (q);
-  buf = (char *) alloca (sizeof (FILE_FUNCTION_FORMAT) + strlen (p)
-			 + strlen (type));
+  protected_alloca buf_alloca (sizeof (FILE_FUNCTION_FORMAT) + strlen (p)
+			       + strlen (type));
+  buf = (char *) buf_alloca.pointer ();
 
   /* Set up the name of the file-level functions we may need.
      Use a global object (which is already required to be unique over
@@ -9709,7 +9713,8 @@  tree_check_failed (const_tree node, const char *file,
 {
   va_list args;
   const char *buffer;
-  unsigned length = 0;
+  protected_alloca buffer_alloca;
+  size_t length = 0;
   enum tree_code code;
 
   va_start (args, function);
@@ -9721,7 +9726,8 @@  tree_check_failed (const_tree node, const char *file,
       char *tmp;
       va_start (args, function);
       length += strlen ("expected ");
-      buffer = tmp = (char *) alloca (length);
+      buffer_alloca.alloc (length);
+      buffer = tmp = (char *) buffer_alloca.pointer ();
       length = 0;
       while ((code = (enum tree_code) va_arg (args, int)))
 	{
@@ -9752,7 +9758,8 @@  tree_not_check_failed (const_tree node, const char *file,
 {
   va_list args;
   char *buffer;
-  unsigned length = 0;
+  protected_alloca buffer_alloca;
+  size_t length = 0;
   enum tree_code code;
 
   va_start (args, function);
@@ -9760,7 +9767,8 @@  tree_not_check_failed (const_tree node, const char *file,
     length += 4 + strlen (get_tree_code_name (code));
   va_end (args);
   va_start (args, function);
-  buffer = (char *) alloca (length);
+  buffer_alloca.alloc (length);
+  buffer = (char *) buffer_alloca.pointer ();
   length = 0;
   while ((code = (enum tree_code) va_arg (args, int)))
     {
@@ -9802,14 +9810,16 @@  tree_range_check_failed (const_tree node, const char *file, int line,
 			 enum tree_code c2)
 {
   char *buffer;
-  unsigned length = 0;
+  protected_alloca buffer_alloca;
+  size_t length = 0;
   unsigned int c;
 
   for (c = c1; c <= c2; ++c)
     length += 4 + strlen (get_tree_code_name ((enum tree_code) c));
 
   length += strlen ("expected ");
-  buffer = (char *) alloca (length);
+  buffer_alloca.alloc (length);
+  buffer = (char *) buffer_alloca.pointer ();
   length = 0;
 
   for (c = c1; c <= c2; ++c)
@@ -9863,14 +9873,16 @@  omp_clause_range_check_failed (const_tree node, const char *file, int line,
 			       enum omp_clause_code c2)
 {
   char *buffer;
-  unsigned length = 0;
+  protected_alloca buffer_alloca;
+  size_t length = 0;
   unsigned int c;
 
   for (c = c1; c <= c2; ++c)
     length += 4 + strlen (omp_clause_code_name[c]);
 
   length += strlen ("expected ");
-  buffer = (char *) alloca (length);
+  buffer_alloca.alloc (length);
+  buffer = (char *) buffer_alloca.pointer ();
   length = 0;
 
   for (c = c1; c <= c2; ++c)