diff mbox series

Fix PR 101453: ICE with optimize and large integer constant

Message ID 1626460525-4914-1-git-send-email-apinski@marvell.com
State New
Headers show
Series Fix PR 101453: ICE with optimize and large integer constant | expand

Commit Message

Li, Pan2 via Gcc-patches July 16, 2021, 6:35 p.m. UTC
From: Andrew Pinski <apinski@marvell.com>

The problem is the buffer is too small to hold "-O" and
the interger.  This fixes the problem by use the correct size
instead.

Changes since v1:
* v2: Use HOST_BITS_PER_LONG and just divide by 3 instead of
3.32.

OK? Bootstrapped and tested on x86_64-linux with no regressions.

gcc/c-family/ChangeLog:

	PR c/101453
	* c-common.c (parse_optimize_options): Use the correct
	size for buffer.
---
 gcc/c-family/c-common.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Richard Biener July 16, 2021, 6:49 p.m. UTC | #1
On July 16, 2021 8:35:25 PM GMT+02:00, apinski--- via Gcc-patches <gcc-patches@gcc.gnu.org> wrote:
>From: Andrew Pinski <apinski@marvell.com>
>
>The problem is the buffer is too small to hold "-O" and
>the interger.  This fixes the problem by use the correct size
>instead.
>
>Changes since v1:
>* v2: Use HOST_BITS_PER_LONG and just divide by 3 instead of
>3.32.
>
>OK? Bootstrapped and tested on x86_64-linux with no regressions.

OK. 

Richard. 

>gcc/c-family/ChangeLog:
>
>	PR c/101453
>	* c-common.c (parse_optimize_options): Use the correct
>	size for buffer.
>---
> gcc/c-family/c-common.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
>index 20ec263..e7a54a5 100644
>--- a/gcc/c-family/c-common.c
>+++ b/gcc/c-family/c-common.c
>@@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)
> 
>       if (TREE_CODE (value) == INTEGER_CST)
> 	{
>-	  char buffer[20];
>+	  char buffer[HOST_BITS_PER_LONG / 3 + 4];
> 	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
> 	  vec_safe_push (optimize_args, ggc_strdup (buffer));
> 	}
Segher Boessenkool July 17, 2021, 9:30 p.m. UTC | #2
Hi!

On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:
> --- a/gcc/c-family/c-common.c
> +++ b/gcc/c-family/c-common.c
> @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)
>  
>        if (TREE_CODE (value) == INTEGER_CST)
>  	{
> -	  char buffer[20];
> +	  char buffer[HOST_BITS_PER_LONG / 3 + 4];
>  	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
>  	  vec_safe_push (optimize_args, ggc_strdup (buffer));
>  	}

This calculation is correct, assuming "value" is non-negative.  You can
easily avoid all of that though:

	  int n = snprintf (0, 0, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  char buffer[n + 1];
	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));

This creates a variable length allocation on the stack though.  You can
do better (for some value of "better") by using the known longest value:
(again assuming non-negative):

	  int n = snprintf (0, 0, "-O%ld", LONG_MAX);
	  char buffer[n + 1];
	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));

This does not call snprintf, GCC can evaluate that at compile time.
This pattern works well in portable code.

But we can do better still:

	  char *buffer;
	  asprintf (&buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
	  vec_safe_push (optimize_args, ggc_strdup (buffer));
	  free (buffer);

(asprintf is in libiberty already).  And we could avoid the ggc_strdup
if we made a variant of asprintf that marks the GGC itself (the aprintf
implementation already factors calculating the length needed, it will
be easy to do this).


Segher
Andrew Pinski July 17, 2021, 10:13 p.m. UTC | #3
On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> Hi!
>
> On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:
> > --- a/gcc/c-family/c-common.c
> > +++ b/gcc/c-family/c-common.c
> > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)
> >
> >        if (TREE_CODE (value) == INTEGER_CST)
> >       {
> > -       char buffer[20];
> > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];
> >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
> >         vec_safe_push (optimize_args, ggc_strdup (buffer));
> >       }
>
> This calculation is correct, assuming "value" is non-negative.  You can
> easily avoid all of that though:

Actually it is still correct even for negative values because we are
adding 4 rather than 3 to make sure there is enough room for the sign
character.
So it takes 11 max characters for a signed 32bit integer
(ceil(log(2^31)) + 1) and add the "-O" part and the null character
gives us 14.  32/3 is 10, and then add 4 gives us 14. This is the
exact amount.
So it takes 20 max characters for a signed 64bit integer
(ceil(log(2^63)) + 1) and add the "-O" part and the null character
gives us 23.  64/3 is 21, and then add 4 gives us 25. There are 2
extra bytes used which is ok.


Thanks,
Andrew


>
>           int n = snprintf (0, 0, "-O%ld", (long) TREE_INT_CST_LOW (value));
>           char buffer[n + 1];
>           sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
>           vec_safe_push (optimize_args, ggc_strdup (buffer));
>
> This creates a variable length allocation on the stack though.  You can
> do better (for some value of "better") by using the known longest value:
> (again assuming non-negative):
>
>           int n = snprintf (0, 0, "-O%ld", LONG_MAX);
>           char buffer[n + 1];
>           sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
>           vec_safe_push (optimize_args, ggc_strdup (buffer));
>
> This does not call snprintf, GCC can evaluate that at compile time.
> This pattern works well in portable code.
>
> But we can do better still:
>
>           char *buffer;
>           asprintf (&buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
>           vec_safe_push (optimize_args, ggc_strdup (buffer));
>           free (buffer);
>
> (asprintf is in libiberty already).  And we could avoid the ggc_strdup
> if we made a variant of asprintf that marks the GGC itself (the aprintf
> implementation already factors calculating the length needed, it will
> be easy to do this).
>
>
> Segher
Segher Boessenkool July 18, 2021, 7:32 a.m. UTC | #4
On Sat, Jul 17, 2021 at 03:13:59PM -0700, Andrew Pinski wrote:
> On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
> > On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:
> > > --- a/gcc/c-family/c-common.c
> > > +++ b/gcc/c-family/c-common.c
> > > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)
> > >
> > >        if (TREE_CODE (value) == INTEGER_CST)
> > >       {
> > > -       char buffer[20];
> > > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];
> > >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
> > >         vec_safe_push (optimize_args, ggc_strdup (buffer));
> > >       }
> >
> > This calculation is correct, assuming "value" is non-negative.  You can
> > easily avoid all of that though:
> 
> Actually it is still correct even for negative values because we are
> adding 4 rather than 3 to make sure there is enough room for the sign
> character.

Say your longs have only two bits, one sign and one value (so it is
{-2, -1, 0, 1}).  There you get  char buffer[4]  although you need 5
here if you care about negative numbers: '-', 'O', '-', '1', 0.

But longs are at least 32 bits, of course.  And 14 chars is (just)
enough, because you divide by only 3 now (instead of {}^2 log 10, or
3.32 as before), giving some leeway.

(n/3+4 also fails for longs of sizes 5, 8, or 11 bits, but no more).


This only strengthens my point though: it is much easier and it requires
much less thinking, but most of all it is less error-prone, if you let
the compiler do the tricky bits.  But it is less fun!  :-)


Segher
Andreas Schwab July 18, 2021, 7:49 a.m. UTC | #5
On Jul 18 2021, Segher Boessenkool wrote:

> On Sat, Jul 17, 2021 at 03:13:59PM -0700, Andrew Pinski wrote:
>> On Sat, Jul 17, 2021 at 2:31 PM Segher Boessenkool
>> <segher@kernel.crashing.org> wrote:
>> > On Fri, Jul 16, 2021 at 11:35:25AM -0700, apinski--- via Gcc-patches wrote:
>> > > --- a/gcc/c-family/c-common.c
>> > > +++ b/gcc/c-family/c-common.c
>> > > @@ -5799,7 +5799,7 @@ parse_optimize_options (tree args, bool attr_p)
>> > >
>> > >        if (TREE_CODE (value) == INTEGER_CST)
>> > >       {
>> > > -       char buffer[20];
>> > > +       char buffer[HOST_BITS_PER_LONG / 3 + 4];
>> > >         sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
>> > >         vec_safe_push (optimize_args, ggc_strdup (buffer));
>> > >       }
>> >
>> > This calculation is correct, assuming "value" is non-negative.  You can
>> > easily avoid all of that though:
>> 
>> Actually it is still correct even for negative values because we are
>> adding 4 rather than 3 to make sure there is enough room for the sign
>> character.
>
> Say your longs have only two bits, one sign and one value (so it is
> {-2, -1, 0, 1}).  There you get  char buffer[4]  although you need 5
> here if you care about negative numbers: '-', 'O', '-', '1', 0.
>
> But longs are at least 32 bits, of course.  And 14 chars is (just)
> enough, because you divide by only 3 now (instead of {}^2 log 10, or
> 3.32 as before), giving some leeway.
>
> (n/3+4 also fails for longs of sizes 5, 8, or 11 bits, but no more).

/* Bound on length of the string representing an unsigned integer
   value representable in B bits.  log10 (2.0) < 146/485.  The
   smallest value of B where this bound is not tight is 2621.  */
#define INT_BITS_STRLEN_BOUND(b) (((b) * 146 + 484) / 485)

/* Bound on length of the string representing an integer type or expression T.
   Subtract 1 for the sign bit if T is signed, and then add 1 more for
   a minus sign if needed.

   Because _GL_SIGNED_TYPE_OR_EXPR sometimes returns 1 when its argument is
   unsigned, this macro may overestimate the true bound by one byte when
   applied to unsigned types of size 2, 4, 16, ... bytes.  */
#define INT_STRLEN_BOUND(t)                                     \
  (INT_BITS_STRLEN_BOUND (TYPE_WIDTH (t) - _GL_SIGNED_TYPE_OR_EXPR (t)) \
   + _GL_SIGNED_TYPE_OR_EXPR (t))

Andreas.
diff mbox series

Patch

diff --git a/gcc/c-family/c-common.c b/gcc/c-family/c-common.c
index 20ec263..e7a54a5 100644
--- a/gcc/c-family/c-common.c
+++ b/gcc/c-family/c-common.c
@@ -5799,7 +5799,7 @@  parse_optimize_options (tree args, bool attr_p)
 
       if (TREE_CODE (value) == INTEGER_CST)
 	{
-	  char buffer[20];
+	  char buffer[HOST_BITS_PER_LONG / 3 + 4];
 	  sprintf (buffer, "-O%ld", (long) TREE_INT_CST_LOW (value));
 	  vec_safe_push (optimize_args, ggc_strdup (buffer));
 	}