diff mbox

preprocessor/58580 - preprocessor goes OOM with warning for zero literals

Message ID 87zjpkrx8p.fsf@redhat.com
State New
Headers show

Commit Message

Dodji Seketeli Nov. 4, 2013, 3:40 p.m. UTC
Jakub Jelinek <jakub@redhat.com> writes:

[...]

> Eventually, I think using int for sizes is just a ticking bomb, what if
> somebody uses > 2GB long lines?  Surely, on 32-bit hosts we are unlikely to
> handle it, but why couldn't 64-bit host handle it?  Column info maybe bogus
> in there, sure, but at least we should not crash or overflow buffers on it
> ;).  Anyway, not something needed to be fixed right now, but in the future
> it would be nicer to use size_t and/or ssize_t here.

Yes.  I initially tried to use size_t but found that I'd need to modify
several other places to shutdown many warning because these places where
using int :-(.  So I felt that would be a battle for later.

But I am adding this to my TODO.  I'll send a patch later that
changes this to size_t then, and adjusts the other places that need it
as well.

[...]

>>    context->last_location = diagnostic->location;
>>    s = expand_location_to_spelling_point (diagnostic->location);
>> -  line = location_get_source_line (s);
>> +  line = location_get_source_line (s, line_width);
>
> I think richi didn't like C++ reference arguments to be used that way (and
> perhaps guidelines don't either), because it isn't immediately obvious
> that line_width is modified by the call.  Can you change it to a pointer
> argument instead and pass &line_width?

Sure.  I have done the change in the patch below.  Sorry for this
reflex.  I tend to use pointers like these only in places where we can
allow them to be NULL.

> XNEWVEC or XRESIZEVEC will never return NULL though, so it doesn't have
> to be tested.  Though, the question is if that is what we want, caret
> diagnostics should be optional, if we can't print it, we just won't.

Hmmh.  This particular bug was noticed because of the explicit OOM
message displayed by XNEWVEC/XRESIZEVEC; otherwise, I bet this could
have just felt through the crack for a little longer.  So I'd say let's
just use XNEWVEC/XRESIZEVEC and remove the test, as you first
suggested.  The caret diagnostics functionality as a whole can be
disabled with -fno-diagnostic-show-caret.


[...]

> Otherwise, LGTM.

Thanks.

So here is the patch that bootstraps.

gcc/ChangeLog:

	* input.h (location_get_source_line): Take an additional line_size
	parameter by reference.
	* input.c (get_line): New static function definition.
	(read_line): Take an additional line_length output parameter to be
	set to the size of the line.  Use the new get_line function to
	compute the size of the line returned by fgets, rather than using
	strlen.  Ensure that the buffer is initially zeroed; ensure that
	when growing the buffer too.
	(location_get_source_line): Take an additional output line_len
	parameter.  Update the use of read_line to pass it the line_len
	parameter.
	* diagnostic.c (adjust_line): Take an additional input parameter
	for the length of the line, rather than calculating it with
	strlen.
	(diagnostic_show_locus): Adjust the use of
	location_get_source_line and adjust_line with respect to their new
	signature.  While displaying a line now, do not stop at the first
	null byte.  Rather, display the zero byte as a space and keep
	going until we reach the size of the line.

gcc/testsuite/ChangeLog:

	* c-c++-common/cpp/warning-zero-in-literals-1.c: New test file.
---
 gcc/diagnostic.c                                   |  17 ++--
 gcc/input.c                                        | 100 ++++++++++++++++++---
 gcc/input.h                                        |   3 +-
 .../c-c++-common/cpp/warning-zero-in-literals-1.c  | Bin 0 -> 240 bytes
 4 files changed, 99 insertions(+), 21 deletions(-)
 create mode 100644 gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c

diff --git a/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c b/gcc/testsuite/c-c++-common/cpp/warning-zero-in-literals-1.c
new file mode 100644
index 0000000000000000000000000000000000000000..ff2ed962ac96e47ae05b0b040f4e10b8e09637e2
GIT binary patch
literal 240
zcmdPbSEyD<N!LxuS12e-Ehx%QPAx80sO92PVo*}h*HVDUmM0eFW#*+TDCL#r<R~O(
UBo-wmm!uXcDby-x=?^KT09Xk|)&Kwi

literal 0
HcmV?d00001

Comments

Bernd Edlinger Nov. 4, 2013, 11:43 p.m. UTC | #1
Hi,

On Mon, 4 Nov 2013 16:40:38, Dodji Seketeli wrote:
> +static ssize_t
> +get_line (char **lineptr, size_t *n, FILE *fp)
> +{
> + ssize_t cur_len = 0, len;
> + char buf[16384];
> +
> + if (lineptr == NULL || n == NULL)
> + return -1;
> +
> + if (*lineptr == NULL || *n == 0)
> + {
> + *n = 120;
> + *lineptr = XNEWVEC (char, *n);
> + }
> +
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> +
> + for (;;)
> + {
> + size_t needed;
> + char *t = (char*) memchr (buf, '\n', len);
> + if (t != NULL) len = (t - buf) + 1;
> + if (__builtin_expect (len>= SSIZE_MAX - cur_len, 0))
> + return -1;
> + needed = cur_len + len + 1;
> + if (needed> *n)
> + {
> + char *new_lineptr;
> + if (needed < 2 * *n)
> + needed = 2 * *n;
> + new_lineptr = XRESIZEVEC (char, *lineptr, needed);
> + *lineptr = new_lineptr;
> + *n = needed;
> + }
> + memcpy (*lineptr + cur_len, buf, len);
> + cur_len += len;
> + if (t != NULL)
> + break;
> + len = fread (buf, 1, sizeof buf, fp);
> + if (ferror (fp))
> + return -1;
> + if (len == 0)
> + break;
> + }
> + (*lineptr)[cur_len] = '\0';
> + return cur_len;
> +}
> +
> +/* Reads one line from FILE into a static buffer. LINE_LENGTH is set
> + by this function to the length of the returned line. Note that the
> + returned line can contain several zero bytes. */
> static const char *
> -read_line (FILE *file)
> +read_line (FILE *file, int *line_length)
> {
> static char *string;
> - static size_t string_len;
> + static size_t string_len, cur_len;
> size_t pos = 0;
> char *ptr;
>
> if (!string_len)
> {
> string_len = 200;
> - string = XNEWVEC (char, string_len);
> + string = XCNEWVEC (char, string_len);
> }
> + else
> + memset (string, 0, string_len);

Is this memset still necessary?

If the previous invocation of read_line already had read
some characters of the following line, how is that information
recovered? How is it detected if another file is to be read this time?

>
> - while ((ptr = fgets (string + pos, string_len - pos, file)))
> + ptr = string;
> + cur_len = string_len;
> + while (size_t len = get_line (&ptr, &cur_len, file))
> {
> - size_t len = strlen (string + pos);
> -
> - if (string[pos + len - 1] == '\n')
> + if (ptr[len - 1] == '\n')
> {
> - string[pos + len - 1] = 0;
> + ptr[len - 1] = 0;
> + *line_length = len;
> return string;
> }
> pos += len;
> string = XRESIZEVEC (char, string, string_len * 2);
> string_len *= 2;
> - }
> -
> + ptr = string + pos;

If "ptr" is passed to get_line it will try to reallocate it,
which must fail, right?

Maybe, this line of code is unreachable?

Who is responsible for reallocating "string" get_line or read_line?

> + cur_len = string_len - pos;
> + }
> +
> + *line_length = pos ? string_len : 0;
> return pos ? string : NULL;
> }
>
> /* Return the physical source line that corresponds to xloc in a
> buffer that is statically allocated. The newline is replaced by
> - the null character. */
> + the null character. Note that the line can contain several null
> + characters, so LINE_LEN contains the actual length of the line. */
>
> const char *
> -location_get_source_line (expanded_location xloc)
> +location_get_source_line (expanded_location xloc,
> + int& line_len)
> {
> const char *buffer;
> int lines = 1;
> @@ -132,7 +204,7 @@ location_get_source_line (expanded_location xloc)
> if (!stream)
> return NULL;
>
> - while ((buffer = read_line (stream)) && lines < xloc.line)
> + while ((buffer = read_line (stream, &line_len)) && lines < xloc.line)
> lines++;
>
> fclose (stream);


Regards
Bernd.
diff mbox

Patch

diff --git a/gcc/diagnostic.c b/gcc/diagnostic.c
index 36094a1..0ca7081 100644
--- a/gcc/diagnostic.c
+++ b/gcc/diagnostic.c
@@ -259,12 +259,13 @@  diagnostic_build_prefix (diagnostic_context *context,
    MAX_WIDTH by some margin, then adjust the start of the line such
    that the COLUMN is smaller than MAX_WIDTH minus the margin.  The
    margin is either 10 characters or the difference between the column
-   and the length of the line, whatever is smaller.  */
+   and the length of the line, whatever is smaller.  The length of
+   LINE is given by LINE_WIDTH.  */
 static const char *
-adjust_line (const char *line, int max_width, int *column_p)
+adjust_line (const char *line, int line_width,
+	     int max_width, int *column_p)
 {
   int right_margin = 10;
-  int line_width = strlen (line);
   int column = *column_p;
 
   right_margin = MIN (line_width - column, right_margin);
@@ -284,6 +285,7 @@  diagnostic_show_locus (diagnostic_context * context,
 		       const diagnostic_info *diagnostic)
 {
   const char *line;
+  int line_width;
   char *buffer;
   expanded_location s;
   int max_width;
@@ -297,22 +299,25 @@  diagnostic_show_locus (diagnostic_context * context,
 
   context->last_location = diagnostic->location;
   s = expand_location_to_spelling_point (diagnostic->location);
-  line = location_get_source_line (s);
+  line = location_get_source_line (s, line_width);
   if (line == NULL)
     return;
 
   max_width = context->caret_max_width;
-  line = adjust_line (line, max_width, &(s.column));
+  line = adjust_line (line, line_width, max_width, &(s.column));
 
   pp_newline (context->printer);
   saved_prefix = pp_get_prefix (context->printer);
   pp_set_prefix (context->printer, NULL);
   pp_space (context->printer);
-  while (max_width > 0 && *line != '\0')
+  while (max_width > 0 && line_width > 0)
     {
       char c = *line == '\t' ? ' ' : *line;
+      if (c == '\0')
+	c = ' ';
       pp_character (context->printer, c);
       max_width--;
+      line_width--;
       line++;
     }
   pp_newline (context->printer);
diff --git a/gcc/input.c b/gcc/input.c
index a141a92..2ee7882 100644
--- a/gcc/input.c
+++ b/gcc/input.c
@@ -87,44 +87,116 @@  expand_location_1 (source_location loc,
   return xloc;
 }
 
-/* Reads one line from file into a static buffer.  */
+/* This function reads a line that might contain bytes whose value is
+   zero.  It returns the number of bytes read.  Note that this
+   function has been adapted from getline() and _IO_getdelim() GNU C
+   library functions.  It's been duplicated here because the getline()
+   function is not necessarily present on all platforms.
+
+   LINEPTR points to a buffer that is to contain the line read.
+
+   N points to the size of the the LINEPTR buffer.
+
+   FP points to the file to consider.  */
+
+static ssize_t
+get_line (char **lineptr, size_t *n, FILE *fp)
+{
+  ssize_t cur_len = 0, len;
+  char buf[16384];
+
+  if (lineptr == NULL || n == NULL)
+    return -1;
+
+  if (*lineptr == NULL || *n == 0)
+    {
+      *n = 120;
+      *lineptr = XNEWVEC (char, *n);
+    }
+
+  len = fread (buf, 1, sizeof buf, fp);
+  if (ferror (fp))
+    return -1;
+
+  for (;;)
+    {
+      size_t needed;
+      char *t = (char*) memchr (buf, '\n', len);
+      if (t != NULL) len = (t - buf) + 1;
+      if (__builtin_expect (len >= SSIZE_MAX - cur_len, 0))
+	return -1;
+      needed = cur_len + len + 1;
+      if (needed > *n)
+	{
+	  char *new_lineptr;
+	  if (needed < 2 * *n)
+	    needed = 2 * *n;
+	  new_lineptr = XRESIZEVEC (char, *lineptr, needed);
+	  *lineptr = new_lineptr;
+	  *n = needed;
+	}
+      memcpy (*lineptr + cur_len, buf, len);
+      cur_len += len;
+      if (t != NULL)
+	break;
+      len = fread (buf, 1, sizeof buf, fp);
+      if (ferror (fp))
+	return -1;
+      if (len == 0)
+	break;
+    }
+  (*lineptr)[cur_len] = '\0';
+  return cur_len;
+}
+
+/* Reads one line from FILE into a static buffer.  LINE_LENGTH is set
+   by this function to the length of the returned line.  Note that the
+   returned line can contain several zero bytes.  */
 static const char *
-read_line (FILE *file)
+read_line (FILE *file, int *line_length)
 {
   static char *string;
-  static size_t string_len;
+  static size_t string_len, cur_len;
   size_t pos = 0;
   char *ptr;
 
   if (!string_len)
     {
       string_len = 200;
-      string = XNEWVEC (char, string_len);
+      string = XCNEWVEC (char, string_len);
     }
+  else
+    memset (string, 0, string_len);
 
-  while ((ptr = fgets (string + pos, string_len - pos, file)))
+  ptr = string;
+  cur_len = string_len;
+  while (size_t len = get_line (&ptr, &cur_len, file))
     {
-      size_t len = strlen (string + pos);
-
-      if (string[pos + len - 1] == '\n')
+      if (ptr[len - 1] == '\n')
 	{
-	  string[pos + len - 1] = 0;
+	  ptr[len - 1] = 0;
+	  *line_length = len;
 	  return string;
 	}
       pos += len;
       string = XRESIZEVEC (char, string, string_len * 2);
       string_len *= 2;
-    }
-      
+      ptr = string + pos;
+      cur_len = string_len - pos;
+     }
+
+  *line_length = pos ? string_len : 0;
   return pos ? string : NULL;
 }
 
 /* Return the physical source line that corresponds to xloc in a
    buffer that is statically allocated.  The newline is replaced by
-   the null character.  */
+   the null character.  Note that the line can contain several null
+   characters, so LINE_LEN contains the actual length of the line.  */
 
 const char *
-location_get_source_line (expanded_location xloc)
+location_get_source_line (expanded_location xloc,
+			  int& line_len)
 {
   const char *buffer;
   int lines = 1;
@@ -132,7 +204,7 @@  location_get_source_line (expanded_location xloc)
   if (!stream)
     return NULL;
 
-  while ((buffer = read_line (stream)) && lines < xloc.line)
+  while ((buffer = read_line (stream, &line_len)) && lines < xloc.line)
     lines++;
 
   fclose (stream);
diff --git a/gcc/input.h b/gcc/input.h
index 8fdc7b2..79b3a10 100644
--- a/gcc/input.h
+++ b/gcc/input.h
@@ -37,7 +37,8 @@  extern char builtins_location_check[(BUILTINS_LOCATION
 				     < RESERVED_LOCATION_COUNT) ? 1 : -1];
 
 extern expanded_location expand_location (source_location);
-extern const char *location_get_source_line (expanded_location xloc);
+extern const char *location_get_source_line (expanded_location xloc,
+					     int& line_size);
 extern expanded_location expand_location_to_spelling_point (source_location);
 extern source_location expansion_point_location_if_in_system_header (source_location);