diff mbox series

c-ppoutput: Fix preprocessing ICE on very large line number [PR99325]

Message ID 20210304090343.GP4020736@tucnak
State New
Headers show
Series c-ppoutput: Fix preprocessing ICE on very large line number [PR99325] | expand

Commit Message

Jakub Jelinek March 4, 2021, 9:03 a.m. UTC
Hi!

In libcpp, lines are represented as linenum_type, which is unsigned int.
The following testcases ICE because maybe_print_line_1 is sometimes called
with UNKNOWN_LOCATION (e.g. at pragma eof) and while most of the time
the
       && src_line >= print.src_line
       && src_line < print.src_line + 8
check doesn't succeed for the src_line of 0 from UNKNOWN_LOCATION, when
print.src_line is from very large line numbers (UINT_MAX - 7 and above)
it succeeds (with UB on the compiler side) but src_file is NULL for
UNKNOWN_LOCATION and so the strcmp call ICEs.
As print.src_line can easily wrap around, this patch changes its type
to unsigned int to match libcpp, so that we don't invoke UB in the compiler.
For print.src_line of UINT_MAX - 7 and above, src_line from UNKNOWN_LOCATION
will not pass that test anymore, but when it wraps around to 0, it can,
so I've also added a check for src_loc != UNKNOWN_LOCATION (or, if
preferred, could be src_file != NULL).
Besides fixing the ICE and UB in the compiler, I believe worst case the
patch will cause printing a few more line directives in the preprocessed
source around the wrapping from lines UINT_MAX - 7 to 0 (but less
around the wrapping from INT_MAX to INT_MAX + 1U), but I think those
are exceptional cases (sources with > 2billion lines are rare and
we warn or error on #line > INT_MAX).

Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?

2021-03-03  Jakub Jelinek  <jakub@redhat.com>

	PR c/99325
	* c-ppoutput.c (print): Change src_line type from int to unsigned.
	(token_streamer::stream) Likewise.
	(maybe_print_line_1): Likewise.  Don't strcmp src_file if src_loc is
	UNKNOWN_LOCATION.

	* gcc.dg/cpp/line11.c: New test.
	* gcc.dg/cpp/line12.c: New test.


	Jakub

Comments

Nathan Sidwell March 4, 2021, 12:55 p.m. UTC | #1
On 3/4/21 4:03 AM, Jakub Jelinek wrote:
> Hi!
> 
> In libcpp, lines are represented as linenum_type, which is unsigned int.
> The following testcases ICE because maybe_print_line_1 is sometimes called
> with UNKNOWN_LOCATION (e.g. at pragma eof) and while most of the time
> the
>         && src_line >= print.src_line
>         && src_line < print.src_line + 8
> check doesn't succeed for the src_line of 0 from UNKNOWN_LOCATION, when
> print.src_line is from very large line numbers (UINT_MAX - 7 and above)
> it succeeds (with UB on the compiler side) but src_file is NULL for
> UNKNOWN_LOCATION and so the strcmp call ICEs.
> As print.src_line can easily wrap around, this patch changes its type
> to unsigned int to match libcpp, so that we don't invoke UB in the compiler.
> For print.src_line of UINT_MAX - 7 and above, src_line from UNKNOWN_LOCATION
> will not pass that test anymore, but when it wraps around to 0, it can,
> so I've also added a check for src_loc != UNKNOWN_LOCATION (or, if
> preferred, could be src_file != NULL).
> Besides fixing the ICE and UB in the compiler, I believe worst case the
> patch will cause printing a few more line directives in the preprocessed
> source around the wrapping from lines UINT_MAX - 7 to 0 (but less
> around the wrapping from INT_MAX to INT_MAX + 1U), but I think those
> are exceptional cases (sources with > 2billion lines are rare and
> we warn or error on #line > INT_MAX).
> 
> Bootstrapped/regtested on x86_64-linux and i686-linux, ok for trunk?
> 
> 2021-03-03  Jakub Jelinek  <jakub@redhat.com>
> 
> 	PR c/99325
> 	* c-ppoutput.c (print): Change src_line type from int to unsigned.
> 	(token_streamer::stream) Likewise.
> 	(maybe_print_line_1): Likewise.  Don't strcmp src_file if src_loc is
> 	UNKNOWN_LOCATION.

ok.  thanks!
I think I noticed that signed/unsigned confusion when futzing around 
with modules, but didn't feel like diving in :)

nathan
diff mbox series

Patch

--- gcc/c-family/c-ppoutput.c.jj	2021-01-04 10:25:50.400102848 +0100
+++ gcc/c-family/c-ppoutput.c	2021-03-03 17:37:26.296185908 +0100
@@ -32,7 +32,7 @@  static struct
   FILE *outf;			/* Stream to write to.  */
   const cpp_token *prev;	/* Previous token.  */
   const cpp_token *source;	/* Source token for spacing.  */
-  int src_line;			/* Line number currently being written.  */
+  unsigned src_line;		/* Line number currently being written.  */
   bool printed;			/* True if something output at line.  */
   bool first_time;		/* pp_file_change hasn't been called yet.  */
   bool prev_was_system_token;	/* True if the previous token was a
@@ -213,7 +213,7 @@  token_streamer::stream (cpp_reader *pfil
   /* Subtle logic to output a space if and only if necessary.  */
   if (avoid_paste)
     {
-      int src_line = LOCATION_LINE (loc);
+      unsigned src_line = LOCATION_LINE (loc);
 
       if (print.source == NULL)
 	print.source = token;
@@ -237,7 +237,7 @@  token_streamer::stream (cpp_reader *pfil
     }
   else if (token->flags & PREV_WHITE)
     {
-      int src_line = LOCATION_LINE (loc);
+      unsigned src_line = LOCATION_LINE (loc);
 
       if (src_line != print.src_line
 	  && do_line_adjustments
@@ -437,7 +437,7 @@  static bool
 maybe_print_line_1 (location_t src_loc, FILE *stream)
 {
   bool emitted_line_marker = false;
-  int src_line = LOCATION_LINE (src_loc);
+  unsigned src_line = LOCATION_LINE (src_loc);
   const char *src_file = LOCATION_FILE (src_loc);
 
   /* End the previous line of text.  */
@@ -451,6 +451,7 @@  maybe_print_line_1 (location_t src_loc,
   if (!flag_no_line_commands
       && src_line >= print.src_line
       && src_line < print.src_line + 8
+      && src_loc != UNKNOWN_LOCATION
       && strcmp (src_file, print.src_file) == 0)
     {
       while (src_line > print.src_line)
--- gcc/testsuite/gcc.dg/cpp/line11.c.jj	2021-03-03 17:45:22.353036566 +0100
+++ gcc/testsuite/gcc.dg/cpp/line11.c	2021-03-03 17:47:27.021670988 +0100
@@ -0,0 +1,6 @@ 
+/* PR c/99325 */
+/* { dg-do preprocess } */
+/* { dg-options "-pedantic" } */
+
+#line 4294967295	/* { dg-warning "line number out of range" } */
+#pragma message "foo"
--- gcc/testsuite/gcc.dg/cpp/line12.c.jj	2021-03-03 17:45:24.690011266 +0100
+++ gcc/testsuite/gcc.dg/cpp/line12.c	2021-03-03 17:47:33.121604153 +0100
@@ -0,0 +1,6 @@ 
+/* PR c/99325 */
+/* { dg-do preprocess } */
+/* { dg-options "-pedantic" } */
+
+#line 9223372036854775807	/* { dg-warning "line number out of range" } */
+#pragma message "foo"