diff mbox series

[committed] preprocessor: Fix cpp_avoid_paste for digit separators

Message ID alpine.DEB.2.22.394.2105111855010.21144@digraph.polyomino.org.uk
State New
Headers show
Series [committed] preprocessor: Fix cpp_avoid_paste for digit separators | expand

Commit Message

Joseph Myers May 11, 2021, 6:55 p.m. UTC
The libcpp function cpp_avoid_paste is used to insert whitespace in
preprocessed output where needed to avoid two consecutive
preprocessing tokens, that logically (e.g. when stringized) do not
have whitespace between them, from being incorrectly lexed as one when
the preprocessed input is reread by a compiler.

This fails to allow for digit separators, so meaning that invalid
code, that has a CPP_NUMBER (from a macro expansion) followed by a
character literal, can result in preprocessed output with a valid use
of digit separators, so that required syntax errors do not occur when
compiling with -save-temps.  Fix this by handling that case in
cpp_avoid_paste (as with other cases in cpp_avoid_paste, this doesn't
try to check whether the language version in use supports digit
separators; it's always OK to have unnecessary whitespace in
preprocessed output).

Note: there are other cases, with various kinds of wide character or
string literal following a CPP_NUMBER, where spurious pasting of
preprocessing tokens can occur but the sequence of tokens remains
invalid both before and after that pasting.  Maybe cpp_avoid_paste
should also handle those cases (and similar cases after a CPP_NAME),
to ensure the sequence of preprocessing tokens in preprocessed output
is exactly right, whether or not it affects whether syntax errors
occur.  This patch only addresses the case with digit separators where
invalid code can fail to be diagnosed without the space inserted.

Bootstrapped with no regressions for x86_64-pc-linux-gnu.  Applied to 

	* lex.c (cpp_avoid_paste): Do not allow pasting CPP_NUMBER with

	* g++.dg/cpp1y/digit-sep-paste.C, gcc.dg/c2x-digit-separators-3.c:
	New tests.
diff mbox series


diff --git a/gcc/testsuite/g++.dg/cpp1y/digit-sep-paste.C b/gcc/testsuite/g++.dg/cpp1y/digit-sep-paste.C
new file mode 100644
index 00000000000..41fb967ef8d
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp1y/digit-sep-paste.C
@@ -0,0 +1,11 @@ 
+// Test token pasting with digit separators avoided for preprocessed output.
+// { dg-do compile { target c++14 } }
+// { dg-options "-save-temps" }
+#define ZERO 0
+f ()
+  return ZERO'0'0; /* { dg-error "expected" } */
diff --git a/gcc/testsuite/gcc.dg/c2x-digit-separators-3.c b/gcc/testsuite/gcc.dg/c2x-digit-separators-3.c
new file mode 100644
index 00000000000..cddb88fa880
--- /dev/null
+++ b/gcc/testsuite/gcc.dg/c2x-digit-separators-3.c
@@ -0,0 +1,12 @@ 
+/* Test C2x digit separators.  Test token pasting avoided for preprocessed
+   output.  */
+/* { dg-do compile } */
+/* { dg-options "-std=c2x -save-temps" } */
+#define ZERO 0
+f (void)
+  return ZERO'0'0; /* { dg-error "expected" } */
diff --git a/libcpp/lex.c b/libcpp/lex.c
index b7ce85a0331..36cd2e30630 100644
--- a/libcpp/lex.c
+++ b/libcpp/lex.c
@@ -3725,6 +3725,7 @@  cpp_avoid_paste (cpp_reader *pfile, const cpp_token *token1,
 				|| b == CPP_NAME
 				|| b == CPP_CHAR || b == CPP_STRING); /* L */
     case CPP_NUMBER:	return (b == CPP_NUMBER || b == CPP_NAME
+				|| b == CPP_CHAR
 				|| c == '.' || c == '+' || c == '-');
 				      /* UCNs */
     case CPP_OTHER:	return ((token1->val.str.text[0] == '\\'