Patchwork [C++] DR1473 - let literal operators be defined with empty user-defined string literal

login
register
mail settings
Submitter Ed Smith-Rowland
Date June 26, 2013, 1:43 p.m.
Message ID <51CAF00D.4070805@verizon.net>
Download mbox | patch
Permalink /patch/254740/
State New
Headers show

Comments

Ed Smith-Rowland - June 26, 2013, 1:43 p.m.
On 06/25/2013 08:50 AM, Jason Merrill wrote:

I had missed a few files in my patch anyway (I was doing too much at once).

> On 06/25/2013 08:27 AM, Ed Smith-Rowland wrote:
>> +      else if (token->type == CPP_KEYWORD)
>> +    {
>> +      error ("unexpected keyword;"
>> +         " Remove space between quotes and suffix identifier");
>> +      return error_mark_node;
>> +    }
>
> Lower-case 'r' after a semicolon.
Done.
>
> After giving the error, let's try to handle it properly anyway to 
> avoid cascading errors.
>
>> +    if (TREE_STRING_LENGTH (string_tree) > 2)
>
> Why 2?  I would expect TREE_STRING_LENGTH for "" to be 1 (the NUL).
>
The string length is the two quotes (and, as you'll see, the encoding 
prefix length).
>> +        error ("expected empty string after %<operator%> keyword");
>> +        return error_mark_node;
>
> And let's continue after the error here, too.
>
>> +      error ("invalid encoding prefix in literal operator");
>>        return error_mark_node;
>
> And here.
I the patch I am sending you now, I don't return error mark node very 
often.  I go on, nonempty strings and bad encoding notwithstanding, and 
produce the correct operator ID which gets returned.  Correct?
>
> Jason
>
I'm still testing and I might have to tweak the error lines and such.

OK in principal though?

Ed
libcpp:

2013-06-25  Ed Smith-Rowland  <3dw4rd@verizon.net>

	* lex.c: Constrain suffixes treated as concatenated literal and macro
	to just the patterns found in inttypes.h.


gcc/cp:

2013-06-25  Ed Smith-Rowland  <3dw4rd@verizon.net>

	* cp-tree.h (UDLIT_OP_ANSI_PREFIX): Remove space.
	* parser.c (cp_parser_operator()): Parse user-defined string
	literal as literal operator.


gcc/testsuite:

2013-06-25  Ed Smith-Rowland  <3dw4rd@verizon.net>

	* g++.dg/cpp0x/udlit-nospace-neg.C: Adjust.
	* g++.dg/cpp1y/udlit-enc-prefix-neg.C: New.
	* g++.dg/cpp1y/udlit-userdef-string.C: New.
	* g++.dg/cpp1y/complex_literals.h: New.
Jason Merrill - June 26, 2013, 9:01 p.m.
On 06/26/2013 09:43 AM, Ed Smith-Rowland wrote:
> +      if (bad_encoding_prefix)
> +	error ("invalid encoding prefix in literal operator");
> +      {
> +	tree string_tree = USERDEF_LITERAL_VALUE (token->u.value);

No need to open a nested block for a declaration now that we're 
compiling as C++.

Otherwise, OK.

Jason

Patch

Index: gcc/cp/cp-tree.h
===================================================================
--- gcc/cp/cp-tree.h	(revision 200414)
+++ gcc/cp/cp-tree.h	(working copy)
@@ -4404,7 +4404,7 @@ 
 #define LAMBDANAME_PREFIX "__lambda"
 #define LAMBDANAME_FORMAT LAMBDANAME_PREFIX "%d"
 
-#define UDLIT_OP_ANSI_PREFIX "operator\"\" "
+#define UDLIT_OP_ANSI_PREFIX "operator\"\""
 #define UDLIT_OP_ANSI_FORMAT UDLIT_OP_ANSI_PREFIX "%s"
 #define UDLIT_OP_MANGLED_PREFIX "li"
 #define UDLIT_OP_MANGLED_FORMAT UDLIT_OP_MANGLED_PREFIX "%s"
Index: libcpp/lex.c
===================================================================
--- libcpp/lex.c	(revision 200414)
+++ libcpp/lex.c	(working copy)
@@ -1556,22 +1556,21 @@ 
 
   if (CPP_OPTION (pfile, user_literals))
     {
-      /* According to C++11 [lex.ext]p10, a ud-suffix not starting with an
-	 underscore is ill-formed.  Since this breaks programs using macros
-	 from inttypes.h, we generate a warning and treat the ud-suffix as a
-	 separate preprocessing token.  This approach is under discussion by
-	 the standards committee, and has been adopted as a conforming
-	 extension by other front ends such as clang.
-         A special exception is made for the suffix 's' which will be
-	 standardized as a user-defined literal suffix for strings.  */
-      if (ISALPHA (*cur) && *cur != 's')
+      /* If a string format macro, say from inttypes.h, is placed touching
+	 a string literal it could be parsed as a C++11 user-defined string
+	 literal thus breaking the program.
+	 Since all format macros in inttypes.h start with "PRI" or "SCN"
+	 suffixes beginning with these will be interpreted as macros and the
+	 string and the macro parsed as separate tokens. A warning is issued. */
+      if (ustrcmp (cur, (const unsigned char *) "PRI") == 0
+       || ustrcmp (cur, (const unsigned char *) "SCN") == 0)
 	{
 	  /* Raise a warning, but do not consume subsequent tokens.  */
 	  if (CPP_OPTION (pfile, warn_literal_suffix))
 	    cpp_warning_with_line (pfile, CPP_W_LITERAL_SUFFIX,
 				   token->src_loc, 0,
 				   "invalid suffix on literal; C++11 requires "
-				   "a space between literal and identifier");
+				   "a space between literal and string macro");
 	}
       /* Grab user defined literal suffix.  */
       else if (ISIDST (*cur))
@@ -1689,22 +1688,21 @@ 
 
   if (CPP_OPTION (pfile, user_literals))
     {
-      /* According to C++11 [lex.ext]p10, a ud-suffix not starting with an
-	 underscore is ill-formed.  Since this breaks programs using macros
-	 from inttypes.h, we generate a warning and treat the ud-suffix as a
-	 separate preprocessing token.  This approach is under discussion by
-	 the standards committee, and has been adopted as a conforming
-	 extension by other front ends such as clang.
-         A special exception is made for the suffix 's' which will be
-	 standardized as a user-defined literal suffix for strings.  */
-      if (ISALPHA (*cur) && *cur != 's')
+      /* If a string format macro, say from inttypes.h, is placed touching
+	 a string literal it could be parsed as a C++11 user-defined string
+	 literal thus breaking the program.
+	 Since all format macros in inttypes.h start with "PRI" or "SCN"
+	 suffixes beginning with these will be interpreted as macros and the
+	 string and the macro parsed as separate tokens. A warning is issued. */
+      if (ustrcmp (cur, (const unsigned char *) "PRI") == 0
+       || ustrcmp (cur, (const unsigned char *) "SCN") == 0)
 	{
 	  /* Raise a warning, but do not consume subsequent tokens.  */
 	  if (CPP_OPTION (pfile, warn_literal_suffix))
 	    cpp_warning_with_line (pfile, CPP_W_LITERAL_SUFFIX,
 				   token->src_loc, 0,
 				   "invalid suffix on literal; C++11 requires "
-				   "a space between literal and identifier");
+				   "a space between literal and string macro");
 	}
       /* Grab user defined literal suffix.  */
       else if (ISIDST (*cur))
Index: gcc/cp/parser.c
===================================================================
--- gcc/cp/parser.c	(revision 200415)
+++ gcc/cp/parser.c	(working copy)
@@ -12370,6 +12370,8 @@ 
 {
   tree id = NULL_TREE;
   cp_token *token;
+  bool bad_encoding_prefix = false;
+  int string_len = 2;
 
   /* Peek at the next token.  */
   token = cp_lexer_peek_token (parser->lexer);
@@ -12569,10 +12571,20 @@ 
       cp_parser_require (parser, CPP_CLOSE_SQUARE, RT_CLOSE_SQUARE);
       return ansi_opname (ARRAY_REF);
 
+    case CPP_WSTRING:
+      string_len = 3;
+    case CPP_STRING16:
+    case CPP_STRING32:
+      string_len = 5;
+    case CPP_UTF8STRING:
+      string_len = 4;
+      bad_encoding_prefix = true;
     case CPP_STRING:
       if (cxx_dialect == cxx98)
 	maybe_warn_cpp0x (CPP0X_USER_DEFINED_LITERALS);
-      if (TREE_STRING_LENGTH (token->u.value) > 2)
+      if (bad_encoding_prefix)
+	error ("invalid encoding prefix in literal operator");
+      if (TREE_STRING_LENGTH (token->u.value) > string_len)
 	{
 	  error ("expected empty string after %<operator%> keyword");
 	  return error_mark_node;
@@ -12590,15 +12602,49 @@ 
 	      return cp_literal_operator_id (name);
 	    }
 	}
+      else if (token->type == CPP_KEYWORD)
+	{
+	  error ("unexpected keyword;"
+		 " remove space between quotes and suffix identifier");
+	  return error_mark_node;
+	}
       else
 	{
 	  error ("expected suffix identifier");
 	  return error_mark_node;
 	}
 
+    case CPP_WSTRING_USERDEF:
+      string_len = 3;
+    case CPP_STRING16_USERDEF:
+    case CPP_STRING32_USERDEF:
+      string_len = 5;
+    case CPP_UTF8STRING_USERDEF:
+      string_len = 4;
+      bad_encoding_prefix = true;
     case CPP_STRING_USERDEF:
-      error ("missing space between %<\"\"%> and suffix identifier");
-      return error_mark_node;
+      if (cxx_dialect == cxx98)
+	maybe_warn_cpp0x (CPP0X_USER_DEFINED_LITERALS);
+      if (bad_encoding_prefix)
+	error ("invalid encoding prefix in literal operator");
+      {
+	tree string_tree = USERDEF_LITERAL_VALUE (token->u.value);
+	if (TREE_STRING_LENGTH (string_tree) > string_len)
+	  {
+	    error ("expected empty string after %<operator%> keyword");
+	    return error_mark_node;
+	  }
+	id = USERDEF_LITERAL_SUFFIX_ID (token->u.value);
+	/* Consume the user-defined string literal.  */
+	cp_lexer_consume_token (parser->lexer);
+	if (id != error_mark_node)
+	  {
+	    const char *name = IDENTIFIER_POINTER (id);
+	    return cp_literal_operator_id (name);
+	  }
+	else
+	  return error_mark_node;
+      }
 
     default:
       /* Anything else is an error.  */
Index: gcc/testsuite/g++.dg/cpp0x/udlit-nospace-neg.C
===================================================================
--- gcc/testsuite/g++.dg/cpp0x/udlit-nospace-neg.C	(revision 200414)
+++ gcc/testsuite/g++.dg/cpp0x/udlit-nospace-neg.C	(working copy)
@@ -1,3 +1,5 @@ 
 // { dg-options "-std=c++0x" }
 
-float operator ""_abc(const char*); // { dg-error "missing space between|and suffix identifier" }
+float operator ""_abc(const char*);
+
+int operator""_def(long double);
Index: gcc/testsuite/g++.dg/cpp1y/udlit-enc-prefix-neg.C
===================================================================
--- gcc/testsuite/g++.dg/cpp1y/udlit-enc-prefix-neg.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp1y/udlit-enc-prefix-neg.C	(working copy)
@@ -0,0 +1,17 @@ 
+// { dg-options -std=c++1y }
+
+int
+operator L""Ls(unsigned long long) // { dg-error "invalid encoding prefix in literal operator" }
+{ return 0; }
+
+int
+operator u""s16(unsigned long long) // { dg-error "invalid encoding prefix in literal operator" }
+{ return 0; }
+
+int
+operator U""s32(unsigned long long) // { dg-error "invalid encoding prefix in literal operator" }
+{ return 0; }
+
+int
+operator u8""u8s(unsigned long long) // { dg-error "invalid encoding prefix in literal operator" }
+{ return 0; }
Index: gcc/testsuite/g++.dg/cpp1y/udlit-userdef-string.C
===================================================================
--- gcc/testsuite/g++.dg/cpp1y/udlit-userdef-string.C	(revision 0)
+++ gcc/testsuite/g++.dg/cpp1y/udlit-userdef-string.C	(working copy)
@@ -0,0 +1,7 @@ 
+// { dg-options -std=c++1y }
+
+#include "complex_literals.h"
+
+auto cx = 1.1if;
+
+auto cn = 123if;
Index: gcc/testsuite/g++.dg/cpp1y/complex_literals.h
===================================================================
--- gcc/testsuite/g++.dg/cpp1y/complex_literals.h	(revision 0)
+++ gcc/testsuite/g++.dg/cpp1y/complex_literals.h	(working copy)
@@ -0,0 +1,12 @@ 
+
+#include <complex>
+
+#pragma GCC system_header
+
+std::complex<float>
+operator""if(long double ximag)
+{ return std::complex<float>(0.0F, static_cast<float>(ximag)); }
+
+std::complex<float>
+operator""if(unsigned long long nimag)
+{ return std::complex<float>(0.0F, static_cast<float>(nimag)); }