diff mbox series

c++: P2513R4, char8_t Compatibility and Portability Fix [PR106656]

Message ID 20220924011611.433106-1-polacek@redhat.com
State New
Headers show
Series c++: P2513R4, char8_t Compatibility and Portability Fix [PR106656] | expand

Commit Message

Marek Polacek Sept. 24, 2022, 1:16 a.m. UTC
P0482R6, which added char8_t, didn't allow

  const char arr[] = u8"howdy";

because it said "Declarations of arrays of char may currently be initialized
with UTF-8 string literals. Under this proposal, such initializations would
become ill-formed."  This caused too many issues, so P2513R4 alleviates some
of those compatibility problems.  In particular, "Arrays of char or unsigned
char may now be initialized with a UTF-8 string literal."  This restriction
has been lifted for initialization only, not implicit conversions.  Also,
my reading is that 'signed char' was excluded from the allowable conversions.

This is supposed to be treated as a DR in C++20.

Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

	PR c++/106656

gcc/c-family/ChangeLog:

	* c-cppbuiltin.cc (c_cpp_builtins): Update value of __cpp_char8_t
	for C++20.

gcc/cp/ChangeLog:

	* typeck2.cc (array_string_literal_compatible_p): Allow
	initializing arrays of char or unsigned char by a UTF-8 string literal.

gcc/testsuite/ChangeLog:

	* g++.dg/cpp23/feat-cxx2b.C: Adjust.
	* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
	* g++.dg/ext/char8_t-feature-test-macro-2.C: Likewise.
	* g++.dg/ext/char8_t-init-2.C: Likewise.
	* g++.dg/cpp2a/char8_t3.C: New test.
	* g++.dg/cpp2a/char8_t4.C: New test.
---
 gcc/c-family/c-cppbuiltin.cc                  |  2 +-
 gcc/cp/typeck2.cc                             |  9 +++++
 gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C       |  4 +-
 gcc/testsuite/g++.dg/cpp2a/char8_t3.C         | 37 +++++++++++++++++++
 gcc/testsuite/g++.dg/cpp2a/char8_t4.C         | 17 +++++++++
 gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C       |  4 +-
 .../g++.dg/ext/char8_t-feature-test-macro-2.C |  4 +-
 gcc/testsuite/g++.dg/ext/char8_t-init-2.C     |  4 +-
 8 files changed, 72 insertions(+), 9 deletions(-)
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t3.C
 create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t4.C


base-commit: f5072839c46acd185f40a5692aca06fac4ed6a48

Comments

Jason Merrill Sept. 26, 2022, 5:18 p.m. UTC | #1
On 9/23/22 21:16, Marek Polacek wrote:
> P0482R6, which added char8_t, didn't allow
> 
>    const char arr[] = u8"howdy";
> 
> because it said "Declarations of arrays of char may currently be initialized
> with UTF-8 string literals. Under this proposal, such initializations would
> become ill-formed."  This caused too many issues, so P2513R4 alleviates some
> of those compatibility problems.  In particular, "Arrays of char or unsigned
> char may now be initialized with a UTF-8 string literal."  This restriction
> has been lifted for initialization only, not implicit conversions.  Also,
> my reading is that 'signed char' was excluded from the allowable conversions.
> 
> This is supposed to be treated as a DR in C++20.
> 
> Bootstrapped/regtested on x86_64-pc-linux-gnu, ok for trunk?

OK.

> 	PR c++/106656
> 
> gcc/c-family/ChangeLog:
> 
> 	* c-cppbuiltin.cc (c_cpp_builtins): Update value of __cpp_char8_t
> 	for C++20.
> 
> gcc/cp/ChangeLog:
> 
> 	* typeck2.cc (array_string_literal_compatible_p): Allow
> 	initializing arrays of char or unsigned char by a UTF-8 string literal.
> 
> gcc/testsuite/ChangeLog:
> 
> 	* g++.dg/cpp23/feat-cxx2b.C: Adjust.
> 	* g++.dg/cpp2a/feat-cxx2a.C: Likewise.
> 	* g++.dg/ext/char8_t-feature-test-macro-2.C: Likewise.
> 	* g++.dg/ext/char8_t-init-2.C: Likewise.
> 	* g++.dg/cpp2a/char8_t3.C: New test.
> 	* g++.dg/cpp2a/char8_t4.C: New test.
> ---
>   gcc/c-family/c-cppbuiltin.cc                  |  2 +-
>   gcc/cp/typeck2.cc                             |  9 +++++
>   gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C       |  4 +-
>   gcc/testsuite/g++.dg/cpp2a/char8_t3.C         | 37 +++++++++++++++++++
>   gcc/testsuite/g++.dg/cpp2a/char8_t4.C         | 17 +++++++++
>   gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C       |  4 +-
>   .../g++.dg/ext/char8_t-feature-test-macro-2.C |  4 +-
>   gcc/testsuite/g++.dg/ext/char8_t-init-2.C     |  4 +-
>   8 files changed, 72 insertions(+), 9 deletions(-)
>   create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t3.C
>   create mode 100644 gcc/testsuite/g++.dg/cpp2a/char8_t4.C
> 
> diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
> index a1557eb23d5..b709f845c81 100644
> --- a/gcc/c-family/c-cppbuiltin.cc
> +++ b/gcc/c-family/c-cppbuiltin.cc
> @@ -1112,7 +1112,7 @@ c_cpp_builtins (cpp_reader *pfile)
>         if (flag_threadsafe_statics)
>   	cpp_define (pfile, "__cpp_threadsafe_static_init=200806L");
>         if (flag_char8_t)
> -        cpp_define (pfile, "__cpp_char8_t=201811L");
> +	cpp_define (pfile, "__cpp_char8_t=202207L");
>   #ifndef THREAD_MODEL_SPEC
>         /* Targets that define THREAD_MODEL_SPEC need to define
>   	 __STDCPP_THREADS__ in their config/XXX/XXX-c.c themselves.  */
> diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
> index 75fd0e2a9bf..739097a9734 100644
> --- a/gcc/cp/typeck2.cc
> +++ b/gcc/cp/typeck2.cc
> @@ -1118,6 +1118,15 @@ array_string_literal_compatible_p (tree type, tree init)
>     if (ordinary_char_type_p (to_char_type)
>         && ordinary_char_type_p (from_char_type))
>       return true;
> +
> +  /* P2513 (C++20/C++23): "an array of char or unsigned char may
> +     be initialized by a UTF-8 string literal, or by such a string
> +     literal enclosed in braces."  */
> +  if (from_char_type == char8_type_node
> +      && (to_char_type == char_type_node
> +	  || to_char_type == unsigned_char_type_node))
> +    return true;
> +
>     return false;
>   }
>   
> diff --git a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
> index d3e40724085..0537e1d24b5 100644
> --- a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
> +++ b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
> @@ -504,8 +504,8 @@
>   
>   #ifndef __cpp_char8_t
>   #  error "__cpp_char8_t"
> -#elif __cpp_char8_t != 201811
> -#  error "__cpp_char8_t != 201811"
> +#elif __cpp_char8_t != 202207
> +#  error "__cpp_char8_t != 202207"
>   #endif
>   
>   #ifndef __cpp_designated_initializers
> diff --git a/gcc/testsuite/g++.dg/cpp2a/char8_t3.C b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
> new file mode 100644
> index 00000000000..071a718c4d0
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
> @@ -0,0 +1,37 @@
> +// PR c++/106656 - P2513 - char8_t Compatibility and Portability Fixes
> +// { dg-do compile { target c++20 } }
> +
> +const char *p1 = u8""; // { dg-error "invalid conversion" }
> +const unsigned char *p2 = u8""; // { dg-error "invalid conversion" }
> +const signed char *p3 = u8""; // { dg-error "invalid conversion" }
> +const char *p4 = { u8"" }; // { dg-error "invalid conversion" }
> +const unsigned char *p5 = { u8"" }; // { dg-error "invalid conversion" }
> +const signed char *p6 = { u8"" }; // { dg-error "invalid conversion" }
> +const char *p7 = static_cast<const char *>(u8""); // { dg-error "invalid" }
> +const char a1[] = u8"text";
> +const unsigned char a2[] = u8"";
> +const signed char a3[] = u8""; // { dg-error "cannot initialize array" }
> +const char a4[] = { u8"text" };
> +const unsigned char a5[] = { u8"" };
> +const signed char a6[] = { u8"" }; // { dg-error "cannot initialize array" }
> +
> +const char *
> +resource_id ()
> +{
> +  static const char res_id[] = u8"";
> +  return res_id;
> +}
> +
> +const char8_t x[] = "fail"; // { dg-error "cannot initialize array" }
> +
> +void fn (const char a[]);
> +void
> +g ()
> +{
> +  fn (u8"z"); // { dg-error "invalid conversion" }
> +}
> +
> +char c = u8'c';
> +unsigned char uc = u8'c';
> +signed char sc = u8'c';
> +char8_t c8 = 'c';
> diff --git a/gcc/testsuite/g++.dg/cpp2a/char8_t4.C b/gcc/testsuite/g++.dg/cpp2a/char8_t4.C
> new file mode 100644
> index 00000000000..c18081b66fb
> --- /dev/null
> +++ b/gcc/testsuite/g++.dg/cpp2a/char8_t4.C
> @@ -0,0 +1,17 @@
> +// PR c++/106656 - P2513 - char8_t Compatibility and Portability Fixes
> +// { dg-do compile { target c++20 } }
> +// [diff.cpp20.dcl]
> +
> +struct A {
> +	char8_t s[10];
> +};
> +struct B {
> +	char s[10];
> +};
> +
> +void f(A);
> +void f(B);
> +
> +int main() {
> +	f({u8""}); // { dg-error "ambiguous" }
> +}
> diff --git a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
> index c65ea6bf48a..02f3a377fd0 100644
> --- a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
> +++ b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
> @@ -504,8 +504,8 @@
>   
>   #ifndef __cpp_char8_t
>   #  error "__cpp_char8_t"
> -#elif __cpp_char8_t != 201811
> -#  error "__cpp_char8_t != 201811"
> +#elif __cpp_char8_t != 202207
> +#  error "__cpp_char8_t != 202207"
>   #endif
>   
>   #ifndef __cpp_designated_initializers
> diff --git a/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C b/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
> index df1063f6aa1..2d0f9045acf 100644
> --- a/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
> +++ b/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
> @@ -5,6 +5,6 @@
>   
>   #if !defined(__cpp_char8_t)
>   #  error __cpp_char8_t is not defined!
> -#elif __cpp_char8_t != 201811
> -#  error __cpp_char8_t != 201811
> +#elif __cpp_char8_t != 202207
> +#  error __cpp_char8_t != 202207
>   #endif
> diff --git a/gcc/testsuite/g++.dg/ext/char8_t-init-2.C b/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
> index c713bc12266..02a96ffe5a4 100644
> --- a/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
> +++ b/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
> @@ -21,7 +21,7 @@ const char8_t (&rca4)[2] = u8"x";
>   const char8_t (&rca5)[2] = u"x"; // { dg-error "invalid initialization of reference of type .const char8_t ....... from expression of type .const char16_t ...." "char8_t" }
>   
>   char ca1[] = "x";
> -char ca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
> +char ca2[] = u8"x";
>   char8_t ca3[] = "x"; // { dg-error "from a string literal with type array of .char." "char8_t" }
>   char8_t ca4[] = u8"x";
>   char8_t ca5[] = u"x"; // { dg-error "from a string literal with type array of .char16_t." "char8_t" }
> @@ -30,4 +30,4 @@ signed char sca1[] = "x";
>   signed char sca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
>   
>   unsigned char uca1[] = "x";
> -unsigned char uca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
> +unsigned char uca2[] = u8"x";
> 
> base-commit: f5072839c46acd185f40a5692aca06fac4ed6a48
diff mbox series

Patch

diff --git a/gcc/c-family/c-cppbuiltin.cc b/gcc/c-family/c-cppbuiltin.cc
index a1557eb23d5..b709f845c81 100644
--- a/gcc/c-family/c-cppbuiltin.cc
+++ b/gcc/c-family/c-cppbuiltin.cc
@@ -1112,7 +1112,7 @@  c_cpp_builtins (cpp_reader *pfile)
       if (flag_threadsafe_statics)
 	cpp_define (pfile, "__cpp_threadsafe_static_init=200806L");
       if (flag_char8_t)
-        cpp_define (pfile, "__cpp_char8_t=201811L");
+	cpp_define (pfile, "__cpp_char8_t=202207L");
 #ifndef THREAD_MODEL_SPEC
       /* Targets that define THREAD_MODEL_SPEC need to define
 	 __STDCPP_THREADS__ in their config/XXX/XXX-c.c themselves.  */
diff --git a/gcc/cp/typeck2.cc b/gcc/cp/typeck2.cc
index 75fd0e2a9bf..739097a9734 100644
--- a/gcc/cp/typeck2.cc
+++ b/gcc/cp/typeck2.cc
@@ -1118,6 +1118,15 @@  array_string_literal_compatible_p (tree type, tree init)
   if (ordinary_char_type_p (to_char_type)
       && ordinary_char_type_p (from_char_type))
     return true;
+
+  /* P2513 (C++20/C++23): "an array of char or unsigned char may
+     be initialized by a UTF-8 string literal, or by such a string
+     literal enclosed in braces."  */
+  if (from_char_type == char8_type_node
+      && (to_char_type == char_type_node
+	  || to_char_type == unsigned_char_type_node))
+    return true;
+
   return false;
 }
 
diff --git a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
index d3e40724085..0537e1d24b5 100644
--- a/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
+++ b/gcc/testsuite/g++.dg/cpp23/feat-cxx2b.C
@@ -504,8 +504,8 @@ 
 
 #ifndef __cpp_char8_t
 #  error "__cpp_char8_t"
-#elif __cpp_char8_t != 201811
-#  error "__cpp_char8_t != 201811"
+#elif __cpp_char8_t != 202207
+#  error "__cpp_char8_t != 202207"
 #endif
 
 #ifndef __cpp_designated_initializers
diff --git a/gcc/testsuite/g++.dg/cpp2a/char8_t3.C b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
new file mode 100644
index 00000000000..071a718c4d0
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/char8_t3.C
@@ -0,0 +1,37 @@ 
+// PR c++/106656 - P2513 - char8_t Compatibility and Portability Fixes
+// { dg-do compile { target c++20 } }
+
+const char *p1 = u8""; // { dg-error "invalid conversion" }
+const unsigned char *p2 = u8""; // { dg-error "invalid conversion" }
+const signed char *p3 = u8""; // { dg-error "invalid conversion" }
+const char *p4 = { u8"" }; // { dg-error "invalid conversion" }
+const unsigned char *p5 = { u8"" }; // { dg-error "invalid conversion" }
+const signed char *p6 = { u8"" }; // { dg-error "invalid conversion" }
+const char *p7 = static_cast<const char *>(u8""); // { dg-error "invalid" }
+const char a1[] = u8"text";
+const unsigned char a2[] = u8"";
+const signed char a3[] = u8""; // { dg-error "cannot initialize array" }
+const char a4[] = { u8"text" };
+const unsigned char a5[] = { u8"" };
+const signed char a6[] = { u8"" }; // { dg-error "cannot initialize array" }
+
+const char *
+resource_id ()
+{
+  static const char res_id[] = u8"";
+  return res_id;
+}
+
+const char8_t x[] = "fail"; // { dg-error "cannot initialize array" }
+
+void fn (const char a[]);
+void
+g ()
+{
+  fn (u8"z"); // { dg-error "invalid conversion" }
+}
+
+char c = u8'c';
+unsigned char uc = u8'c';
+signed char sc = u8'c';
+char8_t c8 = 'c';
diff --git a/gcc/testsuite/g++.dg/cpp2a/char8_t4.C b/gcc/testsuite/g++.dg/cpp2a/char8_t4.C
new file mode 100644
index 00000000000..c18081b66fb
--- /dev/null
+++ b/gcc/testsuite/g++.dg/cpp2a/char8_t4.C
@@ -0,0 +1,17 @@ 
+// PR c++/106656 - P2513 - char8_t Compatibility and Portability Fixes
+// { dg-do compile { target c++20 } }
+// [diff.cpp20.dcl]
+
+struct A {
+	char8_t s[10];
+};
+struct B {
+	char s[10];
+};
+
+void f(A);
+void f(B);
+
+int main() {
+	f({u8""}); // { dg-error "ambiguous" }
+}
diff --git a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
index c65ea6bf48a..02f3a377fd0 100644
--- a/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
+++ b/gcc/testsuite/g++.dg/cpp2a/feat-cxx2a.C
@@ -504,8 +504,8 @@ 
 
 #ifndef __cpp_char8_t
 #  error "__cpp_char8_t"
-#elif __cpp_char8_t != 201811
-#  error "__cpp_char8_t != 201811"
+#elif __cpp_char8_t != 202207
+#  error "__cpp_char8_t != 202207"
 #endif
 
 #ifndef __cpp_designated_initializers
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C b/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
index df1063f6aa1..2d0f9045acf 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-feature-test-macro-2.C
@@ -5,6 +5,6 @@ 
 
 #if !defined(__cpp_char8_t)
 #  error __cpp_char8_t is not defined!
-#elif __cpp_char8_t != 201811
-#  error __cpp_char8_t != 201811
+#elif __cpp_char8_t != 202207
+#  error __cpp_char8_t != 202207
 #endif
diff --git a/gcc/testsuite/g++.dg/ext/char8_t-init-2.C b/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
index c713bc12266..02a96ffe5a4 100644
--- a/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
+++ b/gcc/testsuite/g++.dg/ext/char8_t-init-2.C
@@ -21,7 +21,7 @@  const char8_t (&rca4)[2] = u8"x";
 const char8_t (&rca5)[2] = u"x"; // { dg-error "invalid initialization of reference of type .const char8_t ....... from expression of type .const char16_t ...." "char8_t" }
 
 char ca1[] = "x";
-char ca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
+char ca2[] = u8"x";
 char8_t ca3[] = "x"; // { dg-error "from a string literal with type array of .char." "char8_t" }
 char8_t ca4[] = u8"x";
 char8_t ca5[] = u"x"; // { dg-error "from a string literal with type array of .char16_t." "char8_t" }
@@ -30,4 +30,4 @@  signed char sca1[] = "x";
 signed char sca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
 
 unsigned char uca1[] = "x";
-unsigned char uca2[] = u8"x"; // { dg-error "from a string literal with type array of .char8_t." "char8_t" }
+unsigned char uca2[] = u8"x";