Message ID | 20211107001723.2528944-1-wjwray@gmail.com |
---|---|
State | New |
Headers | show |
Series | c++: designated init of char array by string constant [PR55227] | expand |
The fixes test out, as does the FIXME that's fixed based on the fixes... Note that the bug causes bogus rejection of any designated initialization of char array from a string literal, except for the singular case where the string literal initializer size exactly matches the target char array size and is not enclosed in optional braces: typedef struct C { char id[4]; } C; C a = {.id = "abc"}; // g++ accepts iff sizeof(C::c) == sizeof("abc") C b = {.id = {"abc"}}; // g++ rejects valid (gcc accepts) C c = {.id = "a"}; // g++ rejects valid (gcc accepts) I'd expect this to be common in C code bases, so the bug would be hit in any attempt to compile with g++. From the bugzilla comments, it seems that the following 'workaround' is being used: C d = {{.id = "a"}}; // g++ accepts invalid (gcc rejects) which 'works' in this case but is completely borked, consider: struct name {char first[32], second[32], third[32];}; name DMR {{.first = "Dennis"}, {.third = "Ritchie"}}; Only g++ accepts, ignores the designators, interprets as positional, and generates correspondingly invalid output: DMR: .string "Dennis" .zero 25 .string "Ritchie" .zero 24 .zero 32
Hi, thanks for the patch and sorry for the delay in reviewing. On Sat, Nov 06, 2021 at 08:17:23PM -0400, Will Wray via Gcc-patches wrote: > This patch aims to fix https://gcc.gnu.org/bugzilla/show_bug.cgi?id=55227. > > There are two underlying bugs in the designated initialization of char array > fields by string literals that cause: > > (1) Rejection of valid cases with: > (a) brace-enclosed string literal initializer (of any valid size), or > (b) unbraced string literal shorter than the target char array field. > > (2) Acceptance of invalid cases with designators appearing within the braces > of a braced string literal, in which case the bogus 'designator' was being > entirely ignored and the string literal treated as a positional initializer. I also noticed the C++ FE rejects struct A { char x[4]; }; struct B { struct A a; }; struct B b = { .a.x = "abc" }; but the C FE accepts it. But that's for another time. > Please review these changes carefully; there are likely errors of omission, > logic or an anon anomaly. > > The fixes above allow to address a FIXME in cp_complete_array_type: > > /* FIXME: this code is duplicated from reshape_init. > Probably we should just call reshape_init here? */ > > I believe that this was obstructed by the designator bugs (see comment here > https://patchwork.ozlabs.org/project/gcc/list/?series=199783) > > Boostraps/regtests on x86_64-pc-linux-gnu. > > PR c++/55227 > > gcc/cp/ChangeLog: > > * decl.c (reshape_init_r): restrict has_designator_check, I think something like "Only call has_designator_check when first_initializer_p or for the inner constructor element." would be better. > (cp_complete_array_type): do reshape_init on braced-init-list. s/do/Do/ > gcc/testsuite/ChangeLog: > > * g++.dg/cpp2a/desig20.C: New test. > --- > gcc/cp/decl.c | 28 ++++++---------- > gcc/testsuite/g++.dg/cpp2a/desig20.C | 48 ++++++++++++++++++++++++++++ > 2 files changed, 57 insertions(+), 19 deletions(-) > create mode 100644 gcc/testsuite/g++.dg/cpp2a/desig20.C > > diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c > index 947bbfc6637..f01655c5c14 100644 > --- a/gcc/cp/decl.c > +++ b/gcc/cp/decl.c > @@ -6820,6 +6820,7 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > { > tree str_init = init; > tree stripped_str_init = stripped_init; > + reshape_iter stripd = {}; Since the previous variables spell it "stripped" maybe call it stripped_iter. > /* Strip one level of braces if and only if they enclose a single > element (as allowed by [dcl.init.string]). */ > @@ -6827,7 +6828,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > && TREE_CODE (stripped_str_init) == CONSTRUCTOR > && CONSTRUCTOR_NELTS (stripped_str_init) == 1) > { > - str_init = (*CONSTRUCTOR_ELTS (stripped_str_init))[0].value; > + stripd.cur = CONSTRUCTOR_ELT (stripped_str_init, 0); > + str_init = stripd.cur->value; > stripped_str_init = tree_strip_any_location_wrapper (str_init); > } > > @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > array types (one value per array element). */ > if (TREE_CODE (stripped_str_init) == STRING_CST) > { > - if (has_designator_problem (d, complain)) So the logic here is that... > + if ((first_initializer_p && has_designator_problem (d, complain)) this will complain about a designator in the outermost { }: char arr[] = { [0] = "foo" }; and... > + || (stripd.cur && has_designator_problem (&stripd, complain))) this checks the { } which is at least one level deep, contains a STRING_CST, and this line makes sure that we don't have a designator that would refer to some element of the array we're initializing with the STRING_CST, yes? I.e., something like struct S { char arr[10]; }; S s = { { [2] = "lol" } }; > return error_mark_node; > d->cur++; > return str_init; > @@ -9538,23 +9541,10 @@ cp_complete_array_type (tree *ptype, tree initial_value, bool do_default) > > if (initial_value) > { > - /* An array of character type can be initialized from a > - brace-enclosed string constant. Nice to finally remove this, but let's keep this part of the comment. > - FIXME: this code is duplicated from reshape_init. Probably > - we should just call reshape_init here? */ > - if (char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (*ptype))) > - && TREE_CODE (initial_value) == CONSTRUCTOR > - && !vec_safe_is_empty (CONSTRUCTOR_ELTS (initial_value))) > - { > - vec<constructor_elt, va_gc> *v = CONSTRUCTOR_ELTS (initial_value); > - tree value = (*v)[0].value; > - STRIP_ANY_LOCATION_WRAPPER (value); > - > - if (TREE_CODE (value) == STRING_CST > - && v->length () == 1) > - initial_value = value; > - } > + if (TREE_CODE (initial_value) == CONSTRUCTOR > + && BRACE_ENCLOSED_INITIALIZER_P (initial_value)) BRACE_ENCLOSED_INITIALIZER_P checks that it gets a CONSTRUCTOR so you can remove the first check. > + initial_value = reshape_init (*ptype, initial_value, > + tf_warning_or_error); > > /* If any of the elements are parameter packs, we can't actually > complete this type now because the array size is dependent. */ > diff --git a/gcc/testsuite/g++.dg/cpp2a/desig20.C b/gcc/testsuite/g++.dg/cpp2a/desig20.C > new file mode 100644 > index 00000000000..03eab764325 > --- /dev/null > +++ b/gcc/testsuite/g++.dg/cpp2a/desig20.C > @@ -0,0 +1,48 @@ > +// PR c++/55227 > +// Test designated initializer for char array by string constant > + > +// { dg-do compile } > +// { dg-options "-pedantic" } FWIW, if you remove the dg-options, -pedantic-errors will be used so you could drop it and then use dg-error instead of dg-warning below but this is OK too. > + > +struct C {char a[2];}; > + > +/* Case a; designated, unbraced, string-literal of the same size as the target > + char array to be initialized; valid and accepted before and after. */ > + > +C a = {.a="a"}; // { dg-warning "designated initializers only available with" "" { target c++17_down } .-0 } .-0 looks weird and you don't need it, just drop it. We should probably test more: - nested structs - anonymous unions - test when the initializer is too long - multidim arrays: char a[][10] = { { "aaa" } }; char a2[][10] = { [0] = { "aaa" } }; They all appear to work, so it's just about extending the test. Hope this is useful... Marek
Thanks for the review Marek; I'll post the updated patch in a follow-on message and on bugzilla. On Mon, Nov 15, 2021 at 8:03 PM Marek Polacek <polacek@redhat.com> wrote: > I also noticed the C++ FE rejects > > struct A { char x[4]; }; > struct B { struct A a; }; > struct B b = { .a.x = "abc" }; > but the C FE accepts it. But that's for another time. Yes, the nested case is invalid for C++, valid for C. c.f. cppreference aggregate init. > > + reshape_iter stripd = {}; > > Since the previous variables spell it "stripped" maybe call it stripped_iter. I've left it as "stripd"; the top level reshape_iter is just "d", non-verbose, so "stripped_d" inappropriately over-verbose. > > @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > > array types (one value per array element). */ > > if (TREE_CODE (stripped_str_init) == STRING_CST) > > { > > - if (has_designator_problem (d, complain)) > > So the logic here is that... Yes, you get the logic exactly... took me a few rounds to get it. > Nice to finally remove this, but let's keep this part of the comment. Agreed, and reinstated. > BRACE_ENCLOSED_INITIALIZER_P checks that it gets a CONSTRUCTOR so you > can remove the first check. Nice, thanks; missed that. > > +// { dg-do compile } > > +// { dg-options "-pedantic" } > > FWIW, if you remove the dg-options, -pedantic-errors will be used so you could > drop it and then use dg-error instead of dg-warning below but this is OK too. I'd copied that from another desigN.C test, now I've copied the simpler: +// { dg-options "" } and removed all of the noisy dg-warning tests > We should probably test more: > - nested structs > - anonymous unions > - test when the initializer is too long > - multidim arrays: Cut-n-paste'd your multidim array test, and added a couple more > Hope this is useful... Very useful, thanks again
V2 Patch https://gcc.gnu.org/bugzilla/attachment.cgi?id=51828 On Wed, Nov 17, 2021 at 10:06 PM will wray <wjwray@gmail.com> wrote: > > Thanks for the review Marek; > I'll post the updated patch in a follow-on message and on bugzilla. > > On Mon, Nov 15, 2021 at 8:03 PM Marek Polacek <polacek@redhat.com> wrote: > > > I also noticed the C++ FE rejects > > > > struct A { char x[4]; }; > > struct B { struct A a; }; > > struct B b = { .a.x = "abc" }; > > but the C FE accepts it. But that's for another time. > > Yes, the nested case is invalid for C++, valid for C. > c.f. cppreference aggregate init. > > > > + reshape_iter stripd = {}; > > > > Since the previous variables spell it "stripped" maybe call it stripped_iter. > > I've left it as "stripd"; the top level reshape_iter is just "d", non-verbose, > so "stripped_d" inappropriately over-verbose. > > > > @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > > > array types (one value per array element). */ > > > if (TREE_CODE (stripped_str_init) == STRING_CST) > > > { > > > - if (has_designator_problem (d, complain)) > > > > So the logic here is that... > > Yes, you get the logic exactly... took me a few rounds to get it. > > > Nice to finally remove this, but let's keep this part of the comment. > > Agreed, and reinstated. > > > BRACE_ENCLOSED_INITIALIZER_P checks that it gets a CONSTRUCTOR so you > > can remove the first check. > > Nice, thanks; missed that. > > > > +// { dg-do compile } > > > +// { dg-options "-pedantic" } > > > > FWIW, if you remove the dg-options, -pedantic-errors will be used so you could > > drop it and then use dg-error instead of dg-warning below but this is OK too. > > I'd copied that from another desigN.C test, now I've copied the simpler: > > +// { dg-options "" } > > and removed all of the noisy dg-warning tests > > > We should probably test more: > > - nested structs > > - anonymous unions > > - test when the initializer is too long > > - multidim arrays: > > Cut-n-paste'd your multidim array test, and added a couple more > > > Hope this is useful... > > Very useful, thanks again
On Wed, Nov 17, 2021 at 10:23:58PM -0500, will wray wrote: > V2 Patch > https://gcc.gnu.org/bugzilla/attachment.cgi?id=51828 Can you please post the v2 here on the mailing list? It will be easier for us to reply. Preferably with the subject adjusted to say [PATCH v2] ... > On Wed, Nov 17, 2021 at 10:06 PM will wray <wjwray@gmail.com> wrote: > > > > Thanks for the review Marek; > > I'll post the updated patch in a follow-on message and on bugzilla. > > > > On Mon, Nov 15, 2021 at 8:03 PM Marek Polacek <polacek@redhat.com> wrote: > > > > > I also noticed the C++ FE rejects > > > > > > struct A { char x[4]; }; > > > struct B { struct A a; }; > > > struct B b = { .a.x = "abc" }; > > > but the C FE accepts it. But that's for another time. > > > > Yes, the nested case is invalid for C++, valid for C. > > c.f. cppreference aggregate init. > > > > > > + reshape_iter stripd = {}; > > > > > > Since the previous variables spell it "stripped" maybe call it stripped_iter. > > > > I've left it as "stripd"; the top level reshape_iter is just "d", non-verbose, > > so "stripped_d" inappropriately over-verbose. > > > > > > @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > > > > array types (one value per array element). */ > > > > if (TREE_CODE (stripped_str_init) == STRING_CST) > > > > { > > > > - if (has_designator_problem (d, complain)) > > > > > > So the logic here is that... > > > > Yes, you get the logic exactly... took me a few rounds to get it. > > > > > Nice to finally remove this, but let's keep this part of the comment. > > > > Agreed, and reinstated. > > > > > BRACE_ENCLOSED_INITIALIZER_P checks that it gets a CONSTRUCTOR so you > > > can remove the first check. > > > > Nice, thanks; missed that. > > > > > > +// { dg-do compile } > > > > +// { dg-options "-pedantic" } > > > > > > FWIW, if you remove the dg-options, -pedantic-errors will be used so you could > > > drop it and then use dg-error instead of dg-warning below but this is OK too. > > > > I'd copied that from another desigN.C test, now I've copied the simpler: > > > > +// { dg-options "" } > > > > and removed all of the noisy dg-warning tests > > > > > We should probably test more: > > > - nested structs > > > - anonymous unions > > > - test when the initializer is too long > > > - multidim arrays: > > > > Cut-n-paste'd your multidim array test, and added a couple more > > > > > Hope this is useful... > > > > Very useful, thanks again > Marek
V2 Patch mailing list post https://gcc.gnu.org/pipermail/gcc-patches/2021-November/584897.html On Thu, Nov 18, 2021 at 10:36 AM Marek Polacek <polacek@redhat.com> wrote: > > On Wed, Nov 17, 2021 at 10:23:58PM -0500, will wray wrote: > > V2 Patch > > https://gcc.gnu.org/bugzilla/attachment.cgi?id=51828 > > Can you please post the v2 here on the mailing list? It will be easier > for us to reply. Preferably with the subject adjusted to say [PATCH v2] ... > > > On Wed, Nov 17, 2021 at 10:06 PM will wray <wjwray@gmail.com> wrote: > > > > > > Thanks for the review Marek; > > > I'll post the updated patch in a follow-on message and on bugzilla. > > > > > > On Mon, Nov 15, 2021 at 8:03 PM Marek Polacek <polacek@redhat.com> wrote: > > > > > > > I also noticed the C++ FE rejects > > > > > > > > struct A { char x[4]; }; > > > > struct B { struct A a; }; > > > > struct B b = { .a.x = "abc" }; > > > > but the C FE accepts it. But that's for another time. > > > > > > Yes, the nested case is invalid for C++, valid for C. > > > c.f. cppreference aggregate init. > > > > > > > > + reshape_iter stripd = {}; > > > > > > > > Since the previous variables spell it "stripped" maybe call it stripped_iter. > > > > > > I've left it as "stripd"; the top level reshape_iter is just "d", non-verbose, > > > so "stripped_d" inappropriately over-verbose. > > > > > > > > @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, > > > > > array types (one value per array element). */ > > > > > if (TREE_CODE (stripped_str_init) == STRING_CST) > > > > > { > > > > > - if (has_designator_problem (d, complain)) > > > > > > > > So the logic here is that... > > > > > > Yes, you get the logic exactly... took me a few rounds to get it. > > > > > > > Nice to finally remove this, but let's keep this part of the comment. > > > > > > Agreed, and reinstated. > > > > > > > BRACE_ENCLOSED_INITIALIZER_P checks that it gets a CONSTRUCTOR so you > > > > can remove the first check. > > > > > > Nice, thanks; missed that. > > > > > > > > +// { dg-do compile } > > > > > +// { dg-options "-pedantic" } > > > > > > > > FWIW, if you remove the dg-options, -pedantic-errors will be used so you could > > > > drop it and then use dg-error instead of dg-warning below but this is OK too. > > > > > > I'd copied that from another desigN.C test, now I've copied the simpler: > > > > > > +// { dg-options "" } > > > > > > and removed all of the noisy dg-warning tests > > > > > > > We should probably test more: > > > > - nested structs > > > > - anonymous unions > > > > - test when the initializer is too long > > > > - multidim arrays: > > > > > > Cut-n-paste'd your multidim array test, and added a couple more > > > > > > > Hope this is useful... > > > > > > Very useful, thanks again > > > > Marek >
diff --git a/gcc/cp/decl.c b/gcc/cp/decl.c index 947bbfc6637..f01655c5c14 100644 --- a/gcc/cp/decl.c +++ b/gcc/cp/decl.c @@ -6820,6 +6820,7 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, { tree str_init = init; tree stripped_str_init = stripped_init; + reshape_iter stripd = {}; /* Strip one level of braces if and only if they enclose a single element (as allowed by [dcl.init.string]). */ @@ -6827,7 +6828,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, && TREE_CODE (stripped_str_init) == CONSTRUCTOR && CONSTRUCTOR_NELTS (stripped_str_init) == 1) { - str_init = (*CONSTRUCTOR_ELTS (stripped_str_init))[0].value; + stripd.cur = CONSTRUCTOR_ELT (stripped_str_init, 0); + str_init = stripd.cur->value; stripped_str_init = tree_strip_any_location_wrapper (str_init); } @@ -6836,7 +6838,8 @@ reshape_init_r (tree type, reshape_iter *d, tree first_initializer_p, array types (one value per array element). */ if (TREE_CODE (stripped_str_init) == STRING_CST) { - if (has_designator_problem (d, complain)) + if ((first_initializer_p && has_designator_problem (d, complain)) + || (stripd.cur && has_designator_problem (&stripd, complain))) return error_mark_node; d->cur++; return str_init; @@ -9538,23 +9541,10 @@ cp_complete_array_type (tree *ptype, tree initial_value, bool do_default) if (initial_value) { - /* An array of character type can be initialized from a - brace-enclosed string constant. - - FIXME: this code is duplicated from reshape_init. Probably - we should just call reshape_init here? */ - if (char_type_p (TYPE_MAIN_VARIANT (TREE_TYPE (*ptype))) - && TREE_CODE (initial_value) == CONSTRUCTOR - && !vec_safe_is_empty (CONSTRUCTOR_ELTS (initial_value))) - { - vec<constructor_elt, va_gc> *v = CONSTRUCTOR_ELTS (initial_value); - tree value = (*v)[0].value; - STRIP_ANY_LOCATION_WRAPPER (value); - - if (TREE_CODE (value) == STRING_CST - && v->length () == 1) - initial_value = value; - } + if (TREE_CODE (initial_value) == CONSTRUCTOR + && BRACE_ENCLOSED_INITIALIZER_P (initial_value)) + initial_value = reshape_init (*ptype, initial_value, + tf_warning_or_error); /* If any of the elements are parameter packs, we can't actually complete this type now because the array size is dependent. */ diff --git a/gcc/testsuite/g++.dg/cpp2a/desig20.C b/gcc/testsuite/g++.dg/cpp2a/desig20.C new file mode 100644 index 00000000000..03eab764325 --- /dev/null +++ b/gcc/testsuite/g++.dg/cpp2a/desig20.C @@ -0,0 +1,48 @@ +// PR c++/55227 +// Test designated initializer for char array by string constant + +// { dg-do compile } +// { dg-options "-pedantic" } + +struct C {char a[2];}; + +/* Case a; designated, unbraced, string-literal of the same size as the target + char array to be initialized; valid and accepted before and after. */ + +C a = {.a="a"}; // { dg-warning "designated initializers only available with" "" { target c++17_down } .-0 } + +/* Cases b,c,d; designated mismatched-size string literal, or designated braced + string literal (of any size less than or equal to the target char array), + previously rejected ("C99 designator 'a' outside aggregate initializer"). */ + +C b = {.a=""}; // { dg-warning "designated initializers only available with" "" { target c++17_down } .-0 } +C c = {.a={""}}; // { dg-warning "designated initializers only available with" "" { target c++17_down } .-0 } +C d = {.a={"a"}}; // { dg-warning "designated initializers only available with" "" { target c++17_down } .-0 } + +/* Case e; designated char array field and braced, designated array element(s) + (with GNU [N]= extension) valid extension, accepted before and after. */ + +C e = {.a={[0]='a'}}; // { dg-warning "ISO C.. does not allow C99 designated initializers" } +// { dg-warning "designated initializers only available with" "" { target c++17_down } .-1 } + +/* Cases f,g,h; braced string literal, 'designated' within inner braces; + invalid, previously accepted as positional with 'designator' ignored. */ + +C f = {{[0]="a"}}; // { dg-error "C99 designator .0. outside aggregate initializer" } +// { dg-warning "ISO C.. does not allow C99 designated initializers" "C99 desig" { target *-*-* } .-1 } +C g = {{.a="a"}}; // { dg-error "C99 designator .a. outside aggregate initializer" } +// { dg-warning "designated initializers only available with" "" { target c++17_down } .-1 } +C h = {{.b="a"}}; // { dg-error "C99 designator .b. outside aggregate initializer" } +// { dg-warning "designated initializers only available with" "" { target c++17_down } .-1 } + +template <class T> +void t() +{ + C ca[] = { {.a=""}, {.a={""}}, }; +// { dg-warning "designated initializers only available with" "" { target c++17_down } .-1 } +} + +void u() +{ + return t<void>(); +}