[PR,c++/78572] handle array self references in intializers
diff mbox

Message ID ad3c151e-9c4b-c203-89ab-3267417ab352@redhat.com
State New
Headers show

Commit Message

Aldy Hernandez Dec. 20, 2016, 4:25 p.m. UTC
The problem in this PR is that we're trying to initialize an array with 
members of itself:

	int array[10] = { array[3]=5, array[7]=3 };

The C++ front-end, in store_init_value() via cxx_constant_value() and 
friends will transform:

	{array[3] = 5, array[7] = 3}

into:

	{[3]=5, [0]=5, [7]=3, [1]=3}

which looks invalid to output_constructor*(), presumably because it 
isn't sorted by increasing index.

Jakub has even gone further to show that for the following:

	... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };

things get even worse, because we generate code to write twice into [3]:

	{[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}

I took the easy way in cxx_eval_store_expression() and marked the 
expression as non_constant_p if we're trying to set an array from 
members of itself.  This causes us to to generate the initialization of 
the self-referencing array[sub] elements as a CLEANUP_POINT_EXPR:

<<cleanup_point <<< Unknown tree: expr_stmt
   array[0] = array[3] = 5 >>>>>;
<<cleanup_point <<< Unknown tree: expr_stmt
   array[1] = array[7] = 3 >>>>>;

Ultimately this yields correct code:

array:
         .zero   40
...
...

_GLOBAL__sub_I_array:
.LFB1:
         .cfi_startproc
         movl    $5, array+12(%rip)
         movl    $5, array(%rip)
         movl    $3, array+28(%rip)
         movl    $3, array+4(%rip)
	ret

Is this approach acceptable, or should we be marking this as 
non-constant somewhere else in the chain?  Or perhaps another approach 
would be to handle such constructor holes in the in the varasm code?

Aldy
commit 295d93f60bcbec5b9959a7b3656f10aa0df71c9f
Author: Aldy Hernandez <aldyh@redhat.com>
Date:   Tue Dec 20 06:13:26 2016 -0500

            PR c++/78572
            * constexpr.c (cxx_eval_store_expression): Avoid array self
            references in initialization.

Comments

Nathan Sidwell Dec. 20, 2016, 4:48 p.m. UTC | #1
On 12/20/2016 11:25 AM, Aldy Hernandez wrote:
> The problem in this PR is that we're trying to initialize an array with
> members of itself:

> Jakub has even gone further to show that for the following:
>
>     ... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };
>
> things get even worse, because we generate code to write twice into [3]:
>
>     {[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}

I looked at this a couple of weeks ago, and got confused.  It wasn't 
very clear to me how the side-effect assignments interact with any 
implict zero assignments.  (Especially if one then tries to support 
C-style designated initializers).

IIUC the side effects of an initializer are evaluated before the storage 
of the initializer itself (separated by a sequence point).  further it 
seems that default zero initialization of non-explicitly initialized 
elements happens after any side-effected store to that element (and 
hence zaps the sideeffect).  The rules for non-aggregates appear to be 
as-if one builds a temporary object and then copy-constructs it into the 
object -- clearly zapping any and all side-effects.  However for 
aggregate initializations it looked like elements were initialized in 
order -- so the side effect on a later element could overwrite the 
initialization of an earlier element.

It wasn't entirely clear.

nathan
Aldy Hernandez Dec. 20, 2016, 6:52 p.m. UTC | #2
On 12/20/2016 11:48 AM, Nathan Sidwell wrote:
> On 12/20/2016 11:25 AM, Aldy Hernandez wrote:
>> The problem in this PR is that we're trying to initialize an array with
>> members of itself:
>
>> Jakub has even gone further to show that for the following:
>>
>>     ... = { array[3]=5, array[7]=3, array[7]=8, array[7] = 9 };
>>
>> things get even worse, because we generate code to write twice into [3]:
>>
>>     {[3]=5, [0]=5, [7]=9, [1]=3, [2]=8, [3]=9}
>
> I looked at this a couple of weeks ago, and got confused.  It wasn't
> very clear to me how the side-effect assignments interact with any
> implict zero assignments.  (Especially if one then tries to support
> C-style designated initializers).
>
> IIUC the side effects of an initializer are evaluated before the storage
> of the initializer itself (separated by a sequence point).  further it
> seems that default zero initialization of non-explicitly initialized
> elements happens after any side-effected store to that element (and
> hence zaps the sideeffect).  The rules for non-aggregates appear to be
> as-if one builds a temporary object and then copy-constructs it into the
> object -- clearly zapping any and all side-effects.  However for
> aggregate initializations it looked like elements were initialized in
> order -- so the side effect on a later element could overwrite the
> initialization of an earlier element.
>
> It wasn't entirely clear.

Looks like things are happening in the right order for this testcase. 
That is, the zeroing happens by virtue of it being in the global 
section, and the rest of the runtime initialization happening afterwards:

abulafia:/build/t/gcc$ cat a.cc
int array[10] = { array[3]=5, array[7]=3 };

int main()
{
   return 0;
}

assembly:
---------
         .size   array, 40
array:
         .zero   40

...
...

abulafia:/build/t/gcc$ gdb -n a.out -q
Reading symbols from a.out...(no debugging symbols found)...done.
(gdb) b main
Breakpoint 1 at 0x40055b
(gdb) r
Starting program: /home/build/t/gcc/a.out

Breakpoint 1, 0x000000000040055b in main ()
(gdb) x/4x &array
0x601080 <array>:       0x00000005      0x00000003      0x00000000 
0x00000005
(gdb)

The above looks correct.

Now for the next example below I get what I intuitively expect, though I 
don't know if that's how it's defined in the standard:

int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };

(gdb) x/4x &array
0x601040 <array>:       0x00000005      0x00000111      0x00000222 
0x00000005

That is, the array[3]=5 overwrites the last 0x333.  I would expect that...

Is any of this incorrect?

Aldy
Nathan Sidwell Dec. 20, 2016, 8:14 p.m. UTC | #3
On 12/20/2016 01:52 PM, Aldy Hernandez wrote:

> int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };
>
> (gdb) x/4x &array
> 0x601040 <array>:       0x00000005      0x00000111      0x00000222
> 0x00000005
>
> That is, the array[3]=5 overwrites the last 0x333.  I would expect that...

That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the 
initializer-list of a braced-init-list, the initializer-clauses, 
including any that result from pack expansions (14.5.3), are evaluated 
in the order in which they appear.'
It goes on to mention that there are sequence points, plus that the 
ordering holds regardless of the semantics of initialization.

So by my reading, the effects of the above list are:
a) assign 5 to array[3]
b) assign 5 to array[0] -- consume the first initializer
c) assign x111 to array[1] -- consume the second initializer
d) assign 0x222 to array[2] -- consume the third initializer
e) assign 0x333 to array[3] -- consume the fourth intializer,
                                overwrite #a

nathan
Aldy Hernandez Dec. 21, 2016, 5:29 p.m. UTC | #4
On 12/20/2016 03:14 PM, Nathan Sidwell wrote:
> On 12/20/2016 01:52 PM, Aldy Hernandez wrote:
>
>> int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };
>>
>> (gdb) x/4x &array
>> 0x601040 <array>:       0x00000005      0x00000111      0x00000222
>> 0x00000005
>>
>> That is, the array[3]=5 overwrites the last 0x333.  I would expect
>> that...
>
> That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the
> initializer-list of a braced-init-list, the initializer-clauses,
> including any that result from pack expansions (14.5.3), are evaluated
> in the order in which they appear.'
> It goes on to mention that there are sequence points, plus that the
> ordering holds regardless of the semantics of initialization.
>
> So by my reading, the effects of the above list are:
> a) assign 5 to array[3]
> b) assign 5 to array[0] -- consume the first initializer
> c) assign x111 to array[1] -- consume the second initializer
> d) assign 0x222 to array[2] -- consume the third initializer
> e) assign 0x333 to array[3] -- consume the fourth intializer,
>                                overwrite #a

Hmmmm, I see.

Since designated initializers in arrays are not supported in the C++ FE, 
should we sorry() on array initializers using self-reference like above 
the above?  Surely it's better than ICEing.  Or, would you prefer we 
implement the above semantics in the draft for GCC7?

Aldy
Jakub Jelinek Dec. 21, 2016, 5:40 p.m. UTC | #5
On Wed, Dec 21, 2016 at 12:29:37PM -0500, Aldy Hernandez wrote:
> > > int array[10] = { array[3]=5, 0x111, 0x222, 0x333 };
> > > 
> > > (gdb) x/4x &array
> > > 0x601040 <array>:       0x00000005      0x00000111      0x00000222
> > > 0x00000005
> > > 
> > > That is, the array[3]=5 overwrites the last 0x333.  I would expect
> > > that...
> > 
> > That may be wrong. Using the 4431 draft, [8.5.4]/4 says 'Within the
> > initializer-list of a braced-init-list, the initializer-clauses,
> > including any that result from pack expansions (14.5.3), are evaluated
> > in the order in which they appear.'
> > It goes on to mention that there are sequence points, plus that the
> > ordering holds regardless of the semantics of initialization.
> > 
> > So by my reading, the effects of the above list are:
> > a) assign 5 to array[3]
> > b) assign 5 to array[0] -- consume the first initializer
> > c) assign x111 to array[1] -- consume the second initializer
> > d) assign 0x222 to array[2] -- consume the third initializer
> > e) assign 0x333 to array[3] -- consume the fourth intializer,
> >                                overwrite #a
> 
> Hmmmm, I see.
> 
> Since designated initializers in arrays are not supported in the C++ FE,
> should we sorry() on array initializers using self-reference like above the
> above?  Surely it's better than ICEing.  Or, would you prefer we implement
> the above semantics in the draft for GCC7?

The side-effects are unrelated to designed initializers, only [3] = 5
is designed initializer, array[3] = 5 is not, it is just arbitrary
expression evaluated at that point to compute the corresponding element.
We do support designed initializers in C++, but only in a useless way
(require that the designators appear in order without any gaps).

	Jakub

Patch
diff mbox

diff --git a/gcc/cp/constexpr.c b/gcc/cp/constexpr.c
index aedd004..ac279ad 100644
--- a/gcc/cp/constexpr.c
+++ b/gcc/cp/constexpr.c
@@ -3313,6 +3313,15 @@  cxx_eval_store_expression (const constexpr_ctx *ctx, tree t,
 	}
     }
 
+  /* Initializing an array from a variant of itself is a non-constant.
+     This avoids attemps at generating incorrect self references in
+     something like: int foo[10] = { stuff[3]=8 }.  */
+  if (TREE_CODE (target) == ARRAY_REF && object == ctx->object)
+    {
+      *non_constant_p = true;
+      return t;
+    }
+
   /* And then find/build up our initializer for the path to the subobject
      we're initializing.  */
   tree *valp;
diff --git a/gcc/testsuite/g++.dg/pr78572.C b/gcc/testsuite/g++.dg/pr78572.C
new file mode 100644
index 0000000..82ab4e3
--- /dev/null
+++ b/gcc/testsuite/g++.dg/pr78572.C
@@ -0,0 +1,9 @@ 
+// { dg-do compile } */
+
+static int array[10] = { array[3]=5, array[7]=3 };
+
+int
+main ()
+{
+  return 0;
+}