Message ID | 20120206130105.GP18768@tyan-ft48-01.lab.bos.redhat.com |
---|---|
State | New |
Headers | show |
On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > The attached testcase is miscompiled on arm*, by doing a sibcall when setup > of one argument overwrites incoming arguments used to setup parameters in > later insns. > The reason why > mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap > fails to detect is that the caller has non-zero > crtl->args.pretend_args_size, and in that case the base: > /* The argument block when performing a sibling call is the > incoming argument block. */ > if (pass == 0) > { > argblock = crtl->args.internal_arg_pointer; > argblock > #ifdef STACK_GROWS_DOWNWARD > = plus_constant (argblock, crtl->args.pretend_args_size); > #else > = plus_constant (argblock, -crtl->args.pretend_args_size); > #endif > stored_args_map = sbitmap_alloc (args_size.constant); > sbitmap_zero (stored_args_map); > } > apparently isn't virtual-incoming-rtx, but that plus pretend_args_size > (8 in this case). When we store bits into stored_args_map sbitmap, > we use arg->locate.slot_offset.constant based values (or something different > for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is > testing those bits, it uses just virtual-incoming-rtx offsets (or something > different for ARGS_GROW_DOWNWARD). This patch fixes it by adjusting the > virtual-incoming-rtx relative offset to be actually argblock relative > offset. > > Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the > testcase on arm cross. Ok for trunk? Ok. Thanks, Richard. > 2012-02-06 Jakub Jelinek <jakub@redhat.com> > > PR target/52129 > * calls.c (mem_overlaps_already_clobbered_arg_p): If val is > CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it. > > * gcc.c-torture/execute/pr52129.c: New test. > > --- gcc/calls.c.jj 2012-02-01 14:44:27.000000000 +0100 > +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100 > @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt > return true; > else > i = INTVAL (val); > +#ifdef STACK_GROWS_DOWNWARD > + i -= crtl->args.pretend_args_size; > +#else > + i += crtl->args.pretend_args_size; > +#endif > > #ifdef ARGS_GROW_DOWNWARD > i = -i - size; > --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj 2012-02-06 10:27:50.988876791 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c 2012-02-06 10:25:26.000000000 +0100 > @@ -0,0 +1,28 @@ > +/* PR target/52129 */ > + > +extern void abort (void); > +struct S { void *p; unsigned int q; }; > +struct T { char a[64]; char b[64]; } t; > + > +__attribute__((noinline, noclone)) int > +foo (void *x, struct S s, void *y, void *z) > +{ > + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) > + abort (); > + return 29; > +} > + > +__attribute__((noinline, noclone)) int > +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) > +{ > + return foo (x, s, &u->a[t], &u->b[t]); > +} > + > +int > +main () > +{ > + struct S s = { &t.b[5], 27 }; > + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) > + abort (); > + return 0; > +} > > Jakub
Hi Jakub Instead of disabling the sibcall, it could also be a valid tail call optimization by moving the str after ldmia, and change the used register(It should be handled by RA automatically), as following ... add r4, r1, r4, lsl #2 ldmia r2, {r1, r2} str r4, [sp, #48] ... thanks Carrot On Mon, Feb 6, 2012 at 9:01 PM, Jakub Jelinek <jakub@redhat.com> wrote: > Hi! > > The attached testcase is miscompiled on arm*, by doing a sibcall when setup > of one argument overwrites incoming arguments used to setup parameters in > later insns. > The reason why > mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap > fails to detect is that the caller has non-zero > crtl->args.pretend_args_size, and in that case the base: > /* The argument block when performing a sibling call is the > incoming argument block. */ > if (pass == 0) > { > argblock = crtl->args.internal_arg_pointer; > argblock > #ifdef STACK_GROWS_DOWNWARD > = plus_constant (argblock, crtl->args.pretend_args_size); > #else > = plus_constant (argblock, -crtl->args.pretend_args_size); > #endif > stored_args_map = sbitmap_alloc (args_size.constant); > sbitmap_zero (stored_args_map); > } > apparently isn't virtual-incoming-rtx, but that plus pretend_args_size > (8 in this case). When we store bits into stored_args_map sbitmap, > we use arg->locate.slot_offset.constant based values (or something different > for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is > testing those bits, it uses just virtual-incoming-rtx offsets (or something > different for ARGS_GROW_DOWNWARD). This patch fixes it by adjusting the > virtual-incoming-rtx relative offset to be actually argblock relative > offset. > > Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the > testcase on arm cross. Ok for trunk? > > 2012-02-06 Jakub Jelinek <jakub@redhat.com> > > PR target/52129 > * calls.c (mem_overlaps_already_clobbered_arg_p): If val is > CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it. > > * gcc.c-torture/execute/pr52129.c: New test. > > --- gcc/calls.c.jj 2012-02-01 14:44:27.000000000 +0100 > +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100 > @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt > return true; > else > i = INTVAL (val); > +#ifdef STACK_GROWS_DOWNWARD > + i -= crtl->args.pretend_args_size; > +#else > + i += crtl->args.pretend_args_size; > +#endif > > #ifdef ARGS_GROW_DOWNWARD > i = -i - size; > --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj 2012-02-06 10:27:50.988876791 +0100 > +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c 2012-02-06 10:25:26.000000000 +0100 > @@ -0,0 +1,28 @@ > +/* PR target/52129 */ > + > +extern void abort (void); > +struct S { void *p; unsigned int q; }; > +struct T { char a[64]; char b[64]; } t; > + > +__attribute__((noinline, noclone)) int > +foo (void *x, struct S s, void *y, void *z) > +{ > + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) > + abort (); > + return 29; > +} > + > +__attribute__((noinline, noclone)) int > +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) > +{ > + return foo (x, s, &u->a[t], &u->b[t]); > +} > + > +int > +main () > +{ > + struct S s = { &t.b[5], 27 }; > + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) > + abort (); > + return 0; > +} > > Jakub
Hi Richard and Jakub Since 4.6 contains the same bug, I would like to back port it to 4.6 branch. Could you approve it for 4.6? Jing and Doug Could you approve it for google/gcc-4_6-mobile branch? thanks Carrot On Mon, Feb 6, 2012 at 9:14 PM, Richard Guenther <richard.guenther@gmail.com> wrote: > On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote: >> Hi! >> >> The attached testcase is miscompiled on arm*, by doing a sibcall when setup >> of one argument overwrites incoming arguments used to setup parameters in >> later insns. >> The reason why >> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap >> fails to detect is that the caller has non-zero >> crtl->args.pretend_args_size, and in that case the base: >> /* The argument block when performing a sibling call is the >> incoming argument block. */ >> if (pass == 0) >> { >> argblock = crtl->args.internal_arg_pointer; >> argblock >> #ifdef STACK_GROWS_DOWNWARD >> = plus_constant (argblock, crtl->args.pretend_args_size); >> #else >> = plus_constant (argblock, -crtl->args.pretend_args_size); >> #endif >> stored_args_map = sbitmap_alloc (args_size.constant); >> sbitmap_zero (stored_args_map); >> } >> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size >> (8 in this case). When we store bits into stored_args_map sbitmap, >> we use arg->locate.slot_offset.constant based values (or something different >> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is >> testing those bits, it uses just virtual-incoming-rtx offsets (or something >> different for ARGS_GROW_DOWNWARD). This patch fixes it by adjusting the >> virtual-incoming-rtx relative offset to be actually argblock relative >> offset. >> >> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the >> testcase on arm cross. Ok for trunk? > > Ok. > > Thanks, > Richard. > >> 2012-02-06 Jakub Jelinek <jakub@redhat.com> >> >> PR target/52129 >> * calls.c (mem_overlaps_already_clobbered_arg_p): If val is >> CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it. >> >> * gcc.c-torture/execute/pr52129.c: New test. >> >> --- gcc/calls.c.jj 2012-02-01 14:44:27.000000000 +0100 >> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100 >> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt >> return true; >> else >> i = INTVAL (val); >> +#ifdef STACK_GROWS_DOWNWARD >> + i -= crtl->args.pretend_args_size; >> +#else >> + i += crtl->args.pretend_args_size; >> +#endif >> >> #ifdef ARGS_GROW_DOWNWARD >> i = -i - size; >> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj 2012-02-06 10:27:50.988876791 +0100 >> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c 2012-02-06 10:25:26.000000000 +0100 >> @@ -0,0 +1,28 @@ >> +/* PR target/52129 */ >> + >> +extern void abort (void); >> +struct S { void *p; unsigned int q; }; >> +struct T { char a[64]; char b[64]; } t; >> + >> +__attribute__((noinline, noclone)) int >> +foo (void *x, struct S s, void *y, void *z) >> +{ >> + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) >> + abort (); >> + return 29; >> +} >> + >> +__attribute__((noinline, noclone)) int >> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) >> +{ >> + return foo (x, s, &u->a[t], &u->b[t]); >> +} >> + >> +int >> +main () >> +{ >> + struct S s = { &t.b[5], 27 }; >> + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) >> + abort (); >> + return 0; >> +} >> >> Jakub
On Thu, Feb 09, 2012 at 04:54:59PM +0800, Carrot Wei wrote: > Since 4.6 contains the same bug, I would like to back port it to 4.6 > branch. Could you approve it for 4.6? Yes, this is ok. I'm sorry for deferring too many 4.6 backports for way too long, will get to it eventually. Jakub
On Thu, Feb 9, 2012 at 12:54 AM, Carrot Wei <carrot@google.com> wrote: > Hi Richard and Jakub > > Since 4.6 contains the same bug, I would like to back port it to 4.6 > branch. Could you approve it for 4.6? > > Jing and Doug > > Could you approve it for google/gcc-4_6-mobile branch? > OK for google/gcc-4_6-mobile and gcc-4_6_2-mobile Jing > thanks > Carrot > > On Mon, Feb 6, 2012 at 9:14 PM, Richard Guenther > <richard.guenther@gmail.com> wrote: >> On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote: >>> Hi! >>> >>> The attached testcase is miscompiled on arm*, by doing a sibcall when setup >>> of one argument overwrites incoming arguments used to setup parameters in >>> later insns. >>> The reason why >>> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap >>> fails to detect is that the caller has non-zero >>> crtl->args.pretend_args_size, and in that case the base: >>> /* The argument block when performing a sibling call is the >>> incoming argument block. */ >>> if (pass == 0) >>> { >>> argblock = crtl->args.internal_arg_pointer; >>> argblock >>> #ifdef STACK_GROWS_DOWNWARD >>> = plus_constant (argblock, crtl->args.pretend_args_size); >>> #else >>> = plus_constant (argblock, -crtl->args.pretend_args_size); >>> #endif >>> stored_args_map = sbitmap_alloc (args_size.constant); >>> sbitmap_zero (stored_args_map); >>> } >>> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size >>> (8 in this case). When we store bits into stored_args_map sbitmap, >>> we use arg->locate.slot_offset.constant based values (or something different >>> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is >>> testing those bits, it uses just virtual-incoming-rtx offsets (or something >>> different for ARGS_GROW_DOWNWARD). This patch fixes it by adjusting the >>> virtual-incoming-rtx relative offset to be actually argblock relative >>> offset. >>> >>> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the >>> testcase on arm cross. Ok for trunk? >> >> Ok. >> >> Thanks, >> Richard. >> >>> 2012-02-06 Jakub Jelinek <jakub@redhat.com> >>> >>> PR target/52129 >>> * calls.c (mem_overlaps_already_clobbered_arg_p): If val is >>> CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it. >>> >>> * gcc.c-torture/execute/pr52129.c: New test. >>> >>> --- gcc/calls.c.jj 2012-02-01 14:44:27.000000000 +0100 >>> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100 >>> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt >>> return true; >>> else >>> i = INTVAL (val); >>> +#ifdef STACK_GROWS_DOWNWARD >>> + i -= crtl->args.pretend_args_size; >>> +#else >>> + i += crtl->args.pretend_args_size; >>> +#endif >>> >>> #ifdef ARGS_GROW_DOWNWARD >>> i = -i - size; >>> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj 2012-02-06 10:27:50.988876791 +0100 >>> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c 2012-02-06 10:25:26.000000000 +0100 >>> @@ -0,0 +1,28 @@ >>> +/* PR target/52129 */ >>> + >>> +extern void abort (void); >>> +struct S { void *p; unsigned int q; }; >>> +struct T { char a[64]; char b[64]; } t; >>> + >>> +__attribute__((noinline, noclone)) int >>> +foo (void *x, struct S s, void *y, void *z) >>> +{ >>> + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) >>> + abort (); >>> + return 29; >>> +} >>> + >>> +__attribute__((noinline, noclone)) int >>> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) >>> +{ >>> + return foo (x, s, &u->a[t], &u->b[t]); >>> +} >>> + >>> +int >>> +main () >>> +{ >>> + struct S s = { &t.b[5], 27 }; >>> + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) >>> + abort (); >>> + return 0; >>> +} >>> >>> Jakub
On Fri, Feb 10, 2012 at 2:13 PM, Jing Yu <jingyu@google.com> wrote: > On Thu, Feb 9, 2012 at 12:54 AM, Carrot Wei <carrot@google.com> wrote: >> Hi Richard and Jakub >> >> Since 4.6 contains the same bug, I would like to back port it to 4.6 >> branch. Could you approve it for 4.6? >> >> Jing and Doug >> >> Could you approve it for google/gcc-4_6-mobile branch? >> > > OK for google/gcc-4_6-mobile and gcc-4_6_2-mobile > > Jing > Bootstrapped/regtested on x86_64-linux and regtested on arm qemu. Committed to both google/gcc-4_6-mobile and google/gcc-4_6_2-mobile. thanks Carrot
--- gcc/calls.c.jj 2012-02-01 14:44:27.000000000 +0100 +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100 @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt return true; else i = INTVAL (val); +#ifdef STACK_GROWS_DOWNWARD + i -= crtl->args.pretend_args_size; +#else + i += crtl->args.pretend_args_size; +#endif #ifdef ARGS_GROW_DOWNWARD i = -i - size; --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj 2012-02-06 10:27:50.988876791 +0100 +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c 2012-02-06 10:25:26.000000000 +0100 @@ -0,0 +1,28 @@ +/* PR target/52129 */ + +extern void abort (void); +struct S { void *p; unsigned int q; }; +struct T { char a[64]; char b[64]; } t; + +__attribute__((noinline, noclone)) int +foo (void *x, struct S s, void *y, void *z) +{ + if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17]) + abort (); + return 29; +} + +__attribute__((noinline, noclone)) int +bar (void *x, void *y, void *z, struct S s, int t, struct T *u) +{ + return foo (x, s, &u->a[t], &u->b[t]); +} + +int +main () +{ + struct S s = { &t.b[5], 27 }; + if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29) + abort (); + return 0; +}