diff mbox

Fix sibcall argument overlap checking if pretend_args_size (PR target/52129)

Message ID 20120206130105.GP18768@tyan-ft48-01.lab.bos.redhat.com
State New
Headers show

Commit Message

Jakub Jelinek Feb. 6, 2012, 1:01 p.m. UTC
Hi!

The attached testcase is miscompiled on arm*, by doing a sibcall when setup
of one argument overwrites incoming arguments used to setup parameters in
later insns.
The reason why
mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap
fails to detect is that the caller has non-zero
crtl->args.pretend_args_size, and in that case the base:
      /* The argument block when performing a sibling call is the
         incoming argument block.  */
      if (pass == 0)
        {
          argblock = crtl->args.internal_arg_pointer;
          argblock
#ifdef STACK_GROWS_DOWNWARD
            = plus_constant (argblock, crtl->args.pretend_args_size);
#else
            = plus_constant (argblock, -crtl->args.pretend_args_size);
#endif
          stored_args_map = sbitmap_alloc (args_size.constant);
          sbitmap_zero (stored_args_map);
        }
apparently isn't virtual-incoming-rtx, but that plus pretend_args_size
(8 in this case).  When we store bits into stored_args_map sbitmap,
we use arg->locate.slot_offset.constant based values (or something different
for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is
testing those bits, it uses just virtual-incoming-rtx offsets (or something
different for ARGS_GROW_DOWNWARD).  This patch fixes it by adjusting the
virtual-incoming-rtx relative offset to be actually argblock relative
offset.

Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the
testcase on arm cross.  Ok for trunk?

2012-02-06  Jakub Jelinek  <jakub@redhat.com>

	PR target/52129
	* calls.c (mem_overlaps_already_clobbered_arg_p): If val is
	CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it.

	* gcc.c-torture/execute/pr52129.c: New test.


	Jakub

Comments

Richard Biener Feb. 6, 2012, 1:14 p.m. UTC | #1
On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The attached testcase is miscompiled on arm*, by doing a sibcall when setup
> of one argument overwrites incoming arguments used to setup parameters in
> later insns.
> The reason why
> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap
> fails to detect is that the caller has non-zero
> crtl->args.pretend_args_size, and in that case the base:
>      /* The argument block when performing a sibling call is the
>         incoming argument block.  */
>      if (pass == 0)
>        {
>          argblock = crtl->args.internal_arg_pointer;
>          argblock
> #ifdef STACK_GROWS_DOWNWARD
>            = plus_constant (argblock, crtl->args.pretend_args_size);
> #else
>            = plus_constant (argblock, -crtl->args.pretend_args_size);
> #endif
>          stored_args_map = sbitmap_alloc (args_size.constant);
>          sbitmap_zero (stored_args_map);
>        }
> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size
> (8 in this case).  When we store bits into stored_args_map sbitmap,
> we use arg->locate.slot_offset.constant based values (or something different
> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is
> testing those bits, it uses just virtual-incoming-rtx offsets (or something
> different for ARGS_GROW_DOWNWARD).  This patch fixes it by adjusting the
> virtual-incoming-rtx relative offset to be actually argblock relative
> offset.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the
> testcase on arm cross.  Ok for trunk?

Ok.

Thanks,
Richard.

> 2012-02-06  Jakub Jelinek  <jakub@redhat.com>
>
>        PR target/52129
>        * calls.c (mem_overlaps_already_clobbered_arg_p): If val is
>        CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it.
>
>        * gcc.c-torture/execute/pr52129.c: New test.
>
> --- gcc/calls.c.jj      2012-02-01 14:44:27.000000000 +0100
> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100
> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt
>     return true;
>   else
>     i = INTVAL (val);
> +#ifdef STACK_GROWS_DOWNWARD
> +  i -= crtl->args.pretend_args_size;
> +#else
> +  i += crtl->args.pretend_args_size;
> +#endif
>
>  #ifdef ARGS_GROW_DOWNWARD
>   i = -i - size;
> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj    2012-02-06 10:27:50.988876791 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c       2012-02-06 10:25:26.000000000 +0100
> @@ -0,0 +1,28 @@
> +/* PR target/52129 */
> +
> +extern void abort (void);
> +struct S { void *p; unsigned int q; };
> +struct T { char a[64]; char b[64]; } t;
> +
> +__attribute__((noinline, noclone)) int
> +foo (void *x, struct S s, void *y, void *z)
> +{
> +  if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17])
> +    abort ();
> +  return 29;
> +}
> +
> +__attribute__((noinline, noclone)) int
> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u)
> +{
> +  return foo (x, s, &u->a[t], &u->b[t]);
> +}
> +
> +int
> +main ()
> +{
> +  struct S s = { &t.b[5], 27 };
> +  if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29)
> +    abort ();
> +  return 0;
> +}
>
>        Jakub
Carrot Wei Feb. 7, 2012, 2:10 a.m. UTC | #2
Hi Jakub

Instead of disabling the sibcall, it could also be a valid tail call
optimization by moving the str after ldmia, and change the used
register(It should be handled by RA automatically), as following

         ...
          add     r4, r1, r4, lsl #2
          ldmia   r2, {r1, r2}
          str     r4, [sp, #48]
          ...

thanks
Carrot

On Mon, Feb 6, 2012 at 9:01 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> Hi!
>
> The attached testcase is miscompiled on arm*, by doing a sibcall when setup
> of one argument overwrites incoming arguments used to setup parameters in
> later insns.
> The reason why
> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap
> fails to detect is that the caller has non-zero
> crtl->args.pretend_args_size, and in that case the base:
>      /* The argument block when performing a sibling call is the
>         incoming argument block.  */
>      if (pass == 0)
>        {
>          argblock = crtl->args.internal_arg_pointer;
>          argblock
> #ifdef STACK_GROWS_DOWNWARD
>            = plus_constant (argblock, crtl->args.pretend_args_size);
> #else
>            = plus_constant (argblock, -crtl->args.pretend_args_size);
> #endif
>          stored_args_map = sbitmap_alloc (args_size.constant);
>          sbitmap_zero (stored_args_map);
>        }
> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size
> (8 in this case).  When we store bits into stored_args_map sbitmap,
> we use arg->locate.slot_offset.constant based values (or something different
> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is
> testing those bits, it uses just virtual-incoming-rtx offsets (or something
> different for ARGS_GROW_DOWNWARD).  This patch fixes it by adjusting the
> virtual-incoming-rtx relative offset to be actually argblock relative
> offset.
>
> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the
> testcase on arm cross.  Ok for trunk?
>
> 2012-02-06  Jakub Jelinek  <jakub@redhat.com>
>
>        PR target/52129
>        * calls.c (mem_overlaps_already_clobbered_arg_p): If val is
>        CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it.
>
>        * gcc.c-torture/execute/pr52129.c: New test.
>
> --- gcc/calls.c.jj      2012-02-01 14:44:27.000000000 +0100
> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100
> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt
>     return true;
>   else
>     i = INTVAL (val);
> +#ifdef STACK_GROWS_DOWNWARD
> +  i -= crtl->args.pretend_args_size;
> +#else
> +  i += crtl->args.pretend_args_size;
> +#endif
>
>  #ifdef ARGS_GROW_DOWNWARD
>   i = -i - size;
> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj    2012-02-06 10:27:50.988876791 +0100
> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c       2012-02-06 10:25:26.000000000 +0100
> @@ -0,0 +1,28 @@
> +/* PR target/52129 */
> +
> +extern void abort (void);
> +struct S { void *p; unsigned int q; };
> +struct T { char a[64]; char b[64]; } t;
> +
> +__attribute__((noinline, noclone)) int
> +foo (void *x, struct S s, void *y, void *z)
> +{
> +  if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17])
> +    abort ();
> +  return 29;
> +}
> +
> +__attribute__((noinline, noclone)) int
> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u)
> +{
> +  return foo (x, s, &u->a[t], &u->b[t]);
> +}
> +
> +int
> +main ()
> +{
> +  struct S s = { &t.b[5], 27 };
> +  if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29)
> +    abort ();
> +  return 0;
> +}
>
>        Jakub
Carrot Wei Feb. 9, 2012, 8:54 a.m. UTC | #3
Hi Richard and Jakub

Since 4.6 contains the same bug, I would like to back port it to 4.6
branch. Could you approve it for 4.6?

Jing and Doug

Could you approve it for google/gcc-4_6-mobile branch?

thanks
Carrot

On Mon, Feb 6, 2012 at 9:14 PM, Richard Guenther
<richard.guenther@gmail.com> wrote:
> On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>> Hi!
>>
>> The attached testcase is miscompiled on arm*, by doing a sibcall when setup
>> of one argument overwrites incoming arguments used to setup parameters in
>> later insns.
>> The reason why
>> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap
>> fails to detect is that the caller has non-zero
>> crtl->args.pretend_args_size, and in that case the base:
>>      /* The argument block when performing a sibling call is the
>>         incoming argument block.  */
>>      if (pass == 0)
>>        {
>>          argblock = crtl->args.internal_arg_pointer;
>>          argblock
>> #ifdef STACK_GROWS_DOWNWARD
>>            = plus_constant (argblock, crtl->args.pretend_args_size);
>> #else
>>            = plus_constant (argblock, -crtl->args.pretend_args_size);
>> #endif
>>          stored_args_map = sbitmap_alloc (args_size.constant);
>>          sbitmap_zero (stored_args_map);
>>        }
>> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size
>> (8 in this case).  When we store bits into stored_args_map sbitmap,
>> we use arg->locate.slot_offset.constant based values (or something different
>> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is
>> testing those bits, it uses just virtual-incoming-rtx offsets (or something
>> different for ARGS_GROW_DOWNWARD).  This patch fixes it by adjusting the
>> virtual-incoming-rtx relative offset to be actually argblock relative
>> offset.
>>
>> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the
>> testcase on arm cross.  Ok for trunk?
>
> Ok.
>
> Thanks,
> Richard.
>
>> 2012-02-06  Jakub Jelinek  <jakub@redhat.com>
>>
>>        PR target/52129
>>        * calls.c (mem_overlaps_already_clobbered_arg_p): If val is
>>        CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it.
>>
>>        * gcc.c-torture/execute/pr52129.c: New test.
>>
>> --- gcc/calls.c.jj      2012-02-01 14:44:27.000000000 +0100
>> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100
>> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt
>>     return true;
>>   else
>>     i = INTVAL (val);
>> +#ifdef STACK_GROWS_DOWNWARD
>> +  i -= crtl->args.pretend_args_size;
>> +#else
>> +  i += crtl->args.pretend_args_size;
>> +#endif
>>
>>  #ifdef ARGS_GROW_DOWNWARD
>>   i = -i - size;
>> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj    2012-02-06 10:27:50.988876791 +0100
>> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c       2012-02-06 10:25:26.000000000 +0100
>> @@ -0,0 +1,28 @@
>> +/* PR target/52129 */
>> +
>> +extern void abort (void);
>> +struct S { void *p; unsigned int q; };
>> +struct T { char a[64]; char b[64]; } t;
>> +
>> +__attribute__((noinline, noclone)) int
>> +foo (void *x, struct S s, void *y, void *z)
>> +{
>> +  if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17])
>> +    abort ();
>> +  return 29;
>> +}
>> +
>> +__attribute__((noinline, noclone)) int
>> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u)
>> +{
>> +  return foo (x, s, &u->a[t], &u->b[t]);
>> +}
>> +
>> +int
>> +main ()
>> +{
>> +  struct S s = { &t.b[5], 27 };
>> +  if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29)
>> +    abort ();
>> +  return 0;
>> +}
>>
>>        Jakub
Jakub Jelinek Feb. 9, 2012, 9:04 a.m. UTC | #4
On Thu, Feb 09, 2012 at 04:54:59PM +0800, Carrot Wei wrote:
> Since 4.6 contains the same bug, I would like to back port it to 4.6
> branch. Could you approve it for 4.6?

Yes, this is ok.  I'm sorry for deferring too many 4.6 backports for way too
long, will get to it eventually.

	Jakub
Jing Yu Feb. 10, 2012, 6:13 a.m. UTC | #5
On Thu, Feb 9, 2012 at 12:54 AM, Carrot Wei <carrot@google.com> wrote:
> Hi Richard and Jakub
>
> Since 4.6 contains the same bug, I would like to back port it to 4.6
> branch. Could you approve it for 4.6?
>
> Jing and Doug
>
> Could you approve it for google/gcc-4_6-mobile branch?
>

OK for google/gcc-4_6-mobile and gcc-4_6_2-mobile

Jing

> thanks
> Carrot
>
> On Mon, Feb 6, 2012 at 9:14 PM, Richard Guenther
> <richard.guenther@gmail.com> wrote:
>> On Mon, Feb 6, 2012 at 2:01 PM, Jakub Jelinek <jakub@redhat.com> wrote:
>>> Hi!
>>>
>>> The attached testcase is miscompiled on arm*, by doing a sibcall when setup
>>> of one argument overwrites incoming arguments used to setup parameters in
>>> later insns.
>>> The reason why
>>> mem_overlaps_already_clobbered_arg_p/check_sibcall_argument_overlap
>>> fails to detect is that the caller has non-zero
>>> crtl->args.pretend_args_size, and in that case the base:
>>>      /* The argument block when performing a sibling call is the
>>>         incoming argument block.  */
>>>      if (pass == 0)
>>>        {
>>>          argblock = crtl->args.internal_arg_pointer;
>>>          argblock
>>> #ifdef STACK_GROWS_DOWNWARD
>>>            = plus_constant (argblock, crtl->args.pretend_args_size);
>>> #else
>>>            = plus_constant (argblock, -crtl->args.pretend_args_size);
>>> #endif
>>>          stored_args_map = sbitmap_alloc (args_size.constant);
>>>          sbitmap_zero (stored_args_map);
>>>        }
>>> apparently isn't virtual-incoming-rtx, but that plus pretend_args_size
>>> (8 in this case).  When we store bits into stored_args_map sbitmap,
>>> we use arg->locate.slot_offset.constant based values (or something different
>>> for ARGS_GROW_DOWNWARD, but when mem_overlaps_already_clobbered_arg_p is
>>> testing those bits, it uses just virtual-incoming-rtx offsets (or something
>>> different for ARGS_GROW_DOWNWARD).  This patch fixes it by adjusting the
>>> virtual-incoming-rtx relative offset to be actually argblock relative
>>> offset.
>>>
>>> Bootstrapped/regtested on x86_64-linux and i686-linux and tested on the
>>> testcase on arm cross.  Ok for trunk?
>>
>> Ok.
>>
>> Thanks,
>> Richard.
>>
>>> 2012-02-06  Jakub Jelinek  <jakub@redhat.com>
>>>
>>>        PR target/52129
>>>        * calls.c (mem_overlaps_already_clobbered_arg_p): If val is
>>>        CONST_INT_P, subtract resp. add crtl->args.pretend_args_size to it.
>>>
>>>        * gcc.c-torture/execute/pr52129.c: New test.
>>>
>>> --- gcc/calls.c.jj      2012-02-01 14:44:27.000000000 +0100
>>> +++ gcc/calls.c 2012-02-06 10:19:12.112132905 +0100
>>> @@ -1808,6 +1808,11 @@ mem_overlaps_already_clobbered_arg_p (rt
>>>     return true;
>>>   else
>>>     i = INTVAL (val);
>>> +#ifdef STACK_GROWS_DOWNWARD
>>> +  i -= crtl->args.pretend_args_size;
>>> +#else
>>> +  i += crtl->args.pretend_args_size;
>>> +#endif
>>>
>>>  #ifdef ARGS_GROW_DOWNWARD
>>>   i = -i - size;
>>> --- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj    2012-02-06 10:27:50.988876791 +0100
>>> +++ gcc/testsuite/gcc.c-torture/execute/pr52129.c       2012-02-06 10:25:26.000000000 +0100
>>> @@ -0,0 +1,28 @@
>>> +/* PR target/52129 */
>>> +
>>> +extern void abort (void);
>>> +struct S { void *p; unsigned int q; };
>>> +struct T { char a[64]; char b[64]; } t;
>>> +
>>> +__attribute__((noinline, noclone)) int
>>> +foo (void *x, struct S s, void *y, void *z)
>>> +{
>>> +  if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17])
>>> +    abort ();
>>> +  return 29;
>>> +}
>>> +
>>> +__attribute__((noinline, noclone)) int
>>> +bar (void *x, void *y, void *z, struct S s, int t, struct T *u)
>>> +{
>>> +  return foo (x, s, &u->a[t], &u->b[t]);
>>> +}
>>> +
>>> +int
>>> +main ()
>>> +{
>>> +  struct S s = { &t.b[5], 27 };
>>> +  if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29)
>>> +    abort ();
>>> +  return 0;
>>> +}
>>>
>>>        Jakub
Carrot Wei Feb. 10, 2012, 2:29 p.m. UTC | #6
On Fri, Feb 10, 2012 at 2:13 PM, Jing Yu <jingyu@google.com> wrote:
> On Thu, Feb 9, 2012 at 12:54 AM, Carrot Wei <carrot@google.com> wrote:
>> Hi Richard and Jakub
>>
>> Since 4.6 contains the same bug, I would like to back port it to 4.6
>> branch. Could you approve it for 4.6?
>>
>> Jing and Doug
>>
>> Could you approve it for google/gcc-4_6-mobile branch?
>>
>
> OK for google/gcc-4_6-mobile and gcc-4_6_2-mobile
>
> Jing
>

Bootstrapped/regtested on x86_64-linux and regtested on arm qemu.
Committed to both google/gcc-4_6-mobile and google/gcc-4_6_2-mobile.

thanks
Carrot
diff mbox

Patch

--- gcc/calls.c.jj	2012-02-01 14:44:27.000000000 +0100
+++ gcc/calls.c	2012-02-06 10:19:12.112132905 +0100
@@ -1808,6 +1808,11 @@  mem_overlaps_already_clobbered_arg_p (rt
     return true;
   else
     i = INTVAL (val);
+#ifdef STACK_GROWS_DOWNWARD
+  i -= crtl->args.pretend_args_size;
+#else
+  i += crtl->args.pretend_args_size;
+#endif
 
 #ifdef ARGS_GROW_DOWNWARD
   i = -i - size;
--- gcc/testsuite/gcc.c-torture/execute/pr52129.c.jj	2012-02-06 10:27:50.988876791 +0100
+++ gcc/testsuite/gcc.c-torture/execute/pr52129.c	2012-02-06 10:25:26.000000000 +0100
@@ -0,0 +1,28 @@ 
+/* PR target/52129 */
+
+extern void abort (void);
+struct S { void *p; unsigned int q; };
+struct T { char a[64]; char b[64]; } t;
+
+__attribute__((noinline, noclone)) int
+foo (void *x, struct S s, void *y, void *z)
+{
+  if (x != &t.a[2] || s.p != &t.b[5] || s.q != 27 || y != &t.a[17] || z != &t.b[17])
+    abort ();
+  return 29;
+}
+
+__attribute__((noinline, noclone)) int
+bar (void *x, void *y, void *z, struct S s, int t, struct T *u)
+{
+  return foo (x, s, &u->a[t], &u->b[t]);
+}
+
+int
+main ()
+{
+  struct S s = { &t.b[5], 27 };
+  if (bar (&t.a[2], (void *) 0, (void *) 0, s, 17, &t) != 29)
+    abort ();
+  return 0;
+}