diff mbox

[i386,Pointer,Bounds,Checker,33/x] MPX ABI

Message ID 542C4E8F.8030502@redhat.com
State New
Headers show

Commit Message

Vladimir Makarov Oct. 1, 2014, 6:57 p.m. UTC
On 2014-09-25 5:46 AM, Ilya Enkovich wrote:
> 2014-09-25 1:51 GMT+04:00 Ilya Enkovich <enkovich.gnu@gmail.com>:
>> 2014-09-24 23:09 GMT+04:00 Jeff Law <law@redhat.com>:
>>> On 09/24/14 07:13, Ilya Enkovich wrote:
>>>>
>>>> I tried to generate PARALLEL with all regs set by call.  Here is a
>>>> memset call I got:
>>>>
>>>> (call_insn 23 22 24 2 (set (parallel [
>>>>                   (expr_list:REG_DEP_TRUE (reg:DI 0 ax)
>>>>                       (const_int 0 [0]))
>>>>                   (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
>>>>                       (const_int 0 [0]))
>>>>                   (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
>>>>                       (const_int 0 [0]))
>>>>               ])
>>>>           (call/j (mem:QI (symbol_ref:DI ("memset") [flags 0x41]
>>>
>>> [ snip ]
>>> Looks good.  This is the approved way to handle multiple results of a call.
>>>
>>>>
>>>> During register allocation LRA generated a weird move instruction:
>>>>
>>>> (insn 63 0 0 (set (reg/f:DI 100)
>>>>           (parallel [
>>>>                   (expr_list:REG_DEP_TRUE (reg:DI 0 ax)
>>>>                       (const_int 0 [0]))
>>>>                   (expr_list:REG_DEP_TRUE (reg:BND64 77 bnd0)
>>>>                       (const_int 0 [0]))
>>>>                   (expr_list:REG_DEP_TRUE (reg:BND64 78 bnd1)
>>>>                       (const_int 0 [0]))
>>>>               ])) -1
>>>>        (nil))
>>>>
>>>> Which caused ICE later in LRA.  This move happens because of
>>>> REG_RETURNED (reg/f:DI 100) (see condition in inherit_in_ebb at
>>>> lra-constraints.c:5312).  Thus this code in LRA doesn't accept
>>>> PARALLEL dest for calls.
>>>
>>> This is a bug in LRA then.  Multiple return values aren't heavily used, so
>>> I'm not surprised that its handling was missed in LRA.
>>>
>>> The question now is how to bundle things together in such a way as to make
>>> it easy for Vlad to reproduce and fix this in LRA.
>>>
>>> Jeff
>>
>> I suppose it should be easy to reproduce using the same test case I
>> use and some speudo patch which adds fake return values (e.g. xmm6 and
>> xmm7) to calls.  Will try to make some minimal patch and test Vlad
>> could work with.
>>
>> Ilya
>
> I couldn't reproduce the problem on a small test but chrome build
> shows a lot of errors.  Due to the nature of the problem test's size
> shouldn't matter, so I attach patch which emulates situation with
> bounds regs (but uses xmm5 and xmm6 instead of bnd0 and bnd1) with a
> preprocessed chrome file.
>

The problem is in code introduced by Bernd in IRA and caller-saves.c in 
2012.  It is basically an optimization for functions returning always 
the same result as one argument (e.g. memcpy returning 1st argument).

There are two possible solutions.  First one is to prohibit the 
optimizations when there is a parallel in SET.  Second one is to go 
deeper if the call result is guaranteed in the first element which is 
true for the patch.

For the first solution, the patch would be

             }


The first patch is safer but the second one is ok too.  I have no 
particular preferences.  Whatever we choose, analogous code in 
caller-saves.c should be changed too.

Comments

Uros Bizjak Oct. 10, 2014, 5:10 p.m. UTC | #1
On Wed, Oct 1, 2014 at 8:57 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:

> The problem is in code introduced by Bernd in IRA and caller-saves.c in
> 2012.  It is basically an optimization for functions returning always the
> same result as one argument (e.g. memcpy returning 1st argument).
>
> There are two possible solutions.  First one is to prohibit the
> optimizations when there is a parallel in SET.  Second one is to go deeper
> if the call result is guaranteed in the first element which is true for the
> patch.

I suspect that the first solution will regress
gcc.target/i386/retarg.c on i686 - that testcase test if referred
optimization is effective. All things equal, I think we should go with
the second solution.

> For the first solution, the patch would be
>
> Index: lra-constraints.c
> ===================================================================
> --- lra-constraints.c   (revision 215748)
> +++ lra-constraints.c   (working copy)
> @@ -5348,16 +5348,19 @@
>                   if (GET_CODE (pat) == PARALLEL)
>                     pat = XVECEXP (pat, 0, 0);
>                   dest = SET_DEST (pat);
> -                 start_sequence ();
> -                 emit_move_insn (cheap, copy_rtx (dest));
> -                 restore = get_insns ();
> -                 end_sequence ();
> -                 lra_process_new_insns (curr_insn, NULL, restore,
> -                                        "Inserting call parameter
> restore");
> -                 /* We don't need to save/restore of the pseudo from
> -                    this call.  */
> -                 usage_insns[regno].calls_num = calls_num;
> -                 bitmap_set_bit (&check_only_regs, regno);
> +                 if (REG_P (dest))
> +                   {
> +                     start_sequence ();
> +                     emit_move_insn (cheap, copy_rtx (dest));
> +                     restore = get_insns ();
> +                     end_sequence ();
> +                     lra_process_new_insns (curr_insn, NULL, restore,
> +                                            "Inserting call parameter
> restore");
> +                     /* We don't need to save/restore of the pseudo
> +                        from this call.  */
> +                     usage_insns[regno].calls_num = calls_num;
> +                     bitmap_set_bit (&check_only_regs, regno);
> +                   }
>                 }
>             }
>           to_inherit_num = 0;
>
>
> For the second solution, the patch is
>
>
> Index: lra-constraints.c
> ===================================================================
> --- lra-constraints.c   (revision 215748)
> +++ lra-constraints.c   (working copy)
> @@ -5348,16 +5348,25 @@
>                   if (GET_CODE (pat) == PARALLEL)
>                     pat = XVECEXP (pat, 0, 0);
>                   dest = SET_DEST (pat);
> -                 start_sequence ();
> -                 emit_move_insn (cheap, copy_rtx (dest));
> -                 restore = get_insns ();
> -                 end_sequence ();
> -                 lra_process_new_insns (curr_insn, NULL, restore,
> -                                        "Inserting call parameter
> restore");
> -                 /* We don't need to save/restore of the pseudo from
> -                    this call.  */
> -                 usage_insns[regno].calls_num = calls_num;
> -                 bitmap_set_bit (&check_only_regs, regno);
> +                 if (GET_CODE (dest) == PARALLEL)
> +                   {
> +                     dest = XVECEXP (dest, 0, 0);
> +                     if (GET_CODE (dest) == EXPR_LIST)
> +                       dest = XEXP (dest, 0);
> +                   }
> +                 if (REG_P (dest))
> +                   {
> +                     start_sequence ();
> +                     emit_move_insn (cheap, copy_rtx (dest));
> +                     restore = get_insns ();
> +                     end_sequence ();
> +                     lra_process_new_insns (curr_insn, NULL, restore,
> +                                            "Inserting call parameter
> restore");
> +                     /* We don't need to save/restore of the pseudo from
> +                        this call.      */
> +                     usage_insns[regno].calls_num = calls_num;
> +                     bitmap_set_bit (&check_only_regs, regno);
> +                   }
>                 }
>             }
>
>
> The first patch is safer but the second one is ok too.  I have no particular
> preferences.  Whatever we choose, analogous code in caller-saves.c should be
> changed too.

Uros.
Ilya Enkovich Oct. 10, 2014, 5:29 p.m. UTC | #2
2014-10-10 21:10 GMT+04:00 Uros Bizjak <ubizjak@gmail.com>:
> On Wed, Oct 1, 2014 at 8:57 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>
>> The problem is in code introduced by Bernd in IRA and caller-saves.c in
>> 2012.  It is basically an optimization for functions returning always the
>> same result as one argument (e.g. memcpy returning 1st argument).
>>
>> There are two possible solutions.  First one is to prohibit the
>> optimizations when there is a parallel in SET.  Second one is to go deeper
>> if the call result is guaranteed in the first element which is true for the
>> patch.
>
> I suspect that the first solution will regress
> gcc.target/i386/retarg.c on i686 - that testcase test if referred
> optimization is effective. All things equal, I think we should go with
> the second solution.

The first solutions is in trunk since October 3
(https://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00094.html) and I don't see
such fail.  Patch actually just checks for a case which never occurs
right now and therefore should be quite safe.

Ilya

>
>> For the first solution, the patch would be
>>
>> Index: lra-constraints.c
>> ===================================================================
>> --- lra-constraints.c   (revision 215748)
>> +++ lra-constraints.c   (working copy)
>> @@ -5348,16 +5348,19 @@
>>                   if (GET_CODE (pat) == PARALLEL)
>>                     pat = XVECEXP (pat, 0, 0);
>>                   dest = SET_DEST (pat);
>> -                 start_sequence ();
>> -                 emit_move_insn (cheap, copy_rtx (dest));
>> -                 restore = get_insns ();
>> -                 end_sequence ();
>> -                 lra_process_new_insns (curr_insn, NULL, restore,
>> -                                        "Inserting call parameter
>> restore");
>> -                 /* We don't need to save/restore of the pseudo from
>> -                    this call.  */
>> -                 usage_insns[regno].calls_num = calls_num;
>> -                 bitmap_set_bit (&check_only_regs, regno);
>> +                 if (REG_P (dest))
>> +                   {
>> +                     start_sequence ();
>> +                     emit_move_insn (cheap, copy_rtx (dest));
>> +                     restore = get_insns ();
>> +                     end_sequence ();
>> +                     lra_process_new_insns (curr_insn, NULL, restore,
>> +                                            "Inserting call parameter
>> restore");
>> +                     /* We don't need to save/restore of the pseudo
>> +                        from this call.  */
>> +                     usage_insns[regno].calls_num = calls_num;
>> +                     bitmap_set_bit (&check_only_regs, regno);
>> +                   }
>>                 }
>>             }
>>           to_inherit_num = 0;
>>
>>
>> For the second solution, the patch is
>>
>>
>> Index: lra-constraints.c
>> ===================================================================
>> --- lra-constraints.c   (revision 215748)
>> +++ lra-constraints.c   (working copy)
>> @@ -5348,16 +5348,25 @@
>>                   if (GET_CODE (pat) == PARALLEL)
>>                     pat = XVECEXP (pat, 0, 0);
>>                   dest = SET_DEST (pat);
>> -                 start_sequence ();
>> -                 emit_move_insn (cheap, copy_rtx (dest));
>> -                 restore = get_insns ();
>> -                 end_sequence ();
>> -                 lra_process_new_insns (curr_insn, NULL, restore,
>> -                                        "Inserting call parameter
>> restore");
>> -                 /* We don't need to save/restore of the pseudo from
>> -                    this call.  */
>> -                 usage_insns[regno].calls_num = calls_num;
>> -                 bitmap_set_bit (&check_only_regs, regno);
>> +                 if (GET_CODE (dest) == PARALLEL)
>> +                   {
>> +                     dest = XVECEXP (dest, 0, 0);
>> +                     if (GET_CODE (dest) == EXPR_LIST)
>> +                       dest = XEXP (dest, 0);
>> +                   }
>> +                 if (REG_P (dest))
>> +                   {
>> +                     start_sequence ();
>> +                     emit_move_insn (cheap, copy_rtx (dest));
>> +                     restore = get_insns ();
>> +                     end_sequence ();
>> +                     lra_process_new_insns (curr_insn, NULL, restore,
>> +                                            "Inserting call parameter
>> restore");
>> +                     /* We don't need to save/restore of the pseudo from
>> +                        this call.      */
>> +                     usage_insns[regno].calls_num = calls_num;
>> +                     bitmap_set_bit (&check_only_regs, regno);
>> +                   }
>>                 }
>>             }
>>
>>
>> The first patch is safer but the second one is ok too.  I have no particular
>> preferences.  Whatever we choose, analogous code in caller-saves.c should be
>> changed too.
>
> Uros.
Uros Bizjak Oct. 10, 2014, 5:33 p.m. UTC | #3
On Fri, Oct 10, 2014 at 7:29 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
> 2014-10-10 21:10 GMT+04:00 Uros Bizjak <ubizjak@gmail.com>:
>> On Wed, Oct 1, 2014 at 8:57 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>
>>> The problem is in code introduced by Bernd in IRA and caller-saves.c in
>>> 2012.  It is basically an optimization for functions returning always the
>>> same result as one argument (e.g. memcpy returning 1st argument).
>>>
>>> There are two possible solutions.  First one is to prohibit the
>>> optimizations when there is a parallel in SET.  Second one is to go deeper
>>> if the call result is guaranteed in the first element which is true for the
>>> patch.
>>
>> I suspect that the first solution will regress
>> gcc.target/i386/retarg.c on i686 - that testcase test if referred
>> optimization is effective. All things equal, I think we should go with
>> the second solution.
>
> The first solutions is in trunk since October 3
> (https://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00094.html) and I don't see
> such fail.  Patch actually just checks for a case which never occurs
> right now and therefore should be quite safe.

True, but after MPX patches are committed, PARALLELs will be passed as
call targets. I wonder if the testcase fails then.

Uros.
Jeff Law Oct. 10, 2014, 5:35 p.m. UTC | #4
On 10/10/14 11:33, Uros Bizjak wrote:
> On Fri, Oct 10, 2014 at 7:29 PM, Ilya Enkovich <enkovich.gnu@gmail.com> wrote:
>> 2014-10-10 21:10 GMT+04:00 Uros Bizjak <ubizjak@gmail.com>:
>>> On Wed, Oct 1, 2014 at 8:57 PM, Vladimir Makarov <vmakarov@redhat.com> wrote:
>>>
>>>> The problem is in code introduced by Bernd in IRA and caller-saves.c in
>>>> 2012.  It is basically an optimization for functions returning always the
>>>> same result as one argument (e.g. memcpy returning 1st argument).
>>>>
>>>> There are two possible solutions.  First one is to prohibit the
>>>> optimizations when there is a parallel in SET.  Second one is to go deeper
>>>> if the call result is guaranteed in the first element which is true for the
>>>> patch.
>>>
>>> I suspect that the first solution will regress
>>> gcc.target/i386/retarg.c on i686 - that testcase test if referred
>>> optimization is effective. All things equal, I think we should go with
>>> the second solution.
>>
>> The first solutions is in trunk since October 3
>> (https://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00094.html) and I don't see
>> such fail.  Patch actually just checks for a case which never occurs
>> right now and therefore should be quite safe.
>
> True, but after MPX patches are committed, PARALLELs will be passed as
> call targets. I wonder if the testcase fails then.
Well, before those patches can be committed, they have to survive 
bootstrap and regression testing as a whole :-)

jeff
Ilya Enkovich Oct. 10, 2014, 5:37 p.m. UTC | #5
2014-10-10 21:35 GMT+04:00 Jeff Law <law@redhat.com>:
> On 10/10/14 11:33, Uros Bizjak wrote:
>>
>> On Fri, Oct 10, 2014 at 7:29 PM, Ilya Enkovich <enkovich.gnu@gmail.com>
>> wrote:
>>>
>>> 2014-10-10 21:10 GMT+04:00 Uros Bizjak <ubizjak@gmail.com>:
>>>>
>>>> On Wed, Oct 1, 2014 at 8:57 PM, Vladimir Makarov <vmakarov@redhat.com>
>>>> wrote:
>>>>
>>>>> The problem is in code introduced by Bernd in IRA and caller-saves.c in
>>>>> 2012.  It is basically an optimization for functions returning always
>>>>> the
>>>>> same result as one argument (e.g. memcpy returning 1st argument).
>>>>>
>>>>> There are two possible solutions.  First one is to prohibit the
>>>>> optimizations when there is a parallel in SET.  Second one is to go
>>>>> deeper
>>>>> if the call result is guaranteed in the first element which is true for
>>>>> the
>>>>> patch.
>>>>
>>>>
>>>> I suspect that the first solution will regress
>>>> gcc.target/i386/retarg.c on i686 - that testcase test if referred
>>>> optimization is effective. All things equal, I think we should go with
>>>> the second solution.
>>>
>>>
>>> The first solutions is in trunk since October 3
>>> (https://gcc.gnu.org/ml/gcc-cvs/2014-10/msg00094.html) and I don't see
>>> such fail.  Patch actually just checks for a case which never occurs
>>> right now and therefore should be quite safe.
>>
>>
>> True, but after MPX patches are committed, PARALLELs will be passed as
>> call targets. I wonder if the testcase fails then.
>
> Well, before those patches can be committed, they have to survive bootstrap
> and regression testing as a whole :-)

Surely! Also PARALLEL will not be passed until you pass -mmpx flag.

Ilya

>
> jeff
diff mbox

Patch

Index: lra-constraints.c
===================================================================
--- lra-constraints.c   (revision 215748)
+++ lra-constraints.c   (working copy)
@@ -5348,16 +5348,19 @@ 
                   if (GET_CODE (pat) == PARALLEL)
                     pat = XVECEXP (pat, 0, 0);
                   dest = SET_DEST (pat);
-                 start_sequence ();
-                 emit_move_insn (cheap, copy_rtx (dest));
-                 restore = get_insns ();
-                 end_sequence ();
-                 lra_process_new_insns (curr_insn, NULL, restore,
-                                        "Inserting call parameter 
restore");
-                 /* We don't need to save/restore of the pseudo from
-                    this call.  */
-                 usage_insns[regno].calls_num = calls_num;
-                 bitmap_set_bit (&check_only_regs, regno);
+                 if (REG_P (dest))
+                   {
+                     start_sequence ();
+                     emit_move_insn (cheap, copy_rtx (dest));
+                     restore = get_insns ();
+                     end_sequence ();
+                     lra_process_new_insns (curr_insn, NULL, restore,
+                                            "Inserting call parameter 
restore");
+                     /* We don't need to save/restore of the pseudo
+                        from this call.  */
+                     usage_insns[regno].calls_num = calls_num;
+                     bitmap_set_bit (&check_only_regs, regno);
+                   }
                 }
             }
           to_inherit_num = 0;


For the second solution, the patch is


Index: lra-constraints.c
===================================================================
--- lra-constraints.c   (revision 215748)
+++ lra-constraints.c   (working copy)
@@ -5348,16 +5348,25 @@ 
                   if (GET_CODE (pat) == PARALLEL)
                     pat = XVECEXP (pat, 0, 0);
                   dest = SET_DEST (pat);
-                 start_sequence ();
-                 emit_move_insn (cheap, copy_rtx (dest));
-                 restore = get_insns ();
-                 end_sequence ();
-                 lra_process_new_insns (curr_insn, NULL, restore,
-                                        "Inserting call parameter 
restore");
-                 /* We don't need to save/restore of the pseudo from
-                    this call.  */
-                 usage_insns[regno].calls_num = calls_num;
-                 bitmap_set_bit (&check_only_regs, regno);
+                 if (GET_CODE (dest) == PARALLEL)
+                   {
+                     dest = XVECEXP (dest, 0, 0);
+                     if (GET_CODE (dest) == EXPR_LIST)
+                       dest = XEXP (dest, 0);
+                   }
+                 if (REG_P (dest))
+                   {
+                     start_sequence ();
+                     emit_move_insn (cheap, copy_rtx (dest));
+                     restore = get_insns ();
+                     end_sequence ();
+                     lra_process_new_insns (curr_insn, NULL, restore,
+                                            "Inserting call parameter 
restore");
+                     /* We don't need to save/restore of the pseudo from
+                        this call.      */
+                     usage_insns[regno].calls_num = calls_num;
+                     bitmap_set_bit (&check_only_regs, regno);
+                   }
                 }