diff mbox

[i386] : Fix PR/60193

Message ID CAFULd4b5GVNjfR5QNpAr4Gz9mQgvcfCqrxDyobJ9oHkWq_Zf4g@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak Feb. 14, 2014, 2:40 p.m. UTC
On Fri, Feb 14, 2014 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2014-02-14 13:55 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>:
>> Hello!
>>
>>> 2014-02-14  Kai Tietz  <ktietz@redhat.com>
>>>
>>>     PR target/60193
>>>     * config/i386/i386.c (ix86_expand_prologue): Use
>>>     rax register as displacement for restoring %r10, %eax.
>>>
>>> Regression-tested for x86_64-unknown-linux-gnu, and
>>> x86_64-w64-mingw32, and i686-w64-mingw32.  Ok for apply?
>>
>> No, you should check allocate to satisfy x86_64_immediate_operand and
>> put it into a temporary register if not. There is no need to always
>> force constant into a temporary.
>
> Well, in general I would agree to your statement.  But in this case we
> have already the required value in rax-register loaded.  So I don't
> see the advantage of using in case of <2^32 constant for those
> restore-operation.  At least for code-size optimization it looks to me
> better and I am not aware that usage of register is here more
> expensive. I might be wrong about later.

Ah, I was not aware of the fact that eax already holds the value.
However, there were some problems with the patch: eax RTX is
unnecessarily regenerated in the wrong mode, UNITS_PER_WORD should be
subtracted instead of added - you can use displacement+offset
addressing instead.

Something like (untested) attached patch.

Uros.

Comments

Kai Tietz Feb. 14, 2014, 2:50 p.m. UTC | #1
2014-02-14 15:40 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>:
> On Fri, Feb 14, 2014 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>> 2014-02-14 13:55 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>:
>>> Hello!
>>>
>>>> 2014-02-14  Kai Tietz  <ktietz@redhat.com>
>>>>
>>>>     PR target/60193
>>>>     * config/i386/i386.c (ix86_expand_prologue): Use
>>>>     rax register as displacement for restoring %r10, %eax.
>>>>
>>>> Regression-tested for x86_64-unknown-linux-gnu, and
>>>> x86_64-w64-mingw32, and i686-w64-mingw32.  Ok for apply?
>>>
>>> No, you should check allocate to satisfy x86_64_immediate_operand and
>>> put it into a temporary register if not. There is no need to always
>>> force constant into a temporary.
>>
>> Well, in general I would agree to your statement.  But in this case we
>> have already the required value in rax-register loaded.  So I don't
>> see the advantage of using in case of <2^32 constant for those
>> restore-operation.  At least for code-size optimization it looks to me
>> better and I am not aware that usage of register is here more
>> expensive. I might be wrong about later.
>
> Ah, I was not aware of the fact that eax already holds the value.
> However, there were some problems with the patch: eax RTX is
> unnecessarily regenerated in the wrong mode, UNITS_PER_WORD should be
> subtracted instead of added - you can use displacement+offset
> addressing instead.
>
> Something like (untested) attached patch.
>
> Uros.

No, the patch I attached works fine.  To substract here UNITS_PER_WORD
is in fact a bug.  As description see how we modify allocate on
pushing.

So for allocate of x * UNITS_PER_WORD with living rax, and r10, we
will see following stack layout:

[rax saved]: rsp = -1..-UNITS_PER_WORD1;
[r10 saved]: rsp = -UNITS_PER_WORD-1..-2*UNITS_PER_WORD
[reserved-stack]: rsp = -2*UNITS_PER_WORD-1.. -x*UNITS_PER_WORD

So final rsp is -x * UNITS_PER_WORD and the value of allocate is (x -
2) * UNITS_PER_WORD.

To restore r10, we can use [rsp+allocate] as (-2 * UNITS_PER_WORD) is
its location.
To restore rax we need to use [rsp+allocate+UNITS_PER_UNIT] as -
UNITS_PER_WORD is its location.

Regards.
Kai
Uros Bizjak Feb. 14, 2014, 2:54 p.m. UTC | #2
On Fri, Feb 14, 2014 at 3:50 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
> 2014-02-14 15:40 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>:
>> On Fri, Feb 14, 2014 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote:
>>> 2014-02-14 13:55 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>:
>>>> Hello!
>>>>
>>>>> 2014-02-14  Kai Tietz  <ktietz@redhat.com>
>>>>>
>>>>>     PR target/60193
>>>>>     * config/i386/i386.c (ix86_expand_prologue): Use
>>>>>     rax register as displacement for restoring %r10, %eax.
>>>>>
>>>>> Regression-tested for x86_64-unknown-linux-gnu, and
>>>>> x86_64-w64-mingw32, and i686-w64-mingw32.  Ok for apply?
>>>>
>>>> No, you should check allocate to satisfy x86_64_immediate_operand and
>>>> put it into a temporary register if not. There is no need to always
>>>> force constant into a temporary.
>>>
>>> Well, in general I would agree to your statement.  But in this case we
>>> have already the required value in rax-register loaded.  So I don't
>>> see the advantage of using in case of <2^32 constant for those
>>> restore-operation.  At least for code-size optimization it looks to me
>>> better and I am not aware that usage of register is here more
>>> expensive. I might be wrong about later.
>>
>> Ah, I was not aware of the fact that eax already holds the value.
>> However, there were some problems with the patch: eax RTX is
>> unnecessarily regenerated in the wrong mode, UNITS_PER_WORD should be
>> subtracted instead of added - you can use displacement+offset
>> addressing instead.
>>
>> Something like (untested) attached patch.
>>
>> Uros.
>
> No, the patch I attached works fine.  To substract here UNITS_PER_WORD
> is in fact a bug.  As description see how we modify allocate on
> pushing.

This fact was not mentioned in the ChangeLog.

So, simply change

+   t = plus_constant (Pmode, t, -UNITS_PER_WORD);

to

t = plus_constant (Pmode, t, UNITS_PER_WORD);

in my patch, and it should generate correct offset+displacement address.

Please also add the testcase from the PR.

Uros.
diff mbox

Patch

Index: i386.c
===================================================================
--- i386.c	(revision 207780)
+++ i386.c	(working copy)
@@ -11023,13 +11023,12 @@ 
       rtx r10 = NULL;
       rtx (*adjust_stack_insn)(rtx, rtx, rtx);
       const bool sp_is_cfa_reg = (m->fs.cfa_reg == stack_pointer_rtx);
-      bool eax_live = false;
+      bool eax_live = ix86_eax_live_at_start_p ();
       bool r10_live = false;
 
       if (TARGET_64BIT)
         r10_live = (DECL_STATIC_CHAIN (current_function_decl) != 0);
 
-      eax_live = ix86_eax_live_at_start_p ();
       if (eax_live)
 	{
 	  insn = emit_insn (gen_push (eax));
@@ -11084,17 +11083,16 @@ 
 	 works for realigned stack, too.  */
       if (r10_live && eax_live)
         {
-	  t = plus_constant (Pmode, stack_pointer_rtx, allocate);
+	  t = gen_rtx_PLUS (Pmode, stack_pointer_rtx, eax);
 	  emit_move_insn (gen_rtx_REG (word_mode, R10_REG),
 			  gen_frame_mem (word_mode, t));
-	  t = plus_constant (Pmode, stack_pointer_rtx,
-			     allocate - UNITS_PER_WORD);
+	  t = plus_constant (Pmode, t, -UNITS_PER_WORD);
 	  emit_move_insn (gen_rtx_REG (word_mode, AX_REG),
 			  gen_frame_mem (word_mode, t));
 	}
       else if (eax_live || r10_live)
 	{
-	  t = plus_constant (Pmode, stack_pointer_rtx, allocate);
+	  t = gen_rtx_PLUS (Pmode, stack_pointer_rtx, eax);
 	  emit_move_insn (gen_rtx_REG (word_mode,
 				       (eax_live ? AX_REG : R10_REG)),
 			  gen_frame_mem (word_mode, t));