Message ID | CAFULd4b5GVNjfR5QNpAr4Gz9mQgvcfCqrxDyobJ9oHkWq_Zf4g@mail.gmail.com |
---|---|
State | New |
Headers | show |
2014-02-14 15:40 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>: > On Fri, Feb 14, 2014 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >> 2014-02-14 13:55 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>: >>> Hello! >>> >>>> 2014-02-14 Kai Tietz <ktietz@redhat.com> >>>> >>>> PR target/60193 >>>> * config/i386/i386.c (ix86_expand_prologue): Use >>>> rax register as displacement for restoring %r10, %eax. >>>> >>>> Regression-tested for x86_64-unknown-linux-gnu, and >>>> x86_64-w64-mingw32, and i686-w64-mingw32. Ok for apply? >>> >>> No, you should check allocate to satisfy x86_64_immediate_operand and >>> put it into a temporary register if not. There is no need to always >>> force constant into a temporary. >> >> Well, in general I would agree to your statement. But in this case we >> have already the required value in rax-register loaded. So I don't >> see the advantage of using in case of <2^32 constant for those >> restore-operation. At least for code-size optimization it looks to me >> better and I am not aware that usage of register is here more >> expensive. I might be wrong about later. > > Ah, I was not aware of the fact that eax already holds the value. > However, there were some problems with the patch: eax RTX is > unnecessarily regenerated in the wrong mode, UNITS_PER_WORD should be > subtracted instead of added - you can use displacement+offset > addressing instead. > > Something like (untested) attached patch. > > Uros. No, the patch I attached works fine. To substract here UNITS_PER_WORD is in fact a bug. As description see how we modify allocate on pushing. So for allocate of x * UNITS_PER_WORD with living rax, and r10, we will see following stack layout: [rax saved]: rsp = -1..-UNITS_PER_WORD1; [r10 saved]: rsp = -UNITS_PER_WORD-1..-2*UNITS_PER_WORD [reserved-stack]: rsp = -2*UNITS_PER_WORD-1.. -x*UNITS_PER_WORD So final rsp is -x * UNITS_PER_WORD and the value of allocate is (x - 2) * UNITS_PER_WORD. To restore r10, we can use [rsp+allocate] as (-2 * UNITS_PER_WORD) is its location. To restore rax we need to use [rsp+allocate+UNITS_PER_UNIT] as - UNITS_PER_WORD is its location. Regards. Kai
On Fri, Feb 14, 2014 at 3:50 PM, Kai Tietz <ktietz70@googlemail.com> wrote: > 2014-02-14 15:40 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>: >> On Fri, Feb 14, 2014 at 2:48 PM, Kai Tietz <ktietz70@googlemail.com> wrote: >>> 2014-02-14 13:55 GMT+01:00 Uros Bizjak <ubizjak@gmail.com>: >>>> Hello! >>>> >>>>> 2014-02-14 Kai Tietz <ktietz@redhat.com> >>>>> >>>>> PR target/60193 >>>>> * config/i386/i386.c (ix86_expand_prologue): Use >>>>> rax register as displacement for restoring %r10, %eax. >>>>> >>>>> Regression-tested for x86_64-unknown-linux-gnu, and >>>>> x86_64-w64-mingw32, and i686-w64-mingw32. Ok for apply? >>>> >>>> No, you should check allocate to satisfy x86_64_immediate_operand and >>>> put it into a temporary register if not. There is no need to always >>>> force constant into a temporary. >>> >>> Well, in general I would agree to your statement. But in this case we >>> have already the required value in rax-register loaded. So I don't >>> see the advantage of using in case of <2^32 constant for those >>> restore-operation. At least for code-size optimization it looks to me >>> better and I am not aware that usage of register is here more >>> expensive. I might be wrong about later. >> >> Ah, I was not aware of the fact that eax already holds the value. >> However, there were some problems with the patch: eax RTX is >> unnecessarily regenerated in the wrong mode, UNITS_PER_WORD should be >> subtracted instead of added - you can use displacement+offset >> addressing instead. >> >> Something like (untested) attached patch. >> >> Uros. > > No, the patch I attached works fine. To substract here UNITS_PER_WORD > is in fact a bug. As description see how we modify allocate on > pushing. This fact was not mentioned in the ChangeLog. So, simply change + t = plus_constant (Pmode, t, -UNITS_PER_WORD); to t = plus_constant (Pmode, t, UNITS_PER_WORD); in my patch, and it should generate correct offset+displacement address. Please also add the testcase from the PR. Uros.
Index: i386.c =================================================================== --- i386.c (revision 207780) +++ i386.c (working copy) @@ -11023,13 +11023,12 @@ rtx r10 = NULL; rtx (*adjust_stack_insn)(rtx, rtx, rtx); const bool sp_is_cfa_reg = (m->fs.cfa_reg == stack_pointer_rtx); - bool eax_live = false; + bool eax_live = ix86_eax_live_at_start_p (); bool r10_live = false; if (TARGET_64BIT) r10_live = (DECL_STATIC_CHAIN (current_function_decl) != 0); - eax_live = ix86_eax_live_at_start_p (); if (eax_live) { insn = emit_insn (gen_push (eax)); @@ -11084,17 +11083,16 @@ works for realigned stack, too. */ if (r10_live && eax_live) { - t = plus_constant (Pmode, stack_pointer_rtx, allocate); + t = gen_rtx_PLUS (Pmode, stack_pointer_rtx, eax); emit_move_insn (gen_rtx_REG (word_mode, R10_REG), gen_frame_mem (word_mode, t)); - t = plus_constant (Pmode, stack_pointer_rtx, - allocate - UNITS_PER_WORD); + t = plus_constant (Pmode, t, -UNITS_PER_WORD); emit_move_insn (gen_rtx_REG (word_mode, AX_REG), gen_frame_mem (word_mode, t)); } else if (eax_live || r10_live) { - t = plus_constant (Pmode, stack_pointer_rtx, allocate); + t = gen_rtx_PLUS (Pmode, stack_pointer_rtx, eax); emit_move_insn (gen_rtx_REG (word_mode, (eax_live ? AX_REG : R10_REG)), gen_frame_mem (word_mode, t));