Message ID | CAFULd4ag80XLNrt0SP=fB8B=ywM=A2k6rP31=VSzz+x7b4mBAQ@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote: > Please find attach a patch that takes your idea slightly further. We > find perhaps zero-extended UNSPEC_TP, and copy it for further use. At > its place, we simply slap const0_rtx. We know that address to Is that safe? I mean, the address, even if offsetable, can have some immediate already (seems e.g. the offsettable_memref_p predicate just checks you can plus_constant some small integer and be recognized again) and if you turn the %gs: into a const0_rtx, it would fail next decompose. And when you already have the PLUS which has UNSPEC_TP as one of its arguments, replacing that PLUS with the other argument is IMHO very easy. Perhaps you are right that there is no need to copy_rtx, supposedly the rtx shouldn't be shared with anything and thus can be modified in place. If -mx32 is a non-issue here, then perhaps my initial patch is good enough? > Index: config/i386/i386.c > =================================================================== > --- config/i386/i386.c (revision 224292) > +++ config/i386/i386.c (working copy) > @@ -22858,7 +22858,7 @@ ix86_split_long_move (rtx operands[]) > Do an lea to the last part and use only one colliding move. */ > else if (collisions > 1) > { > - rtx base; > + rtx base, addr, tls_base = NULL_RTX; > > collisions = 1; > > @@ -22869,10 +22869,52 @@ ix86_split_long_move (rtx operands[]) > if (GET_MODE (base) != Pmode) > base = gen_rtx_REG (Pmode, REGNO (base)); > > - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0))); > + addr = XEXP (part[1][0], 0); > + if (TARGET_TLS_DIRECT_SEG_REFS) > + { > + struct ix86_address parts; > + int ok = ix86_decompose_address (addr, &parts); > + gcc_assert (ok); > + if (parts.seg != SEG_DEFAULT) > + { > + /* It is not valid to use %gs: or %fs: in > + lea though, so we need to remove it from the > + address used for lea and add it to each individual > + memory loads instead. */ > + rtx *x = &addr; > + while (GET_CODE (*x) == PLUS) > + { > + for (i = 0; i < 2; i++) > + { > + rtx op = XEXP (*x, i); > + if ((GET_CODE (op) == UNSPEC > + && XINT (op, 1) == UNSPEC_TP) > + || (GET_CODE (op) == ZERO_EXTEND > + && GET_CODE (XEXP (op, 0)) == UNSPEC > + && (XINT (XEXP (op, 0), 1) > + == UNSPEC_TP))) > + { > + tls_base = XEXP (*x, i); > + XEXP (*x, i) = const0_rtx; > + break; > + } > + } > + > + if (tls_base) > + break; > + x = &XEXP (*x, 0); > + } > + gcc_assert (tls_base); > + } > + } > + emit_insn (gen_rtx_SET (base, addr)); > + if (tls_base) > + base = gen_rtx_PLUS (GET_MODE (base), base, tls_base); > part[1][0] = replace_equiv_address (part[1][0], base); > for (i = 1; i < nparts; i++) > { > + if (tls_base) > + base = copy_rtx (base); > tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i); > part[1][i] = replace_equiv_address (part[1][i], tmp); > } Jakub
On Tue, Jun 9, 2015 at 9:30 PM, Jakub Jelinek <jakub@redhat.com> wrote: > On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote: >> Please find attach a patch that takes your idea slightly further. We >> find perhaps zero-extended UNSPEC_TP, and copy it for further use. At >> its place, we simply slap const0_rtx. We know that address to > > Is that safe? I mean, the address, even if offsetable, can have some > immediate already (seems e.g. the offsettable_memref_p predicate just checks > you can plus_constant some small integer and be recognized again) and if you > turn the %gs: into a const0_rtx, it would fail next decompose. > And when you already have the PLUS which has UNSPEC_TP as one of its > arguments, replacing that PLUS with the other argument is IMHO very easy. > Perhaps you are right that there is no need to copy_rtx, supposedly > the rtx shouldn't be shared with anything and thus can be modified in place. Hm, you are right. I was under impression that decompose_address can handle multiple CONST_INT addends, which is unfortunatelly not the case. > If -mx32 is a non-issue here, then perhaps my initial patch is good enough? It looks to me, that if you detect and record zero-extended UNSPEC_TP, your original patch would also handle -mx32. Can you please repost your original patch with the above addition? Thanks, Uros.
Uros Bizjak <ubizjak@gmail.com> writes: > On Tue, Jun 9, 2015 at 9:30 PM, Jakub Jelinek <jakub@redhat.com> wrote: >> On Tue, Jun 09, 2015 at 08:09:28PM +0200, Uros Bizjak wrote: >>> Please find attach a patch that takes your idea slightly further. We >>> find perhaps zero-extended UNSPEC_TP, and copy it for further use. At >>> its place, we simply slap const0_rtx. We know that address to >> >> Is that safe? I mean, the address, even if offsetable, can have some >> immediate already (seems e.g. the offsettable_memref_p predicate just checks >> you can plus_constant some small integer and be recognized again) and if you >> turn the %gs: into a const0_rtx, it would fail next decompose. >> And when you already have the PLUS which has UNSPEC_TP as one of its >> arguments, replacing that PLUS with the other argument is IMHO very easy. >> Perhaps you are right that there is no need to copy_rtx, supposedly >> the rtx shouldn't be shared with anything and thus can be modified in place. > > Hm, you are right. I was under impression that decompose_address can > handle multiple CONST_INT addends, which is unfortunatelly not the > case. That's in some ways a feature though. I don't think we want to support multiple offsets, since that implies having more than one representation for the same address. Thanks, Richard
Index: config/i386/i386.c =================================================================== --- config/i386/i386.c (revision 224292) +++ config/i386/i386.c (working copy) @@ -22858,7 +22858,7 @@ ix86_split_long_move (rtx operands[]) Do an lea to the last part and use only one colliding move. */ else if (collisions > 1) { - rtx base; + rtx base, addr, tls_base = NULL_RTX; collisions = 1; @@ -22869,10 +22869,52 @@ ix86_split_long_move (rtx operands[]) if (GET_MODE (base) != Pmode) base = gen_rtx_REG (Pmode, REGNO (base)); - emit_insn (gen_rtx_SET (base, XEXP (part[1][0], 0))); + addr = XEXP (part[1][0], 0); + if (TARGET_TLS_DIRECT_SEG_REFS) + { + struct ix86_address parts; + int ok = ix86_decompose_address (addr, &parts); + gcc_assert (ok); + if (parts.seg != SEG_DEFAULT) + { + /* It is not valid to use %gs: or %fs: in + lea though, so we need to remove it from the + address used for lea and add it to each individual + memory loads instead. */ + rtx *x = &addr; + while (GET_CODE (*x) == PLUS) + { + for (i = 0; i < 2; i++) + { + rtx op = XEXP (*x, i); + if ((GET_CODE (op) == UNSPEC + && XINT (op, 1) == UNSPEC_TP) + || (GET_CODE (op) == ZERO_EXTEND + && GET_CODE (XEXP (op, 0)) == UNSPEC + && (XINT (XEXP (op, 0), 1) + == UNSPEC_TP))) + { + tls_base = XEXP (*x, i); + XEXP (*x, i) = const0_rtx; + break; + } + } + + if (tls_base) + break; + x = &XEXP (*x, 0); + } + gcc_assert (tls_base); + } + } + emit_insn (gen_rtx_SET (base, addr)); + if (tls_base) + base = gen_rtx_PLUS (GET_MODE (base), base, tls_base); part[1][0] = replace_equiv_address (part[1][0], base); for (i = 1; i < nparts; i++) { + if (tls_base) + base = copy_rtx (base); tmp = plus_constant (Pmode, base, UNITS_PER_WORD * i); part[1][i] = replace_equiv_address (part[1][i], tmp); }