Message ID | alpine.BSF.2.02.1211112347180.73231@dair.pair.com |
---|---|
State | New |
Headers | show |
> This is a target-specific blockage insn, but with the general form > found in all targets defining it. The default blockage is an empty > asm-volatile, which is what cse_insn recognized. The blockage insn is > there to "prevent scheduling" of the critical insns and register > values. It's almost obvious that an unspec_volatile should have that > effect "too"; at least that's intended by the code in > expand_builtin_setjmp_receiver. Luckily (or unluckily regarding the > presence of the bug) *this* cse code is not run post-frame-layout > (post-reload-cse does not use this code), or it seems people would > soon notice register values used from the wrong side of the blockage, > considering its critical use at the restore of the stack-pointer. > As mentioned in the previous patch, > <http://gcc.gnu.org/ml/gcc-patches/2012-10/msg01172.html>, clobbering > the frame-pointer (and then using it) does not seem the logical > alternative to the patch below; the blockage insn should just do its job. Agreed. > I updated comments and documentation so it's more apparent that > register uses should also not be moved across the blockage; see the > patched comments. > > Tested native x86_64-unknown-linux-gnu w./wo. -m32 and before/after > 192677. Ok to commit? > > gcc: > PR middle-end/55030 > * builtins.c (expand_builtin_setjmp_receiver): Update comment > regarding purpose of blockage. > * emit-rtl.c [!HAVE_blockage] (gen_blockage): Similarly for > the head comment. > * doc/md.texi (blockage): Update similarly. Change wording to > require one of two forms, rather than implying a wider choice. > * cse.c (cse_insn): Where checking for blocking insns, treat > UNSPEC_VOLATILE as blocking, besides volatile ASM. That's fine on principle, but there is a predicate for this (volatile_insn_p) so I think we should use it here. Moreover, cselib_process_insn has the same check so we should adjust it as well, which in turn means that dse.c:scan_insn should probably be adjusted too. OK with these changes, thanks.
Index: gcc/builtins.c =================================================================== --- gcc/builtins.c (revision 192677) +++ gcc/builtins.c (working copy) @@ -963,7 +963,8 @@ expand_builtin_setjmp_receiver (rtx rece /* We must not allow the code we just generated to be reordered by scheduling. Specifically, the update of the frame pointer must - happen immediately, not later. */ + happen immediately, not later. Similarly, we must block + (frame-related) register values to be used across this code. */ emit_insn (gen_blockage ()); } Index: gcc/cse.c =================================================================== --- gcc/cse.c (revision 192677) +++ gcc/cse.c (working copy) @@ -5660,10 +5660,11 @@ cse_insn (rtx insn) invalidate (XEXP (dest, 0), GET_MODE (dest)); } - /* A volatile ASM invalidates everything. */ + /* A volatile ASM or an UNSPEC_VOLATILE invalidates everything. */ if (NONJUMP_INSN_P (insn) - && GET_CODE (PATTERN (insn)) == ASM_OPERANDS - && MEM_VOLATILE_P (PATTERN (insn))) + && ((GET_CODE (PATTERN (insn)) == ASM_OPERANDS + && MEM_VOLATILE_P (PATTERN (insn))) + || GET_CODE (PATTERN (insn)) == UNSPEC_VOLATILE)) flush_hash_table (); /* Don't cse over a call to setjmp; on some machines (eg VAX) Index: gcc/emit-rtl.c =================================================================== --- gcc/emit-rtl.c (revision 192677) +++ gcc/emit-rtl.c (working copy) @@ -364,8 +364,8 @@ get_reg_attrs (tree decl, int offset) #if !HAVE_blockage -/* Generate an empty ASM_INPUT, which is used to block attempts to schedule - across this insn. */ +/* Generate an empty ASM_INPUT, which is used to block attempts to schedule, + and to block register equivalences to be seen across this insn. */ rtx gen_blockage (void) Index: gcc/doc/md.texi =================================================================== --- gcc/doc/md.texi (revision 192677) +++ gcc/doc/md.texi (working copy) @@ -5832,8 +5832,9 @@ the values of operands 1 and 2. @item @samp{blockage} This pattern defines a pseudo insn that prevents the instruction -scheduler from moving instructions across the boundary defined by the -blockage insn. Normally an UNSPEC_VOLATILE pattern. +scheduler and other passes from moving instructions and using register +equivalences across the boundary defined by the blockage insn. +This needs to be an UNSPEC_VOLATILE pattern or a volatile ASM. @cindex @code{memory_barrier} instruction pattern @item @samp{memory_barrier}