Message ID | aa072e76827ffbc48f5abacbca1f3fd5388647a8.1434644736.git.segher@kernel.crashing.org |
---|---|
State | New |
Headers | show |
On Thu, Jun 18, 2015 at 1:08 PM, Segher Boessenkool <segher@kernel.crashing.org> wrote: > The macro WORD_REGISTER_OPERATIONS, if defined, means that all reg-reg > operations on data smaller than words are performed on the full word. > For TARGET_POWERPC64 words are 64 bits; but many operations on SImode > do not behave as if on DImode. So rs6000 should not define the macro. > > Bootstrappped and tested as usual (-m32,-m32/-mpowerpc64,-m64,-m64/-mlra), > no regressions. Is this okay for mainline? > > - > > I did some analysis on the code differences this causes. > > - For both 32-bit and 64-bit, combine can combine more AND instructions, > including to a whole bunch of dot forms. This is mostly because combine > thinks it should "simplify" to a smaller mode (because it has more info > about zero bits), but we have no compare instructions in smaller modes. > > - Range checks (x >= a && x <= b) are problematic. They are folded (in > the frontend already) to the usual x-a u<= b-a affair, but often in > less than 32 bits. This survives in that form throughout the middle end, > and then expand makes it a minus, a zero_extend from the smaller mode to > SImode, and a compare as SI. Without WORD_REGISTER_OPERATIONS combine > can never get rid of the zero_extend (and with it, only sometimes). Had > it been a zero_extend, minus, compare in that order (with slightly > modified constants to adjust for the wider mode), the zero_extend can > more often be removed. This happens on almost all targets. > > - For 64-bit, many 64-bit loads are changed to 32-bit loads. This is > fine in most places; the one case that looks nasty is where it spills > a 64-bit reg to stack and immediately loads it back as 32-bit (with > an ori 2,2,0 in between, thankfully). Only reload does this; LRA makes > better code (with a clrldi), not worse than with W_R_O defined. > > In all, you get about 1 in 20000 extra insns (and a bit more for the > compiler itself, it does a *lot* of range checks). Following patches > to improve the rotate insns more than make up for it (and get better > results than with the macro defined even :-) ) > > > Segher > > > 2015-06-18 Segher Boessenkool <segher@kernel.crashing.org> > > * config/rs6000/rs6000.h (WORD_REGISTER_OPERATIONS): Delete. Okay. Thanks, David
diff --git a/gcc/config/rs6000/rs6000.h b/gcc/config/rs6000/rs6000.h index 1b1145f..ef8ff38 100644 --- a/gcc/config/rs6000/rs6000.h +++ b/gcc/config/rs6000/rs6000.h @@ -2039,10 +2039,6 @@ do { \ is undesirable. */ #define SLOW_BYTE_ACCESS 1 -/* Define if operations between registers always perform the operation - on the full register even if a narrower mode is specified. */ -#define WORD_REGISTER_OPERATIONS - /* Define if loading in MODE, an integral mode narrower than BITS_PER_WORD will either zero-extend or sign-extend. The value of this macro should be the code that says which one of the two operations is implicitly