Message ID | 20191016133533.GA4483@ibm-toto.the-meissners.org |
---|---|
State | New |
Headers | show |
Series | V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions | expand |
Hi! On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote: > This patch uses the target hook ADJUST_INSN_LENGTH to change the length of > instructions that contain prefixed memory/add instructions. That made this amazingly hard to review. But it might well be worth it, thankfully :-) > There are 2 new insn attributes: > > 1) num_insns: If non-zero, returns the number of machine instructions in an > insn. This simplifies the calculations in rs6000_insn_cost. This is great. > 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in > an insn. Normally this is 1, but in the insns that load up 128-bit values into > GPRs, it will be 2. This one, I am not so sure. > - int n = get_attr_length (insn) / 4; > + /* If the insn tells us how many insns there are, use that. Otherwise use > + the length/4. Adjust the insn length to remove the extra size that > + prefixed instructions take. */ This should be temporary, until we have converted everything to use num_insns, right? > + int n = get_attr_num_insns (insn); > + if (n == 0) > + { > + int length = get_attr_length (insn); > + if (get_attr_prefixed (insn) == PREFIXED_YES) > + { > + int adjust = 0; > + ADJUST_INSN_LENGTH (insn, adjust); > + length -= adjust; > + } > + > + n = length / 4; > + } > --- gcc/config/rs6000/rs6000.h (revision 277017) > +++ gcc/config/rs6000/rs6000.h (working copy) > @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode; > /* Adjust the length of an INSN. LENGTH is the currently-computed length and > should be adjusted to reflect any required changes. This macro is used when > there is some systematic length adjustment required that would be difficult > - to express in the length attribute. */ > + to express in the length attribute. > > -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */ > + In the PowerPC, we use this to adjust the length of an instruction if one or > + more prefixed instructions are generated, using the attribute > + num_prefixed_insns. A prefixed instruction is 8 bytes instead of 4, but the > + hardware requires that a prefied instruciton not cross a 64-byte boundary. "prefixed instruction does not" > + This means the compiler has to assume the length of the first prefixed > + instruction is 12 bytes instead of 8 bytes. Since the length is already set > + for the non-prefixed instruction, we just need to udpate for the > + difference. */ > + > +#define ADJUST_INSN_LENGTH(INSN,LENGTH) \ > +{ \ > + if (NONJUMP_INSN_P (INSN)) \ > + { \ > + rtx pattern = PATTERN (INSN); \ > + if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER \ > + && get_attr_prefixed (INSN) == PREFIXED_YES) \ > + { \ > + int num_prefixed = get_attr_max_prefixed_insns (INSN); \ > + (LENGTH) += 4 * (num_prefixed + 1); \ > + } \ > + } \ > +} Please use a function, not a function-like macro. So this computes the *maximum* RTL instruction length, not considering how many of the machine insns in it need a prefix insn. Can't we do better? Hrm, I guess in all cases that matter we will split early anyway. > +;; Return the number of real hardware instructions in a combined insn. If it > +;; is 0, just use the length / 4. > +(define_attr "num_insns" "" (const_int 0)) So we could have the default value *be* length/4, not 0? > +;; If an insn is prefixed, return the maximum number of prefixed instructions > +;; in the insn. The macro ADJUST_INSN_LENGTH uses this number to adjust the > +;; insn length. > +(define_attr "max_prefixed_insns" "" (const_int 1)) "maximum number of prefixed machine instructions in the RTL instruction". > +;; Length of the instruction (in bytes). This length does not consider the > +;; length for prefixed instructions. The macro ADJUST_INSN_LENGTH will adjust > +;; the length if there are prefixed instructions. That is not what it does... it uses the maximum number of prefixed insns there could be. > +;; While it might be tempting to use num_insns to calculate the length, it can > +;; be problematical unless all insn lengths are adjusted to use num_insns > +;; (i.e. if num_insns is 0, it will get the length, which in turn will get > +;; num_insns and recurse). > +(define_attr "length" "" (const_int 4)) Yes, and not only is it tempting, it is what we are going to do! Right? :-) Just not just yet. So please use a function for ADJUST_INSN_LENGTH. Okay for trunk like that. Thanks! And sorry this took so long. Segher
On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote: > Hi! > > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote: > > This patch uses the target hook ADJUST_INSN_LENGTH to change the length of > > instructions that contain prefixed memory/add instructions. > > That made this amazingly hard to review. But it might well be worth it, > thankfully :-) > > > There are 2 new insn attributes: > > > > 1) num_insns: If non-zero, returns the number of machine instructions in an > > insn. This simplifies the calculations in rs6000_insn_cost. > > This is great. > > > 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in > > an insn. Normally this is 1, but in the insns that load up 128-bit values into > > GPRs, it will be 2. > > This one, I am not so sure. I wanted it to be simple, so in general it was just a constant. Since the only user of it has already checked that the insn is prefixed, I didn't think it needed the prefixed test to set it to 0. > > - int n = get_attr_length (insn) / 4; > > + /* If the insn tells us how many insns there are, use that. Otherwise use > > + the length/4. Adjust the insn length to remove the extra size that > > + prefixed instructions take. */ > > This should be temporary, until we have converted everything to use > num_insns, right? Well there were some 200+ places where length was set. > > --- gcc/config/rs6000/rs6000.h (revision 277017) > > +++ gcc/config/rs6000/rs6000.h (working copy) > > @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode; > > /* Adjust the length of an INSN. LENGTH is the currently-computed length and > > should be adjusted to reflect any required changes. This macro is used when > > there is some systematic length adjustment required that would be difficult > > - to express in the length attribute. */ > > + to express in the length attribute. > > > > -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */ > > + In the PowerPC, we use this to adjust the length of an instruction if one or > > + more prefixed instructions are generated, using the attribute > > + num_prefixed_insns. A prefixed instruction is 8 bytes instead of 4, but the > > + hardware requires that a prefied instruciton not cross a 64-byte boundary. > > "prefixed instruction does not" Thanks. > > + This means the compiler has to assume the length of the first prefixed > > + instruction is 12 bytes instead of 8 bytes. Since the length is already set > > + for the non-prefixed instruction, we just need to udpate for the > > + difference. */ > > + > > +#define ADJUST_INSN_LENGTH(INSN,LENGTH) \ > > +{ \ > > + if (NONJUMP_INSN_P (INSN)) \ > > + { \ > > + rtx pattern = PATTERN (INSN); \ > > + if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER \ > > + && get_attr_prefixed (INSN) == PREFIXED_YES) \ > > + { \ > > + int num_prefixed = get_attr_max_prefixed_insns (INSN); \ > > + (LENGTH) += 4 * (num_prefixed + 1); \ > > + } \ > > + } \ > > +} > > Please use a function, not a function-like macro. Ok, I added rs6000_adjust_insn_length in rs6000.c. > So this computes the *maximum* RTL instruction length, not considering how > many of the machine insns in it need a prefix insn. Can't we do better? > Hrm, I guess in all cases that matter we will split early anyway. Well before register allocation for the 128-bit types, you really can't say what the precise length is, even if it is not prefixed. And of course even after register allocation, it isn't precise, since the length of a prefixed instruction is normally 8, but sometimes 12. So we have to use 12. > > > +;; Return the number of real hardware instructions in a combined insn. If it > > +;; is 0, just use the length / 4. > > +(define_attr "num_insns" "" (const_int 0)) > > So we could have the default value *be* length/4, not 0? Only if you make sure that every place sets num_insns. As the comment says, until it is set every where, you run the risk of a deadly embrace. > > +;; If an insn is prefixed, return the maximum number of prefixed instructions > > +;; in the insn. The macro ADJUST_INSN_LENGTH uses this number to adjust the > > +;; insn length. > > +(define_attr "max_prefixed_insns" "" (const_int 1)) > > "maximum number of prefixed machine instructions in the RTL instruction". > > So please use a function for ADJUST_INSN_LENGTH. Okay for trunk like that. > Thanks! And sorry this took so long. Checked in, thanks.
On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote: > On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote: > > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote: > > > - int n = get_attr_length (insn) / 4; > > > + /* If the insn tells us how many insns there are, use that. Otherwise use > > > + the length/4. Adjust the insn length to remove the extra size that > > > + prefixed instructions take. */ > > > > This should be temporary, until we have converted everything to use > > num_insns, right? > > Well there were some 200+ places where length was set. Yes, and I did volunteer to do this work, if needed / wanted. > > Please use a function, not a function-like macro. > > Ok, I added rs6000_adjust_insn_length in rs6000.c. Thanks. > > > +;; Return the number of real hardware instructions in a combined insn. If it > > > +;; is 0, just use the length / 4. > > > +(define_attr "num_insns" "" (const_int 0)) > > > > So we could have the default value *be* length/4, not 0? > > Only if you make sure that every place sets num_insns. As the comment says, > until it is set every where, you run the risk of a deadly embrace. Sure :-) Segher
On Wed, Oct 23, 2019 at 04:42:19PM -0500, Segher Boessenkool wrote: > On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote: > > On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote: > > > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote: > > > > - int n = get_attr_length (insn) / 4; > > > > + /* If the insn tells us how many insns there are, use that. Otherwise use > > > > + the length/4. Adjust the insn length to remove the extra size that > > > > + prefixed instructions take. */ > > > > > > This should be temporary, until we have converted everything to use > > > num_insns, right? > > > > Well there were some 200+ places where length was set. > > Yes, and I did volunteer to do this work, if needed / wanted. So I did this, but it does not work: I end up with out-of-range branches. I rewrote it in a different way: same thing. And again, and again. The fundamental problem is that the "length" attribute is special. It actually is four attributes: insn_current_length (that's the one you expect) insn_variable_length_p (true if the length of this insn depends on the (relative) addresses of any (other) insns) insn_min_length (minimum length: minimum of the alternatives) insn_default_length (which really should be called insn_max_length) which are used in the shorten pass (which actually makes branches *longer* in the normal case). The problem is that these are only computed for static values that are set in the "length" attribute in the machine description, while we actually want to set "length" based on some other attributes ("num_insns", "prefixed", "max_prefixed_insns"), either directly or via adjust_insn_length. The shorten pass (which lenghtens branches, unless -O0) does not notice most insns that can become longer, and eventually we ICE because of an out-of-range branch. Hrm, maybe if I force insn_variable_length_p to "true". Let me try that. Segher
Index: gcc/config/rs6000/rs6000.c =================================================================== --- gcc/config/rs6000/rs6000.c (revision 277017) +++ gcc/config/rs6000/rs6000.c (working copy) @@ -20973,14 +20973,32 @@ rs6000_insn_cost (rtx_insn *insn, bool s if (recog_memoized (insn) < 0) return 0; + /* If we are optimizing for size, just use the length. */ if (!speed) return get_attr_length (insn); + /* Use the cost if provided. */ int cost = get_attr_cost (insn); if (cost > 0) return cost; - int n = get_attr_length (insn) / 4; + /* If the insn tells us how many insns there are, use that. Otherwise use + the length/4. Adjust the insn length to remove the extra size that + prefixed instructions take. */ + int n = get_attr_num_insns (insn); + if (n == 0) + { + int length = get_attr_length (insn); + if (get_attr_prefixed (insn) == PREFIXED_YES) + { + int adjust = 0; + ADJUST_INSN_LENGTH (insn, adjust); + length -= adjust; + } + + n = length / 4; + } + enum attr_type type = get_attr_type (insn); switch (type) Index: gcc/config/rs6000/rs6000.h =================================================================== --- gcc/config/rs6000/rs6000.h (revision 277017) +++ gcc/config/rs6000/rs6000.h (working copy) @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode; /* Adjust the length of an INSN. LENGTH is the currently-computed length and should be adjusted to reflect any required changes. This macro is used when there is some systematic length adjustment required that would be difficult - to express in the length attribute. */ + to express in the length attribute. -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */ + In the PowerPC, we use this to adjust the length of an instruction if one or + more prefixed instructions are generated, using the attribute + num_prefixed_insns. A prefixed instruction is 8 bytes instead of 4, but the + hardware requires that a prefied instruciton not cross a 64-byte boundary. + This means the compiler has to assume the length of the first prefixed + instruction is 12 bytes instead of 8 bytes. Since the length is already set + for the non-prefixed instruction, we just need to udpate for the + difference. */ + +#define ADJUST_INSN_LENGTH(INSN,LENGTH) \ +{ \ + if (NONJUMP_INSN_P (INSN)) \ + { \ + rtx pattern = PATTERN (INSN); \ + if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER \ + && get_attr_prefixed (INSN) == PREFIXED_YES) \ + { \ + int num_prefixed = get_attr_max_prefixed_insns (INSN); \ + (LENGTH) += 4 * (num_prefixed + 1); \ + } \ + } \ +} /* Given a comparison code (EQ, NE, etc.) and the first operand of a COMPARE, return the mode to be used for the comparison. For Index: gcc/config/rs6000/rs6000.md =================================================================== --- gcc/config/rs6000/rs6000.md (revision 277017) +++ gcc/config/rs6000/rs6000.md (working copy) @@ -285,20 +285,24 @@ (define_attr "prefixed" "no,yes" (const_string "no"))) -;; Length in bytes of instructions that use prefixed addressing and length in -;; bytes of instructions that does not use prefixed addressing. This allows -;; both lengths to be defined as constants, and the length attribute can pick -;; the size as appropriate. -(define_attr "prefixed_length" "" (const_int 12)) -(define_attr "non_prefixed_length" "" (const_int 4)) - -;; Length of the instruction (in bytes). Prefixed insns are 8 bytes, but the -;; assembler might issue need to issue a NOP so that the prefixed instruction -;; does not cross a cache boundary, which makes them possibly 12 bytes. -(define_attr "length" "" - (if_then_else (eq_attr "prefixed" "yes") - (attr "prefixed_length") - (attr "non_prefixed_length"))) +;; Return the number of real hardware instructions in a combined insn. If it +;; is 0, just use the length / 4. +(define_attr "num_insns" "" (const_int 0)) + +;; If an insn is prefixed, return the maximum number of prefixed instructions +;; in the insn. The macro ADJUST_INSN_LENGTH uses this number to adjust the +;; insn length. +(define_attr "max_prefixed_insns" "" (const_int 1)) + +;; Length of the instruction (in bytes). This length does not consider the +;; length for prefixed instructions. The macro ADJUST_INSN_LENGTH will adjust +;; the length if there are prefixed instructions. +;; +;; While it might be tempting to use num_insns to calculate the length, it can +;; be problematical unless all insn lengths are adjusted to use num_insns +;; (i.e. if num_insns is 0, it will get the length, which in turn will get +;; num_insns and recurse). +(define_attr "length" "" (const_int 4)) ;; Processor type -- this attribute must exactly match the processor_type ;; enumeration in rs6000-opts.h. @@ -7775,7 +7779,9 @@ (define_insn_and_split "*mov<mode>_64bit [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; } [(set_attr "length" "8") - (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")]) + (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v") + (set_attr "max_prefixed_insns" "2") + (set_attr "num_insns" "2")]) (define_insn_and_split "*movtd_64bit_nodm" [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r") @@ -7787,7 +7793,9 @@ (define_insn_and_split "*movtd_64bit_nod "&& reload_completed" [(pc)] { rs6000_split_multireg_move (operands[0], operands[1]); DONE; } - [(set_attr "length" "8,8,8,12,12,8")]) + [(set_attr "length" "8,8,8,12,12,8") + (set_attr "max_prefixed_insns" "2") + (set_attr "num_insns" "2,2,2,3,3,2")]) (define_insn_and_split "*mov<mode>_32bit" [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r") @@ -8984,7 +8992,8 @@ (define_insn "*mov<mode>_ppc64" return rs6000_output_move_128bit (operands); } [(set_attr "type" "store,store,load,load,*,*") - (set_attr "length" "8")]) + (set_attr "length" "8") + (set_attr "max_prefixed_insns" "2")]) (define_split [(set (match_operand:TI2 0 "int_reg_operand") Index: gcc/config/rs6000/vsx.md =================================================================== --- gcc/config/rs6000/vsx.md (revision 277017) +++ gcc/config/rs6000/vsx.md (working copy) @@ -1149,6 +1149,14 @@ (define_insn "vsx_mov<mode>_64bit" "vecstore, vecload, vecsimple, mffgpr, mftgpr, load, store, load, store, *, vecsimple, vecsimple, vecsimple, *, *, vecstore, vecload") + (set_attr "num_insns" + "*, *, *, 2, *, 2, + 2, 2, 2, 2, *, *, + *, 5, 2, *, *") + (set_attr "max_prefixed_insns" + "*, *, *, *, *, 2, + 2, 2, 2, 2, *, *, + *, *, *, *, *") (set_attr "length" "*, *, *, 8, *, 8, 8, 8, 8, 8, *, *,