diff mbox series

V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions

Message ID 20191016133533.GA4483@ibm-toto.the-meissners.org
State New
Headers show
Series V6, #1 of 17: Use ADJUST_INSN_LENGTH for prefixed instructions | expand

Commit Message

Michael Meissner Oct. 16, 2019, 1:35 p.m. UTC
This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
instructions that contain prefixed memory/add instructions.

There are 2 new insn attributes:

1) num_insns: If non-zero, returns the number of machine instructions in an
insn.  This simplifies the calculations in rs6000_insn_cost.

2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
an insn.  Normally this is 1, but in the insns that load up 128-bit values into
GPRs, it will be 2.

This patch replaces patches #2, #3, and #6 from the V5 patch series.

Along with the other patches, I have done bootstraps on a little endian power8
system, and there were no regressions in the test suite.  I have built both
Spec 2006 and Spec 2017 with all of these patches installed using -mcpu=future,
and there were no failures.  Can I check this into the trunk?

Note, I may have limited email access on October 17th and 18th, 2019.

2019-10-15  Michael Meissner  <meissner@linux.ibm.com>

	* config/rs6000/rs6000.c (rs6000_insn_cost): Use num_insns insn
	attribute if it exists, rather than the insn size.  If we use the
	insn size, adjust the size to remove the extra size that prefixed
	instructions take.
	* config/rs6000/rs6000.h (ADJUST_INSN_LENGTH): New target hook to
	update the instruction sized if prefixed instructions are used.
	* config/rs6000/rs6000.md (prefixed_length attribute): Delete.
	(non_prefixed_length attribute): Delete.
	(num_insns attribute): New insn attribute to return the number of
	instructions.
	(max_prefixed_insns attribute): New insn attribute to return the
	maximum number of prefixed instructions in an insn.
	(length attribute): Do not adjust for prefix instructions here,
	punt to ADJUST_INSN_LENGTH.
	(mov<mode>_64bit): Set max_prefixed_insns and num_insns.
	(movtd_64bit_nodm): Set max_prefixed_insns and num_insns.
	(mov<mode>_ppc64): Set max_prefixed_insns and num_insns.
	* config/rs6000/vsx.md: (vsx_mov<mode>_64bit): Set
	max_prefixed_insns and num_insns.

Comments

Segher Boessenkool Oct. 22, 2019, 10:27 p.m. UTC | #1
Hi!

On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
> instructions that contain prefixed memory/add instructions.

That made this amazingly hard to review.  But it might well be worth it,
thankfully :-)

> There are 2 new insn attributes:
> 
> 1) num_insns: If non-zero, returns the number of machine instructions in an
> insn.  This simplifies the calculations in rs6000_insn_cost.

This is great.

> 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
> an insn.  Normally this is 1, but in the insns that load up 128-bit values into
> GPRs, it will be 2.

This one, I am not so sure.

> -  int n = get_attr_length (insn) / 4;
> +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> +     the length/4.  Adjust the insn length to remove the extra size that
> +     prefixed instructions take.  */

This should be temporary, until we have converted everything to use
num_insns, right?

> +  int n = get_attr_num_insns (insn);
> +  if (n == 0)
> +    {
> +      int length = get_attr_length (insn);
> +      if (get_attr_prefixed (insn) == PREFIXED_YES)
> +	{
> +	  int adjust = 0;
> +	  ADJUST_INSN_LENGTH (insn, adjust);
> +	  length -= adjust;
> +	}
> +
> +      n = length / 4;
> +    }

> --- gcc/config/rs6000/rs6000.h	(revision 277017)
> +++ gcc/config/rs6000/rs6000.h	(working copy)
> @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode;
>  /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
>     should be adjusted to reflect any required changes.  This macro is used when
>     there is some systematic length adjustment required that would be difficult
> -   to express in the length attribute.  */
> +   to express in the length attribute.
>  
> -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
> +   In the PowerPC, we use this to adjust the length of an instruction if one or
> +   more prefixed instructions are generated, using the attribute
> +   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
> +   hardware requires that a prefied instruciton not cross a 64-byte boundary.

"prefixed instruction does not"

> +   This means the compiler has to assume the length of the first prefixed
> +   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
> +   for the non-prefixed instruction, we just need to udpate for the
> +   difference.  */
> +
> +#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
> +{									\
> +  if (NONJUMP_INSN_P (INSN))						\
> +    {									\
> +      rtx pattern = PATTERN (INSN);					\
> +      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
> +	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
> +	{								\
> +	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
> +	  (LENGTH) += 4 * (num_prefixed + 1);				\
> +	}								\
> +    }									\
> +}

Please use a function, not a function-like macro.

So this computes the *maximum* RTL instruction length, not considering how
many of the machine insns in it need a prefix insn.  Can't we do better?
Hrm, I guess in all cases that matter we will split early anyway.


> +;; Return the number of real hardware instructions in a combined insn.  If it
> +;; is 0, just use the length / 4.
> +(define_attr "num_insns" "" (const_int 0))

So we could have the default value *be* length/4, not 0?

> +;; If an insn is prefixed, return the maximum number of prefixed instructions
> +;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
> +;; insn length.
> +(define_attr "max_prefixed_insns" "" (const_int 1))

"maximum number of prefixed machine instructions in the RTL instruction".

> +;; Length of the instruction (in bytes).  This length does not consider the
> +;; length for prefixed instructions.  The macro ADJUST_INSN_LENGTH will adjust
> +;; the length if there are prefixed instructions.

That is not what it does...  it uses the maximum number of prefixed insns
there could be.

> +;; While it might be tempting to use num_insns to calculate the length, it can
> +;; be problematical unless all insn lengths are adjusted to use num_insns
> +;; (i.e. if num_insns is 0, it will get the length, which in turn will get
> +;; num_insns and recurse).
> +(define_attr "length" "" (const_int 4))

Yes, and not only is it tempting, it is what we are going to do!  Right?
:-)  Just not just yet.

So please use a function for ADJUST_INSN_LENGTH.  Okay for trunk like that.
Thanks!  And sorry this took so long.


Segher
Michael Meissner Oct. 23, 2019, 9 p.m. UTC | #2
On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> Hi!
> 
> On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > This patch uses the target hook ADJUST_INSN_LENGTH to change the length of
> > instructions that contain prefixed memory/add instructions.
> 
> That made this amazingly hard to review.  But it might well be worth it,
> thankfully :-)
> 
> > There are 2 new insn attributes:
> > 
> > 1) num_insns: If non-zero, returns the number of machine instructions in an
> > insn.  This simplifies the calculations in rs6000_insn_cost.
> 
> This is great.
> 
> > 2) max_prefixed_insns: Returns the maximum number of prefixed instructions in
> > an insn.  Normally this is 1, but in the insns that load up 128-bit values into
> > GPRs, it will be 2.
> 
> This one, I am not so sure.

I wanted it to be simple, so in general it was just a constant.  Since the only
user of it has already checked that the insn is prefixed, I didn't think it
needed the prefixed test to set it to 0.

> > -  int n = get_attr_length (insn) / 4;
> > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > +     the length/4.  Adjust the insn length to remove the extra size that
> > +     prefixed instructions take.  */
> 
> This should be temporary, until we have converted everything to use
> num_insns, right?

Well there were some 200+ places where length was set.

> > --- gcc/config/rs6000/rs6000.h	(revision 277017)
> > +++ gcc/config/rs6000/rs6000.h	(working copy)
> > @@ -1847,9 +1847,30 @@ extern scalar_int_mode rs6000_pmode;
> >  /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
> >     should be adjusted to reflect any required changes.  This macro is used when
> >     there is some systematic length adjustment required that would be difficult
> > -   to express in the length attribute.  */
> > +   to express in the length attribute.
> >  
> > -/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
> > +   In the PowerPC, we use this to adjust the length of an instruction if one or
> > +   more prefixed instructions are generated, using the attribute
> > +   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
> > +   hardware requires that a prefied instruciton not cross a 64-byte boundary.
> 
> "prefixed instruction does not"

Thanks.

> > +   This means the compiler has to assume the length of the first prefixed
> > +   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
> > +   for the non-prefixed instruction, we just need to udpate for the
> > +   difference.  */
> > +
> > +#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
> > +{									\
> > +  if (NONJUMP_INSN_P (INSN))						\
> > +    {									\
> > +      rtx pattern = PATTERN (INSN);					\
> > +      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
> > +	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
> > +	{								\
> > +	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
> > +	  (LENGTH) += 4 * (num_prefixed + 1);				\
> > +	}								\
> > +    }									\
> > +}
> 
> Please use a function, not a function-like macro.

Ok, I added rs6000_adjust_insn_length in rs6000.c.

> So this computes the *maximum* RTL instruction length, not considering how
> many of the machine insns in it need a prefix insn.  Can't we do better?
> Hrm, I guess in all cases that matter we will split early anyway.

Well before register allocation for the 128-bit types, you really can't say
what the precise length is, even if it is not prefixed.

And of course even after register allocation, it isn't precise, since the
length of a prefixed instruction is normally 8, but sometimes 12.  So we have
to use 12.

> 
> > +;; Return the number of real hardware instructions in a combined insn.  If it
> > +;; is 0, just use the length / 4.
> > +(define_attr "num_insns" "" (const_int 0))
> 
> So we could have the default value *be* length/4, not 0?

Only if you make sure that every place sets num_insns.  As the comment says,
until it is set every where, you run the risk of a deadly embrace.

> > +;; If an insn is prefixed, return the maximum number of prefixed instructions
> > +;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
> > +;; insn length.
> > +(define_attr "max_prefixed_insns" "" (const_int 1))
> 
> "maximum number of prefixed machine instructions in the RTL instruction".
> 
> So please use a function for ADJUST_INSN_LENGTH.  Okay for trunk like that.
> Thanks!  And sorry this took so long.

Checked in, thanks.
Segher Boessenkool Oct. 23, 2019, 9:42 p.m. UTC | #3
On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote:
> On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > > -  int n = get_attr_length (insn) / 4;
> > > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > > +     the length/4.  Adjust the insn length to remove the extra size that
> > > +     prefixed instructions take.  */
> > 
> > This should be temporary, until we have converted everything to use
> > num_insns, right?
> 
> Well there were some 200+ places where length was set.

Yes, and I did volunteer to do this work, if needed / wanted.

> > Please use a function, not a function-like macro.
> 
> Ok, I added rs6000_adjust_insn_length in rs6000.c.

Thanks.

> > > +;; Return the number of real hardware instructions in a combined insn.  If it
> > > +;; is 0, just use the length / 4.
> > > +(define_attr "num_insns" "" (const_int 0))
> > 
> > So we could have the default value *be* length/4, not 0?
> 
> Only if you make sure that every place sets num_insns.  As the comment says,
> until it is set every where, you run the risk of a deadly embrace.

Sure :-)


Segher
Segher Boessenkool Oct. 31, 2019, 10:46 p.m. UTC | #4
On Wed, Oct 23, 2019 at 04:42:19PM -0500, Segher Boessenkool wrote:
> On Wed, Oct 23, 2019 at 05:00:58PM -0400, Michael Meissner wrote:
> > On Tue, Oct 22, 2019 at 05:27:19PM -0500, Segher Boessenkool wrote:
> > > On Wed, Oct 16, 2019 at 09:35:33AM -0400, Michael Meissner wrote:
> > > > -  int n = get_attr_length (insn) / 4;
> > > > +  /* If the insn tells us how many insns there are, use that.  Otherwise use
> > > > +     the length/4.  Adjust the insn length to remove the extra size that
> > > > +     prefixed instructions take.  */
> > > 
> > > This should be temporary, until we have converted everything to use
> > > num_insns, right?
> > 
> > Well there were some 200+ places where length was set.
> 
> Yes, and I did volunteer to do this work, if needed / wanted.

So I did this, but it does not work: I end up with out-of-range branches.
I rewrote it in a different way: same thing.  And again, and again.

The fundamental problem is that the "length" attribute is special.  It
actually is four attributes:

insn_current_length     (that's the one you expect)
insn_variable_length_p  (true if the length of this insn depends on
                         the (relative) addresses of any (other) insns)
insn_min_length         (minimum length: minimum of the alternatives)
insn_default_length     (which really should be called insn_max_length)

which are used in the shorten pass (which actually makes branches *longer*
in the normal case).  The problem is that these are only computed for static
values that are set in the "length" attribute in the machine description,
while we actually want to set "length" based on some other attributes
("num_insns", "prefixed", "max_prefixed_insns"), either directly or via
adjust_insn_length.

The shorten pass (which lenghtens branches, unless -O0) does not notice
most insns that can become longer, and eventually we ICE because of an
out-of-range branch.

Hrm, maybe if I force insn_variable_length_p to "true".  Let me try that.


Segher
diff mbox series

Patch

Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 277017)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -20973,14 +20973,32 @@  rs6000_insn_cost (rtx_insn *insn, bool s
   if (recog_memoized (insn) < 0)
     return 0;
 
+  /* If we are optimizing for size, just use the length.  */
   if (!speed)
     return get_attr_length (insn);
 
+  /* Use the cost if provided.  */
   int cost = get_attr_cost (insn);
   if (cost > 0)
     return cost;
 
-  int n = get_attr_length (insn) / 4;
+  /* If the insn tells us how many insns there are, use that.  Otherwise use
+     the length/4.  Adjust the insn length to remove the extra size that
+     prefixed instructions take.  */
+  int n = get_attr_num_insns (insn);
+  if (n == 0)
+    {
+      int length = get_attr_length (insn);
+      if (get_attr_prefixed (insn) == PREFIXED_YES)
+	{
+	  int adjust = 0;
+	  ADJUST_INSN_LENGTH (insn, adjust);
+	  length -= adjust;
+	}
+
+      n = length / 4;
+    }
+
   enum attr_type type = get_attr_type (insn);
 
   switch (type)
Index: gcc/config/rs6000/rs6000.h
===================================================================
--- gcc/config/rs6000/rs6000.h	(revision 277017)
+++ gcc/config/rs6000/rs6000.h	(working copy)
@@ -1847,9 +1847,30 @@  extern scalar_int_mode rs6000_pmode;
 /* Adjust the length of an INSN.  LENGTH is the currently-computed length and
    should be adjusted to reflect any required changes.  This macro is used when
    there is some systematic length adjustment required that would be difficult
-   to express in the length attribute.  */
+   to express in the length attribute.
 
-/* #define ADJUST_INSN_LENGTH(X,LENGTH) */
+   In the PowerPC, we use this to adjust the length of an instruction if one or
+   more prefixed instructions are generated, using the attribute
+   num_prefixed_insns.  A prefixed instruction is 8 bytes instead of 4, but the
+   hardware requires that a prefied instruciton not cross a 64-byte boundary.
+   This means the compiler has to assume the length of the first prefixed
+   instruction is 12 bytes instead of 8 bytes.  Since the length is already set
+   for the non-prefixed instruction, we just need to udpate for the
+   difference.  */
+
+#define ADJUST_INSN_LENGTH(INSN,LENGTH)					\
+{									\
+  if (NONJUMP_INSN_P (INSN))						\
+    {									\
+      rtx pattern = PATTERN (INSN);					\
+      if (GET_CODE (pattern) != USE && GET_CODE (pattern) != CLOBBER	\
+	  && get_attr_prefixed (INSN) == PREFIXED_YES)			\
+	{								\
+	  int num_prefixed = get_attr_max_prefixed_insns (INSN);	\
+	  (LENGTH) += 4 * (num_prefixed + 1);				\
+	}								\
+    }									\
+}
 
 /* Given a comparison code (EQ, NE, etc.) and the first operand of a
    COMPARE, return the mode to be used for the comparison.  For
Index: gcc/config/rs6000/rs6000.md
===================================================================
--- gcc/config/rs6000/rs6000.md	(revision 277017)
+++ gcc/config/rs6000/rs6000.md	(working copy)
@@ -285,20 +285,24 @@  (define_attr "prefixed" "no,yes"
 
 	(const_string "no")))
 
-;; Length in bytes of instructions that use prefixed addressing and length in
-;; bytes of instructions that does not use prefixed addressing.  This allows
-;; both lengths to be defined as constants, and the length attribute can pick
-;; the size as appropriate.
-(define_attr "prefixed_length" "" (const_int 12))
-(define_attr "non_prefixed_length" "" (const_int 4))
-
-;; Length of the instruction (in bytes).  Prefixed insns are 8 bytes, but the
-;; assembler might issue need to issue a NOP so that the prefixed instruction
-;; does not cross a cache boundary, which makes them possibly 12 bytes.
-(define_attr "length" ""
-  (if_then_else (eq_attr "prefixed" "yes")
-		(attr "prefixed_length")
-		(attr "non_prefixed_length")))
+;; Return the number of real hardware instructions in a combined insn.  If it
+;; is 0, just use the length / 4.
+(define_attr "num_insns" "" (const_int 0))
+
+;; If an insn is prefixed, return the maximum number of prefixed instructions
+;; in the insn.  The macro ADJUST_INSN_LENGTH uses this number to adjust the
+;; insn length.
+(define_attr "max_prefixed_insns" "" (const_int 1))
+
+;; Length of the instruction (in bytes).  This length does not consider the
+;; length for prefixed instructions.  The macro ADJUST_INSN_LENGTH will adjust
+;; the length if there are prefixed instructions.
+;;
+;; While it might be tempting to use num_insns to calculate the length, it can
+;; be problematical unless all insn lengths are adjusted to use num_insns
+;; (i.e. if num_insns is 0, it will get the length, which in turn will get
+;; num_insns and recurse).
+(define_attr "length" "" (const_int 4))
 
 ;; Processor type -- this attribute must exactly match the processor_type
 ;; enumeration in rs6000-opts.h.
@@ -7775,7 +7779,9 @@  (define_insn_and_split "*mov<mode>_64bit
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
   [(set_attr "length" "8")
-   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")])
+   (set_attr "isa" "*,*,*,*,*,*,*,*,p8v,p8v")
+   (set_attr "max_prefixed_insns" "2")
+   (set_attr "num_insns" "2")])
 
 (define_insn_and_split "*movtd_64bit_nodm"
   [(set (match_operand:TD 0 "nonimmediate_operand" "=m,d,d,Y,r,r")
@@ -7787,7 +7793,9 @@  (define_insn_and_split "*movtd_64bit_nod
   "&& reload_completed"
   [(pc)]
 { rs6000_split_multireg_move (operands[0], operands[1]); DONE; }
-  [(set_attr "length" "8,8,8,12,12,8")])
+  [(set_attr "length" "8,8,8,12,12,8")
+   (set_attr "max_prefixed_insns" "2")
+   (set_attr "num_insns" "2,2,2,3,3,2")])
 
 (define_insn_and_split "*mov<mode>_32bit"
   [(set (match_operand:FMOVE128_FPR 0 "nonimmediate_operand" "=m,d,d,d,Y,r,r")
@@ -8984,7 +8992,8 @@  (define_insn "*mov<mode>_ppc64"
   return rs6000_output_move_128bit (operands);
 }
   [(set_attr "type" "store,store,load,load,*,*")
-   (set_attr "length" "8")])
+   (set_attr "length" "8")
+   (set_attr "max_prefixed_insns" "2")])
 
 (define_split
   [(set (match_operand:TI2 0 "int_reg_operand")
Index: gcc/config/rs6000/vsx.md
===================================================================
--- gcc/config/rs6000/vsx.md	(revision 277017)
+++ gcc/config/rs6000/vsx.md	(working copy)
@@ -1149,6 +1149,14 @@  (define_insn "vsx_mov<mode>_64bit"
                "vecstore,  vecload,   vecsimple, mffgpr,    mftgpr,    load,
                 store,     load,      store,     *,         vecsimple, vecsimple,
                 vecsimple, *,         *,         vecstore,  vecload")
+   (set_attr "num_insns"
+               "*,         *,         *,         2,         *,         2,
+                2,         2,         2,         2,         *,         *,
+                *,         5,         2,         *,         *")
+   (set_attr "max_prefixed_insns"
+               "*,         *,         *,         *,         *,         2,
+                2,         2,         2,         2,         *,         *,
+                *,         *,         *,         *,         *")
    (set_attr "length"
                "*,         *,         *,         8,         *,         8,
                 8,         8,         8,         8,         *,         *,