{PATCH v3, rs6000] Replace X-form addressing with D-form addressing in new pass for Power9
diff mbox series

Message ID 35585daa-8e45-1eae-b1a4-48a7da6f6011@linux.ibm.com
State New
Headers show
Series
  • {PATCH v3, rs6000] Replace X-form addressing with D-form addressing in new pass for Power9
Related show

Commit Message

Kelvin Nilsen Oct. 9, 2019, 8:28 p.m. UTC
This patch is a refinement of a patch first submitted to this list on Nov. 10, 2018, with revisions submitted this list on Dec. 13, 2018 and Sep. 3, 2019.

This new pass scans existing rtl expressions and replaces them with rtl expressions that favor selection of the D-form instructions in contexts for which the D-form instructions are preferred.  The new pass runs after the RTL loop optimizations since loop unrolling often introduces opportunities for beneficial replacements of X-form addressing instructions.

For each of the new tests, multiple X-form instructions are replaced with D-form instructions, some addi instructions are replaced with add instructions, and some addi instructions are eliminated.  The typical improvement for the included tests is a decrease of 4.28% to 12.12% in the number of instructions executed on each iteration of the loop.  The optimization has not shown measurable improvement on specmark tests, presumably because the typical loops that are benefited by this optimization are memory bounded and this optimization does not eliminate memory loads or stores.  However, it is anticipated that multi-threaded workloads and measurements of total power and cooling costs for heavy server workloads would benefit.

This version 3 patch responds to feedback and numerous suggestions by Segher:

1. Fixed multiple typos.

2. Improved comments and added discussion of computational complexity.

3. Added a field to the indexing_web_entry class, allowing constant-time test for dominance of instructions within a common basic block.

4. Improved implementation of the equivalence hash function.

5. Refactored the code to divide into smaller functions and provide more descriptive commentary.

6. Improved indentation.

7. Corrected definition of max_16bit_signed value.

8. Added To-do comment in rs6000_target_supports_dform_offset_p, to alert maintainers that adding support for future hardware architectures will require code to be added to this function.

9. Simplified the dg directives in the new test cases. 

I have built and regression tested this patch on powerpc64le-unknown-linux target with no regressions.

Is this ok for trunk?

gcc/ChangeLog:

2019-10-09  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* config/rs6000/rs6000-p9dform.c: New file.
	* config/rs6000/rs6000-passes.def: Add pass_insert_dform.
	* config/rs6000/rs6000-protos.h
	(rs6000_target_supports_dform_offset_p): New function prototype.
	(make_pass_insert_dform): Likewise.
	* config/rs6000/rs6000.c (rs6000_target_supports_dform_offset_p):
	New function.
	* config/rs6000/t-rs6000 (rs6000-p9dform.o): New build target.
	* config.gcc: Add rs6000-p9dform.o object file.

gcc/testsuite/ChangeLog:

2019-10-09  Kelvin Nilsen  <kelvin@gcc.gnu.org>

	* gcc.target/powerpc/p9-dform-0.c: New test.
	* gcc.target/powerpc/p9-dform-1.c: New test.
	* gcc.target/powerpc/p9-dform-10.c: New test.
	* gcc.target/powerpc/p9-dform-11.c: New test.
	* gcc.target/powerpc/p9-dform-12.c: New test.
	* gcc.target/powerpc/p9-dform-13.c: New test.
	* gcc.target/powerpc/p9-dform-14.c: New test.
	* gcc.target/powerpc/p9-dform-15.c: New test.
	* gcc.target/powerpc/p9-dform-2.c: New test.
	* gcc.target/powerpc/p9-dform-3.c: New test.
	* gcc.target/powerpc/p9-dform-4.c: New test.
	* gcc.target/powerpc/p9-dform-5.c: New test.
	* gcc.target/powerpc/p9-dform-6.c: New test.
	* gcc.target/powerpc/p9-dform-7.c: New test.
	* gcc.target/powerpc/p9-dform-8.c: New test.
	* gcc.target/powerpc/p9-dform-9.c: New test.
	* gcc.target/powerpc/p9-dform-generic.h: New test.

Comments

Segher Boessenkool Oct. 17, 2019, 10:57 p.m. UTC | #1
Hi Kelvin,

On Wed, Oct 09, 2019 at 03:28:45PM -0500, Kelvin Nilsen wrote:
> This new pass scans existing rtl expressions and replaces them with rtl expressions that favor selection of the D-form instructions in contexts for which the D-form instructions are preferred.  The new pass runs after the RTL loop optimizations since loop unrolling often introduces opportunities for beneficial replacements of X-form addressing instructions.
> 
> For each of the new tests, multiple X-form instructions are replaced with D-form instructions, some addi instructions are replaced with add instructions, and some addi instructions are eliminated.  The typical improvement for the included tests is a decrease of 4.28% to 12.12% in the number of instructions executed on each iteration of the loop.  The optimization has not shown measurable improvement on specmark tests, presumably because the typical loops that are benefited by this optimization are memory bounded and this optimization does not eliminate memory loads or stores.  However, it is anticipated that multi-threaded workloads and measurements of total power and cooling costs for heavy server workloads would benefit.

My first question is, why did ivopts choose the suboptimal solution?
_Did_ it, or did something later mess things up?

This new pass can help us investigate that.  It certainly sounds like we
could do better earlier already.

I think it is a good design to make fixes late in the pass pipeline, *but*
we should try to make good choices earlier, too -- the "late tweaks" should
be just that, tweaks; 4%-12% is a bit much.

(It's not that super late here; but still, why does it help so much?)

> 2. Improved comments and added discussion of computational complexity.

It's really not good at all to have anything that is quadratic in the size
of the program, or the size of a function, or the size of a basic block:
there always show up real programs which then take approximately infinitely
long to compile.

If there are good arguments why some parameter can not be bigger than 100
in reality, or maybe even 1000, it is different of course; but things like
number of function, number of basic blocks, or number of instructions (per
function or bb or loop) are not naturally limited.

> 5. Refactored the code to divide into smaller functions and provide more descriptive commentary.

Many thanks for this :-)

> +   This pass replaces the above-matched sequences with:
> +
> +   Ai: derived_pointer = array_base + offset
> +       *(derived_pointer)
> +
> +   Aij: leave these alone.  expect that subsequent optimization deletes
> +        this code as it may become dead (since we don't use the
> +        indexing expression following our code transformations.)
> +
> +   Ai:
> +   *(derived_pointer + constant_i)
> +     (where constant_i equals sum of constant (n,j) for all n from 1
> +      to i paired with all j from 1 to Kn,

So if I understand this correctly, if the code is

  x0 = [base+8]
  x1 = [base]
  x2 = [base+16]

this pass will change it to

  p = base+8
  x0 = [p]
  x1 = [p-8]
  x2 = [p+8]

Should it always pick the first access as the new base pointer?  Should it
use the lowest offset instead?

(Maybe the code does something more advanced than picking the first; not
clear from this comment though).

> +class indexing_web_entry: public web_entry_base
> +{
> + public:
> +  rtx_insn *insn;		/* Pointer to the insn */
> +  basic_block bb;		/* Pointer to the enclosing basic block */

The rest of the fields have the comment before the declaration.  I would
just lose the comments here though: if something called "insn" is not an
insn, or something called "bb" is not a block, ... :-)

> +  /* A unique sequence number is assigned to each instruction for the
> +     purpose of simplifying domination tests.  Within each basic
> +     block, sequence numbers areassigned in strictly increasing order.
> +     Thus, for any two instructions known to reside in the same basic
> +     block, the instruction with a lower insn_sequence_no is kknown
> +     to dominate the instruction with a higher insn_sequence_no.  */
> +  unsigned int insn_sequence_no;

Many existing passes call this "luid" (for "local unique id").

(Typos: "are assigned", "known").

> +  /* If this insn is relevant, it is a load or store with a memory
> +     address that is comprised of a base pointer (e.g. the address of
> +     an array or array slice) and an index expression (e.g. an index
> +     within the array).  The original_base_use and original_index_use
> +     fields represent the numbers of the instructions that define the
> +     base and index values which are summed together with a constant
> +     value to determine the value of this instruction's memory
> +     address.  */
> +  unsigned int original_base_use;
> +  unsigned int original_index_use;

I wonder how you determine what is base and what is index?

(I'll review the rest later).


Segher
Kelvin Nilsen Oct. 22, 2019, 8:06 p.m. UTC | #2
On 10/17/19 5:57 PM, Segher Boessenkool wrote:
> Hi Kelvin,
> 
> On Wed, Oct 09, 2019 at 03:28:45PM -0500, Kelvin Nilsen wrote:
>> This new pass scans existing rtl expressions and replaces them with rtl expressions that favor selection of the D-form instructions in contexts for which the D-form instructions are preferred.  The new pass runs after the RTL loop optimizations since loop unrolling often introduces opportunities for beneficial replacements of X-form addressing instructions.
>>
>> For each of the new tests, multiple X-form instructions are replaced with D-form instructions, some addi instructions are replaced with add instructions, and some addi instructions are eliminated.  The typical improvement for the included tests is a decrease of 4.28% to 12.12% in the number of instructions executed on each iteration of the loop.  The optimization has not shown measurable improvement on specmark tests, presumably because the typical loops that are benefited by this optimization are memory bounded and this optimization does not eliminate memory loads or stores.  However, it is anticipated that multi-threaded workloads and measurements of total power and cooling costs for heavy server workloads would benefit.
> 
> My first question is, why did ivopts choose the suboptimal solution?
> _Did_ it, or did something later mess things up?
> 
> This new pass can help us investigate that.  It certainly sounds like we
> could do better earlier already.
> 
> I think it is a good design to make fixes late in the pass pipeline, *but*
> we should try to make good choices earlier, too -- the "late tweaks" should
> be just that, tweaks; 4%-12% is a bit much.
> 
> (It's not that super late here; but still, why does it help so much?)
> 

Thanks Segher for looking over my draft patch and providing your comments. When I first began work
on this reported performance problem, I did look at the earlier passes in hopes of identifying a better place to address the poor instruction selection.

It is difficult to know exactly where we want to accomplish the improved code generation.  Some of the "earlier" candidate passes are disadvantaged because they are "blind" to instruction costs and do not even have an awareness of which addressing modes are supported by which instructions.

Below, I'm providing some of the earlier pass information for one of the sample programs that motivates this patch.  Please feel free to comment.  I welcome suggestions as to alternative ways to attack this.

Thanks.

--------------------------------------------
 
Consider the following program:

extern float opt_value
extern char *opt_desc;

#define M 128
#define N 512

double x [N];
double y [N];

int main (int argc, char *argv []) {
  double sacc;

  first_dummy ();
  for (int j = 0; j < M; j++) {
    sacc = 0.00;
    for (unsigned long long int i = 0; i < N; i++)
      sacc += x[i] * y[i];
    dummy (sacc, N);
  }
  opt_value = ((float) N) * 2 * ((float) M);
  opt_desc = "flops";
  other_dummy ();
}


Compile this with the following command-line options on a Power target:

xgcc p9-dform-0.c -da -m64 -fdump-tree-all -fno-diagnostics-show-caret \
  -fno-diagnostics-show-line-numbers -fdiagnostics-color=never -O3 \
  -mcpu=power9 -mtune=power9 -funroll-loops -ffat-lto-objects -fno-ident


*********************************
* Auto-vectorization transforms this program into approximately the
* following C code
*********************************

int main (int argc, char *argv []) {
  double sacc;
  vector double x_values, y_values, xy_product;
  vector double *vectp_x, *vectp_y;

  first_dummy ();
  for (int j = 0; j < M; j++) {
    sacc = 0.00;
    vectp_x = x;
    vectp_y = y;
    for (unsigned int ivtmp_31 = 0; ivtmp_31 != N / 2; ivtmp_31++) {
      x_values = *vectp_x;
      y_values = *vectp_y;
      xy_product = x_values * y_values;
      sacc += xy_product[0];
      sacc += xy_product[1];
      vectp_x++;
      vectp_y++;
    }
    dummy (sacc, N);
  }
  opt_value = ((float) N) * 2 * ((float) M);
  opt_desc = "flops";
  other_dummy ();
}

*********************************
* Induction variable optimization transforms this program into approximately
* the following C code
*********************************


int main (int argc, char *argv []) {
  double sacc;
  vector double x_values, y_values, xy_product;

  first_dummy ();
  for (int j = 0; j < M; j++) {
    sacc = 0.00;
    for (unsigned int ivtmp_14 = 0; ivtmp_31 != 4096; ivtmp_14 += 16) {
      x_values = x [ivtmp_14];
      y_values = y [ivtmp_14];
      xy_product = x_values * y_values;
      sacc += xy_product[0];
      sacc += xy_product[1];
      /* Note: induction variable optimization has removed 2 pointer
       * increments of the form "vectp_x++" at the "cost" of replacing
       * two direct memory fetches of the form "*vectp_x" with indexed
       * memory fetches of the form "x[i]".  Since most popular
       * architectures support no-cost indexed load instructions, and
       * the induction-variable optimization pass does not have
       * specific information about instruction costs, it's "difficult"
       * to fault its choices here...  What might be a good induction
       * variable choice on one target may not be so good on a
       * different target.  */
    }
    dummy (sacc, N);
  }
  opt_value = ((float) N) * 2 * ((float) M);
  opt_desc = "flops";
  other_dummy ();
}

*********************************
* Loop unrolling turns this code into the following:
*********************************


int main (int argc, char *argv []) {
  double sacc;
  vector double x_values, y_values, xy_product;
  vector double *vectp_x, *vectp_y;

bb2:
  first_dummy ();
  ivtmp_28 = 128;
  sacc = 0.00;		// reg:DF 146
  vectp_y = y;		// vectp_y is reg_di_145
  vectp_x = x;		// vectp_x is reg_di_144
  reg_df_146 = 0.0D;
  goto bb5;

bb7:
bb5:			// Top of outer loop (from bb2 and bb4 via bb7)
  // prepare/initialize for the inner loop
  ivtmp_14 = 0;
  sacc = reg_df_146;
  goto bb3

bb3:			// Top of inner loop (bb23 iterates to here)
  // Unroll 1
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14_base = ivtmp_14 + 16;
  ivtmp_14 = ivtmp_14_base;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code
  goto bb8;

bb8:				// out of order
  goto bb10

bb10:			// loop body, comes from bb8
  // Unroll 2
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 16;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb11

bb11:
  goto bb12

bb12:
  // Unroll 3
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 32;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb13

bb13:
  goto bb14

bb14:
  // Unroll 4
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 48;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb15

bb15:
  goto bb16

bb16:
  // Unroll 5
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 64;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb17

bb17:
  goto bb18

bb18:
  // Unroll 6
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 80;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb19

bb19:
  goto bb20

bb20:
  // Unroll 7
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 96;
  cr_135 = (ivtmp_14 >= 4096);	// apparently dead code, to be removed
  goto bb21

bb21:
  goto bb22

bb22:
  // Unroll 8
  y_values = vectp_y [ivtmp_14];
  x_values = vectp_x [ivtmp_14];
  xy_product = x_values * y_values;
  sacc += xy_product[0];
  sacc += xy_product[1];
  ivtmp_14 = ivtmp_14_base + 112;
  if (ivtmp_14 == 4096)
    goto bb9			// exit the inner loop
  else
    goto bb23			// continue the inner loop

bb23:
  goto bb3

bb9:				// outside inner loop, bottom of outer loop
  goto bb4

bb4:
  dummy (sacc, N);
  ivtmp_28 -= 1;
  if (ivtmp_28 == 0)
    goto bb6		// break out of outer loop
  else
    goto bb7		// continue the outer loop

bb6:	 		// end this function
  opt_value = ((float) N) * 2 * ((float) M);
  opt_desc = "flops";
  other_dummy ();
  goto EXIT

}

Attachments:

******************************************
* p9-dform-0.c (original source)
******************************************

/* { dg-do compile { target { powerpc*-*-* } } } */
/* { dg-require-effective-target powerpc_p9vector_ok } */
/* { dg-skip-if "do not override -mcpu" { powerpc*-*-* } { "-mcpu=*" } { "-mcpu=power9" } } */
/* { dg-skip-if "" { powerpc*-*-aix* } } */
/* { dg-options "-O3 -mcpu=power9 -mtune=power9 -funroll-loops" } */

/* This test confirms that the dform instructions are selected in the
   translation of this main program.  */

extern void first_dummy ();
extern void dummy (double sacc, int n);
extern void other_dummy ();

extern float opt_value;
extern char *opt_desc;

#define M 128
#define N 512

double x [N];
double y [N];

int main (int argc, char *argv []) {
  double sacc;

  first_dummy ();
  for (int j = 0; j < M; j++) {

    sacc = 0.00;
    for (unsigned long long int i = 0; i < N; i++) {
      sacc += x[i] * y[i];
    }
    dummy (sacc, N);
  }
  opt_value = ((float) N) * 2 * ((float) M);
  opt_desc = "flops";
  other_dummy ();
}

/* At time the dform optimization pass was merged with trunk, 12
   lxv instructions were emitted in place of the same number of lxvx
   instructions.  No need to require exactly this number, as it may
   change when other optimization passes evolve.  */

/* { dg-final { scan-assembler {\mlxv\M} } } */


******************************************
* p9-dform-0.c.162t.slp1 (auto-vectorization)
******************************************

;; Function main (main, funcdef_no=0, decl_uid=2861, cgraph_uid=1, symbol_order=2) (executed once)

main (int argc, char * * argv)
{
  double stmp_sacc_16.10;
  vector(2) double vect__3.9;
  vector(2) double vect__2.8;
  vector(2) double * vectp_y.7;
  vector(2) double * vectp_y.6;
  vector(2) double vect__1.5;
  vector(2) double * vectp_x.4;
  vector(2) double * vectp_x.3;
  long long unsigned int i;
  int j;
  double sacc;
  unsigned int ivtmp_28;
  long long unsigned int ivtmp_29;
  long long unsigned int ivtmp_30;
  unsigned int ivtmp_31;

  <bb 2> [local count: 108459]:
  first_dummy ();
  goto <bb 5>; [100.00%]

  <bb 8> [local count: 526133483]:

  <bb 3> [local count: 536870902]:
  # sacc_24 = PHI <sacc_16(8), 0.0(5)>
  # vectp_x.3_23 = PHI <vectp_x.3_22(8), &x(5)>
  # vectp_y.6_20 = PHI <vectp_y.6_19(8), &y(5)>
  # ivtmp_30 = PHI <ivtmp_29(8), 0(5)>
  vect__1.5_21 = MEM <vector(2) double> [(double *)vectp_x.3_23];
  vect__2.8_18 = MEM <vector(2) double> [(double *)vectp_y.6_20];
  vect__3.9_10 = vect__1.5_21 * vect__2.8_18;
  stmp_sacc_16.10_7 = BIT_FIELD_REF <vect__3.9_10, 64, 0>;
  stmp_sacc_16.10_6 = sacc_24 + stmp_sacc_16.10_7;
  stmp_sacc_16.10_5 = BIT_FIELD_REF <vect__3.9_10, 64, 64>;
  sacc_16 = stmp_sacc_16.10_6 + stmp_sacc_16.10_5;
  vectp_x.3_22 = vectp_x.3_23 + 16;
  vectp_y.6_19 = vectp_y.6_20 + 16;
  ivtmp_29 = ivtmp_30 + 1;
  if (ivtmp_29 < 256)
    goto <bb 8>; [98.00%]
  else
    goto <bb 4>; [2.00%]

  <bb 4> [local count: 10737418]:
  # sacc_34 = PHI <sacc_16(3)>
  dummy (sacc_34, 512);
  ivtmp_28 = ivtmp_31 - 1;
  if (ivtmp_28 != 0)
    goto <bb 7>; [98.99%]
  else
    goto <bb 6>; [1.01%]

  <bb 7> [local count: 10628959]:

  <bb 5> [local count: 10737418]:
  # ivtmp_31 = PHI <128(2), ivtmp_28(7)>
  goto <bb 3>; [100.00%]

  <bb 6> [local count: 108459]:
  opt_value = 1.31072e+5;
  opt_desc = "flops";
  other_dummy ();
  return 0;

}


******************************************
* p9-dform-0.c.164t.ivopts (induction variable optimizations)
******************************************

;; Function main (main, funcdef_no=0, decl_uid=2861, cgraph_uid=1, symbol_order=2) (executed once)

main (int argc, char * * argv)
{
  sizetype ivtmp.14;
  double stmp_sacc_16.10;
  vector(2) double vect__3.9;
  vector(2) double vect__2.8;
  vector(2) double * vectp_y.7;
  vector(2) double * vectp_y.6;
  vector(2) double vect__1.5;
  vector(2) double * vectp_x.4;
  vector(2) double * vectp_x.3;
  long long unsigned int i;
  int j;
  double sacc;
  unsigned int ivtmp_28;
  unsigned int ivtmp_31;

  <bb 2> [local count: 108459]:
  first_dummy ();
  goto <bb 5>; [100.00%]

  <bb 8> [local count: 526133483]:

  <bb 3> [local count: 536870902]:
  # sacc_24 = PHI <sacc_16(8), 0.0(5)>
  # ivtmp.14_25 = PHI <ivtmp.14_33(8), 0(5)>
  vect__1.5_21 = MEM[symbol: x, index: ivtmp.14_25, offset: 0B];
  vect__2.8_18 = MEM[symbol: y, index: ivtmp.14_25, offset: 0B];
  vect__3.9_10 = vect__1.5_21 * vect__2.8_18;
  stmp_sacc_16.10_7 = BIT_FIELD_REF <vect__3.9_10, 64, 0>;
  stmp_sacc_16.10_6 = sacc_24 + stmp_sacc_16.10_7;
  stmp_sacc_16.10_5 = BIT_FIELD_REF <vect__3.9_10, 64, 64>;
  sacc_16 = stmp_sacc_16.10_6 + stmp_sacc_16.10_5;
  ivtmp.14_33 = ivtmp.14_25 + 16;
  if (ivtmp.14_33 != 4096)
    goto <bb 8>; [98.00%]
  else
    goto <bb 4>; [2.00%]

  <bb 4> [local count: 10737418]:
  # sacc_34 = PHI <sacc_16(3)>
  dummy (sacc_34, 512);
  ivtmp_28 = ivtmp_31 - 1;
  if (ivtmp_28 != 0)
    goto <bb 7>; [98.99%]
  else
    goto <bb 6>; [1.01%]

  <bb 7> [local count: 10628959]:

  <bb 5> [local count: 10737418]:
  # ivtmp_31 = PHI <128(2), ivtmp_28(7)>
  goto <bb 3>; [100.00%]

  <bb 6> [local count: 108459]:
  opt_value = 1.31072e+5;
  opt_desc = "flops";
  other_dummy ();
  return 0;

}

******************************************
* p9-dform-0.c.253r.loop2_unroll (after vectorized inner loop unrolled 8 times)
******************************************


;; Function main (main, funcdef_no=0, decl_uid=2861, cgraph_uid=1, symbol_order=2) (executed once)



main

Dataflow summary:
;;  invalidated by call 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr]
;;  hardware regs used 	 1 [1] 2 [2] 99 [ap] 109 [vscr] 110 [sfp]
;;  regular block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  eh block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  entry block defs 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;;  exit block uses 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;;  regs ever live 	 1 [1] 2 [2] 3 [3] 4 [4] 33 [1] 96 [lr] 109 [vscr]
;;  ref usage 	r0={3d} r1={1d,11u} r2={1d,17u} r3={5d,2u} r4={5d,1u} r5={4d} r6={4d} r7={4d} r8={4d} r9={4d} r10={4d} r11={3d} r12={3d} r13={3d} r31={1d,8u} r32={3d} r33={5d,1u} r34={4d} r35={4d} r36={4d} r37={4d} r38={4d} r39={4d} r40={4d} r41={4d} r42={4d} r43={4d} r44={4d} r45={4d} r64={3d} r65={3d} r66={4d} r67={4d} r68={4d} r69={4d} r70={4d} r71={4d} r72={4d} r73={4d} r74={4d} r75={4d} r76={4d} r77={4d} r78={3d} r79={3d} r80={3d} r81={3d} r82={3d} r83={3d} r96={4d} r97={3d} r98={3d} r99={1d,7u} r100={3d} r101={3d} r105={3d} r106={3d} r107={3d} r108={1u} r109={4d,4u} r110={1d,8u} r119={1d,1u} r120={1d,1u} r121={1d,2u} r122={2d,2u} r125={2d,4u,2e} r126={2d,2u} r131={1d,1u} r133={1d,1u} r134={1d,1u} r135={1d,1u} r136={1d,1u} r137={1d,1u} r138={1d,1u} r140={1d,1u} r141={1d,1u} r142={1d,1u} r144={1d,1u} r145={1d,1u} r146={1d,1u} 
;;    total ref usage 317{230d,85u,2e} in 33{30 regular + 3 call} insns.

( )->[0]->( 2 )
;; bb 0 artificial_defs: { d-1(1){ }d-1(2){ }d-1(3){ }d-1(4){ }d-1(5){ }d-1(6){ }d-1(7){ }d-1(8){ }d-1(9){ }d-1(10){ }d-1(31){ }d-1(33){ }d-1(34){ }d-1(35){ }d-1(36){ }d-1(37){ }d-1(38){ }d-1(39){ }d-1(40){ }d-1(41){ }d-1(42){ }d-1(43){ }d-1(44){ }d-1(45){ }d-1(66){ }d-1(67){ }d-1(68){ }d-1(69){ }d-1(70){ }d-1(71){ }d-1(72){ }d-1(73){ }d-1(74){ }d-1(75){ }d-1(76){ }d-1(77){ }d-1(96){ }d-1(99){ }d-1(109){ }d-1(110){ }}
;; bb 0 artificial_uses: { }
;; lr  in  	 108 [vrsave]
;; lr  use 	
;; lr  def 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;; live  in  	
;; live  gen 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;; live  kill	
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; live  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]

( 0 )->[2]->( 5 )
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u0(1){ }u1(2){ }u2(31){ }u3(99){ }u4(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 126 144 145
;; live  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; live  gen 	 109 [vscr] 126 144 145
;; live  kill	 96 [lr]
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp] 126 144 145
;; live  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145

( 5 8 )->[3]->( 8 4 )
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u8(1){ }u9(2){ }u10(31){ }u11(99){ }u12(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 122 125 144 145
;; lr  def 	 119 120 121 122 125 131 133 134 135
;; live  in  	 109 [vscr] 122 125 126
;; live  gen 	 119 120 121 122 125 131 133 134 135
;; live  kill	
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145
;; live  out 	 109 [vscr] 122 125 126

( 3 )->[8]->( 3 )
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 109 [vscr] 122 125 126
;; live  gen 	
;; live  kill	
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145
;; live  out 	 109 [vscr] 122 125 126

( 3 )->[4]->( 7 6 )
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u30(1){ }u31(2){ }u32(31){ }u33(99){ }u34(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 126 136 137
;; live  in  	 109 [vscr] 122 126
;; live  gen 	 4 [4] 33 [1] 109 [vscr] 126 136 137
;; live  kill	 96 [lr]
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145
;; live  out 	 109 [vscr] 126

( 4 )->[7]->( 5 )
;; bb 7 artificial_defs: { }
;; bb 7 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 109 [vscr] 126
;; live  gen 	
;; live  kill	
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145
;; live  out 	 109 [vscr] 126

( 2 7 )->[5]->( 3 )
;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u45(1){ }u46(2){ }u47(31){ }u48(99){ }u49(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	 122 125
;; live  in  	 109 [vscr] 126
;; live  gen 	 122 125
;; live  kill	
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145
;; live  out 	 109 [vscr] 122 125 126

( 4 )->[6]->( 1 )
;; bb 6 artificial_defs: { }
;; bb 6 artificial_uses: { u50(1){ }u51(2){ }u52(31){ }u53(99){ }u54(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 138 140 141 142
;; live  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; live  gen 	 3 [3] 109 [vscr] 138 140 141 142
;; live  kill	 96 [lr]
;; lr  out 	 1 [1] 2 [2] 3 [3] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; live  out 	 1 [1] 2 [2] 3 [3] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]

( 6 )->[1]->( )
;; bb 1 artificial_defs: { }
;; bb 1 artificial_uses: { u68(1){ }u69(2){ }u70(3){ }u71(31){ }u72(108){ }u73(109){ }u74(110){ }}
;; lr  in  	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  use 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  def 	
;; live  in  	 1 [1] 2 [2] 3 [3] 31 [31] 109 [vscr] 110 [sfp]
;; live  gen 	
;; live  kill	
;; lr  out 	
;; live  out 	

starting the processing of deferred insns
ending the processing of deferred insns
setting blocks to analyze 3, 8
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 3 ( 0.33)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 2 ( 0.22)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 3 ( 0.33)


starting region dump


main

Dataflow summary:
def_info->table_size = 9, use_info->table_size = 76
;;  invalidated by call 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr]
;;  hardware regs used 	 1 [1] 2 [2] 99 [ap] 109 [vscr] 110 [sfp]
;;  regular block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  eh block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  entry block defs 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;;  exit block uses 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;;  regs ever live 	 1 [1] 2 [2] 3 [3] 4 [4] 33 [1] 96 [lr] 109 [vscr]
;;  ref usage 	r0={3d} r1={1d,11u} r2={1d,17u} r3={5d,2u} r4={5d,1u} r5={4d} r6={4d} r7={4d} r8={4d} r9={4d} r10={4d} r11={3d} r12={3d} r13={3d} r31={1d,8u} r32={3d} r33={5d,1u} r34={4d} r35={4d} r36={4d} r37={4d} r38={4d} r39={4d} r40={4d} r41={4d} r42={4d} r43={4d} r44={4d} r45={4d} r64={3d} r65={3d} r66={4d} r67={4d} r68={4d} r69={4d} r70={4d} r71={4d} r72={4d} r73={4d} r74={4d} r75={4d} r76={4d} r77={4d} r78={3d} r79={3d} r80={3d} r81={3d} r82={3d} r83={3d} r96={4d} r97={3d} r98={3d} r99={1d,7u} r100={3d} r101={3d} r105={3d} r106={3d} r107={3d} r108={1u} r109={4d,4u} r110={1d,8u} r119={1d,1u} r120={1d,1u} r121={1d,2u} r122={2d,2u} r125={2d,4u,2e} r126={2d,2u} r131={1d,1u} r133={1d,1u} r134={1d,1u} r135={1d,1u} r136={1d,1u} r137={1d,1u} r138={1d,1u} r140={1d,1u} r141={1d,1u} r142={1d,1u} r144={1d,1u} r145={1d,1u} r146={1d,1u} 
;;    total ref usage 317{230d,85u,2e} in 33{30 regular + 3 call} insns.
;; Reaching defs:
;;  sparse invalidated 	
;;  dense invalidated 	
;;  reg->defs[] map:	119[0,0] 120[1,1] 121[2,2] 122[3,3] 125[4,4] 131[5,5] 133[6,6] 134[7,7] 135[8,8] 
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u8(1){ }u9(2){ }u10(31){ }u11(99){ }u12(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 122 125 144 145
;; lr  def 	 119 120 121 122 125 131 133 134 135
;; live  in  	 122 125
;; live  gen 	 119 120 121 122 125 131 133 134 135
;; live  kill	
;; rd  in  	(2) 122[3],125[4]
;; rd  gen 	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;; rd  kill	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;;  UD chains for artificial uses at top

(code_label 24 6 13 3 3 (nil) [0 uses])
(note 13 24 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
;;   UD chains for insn luid 0 uid 15
;;      reg 125 { d4(bb 3 insn 23) }
;;      reg 145 { }
;;   eq_note reg 125 { d4(bb 3 insn 23) }
(insn 15 13 17 3 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("y") [flags 0x80]  <var_decl 0x3fff88d405a0 y>)) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 1 uid 17
;;      reg 125 { d4(bb 3 insn 23) }
;;      reg 144 { }
;;   eq_note reg 125 { d4(bb 3 insn 23) }
(insn 17 15 18 3 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("x") [flags 0x80]  <var_decl 0x3fff88d40510 x>)) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 2 uid 18
;;      reg 131 { d5(bb 3 insn 15) }
;;      reg 133 { d6(bb 3 insn 17) }
(insn 18 17 19 3 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))
;;   UD chains for insn luid 3 uid 19
;;      reg 121 { d2(bb 3 insn 18) }
(insn 19 18 20 3 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))
;;   UD chains for insn luid 4 uid 20
;;      reg 120 { d1(bb 3 insn 19) }
;;      reg 122 { d3(bb 3 insn 22) }
(insn 20 19 21 3 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 5 uid 21
;;      reg 121 { d2(bb 3 insn 18) }
(insn 21 20 22 3 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))
;;   UD chains for insn luid 6 uid 22
;;      reg 119 { d0(bb 3 insn 20) }
;;      reg 134 { d7(bb 3 insn 21) }
(insn 22 21 23 3 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 7 uid 23
;;      reg 125 { d4(bb 3 insn 23) }
(insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
;;   UD chains for insn luid 8 uid 25
;;      reg 125 { d4(bb 3 insn 23) }
(insn 25 23 26 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;   UD chains for insn luid 9 uid 26
;;      reg 135 { d8(bb 3 insn 25) }
(jump_insn 26 25 67 3 (set (pc)
        (if_then_else (ne (reg:CCUNS 135)
                (const_int 0 [0]))
            (label_ref:DI 67)
            (pc))) 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CCUNS 135)
        (int_list:REG_BR_PROB 1052266990 (nil)))
 -> 67)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125
;; rd  out 	(2) 122[3],125[4]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 122 125
;; live  gen 	
;; live  kill	
;; rd  in  	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;; rd  gen 	(0) 
;; rd  kill	(0) 
;;  UD chains for artificial uses at top

(code_label 67 26 66 8 5 (nil) [1 uses])
(note 66 67 27 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125
;; rd  out 	(2) 122[3],125[4]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }



Analyzing operand (reg:DI 125 [ ivtmp.14 ]) of insn (insn 25 23 26 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
Analyzing def of (reg:DI 125 [ ivtmp.14 ]) in insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
Analyzing operand (reg:DI 125 [ ivtmp.14 ]) of insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
Analyzing (reg:DI 125 [ ivtmp.14 ]) for bivness.
  (reg:DI 125 [ ivtmp.14 ]) + (const_int 16 [0x10]) * iteration (in DI)
Analyzing operand (const_int 16 [0x10]) of insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
  invariant (const_int 16 [0x10]) (in DI)
(reg:DI 125 [ ivtmp.14 ]) in insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
  is (plus:DI (reg:DI 125 [ ivtmp.14 ])
    (const_int 16 [0x10])) + (const_int 16 [0x10]) * iteration (in DI)
Analyzing operand (const_int 4096 [0x1000]) of insn (insn 25 23 26 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
  invariant (const_int 4096 [0x1000]) (in DI)
Loop 2 is simple:
  simple exit 3 -> 4
  number of iterations: (const_int 255 [0xff])
  upper bound: 255
  likely upper bound: 255
  realistic bound: 255
starting the processing of deferred insns
ending the processing of deferred insns
setting blocks to analyze 3, 4, 5, 7, 8
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 9 (    1)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 9 (    1)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 11 (  1.2)


starting region dump


main

Dataflow summary:
def_info->table_size = 71, use_info->table_size = 76
;;  invalidated by call 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr]
;;  hardware regs used 	 1 [1] 2 [2] 99 [ap] 109 [vscr] 110 [sfp]
;;  regular block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  eh block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  entry block defs 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;;  exit block uses 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;;  regs ever live 	 1 [1] 2 [2] 3 [3] 4 [4] 33 [1] 96 [lr] 109 [vscr]
;;  ref usage 	r0={3d} r1={1d,11u} r2={1d,17u} r3={5d,2u} r4={5d,1u} r5={4d} r6={4d} r7={4d} r8={4d} r9={4d} r10={4d} r11={3d} r12={3d} r13={3d} r31={1d,8u} r32={3d} r33={5d,1u} r34={4d} r35={4d} r36={4d} r37={4d} r38={4d} r39={4d} r40={4d} r41={4d} r42={4d} r43={4d} r44={4d} r45={4d} r64={3d} r65={3d} r66={4d} r67={4d} r68={4d} r69={4d} r70={4d} r71={4d} r72={4d} r73={4d} r74={4d} r75={4d} r76={4d} r77={4d} r78={3d} r79={3d} r80={3d} r81={3d} r82={3d} r83={3d} r96={4d} r97={3d} r98={3d} r99={1d,7u} r100={3d} r101={3d} r105={3d} r106={3d} r107={3d} r108={1u} r109={4d,4u} r110={1d,8u} r119={1d,1u} r120={1d,1u} r121={1d,2u} r122={2d,2u} r125={2d,4u,2e} r126={2d,2u} r131={1d,1u} r133={1d,1u} r134={1d,1u} r135={1d,1u} r136={1d,1u} r137={1d,1u} r138={1d,1u} r140={1d,1u} r141={1d,1u} r142={1d,1u} r144={1d,1u} r145={1d,1u} r146={1d,1u} 
;;    total ref usage 317{230d,85u,2e} in 33{30 regular + 3 call} insns.
;; Reaching defs:
;;  sparse invalidated 	
;;  dense invalidated 	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56
;;  reg->defs[] map:	0[0,0] 3[1,1] 4[2,3] 5[4,4] 6[5,5] 7[6,6] 8[7,7] 9[8,8] 10[9,9] 11[10,10] 12[11,11] 13[12,12] 32[13,13] 33[14,15] 34[16,16] 35[17,17] 36[18,18] 37[19,19] 38[20,20] 39[21,21] 40[22,22] 41[23,23] 42[24,24] 43[25,25] 44[26,26] 45[27,27] 64[28,28] 65[29,29] 66[30,30] 67[31,31] 68[32,32] 69[33,33] 70[34,34] 71[35,35] 72[36,36] 73[37,37] 74[38,38] 75[39,39] 76[40,40] 77[41,41] 78[42,42] 79[43,43] 80[44,44] 81[45,45] 82[46,46] 83[47,47] 96[48,48] 97[49,49] 98[50,50] 100[51,51] 101[52,52] 105[53,53] 106[54,54] 107[55,55] 109[56,56] 119[57,57] 120[58,58] 121[59,59] 122[60,61] 125[62,63] 126[64,64] 131[65,65] 133[66,66] 134[67,67] 135[68,68] 136[69,69] 137[70,70] 
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u8(1){ }u9(2){ }u10(31){ }u11(99){ }u12(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 122 125 144 145
;; lr  def 	 119 120 121 122 125 131 133 134 135
;; live  in  	 109 [vscr] 122 125 126
;; live  gen 	 119 120 121 122 125 131 133 134 135
;; live  kill	
;; rd  in  	(6) 109[56],122[60,61],125[62,63],126[64]
;; rd  gen 	(9) 119[57],120[58],121[59],122[60],125[62],131[65],133[66],134[67],135[68]
;; rd  kill	(11) 119[57],120[58],121[59],122[60,61],125[62,63],131[65],133[66],134[67],135[68]
;;  UD chains for artificial uses at top

(code_label 24 6 13 3 3 (nil) [0 uses])
(note 13 24 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
;;   UD chains for insn luid 0 uid 15
;;      reg 125 { d63(bb 5 insn 5) d62(bb 3 insn 23) }
;;      reg 145 { }
;;   eq_note reg 125 { d63(bb 5 insn 5) d62(bb 3 insn 23) }
(insn 15 13 17 3 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("y") [flags 0x80]  <var_decl 0x3fff88d405a0 y>)) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 1 uid 17
;;      reg 125 { d63(bb 5 insn 5) d62(bb 3 insn 23) }
;;      reg 144 { }
;;   eq_note reg 125 { d63(bb 5 insn 5) d62(bb 3 insn 23) }
(insn 17 15 18 3 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("x") [flags 0x80]  <var_decl 0x3fff88d40510 x>)) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 2 uid 18
;;      reg 131 { d65(bb 3 insn 15) }
;;      reg 133 { d66(bb 3 insn 17) }
(insn 18 17 19 3 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))
;;   UD chains for insn luid 3 uid 19
;;      reg 121 { d59(bb 3 insn 18) }
(insn 19 18 20 3 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))
;;   UD chains for insn luid 4 uid 20
;;      reg 120 { d58(bb 3 insn 19) }
;;      reg 122 { d61(bb 5 insn 68) d60(bb 3 insn 22) }
(insn 20 19 21 3 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 5 uid 21
;;      reg 121 { d59(bb 3 insn 18) }
(insn 21 20 22 3 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))
;;   UD chains for insn luid 6 uid 22
;;      reg 119 { d57(bb 3 insn 20) }
;;      reg 134 { d67(bb 3 insn 21) }
(insn 22 21 23 3 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 7 uid 23
;;      reg 125 { d63(bb 5 insn 5) d62(bb 3 insn 23) }
(insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
;;   UD chains for insn luid 8 uid 25
;;      reg 125 { d62(bb 3 insn 23) }
(insn 25 23 26 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;   UD chains for insn luid 9 uid 26
;;      reg 135 { d68(bb 3 insn 25) }
(jump_insn 26 25 67 3 (set (pc)
        (if_then_else (ne (reg:CCUNS 135)
                (const_int 0 [0]))
            (label_ref:DI 67)
            (pc))) 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CCUNS 135)
        (int_list:REG_BR_PROB 1052266990 (nil)))
 -> 67)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; live  out 	 109 [vscr] 122 125 126
;; rd  out 	(4) 109[56],122[60],125[62],126[64]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u30(1){ }u31(2){ }u32(31){ }u33(99){ }u34(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 126 136 137
;; live  in  	 109 [vscr] 122 126
;; live  gen 	 4 [4] 33 [1] 109 [vscr] 126 136 137
;; live  kill	 96 [lr]
;; rd  in  	(4) 109[56],122[60],125[62],126[64]
;; rd  gen 	(4) 109[56],126[64],136[69],137[70]
;; rd  kill	(5) 96[48],109[56],126[64],136[69],137[70]
;;  UD chains for artificial uses at top

(note 27 66 28 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
;;   UD chains for insn luid 0 uid 28
(insn 28 27 29 4 (set (reg:DI 4 4)
        (const_int 512 [0x200])) "p9-dform-0.c":33:5 609 {*movdi_internal64}
     (nil))
;;   UD chains for insn luid 1 uid 29
;;      reg 122 { d60(bb 3 insn 22) }
(insn 29 28 30 4 (set (reg:DF 33 1)
        (reg/v:DF 122 [ sacc ])) "p9-dform-0.c":33:5 512 {*movdf_hardfloat64}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (nil)))
;;   UD chains for insn luid 2 uid 30
;;      reg 1 { }
;;      reg 109 { d56(bb 4 insn 30) }
;;      reg 2 { }
;;      reg 4 { d2(bb 4 insn 28) }
;;      reg 33 { d14(bb 4 insn 29) }
(call_insn 30 29 31 4 (parallel [
            (call (mem:SI (symbol_ref:DI ("dummy") [flags 0x41]  <function_decl 0x3fff863be100 dummy>) [0 dummy S4 A8])
                (const_int 0 [0]))
            (clobber (reg:DI 96 lr))
        ]) "p9-dform-0.c":33:5 704 {*call_nonlocal_aixdi}
     (expr_list:REG_DEAD (reg:DF 33 1)
        (expr_list:REG_DEAD (reg:DI 4 4)
            (expr_list:REG_CALL_DECL (symbol_ref:DI ("dummy") [flags 0x41]  <function_decl 0x3fff863be100 dummy>)
                (nil))))
    (expr_list (use (reg:DI 2 2))
        (expr_list:DF (use (reg:DF 33 1))
            (expr_list:SI (use (reg:DI 4 4))
                (nil)))))
;;   UD chains for insn luid 3 uid 31
;;      reg 126 { d64(bb 4 insn 32) }
(insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
;;   UD chains for insn luid 4 uid 32
;;      reg 136 { d69(bb 4 insn 31) }
(insn 32 31 33 4 (set (reg:DI 126 [ ivtmp_28 ])
        (zero_extend:DI (reg:SI 136))) "p9-dform-0.c":27:3 19 {zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 136)
        (nil)))
;;   UD chains for insn luid 5 uid 33
;;      reg 126 { d64(bb 4 insn 32) }
(insn 33 32 34 4 (set (reg:CC 137)
        (compare:CC (reg:DI 126 [ ivtmp_28 ])
            (const_int 0 [0]))) "p9-dform-0.c":27:3 730 {*cmpdi_signed}
     (nil))
;;   UD chains for insn luid 6 uid 34
;;      reg 137 { d70(bb 4 insn 33) }
(jump_insn 34 33 65 4 (set (pc)
        (if_then_else (eq (reg:CC 137)
                (const_int 0 [0]))
            (label_ref 39)
            (pc))) "p9-dform-0.c":27:3 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CC 137)
        (int_list:REG_BR_PROB 10845908 (nil)))
 -> 39)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; live  out 	 109 [vscr] 126
;; rd  out 	(2) 109[56],126[64]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u45(1){ }u46(2){ }u47(31){ }u48(99){ }u49(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 146
;; lr  def 	 122 125
;; live  in  	 109 [vscr] 126
;; live  gen 	 122 125
;; live  kill	
;; rd  in  	(2) 109[56],126[64]
;; rd  gen 	(2) 122[61],125[63]
;; rd  kill	(4) 122[60,61],125[62,63]
;;  UD chains for artificial uses at top

(code_label 35 65 36 5 2 (nil) [0 uses])
(note 36 35 5 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
;;   UD chains for insn luid 0 uid 5
(insn 5 36 68 5 (set (reg:DI 125 [ ivtmp.14 ])
        (const_int 0 [0])) "p9-dform-0.c":23:36 609 {*movdi_internal64}
     (nil))
;;   UD chains for insn luid 1 uid 68
;;      reg 146 { }
(insn 68 5 39 5 (set (reg/v:DF 122 [ sacc ])
        (reg:DF 146 [ sacc ])) "p9-dform-0.c":29:10 -1
     (expr_list:REG_DEAD (reg:DF 146 [ sacc ])
        (nil)))
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; live  out 	 109 [vscr] 122 125 126
;; rd  out 	(4) 109[56],122[61],125[63],126[64]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 7 artificial_defs: { }
;; bb 7 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 109 [vscr] 126
;; live  gen 	
;; live  kill	
;; rd  in  	(2) 109[56],126[64]
;; rd  gen 	(0) 
;; rd  kill	(0) 
;;  UD chains for artificial uses at top

(note 65 34 35 7 [bb 7] NOTE_INSN_BASIC_BLOCK)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; live  out 	 109 [vscr] 126
;; rd  out 	(2) 109[56],126[64]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 109 [vscr] 122 125 126
;; live  gen 	
;; live  kill	
;; rd  in  	(11) 109[56],119[57],120[58],121[59],122[60],125[62],126[64],131[65],133[66],134[67],135[68]
;; rd  gen 	(0) 
;; rd  kill	(0) 
;;  UD chains for artificial uses at top

(code_label 67 26 66 8 5 (nil) [1 uses])
(note 66 67 27 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; live  out 	 109 [vscr] 122 125 126
;; rd  out 	(4) 109[56],122[60],125[62],126[64]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }



Analyzing operand (reg:DI 126 [ ivtmp_28 ]) of insn (insn 33 32 34 4 (set (reg:CC 137)
        (compare:CC (reg:DI 126 [ ivtmp_28 ])
            (const_int 0 [0]))) "p9-dform-0.c":27:3 730 {*cmpdi_signed}
     (nil))
Analyzing def of (reg:DI 126 [ ivtmp_28 ]) in insn (insn 32 31 33 4 (set (reg:DI 126 [ ivtmp_28 ])
        (zero_extend:DI (reg:SI 136))) "p9-dform-0.c":27:3 19 {zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 136)
        (nil)))
Analyzing operand (reg:SI 136) of insn (insn 32 31 33 4 (set (reg:DI 126 [ ivtmp_28 ])
        (zero_extend:DI (reg:SI 136))) "p9-dform-0.c":27:3 19 {zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 136)
        (nil)))
Analyzing def of (reg:SI 136) in insn (insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
Analyzing operand (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0) of insn (insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
Analyzing operand (reg:DI 126 [ ivtmp_28 ]) of insn (insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
Analyzing (reg:DI 126 [ ivtmp_28 ]) for bivness.
  (reg:DI 126 [ ivtmp_28 ]) + (const_int -1 [0xffffffffffffffff]) * iteration (in SI) zero_extend to DI (first special)
Analyzing operand (const_int -1 [0xffffffffffffffff]) of insn (insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
  invariant (const_int -1 [0xffffffffffffffff]) (in SI)
(reg:SI 136) in insn (insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
  is (plus:DI (reg:DI 126 [ ivtmp_28 ])
    (const_int 4294967295 [0xffffffff])) + (const_int -1 [0xffffffffffffffff]) * iteration (in SI) UnKnown to DI
(reg:DI 126 [ ivtmp_28 ]) in insn (insn 32 31 33 4 (set (reg:DI 126 [ ivtmp_28 ])
        (zero_extend:DI (reg:SI 136))) "p9-dform-0.c":27:3 19 {zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 136)
        (nil)))
  is (plus:DI (reg:DI 126 [ ivtmp_28 ])
    (const_int 4294967295 [0xffffffff])) + (const_int -1 [0xffffffffffffffff]) * iteration (in SI) zero_extend to DI
Analyzing operand (const_int 0 [0]) of insn (insn 33 32 34 4 (set (reg:CC 137)
        (compare:CC (reg:DI 126 [ ivtmp_28 ])
            (const_int 0 [0]))) "p9-dform-0.c":27:3 730 {*cmpdi_signed}
     (nil))
  invariant (const_int 0 [0]) (in DI)
Loop 1 is simple:
  simple exit 4 -> 6
  number of iterations: (const_int 127 [0x7f])
  upper bound: 127
  likely upper bound: 127
  realistic bound: 127
;; Not considering loop, is not innermost
starting the processing of deferred insns
ending the processing of deferred insns
setting blocks to analyze 3, 8
starting the processing of deferred insns
ending the processing of deferred insns
df_analyze called
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 3 ( 0.33)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 2 ( 0.22)
df_worklist_dataflow_doublequeue: n_basic_blocks 9 n_edges 10 count 3 ( 0.33)


starting region dump


main

Dataflow summary:
def_info->table_size = 9, use_info->table_size = 76
;;  invalidated by call 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr]
;;  hardware regs used 	 1 [1] 2 [2] 99 [ap] 109 [vscr] 110 [sfp]
;;  regular block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  eh block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  entry block defs 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;;  exit block uses 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;;  regs ever live 	 1 [1] 2 [2] 3 [3] 4 [4] 33 [1] 96 [lr] 109 [vscr]
;;  ref usage 	r0={3d} r1={1d,11u} r2={1d,17u} r3={5d,2u} r4={5d,1u} r5={4d} r6={4d} r7={4d} r8={4d} r9={4d} r10={4d} r11={3d} r12={3d} r13={3d} r31={1d,8u} r32={3d} r33={5d,1u} r34={4d} r35={4d} r36={4d} r37={4d} r38={4d} r39={4d} r40={4d} r41={4d} r42={4d} r43={4d} r44={4d} r45={4d} r64={3d} r65={3d} r66={4d} r67={4d} r68={4d} r69={4d} r70={4d} r71={4d} r72={4d} r73={4d} r74={4d} r75={4d} r76={4d} r77={4d} r78={3d} r79={3d} r80={3d} r81={3d} r82={3d} r83={3d} r96={4d} r97={3d} r98={3d} r99={1d,7u} r100={3d} r101={3d} r105={3d} r106={3d} r107={3d} r108={1u} r109={4d,4u} r110={1d,8u} r119={1d,1u} r120={1d,1u} r121={1d,2u} r122={2d,2u} r125={2d,4u,2e} r126={2d,2u} r131={1d,1u} r133={1d,1u} r134={1d,1u} r135={1d,1u} r136={1d,1u} r137={1d,1u} r138={1d,1u} r140={1d,1u} r141={1d,1u} r142={1d,1u} r144={1d,1u} r145={1d,1u} r146={1d,1u} 
;;    total ref usage 317{230d,85u,2e} in 33{30 regular + 3 call} insns.
;; Reaching defs:
;;  sparse invalidated 	
;;  dense invalidated 	
;;  reg->defs[] map:	119[0,0] 120[1,1] 121[2,2] 122[3,3] 125[4,4] 131[5,5] 133[6,6] 134[7,7] 135[8,8] 
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u8(1){ }u9(2){ }u10(31){ }u11(99){ }u12(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 122 125 144 145
;; lr  def 	 119 120 121 122 125 131 133 134 135
;; live  in  	 122 125
;; live  gen 	 119 120 121 122 125 131 133 134 135
;; live  kill	
;; rd  in  	(2) 122[3],125[4]
;; rd  gen 	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;; rd  kill	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;;  UD chains for artificial uses at top

(code_label 24 6 13 3 3 (nil) [0 uses])
(note 13 24 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
;;   UD chains for insn luid 0 uid 15
;;      reg 125 { d4(bb 3 insn 23) }
;;      reg 145 { }
;;   eq_note reg 125 { d4(bb 3 insn 23) }
(insn 15 13 17 3 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("y") [flags 0x80]  <var_decl 0x3fff88d405a0 y>)) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 1 uid 17
;;      reg 125 { d4(bb 3 insn 23) }
;;      reg 144 { }
;;   eq_note reg 125 { d4(bb 3 insn 23) }
(insn 17 15 18 3 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (expr_list:REG_EQUAL (mem:V2DF (plus:DI (reg:DI 125 [ ivtmp.14 ])
                    (symbol_ref:DI ("x") [flags 0x80]  <var_decl 0x3fff88d40510 x>)) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])
            (nil))))
;;   UD chains for insn luid 2 uid 18
;;      reg 131 { d5(bb 3 insn 15) }
;;      reg 133 { d6(bb 3 insn 17) }
(insn 18 17 19 3 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))
;;   UD chains for insn luid 3 uid 19
;;      reg 121 { d2(bb 3 insn 18) }
(insn 19 18 20 3 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))
;;   UD chains for insn luid 4 uid 20
;;      reg 120 { d1(bb 3 insn 19) }
;;      reg 122 { d3(bb 3 insn 22) }
(insn 20 19 21 3 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 5 uid 21
;;      reg 121 { d2(bb 3 insn 18) }
(insn 21 20 22 3 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))
;;   UD chains for insn luid 6 uid 22
;;      reg 119 { d0(bb 3 insn 20) }
;;      reg 134 { d7(bb 3 insn 21) }
(insn 22 21 23 3 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))
;;   UD chains for insn luid 7 uid 23
;;      reg 125 { d4(bb 3 insn 23) }
(insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
;;   UD chains for insn luid 8 uid 25
;;      reg 125 { d4(bb 3 insn 23) }
(insn 25 23 26 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;   UD chains for insn luid 9 uid 26
;;      reg 135 { d8(bb 3 insn 25) }
(jump_insn 26 25 67 3 (set (pc)
        (if_then_else (ne (reg:CCUNS 135)
                (const_int 0 [0]))
            (label_ref:DI 67)
            (pc))) 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CCUNS 135)
        (int_list:REG_BR_PROB 1052266990 (nil)))
 -> 67)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125
;; rd  out 	(2) 122[3],125[4]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }


;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 122 125
;; live  gen 	
;; live  kill	
;; rd  in  	(9) 119[0],120[1],121[2],122[3],125[4],131[5],133[6],134[7],135[8]
;; rd  gen 	(0) 
;; rd  kill	(0) 
;;  UD chains for artificial uses at top

(code_label 67 26 66 8 5 (nil) [1 uses])
(note 66 67 27 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125
;; rd  out 	(2) 122[3],125[4]
;;  UD chains for artificial uses at bottom
;;   reg 1 { }
;;   reg 2 { }
;;   reg 31 { }
;;   reg 99 { }
;;   reg 110 { }



changing bb of uid 69
  unscanned insn
Redirecting fallthru edge 3->4 to 9
Analyzing (reg:DI 125 [ ivtmp.14 ]) for bivness.
  (reg:DI 125 [ ivtmp.14 ]) + (const_int 16 [0x10]) * iteration (in DI)
Analyzing def of (reg:DI 125 [ ivtmp.14 ]) in insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
Analyzing operand (reg:DI 125 [ ivtmp.14 ]) of insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
Analyzing (reg:DI 125 [ ivtmp.14 ]) for bivness.
  already analysed.
Analyzing operand (const_int 16 [0x10]) of insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
  invariant (const_int 16 [0x10]) (in DI)
(reg:DI 125 [ ivtmp.14 ]) in insn (insn 23 22 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))
  is (plus:DI (reg:DI 125 [ ivtmp.14 ])
    (const_int 16 [0x10])) + (const_int 16 [0x10]) * iteration (in DI)
;; Condition at end of loop.
changing bb of uid 81
  unscanned insn
changing bb of uid 71
  unscanned insn
deferring rescan insn with uid = 71.
changing bb of uid 72
  unscanned insn
deferring rescan insn with uid = 72.
changing bb of uid 73
  unscanned insn
deferring rescan insn with uid = 73.
changing bb of uid 74
  unscanned insn
deferring rescan insn with uid = 74.
changing bb of uid 75
  unscanned insn
deferring rescan insn with uid = 75.
changing bb of uid 76
  unscanned insn
deferring rescan insn with uid = 76.
changing bb of uid 77
  unscanned insn
deferring rescan insn with uid = 77.
changing bb of uid 78
  unscanned insn
deferring rescan insn with uid = 78.
changing bb of uid 79
  unscanned insn
deferring rescan insn with uid = 79.
changing bb of uid 80
  unscanned insn
deferring rescan insn with uid = 80.
changing bb of uid 83
  unscanned insn
deferring rescan insn with uid = 80.
Edge 10->8 redirected to 11
Redirecting fallthru edge 11->3 to 10
Redirecting fallthru edge 8->3 to 10
Redirecting fallthru edge 11->10 to 3
Making edge 10->9 impossible by redistributing probability to other edges.
changing bb of uid 96
  unscanned insn
changing bb of uid 86
  unscanned insn
deferring rescan insn with uid = 86.
changing bb of uid 87
  unscanned insn
deferring rescan insn with uid = 87.
changing bb of uid 88
  unscanned insn
deferring rescan insn with uid = 88.
changing bb of uid 89
  unscanned insn
deferring rescan insn with uid = 89.
changing bb of uid 90
  unscanned insn
deferring rescan insn with uid = 90.
changing bb of uid 91
  unscanned insn
deferring rescan insn with uid = 91.
changing bb of uid 92
  unscanned insn
deferring rescan insn with uid = 92.
changing bb of uid 93
  unscanned insn
deferring rescan insn with uid = 93.
changing bb of uid 94
  unscanned insn
deferring rescan insn with uid = 94.
changing bb of uid 95
  unscanned insn
deferring rescan insn with uid = 95.
changing bb of uid 98
  unscanned insn
deferring rescan insn with uid = 95.
Edge 12->8 redirected to 13
Redirecting fallthru edge 11->3 to 12
Redirecting fallthru edge 13->10 to 3
Making edge 12->9 impossible by redistributing probability to other edges.
changing bb of uid 111
  unscanned insn
changing bb of uid 101
  unscanned insn
deferring rescan insn with uid = 101.
changing bb of uid 102
  unscanned insn
deferring rescan insn with uid = 102.
changing bb of uid 103
  unscanned insn
deferring rescan insn with uid = 103.
changing bb of uid 104
  unscanned insn
deferring rescan insn with uid = 104.
changing bb of uid 105
  unscanned insn
deferring rescan insn with uid = 105.
changing bb of uid 106
  unscanned insn
deferring rescan insn with uid = 106.
changing bb of uid 107
  unscanned insn
deferring rescan insn with uid = 107.
changing bb of uid 108
  unscanned insn
deferring rescan insn with uid = 108.
changing bb of uid 109
  unscanned insn
deferring rescan insn with uid = 109.
changing bb of uid 110
  unscanned insn
deferring rescan insn with uid = 110.
changing bb of uid 113
  unscanned insn
deferring rescan insn with uid = 110.
Edge 14->8 redirected to 15
Redirecting fallthru edge 13->3 to 14
Redirecting fallthru edge 15->10 to 3
Making edge 14->9 impossible by redistributing probability to other edges.
changing bb of uid 126
  unscanned insn
changing bb of uid 116
  unscanned insn
deferring rescan insn with uid = 116.
changing bb of uid 117
  unscanned insn
deferring rescan insn with uid = 117.
changing bb of uid 118
  unscanned insn
deferring rescan insn with uid = 118.
changing bb of uid 119
  unscanned insn
deferring rescan insn with uid = 119.
changing bb of uid 120
  unscanned insn
deferring rescan insn with uid = 120.
changing bb of uid 121
  unscanned insn
deferring rescan insn with uid = 121.
changing bb of uid 122
  unscanned insn
deferring rescan insn with uid = 122.
changing bb of uid 123
  unscanned insn
deferring rescan insn with uid = 123.
changing bb of uid 124
  unscanned insn
deferring rescan insn with uid = 124.
changing bb of uid 125
  unscanned insn
deferring rescan insn with uid = 125.
changing bb of uid 128
  unscanned insn
deferring rescan insn with uid = 125.
Edge 16->8 redirected to 17
Redirecting fallthru edge 15->3 to 16
Redirecting fallthru edge 17->10 to 3
Making edge 16->9 impossible by redistributing probability to other edges.
changing bb of uid 141
  unscanned insn
changing bb of uid 131
  unscanned insn
deferring rescan insn with uid = 131.
changing bb of uid 132
  unscanned insn
deferring rescan insn with uid = 132.
changing bb of uid 133
  unscanned insn
deferring rescan insn with uid = 133.
changing bb of uid 134
  unscanned insn
deferring rescan insn with uid = 134.
changing bb of uid 135
  unscanned insn
deferring rescan insn with uid = 135.
changing bb of uid 136
  unscanned insn
deferring rescan insn with uid = 136.
changing bb of uid 137
  unscanned insn
deferring rescan insn with uid = 137.
changing bb of uid 138
  unscanned insn
deferring rescan insn with uid = 138.
changing bb of uid 139
  unscanned insn
deferring rescan insn with uid = 139.
changing bb of uid 140
  unscanned insn
deferring rescan insn with uid = 140.
changing bb of uid 143
  unscanned insn
deferring rescan insn with uid = 140.
Edge 18->8 redirected to 19
Redirecting fallthru edge 17->3 to 18
Redirecting fallthru edge 19->10 to 3
Making edge 18->9 impossible by redistributing probability to other edges.
changing bb of uid 156
  unscanned insn
changing bb of uid 146
  unscanned insn
deferring rescan insn with uid = 146.
changing bb of uid 147
  unscanned insn
deferring rescan insn with uid = 147.
changing bb of uid 148
  unscanned insn
deferring rescan insn with uid = 148.
changing bb of uid 149
  unscanned insn
deferring rescan insn with uid = 149.
changing bb of uid 150
  unscanned insn
deferring rescan insn with uid = 150.
changing bb of uid 151
  unscanned insn
deferring rescan insn with uid = 151.
changing bb of uid 152
  unscanned insn
deferring rescan insn with uid = 152.
changing bb of uid 153
  unscanned insn
deferring rescan insn with uid = 153.
changing bb of uid 154
  unscanned insn
deferring rescan insn with uid = 154.
changing bb of uid 155
  unscanned insn
deferring rescan insn with uid = 155.
changing bb of uid 158
  unscanned insn
deferring rescan insn with uid = 155.
Edge 20->8 redirected to 21
Redirecting fallthru edge 19->3 to 20
Redirecting fallthru edge 21->10 to 3
Making edge 20->9 impossible by redistributing probability to other edges.
changing bb of uid 171
  unscanned insn
changing bb of uid 161
  unscanned insn
deferring rescan insn with uid = 161.
changing bb of uid 162
  unscanned insn
deferring rescan insn with uid = 162.
changing bb of uid 163
  unscanned insn
deferring rescan insn with uid = 163.
changing bb of uid 164
  unscanned insn
deferring rescan insn with uid = 164.
changing bb of uid 165
  unscanned insn
deferring rescan insn with uid = 165.
changing bb of uid 166
  unscanned insn
deferring rescan insn with uid = 166.
changing bb of uid 167
  unscanned insn
deferring rescan insn with uid = 167.
changing bb of uid 168
  unscanned insn
deferring rescan insn with uid = 168.
changing bb of uid 169
  unscanned insn
deferring rescan insn with uid = 169.
changing bb of uid 170
  unscanned insn
deferring rescan insn with uid = 170.
changing bb of uid 173
  unscanned insn
deferring rescan insn with uid = 170.
Edge 22->8 redirected to 23
Redirecting fallthru edge 21->3 to 22
Redirecting fallthru edge 23->10 to 3
Making edge 3->9 impossible by redistributing probability to other edges.
deferring rescan insn with uid = 78.
deferring rescan insn with uid = 93.
deferring rescan insn with uid = 108.
deferring rescan insn with uid = 123.
deferring rescan insn with uid = 138.
deferring rescan insn with uid = 153.
deferring rescan insn with uid = 168.
deferring rescan insn with uid = 175.
deferring rescan insn with uid = 23.
changing bb of uid 176
  unscanned insn
Redirecting fallthru edge 10->9 to 24
Removing jump 80.
deferring deletion of insn with uid = 80.
deleting block 24
changing bb of uid 177
  unscanned insn
Redirecting fallthru edge 12->9 to 25
Removing jump 95.
deferring deletion of insn with uid = 95.
deleting block 25
changing bb of uid 178
  unscanned insn
Redirecting fallthru edge 14->9 to 26
Removing jump 110.
deferring deletion of insn with uid = 110.
deleting block 26
changing bb of uid 179
  unscanned insn
Redirecting fallthru edge 16->9 to 27
Removing jump 125.
deferring deletion of insn with uid = 125.
deleting block 27
changing bb of uid 180
  unscanned insn
Redirecting fallthru edge 18->9 to 28
Removing jump 140.
deferring deletion of insn with uid = 140.
deleting block 28
changing bb of uid 181
  unscanned insn
Redirecting fallthru edge 20->9 to 29
Removing jump 155.
deferring deletion of insn with uid = 155.
deleting block 29
changing bb of uid 182
  unscanned insn
Redirecting fallthru edge 3->9 to 30
Removing jump 26.
deferring deletion of insn with uid = 26.
deleting block 30
;; Unrolled loop 7 times, constant # of iterations 74 insns
fix_loop_structure: fixing up loops for function
starting the processing of deferred insns
rescanning insn with uid = 23.
rescanning insn with uid = 71.
rescanning insn with uid = 72.
rescanning insn with uid = 73.
rescanning insn with uid = 74.
rescanning insn with uid = 75.
rescanning insn with uid = 76.
rescanning insn with uid = 77.
rescanning insn with uid = 78.
rescanning insn with uid = 79.
rescanning insn with uid = 86.
rescanning insn with uid = 87.
rescanning insn with uid = 88.
rescanning insn with uid = 89.
rescanning insn with uid = 90.
rescanning insn with uid = 91.
rescanning insn with uid = 92.
rescanning insn with uid = 93.
rescanning insn with uid = 94.
rescanning insn with uid = 101.
rescanning insn with uid = 102.
rescanning insn with uid = 103.
rescanning insn with uid = 104.
rescanning insn with uid = 105.
rescanning insn with uid = 106.
rescanning insn with uid = 107.
rescanning insn with uid = 108.
rescanning insn with uid = 109.
rescanning insn with uid = 116.
rescanning insn with uid = 117.
rescanning insn with uid = 118.
rescanning insn with uid = 119.
rescanning insn with uid = 120.
rescanning insn with uid = 121.
rescanning insn with uid = 122.
rescanning insn with uid = 123.
rescanning insn with uid = 124.
rescanning insn with uid = 131.
rescanning insn with uid = 132.
rescanning insn with uid = 133.
rescanning insn with uid = 134.
rescanning insn with uid = 135.
rescanning insn with uid = 136.
rescanning insn with uid = 137.
rescanning insn with uid = 138.
rescanning insn with uid = 139.
rescanning insn with uid = 146.
rescanning insn with uid = 147.
rescanning insn with uid = 148.
rescanning insn with uid = 149.
rescanning insn with uid = 150.
rescanning insn with uid = 151.
rescanning insn with uid = 152.
rescanning insn with uid = 153.
rescanning insn with uid = 154.
rescanning insn with uid = 161.
rescanning insn with uid = 162.
rescanning insn with uid = 163.
rescanning insn with uid = 164.
rescanning insn with uid = 165.
rescanning insn with uid = 166.
rescanning insn with uid = 167.
rescanning insn with uid = 168.
rescanning insn with uid = 169.
rescanning insn with uid = 170.
rescanning insn with uid = 175.
ending the processing of deferred insns


main

Dataflow summary:
;;  invalidated by call 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr]
;;  hardware regs used 	 1 [1] 2 [2] 99 [ap] 109 [vscr] 110 [sfp]
;;  regular block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  eh block artificial uses 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;;  entry block defs 	 1 [1] 2 [2] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 31 [31] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 96 [lr] 99 [ap] 109 [vscr] 110 [sfp]
;;  exit block uses 	 1 [1] 2 [2] 3 [3] 31 [31] 108 [vrsave] 109 [vscr] 110 [sfp]
;;  regs ever live 	 1 [1] 2 [2] 3 [3] 4 [4] 33 [1] 96 [lr] 109 [vscr]
;;  ref usage 	r0={3d} r1={1d,26u} r2={1d,32u} r3={5d,2u} r4={5d,1u} r5={4d} r6={4d} r7={4d} r8={4d} r9={4d} r10={4d} r11={3d} r12={3d} r13={3d} r31={1d,23u} r32={3d} r33={5d,1u} r34={4d} r35={4d} r36={4d} r37={4d} r38={4d} r39={4d} r40={4d} r41={4d} r42={4d} r43={4d} r44={4d} r45={4d} r64={3d} r65={3d} r66={4d} r67={4d} r68={4d} r69={4d} r70={4d} r71={4d} r72={4d} r73={4d} r74={4d} r75={4d} r76={4d} r77={4d} r78={3d} r79={3d} r80={3d} r81={3d} r82={3d} r83={3d} r96={4d} r97={3d} r98={3d} r99={1d,22u} r100={3d} r101={3d} r105={3d} r106={3d} r107={3d} r108={1u} r109={4d,4u} r110={1d,23u} r119={8d,8u} r120={8d,8u} r121={8d,16u} r122={9d,9u} r125={9d,25u} r126={2d,2u} r131={8d,8u} r133={8d,8u} r134={8d,8u} r135={8d,1u} r136={1d,1u} r137={1d,1u} r138={1d,1u} r140={1d,1u} r141={1d,1u} r142={1d,1u} r144={1d,8u} r145={1d,8u} r146={1d,1u} r147={1d,8u} 
;;    total ref usage 553{294d,259u,0e} in 97{94 regular + 3 call} insns.
;; basic block 2, loop depth 0, count 108459 (estimated locally), maybe hot
;;  prev block 0, next block 3, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       ENTRY [always]  count:108459 (estimated locally) (FALLTHRU)
;; bb 2 artificial_defs: { }
;; bb 2 artificial_uses: { u0(1){ }u1(2){ }u2(31){ }u3(99){ }u4(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 126 144 145 146
;; live  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; live  gen 	 109 [vscr] 126 144 145 146
;; live  kill	 96 [lr]
(note 8 0 4 2 [bb 2] NOTE_INSN_BASIC_BLOCK)
(note 4 8 10 2 NOTE_INSN_FUNCTION_BEG)
(call_insn 10 4 7 2 (parallel [
            (call (mem:SI (symbol_ref:DI ("first_dummy") [flags 0x41]  <function_decl 0x3fff863be000 first_dummy>) [0 first_dummy S4 A8])
                (const_int 64 [0x40]))
            (clobber (reg:DI 96 lr))
        ]) "p9-dform-0.c":26:3 704 {*call_nonlocal_aixdi}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("first_dummy") [flags 0x41]  <function_decl 0x3fff863be000 first_dummy>)
        (nil))
    (expr_list (use (reg:DI 2 2))
        (nil)))
(insn 7 10 63 2 (set (reg:DI 126 [ ivtmp_28 ])
        (const_int 128 [0x80])) "p9-dform-0.c":26:3 609 {*movdi_internal64}
     (nil))

(insn 63 7 64 2 (set (reg/f:DI 145)
        (mem/u/c:DI (unspec:DI [
                    (symbol_ref/u:DI ("*.LC0") [flags 0x2])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [4  S8 A8])) 609 {*movdi_internal64}
     (expr_list:REG_EQUAL (symbol_ref:DI ("y") [flags 0x80]  <var_decl 0x3fff88d405a0 y>)
        (nil)))

(insn 64 63 6 2 (set (reg/f:DI 144)
        (mem/u/c:DI (unspec:DI [
                    (symbol_ref/u:DI ("*.LC1") [flags 0x2])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [4  S8 A8])) 609 {*movdi_internal64}
     (expr_list:REG_EQUAL (symbol_ref:DI ("x") [flags 0x80]  <var_decl 0x3fff88d40510 x>)
        (nil)))
(insn 6 64 24 2 (set (reg:DF 146 [ sacc ])
        (const_double:DF 0.0 [0x0.0p+0])) "p9-dform-0.c":29:10 512 {*movdf_hardfloat64}
     (nil))
;;  succ:       5 [always]  count:108459 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp] 126 144 145
;; live  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145

;; basic block 3, loop depth 2, count 67108863 (estimated locally), maybe hot
;; Invalid sum of incoming counts 76504103 (estimated locally), should be 67108863 (estimated locally)
;;  prev block 2, next block 9, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       5 [always]  count:10737418 (estimated locally) (FALLTHRU)
;;              23 [always]  count:65766685 (estimated locally) (FALLTHRU)
;; bb 3 artificial_defs: { }
;; bb 3 artificial_uses: { u8(1){ }u9(2){ }u10(31){ }u11(99){ }u12(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 122 125 144 145
;; lr  def 	 119 120 121 122 125 131 133 134 135
;; live  in  	 122 125
;; live  gen 	 119 120 121 122 125 131 133 134 135
;; live  kill	
(code_label 24 6 13 3 3 (nil) [0 uses])
(note 13 24 15 3 [bb 3] NOTE_INSN_BASIC_BLOCK)
(insn 15 13 17 3 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))
(insn 17 15 18 3 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))
(insn 18 17 19 3 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))
(insn 19 18 20 3 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))
(insn 20 19 21 3 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))
(insn 21 20 22 3 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))
(insn 22 21 175 3 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))
(insn 175 22 23 3 (set (reg:DI 147)
        (plus:DI (reg:DI 125 [ ivtmp.14 ])
            (const_int 16 [0x10]))) -1
     (nil))
(insn 23 175 25 3 (set (reg:DI 125 [ ivtmp.14 ])
        (reg:DI 147)) 609 {*movdi_internal64}
     (nil))
(insn 25 23 69 3 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;  succ:       8 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125

;; basic block 9, loop depth 1, count 10737420 (estimated locally), maybe hot
;; Invalid sum of incoming counts 1342177 (estimated locally), should be 10737420 (estimated locally)
;;  prev block 3, next block 8, flags: (NEW, RTL, MODIFIED)
;;  pred:       22 [2.0% (adjusted)]  count:1342177 (estimated locally) (FALLTHRU,LOOP_EXIT)
;; bb 9 artificial_defs: { }
;; bb 9 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	
;; lr  use 	
;; lr  def 	
;; live  in  	
;; live  gen 	
;; live  kill	
(note 69 25 67 9 [bb 9] NOTE_INSN_BASIC_BLOCK)
;;  succ:       4 [always]  count:10737420 (estimated locally) (FALLTHRU)
;; lr  out 	
;; live  out 	

;; basic block 8, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 9, next block 4, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       3 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 8 artificial_defs: { }
;; bb 8 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 122 125
;; live  gen 	
;; live  kill	
(code_label 67 69 66 8 5 (nil) [0 uses])
(note 66 67 27 8 [bb 8] NOTE_INSN_BASIC_BLOCK)
;;  succ:       10 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 144 145
;; live  out 	 122 125

;; basic block 4, loop depth 1, count 10737418 (estimated locally), maybe hot
;;  prev block 8, next block 7, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       9 [always]  count:10737420 (estimated locally) (FALLTHRU)
;; bb 4 artificial_defs: { }
;; bb 4 artificial_uses: { u30(1){ }u31(2){ }u32(31){ }u33(99){ }u34(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 126
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 126 136 137
;; live  in  	 109 [vscr] 122 126
;; live  gen 	 4 [4] 33 [1] 109 [vscr] 126 136 137
;; live  kill	 96 [lr]
(note 27 66 28 4 [bb 4] NOTE_INSN_BASIC_BLOCK)
(insn 28 27 29 4 (set (reg:DI 4 4)
        (const_int 512 [0x200])) "p9-dform-0.c":33:5 609 {*movdi_internal64}
     (nil))
(insn 29 28 30 4 (set (reg:DF 33 1)
        (reg/v:DF 122 [ sacc ])) "p9-dform-0.c":33:5 512 {*movdf_hardfloat64}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (nil)))
(call_insn 30 29 31 4 (parallel [
            (call (mem:SI (symbol_ref:DI ("dummy") [flags 0x41]  <function_decl 0x3fff863be100 dummy>) [0 dummy S4 A8])
                (const_int 0 [0]))
            (clobber (reg:DI 96 lr))
        ]) "p9-dform-0.c":33:5 704 {*call_nonlocal_aixdi}
     (expr_list:REG_DEAD (reg:DF 33 1)
        (expr_list:REG_DEAD (reg:DI 4 4)
            (expr_list:REG_CALL_DECL (symbol_ref:DI ("dummy") [flags 0x41]  <function_decl 0x3fff863be100 dummy>)
                (nil))))
    (expr_list (use (reg:DI 2 2))
        (expr_list:DF (use (reg:DF 33 1))
            (expr_list:SI (use (reg:DI 4 4))
                (nil)))))
(insn 31 30 32 4 (set (reg:SI 136)
        (plus:SI (subreg/s/v:SI (reg:DI 126 [ ivtmp_28 ]) 0)
            (const_int -1 [0xffffffffffffffff]))) "p9-dform-0.c":27:3 68 {*addsi3}
     (expr_list:REG_DEAD (reg:DI 126 [ ivtmp_28 ])
        (nil)))
(insn 32 31 33 4 (set (reg:DI 126 [ ivtmp_28 ])
        (zero_extend:DI (reg:SI 136))) "p9-dform-0.c":27:3 19 {zero_extendsidi2}
     (expr_list:REG_DEAD (reg:SI 136)
        (nil)))
(insn 33 32 34 4 (set (reg:CC 137)
        (compare:CC (reg:DI 126 [ ivtmp_28 ])
            (const_int 0 [0]))) "p9-dform-0.c":27:3 730 {*cmpdi_signed}
     (nil))
(jump_insn 34 33 65 4 (set (pc)
        (if_then_else (eq (reg:CC 137)
                (const_int 0 [0]))
            (label_ref 39)
            (pc))) "p9-dform-0.c":27:3 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CC 137)
        (int_list:REG_BR_PROB 10845908 (nil)))
 -> 39)
;;  succ:       7 [99.0% (guessed)]  count:10628959 (estimated locally) (FALLTHRU,DFS_BACK)
;;              6 [1.0% (guessed)]  count:108459 (estimated locally) (LOOP_EXIT)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; live  out 	 109 [vscr] 126

;; basic block 7, loop depth 1, count 10628959 (estimated locally), maybe hot
;;  prev block 4, next block 5, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       4 [99.0% (guessed)]  count:10628959 (estimated locally) (FALLTHRU,DFS_BACK)
;; bb 7 artificial_defs: { }
;; bb 7 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp]
;; lr  def 	
;; live  in  	 109 [vscr] 126
;; live  gen 	
;; live  kill	
(note 65 34 35 7 [bb 7] NOTE_INSN_BASIC_BLOCK)
;;  succ:       5 [always]  count:10628959 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; live  out 	 109 [vscr] 126

;; basic block 5, loop depth 1, count 10737418 (estimated locally), maybe hot
;;  prev block 7, next block 6, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       2 [always]  count:108459 (estimated locally) (FALLTHRU)
;;              7 [always]  count:10628959 (estimated locally) (FALLTHRU)
;; bb 5 artificial_defs: { }
;; bb 5 artificial_uses: { u45(1){ }u46(2){ }u47(31){ }u48(99){ }u49(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 126 144 145 146
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 110 [sfp] 146
;; lr  def 	 122 125
;; live  in  	 109 [vscr] 126
;; live  gen 	 122 125
;; live  kill	
(code_label 35 65 36 5 2 (nil) [0 uses])
(note 36 35 5 5 [bb 5] NOTE_INSN_BASIC_BLOCK)
(insn 5 36 68 5 (set (reg:DI 125 [ ivtmp.14 ])
        (const_int 0 [0])) "p9-dform-0.c":23:36 609 {*movdi_internal64}
     (nil))
(insn 68 5 39 5 (set (reg/v:DF 122 [ sacc ])
        (reg:DF 146 [ sacc ])) "p9-dform-0.c":29:10 512 {*movdf_hardfloat64}
     (expr_list:REG_DEAD (reg:DF 146 [ sacc ])
        (nil)))
;;  succ:       3 [always]  count:10737418 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp] 122 125 126 144 145 146
;; live  out 	 109 [vscr] 122 125 126

;; basic block 6, loop depth 0, count 108459 (estimated locally), maybe hot
;;  prev block 5, next block 10, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       4 [1.0% (guessed)]  count:108459 (estimated locally) (LOOP_EXIT)
;; bb 6 artificial_defs: { }
;; bb 6 artificial_uses: { u50(1){ }u51(2){ }u52(31){ }u53(99){ }u54(110){ }}
;; lr  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; lr  use 	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; lr  def 	 0 [0] 3 [3] 4 [4] 5 [5] 6 [6] 7 [7] 8 [8] 9 [9] 10 [10] 11 [11] 12 [12] 13 [13] 32 [0] 33 [1] 34 [2] 35 [3] 36 [4] 37 [5] 38 [6] 39 [7] 40 [8] 41 [9] 42 [10] 43 [11] 44 [12] 45 [13] 64 [0] 65 [1] 66 [2] 67 [3] 68 [4] 69 [5] 70 [6] 71 [7] 72 [8] 73 [9] 74 [10] 75 [11] 76 [12] 77 [13] 78 [14] 79 [15] 80 [16] 81 [17] 82 [18] 83 [19] 96 [lr] 97 [ctr] 98 [ca] 100 [0] 101 [1] 105 [5] 106 [6] 107 [7] 109 [vscr] 138 140 141 142
;; live  in  	 1 [1] 2 [2] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]
;; live  gen 	 3 [3] 109 [vscr] 138 140 141 142
;; live  kill	 96 [lr]
(code_label 39 68 40 6 4 (nil) [1 uses])
(note 40 39 41 6 [bb 6] NOTE_INSN_BASIC_BLOCK)

;; get address of opt_value
(insn 41 40 43 6 (set (reg/f:DI 138)
        (mem/u/c:DI (unspec:DI [
                    (symbol_ref/u:DI ("*.LC2") [flags 0x2])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [4  S8 A8])) "p9-dform-0.c":35:13 609 {*movdi_internal64}
     (expr_list:REG_EQUAL (symbol_ref:DI ("opt_value") [flags 0xc0]  <var_decl 0x3fff88d403f0 opt_value>)
        (nil)))

;; must be folded constant expression as a float
(insn 43 41 44 6 (set (reg:SF 140)
        (mem/u/c:SF (unspec:DI [
                    (symbol_ref/u:DI ("*.LC3") [flags 0x82])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [0  S4 A32])) "p9-dform-0.c":35:13 503 {movsf_hardfloat}
     (expr_list:REG_EQUAL (const_double:SF 1.31072e+5 [0x0.8p+18])
        (nil)))

;; overwrite opt_value
(insn 44 43 45 6 (set (mem/c:SF (reg/f:DI 138) [2 opt_value+0 S4 A32])
        (reg:SF 140)) "p9-dform-0.c":35:13 503 {movsf_hardfloat}
     (expr_list:REG_DEAD (reg:SF 140)
        (expr_list:REG_DEAD (reg/f:DI 138)
            (nil))))

;; get address of opt_desc
(insn 45 44 46 6 (set (reg/f:DI 141)
        (mem/u/c:DI (unspec:DI [
                    (symbol_ref/u:DI ("*.LC4") [flags 0x2])
                    (reg:DI 2 2)
                ] UNSPEC_TOCREL) [4  S8 A8])) "p9-dform-0.c":36:12 609 {*movdi_internal64}
     (expr_list:REG_EQUAL (symbol_ref:DI ("opt_desc") [flags 0xc0]  <var_decl 0x3fff88d40480 opt_desc>)
        (nil)))

;; overwrite opt_desc
(insn 46 45 47 6 (set (reg/f:DI 142)
        (unspec:DI [
                (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0x3fff88d41b00 *.LC5>)
                (reg:DI 2 2)
            ] UNSPEC_TOCREL)) "p9-dform-0.c":36:12 685 {*tocrefdi}
     (expr_list:REG_EQUAL (symbol_ref/f:DI ("*.LC5") [flags 0x82]  <var_decl 0x3fff88d41b00 *.LC5>)
        (nil)))

(insn 47 46 48 6 (set (mem/f/c:DI (reg/f:DI 141) [3 opt_desc+0 S8 A64])
        (reg/f:DI 142)) "p9-dform-0.c":36:12 609 {*movdi_internal64}
     (expr_list:REG_DEAD (reg/f:DI 142)
        (expr_list:REG_DEAD (reg/f:DI 141)
            (nil))))

;; call other_dummy

(call_insn 48 47 53 6 (parallel [
            (call (mem:SI (symbol_ref:DI ("other_dummy") [flags 0x41]  <function_decl 0x3fff863be200 other_dummy>) [0 other_dummy S4 A8])
                (const_int 64 [0x40]))
            (clobber (reg:DI 96 lr))
        ]) "p9-dform-0.c":37:3 704 {*call_nonlocal_aixdi}
     (expr_list:REG_CALL_DECL (symbol_ref:DI ("other_dummy") [flags 0x41]  <function_decl 0x3fff863be200 other_dummy>)
        (nil))
    (expr_list (use (reg:DI 2 2))
        (nil)))
(insn 53 48 54 6 (set (reg/i:DI 3 3)
        (const_int 0 [0])) "p9-dform-0.c":38:1 609 {*movdi_internal64}
     (nil))
(insn 54 53 81 6 (use (reg/i:DI 3 3)) "p9-dform-0.c":38:1 -1
     (nil))
;;  succ:       EXIT [always]  count:108459 (estimated locally) (FALLTHRU)
;; lr  out 	 1 [1] 2 [2] 3 [3] 31 [31] 99 [ap] 108 [vrsave] 109 [vscr] 110 [sfp]
;; live  out 	 1 [1] 2 [2] 3 [3] 31 [31] 99 [ap] 109 [vscr] 110 [sfp]

;; basic block 10, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 6, next block 11, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       8 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 10 artificial_defs: { }
;; bb 10 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 81 54 71 10 [bb 10] NOTE_INSN_BASIC_BLOCK)

(insn 71 81 72 10 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 72 71 73 10 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 73 72 74 10 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))
(insn 74 73 75 10 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 75 74 76 10 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 76 75 77 10 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 77 76 78 10 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 78 77 79 10 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 16 [0x10]))) 69 {*adddi3}
     (nil))

(insn 79 78 84 10 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;  succ:       11 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 11, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 10, next block 12, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       10 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 11 artificial_defs: { }
;; bb 11 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 84 79 83 11 6 (nil) [0 uses])
(note 83 84 96 11 [bb 11] NOTE_INSN_BASIC_BLOCK)
;;  succ:       12 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 12, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 11, next block 13, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       11 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 12 artificial_defs: { }
;; bb 12 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 96 83 86 12 [bb 12] NOTE_INSN_BASIC_BLOCK)
(insn 86 96 87 12 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 87 86 88 12 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 88 87 89 12 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 89 88 90 12 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 90 89 91 12 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 91 90 92 12 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 92 91 93 12 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 93 92 94 12 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 32 [0x20]))) 69 {*adddi3}
     (nil))

(insn 94 93 99 12 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;  succ:       13 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 13, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 12, next block 14, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       12 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 13 artificial_defs: { }
;; bb 13 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 99 94 98 13 7 (nil) [0 uses])
(note 98 99 111 13 [bb 13] NOTE_INSN_BASIC_BLOCK)
;;  succ:       14 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 14, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 13, next block 15, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       13 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 14 artificial_defs: { }
;; bb 14 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 111 98 101 14 [bb 14] NOTE_INSN_BASIC_BLOCK)
(insn 101 111 102 14 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 102 101 103 14 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 103 102 104 14 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 104 103 105 14 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 105 104 106 14 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 106 105 107 14 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 107 106 108 14 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 108 107 109 14 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 48 [0x30]))) 69 {*adddi3}
     (nil))

(insn 109 108 114 14 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;  succ:       15 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 15, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 14, next block 16, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       14 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 15 artificial_defs: { }
;; bb 15 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 114 109 113 15 8 (nil) [0 uses])
(note 113 114 126 15 [bb 15] NOTE_INSN_BASIC_BLOCK)
;;  succ:       16 [always]  count:67108863 (estimated locally) (FALLTHRU)


;; basic block 16, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 15, next block 17, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       15 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 16 artificial_defs: { }
;; bb 16 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 126 113 116 16 [bb 16] NOTE_INSN_BASIC_BLOCK)

(insn 116 126 117 16 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 117 116 118 16 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 118 117 119 16 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 119 118 120 16 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 120 119 121 16 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 121 120 122 16 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 122 121 123 16 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 123 122 124 16 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 64 [0x40]))) 69 {*adddi3}
     (nil))

(insn 124 123 129 16 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))

;;  succ:       17 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 17, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 16, next block 18, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       16 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 17 artificial_defs: { }
;; bb 17 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 129 124 128 17 9 (nil) [0 uses])
(note 128 129 141 17 [bb 17] NOTE_INSN_BASIC_BLOCK)
;;  succ:       18 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 18, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 17, next block 19, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       17 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 18 artificial_defs: { }
;; bb 18 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 141 128 131 18 [bb 18] NOTE_INSN_BASIC_BLOCK)

(insn 131 141 132 18 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 132 131 133 18 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 133 132 134 18 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 134 133 135 18 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 135 134 136 18 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 136 135 137 18 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 137 136 138 18 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 138 137 139 18 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 80 [0x50]))) 69 {*adddi3}
     (nil))

(insn 139 138 144 18 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))
;;  succ:       19 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 19, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 18, next block 20, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       18 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 19 artificial_defs: { }
;; bb 19 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 144 139 143 19 10 (nil) [0 uses])
(note 143 144 156 19 [bb 19] NOTE_INSN_BASIC_BLOCK)
;;  succ:       20 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 20, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 19, next block 21, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       19 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 20 artificial_defs: { }
;; bb 20 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 156 143 146 20 [bb 20] NOTE_INSN_BASIC_BLOCK)

(insn 146 156 147 20 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 147 146 148 20 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 148 147 149 20 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 149 148 150 20 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 150 149 151 20 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 151 150 152 20 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 152 151 153 20 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 153 152 154 20 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 96 [0x60]))) 69 {*adddi3}
     (nil))

(insn 154 153 159 20 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))

;;  succ:       21 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 21, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 20, next block 22, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       20 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 21 artificial_defs: { }
;; bb 21 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 159 154 158 21 11 (nil) [0 uses])
(note 158 159 171 21 [bb 21] NOTE_INSN_BASIC_BLOCK)
;;  succ:       22 [always]  count:67108863 (estimated locally) (FALLTHRU)

;; basic block 22, loop depth 2, count 67108863 (estimated locally), maybe hot
;;  prev block 21, next block 23, flags: (REACHABLE, RTL, MODIFIED)
;;  pred:       21 [always]  count:67108863 (estimated locally) (FALLTHRU)
;; bb 22 artificial_defs: { }
;; bb 22 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(note 171 158 161 22 [bb 22] NOTE_INSN_BASIC_BLOCK)

(insn 161 171 162 22 (set (reg:V2DF 131 [ vect__2.8 ])
        (mem:V2DF (plus:DI (reg/f:DI 145)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: y, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:23 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 145)
        (nil)))

(insn 162 161 163 22 (set (reg:V2DF 133 [ vect__1.5 ])
        (mem:V2DF (plus:DI (reg/f:DI 144)
                (reg:DI 125 [ ivtmp.14 ])) [1 MEM[symbol: x, index: ivtmp.14_25, offset: 0B]+0 S16 A64])) "p9-dform-0.c":31:16 1073 {vsx_movv2df_64bit}
     (expr_list:REG_DEAD (reg/f:DI 144)
        (nil)))

(insn 163 162 164 22 (set (reg:V2DF 121 [ vect__3.9 ])
        (mult:V2DF (reg:V2DF 131 [ vect__2.8 ])
            (reg:V2DF 133 [ vect__1.5 ]))) "p9-dform-0.c":31:20 1108 {*vsx_mulv2df3}
     (expr_list:REG_DEAD (reg:V2DF 133 [ vect__1.5 ])
        (expr_list:REG_DEAD (reg:V2DF 131 [ vect__2.8 ])
            (nil))))

(insn 164 163 165 22 (set (reg:DF 120 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 0 [0])
                ]))) 1259 {vsx_extract_v2df}
     (nil))

(insn 165 164 166 22 (set (reg:DF 119 [ stmp_sacc_16.10 ])
        (plus:DF (reg:DF 120 [ stmp_sacc_16.10 ])
            (reg/v:DF 122 [ sacc ]))) 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg/v:DF 122 [ sacc ])
        (expr_list:REG_DEAD (reg:DF 120 [ stmp_sacc_16.10 ])
            (nil))))

(insn 166 165 167 22 (set (reg:DF 134 [ stmp_sacc_16.10 ])
        (vec_select:DF (reg:V2DF 121 [ vect__3.9 ])
            (parallel [
                    (const_int 1 [0x1])
                ]))) "p9-dform-0.c":31:12 1259 {vsx_extract_v2df}
     (expr_list:REG_DEAD (reg:V2DF 121 [ vect__3.9 ])
        (nil)))

(insn 167 166 168 22 (set (reg/v:DF 122 [ sacc ])
        (plus:DF (reg:DF 134 [ stmp_sacc_16.10 ])
            (reg:DF 119 [ stmp_sacc_16.10 ]))) "p9-dform-0.c":31:12 289 {*adddf3_fpr}
     (expr_list:REG_DEAD (reg:DF 134 [ stmp_sacc_16.10 ])
        (expr_list:REG_DEAD (reg:DF 119 [ stmp_sacc_16.10 ])
            (nil))))

(insn 168 167 169 22 (set (reg:DI 125 [ ivtmp.14 ])
        (plus:DI (reg:DI 147)
            (const_int 112 [0x70]))) 69 {*adddi3}
     (nil))

(insn 169 168 170 22 (set (reg:CCUNS 135)
        (compare:CCUNS (reg:DI 125 [ ivtmp.14 ])
            (const_int 4096 [0x1000]))) 732 {*cmpdi_unsigned}
     (nil))

(jump_insn 170 169 174 22 (set (pc)
        (if_then_else (ne (reg:CCUNS 135)
                (const_int 0 [0]))
            (label_ref:DI 174)
            (pc))) 794 {*cbranch}
     (expr_list:REG_DEAD (reg:CCUNS 135)
        (int_list:REG_BR_PROB 1052266990 (nil)))
 -> 174)
;;  succ:       23 [98.0% (adjusted)]  count:65766686 (estimated locally) (DFS_BACK)
;;              9 [2.0% (adjusted)]  count:1342177 (estimated locally) (FALLTHRU,LOOP_EXIT)

;; basic block 23, loop depth 2, count 65766685 (estimated locally), maybe hot
;;  prev block 22, next block 1, flags: (NEW, REACHABLE, RTL, MODIFIED)
;;  pred:       22 [98.0% (adjusted)]  count:65766686 (estimated locally) (DFS_BACK)
;; bb 23 artificial_defs: { }
;; bb 23 artificial_uses: { u-1(1){ }u-1(2){ }u-1(31){ }u-1(99){ }u-1(110){ }}
(code_label 174 170 173 23 12 (nil) [1 uses])
(note 173 174 0 23 [bb 23] NOTE_INSN_BASIC_BLOCK)
;;  succ:       3 [always]  count:65766685 (estimated locally) (FALLTHRU)

Patch
diff mbox series

Index: gcc/config/rs6000/rs6000-p9dform.c
===================================================================
--- gcc/config/rs6000/rs6000-p9dform.c	(nonexistent)
+++ gcc/config/rs6000/rs6000-p9dform.c	(working copy)
@@ -0,0 +1,1623 @@ 
+/* Subroutines used to transform array subscripting expressions into
+   forms that are more amenable to d-form instruction selection for p9
+   little-endian VSX code.
+   Copyright (C) 1991-2019 Free Software Foundation, Inc.
+
+   This file is part of GCC.
+
+   GCC is free software; you can redistribute it and/or modify it
+   under the terms of the GNU General Public License as published
+   by the Free Software Foundation; either version 3, or (at your
+   option) any later version.
+
+   GCC is distributed in the hope that it will be useful, but WITHOUT
+   ANY WARRANTY; without even the implied warranty of MERCHANTABILITY
+   or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public
+   License for more details.
+
+   You should have received a copy of the GNU General Public License
+   along with GCC; see the file COPYING3.  If not see
+   <http://www.gnu.org/licenses/>.  */
+
+#include "config.h"
+#include "system.h"
+#include "coretypes.h"
+#include "backend.h"
+#include "rtl.h"
+#include "tree.h"
+#include "memmodel.h"
+#include "df.h"
+#include "tm_p.h"
+#include "ira.h"
+#include "print-tree.h"
+#include "varasm.h"
+#include "explow.h"
+#include "expr.h"
+#include "output.h"
+#include "tree-pass.h"
+#include "rtx-vector-builder.h"
+#include "cfgloop.h"
+
+#include "insn-config.h"
+#include "recog.h"
+
+#include "print-rtl.h"
+#include "tree-pretty-print.h"
+
+#include "genrtl.h"
+
+/* This pass transforms array indexing expressions from a form that
+   favors selection of X-form instructions into a form that favors
+   selection of D-form instructions.
+
+   Showing favor for D-form instructions is especially important when
+   targeting Power9, as the Power9 architecture added a number of new
+   D-form instruction capabilities.
+
+   Consider, for example, the following loop, excerpted from an actual
+   program:
+
+    double sacc, x[], y[], z[];
+    sacc = 0.00;
+    for (unsigned long long int i = 0; i < N; i++) {
+      z[i] = x[i] * y[i];
+      sacc += z[i];
+    }
+
+   Compile this program with the following gcc options which enable both
+   vectorization and loop unrolling:
+    -m64 -fdump-rtl-all-details -mcpu=power9 -mtune=power9 -funroll-loops -O3
+
+   Without this pass, this loop is represented by the following:
+
+   	lxvx:	       16
+	addi:		8
+	xvmuldp:	8
+	stxvx:		8
+	fmr:		8
+	xxpermdi:	8
+	fadd:	       16
+	bdnz:		1
+		      ___
+	      total:   73 instructions
+
+.L3:
+	lxvx 0,29,11
+	lxvx 12,30,11
+	addi 12,11,16
+	addi 0,11,48
+	addi 5,11,64
+	addi 9,11,32
+	addi 6,11,80
+	addi 7,11,96
+	addi 8,11,112
+	lxvx 2,29,12
+	lxvx 3,30,12
+	lxvx 4,29,0
+	lxvx 5,30,0
+	lxvx 10,30,9
+	lxvx 11,29,5
+	xvmuldp 6,0,12
+	lxvx 13,30,5
+	lxvx 8,29,9
+	lxvx 27,29,6
+	lxvx 28,30,6
+	xvmuldp 7,2,3
+	lxvx 29,29,7
+	lxvx 30,30,7
+	xvmuldp 9,4,5
+	lxvx 3,30,8
+	lxvx 0,29,8
+	xvmuldp 8,8,10
+	xvmuldp 10,11,13
+	xvmuldp 11,27,28
+	xxpermdi 26,6,6,3
+	fmr 2,6
+	stxvx 6,31,11
+	xvmuldp 12,29,30
+	addi 11,11,128
+	fadd 1,26,1
+	xxpermdi 26,7,7,3
+	stxvx 7,31,12
+	fmr 27,7
+	xvmuldp 0,0,3
+	xxpermdi 30,9,9,3
+	fmr 31,9
+	stxvx 8,31,9
+	xxpermdi 13,10,10,3
+	xxpermdi 28,8,8,3
+	stxvx 9,31,0
+	stxvx 10,31,5
+	fadd 1,2,1
+	fmr 2,10
+	xxpermdi 3,11,11,3
+	stxvx 11,31,6
+	fmr 4,11
+	fmr 29,8
+	xxpermdi 5,12,12,3
+	fmr 6,12
+	stxvx 12,31,7
+	xxpermdi 9,0,0,3
+	fmr 8,0
+	fadd 7,26,1
+	stxvx 0,31,8
+	fadd 10,27,7
+	fadd 11,28,10
+	fadd 12,29,11
+	fadd 26,30,12
+	fadd 27,31,26
+	fadd 30,13,27
+	fadd 31,2,30
+	fadd 0,3,31
+	fadd 28,4,0
+	fadd 29,5,28
+	fadd 13,6,29
+	fadd 1,9,13
+	fadd 1,8,1
+	bdnz .L3
+
+   With this pass, the same loop is represented by:
+
+   	lxvx:	        4
+	lxv:	       12
+	addi:		2
+	add:		3
+	xvmuldp:	8
+	stxvx:		2
+	stxv:		6
+	fmr:		8
+	xxpermdi:	8
+	fadd:	       16
+	bdnz:		1
+		      ___
+	      total:   70 instructions
+
+.L3:
+	lxvx 0,29,6
+	lxvx 12,30,6
+	addi 10,6,16
+	add 7,29,10
+	add 8,30,10
+	lxvx 2,29,10
+	lxvx 3,30,10
+	add 11,31,10
+	lxv 4,16(7)
+	lxv 9,16(8)
+	xvmuldp 6,0,12
+	lxv 13,64(8)
+	lxv 5,32(7)
+	lxv 28,32(8)
+	lxv 11,64(7)
+	xvmuldp 7,2,3
+	lxv 30,80(7)
+	lxv 12,80(8)
+	lxv 31,48(8)
+	lxv 10,48(7)
+	xvmuldp 8,4,9
+	lxv 3,96(8)
+	lxv 0,96(7)
+	xvmuldp 9,5,28
+	xvmuldp 11,11,13
+	xxpermdi 29,6,6,3
+	fmr 4,6
+	stxvx 6,31,6
+	xvmuldp 12,30,12
+	addi 6,6,128
+	fadd 1,29,1
+	xxpermdi 28,7,7,3
+	fmr 29,7
+	stxvx 7,31,10
+	xvmuldp 10,10,31
+	xxpermdi 30,8,8,3
+	fmr 31,8
+	stxv 8,16(11)
+	xvmuldp 0,0,3
+	xxpermdi 5,11,11,3
+	fmr 6,11
+	xxpermdi 13,9,9,3
+	fadd 1,4,1
+	stxv 11,64(11)
+	xxpermdi 7,12,12,3
+	fmr 8,12
+	stxv 12,80(11)
+	fmr 2,9
+	xxpermdi 3,10,10,3
+	fmr 4,10
+	stxv 9,32(11)
+	stxv 10,48(11)
+	fadd 28,28,1
+	xxpermdi 9,0,0,3
+	fmr 10,0
+	stxv 0,96(11)
+	fadd 11,29,28
+	fadd 29,30,11
+	fadd 12,31,29
+	fadd 30,13,12
+	fadd 31,2,30
+	fadd 13,3,31
+	fadd 2,4,13
+	fadd 1,5,2
+	fadd 0,6,1
+	fadd 3,7,0
+	fadd 4,8,3
+	fadd 5,9,4
+	fadd 1,10,5
+	bdnz .L3
+
+   The optimized loop body replaces 12 lxvx instructions with lxv
+   instructions, 6 stxvx instructions with stxv, and has 3 fewer add
+   operations.
+
+   This pass runs immediately after pass_loop2.  Loops have already
+   been unrolled.  The pass searches for sequences of code of the following
+   form.  These code sequences often appear within the expanded loop bodies
+   that result from unrolling.  The memory access patterns below match
+   both load and store instructions.  The set of memory operations
+   that derive from the same originating expression are grouped together
+   by this algorithm into a collection identified within the code as an
+   equivalence class.
+
+   A0: *(array_base + offset)
+   ;; The above is known as an originating access to memory.
+
+   Aij: offset += constant (i, j)
+   ;; Between consecutive accesses to memory, there may appear zero or
+   ;; more constant adjustments to the memory offset subexpression.
+
+   Ai: *(array_base + offset)
+   ;; The memory address for each subsequent access to memory differs
+   ;; from the originating memory access by a constant offset, which
+   ;; is computed by adding together all of the preceding constant
+   ;; (i,j) values.
+
+   ;; In any given equivalence class, there may be multiple subsequenet
+   ;; memory accesses, identifed as A2, A3, ... AN, and there may be
+   ;; multiple constant adjustments to the offset expression between
+   ;; each pair Ai-1 and Ai where the N intervening constant
+   ;; adjustments are identified as Aij for j ranging from 0 to N-1.
+
+   ;; It is required that each element of the matched pattern dominate
+   ;; the element that follows.  In other words, the flow through the
+   ;; various matched elements must be unconditional.  Otherwise, the
+   ;; matched elements cannot be considered to reside within the same
+   ;; equivalence class for purposes of this optimization.
+
+   This pass replaces the above-matched sequences with:
+
+   Ai: derived_pointer = array_base + offset
+       *(derived_pointer)
+
+   Aij: leave these alone.  expect that subsequent optimization deletes
+        this code as it may become dead (since we don't use the
+        indexing expression following our code transformations.)
+
+   Ai:
+   *(derived_pointer + constant_i)
+     (where constant_i equals sum of constant (n,j) for all n from 1
+      to i paired with all j from 1 to Kn,
+
+   Note that there may be multiple equivalence classes, each
+   associated with the same or possibly a different array_base value
+   within each function that is processed by this optimization pass.  */
+
+/* This is based on the union-find logic in web.c.  web_entry_base is
+   defined in df.h.  */
+class indexing_web_entry: public web_entry_base
+{
+ public:
+  rtx_insn *insn;		/* Pointer to the insn */
+  basic_block bb;		/* Pointer to the enclosing basic block */
+
+  /* A unique sequence number is assigned to each instruction for the
+     purpose of simplifying domination tests.  Within each basic
+     block, sequence numbers areassigned in strictly increasing order.
+     Thus, for any two instructions known to reside in the same basic
+     block, the instruction with a lower insn_sequence_no is kknown
+     to dominate the instruction with a higher insn_sequence_no.  */
+  unsigned int insn_sequence_no;
+
+  /* If this insn is relevant, it is a load or store with a memory
+     address that is comprised of a base pointer (e.g. the address of
+     an array or array slice) and an index expression (e.g. an index
+     within the array).  The original_base_use and original_index_use
+     fields represent the numbers of the instructions that define the
+     base and index values which are summed together with a constant
+     value to determine the value of this instruction's memory
+     address.  */
+  unsigned int original_base_use;
+  unsigned int original_index_use;
+
+  /* If this insn is relevant, the register assigned by insn
+     original_base_use is original_base_reg.  The insn assigned by insn
+     original_index_use is original_index_reg.  */
+  unsigned int original_base_reg;
+  unsigned int original_index_reg;
+
+  /* If this insn is_relevant, this is the constant that is added to
+     the originating expression to calculate the value of this insn's
+     memory address.  */
+  int base_delta;
+  int index_delta;
+
+  /* If this insn is relevant, it belongs to an equivalence class.
+     The equivalence classes are identified by the definitions that
+     define the inputs to this insn.   */
+  unsigned int base_equivalence_hash;
+  unsigned int index_equivalence_hash;
+
+  /* When multiple insns fall within the same equivalence class, they
+     are linked together through this field.  The value UINT_MAX
+     represents the end of this list.  */
+  unsigned int equivalent_sibling;
+
+  /* Only instructions that represent loads or stores for which the
+     memory address computation is in a particular simple form are
+     considered relevant to this d-form optimization pass.
+
+     If a particular entry is identified as is_relevant == false, the
+     values of the following fields are all undefined: is_load,
+     is_store, is_originating, original_base_use, original_index_use,
+     original_base_reg, original_index_reg, base_delta, index_delta,
+     base_equivalence_hash, index_equivalence_hash, and
+     equivalent_sibling.  */
+  unsigned int is_relevant : 1;
+  unsigned int is_load : 1;
+  unsigned int is_store : 1;
+  unsigned int is_originating : 1;
+};
+
+/* Count how many definitions reach the use that is represented by the
+   DEF_LINK argument.  */
+static unsigned int
+count_links (struct df_link *def_link)
+{
+  int result;
+  for (result = 0; def_link != NULL; result++)
+    def_link = def_link->next;
+  return result;
+}
+
+static unsigned int max_use_links = 0;
+
+/* Helper comparison function for use by qsort.  */
+static int
+int_compare (const void *v1, const void *v2)
+{
+  const int *i1 = (const int *) v1;
+  const int *i2 = (const int *) v2;
+  return *i1 - *i2;
+}
+
+/* Calculate the hash value for the use represented by DEF_LINK, given
+   that COUNT definitions are known to reach this use.
+
+   Complexity is n*log(n) (for qsort) where n is COUNT.  */
+static unsigned int
+help_hash (unsigned int count, struct df_link *def_link)
+{
+  int *ids;
+  int i = 0;
+
+  ids = (int *) alloca (count * sizeof (ids[0]));
+  if (count > max_use_links)
+    max_use_links = count;
+
+  while (def_link != NULL)
+    {
+      ids[i++] = DF_REF_ID (def_link->ref);
+      def_link = def_link->next;
+    }
+
+  /* sort to put ids in ascending order. */
+  qsort ((void *) ids, count, sizeof (ids[0]), int_compare);
+
+  unsigned int result = 0;
+  for (unsigned i = 0; i < count; i++)
+    {
+      result = (result << 6) ^ ((result >> 28) & 0x0f);
+      result += ids[i];
+    }
+  return result;
+}
+
+/* Calculate a hash code that represents all of the definitions that
+   contribute to a given variable's computed value.  */
+static unsigned int
+equivalence_hash (struct df_link *def_link)
+{
+  unsigned int count = count_links (def_link);
+  return help_hash (count, def_link);
+}
+
+static void
+overwrite_defs_header (indexing_web_entry *insn_entry, unsigned int uid,
+		       df_ref use, struct df_link **header)
+{
+  struct df_link *def_link = DF_REF_CHAIN (use);
+
+  /* If there is no def, or if the single used variable has
+     multiple definitions, or the single used variable's
+     definition is artificial, or if there are multiple used
+     variables, this is an originating use.  */
+  if (!def_link || !def_link->ref
+      || DF_REF_IS_ARTIFICIAL (def_link->ref) || def_link->next)
+    *header = def_link;
+  else
+    {
+      unsigned int uid2	= insn_entry[uid].original_base_use;
+      df_ref use2;
+      if (uid2 > 0)
+	{
+	  rtx_insn *insn2 = insn_entry[uid2].insn;
+	  struct df_insn_info *insn_info2 = DF_INSN_INFO_GET (insn2);
+	  use2 = DF_INSN_INFO_USES (insn_info2);
+	  if (use2)
+	    *header = DF_REF_CHAIN (use2);
+	  else
+	    *header = NULL;
+	}
+    }
+}
+
+static void
+help_find_defs (indexing_web_entry *insn_entry,
+		unsigned int uid, rtx base_reg, rtx index_reg,
+		struct df_link **insn_base, struct df_link **insn_index)
+{
+  rtx_insn *insn = insn_entry[uid].insn;
+  struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+  df_ref use;
+
+  /* Iterate over the definitions used by this insn: loop iterates at
+     most 3 times, once for definitions of base register, once for
+     definitions of index register, and, for store operations, once
+     for the value stored to memory.  */
+  FOR_EACH_INSN_INFO_USE (use, insn_info)
+    {
+      if (rtx_equal_p (DF_REF_REG (use), base_reg))
+ 	overwrite_defs_header (insn_entry, uid, use, insn_base);
+      else if (rtx_equal_p (DF_REF_REG (use), index_reg))
+	overwrite_defs_header (insn_entry, uid, use, insn_index);
+    }
+}
+
+/* Find the linked list of definitions that define the base register
+   and the the linked list of definitions that define the index register
+   where INSN is known to represent a load or store operation and the
+   relevant memory address is represented as a sum or difference of a
+   base register and an index offset register.  Pointers to the
+   respective linked lists of definitions are saved to the *INSN_BASE
+   and *INSN_INDEX locations respectively.  */
+static void
+find_defs (indexing_web_entry *insn_entry, rtx_insn *insn,
+	   struct df_link **insn_base, struct df_link **insn_index)
+{
+  unsigned int uid = INSN_UID (insn);
+  rtx body = PATTERN (insn);
+  rtx mem = NULL;
+
+  gcc_assert (GET_CODE (body) == SET);
+
+  if (MEM_P (SET_SRC (body)))
+    mem = XEXP (SET_SRC (body), 0);
+  else if (MEM_P (SET_DEST (body)))
+    mem = XEXP (SET_DEST (body), 0);
+  else
+    gcc_unreachable ();
+
+  enum rtx_code code = GET_CODE (mem);
+  gcc_assert ((code == PLUS) || (code == MINUS));
+
+  rtx base_reg = XEXP (mem, 0);
+  rtx index_reg = XEXP (mem, 1);
+  if (REG_P (base_reg) && REG_P (index_reg))
+    help_find_defs (insn_entry, uid,
+		    base_reg, index_reg, insn_base, insn_index);
+}
+
+/* Return non-zero if and only if USE represents a compile-time constant.  */
+static bool
+represents_constant_p (df_ref use)
+{
+  struct df_link *def_link = DF_REF_CHAIN (use);
+
+  /* If there is no definition, or there are multiple definitions,
+     or the definition is artificial, this is an originating use which
+     is not a constant.  */
+  if (!def_link || !def_link->ref
+      || DF_REF_IS_ARTIFICIAL (def_link->ref) || def_link->next)
+    return false;
+  else
+    {
+      rtx def_insn = DF_REF_INSN (def_link->ref);
+      rtx body = PATTERN (def_insn);
+      if (CONST_INT_P (body))
+	return true;
+      else if (GET_CODE (body) == SET)
+	{
+	  /* Recurse on the use that defines this variable.  */
+	  struct df_insn_info *inner_insn_info = DF_INSN_INFO_GET (def_insn);
+	  df_ref inner_use;
+	  FOR_EACH_INSN_INFO_USE (inner_use, inner_insn_info)
+	    {
+	      if (!represents_constant_p (inner_use))
+		return false;
+	    }
+	  /* Multiple used definitions used are all constant.  */
+	  return true;
+	}
+      else
+	return false;	/* Treat unrecognized codes as not constant.  */
+    }
+}
+
+/* This function helps analyze opportunities for copy propogation and
+   constant folding.
+
+   An originator represents the first point at which the value of
+   DEF_LINK is derived from potentially more than one input
+   definition, or the point at which DEF_LINK's value is defined by an
+   algebraic expression involving only constants,
+
+   If DEF_LINK's value depends on a constant combined with a single
+   variable or a simple propagation of a single variable, continue
+   the search for the originator by examining the origin of the source
+   variable's value.
+
+   The value of *ADJUSTMENT is overwritten with the constant value that is
+   added to the originator expression to obtain the value intended to
+   be represented by DEF_LINK.  In the case that find_true_originator
+   returns NULL, the value held in *ADJUSTMENT is undefined.
+
+   Returns NULL if there is no single true originator.  In general, the search
+   for an originator expression only spans SET operations that are
+   based on simple algebraic expressions, each of which involves no
+   more than one variable input.
+
+   Complexity: Linear in the number of definitions used by this
+    instruction multiplied (for recursive calls) by the maximum depth
+    of the potential copy propogation chain.
+  */
+static rtx
+find_true_originator (struct df_link *def_link, long long int *adjustment)
+{
+  rtx def_insn = DF_REF_INSN (def_link->ref);
+
+  rtx inner_body = PATTERN (def_insn);
+  if (GET_CODE (inner_body) == SET)
+    {
+      struct df_insn_info *inner_insn_info = DF_INSN_INFO_GET (def_insn);
+      df_ref inner_use;
+
+      /* We're only happy with multiple uses if all but one represent
+	 constant values.  */
+      int non_constant_uses = 0;
+      rtx result = NULL;
+      FOR_EACH_INSN_INFO_USE (inner_use, inner_insn_info)
+	{
+	  if (!represents_constant_p (inner_use))
+	    {
+	      non_constant_uses++;
+	      /* There should be only one non-constant use, and it should
+		 satisfy find_true_originator.  */
+	      struct df_link *def_link = DF_REF_CHAIN (inner_use);
+
+	      /* If there is no definition, or there are multiple definitions,
+		 or the definition is artificial, this is an
+		 originating use.  */
+	      if (!def_link || !def_link->ref
+		  || DF_REF_IS_ARTIFICIAL (def_link->ref) || def_link->next)
+		result = def_insn;
+	      else
+		result = find_true_originator (def_link, adjustment);
+	    }
+	}
+
+      /* If non_constant_uses > 1, the value of result is not well
+	 defined because it is overwritten during multiple iterations
+	 of the above loop.  In the case that non_constant_uses > 1,
+	 we ignore the result value computed above.  */
+      if (non_constant_uses == 1) {
+
+	/* If my SET looks like a simple register copy, or if it looks
+	   like PLUS or MINUS of a constant and a register, this is
+	   what we optimize.  Otherwise, punt.  */
+
+	if (result == NULL)
+	  /* Doing constant arithmetic with unknown originator is not
+	     useful.  */
+	  return def_insn;
+
+	rtx source_expr = SET_SRC (inner_body);
+	int source_code = GET_CODE (source_expr);
+	if (source_code == PLUS)
+	  {
+	    rtx op1 = XEXP (source_expr, 0);
+	    rtx op2 = XEXP (source_expr, 1);
+
+	    if (CONST_INT_P (op1) && CONST_INT_P (op2))
+	      *adjustment += INTVAL (op1);
+	    else if (!CONST_INT_P (op1) && CONST_INT_P (op2))
+	      *adjustment += INTVAL (op2);
+	  }
+	else if (source_code == MINUS)
+	  {
+	    rtx op1 = XEXP (source_expr, 0);
+	    rtx op2 = XEXP (source_expr, 1);
+
+	    if (!CONST_INT_P (op1) && CONST_INT_P (op2))
+	      *adjustment -= INTVAL (op1);
+	    else
+	      /* assumption is that *adjustment is added to a positive variable
+		 expression, so don't optimize this rare condition.  */
+	      result = def_insn;
+	  }
+	else if (source_code != REG)
+	  /* We don't handle ashift, UNSPEC, etc..  */
+	  result = def_insn;
+	/* else, register copy expression does not impact adjustment.  */
+
+	return result;
+      }
+      else
+	/* Same behavior if there are too many non-constant inputs or if
+	   all inputs are constant.  */
+	return def_insn;
+    }
+  else
+    /* This is not a SET.  It does not serve as a true originator. */
+    return NULL;
+}
+
+/* The size of the insn_entry array.  Note that this array does not
+   represent instructions created during this optimization pass.  */
+static unsigned int max_uid_at_start;
+
+/* Return true if and only if ELEMENT is on LIST.  */
+static bool
+in_use_list (struct df_link *list, struct df_link *element)
+{
+  while (list != NULL)
+    {
+      if (element->ref == list->ref)
+	return true;
+      list = list->next;
+    }
+  /* Got to end of list without finding element.  */
+  return false;
+}
+
+/* Return true iff the instruction represented by uid_1 is in the same
+   equivalence class as the instruction represented by uid_2.
+
+   Returns false generally in constant time (based on unequal hash
+   values).  Returning true, as currently implemented, requires
+   quadratic time in the number of definitions that contribute to
+   either the base or index expressions of the equivalence class.
+
+   To improve complexity to linear in the number of contributing
+   definitions, allocate and remember the sorted list of all
+   definitions for each relevant value.   */
+static bool
+equivalent_p (indexing_web_entry *insn_entry,
+	      unsigned int uid_1, unsigned int uid_2)
+{
+  if ((insn_entry[uid_1].base_equivalence_hash !=
+       insn_entry[uid_2].base_equivalence_hash) ||
+      (insn_entry[uid_1].index_equivalence_hash !=
+       insn_entry[uid_2].index_equivalence_hash))
+    return false;
+
+  /* Hash codes match.  Check details.  */
+  rtx_insn *insn_1, *insn_2;
+  insn_1 = insn_entry[uid_1].insn;
+  insn_2 = insn_entry[uid_2].insn;
+
+  struct df_link *insn1_base_defs, *insn1_index_defs;
+  struct df_link *insn2_base_defs, *insn2_index_defs;
+
+  find_defs (insn_entry, insn_1, &insn1_base_defs, &insn1_index_defs);
+  find_defs (insn_entry, insn_2, &insn2_base_defs, &insn2_index_defs);
+
+  int base_count_1 = count_links (insn1_base_defs);
+  int index_count_1 = count_links (insn1_index_defs);
+  int base_count_2 = count_links (insn2_base_defs);
+  int index_count_2 = count_links (insn2_index_defs);
+
+  if ((base_count_1 != base_count_2) || (index_count_1 != index_count_2))
+    return false;
+
+  if (insn_entry [uid_1].original_base_reg
+      != insn_entry [uid_2].original_base_reg)
+    return false;
+  else if (insn_entry [uid_1].original_index_reg
+	   != insn_entry [uid_2].original_index_reg)
+    return false;
+
+  /* Counts are the same.  Make sure elements match.   */
+  /* The following comparison code is quadratic in counts.  Improve to
+     n*log(n) by sorting the four arrays and comparing eleents pairwise.
+     Improve to linear by remembering the sorted contents of each of
+     the four arrays from when the hash values were first computed.
+     Since the typical sizes of the count variables are fairly small,
+     leave as is unless performance measurements justify increased
+     complexity.  */
+  while (insn1_base_defs != NULL)
+    {
+      if (!in_use_list (insn2_base_defs, insn1_base_defs))
+	return false;
+      insn1_base_defs = insn1_base_defs->next;
+    }
+  /* base patterns match, but stil need to consider index matches.  */
+  while (insn1_index_defs != NULL)
+    {
+      if (!in_use_list (insn2_index_defs, insn1_index_defs))
+	return false;
+      insn1_index_defs = insn1_index_defs->next;
+    }
+
+  return true;
+}
+
+/* Return true iff instruction E2 dominates instruction E1.  Note
+   that insn_dominated_by_p defined in ira.c is declared static and
+   requires initialization of auxilary data not computed in this
+   context. */
+static bool
+insn_dominated_by_p (indexing_web_entry *e1, indexing_web_entry *e2)
+{
+  basic_block bb1 = e1->bb;
+  basic_block bb2 = e2->bb;
+
+  if (bb1 == bb2)
+    return e2->insn_sequence_no <= e1->insn_sequence_no;
+  else
+    return dominated_by_p (CDI_DOMINATORS, bb1, bb2);
+}
+
+/* Confirm that everything in the equivalence class is eligible for
+   representation as a d-form insn.  Otherwise, remove additional
+   entries from the equivalence class.
+  */
+static unsigned int
+confirm_dform_insns (indexing_web_entry *insn_entry,
+		     unsigned int *equivalence_hash, unsigned int index,
+		     unsigned int the_dominator, unsigned replacement_count)
+{
+  unsigned int uid;
+  long long int dominator_delta = (insn_entry [the_dominator].base_delta
+				   + insn_entry [the_dominator].index_delta);
+  for (uid = equivalence_hash [index]; uid != UINT_MAX;
+       uid = insn_entry [uid].equivalent_sibling)
+    {
+      if (uid != the_dominator)
+	{
+	  long long int dominated_delta = (insn_entry [uid].base_delta
+					   + insn_entry [uid].index_delta);
+	  dominated_delta -= dominator_delta;
+
+	  rtx_insn *insn = insn_entry [uid].insn;
+	  rtx body = PATTERN (insn);
+	  rtx mem;
+
+	  gcc_assert (GET_CODE (body) == SET);
+
+	  if (MEM_P (SET_SRC (body))) /* load */
+	    {
+	      mem = SET_SRC (body);
+	      if (!rs6000_target_supports_dform_offset_p
+		  (GET_MODE (mem), dominated_delta))
+		replacement_count--;
+	    }
+	  else
+	    {
+	      mem = SET_DEST (body); /* store */
+	      if (!rs6000_target_supports_dform_offset_p
+		  (GET_MODE (mem), dominated_delta))
+		replacement_count--;
+	    }
+	}
+    }
+  return replacement_count;
+}
+
+/* Replace all xform insns in equivalence class with dform insns.  */
+void replace_xform_insns (indexing_web_entry *insn_entry,
+			  unsigned int *equivalence_hash,
+			  unsigned int index,
+			  unsigned int the_dominator)
+{
+  /* First, fix up the_dominator instruction.  */
+  rtx derived_ptr_reg = gen_reg_rtx (Pmode);
+  rtx_insn *insn = insn_entry [the_dominator].insn;
+  rtx body = PATTERN (insn);
+  rtx base_reg, index_reg;
+  rtx addr, mem;
+  rtx new_init_expr;
+
+  if (dump_file)
+    {
+      fprintf (dump_file,
+	       "Endeavoring to replace originating insn %d: ", the_dominator);
+      print_inline_rtx (dump_file, insn, 2);
+      fprintf (dump_file, "\n");
+    }
+
+  gcc_assert (GET_CODE (body) == SET);
+  if (MEM_P (SET_SRC (body)))
+    {
+      /* originating instruction is a load */
+      mem = SET_SRC (body);
+      addr = XEXP (SET_SRC (body), 0);
+    }
+  else
+    { /* originating instruction is a store */
+      gcc_assert (MEM_P (SET_DEST (body)));
+      mem = SET_DEST (body);
+      addr = XEXP (SET_DEST (body), 0);
+    }
+
+  enum rtx_code code = GET_CODE (addr);
+  gcc_assert ((code == PLUS) || (code == MINUS));
+  base_reg = XEXP (addr, 0);
+  index_reg = XEXP (addr, 1);
+
+  if (code == PLUS)
+    new_init_expr = gen_rtx_PLUS (Pmode, base_reg, index_reg);
+  else
+    new_init_expr = gen_rtx_MINUS (Pmode, base_reg, index_reg);
+  new_init_expr = gen_rtx_SET (derived_ptr_reg, new_init_expr);
+
+  rtx_insn *new_insn = emit_insn_before (new_init_expr, insn);
+  set_block_for_insn (new_insn, BLOCK_FOR_INSN (insn));
+  INSN_CODE (new_insn) = -1; /* force re-recogniition. */
+  df_insn_rescan (new_insn);
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "with insn %d: ", INSN_UID (new_insn));
+      print_inline_rtx (dump_file, new_insn, 2);
+      fprintf (dump_file, "\n");
+    }
+
+  /* If dominator_delta != 0, we need to make adjustments for dominator_delta
+     in the D-form constant offsets associated with the propagating
+     instructions.  */
+
+  rtx new_mem = gen_rtx_MEM (GET_MODE (mem), derived_ptr_reg);
+  MEM_COPY_ATTRIBUTES (new_mem, mem);
+  rtx new_expr;
+  if (insn_entry [the_dominator].is_load)
+    new_expr = gen_rtx_SET (SET_DEST (body), new_mem);
+  else
+    new_expr = gen_rtx_SET (new_mem, SET_SRC (body));
+
+  if (!validate_change (insn, &PATTERN(insn), new_expr, false))
+    {	/* proposed change was rejected */
+      if (dump_file)
+	{
+	  fprintf (dump_file,
+		   "Dform optimization rejected by validate_change\n");
+	  print_inline_rtx (dump_file, new_insn, 2);
+	  fprintf (dump_file, "\n");
+	}
+    }
+  else if (dump_file)
+    {
+      fprintf (dump_file, "and with insn %d: ", INSN_UID (insn));
+      print_inline_rtx (dump_file, insn, 2);
+      fprintf (dump_file, "\n");
+    }
+
+  for (unsigned int uid = equivalence_hash [index]; uid != UINT_MAX;
+       uid = insn_entry [uid].equivalent_sibling)
+    {
+      if (uid != the_dominator)
+	{
+	  long long int dominated_delta = (insn_entry [uid].base_delta
+					   + insn_entry [uid].index_delta);
+	  long long int dominator_delta
+	    = (insn_entry [the_dominator].base_delta
+	       + insn_entry [the_dominator].index_delta);
+	  dominated_delta -= dominator_delta;
+
+	  rtx_insn *insn = insn_entry [uid].insn;
+	  rtx body = PATTERN (insn);
+	  rtx mem;
+
+	  if (dump_file)
+	    {
+	      fprintf (dump_file,
+		       "Endeavoring to replace propagating insn %d: ", uid);
+	      print_inline_rtx (dump_file, insn, 2);
+	      fprintf (dump_file, "\n");
+	    }
+
+	  gcc_assert (GET_CODE (body) == SET);
+	  if (MEM_P (SET_SRC (body))) /* load */
+	    mem = SET_SRC (body);
+	  else
+	    mem = SET_DEST (body); /* store */
+
+	  rtx ci = gen_rtx_raw_CONST_INT (Pmode, dominated_delta);
+	  rtx addr_expr = gen_rtx_PLUS (Pmode,
+					derived_ptr_reg, ci);
+	  rtx new_mem = gen_rtx_MEM (GET_MODE (mem), addr_expr);
+	  MEM_COPY_ATTRIBUTES (new_mem, mem);
+
+	  rtx new_expr;
+	  if (insn_entry [uid].is_load)
+	    new_expr = gen_rtx_SET (SET_DEST (body), new_mem);
+	  else
+	    new_expr = gen_rtx_SET (new_mem, SET_SRC (body));
+
+	  if (!validate_change (insn, &PATTERN(insn), new_expr, false))
+	    {	/* proposed change was rejected */
+	      if (dump_file)
+		{
+		  fprintf (dump_file,
+			   "Dform optimization rejected by validate_change\n");
+		  print_inline_rtx (dump_file, new_expr, 2);
+		  fprintf (dump_file, "\n");
+		}
+	    }
+	  else if (dump_file)
+	    {
+	      fprintf (dump_file, "with insn %d: ", INSN_UID (insn));
+	      print_inline_rtx (dump_file, insn, 2);
+	      fprintf (dump_file, "\n");
+	    }
+	}
+    }
+}
+
+/* Organize all "relevant" instructions into equivalence classes.
+   Relevant instructions are instructions that load or store memory
+   where the memory address is represented by a sum or difference of a
+   base address register and an integer offset register.
+
+   An equivalence class holds all of the load and store operations
+   that refer to the same computed base and/or index variables plus or
+   minus some constant value.
+
+   All expressions in each equivalence class are replaced with d-form
+   instructions in the emitted code on P9 or above.
+
+   Calculation of hash functions is linear in the number of
+   definitions that potentially contribute to the computation of a
+   particular variable's value.
+
+   Assume hash table insertion and lookup is conatant time (i.e. we
+   normally do not experience collision on hash values).
+
+   Complexity of of this function is O(m*n) where m is number of
+   equivalence classes and n is maximum number of entries in the
+   equivalence class.  */
+static void
+build_and_fixup_equivalence_classes (indexing_web_entry *insn_entry)
+{
+  unsigned int i;
+  /* There can be no more equivalence classes than the total number of
+     instructions in the analyzed function.  Usually, there are far
+     fewer instructions because many of the instructions are not load
+     or store instructions, and some of those that are load and store
+     instructions may end up in the same equivalence class.  */
+  unsigned int *equivalence_hash =
+    (unsigned int *) alloca (max_uid_at_start * sizeof (unsigned int));
+
+  /* Initialize the equivalence_hash array.  */
+  for (i = 0; i < max_uid_at_start; i++)
+    equivalence_hash [i] = UINT_MAX;
+
+  /* Place each relevant instruction into an equivalence class, either
+     a class consisting only of itself, or a class that includes other
+     relevant instructions.
+
+     Complexity of this loop is O(m*n^2) where m is the number of
+     relevant instructions and n is number of definitions of the
+     registers that hold the base or index components of the memory
+     operation's address.  */
+  for (unsigned int uid = 0; uid < max_uid_at_start; uid++)
+    {
+      if (insn_entry [uid].is_relevant)
+	{
+	  unsigned int hash = ((insn_entry [uid].base_equivalence_hash
+				+ insn_entry [uid].index_equivalence_hash)
+			       % max_uid_at_start);
+
+	  if (equivalence_hash [hash] == UINT_MAX)
+	    {			/* first mention of this class */
+	      equivalence_hash [hash] = uid;
+	      insn_entry [uid].equivalent_sibling = UINT_MAX;
+	    }
+	  else
+	    {
+	      while ((equivalence_hash [hash] != UINT_MAX)
+		     && !equivalent_p (insn_entry, uid,
+				       equivalence_hash [hash]))
+		hash = (hash + 1) % max_uid_at_start;
+
+	      if (equivalence_hash [hash] != UINT_MAX)
+		{
+		  /* Found an equivalent instruction. */
+		  insn_entry [uid].equivalent_sibling =
+		    equivalence_hash [hash];
+		  equivalence_hash [hash] = uid;
+		}
+	      else
+		{
+		  /* Equivalence class doesn't yet exist.  */
+		  equivalence_hash [hash] = uid;
+		  insn_entry [uid].equivalent_sibling = UINT_MAX;
+		}
+	    }
+	}
+    }
+
+  /* Scrutinize each equivalence class.  For any entries in the
+     equivalence class that are on conditional control flows (such
+     that they do not dominate the other entries), remove these
+     entries from the equivalence class.  */
+  for (unsigned int i = 0; i < max_uid_at_start; i++)
+    {
+      while (equivalence_hash [i] != UINT_MAX)
+	{
+	  unsigned int the_dominator = equivalence_hash [i];
+	  unsigned int uid;
+
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "Equivalence class consists of\n");
+
+	  /* Find the dominator for this equivalence class.
+
+	     Complexity of following loop body is O(m*n) where m is number
+	     of equivalence classes and n is maximum number of entries
+	     in the equivalence class.  */
+	  for (uid = the_dominator; uid != UINT_MAX;
+	       uid = insn_entry [uid].equivalent_sibling)
+	    {
+	      if (insn_dominated_by_p (&insn_entry [the_dominator],
+				       &insn_entry [uid]))
+		the_dominator = uid;
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file, "  member: %d\n", uid);
+	    }
+
+	  unsigned int size_of_equivalence = 0;
+	  unsigned int removed_partition = UINT_MAX;
+	  unsigned int preceding_uid = UINT_MAX;
+	  unsigned int next_uid;
+
+	  /* Having found a dominator, remove from this equivalence
+	     class any element that is not dominated by the_dominator.
+
+	     Complexity of following loop body is O(m*n) where m is
+	     number of equivalence classes and n is maximum number of
+	     entries in the equivalence class.  */
+	  for (uid = equivalence_hash [i]; uid != UINT_MAX; uid = next_uid)
+	    {
+	      next_uid = insn_entry [uid].equivalent_sibling;
+	      if (!insn_dominated_by_p (&insn_entry [uid],
+					&insn_entry [the_dominator]))
+		{
+		  /* insn uid thinks its in this equivalence class, but
+		     it's not dominated by the_dominator, so remove it.  */
+		  insn_entry [uid].equivalent_sibling = removed_partition;
+		  removed_partition = uid;
+		  if (preceding_uid == UINT_MAX)
+		    equivalence_hash [i] = next_uid;
+		  else
+		    insn_entry [preceding_uid].equivalent_sibling = next_uid;
+		}
+	      else
+		{
+		  size_of_equivalence++;
+		  preceding_uid = uid;
+		}
+	    }
+
+	  /* Complexity contribution of confirm_dform_insns is O(m*n) where
+	     m is number of equivalence classes and n is maximum
+	     number of entries in the equivalence class.  */
+	  if (confirm_dform_insns (insn_entry, equivalence_hash, i,
+				   the_dominator, size_of_equivalence) > 1)
+	    {
+
+	      /* Complexity contribution of replace_xform_insns is
+		 O(m*n) where m is number of equivalence classes and n
+		 is maximum number of entries in the equivalence class.  */
+	      replace_xform_insns (insn_entry, equivalence_hash,
+				   i, the_dominator);
+	    }
+	  else if (dump_file)
+	    {
+	      fprintf (dump_file,
+		       "Abandoning dform optimization: too few dform insns\n");
+	    }
+
+	  /* if (removed_partition != UINT_MAX), need to reprocess the
+	     contents of the removed_partition.  There may be
+	     additional opportunity to optimize within the set of
+	     insns that were not dominated by the selected dominator.
+
+	     Each time through this loop, at least one dominator and
+	     any instructions it dominates are "processed".  Anything
+	     not dominated by the selected dominator remains in the
+	     "removed partition".  The "removed partition" gets
+	     smaller on each iteration, assuring eventual termination.  */
+	  equivalence_hash [i] = removed_partition;
+	}
+    }
+}
+
+/* Assess whether the instruction represented by uid is relevant,
+   setting *is_relevant to true if so, and setting *is_originating to
+   true if this use is an originating definition.
+
+   if the insn is determined to be relevant and is_base is true, overwrite
+   the base_delta, original_base_reg, original_base_use, and
+   base_equivalence_hash fields of insn_entry[uid].
+
+   Otherwise, if the insn is determined to be relevant and is_base is
+   not true, overwrite the index_delta, original_index_reg,
+   original_index_use, and index_equivalence_hash fields of
+   insn_entry[uid].
+
+   In case the insn is not determined to be relevant, certain fields
+   fields of insn_entry[uid] may be overwritten with scratch values
+   that have no significance, as these fields are not consulted
+   subequently in the case that the insn is not relevant.
+
+   Complexity: linear in the length of the number of entries on the
+   use-definition chain, as represented by argument USE.
+*/
+
+static void
+assess_use_relevance (bool is_base, rtx reg, bool *is_relevant,
+		      bool *is_originating, df_ref use,
+		      unsigned int uid, indexing_web_entry *insn_entry)
+{
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, "Found use corresponding to %s\n",
+	       is_base? "base": "index");
+      df_ref_debug (use, dump_file);
+    }
+  struct df_link *def_link = DF_REF_CHAIN (use);
+
+  /* If there is no definition, or there are multiple definitions, or
+     the definition is artificial, this is originating use.  */
+  if (!def_link || !def_link->ref
+      || DF_REF_IS_ARTIFICIAL (def_link->ref) || def_link->next)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Use is originating!\n");
+      *is_relevant = true;
+      *is_originating = true;
+      unsigned int hash = equivalence_hash (def_link);
+
+      if (is_base)
+	{
+	  insn_entry[uid].base_delta = 0;
+	  insn_entry[uid].original_base_reg = REGNO (reg);
+	  insn_entry[uid].original_base_use = uid;
+	  insn_entry[uid].base_equivalence_hash = hash;
+	}
+      else
+	{
+	  insn_entry[uid].index_delta = 0;
+	  insn_entry[uid].original_index_reg = REGNO (reg);
+	  insn_entry[uid].original_index_use = uid;
+	  insn_entry[uid].index_equivalence_hash = hash;
+	}
+    }
+  else
+    {
+      /* Only one definition.  Dig deeper.  */
+      long long int delta = 0;
+      rtx insn2 = find_true_originator (def_link, &delta);
+      if (insn2)
+	{
+	  unsigned uid2 = INSN_UID (insn2);
+	  df_ref use2;
+
+	  if (dump_file  && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, "Use may propagate from %d\n", uid2);
+
+	  struct df_insn_info *insn_info2 = DF_INSN_INFO_GET (insn2);
+
+	  if (insn_info2)
+	    use2 = DF_INSN_INFO_USES (insn_info2);
+	  else
+	    use2 = NULL;
+
+	  if (!use2 || !DF_REF_NEXT_LOC (use2))
+	    {
+	      *is_originating = false;
+
+	      rtx body = PATTERN (insn2);
+	      gcc_assert (GET_CODE (body) == SET);
+	      gcc_assert (REG_P (SET_DEST (body)));
+
+	      if (is_base)
+		{
+		  insn_entry[uid].original_base_reg = REGNO (SET_DEST (body));
+		  insn_entry[uid].original_base_use  = uid2;
+		  insn_entry[uid].base_delta = delta;
+		}
+	      else		/* !is_base means is_index.  */
+		{
+		  insn_entry[uid].original_index_reg = REGNO (SET_DEST(body));
+		  insn_entry[uid].original_index_use = uid2;
+		  insn_entry[uid].index_delta = delta;
+		}
+
+	      if (use2)
+		{
+		  struct df_link *def_link = DF_REF_CHAIN (use2);
+		  unsigned int hash = equivalence_hash (def_link);
+		  *is_relevant = true;
+		  if (is_base)
+		    insn_entry[uid].base_equivalence_hash = hash;
+		  else
+		    insn_entry[uid].index_equivalence_hash = hash;
+		}
+	      /* else, use is not relevant.  */
+
+	      if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file,
+			 " propagates from originating insn %d"
+			 " with delta: %lld\n", uid2, delta);
+	    }
+	  else if (dump_file && (dump_flags & TDF_DETAILS))
+	    fprintf (dump_file, " Dependencies are too"
+		     " complicated for this optimization\n");
+	}
+    }
+}
+
+/* Given that insn represents a memory store or load operation, that
+   mem is the expression that computes the address to or from which
+   memory is transferred and insn_entry holds the array representing
+   all of the indexing_web_entry structures associated with the
+   instructions of this function, set the is_relevant field of
+   insn_entry[uid] if the form of the memory address expression is
+   compatible with dform optimization.
+
+   If the insn is considered relevant, this function also initializes
+   the following fields of the corresponding insn_entry:
+
+	is_originating
+
+   If is_relevant and is_originating, we set:
+
+	original_base_reg
+	original_base_use
+	base_delta
+	base_equivalence_hash
+
+	original_index_reg = REGNO (index_reg);
+	original_index_use = uid;
+	index_delta = 0;
+	index_equivalence_hash
+
+   If is_relevant and !is_originating, we set:
+
+        original_index_reg
+	original_index_use
+	index_delta = (non-zero)
+	index_equivalence_hash (only if use2 != NULL)
+
+
+   Complexity is linear in the number of variabless "used" by this
+   instruction, multiplied by the number of definitions of each
+   variable.  */
+static void
+assess_relevance (rtx mem, rtx_insn *insn, indexing_web_entry *insn_entry)
+{
+  unsigned int uid = INSN_UID (insn);
+  rtx base_reg = XEXP (mem,0);
+  rtx index_reg = XEXP (mem, 1);
+
+  if (dump_file && (dump_flags & TDF_DETAILS))
+    {
+      fprintf (dump_file, " memory is base +/- index, ");
+      fprintf (dump_file, "base: ");
+      print_inline_rtx (dump_file, base_reg, 2);
+      fprintf (dump_file, "\n index: ");
+      print_inline_rtx (dump_file, index_reg, 2);
+      fprintf (dump_file, "\n");
+    }
+
+  if (REG_P (base_reg) && REG_P (index_reg))
+    {
+      struct df_insn_info *insn_info = DF_INSN_INFO_GET (insn);
+      /* Since insn is known to represent a sum or
+	 difference, this insn is likely to use at
+	 least two input variables.  */
+
+      int num_base_defs = 0;
+      int num_index_defs = 0;
+      bool base_is_relevant = false;
+      bool index_is_relevant = false;
+      bool base_is_originating = false;
+      bool index_is_originating = false;
+      df_ref use;
+
+      /* Iterate over the number of definitions used by this
+	 instruction to find the definitions that correspond to the
+	 base register and the index register.  */
+      FOR_EACH_INSN_INFO_USE (use, insn_info)
+	{
+	  if (rtx_equal_p (DF_REF_REG (use), base_reg))
+	    {
+	      assess_use_relevance (true, base_reg, &base_is_relevant,
+				   &base_is_originating,
+				   use, uid, insn_entry);
+	      num_base_defs++;
+	    }
+	  else if (rtx_equal_p (DF_REF_REG (use), index_reg))
+	    {
+	      assess_use_relevance (false, index_reg, &index_is_relevant,
+				   &index_is_originating,
+				   use, uid, insn_entry);
+	      num_index_defs++;
+	    }
+	}
+
+      /* This insn is only relevant if there is  exactly one definition of
+	 base and one definition of index and they are both considered to
+	 be relevant.  */
+      if ((num_base_defs == 1) && (num_index_defs == 1) &&
+	  base_is_relevant && index_is_relevant)
+	{
+	  insn_entry[uid].is_relevant = true;
+	  insn_entry[uid].is_originating =
+	    (base_is_originating && index_is_originating);
+	}
+      else if (dump_file)
+	{
+	  fprintf (dump_file,
+		   "Rejecting dform optimization of insn %d\n", uid);
+	  if (num_base_defs != 1)
+	    fprintf (dump_file, "Too %s (%d) base definitions\n",
+		     (num_base_defs > 1)? "many": "few", num_base_defs);
+	  if (num_index_defs != 1)
+	    fprintf (dump_file, "Too %s (%d) index definitions\n",
+		     (num_index_defs > 1)? "many": "few", num_index_defs);
+	  if (!base_is_relevant)
+	    fprintf (dump_file,
+		     "The available base definition is not relevant\n");
+	  if (!index_is_relevant)
+	    fprintf (dump_file,
+		     "The available index definition is not relevant\n");
+	}
+    }
+  else if (dump_file && (dump_flags & TDF_DETAILS))
+    fprintf (dump_file, " punting because base or index not registers\n");
+}
+
+
+/* Main entry point for this pass.
+
+   Complexity is linear in the number of instructions in the function
+   plus the complexity of build_and_fixup_equivalence_classes, which is
+   O(m*n) where m is number of equivalence classes and n is maximum
+   number of entries in the equivalence class.  */
+unsigned int
+rs6000_insert_dform (function *fun)
+{
+  basic_block bb;
+  rtx_insn *insn, *curr_insn = 0;
+  indexing_web_entry *insn_entry;
+  unsigned int insn_sequence_no = 0;
+
+  calculate_dominance_info (CDI_DOMINATORS);
+
+  /* Dataflow analysis for use-def chains.  */
+  df_set_flags (DF_RD_PRUNE_DEAD_DEFS);
+  df_chain_add_problem (DF_DU_CHAIN | DF_UD_CHAIN);
+  df_analyze ();
+
+  /* Since this pass creates new instructions, get_max_uid () may
+     return different values at different times during this pass.  The
+     insn_entry array represents only the instructions that were
+     present in this function's representation at the start of this
+     pass.  */
+  max_uid_at_start = get_max_uid ();
+  insn_entry = XCNEWVEC (indexing_web_entry, max_uid_at_start);
+
+  if (dump_file)
+    {
+      fprintf (dump_file, "Creating insn_entry array with %d entries\n",
+	       max_uid_at_start);
+    }
+
+  /* The general approach is to:
+
+       1. Look for multiple array indexing expressions that refer to
+          the same array base address such as are represented by the
+	  rtl excerpts below..
+
+       2. Group these into subsets for which the indexing expression
+          derives from the same initial_value + some accumulation of
+          constant values added thereto.
+
+      (cinsn 2 (set (reg/v/f:DI <27> [ x ])
+                    (reg:DI 3 [ x ])) "ddot-c.c":12
+       (expr_list:REG_DEAD (reg:DI 3 [ x ])))
+
+      ...
+
+      (cinsn 31 (set (reg:V2DF <35> [ vect__3.7 ])
+                     (mem:V2DF (plus:DI (reg/v/f:DI <27> [ x ])
+                                        (reg:DI <9> [ ivtmp.18 ]))
+                      [1 MEM[base: x_20(D), index: ivtmp.18_35,
+                       offset: 0B]+0 S16 A64])) "ddot-c.c":18)
+
+      ...
+
+      (cinsn 304 (set (reg:DI <70>)
+                      (plus:DI (reg:DI <9> [ ivtmp.18 ])
+                               (const_int 16)))
+       (expr_list:REG_DEAD (reg:DI <9> [ ivtmp.18 ])))
+      (cinsn 34 (set (reg:DI <9> [ ivtmp.18 ])
+                     (reg:DI <70>)))
+   */
+
+  /* Walk the insns to gather basic data: complexity of inner-nested
+     loop body is linear in total number of insns within function.  */
+  FOR_ALL_BB_FN (bb, fun)
+    {
+      if (dump_file && (dump_flags & TDF_DETAILS))
+	fprintf (dump_file, "Scrutinizing bb %d\n", bb->index);
+
+      FOR_BB_INSNS_SAFE (bb, insn, curr_insn)
+	{
+	  unsigned int uid = INSN_UID (insn);
+
+	  insn_entry[uid].insn = insn;
+	  insn_entry[uid].bb = BLOCK_FOR_INSN (insn);
+	  insn_entry[uid].insn_sequence_no = insn_sequence_no++;
+	  insn_entry[uid].is_relevant = false;
+
+	  if (dump_file && (dump_flags & TDF_DETAILS))
+	    {
+	      fprintf (dump_file, "\nLooking at insn: %d\n", uid);
+	      df_dump_insn_top (insn, dump_file);
+	      dump_insn_slim (dump_file, insn);
+	      df_dump_insn_bottom (insn, dump_file);
+	    }
+
+	  /* First, look for all memory[base + index] expressions.
+	   * Then group these by base.
+	   * Then for all instructions in each group, scrutinize the index
+	   * definition. Partition this group according to the origin
+	   * variable upon which the the definitions of i are based.
+	   *
+	   * How do we define "origin variable"?
+	   *
+	   *  If i has multiple definitions, it is its own origin
+	   *  variable.  Likewise, if i has a single definition and the
+	   *  definition is NOT the sum or difference of a constant value
+	   *  and some other variable, then i is its own origin variable.
+	   *
+	   *  Otherwise, i has the same origin variable as the expression
+	   *  that represents its definition.
+	   *
+	   * After we've created these partitions, for each partition
+	   * whose size is greater than 1:
+	   *
+	   *  1. introduce derived_ptr = base + origin_variable
+	   *     immediately following the instruction that defines
+	   *     origin_variable.
+	   *
+	   *  2. for each member of the partition, replace the expression
+	   *     memory [base + index] with derived_ptr [constant], where
+	   *     constant is the sum of all constant values added to the
+	   *     origin variable to represent this particular value of i.  */
+	  if (NONDEBUG_INSN_P (insn))
+	    {
+	      rtx body = PATTERN (insn);
+	      rtx mem;
+	      if ((GET_CODE (body) == SET) && MEM_P (SET_SRC (body)))
+		{
+		  mem = XEXP (SET_SRC (body), 0);
+		  insn_entry[uid].is_load = true;
+		  insn_entry[uid].is_store = false;
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file,
+			       " this insn is fetching data from memory: ");
+		      print_inline_rtx (dump_file, mem, 2);
+		      fprintf (dump_file, "\n");
+		    }
+		}
+	      else if ((GET_CODE (body) == SET) && MEM_P (SET_DEST (body)))
+		{
+		  mem = XEXP (SET_DEST (body), 0);
+		  insn_entry[uid].is_load = false;
+		  insn_entry[uid].is_store = true;
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    {
+		      fprintf (dump_file,
+			       " this insn is storing data to memory: ");
+		      print_inline_rtx (dump_file, mem, 2);
+		      fprintf (dump_file, "\n");
+		    }
+		}
+	      else
+		{
+		  if (dump_file && (dump_flags & TDF_DETAILS))
+		    fprintf (dump_file,
+			     " this insn is neither load nor store\n");
+		  continue;		/* Not a load or store */
+		}
+
+	      enum rtx_code code = GET_CODE (mem);
+	      if ((code == PLUS) || (code == MINUS))
+		assess_relevance (mem, insn, insn_entry);
+	      else if (dump_file && (dump_flags & TDF_DETAILS))
+		fprintf (dump_file,
+			 " address not sum or difference of values\n");
+	    }
+	  /* else, this is a DEBUG_INSN_P (insn) so ignore it.  */
+	}
+
+	if (dump_file && (dump_flags & TDF_DETAILS))
+	  fprintf (dump_file, "\n");
+    }
+
+  build_and_fixup_equivalence_classes (insn_entry);
+  free_dominance_info (CDI_DOMINATORS);
+  return 0;
+}  // anon namespace
+
+
+const pass_data pass_data_insert_dform =
+{
+  RTL_PASS, /* type */
+  "dform", /* name */
+  OPTGROUP_NONE, /* optinfo_flags, or could use OPTGROUP_LOOP */
+  TV_NONE, /* tv_id, or could use TV_LOOP_UNROLL */
+  0, /* properties_required */
+  0, /* properties_provided */
+  0, /* properties_destroyed */
+  0, /* todo_flags_start */
+  TODO_df_finish, /* todo_flags_finish */
+};
+
+class pass_insert_dform: public rtl_opt_pass
+{
+public:
+  pass_insert_dform(gcc::context *ctxt)
+    : rtl_opt_pass(pass_data_insert_dform, ctxt)
+  {}
+
+  /* opt_pass methods: */
+  virtual bool gate (function *)
+    {
+      // This is most relevant to P9 and subsequent targets since P9
+      // introduces new D-form instructions, but this may pay off on
+      // other architectures as well.  Additional experimentation with
+      // other targets may be worthwhile.
+      return (optimize > 0 && !BYTES_BIG_ENDIAN && TARGET_VSX
+	      && TARGET_P9_VECTOR);
+    }
+
+  virtual unsigned int execute (function *fun)
+    {
+      return rs6000_insert_dform (fun);
+    }
+
+  opt_pass *clone ()
+    {
+      return new pass_insert_dform (m_ctxt);
+    }
+
+}; // class pass_insert_dform
+
+rtl_opt_pass *make_pass_insert_dform (gcc::context *ctxt)
+{
+  return new pass_insert_dform (ctxt);
+}
Index: gcc/config/rs6000/rs6000-passes.def
===================================================================
--- gcc/config/rs6000/rs6000-passes.def	(revision 275051)
+++ gcc/config/rs6000/rs6000-passes.def	(working copy)
@@ -22,6 +22,8 @@  along with GCC; see the file COPYING3.  If not see
    INSERT_PASS_AFTER (PASS, INSTANCE, TGT_PASS)
    INSERT_PASS_BEFORE (PASS, INSTANCE, TGT_PASS)
    REPLACE_PASS (PASS, INSTANCE, TGT_PASS)
+   Be advised: gawk program does not parse C comments if inserted below.
  */
 
   INSERT_PASS_BEFORE (pass_cse, 1, pass_analyze_swaps);
+  INSERT_PASS_AFTER (pass_loop2, 1, pass_insert_dform);
Index: gcc/config/rs6000/rs6000-protos.h
===================================================================
--- gcc/config/rs6000/rs6000-protos.h	(revision 275051)
+++ gcc/config/rs6000/rs6000-protos.h	(working copy)
@@ -47,6 +47,8 @@  extern bool legitimate_indirect_address_p (rtx, in
 extern bool legitimate_indexed_address_p (rtx, int);
 extern bool avoiding_indexed_address_p (machine_mode);
 extern rtx rs6000_force_indexed_or_indirect_mem (rtx x);
+extern bool rs6000_target_supports_dform_offset_p (machine_mode,
+						   HOST_WIDE_INT);
 
 extern rtx rs6000_got_register (rtx);
 extern rtx find_addr_reg (rtx);
@@ -246,6 +248,8 @@  namespace gcc { class context; }
 class rtl_opt_pass;
 
 extern rtl_opt_pass *make_pass_analyze_swaps (gcc::context *);
+extern rtl_opt_pass *make_pass_insert_dform (gcc::context *);
+
 extern bool rs6000_sum_of_two_registers_p (const_rtx expr);
 extern bool rs6000_quadword_masked_address_p (const_rtx exp);
 extern rtx rs6000_gen_lvx (enum machine_mode, rtx, rtx);
Index: gcc/config/rs6000/rs6000.c
===================================================================
--- gcc/config/rs6000/rs6000.c	(revision 275051)
+++ gcc/config/rs6000/rs6000.c	(working copy)
@@ -8725,6 +8725,152 @@  rs6000_debug_legitimate_address_p (machine_mode mo
   return ret;
 }
 
+/* This function provides an approximation of which d-form addressing
+   expressions are valid on any given target configuration.  This
+   approximation guides optimization choices.  Secondary validation
+   of the addressing mode is performed before code generation.
+
+   Return true iff target has instructions to perform a memory
+   operation at the specified BYTE_OFFSET from an address held
+   in a general purpose register.  */
+bool
+rs6000_target_supports_dform_offset_p (machine_mode mode,
+				       HOST_WIDE_INT byte_offset)
+{
+  const HOST_WIDE_INT max_16bit_signed = (0x7fff);
+  const HOST_WIDE_INT min_16bit_signed = -1 - max_16bit_signed;
+
+  /* Available d-form instructions with P1 (the original Power architecture):
+
+     lbz RT,D(RA) - load byte and zero d-form
+     lhz RT,D(RA) - load half word and zero d-form
+     lha RT,D(RA) - load half word algebraic d-form
+     lwz RT,D(RA) - load word and zero d-form
+     lfs FRT,D(RA) - load floating-point single d-form
+     lfd FRT,D(RA) - load floating-point double d-form
+
+     stb RS,D(RA) - store byte d-form
+     sth RS,D(RA) - store half word d-form
+     stfs FRS,D(RA) - store floating point single d-form
+     stfd FRS,D(RA) - store floating point double d-form  */
+
+  /* Available d-form instructions with PPC (prior to v2.00):
+     (option mpowerpc "existed in the past" but is now "always on"
+
+     lwa RT,DS(RA) - load word algebraic ds-form (2 bottom bits zero)
+     ld RT,DS(RA) - load double word ds-form (2 bottom bits zero)
+
+     std RS,DS(RA) - store double word ds-form (2 bottom bits zero)
+
+     Consider lwa redundant with insn available in prior processors.  */
+  switch (mode)
+    {
+    case E_QImode:
+    case E_HImode:
+    case E_SImode:
+    case E_SFmode:
+    case E_DFmode:
+      if (IN_RANGE (byte_offset, min_16bit_signed, max_16bit_signed))
+	return true;
+      break;
+
+    case E_DImode:
+      if (IN_RANGE (byte_offset, min_16bit_signed, max_16bit_signed)
+	  && ((byte_offset & 0x03) == 0))
+	return true;
+      break;
+
+    default:
+      ;	   /* Fall through to see if other instructions will work.  */
+
+    }
+
+  /* Available d-form instructions with v2.03:
+
+     lq RTp,DQ(RA) - load quadword dq-form (4 bottom bits zero)
+
+     stq RSp,DS(RA) - store quadword ds-form (2 bottom bits zero)
+
+     These instructions are not recommended for general use as they
+     are expected to be very inefficient.  Their design was apparently
+     motivated by a need to support atomic quad-word access, which is
+     difficult to implement even in hardware on some architectures.
+     Furthermore, the design of these instructions apparently does the
+     "wrong" thing with regards to swapping of double words on load
+     and store for little-endian targets.
+
+     Therefore, this routine assumes v2.03 does NOT support quadword
+     d-form addressing.  */
+
+  /* Available d-form instructions with v2.05
+
+     (There are some floating-point load and store double-pair
+      instructions.  Consider them "not available".  There are
+      described as phasing out, which means they are expected
+      to have poor performance.)  */
+
+  /* Available d-form instructions with 3.0
+
+     lxsd VRT,DS(RA) - Load VSX scalar double word ds-form (2 bottom bits zero)
+                       (redundant with lfd from P1)
+     lxssp VRT,DS(RA) - Load VSX scalar single precision ds-form
+                        (bottom 2 bits zero)
+                        (redundant with lfs from P1)
+     lxv XT,DQ(RA) - Load VSX Vector dq-form (4 bottom bits zero)
+                     (Works on little endian for any element type, but
+		      does not preserve lanes.)
+
+     stxsd VRS,DS(RA) - Store VSX scalar double-word DS form
+                        (bottom 2 bits zero)
+                        (redundant with stfd from P1)
+     stxssp VRS,DS(RA) - Store VSX scalar single precision DS-form
+                         (bottom 2 bits zero)
+                         (redundant with stfs from P1)
+     stxv XS,DQ(RA) - Store VSX vector dq-form (4 bottom bits zero)
+                      (Works on little endian for any element type,
+		       but does not preserve lanes.)
+
+     lxv and stxv load/store to/from any VSX register, including
+     registers that overlay with floating point and altivec register
+     sets.  */
+
+  if (rs6000_isa_flags & OPTION_MASK_MODULO) /* ISA 3.0 */
+    {
+      switch (mode)
+	{
+	case E_V16QImode:
+	case E_V8HImode:
+	case E_V4SFmode:
+	case E_V4SImode:
+	case E_V2DFmode:
+	case E_V2DImode:
+	case E_V1TImode:
+	case E_TImode:
+
+	case E_KFmode:		/* ieee 754 128-bit floating point */
+	case E_IFmode:		/* IBM extended 128-bit double */
+	case E_TFmode:		/* 128-bit double (form depends on
+				   gcc command line, which may be
+				   either -mabi=ieeelongdouble (KF)
+				   or -mabi=ibmlongdouble (IF). */
+	  /* All 128-bit loads and stores are handled by lxv and stxv.  */
+	  if (IN_RANGE (byte_offset, min_16bit_signed, max_16bit_signed)
+	      && ((byte_offset & 0x0f) == 0))
+	    return true;
+	  break;
+
+	default:
+	  ; /* fall through to see if other instructions will work.  */
+	}
+    }
+
+  /* Todo: add support for any new instructions provided by future
+     archictures when support for those future architectures is
+     enabled.  */
+
+  return false;
+}
+
 /* Implement TARGET_MODE_DEPENDENT_ADDRESS_P.  */
 
 static bool
Index: gcc/config/rs6000/t-rs6000
===================================================================
--- gcc/config/rs6000/t-rs6000	(revision 275051)
+++ gcc/config/rs6000/t-rs6000	(working copy)
@@ -47,6 +47,10 @@  rs6000-call.o: $(srcdir)/config/rs6000/rs6000-call
 	$(COMPILE) $<
 	$(POSTCOMPILE)
 
+rs6000-p9dform.o: $(srcdir)/config/rs6000/rs6000-p9dform.c
+	$(COMPILE) $<
+	$(POSTCOMPILE)
+
 $(srcdir)/config/rs6000/rs6000-tables.opt: $(srcdir)/config/rs6000/genopt.sh \
   $(srcdir)/config/rs6000/rs6000-cpus.def
 	$(SHELL) $(srcdir)/config/rs6000/genopt.sh $(srcdir)/config/rs6000 > \
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc	(revision 275051)
+++ gcc/config.gcc	(working copy)
@@ -499,7 +499,7 @@  or1k*-*-*)
 	;;
 powerpc*-*-*)
 	cpu_type=rs6000
-	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-logue.o rs6000-call.o"
+	extra_objs="rs6000-string.o rs6000-p8swap.o rs6000-p9dform.o rs6000-logue.o rs6000-call.o"
 	extra_headers="ppc-asm.h altivec.h htmintrin.h htmxlintrin.h"
 	extra_headers="${extra_headers} bmi2intrin.h bmiintrin.h"
 	extra_headers="${extra_headers} xmmintrin.h mm_malloc.h emmintrin.h"
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-0.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-0.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-0.c	(working copy)
@@ -0,0 +1,44 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops" } */
+
+/* This test confirms that the dform instructions are selected in the
+   translation of this main program.  */
+
+extern void first_dummy ();
+extern void dummy (double sacc, int n);
+extern void other_dummy ();
+
+extern float opt_value;
+extern char *opt_desc;
+
+#define M 128
+#define N 512
+
+double x [N];
+double y [N];
+
+int main (int argc, char *argv []) {
+  double sacc;
+
+  first_dummy ();
+  for (int j = 0; j < M; j++) {
+
+    sacc = 0.00;
+    for (unsigned long long int i = 0; i < N; i++) {
+      sacc += x[i] * y[i];
+    }
+    dummy (sacc, N);
+  }
+  opt_value = ((float) N) * 2 * ((float) M);
+  opt_desc = "flops";
+  other_dummy ();
+}
+
+/* At time the dform optimization pass was merged with trunk, 12
+   lxv instructions were emitted in place of the same number of lxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mlxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-1.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-1.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-1.c	(working copy)
@@ -0,0 +1,56 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops" } */
+
+/* This test confirms that the dform instructions are selected in the
+   translation of this main program.  */
+
+extern void first_dummy ();
+extern void dummy (double sacc, int n);
+extern void other_dummy ();
+
+extern float opt_value;
+extern char *opt_desc;
+
+#define M 128
+#define N 512
+
+double x [N];
+double y [N];
+double z [N];
+
+int main (int argc, char *argv []) {
+  double sacc;
+
+  first_dummy ();
+  for (int j = 0; j < M; j++) {
+
+    sacc = 0.00;
+    for (unsigned long long int i = 0; i < N; i++) {
+      z[i] = x[i] * y[i];
+      sacc += z[i];
+    }
+    dummy (sacc, N);
+  }
+  opt_value = ((float) N) * 2 * ((float) M);
+  opt_desc = "flops";
+  other_dummy ();
+}
+
+
+
+/* At time the dform optimization pass was merged with trunk, 12
+   lxv instructions were emitted in place of the same number of lxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+
+/* At time the dform optimization pass was merged with trunk, 6
+   stxv instructions were emitted in place of the same number of stxvx
+   instructions.  No need to require exactly this number, as it may
+   change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mstxv\M} } } */
+
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-10.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-10.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-10.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE signed int
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-11.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-11.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-11.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE unsigned long long
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mld\M} } } */
+/* { dg-final { scan-assembler {\mstd\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-12.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-12.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-12.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE signed long long
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mld\M} } } */
+/* { dg-final { scan-assembler {\mstd\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-13.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-13.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-13.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE unsigned __int128
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mld\M} } } */
+/* { dg-final { scan-assembler {\mstd\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-14.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-14.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-14.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE signed __int128
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mld\M} } } */
+/* { dg-final { scan-assembler {\mstd\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-15.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-15.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-15.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie -mfloat128" } */
+
+#define TYPE __float128
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-2.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-2.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-2.c	(working copy)
@@ -0,0 +1,14 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+
+#define TYPE float
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-3.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-3.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-3.c	(working copy)
@@ -0,0 +1,16 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE double
+#include "p9-dform-generic.h"
+
+/* At time the dform optimization pass was merged with trunk, 6
+   lxv instructions were emitted in place of the same number of lxvx
+   instructions and 8 stxv instructions replace the same number of
+   stxvx instructions.  No need to require exactly this number, as it
+   may change when other optimization passes evolve.  */
+
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-4.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-4.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-4.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile { target { powerpc*-*-* } } } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE long double
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlfd\M} } } */
+/* { dg-final { scan-assembler {\mstfd\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-5.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-5.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-5.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE unsigned char
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-6.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-6.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-6.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE signed char
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-7.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-7.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-7.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE unsigned short
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-8.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-8.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-8.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE signed short
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-9.c
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-9.c	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-9.c	(working copy)
@@ -0,0 +1,13 @@ 
+/* { dg-do compile } */
+/* { dg-require-effective-target powerpc_p9vector_ok } */
+/* { dg-skip-if "" { powerpc*-*-aix* } } */
+/* { dg-options "-O3 -mdejagnu-cpu=power9 -funroll-loops -Wall -no-pie" } */
+
+#define TYPE unsigned int
+#include "p9-dform-generic.h"
+
+/* The precise number of lxv and stxv instructions may be impacted by
+   complex interactions between optimization passes, but we expect at
+   least one of each.  */
+/* { dg-final { scan-assembler {\mlxv\M} } } */
+/* { dg-final { scan-assembler {\mstxv\M} } } */
Index: gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h
===================================================================
--- gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h	(nonexistent)
+++ gcc/testsuite/gcc.target/powerpc/p9-dform-generic.h	(working copy)
@@ -0,0 +1,34 @@ 
+
+#define ITERATIONS 1000000
+
+#define SIZE (16384/sizeof(TYPE))
+
+static TYPE x[SIZE] __attribute__ ((aligned (16)));
+static TYPE y[SIZE] __attribute__ ((aligned (16)));
+static TYPE a;
+
+void obfuscate(void *a, ...);
+
+static void __attribute__((noinline)) do_one(void)
+{
+  unsigned long i;
+
+  obfuscate(x, y, &a);
+
+  for (i = 0; i < SIZE; i++)
+    y[i] = a * x[i];
+
+  obfuscate(x, y, &a);
+
+}
+
+int main(void)
+{
+  unsigned long i;
+
+  for (i = 0; i < ITERATIONS; i++)
+    do_one();
+
+  return 0;
+
+}