Message ID | 53764FB9.6000000@redhat.com |
---|---|
State | New |
Headers | show |
On Fri, May 16, 2014 at 06:49:45PM +0100, Vladimir Makarov wrote: > The following patch fixes > > https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60969 > > The patch was bootstrapped and tested on x86/x86-64. > > Committed as rev. 210519 to gcc 4.9 branch and as rev. 210520 to trunk. > > 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> > > PR rtl-optimization/60969 > * ira-costs.c (record_reg_classes): Allow only memory for pseudo. > Calculate costs for this case. > > 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> > > PR rtl-optimization/60969 > * g++.dg/pr60969.C: New. This seems to have cause gcc.target/aarch64/vect-abs-compile.c to begin failing on aarch64-none-elf: FAIL: gcc.target/aarch64/table-intrinsics.c (internal compiler error) FAIL: gcc.target/aarch64/table-intrinsics.c (test for excess errors) Excess errors: /work/gcc-clean/src/gcc/gcc/testsuite/gcc.target/aarch64/table-intrinsics.c:172:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) 0x8923cd lra_constraints(bool) /work/gcc-clean/src/gcc/gcc/lra-constraints.c:4140 0x882f62 lra(_IO_FILE*) /work/gcc-clean/src/gcc/gcc/lra.c:2353 0x8453f6 do_reload /work/gcc-clean/src/gcc/gcc/ira.c:5457 0x8453f6 execute /work/gcc-clean/src/gcc/gcc/ira.c:5618 Thanks, James
On Mon, May 19, 2014 at 2:37 PM, James Greenhalgh <james.greenhalgh@arm.com> wrote: > On Fri, May 16, 2014 at 06:49:45PM +0100, Vladimir Makarov wrote: >> The following patch fixes >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60969 >> >> The patch was bootstrapped and tested on x86/x86-64. >> >> Committed as rev. 210519 to gcc 4.9 branch and as rev. 210520 to trunk. >> >> 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> >> >> PR rtl-optimization/60969 >> * ira-costs.c (record_reg_classes): Allow only memory for pseudo. >> Calculate costs for this case. >> >> 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> >> >> PR rtl-optimization/60969 >> * g++.dg/pr60969.C: New. > > This seems to have cause gcc.target/aarch64/vect-abs-compile.c to begin > failing on aarch64-none-elf: > > FAIL: gcc.target/aarch64/table-intrinsics.c (internal compiler error) > FAIL: gcc.target/aarch64/table-intrinsics.c (test for excess errors) > Excess errors: > /work/gcc-clean/src/gcc/gcc/testsuite/gcc.target/aarch64/table-intrinsics.c:172:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) > 0x8923cd lra_constraints(bool) > /work/gcc-clean/src/gcc/gcc/lra-constraints.c:4140 > 0x882f62 lra(_IO_FILE*) > /work/gcc-clean/src/gcc/gcc/lra.c:2353 > 0x8453f6 do_reload > /work/gcc-clean/src/gcc/gcc/ira.c:5457 > 0x8453f6 execute > /work/gcc-clean/src/gcc/gcc/ira.c:5618 > I think x86 backend should disable 3DNOW mode if 3DNOW isn't enabled. Allowing SFmode with MMX doesn't buy us anything, but trouble.
On 05/19/2014 05:37 PM, James Greenhalgh wrote: > On Fri, May 16, 2014 at 06:49:45PM +0100, Vladimir Makarov wrote: >> The following patch fixes >> >> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60969 >> >> The patch was bootstrapped and tested on x86/x86-64. >> >> Committed as rev. 210519 to gcc 4.9 branch and as rev. 210520 to trunk. >> >> 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> >> >> PR rtl-optimization/60969 >> * ira-costs.c (record_reg_classes): Allow only memory for pseudo. >> Calculate costs for this case. >> >> 2014-05-16 Vladimir Makarov <vmakarov@redhat.com> >> >> PR rtl-optimization/60969 >> * g++.dg/pr60969.C: New. > This seems to have cause gcc.target/aarch64/vect-abs-compile.c to begin > failing on aarch64-none-elf: > > FAIL: gcc.target/aarch64/table-intrinsics.c (internal compiler error) > FAIL: gcc.target/aarch64/table-intrinsics.c (test for excess errors) > Excess errors: > /work/gcc-clean/src/gcc/gcc/testsuite/gcc.target/aarch64/table-intrinsics.c:172:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) > 0x8923cd lra_constraints(bool) > /work/gcc-clean/src/gcc/gcc/lra-constraints.c:4140 > 0x882f62 lra(_IO_FILE*) > /work/gcc-clean/src/gcc/gcc/lra.c:2353 > 0x8453f6 do_reload > /work/gcc-clean/src/gcc/gcc/ira.c:5457 > 0x8453f6 execute > /work/gcc-clean/src/gcc/gcc/ira.c:5618 > > Sorry, I have no aarch64 machine. Could you sent me the pre-processed file of the test.
>>> https://gcc.gnu.org/bugzilla/show_bug.cgi?id=60969 >> FAIL: gcc.target/aarch64/table-intrinsics.c (internal compiler error) >> FAIL: gcc.target/aarch64/table-intrinsics.c (test for excess errors) >> Excess errors: >> /work/gcc-clean/src/gcc/gcc/testsuite/gcc.target/aarch64/table-intrinsics.c:172:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) >> 0x8923cd lra_constraints(bool) >> /work/gcc-clean/src/gcc/gcc/lra-constraints.c:4140 >> 0x882f62 lra(_IO_FILE*) >> /work/gcc-clean/src/gcc/gcc/lra.c:2353 >> 0x8453f6 do_reload >> /work/gcc-clean/src/gcc/gcc/ira.c:5457 >> 0x8453f6 execute >> /work/gcc-clean/src/gcc/gcc/ira.c:5618 >> >> > Sorry, I have no aarch64 machine. Could you sent me the pre-processed > file of the test. Please find inline a reduced testcase that fails. Compiler configured with $SRC/gcc/configure --target=aarch64-none-elf $>./xgcc -B`pwd` -S -O2 try.c try.c: In function 'qtbl_tests8_2': try.c:26:1: internal compiler error: Max. number of generated reload insns per insn is achieved (90) } ^ 0x8653f7 lra_constraints(bool) /work/wa1/src/gcc/gcc/lra-constraints.c:4140 0x855ca6 lra(_IO_FILE*) /work/wa1/src/gcc/gcc/lra.c:2353 0x81eada do_reload /work/wa1/src/gcc/gcc/ira.c:5457 0x81eada execute /work/wa1/src/gcc/gcc/ira.c:5618 Please submit a full bug report, with preprocessed source if appropriate. Please include the complete backtrace with any bug report. See <http://gcc.gnu.org/bugs.html> for instructions. compilation status=1 $>cat try.c typedef __builtin_aarch64_simd_qi int8x8_t __attribute__ ((__vector_size__ (8))); typedef __builtin_aarch64_simd_uqi uint8x8_t __attribute__ ((__vector_size__ (8))); typedef __builtin_aarch64_simd_qi int8x16_t __attribute__ ((__vector_size__ (16))); typedef struct int8x16x2_t { int8x16_t val[2]; } int8x16x2_t; __extension__ static __inline int8x8_t __attribute__ ((__always_inline__)) vqtbl2_s8 (int8x16x2_t tab, uint8x8_t idx) { int8x8_t result; __asm__ ("ld1 {v16.16b, v17.16b}, %1\n\t" :"=w"(result) :"Q"(tab),"w"(idx) :"memory", "v16", "v17"); return result; } int8x8_t qtbl_tests8_2 (int8x16x2_t tab, uint8x8_t idx) { return vqtbl2_s8 (tab, idx); } > >
Index: ira-costs.c =================================================================== --- ira-costs.c (revision 210069) +++ ira-costs.c (working copy) @@ -762,10 +762,11 @@ record_reg_classes (int n_alts, int n_op into that class. */ if (REG_P (op) && REGNO (op) >= FIRST_PSEUDO_REGISTER) { - if (classes[i] == NO_REGS) + if (classes[i] == NO_REGS && ! allows_mem[i]) { /* We must always fail if the operand is a REG, but - we did not find a suitable class. + we did not find a suitable class and memory is + not allowed. Otherwise we may perform an uninitialized read from this_op_costs after the `continue' statement @@ -783,50 +784,90 @@ record_reg_classes (int n_alts, int n_op bool out_p = recog_data.operand_type[i] != OP_IN; enum reg_class op_class = classes[i]; move_table *move_in_cost, *move_out_cost; + short (*mem_cost)[2]; ira_init_register_move_cost_if_necessary (mode); if (! in_p) { ira_assert (out_p); - move_out_cost = ira_may_move_out_cost[mode]; - for (k = cost_classes_ptr->num - 1; k >= 0; k--) + if (op_class == NO_REGS) { - rclass = cost_classes[k]; - pp_costs[k] - = move_out_cost[op_class][rclass] * frequency; + mem_cost = ira_memory_move_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = mem_cost[rclass][0] * frequency; + } + } + else + { + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_out_cost[op_class][rclass] * frequency; + } } } else if (! out_p) { ira_assert (in_p); - move_in_cost = ira_may_move_in_cost[mode]; - for (k = cost_classes_ptr->num - 1; k >= 0; k--) + if (op_class == NO_REGS) { - rclass = cost_classes[k]; - pp_costs[k] - = move_in_cost[rclass][op_class] * frequency; + mem_cost = ira_memory_move_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = mem_cost[rclass][1] * frequency; + } + } + else + { + move_in_cost = ira_may_move_in_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] + = move_in_cost[rclass][op_class] * frequency; + } } } else { - move_in_cost = ira_may_move_in_cost[mode]; - move_out_cost = ira_may_move_out_cost[mode]; - for (k = cost_classes_ptr->num - 1; k >= 0; k--) - { - rclass = cost_classes[k]; - pp_costs[k] = ((move_in_cost[rclass][op_class] - + move_out_cost[op_class][rclass]) - * frequency); + if (op_class == NO_REGS) + { + mem_cost = ira_memory_move_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = ((mem_cost[rclass][0] + + mem_cost[rclass][1]) + * frequency); + } + } + else + { + move_in_cost = ira_may_move_in_cost[mode]; + move_out_cost = ira_may_move_out_cost[mode]; + for (k = cost_classes_ptr->num - 1; k >= 0; k--) + { + rclass = cost_classes[k]; + pp_costs[k] = ((move_in_cost[rclass][op_class] + + move_out_cost[op_class][rclass]) + * frequency); + } } } /* If the alternative actually allows memory, make things a bit cheaper since we won't need an extra insn to load it. */ - pp->mem_cost - = ((out_p ? ira_memory_move_cost[mode][op_class][0] : 0) - + (in_p ? ira_memory_move_cost[mode][op_class][1] : 0) - - allows_mem[i]) * frequency; + if (op_class != NO_REGS) + pp->mem_cost + = ((out_p ? ira_memory_move_cost[mode][op_class][0] : 0) + + (in_p ? ira_memory_move_cost[mode][op_class][1] : 0) + - allows_mem[i]) * frequency; /* If we have assigned a class to this allocno in our first pass, add a cost to this alternative corresponding to what we would add if this @@ -836,15 +877,28 @@ record_reg_classes (int n_alts, int n_op enum reg_class pref_class = pref[COST_INDEX (REGNO (op))]; if (pref_class == NO_REGS) + { + if (op_class != NO_REGS) + alt_cost + += ((out_p + ? ira_memory_move_cost[mode][op_class][0] + : 0) + + (in_p + ? ira_memory_move_cost[mode][op_class][1] + : 0)); + } + else if (op_class == NO_REGS) alt_cost += ((out_p - ? ira_memory_move_cost[mode][op_class][0] : 0) + ? ira_memory_move_cost[mode][pref_class][1] + : 0) + (in_p - ? ira_memory_move_cost[mode][op_class][1] + ? ira_memory_move_cost[mode][pref_class][0] : 0)); else if (ira_reg_class_intersect[pref_class][op_class] == NO_REGS) - alt_cost += ira_register_move_cost[mode][pref_class][op_class]; + alt_cost += (ira_register_move_cost + [mode][pref_class][op_class]); } } } Index: testsuite/g++.dg/pr60969.C =================================================================== --- testsuite/g++.dg/pr60969.C (revision 0) +++ testsuite/g++.dg/pr60969.C (working copy) @@ -0,0 +1,30 @@ +/* { dg-do compile { target i?86-*-* } } */ +/* { dg-options "-O2 -ftree-vectorize -march=pentium4" } */ + +struct A +{ + float f, g, h, k; + A () {} + A (float v0, float x, float y) : f(v0), g(x), h(y), k(0.0f) {} + A bar (A &a, float t) { return A (f + a.f * t, g + a.g * t, h + a.h * t); } +}; + +A +baz (A &x, A &y, float t) +{ + return x.bar (y, t); +} + +A * +foo (A &s, A &t, A &u, A &v, int y, int z) +{ + A *x = new A[y * z]; + for (int i = 0; i < 7; i++) + { + A s = baz (s, u, i / (float) z); + A t = baz (t, v, i / (float) z); + for (int j = 0; j < 7; j++) + x[i * y + j] = baz (s, t, j / (float) y); + } + return x; +}