Message ID | CAFULd4YsmWZORJRX+4+HVEV3zWLco8by1iWsvFLvTXXd+OrU4A@mail.gmail.com |
---|---|
State | New |
Headers | show |
On Thu, Jul 09, 2015 at 05:22:33PM +0200, Uros Bizjak wrote: > Hello! > > This ICE was caused by a peephole2 pattern that allowed non-general > regs arguments. > > 2015-07-08 Uros Bizjak <ubizjak@gmail.com> > > PR target/66814 > * config/i386/predicates.md (nonimmediate_gr_operand): New predicate. > * config/i386/i386.md (not peephole2): Use nonimmediate_gr_operand. > (varous peephole2s): Use {GENERAL,SSE,MMX}_REGNO_P instead of > {GENERAL_SSE_MMX}_REG_P where appropriate. This patch breaks bootstrap with rtl checking on x86_64-linux. ../../gcc/tree-vect-loop.c: In function ‘void calc_vec_perm_mask_for_shift(machine_mode, unsigned int, unsigned char*)’: ../../gcc/tree-vect-loop.c:3190:1: internal compiler error: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1782 } ^ 0x11a44ef rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*) ../../gcc/rtl.c:786 0x19e917c rhs_regno ../../gcc/rtl.h:1782 0x1d20eea peephole2_10 ../../gcc/config/i386/i386.md:17699 0x1d2ccc8 peephole2_insns(rtx_def*, rtx_insn*, int*) ../../gcc/config/i386/i386.md:5050 0x11190e0 peephole2_optimize ../../gcc/recog.c:3627 0x111a81f rest_of_handle_peephole2 ../../gcc/recog.c:3807 0x111a8d4 execute ../../gcc/recog.c:3841 register_operand allows a SUBREG even after a reload, as long as it is a SUBREG of a REG_P. insn in question is: (insn 75 71 76 2 (set (subreg:SI (reg:DI 2 cx) 0) (const_int -1 [0xffffffffffffffff])) ../../gcc/tree-vect-loop.c:3189 -1 (nil)) and has been created by the: ;; After splitting up read-modify operations, array accesses with memory ;; operands might end up in form: ;; sall $2, %eax ;; movl 4(%esp), %edx ;; addl %edx, %eax ;; instead of pre-splitting: ;; sall $2, %eax ;; addl 4(%esp), %eax ;; Turn it into: ;; movl 4(%esp), %edx ;; leal (%edx,%eax,4), %eax peephole2. Wonder if if (mode != word_mode) operands[1] = gen_rtx_SUBREG (mode, operands[1], 0); if (op1mode != word_mode) operands[5] = gen_rtx_SUBREG (op1mode, operands[5], 0); should not have been gen_lowpart instead of gen_rtx_SUBREG. In any case, it wouldn't surprise me if there aren't many other ways how to get a SUBREG in there. Jakub
On Thu, Jul 9, 2015 at 9:38 PM, Jakub Jelinek <jakub@redhat.com> wrote: > On Thu, Jul 09, 2015 at 05:22:33PM +0200, Uros Bizjak wrote: >> Hello! >> >> This ICE was caused by a peephole2 pattern that allowed non-general >> regs arguments. >> >> 2015-07-08 Uros Bizjak <ubizjak@gmail.com> >> >> PR target/66814 >> * config/i386/predicates.md (nonimmediate_gr_operand): New predicate. >> * config/i386/i386.md (not peephole2): Use nonimmediate_gr_operand. >> (varous peephole2s): Use {GENERAL,SSE,MMX}_REGNO_P instead of >> {GENERAL_SSE_MMX}_REG_P where appropriate. > > This patch breaks bootstrap with rtl checking on x86_64-linux. > > ../../gcc/tree-vect-loop.c: In function ‘void calc_vec_perm_mask_for_shift(machine_mode, unsigned int, unsigned char*)’: > ../../gcc/tree-vect-loop.c:3190:1: internal compiler error: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1782 > } > ^ > 0x11a44ef rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*) > ../../gcc/rtl.c:786 > 0x19e917c rhs_regno > ../../gcc/rtl.h:1782 > 0x1d20eea peephole2_10 > ../../gcc/config/i386/i386.md:17699 > 0x1d2ccc8 peephole2_insns(rtx_def*, rtx_insn*, int*) > ../../gcc/config/i386/i386.md:5050 > 0x11190e0 peephole2_optimize > ../../gcc/recog.c:3627 > 0x111a81f rest_of_handle_peephole2 > ../../gcc/recog.c:3807 > 0x111a8d4 execute > ../../gcc/recog.c:3841 > > register_operand allows a SUBREG even after a reload, as long as > it is a SUBREG of a REG_P. > > insn in question is: > (insn 75 71 76 2 (set (subreg:SI (reg:DI 2 cx) 0) > (const_int -1 [0xffffffffffffffff])) ../../gcc/tree-vect-loop.c:3189 -1 > (nil)) > > and has been created by the: > ;; After splitting up read-modify operations, array accesses with memory > ;; operands might end up in form: > ;; sall $2, %eax > ;; movl 4(%esp), %edx > ;; addl %edx, %eax > ;; instead of pre-splitting: > ;; sall $2, %eax > ;; addl 4(%esp), %eax > ;; Turn it into: > ;; movl 4(%esp), %edx > ;; leal (%edx,%eax,4), %eax > peephole2. > Wonder if > if (mode != word_mode) > operands[1] = gen_rtx_SUBREG (mode, operands[1], 0); > if (op1mode != word_mode) > operands[5] = gen_rtx_SUBREG (op1mode, operands[5], 0); > should not have been gen_lowpart instead of gen_rtx_SUBREG. > In any case, it wouldn't surprise me if there aren't many other ways > how to get a SUBREG in there. I was under impression that peephole2 pass doesn't see subregs of hard regs (all x86 predicates are written in this way). Even documentation somehow agrees with this: '(subreg:M1 REG:M2 BYTENUM)' 'subreg' expressions are used to refer to a register in a machine mode other than its natural one, or to refer to one register of a multi-part 'reg' that actually refers to several registers. Each pseudo register has a natural mode. If it is necessary to operate on it in a different mode, the register must be enclosed in a 'subreg'. There are currently three supported types for the first operand of a 'subreg': [...] * hard registers It is seldom necessary to wrap hard registers in 'subreg's; such registers would normally reduce to a single 'reg' rtx. This use of 'subreg's is discouraged and may not be supported in the future. So, I'd say that generating naked SUBREG after reload should be avoided and gen_lowpart should be used in the code above. Uros. > Jakub
On Thu, Jul 09, 2015 at 10:13:49PM +0200, Uros Bizjak wrote: > I was under impression that peephole2 pass doesn't see subregs of hard > regs (all x86 predicates are written in this way). Even documentation > somehow agrees with this: > > '(subreg:M1 REG:M2 BYTENUM)' > > 'subreg' expressions are used to refer to a register in a machine > mode other than its natural one, or to refer to one register of a > multi-part 'reg' that actually refers to several registers. > > Each pseudo register has a natural mode. If it is necessary to > operate on it in a different mode, the register must be enclosed in > a 'subreg'. > > There are currently three supported types for the first operand of > a 'subreg': > > [...] > > * hard registers It is seldom necessary to wrap hard registers > in 'subreg's; such registers would normally reduce to a single > 'reg' rtx. This use of 'subreg's is discouraged and may not > be supported in the future. > > So, I'd say that generating naked SUBREG after reload should be > avoided and gen_lowpart should be used in the code above. There is also: emit_insn (gen_sse2_loadld (operands[3], CONST0_RTX (V4SImode), gen_rtx_SUBREG (SImode, operands[1], 0))); emit_insn (gen_sse2_loadld (operands[4], CONST0_RTX (V4SImode), gen_rtx_SUBREG (SImode, operands[1], 4))); in some splitters (also post-reload). Jakub
On Thu, Jul 9, 2015 at 10:17 PM, Jakub Jelinek <jakub@redhat.com> wrote: > On Thu, Jul 09, 2015 at 10:13:49PM +0200, Uros Bizjak wrote: >> I was under impression that peephole2 pass doesn't see subregs of hard >> regs (all x86 predicates are written in this way). Even documentation >> somehow agrees with this: >> >> '(subreg:M1 REG:M2 BYTENUM)' >> >> 'subreg' expressions are used to refer to a register in a machine >> mode other than its natural one, or to refer to one register of a >> multi-part 'reg' that actually refers to several registers. >> >> Each pseudo register has a natural mode. If it is necessary to >> operate on it in a different mode, the register must be enclosed in >> a 'subreg'. >> >> There are currently three supported types for the first operand of >> a 'subreg': >> >> [...] >> >> * hard registers It is seldom necessary to wrap hard registers >> in 'subreg's; such registers would normally reduce to a single >> 'reg' rtx. This use of 'subreg's is discouraged and may not >> be supported in the future. >> >> So, I'd say that generating naked SUBREG after reload should be >> avoided and gen_lowpart should be used in the code above. > > There is also: > emit_insn (gen_sse2_loadld (operands[3], CONST0_RTX (V4SImode), > gen_rtx_SUBREG (SImode, operands[1], 0))); > emit_insn (gen_sse2_loadld (operands[4], CONST0_RTX (V4SImode), > gen_rtx_SUBREG (SImode, operands[1], 4))); > in some splitters (also post-reload). These won't hit the peephole2, changed by the patch, but let's fix these before they start to break. Uros.
Index: testsuite/gcc.target/i386/pr66814.c =================================================================== --- testsuite/gcc.target/i386/pr66814.c (revision 0) +++ testsuite/gcc.target/i386/pr66814.c (revision 0) @@ -0,0 +1,4 @@ +/* { dg-do compile { target { ia32 } } } */ +/* { dg-options "-march=i586 -mavx512f -O2" } */ + +#include "avx512f-klogic-2.c" Index: config/i386/i386.md =================================================================== --- config/i386/i386.md (revision 225599) +++ config/i386/i386.md (working copy) @@ -17379,8 +17379,8 @@ ;; lifetime information then. (define_peephole2 - [(set (match_operand:SWI124 0 "nonimmediate_operand") - (not:SWI124 (match_operand:SWI124 1 "nonimmediate_operand")))] + [(set (match_operand:SWI124 0 "nonimmediate_gr_operand") + (not:SWI124 (match_operand:SWI124 1 "nonimmediate_gr_operand")))] "optimize_insn_for_speed_p () && ((TARGET_NOT_UNPAIRABLE && (!MEM_P (operands[0]) @@ -17524,8 +17524,10 @@ [(match_dup 0) (match_operand 2 "memory_operand")]))] "REGNO (operands[0]) != REGNO (operands[1]) - && ((MMX_REG_P (operands[0]) && MMX_REG_P (operands[1])) - || (SSE_REG_P (operands[0]) && SSE_REG_P (operands[1])))" + && ((MMX_REGNO_P (REGNO (operands[0])) + && MMX_REGNO_P (REGNO (operands[1]))) + || (SSE_REGNO_P (REGNO (operands[0])) + && SSE_REGNO_P (REGNO (operands[1]))))" [(set (match_dup 0) (match_dup 2)) (set (match_dup 0) (match_op_dup 3 [(match_dup 0) (match_dup 1)]))]) @@ -17673,7 +17675,7 @@ (match_operand 1 "const0_operand"))] "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ()) - && GENERAL_REG_P (operands[0]) + && GENERAL_REGNO_P (REGNO (operands[0])) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 0) (const_int 0)) (clobber (reg:CC FLAGS_REG))])] @@ -17694,6 +17696,7 @@ [(set (match_operand:SWI248 0 "register_operand") (const_int -1))] "(optimize_insn_for_size_p () || TARGET_MOVE_M1_VIA_OR) + && GENERAL_REGNO_P (REGNO (operands[0])) && peep2_regno_dead_p (0, FLAGS_REG)" [(parallel [(set (match_dup 0) (const_int -1)) (clobber (reg:CC FLAGS_REG))])] Index: config/i386/predicates.md =================================================================== --- config/i386/predicates.md (revision 225599) +++ config/i386/predicates.md (working copy) @@ -37,6 +37,12 @@ (and (match_code "reg") (match_test "GENERAL_REGNO_P (REGNO (op))"))) +;; True if the operand is a nonimmediate operand with GENERAL class register. +(define_predicate "nonimmediate_gr_operand" + (if_then_else (match_code "reg") + (match_test "GENERAL_REGNO_P (REGNO (op))") + (match_operand 0 "nonimmediate_operand"))) + ;; Return true if OP is a register operand other than an i387 fp register. (define_predicate "register_and_not_fp_reg_operand" (and (match_code "reg")