diff mbox

[i386] : Fix PR 66814, ICE: gcc.target/i386/avx512f-klogic-2.c

Message ID CAFULd4YsmWZORJRX+4+HVEV3zWLco8by1iWsvFLvTXXd+OrU4A@mail.gmail.com
State New
Headers show

Commit Message

Uros Bizjak July 9, 2015, 3:22 p.m. UTC
Hello!

This ICE was caused by a peephole2 pattern that allowed non-general
regs arguments.

2015-07-08  Uros Bizjak  <ubizjak@gmail.com>

    PR target/66814
    * config/i386/predicates.md (nonimmediate_gr_operand): New predicate.
    * config/i386/i386.md (not peephole2): Use nonimmediate_gr_operand.
    (varous peephole2s): Use {GENERAL,SSE,MMX}_REGNO_P instead of
    {GENERAL_SSE_MMX}_REG_P where appropriate.

testsuite/ChangeLog:

2015-07-09  Uros Bizjak  <ubizjak@gmail.com>

    PR target/66814
    * gcc.target/i386/pr66814.c: New test.

Bootstrapped and regression tested on x86_64-linux-gnu {,-m32}.

Committed to mainline SVN, will be backported to gcc-5, when the branch opens.

Uros.

Comments

Jakub Jelinek July 9, 2015, 7:38 p.m. UTC | #1
On Thu, Jul 09, 2015 at 05:22:33PM +0200, Uros Bizjak wrote:
> Hello!
> 
> This ICE was caused by a peephole2 pattern that allowed non-general
> regs arguments.
> 
> 2015-07-08  Uros Bizjak  <ubizjak@gmail.com>
> 
>     PR target/66814
>     * config/i386/predicates.md (nonimmediate_gr_operand): New predicate.
>     * config/i386/i386.md (not peephole2): Use nonimmediate_gr_operand.
>     (varous peephole2s): Use {GENERAL,SSE,MMX}_REGNO_P instead of
>     {GENERAL_SSE_MMX}_REG_P where appropriate.

This patch breaks bootstrap with rtl checking on x86_64-linux.

../../gcc/tree-vect-loop.c: In function ‘void calc_vec_perm_mask_for_shift(machine_mode, unsigned int, unsigned char*)’:
../../gcc/tree-vect-loop.c:3190:1: internal compiler error: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1782
 }
 ^
0x11a44ef rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*)
        ../../gcc/rtl.c:786
0x19e917c rhs_regno
        ../../gcc/rtl.h:1782
0x1d20eea peephole2_10
        ../../gcc/config/i386/i386.md:17699
0x1d2ccc8 peephole2_insns(rtx_def*, rtx_insn*, int*)
        ../../gcc/config/i386/i386.md:5050
0x11190e0 peephole2_optimize
        ../../gcc/recog.c:3627
0x111a81f rest_of_handle_peephole2
        ../../gcc/recog.c:3807
0x111a8d4 execute
        ../../gcc/recog.c:3841

register_operand allows a SUBREG even after a reload, as long as
it is a SUBREG of a REG_P.

insn in question is:
(insn 75 71 76 2 (set (subreg:SI (reg:DI 2 cx) 0)
        (const_int -1 [0xffffffffffffffff])) ../../gcc/tree-vect-loop.c:3189 -1
     (nil))

and has been created by the:
;; After splitting up read-modify operations, array accesses with memory
;; operands might end up in form:
;;  sall    $2, %eax
;;  movl    4(%esp), %edx
;;  addl    %edx, %eax
;; instead of pre-splitting:
;;  sall    $2, %eax
;;  addl    4(%esp), %eax
;; Turn it into:
;;  movl    4(%esp), %edx
;;  leal    (%edx,%eax,4), %eax
peephole2.
Wonder if
  if (mode != word_mode)
    operands[1] = gen_rtx_SUBREG (mode, operands[1], 0);
  if (op1mode != word_mode)
    operands[5] = gen_rtx_SUBREG (op1mode, operands[5], 0);  
should not have been gen_lowpart instead of gen_rtx_SUBREG.
In any case, it wouldn't surprise me if there aren't many other ways
how to get a SUBREG in there.

	Jakub
Uros Bizjak July 9, 2015, 8:13 p.m. UTC | #2
On Thu, Jul 9, 2015 at 9:38 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Jul 09, 2015 at 05:22:33PM +0200, Uros Bizjak wrote:
>> Hello!
>>
>> This ICE was caused by a peephole2 pattern that allowed non-general
>> regs arguments.
>>
>> 2015-07-08  Uros Bizjak  <ubizjak@gmail.com>
>>
>>     PR target/66814
>>     * config/i386/predicates.md (nonimmediate_gr_operand): New predicate.
>>     * config/i386/i386.md (not peephole2): Use nonimmediate_gr_operand.
>>     (varous peephole2s): Use {GENERAL,SSE,MMX}_REGNO_P instead of
>>     {GENERAL_SSE_MMX}_REG_P where appropriate.
>
> This patch breaks bootstrap with rtl checking on x86_64-linux.
>
> ../../gcc/tree-vect-loop.c: In function ‘void calc_vec_perm_mask_for_shift(machine_mode, unsigned int, unsigned char*)’:
> ../../gcc/tree-vect-loop.c:3190:1: internal compiler error: RTL check: expected code 'reg', have 'subreg' in rhs_regno, at rtl.h:1782
>  }
>  ^
> 0x11a44ef rtl_check_failed_code1(rtx_def const*, rtx_code, char const*, int, char const*)
>         ../../gcc/rtl.c:786
> 0x19e917c rhs_regno
>         ../../gcc/rtl.h:1782
> 0x1d20eea peephole2_10
>         ../../gcc/config/i386/i386.md:17699
> 0x1d2ccc8 peephole2_insns(rtx_def*, rtx_insn*, int*)
>         ../../gcc/config/i386/i386.md:5050
> 0x11190e0 peephole2_optimize
>         ../../gcc/recog.c:3627
> 0x111a81f rest_of_handle_peephole2
>         ../../gcc/recog.c:3807
> 0x111a8d4 execute
>         ../../gcc/recog.c:3841
>
> register_operand allows a SUBREG even after a reload, as long as
> it is a SUBREG of a REG_P.
>
> insn in question is:
> (insn 75 71 76 2 (set (subreg:SI (reg:DI 2 cx) 0)
>         (const_int -1 [0xffffffffffffffff])) ../../gcc/tree-vect-loop.c:3189 -1
>      (nil))
>
> and has been created by the:
> ;; After splitting up read-modify operations, array accesses with memory
> ;; operands might end up in form:
> ;;  sall    $2, %eax
> ;;  movl    4(%esp), %edx
> ;;  addl    %edx, %eax
> ;; instead of pre-splitting:
> ;;  sall    $2, %eax
> ;;  addl    4(%esp), %eax
> ;; Turn it into:
> ;;  movl    4(%esp), %edx
> ;;  leal    (%edx,%eax,4), %eax
> peephole2.
> Wonder if
>   if (mode != word_mode)
>     operands[1] = gen_rtx_SUBREG (mode, operands[1], 0);
>   if (op1mode != word_mode)
>     operands[5] = gen_rtx_SUBREG (op1mode, operands[5], 0);
> should not have been gen_lowpart instead of gen_rtx_SUBREG.
> In any case, it wouldn't surprise me if there aren't many other ways
> how to get a SUBREG in there.

I was under impression that peephole2 pass doesn't see subregs of hard
regs (all x86 predicates are written in this way). Even documentation
somehow agrees with this:

'(subreg:M1 REG:M2 BYTENUM)'

    'subreg' expressions are used to refer to a register in a machine
    mode other than its natural one, or to refer to one register of a
    multi-part 'reg' that actually refers to several registers.

    Each pseudo register has a natural mode.  If it is necessary to
    operate on it in a different mode, the register must be enclosed in
    a 'subreg'.

    There are currently three supported types for the first operand of
    a 'subreg':

[...]

       * hard registers It is seldom necessary to wrap hard registers
         in 'subreg's; such registers would normally reduce to a single
         'reg' rtx.  This use of 'subreg's is discouraged and may not
         be supported in the future.

So, I'd say that generating naked SUBREG after reload should be
avoided and gen_lowpart should be used in the code above.

Uros.


>         Jakub
Jakub Jelinek July 9, 2015, 8:17 p.m. UTC | #3
On Thu, Jul 09, 2015 at 10:13:49PM +0200, Uros Bizjak wrote:
> I was under impression that peephole2 pass doesn't see subregs of hard
> regs (all x86 predicates are written in this way). Even documentation
> somehow agrees with this:
> 
> '(subreg:M1 REG:M2 BYTENUM)'
> 
>     'subreg' expressions are used to refer to a register in a machine
>     mode other than its natural one, or to refer to one register of a
>     multi-part 'reg' that actually refers to several registers.
> 
>     Each pseudo register has a natural mode.  If it is necessary to
>     operate on it in a different mode, the register must be enclosed in
>     a 'subreg'.
> 
>     There are currently three supported types for the first operand of
>     a 'subreg':
> 
> [...]
> 
>        * hard registers It is seldom necessary to wrap hard registers
>          in 'subreg's; such registers would normally reduce to a single
>          'reg' rtx.  This use of 'subreg's is discouraged and may not
>          be supported in the future.
> 
> So, I'd say that generating naked SUBREG after reload should be
> avoided and gen_lowpart should be used in the code above.

There is also:
  emit_insn (gen_sse2_loadld (operands[3], CONST0_RTX (V4SImode),
                              gen_rtx_SUBREG (SImode, operands[1], 0)));
  emit_insn (gen_sse2_loadld (operands[4], CONST0_RTX (V4SImode),
                              gen_rtx_SUBREG (SImode, operands[1], 4)));
in some splitters (also post-reload).

	Jakub
Uros Bizjak July 9, 2015, 8:23 p.m. UTC | #4
On Thu, Jul 9, 2015 at 10:17 PM, Jakub Jelinek <jakub@redhat.com> wrote:
> On Thu, Jul 09, 2015 at 10:13:49PM +0200, Uros Bizjak wrote:
>> I was under impression that peephole2 pass doesn't see subregs of hard
>> regs (all x86 predicates are written in this way). Even documentation
>> somehow agrees with this:
>>
>> '(subreg:M1 REG:M2 BYTENUM)'
>>
>>     'subreg' expressions are used to refer to a register in a machine
>>     mode other than its natural one, or to refer to one register of a
>>     multi-part 'reg' that actually refers to several registers.
>>
>>     Each pseudo register has a natural mode.  If it is necessary to
>>     operate on it in a different mode, the register must be enclosed in
>>     a 'subreg'.
>>
>>     There are currently three supported types for the first operand of
>>     a 'subreg':
>>
>> [...]
>>
>>        * hard registers It is seldom necessary to wrap hard registers
>>          in 'subreg's; such registers would normally reduce to a single
>>          'reg' rtx.  This use of 'subreg's is discouraged and may not
>>          be supported in the future.
>>
>> So, I'd say that generating naked SUBREG after reload should be
>> avoided and gen_lowpart should be used in the code above.
>
> There is also:
>   emit_insn (gen_sse2_loadld (operands[3], CONST0_RTX (V4SImode),
>                               gen_rtx_SUBREG (SImode, operands[1], 0)));
>   emit_insn (gen_sse2_loadld (operands[4], CONST0_RTX (V4SImode),
>                               gen_rtx_SUBREG (SImode, operands[1], 4)));
> in some splitters (also post-reload).

These won't hit the peephole2, changed by the patch, but let's fix
these before they start to break.

Uros.
diff mbox

Patch

Index: testsuite/gcc.target/i386/pr66814.c
===================================================================
--- testsuite/gcc.target/i386/pr66814.c	(revision 0)
+++ testsuite/gcc.target/i386/pr66814.c	(revision 0)
@@ -0,0 +1,4 @@ 
+/* { dg-do compile { target { ia32 } } } */
+/* { dg-options "-march=i586 -mavx512f -O2" } */
+
+#include "avx512f-klogic-2.c"
Index: config/i386/i386.md
===================================================================
--- config/i386/i386.md	(revision 225599)
+++ config/i386/i386.md	(working copy)
@@ -17379,8 +17379,8 @@ 
 ;; lifetime information then.
 
 (define_peephole2
-  [(set (match_operand:SWI124 0 "nonimmediate_operand")
-	(not:SWI124 (match_operand:SWI124 1 "nonimmediate_operand")))]
+  [(set (match_operand:SWI124 0 "nonimmediate_gr_operand")
+	(not:SWI124 (match_operand:SWI124 1 "nonimmediate_gr_operand")))]
   "optimize_insn_for_speed_p ()
    && ((TARGET_NOT_UNPAIRABLE
 	&& (!MEM_P (operands[0])
@@ -17524,8 +17524,10 @@ 
                      [(match_dup 0)
                       (match_operand 2 "memory_operand")]))]
   "REGNO (operands[0]) != REGNO (operands[1])
-   && ((MMX_REG_P (operands[0]) && MMX_REG_P (operands[1])) 
-       || (SSE_REG_P (operands[0]) && SSE_REG_P (operands[1])))"
+   && ((MMX_REGNO_P (REGNO (operands[0]))
+        && MMX_REGNO_P (REGNO (operands[1]))) 
+       || (SSE_REGNO_P (REGNO (operands[0]))
+           && SSE_REGNO_P (REGNO (operands[1]))))"
   [(set (match_dup 0) (match_dup 2))
    (set (match_dup 0)
         (match_op_dup 3 [(match_dup 0) (match_dup 1)]))])
@@ -17673,7 +17675,7 @@ 
 	(match_operand 1 "const0_operand"))]
   "GET_MODE_SIZE (GET_MODE (operands[0])) <= UNITS_PER_WORD
    && (! TARGET_USE_MOV0 || optimize_insn_for_size_p ())
-   && GENERAL_REG_P (operands[0])
+   && GENERAL_REGNO_P (REGNO (operands[0]))
    && peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 0) (const_int 0))
 	      (clobber (reg:CC FLAGS_REG))])]
@@ -17694,6 +17696,7 @@ 
   [(set (match_operand:SWI248 0 "register_operand")
 	(const_int -1))]
   "(optimize_insn_for_size_p () || TARGET_MOVE_M1_VIA_OR)
+   && GENERAL_REGNO_P (REGNO (operands[0]))
    && peep2_regno_dead_p (0, FLAGS_REG)"
   [(parallel [(set (match_dup 0) (const_int -1))
 	      (clobber (reg:CC FLAGS_REG))])]
Index: config/i386/predicates.md
===================================================================
--- config/i386/predicates.md	(revision 225599)
+++ config/i386/predicates.md	(working copy)
@@ -37,6 +37,12 @@ 
   (and (match_code "reg")
        (match_test "GENERAL_REGNO_P (REGNO (op))")))
 
+;; True if the operand is a nonimmediate operand with GENERAL class register.
+(define_predicate "nonimmediate_gr_operand"
+  (if_then_else (match_code "reg")
+    (match_test "GENERAL_REGNO_P (REGNO (op))")
+    (match_operand 0 "nonimmediate_operand")))
+
 ;; Return true if OP is a register operand other than an i387 fp register.
 (define_predicate "register_and_not_fp_reg_operand"
   (and (match_code "reg")