ARM: Don't clobber CC reg when it is live after the peephole window

Submitted by Meador Inge on May 29, 2013, 5:15 p.m.

Details

Message ID 1369847707-8357-1-git-send-email-meadori@codesourcery.com
State New
Headers show

Commit Message

Meador Inge May 29, 2013, 5:15 p.m.
Hi All,

This patch fixes a bug in one of the ARM peephole2 optimizations.  The
peephole2 optimization in question was changed to use the CC-updating
form for all of the instructions produced by the peephole so that the
encoding will be smaller when compiling for thumb [1].  However, I don't
think that is always safe.

For example, the CC register might be used by something *after* the
peephole window.  The current peephole will transform:


    (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
            (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
         (nil))

    (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (reg:SI 3 r3 [140])
                (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
         (expr_list:REG_EQUIV (const_int 0 [0])
            (nil)))

    (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (reg:SI 3 r3 [138])
                (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
         (expr_list:REG_EQUIV (const_int 1 [0x1])
            (nil)))

    (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
         (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
            (nil)))

    (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
         (nil))

    (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
                (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
         (expr_list:REG_DEAD (reg:CC 24 cc)
            (expr_list:REG_DEAD (reg:QI 3 r3 [140])
                (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
                    (nil)))))

into the following:


    (insn 59 49 60 2 (parallel [
                (set (reg:CC 24 cc)
                    (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                        (const_int 0 [0])))
                (set (reg:SI 1 r1)
                    (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
                        (const_int 0 [0])))
            ]) repro.c:6 -1
         (nil))

    (insn 60 59 61 2 (parallel [
                (set (reg:CC 24 cc)
                    (compare:CC (const_int 0 [0])
                        (reg:SI 1 r1)))
                (set (reg:SI 3 r3 [140])
                    (minus:SI (const_int 0 [0])
                        (reg:SI 1 r1)))
            ]) repro.c:6 -1
         (nil))

    (insn 61 60 19 2 (parallel [
                (set (reg:SI 3 r3 [140])
                    (plus:SI (plus:SI (reg:SI 3 r3 [140])
                            (reg:SI 1 r1))
                        (geu:SI (reg:CC 24 cc)
                            (const_int 0 [0]))))
                (clobber (reg:CC 24 cc))
            ]) repro.c:6 -1
         (nil))

    (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
         (nil))

    (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
                (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
         (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
            (nil)))

    (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
                (const_int 0 [0]))
            (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
                (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
         (expr_list:REG_DEAD (reg:CC 24 cc)
            (expr_list:REG_DEAD (reg:QI 3 r3 [140])
                (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
                    (nil)))))


This gets compiled into the incorrect sequence:

                       
    ldrb    r3, [r0, #0]
    ldr     r2, .L4
    subs    r1, r3, #0
    rsbs    r3, r1, #0
    adcs    r3, r3, r1
    strne   r3, [r2, #0]
    streq   r3, [r2, #0]
    strneb r3, [r0, #0]


The conditional stores are now dealing with an incorrect condition state.

This patch fixes the problem by ensuring that the CC reg is dead after the
peephole window for the current peephole definition and falls back on the
original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
coming up with a reproduction case against mainline.  I only noticed the bug
while working with some local changes that exposed it.

Built and tested a full ARM GNU/Linux toolchain.  No regressions in the GCC
test suite.

OK?

gcc/

2013-05-29  Meador Inge  <meadori@codesourcery.com>

	* config/arm/arm.md (conditional move peephole2): Only clobber CC
	register when it is dead after the peephole window.

[1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html

Comments

Meador Inge June 5, 2013, 8:36 p.m.
Ping.

On 05/29/2013 12:15 PM, Meador Inge wrote:
> Hi All,
> 
> This patch fixes a bug in one of the ARM peephole2 optimizations.  The
> peephole2 optimization in question was changed to use the CC-updating
> form for all of the instructions produced by the peephole so that the
> encoding will be smaller when compiling for thumb [1].  However, I don't
> think that is always safe.
> 
> For example, the CC register might be used by something *after* the
> peephole window.  The current peephole will transform:
> 
> 
>     (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
>             (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                 (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
>          (nil))
> 
>     (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (reg:SI 3 r3 [140])
>                 (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>          (expr_list:REG_EQUIV (const_int 0 [0])
>             (nil)))
> 
>     (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (reg:SI 3 r3 [138])
>                 (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>          (expr_list:REG_EQUIV (const_int 1 [0x1])
>             (nil)))
> 
>     (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>          (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
>             (nil)))
> 
>     (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>          (nil))
> 
>     (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
>                 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
>          (expr_list:REG_DEAD (reg:CC 24 cc)
>             (expr_list:REG_DEAD (reg:QI 3 r3 [140])
>                 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
>                     (nil)))))
> 
> into the following:
> 
> 
>     (insn 59 49 60 2 (parallel [
>                 (set (reg:CC 24 cc)
>                     (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                         (const_int 0 [0])))
>                 (set (reg:SI 1 r1)
>                     (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                         (const_int 0 [0])))
>             ]) repro.c:6 -1
>          (nil))
> 
>     (insn 60 59 61 2 (parallel [
>                 (set (reg:CC 24 cc)
>                     (compare:CC (const_int 0 [0])
>                         (reg:SI 1 r1)))
>                 (set (reg:SI 3 r3 [140])
>                     (minus:SI (const_int 0 [0])
>                         (reg:SI 1 r1)))
>             ]) repro.c:6 -1
>          (nil))
> 
>     (insn 61 60 19 2 (parallel [
>                 (set (reg:SI 3 r3 [140])
>                     (plus:SI (plus:SI (reg:SI 3 r3 [140])
>                             (reg:SI 1 r1))
>                         (geu:SI (reg:CC 24 cc)
>                             (const_int 0 [0]))))
>                 (clobber (reg:CC 24 cc))
>             ]) repro.c:6 -1
>          (nil))
> 
>     (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                 (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>          (nil))
> 
>     (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                 (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>          (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
>             (nil)))
> 
>     (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
>                 (const_int 0 [0]))
>             (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
>                 (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
>          (expr_list:REG_DEAD (reg:CC 24 cc)
>             (expr_list:REG_DEAD (reg:QI 3 r3 [140])
>                 (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
>                     (nil)))))
> 
> 
> This gets compiled into the incorrect sequence:
> 
>                        
>     ldrb    r3, [r0, #0]
>     ldr     r2, .L4
>     subs    r1, r3, #0
>     rsbs    r3, r1, #0
>     adcs    r3, r3, r1
>     strne   r3, [r2, #0]
>     streq   r3, [r2, #0]
>     strneb r3, [r0, #0]
> 
> 
> The conditional stores are now dealing with an incorrect condition state.
> 
> This patch fixes the problem by ensuring that the CC reg is dead after the
> peephole window for the current peephole definition and falls back on the
> original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
> coming up with a reproduction case against mainline.  I only noticed the bug
> while working with some local changes that exposed it.
> 
> Built and tested a full ARM GNU/Linux toolchain.  No regressions in the GCC
> test suite.
> 
> OK?
> 
> gcc/
> 
> 2013-05-29  Meador Inge  <meadori@codesourcery.com>
> 
> 	* config/arm/arm.md (conditional move peephole2): Only clobber CC
> 	register when it is dead after the peephole window.
> 
> [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html
> 
> Index: gcc/config/arm/arm.md
> ===================================================================
> --- gcc/config/arm/arm.md	(revision 199414)
> +++ gcc/config/arm/arm.md	(working copy)
> @@ -9978,29 +9978,48 @@
>  ;; Attempt to improve the sequence generated by the compare_scc splitters
>  ;; not to use conditional execution.
>  (define_peephole2
> -  [(set (reg:CC CC_REGNUM)
> +  [(set (match_operand 0 "cc_register" "")
>  	(compare:CC (match_operand:SI 1 "register_operand" "")
>  		    (match_operand:SI 2 "arm_rhs_operand" "")))
>     (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> -	      (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
> +	      (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
>     (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> -	      (set (match_dup 0) (const_int 1)))
> -   (match_scratch:SI 3 "r")]
> -  "TARGET_32BIT"
> +	      (set (match_dup 3) (const_int 1)))
> +   (match_scratch:SI 4 "r")]
> +  "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
>    [(parallel
>      [(set (reg:CC CC_REGNUM)
>  	  (compare:CC (match_dup 1) (match_dup 2)))
> -     (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
> +     (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
>     (parallel
>      [(set (reg:CC CC_REGNUM)
> -	  (compare:CC (const_int 0) (match_dup 3)))
> -     (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
> +	  (compare:CC (const_int 0) (match_dup 4)))
> +     (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
>     (parallel
> -    [(set (match_dup 0)
> -	  (plus:SI (plus:SI (match_dup 0) (match_dup 3))
> +    [(set (match_dup 3)
> +	  (plus:SI (plus:SI (match_dup 3) (match_dup 4))
>  		   (geu:SI (reg:CC CC_REGNUM) (const_int 0))))
>       (clobber (reg:CC CC_REGNUM))])])
>  
> +(define_peephole2
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC (match_operand:SI 0 "register_operand" "")
> +		    (match_operand:SI 1 "arm_rhs_operand" "")))
> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> +	      (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> +	      (set (match_dup 2) (const_int 1)))
> +   (match_scratch:SI 3 "r")]
> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
> +   (parallel
> +    [(set (reg:CC CC_REGNUM)
> +	  (compare:CC (const_int 0) (match_dup 3)))
> +     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
> +   (set (match_dup 2)
> +	(plus:SI (plus:SI (match_dup 2) (match_dup 3))
> +		 (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
> +
>  (define_insn "*cond_move"
>    [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
>  	(if_then_else:SI (match_operator 3 "equality_operator"
>
Richard Earnshaw June 6, 2013, 1:11 p.m.
On 29/05/13 18:15, Meador Inge wrote:
> Hi All,
>
> This patch fixes a bug in one of the ARM peephole2 optimizations.  The
> peephole2 optimization in question was changed to use the CC-updating
> form for all of the instructions produced by the peephole so that the
> encoding will be smaller when compiling for thumb [1].  However, I don't
> think that is always safe.
>
> For example, the CC register might be used by something *after* the
> peephole window.  The current peephole will transform:
>
>
>      (insn:TI 7 49 18 2 (set (reg:CC 24 cc)
>              (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                  (const_int 0 [0]))) repro.c:5 212 {*arm_cmpsi_insn}
>           (nil))
>
>      (insn:TI 18 7 11 2 (cond_exec (ne (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (reg:SI 3 r3 [140])
>                  (const_int 0 [0]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>           (expr_list:REG_EQUIV (const_int 0 [0])
>              (nil)))
>
>      (insn 11 18 19 2 (cond_exec (eq (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (reg:SI 3 r3 [138])
>                  (const_int 1 [0x1]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>           (expr_list:REG_EQUIV (const_int 1 [0x1])
>              (nil)))
>
>      (insn:TI 19 11 12 2 (cond_exec (ne (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                  (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>           (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
>              (nil)))
>
>      (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                  (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>           (nil))
>
>      (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
>                  (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
>           (expr_list:REG_DEAD (reg:CC 24 cc)
>              (expr_list:REG_DEAD (reg:QI 3 r3 [140])
>                  (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
>                      (nil)))))
>
> into the following:
>
>
>      (insn 59 49 60 2 (parallel [
>                  (set (reg:CC 24 cc)
>                      (compare:CC (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                          (const_int 0 [0])))
>                  (set (reg:SI 1 r1)
>                      (minus:SI (reg:SI 3 r3 [orig:136 *endname_1(D) ] [136])
>                          (const_int 0 [0])))
>              ]) repro.c:6 -1
>           (nil))
>
>      (insn 60 59 61 2 (parallel [
>                  (set (reg:CC 24 cc)
>                      (compare:CC (const_int 0 [0])
>                          (reg:SI 1 r1)))
>                  (set (reg:SI 3 r3 [140])
>                      (minus:SI (const_int 0 [0])
>                          (reg:SI 1 r1)))
>              ]) repro.c:6 -1
>           (nil))
>
>      (insn 61 60 19 2 (parallel [
>                  (set (reg:SI 3 r3 [140])
>                      (plus:SI (plus:SI (reg:SI 3 r3 [140])
>                              (reg:SI 1 r1))
>                          (geu:SI (reg:CC 24 cc)
>                              (const_int 0 [0]))))
>                  (clobber (reg:CC 24 cc))
>              ]) repro.c:6 -1
>           (nil))
>
>      (insn:TI 19 61 12 2 (cond_exec (ne (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                  (reg:SI 3 r3 [140]))) repro.c:8 3366 {*p *arm_movsi_vfp}
>           (nil))
>
>      (insn:TI 12 19 22 2 (cond_exec (eq (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem/c:SI (reg/f:SI 2 r2 [143]) [2 atend+0 S4 A32])
>                  (reg:SI 3 r3 [138]))) repro.c:6 3366 {*p *arm_movsi_vfp}
>           (expr_list:REG_DEAD (reg/f:SI 2 r2 [143])
>              (nil)))
>
>      (insn:TI 22 12 58 2 (cond_exec (ne (reg:CC 24 cc)
>                  (const_int 0 [0]))
>              (set (mem:QI (reg/v/f:SI 0 r0 [orig:135 endname ] [135]) [0 *endname_1(D)+0 S1 A8])
>                  (reg:QI 3 r3 [140]))) repro.c:9 3115 {*p *arm_movqi_insn}
>           (expr_list:REG_DEAD (reg:CC 24 cc)
>              (expr_list:REG_DEAD (reg:QI 3 r3 [140])
>                  (expr_list:REG_DEAD (reg/v/f:SI 0 r0 [orig:135 endname ] [135])
>                      (nil)))))
>
>
> This gets compiled into the incorrect sequence:
>
>
>      ldrb    r3, [r0, #0]
>      ldr     r2, .L4
>      subs    r1, r3, #0
>      rsbs    r3, r1, #0
>      adcs    r3, r3, r1
>      strne   r3, [r2, #0]
>      streq   r3, [r2, #0]
>      strneb r3, [r0, #0]
>
>
> The conditional stores are now dealing with an incorrect condition state.
>
> This patch fixes the problem by ensuring that the CC reg is dead after the
> peephole window for the current peephole definition and falls back on the
> original pre-PR46975 peephole when it is live.  Unfortunately I had trouble
> coming up with a reproduction case against mainline.  I only noticed the bug
> while working with some local changes that exposed it.
>
> Built and tested a full ARM GNU/Linux toolchain.  No regressions in the GCC
> test suite.
>
> OK?
>
> gcc/
>
> 2013-05-29  Meador Inge  <meadori@codesourcery.com>
>
> 	* config/arm/arm.md (conditional move peephole2): Only clobber CC
> 	register when it is dead after the peephole window.
>
> [1] http://gcc.gnu.org/ml/gcc-patches/2010-12/msg01336.html
>
> Index: gcc/config/arm/arm.md
> ===================================================================
> --- gcc/config/arm/arm.md	(revision 199414)
> +++ gcc/config/arm/arm.md	(working copy)
> @@ -9978,29 +9978,48 @@
>   ;; Attempt to improve the sequence generated by the compare_scc splitters
>   ;; not to use conditional execution.
>   (define_peephole2
> -  [(set (reg:CC CC_REGNUM)
> +  [(set (match_operand 0 "cc_register" "")
>   	(compare:CC (match_operand:SI 1 "register_operand" "")
>   		    (match_operand:SI 2 "arm_rhs_operand" "")))
>      (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> -	      (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
> +	      (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
>      (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> -	      (set (match_dup 0) (const_int 1)))
> -   (match_scratch:SI 3 "r")]
> -  "TARGET_32BIT"
> +	      (set (match_dup 3) (const_int 1)))
> +   (match_scratch:SI 4 "r")]
> +  "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
>     [(parallel
>       [(set (reg:CC CC_REGNUM)
>   	  (compare:CC (match_dup 1) (match_dup 2)))
> -     (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
> +     (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
>      (parallel
>       [(set (reg:CC CC_REGNUM)
> -	  (compare:CC (const_int 0) (match_dup 3)))
> -     (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
> +	  (compare:CC (const_int 0) (match_dup 4)))
> +     (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
>      (parallel
> -    [(set (match_dup 0)
> -	  (plus:SI (plus:SI (match_dup 0) (match_dup 3))
> +    [(set (match_dup 3)
> +	  (plus:SI (plus:SI (match_dup 3) (match_dup 4))
>   		   (geu:SI (reg:CC CC_REGNUM) (const_int 0))))
>        (clobber (reg:CC CC_REGNUM))])])
>

I understand (and agree with) this bit...

> +(define_peephole2
> +  [(set (reg:CC CC_REGNUM)
> +	(compare:CC (match_operand:SI 0 "register_operand" "")
> +		    (match_operand:SI 1 "arm_rhs_operand" "")))
> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
> +	      (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
> +	      (set (match_dup 2) (const_int 1)))
> +   (match_scratch:SI 3 "r")]
> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
> +   (parallel
> +    [(set (reg:CC CC_REGNUM)
> +	  (compare:CC (const_int 0) (match_dup 3)))
> +     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
> +   (set (match_dup 2)
> +	(plus:SI (plus:SI (match_dup 2) (match_dup 3))
> +		 (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
> +

... but what's this bit about?

R.
Meador Inge June 6, 2013, 6:23 p.m.
On 06/06/2013 08:11 AM, Richard Earnshaw wrote:

> I understand (and agree with) this bit...
> 
>> +(define_peephole2
>> +  [(set (reg:CC CC_REGNUM)
>> +    (compare:CC (match_operand:SI 0 "register_operand" "")
>> +            (match_operand:SI 1 "arm_rhs_operand" "")))
>> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
>> +          (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
>> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
>> +          (set (match_dup 2) (const_int 1)))
>> +   (match_scratch:SI 3 "r")]
>> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
>> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
>> +   (parallel
>> +    [(set (reg:CC CC_REGNUM)
>> +      (compare:CC (const_int 0) (match_dup 3)))
>> +     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
>> +   (set (match_dup 2)
>> +    (plus:SI (plus:SI (match_dup 2) (match_dup 3))
>> +         (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
>> +
> 
> ... but what's this bit about?

The original intent was to revert back to the original peephole pattern
(pre-PR 46975) when the CC reg is still live, but that doesn't properly
maintain the CC state either (it just happened to pass in the test
case I was looking at because I only cared about the Z flag, which is
maintained the same).

OK with the above bit left out?
Meador Inge June 11, 2013, 4:47 a.m.
On 06/06/2013 01:23 PM, Meador Inge wrote:

> On 06/06/2013 08:11 AM, Richard Earnshaw wrote:
> 
>> I understand (and agree with) this bit...
>>
>>> +(define_peephole2
>>> +  [(set (reg:CC CC_REGNUM)
>>> +    (compare:CC (match_operand:SI 0 "register_operand" "")
>>> +            (match_operand:SI 1 "arm_rhs_operand" "")))
>>> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
>>> +          (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
>>> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
>>> +          (set (match_dup 2) (const_int 1)))
>>> +   (match_scratch:SI 3 "r")]
>>> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
>>> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
>>> +   (parallel
>>> +    [(set (reg:CC CC_REGNUM)
>>> +      (compare:CC (const_int 0) (match_dup 3)))
>>> +     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
>>> +   (set (match_dup 2)
>>> +    (plus:SI (plus:SI (match_dup 2) (match_dup 3))
>>> +         (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
>>> +
>>
>> ... but what's this bit about?
> 
> The original intent was to revert back to the original peephole pattern
> (pre-PR 46975) when the CC reg is still live, but that doesn't properly
> maintain the CC state either (it just happened to pass in the test
> case I was looking at because I only cared about the Z flag, which is
> maintained the same).
> 
> OK with the above bit left out?

OK?
Meador Inge June 18, 2013, 4:22 p.m.
Ping.

On 06/06/2013 01:23 PM, Meador Inge wrote:
> On 06/06/2013 08:11 AM, Richard Earnshaw wrote:
> 
>> I understand (and agree with) this bit...
>>
>>> +(define_peephole2
>>> +  [(set (reg:CC CC_REGNUM)
>>> +    (compare:CC (match_operand:SI 0 "register_operand" "")
>>> +            (match_operand:SI 1 "arm_rhs_operand" "")))
>>> +   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
>>> +          (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
>>> +   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
>>> +          (set (match_dup 2) (const_int 1)))
>>> +   (match_scratch:SI 3 "r")]
>>> +  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
>>> +  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
>>> +   (parallel
>>> +    [(set (reg:CC CC_REGNUM)
>>> +      (compare:CC (const_int 0) (match_dup 3)))
>>> +     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
>>> +   (set (match_dup 2)
>>> +    (plus:SI (plus:SI (match_dup 2) (match_dup 3))
>>> +         (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
>>> +
>>
>> ... but what's this bit about?
> 
> The original intent was to revert back to the original peephole pattern
> (pre-PR 46975) when the CC reg is still live, but that doesn't properly
> maintain the CC state either (it just happened to pass in the test
> case I was looking at because I only cared about the Z flag, which is
> maintained the same).
> 
> OK with the above bit left out?
>

Patch hide | download patch | download mbox

Index: gcc/config/arm/arm.md
===================================================================
--- gcc/config/arm/arm.md	(revision 199414)
+++ gcc/config/arm/arm.md	(working copy)
@@ -9978,29 +9978,48 @@ 
 ;; Attempt to improve the sequence generated by the compare_scc splitters
 ;; not to use conditional execution.
 (define_peephole2
-  [(set (reg:CC CC_REGNUM)
+  [(set (match_operand 0 "cc_register" "")
 	(compare:CC (match_operand:SI 1 "register_operand" "")
 		    (match_operand:SI 2 "arm_rhs_operand" "")))
    (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
-	      (set (match_operand:SI 0 "register_operand" "") (const_int 0)))
+	      (set (match_operand:SI 3 "register_operand" "") (const_int 0)))
    (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
-	      (set (match_dup 0) (const_int 1)))
-   (match_scratch:SI 3 "r")]
-  "TARGET_32BIT"
+	      (set (match_dup 3) (const_int 1)))
+   (match_scratch:SI 4 "r")]
+  "TARGET_32BIT && peep2_reg_dead_p (3, operands[0])"
   [(parallel
     [(set (reg:CC CC_REGNUM)
 	  (compare:CC (match_dup 1) (match_dup 2)))
-     (set (match_dup 3) (minus:SI (match_dup 1) (match_dup 2)))])
+     (set (match_dup 4) (minus:SI (match_dup 1) (match_dup 2)))])
    (parallel
     [(set (reg:CC CC_REGNUM)
-	  (compare:CC (const_int 0) (match_dup 3)))
-     (set (match_dup 0) (minus:SI (const_int 0) (match_dup 3)))])
+	  (compare:CC (const_int 0) (match_dup 4)))
+     (set (match_dup 3) (minus:SI (const_int 0) (match_dup 4)))])
    (parallel
-    [(set (match_dup 0)
-	  (plus:SI (plus:SI (match_dup 0) (match_dup 3))
+    [(set (match_dup 3)
+	  (plus:SI (plus:SI (match_dup 3) (match_dup 4))
 		   (geu:SI (reg:CC CC_REGNUM) (const_int 0))))
      (clobber (reg:CC CC_REGNUM))])])
 
+(define_peephole2
+  [(set (reg:CC CC_REGNUM)
+	(compare:CC (match_operand:SI 0 "register_operand" "")
+		    (match_operand:SI 1 "arm_rhs_operand" "")))
+   (cond_exec (ne (reg:CC CC_REGNUM) (const_int 0))
+	      (set (match_operand:SI 2 "register_operand" "") (const_int 0)))
+   (cond_exec (eq (reg:CC CC_REGNUM) (const_int 0))
+	      (set (match_dup 2) (const_int 1)))
+   (match_scratch:SI 3 "r")]
+  "TARGET_32BIT && !peep2_reg_dead_p (3, operands[0])"
+  [(set (match_dup 3) (minus:SI (match_dup 0) (match_dup 1)))
+   (parallel
+    [(set (reg:CC CC_REGNUM)
+	  (compare:CC (const_int 0) (match_dup 3)))
+     (set (match_dup 2) (minus:SI (const_int 0) (match_dup 3)))])
+   (set (match_dup 2)
+	(plus:SI (plus:SI (match_dup 2) (match_dup 3))
+		 (geu:SI (reg:CC CC_REGNUM) (const_int 0))))])
+
 (define_insn "*cond_move"
   [(set (match_operand:SI 0 "s_register_operand" "=r,r,r")
 	(if_then_else:SI (match_operator 3 "equality_operator"