Message ID | 4F3E927C.5060802@gjlay.de |
---|---|
State | New |
Headers | show |
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html Georg-Johann Lay wrote: > Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3 > pointer registers. Register allocator is at it's limits and the patch tries to > cure the situation by replacing > > (match_operand:HI 0 "register_operand" "x") > > with explicit > > (reg:HI REG_X) > > and similar for Z Register classes "x" and "z" contain only one HI register. > > This PR and PR50925 show that register allocator has some problems. > Even though this patch is not a fix of the root cause, it allows the PR's test > case to compile. > > Anyways, the patch simplifies the backend and replaces an insn with 11(!) > operands with an insn with only 2 operands so that the patch is improvement of > the backend. > > The hard registers are already known at expand time so there is no need for > match_operands. > > Passes without regression. > > Ok for trunk? > > Johann > > PR target/52148 > * config/avr/avr.md (movmem_<mode>): Replace match_operand that > match only one single hard register with respective hard reg rtx. > (movmemx_<mode>): Ditto. > * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new > insn anatomy of movmem[x]_<mode>. > (avr_out_movmem): Same for printing assembler and operand usage.
2012/2/24 Georg-Johann Lay <avr@gjlay.de>: > http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html > > Georg-Johann Lay wrote: >> Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3 >> pointer registers. Register allocator is at it's limits and the patch tries to >> cure the situation by replacing >> >> (match_operand:HI 0 "register_operand" "x") >> >> with explicit >> >> (reg:HI REG_X) >> >> and similar for Z Register classes "x" and "z" contain only one HI register. >> >> This PR and PR50925 show that register allocator has some problems. >> Even though this patch is not a fix of the root cause, it allows the PR's test >> case to compile. >> >> Anyways, the patch simplifies the backend and replaces an insn with 11(!) >> operands with an insn with only 2 operands so that the patch is improvement of >> the backend. >> >> The hard registers are already known at expand time so there is no need for >> match_operands. >> >> Passes without regression. >> >> Ok for trunk? >> >> Johann >> >> PR target/52148 >> * config/avr/avr.md (movmem_<mode>): Replace match_operand that >> match only one single hard register with respective hard reg rtx. >> (movmemx_<mode>): Ditto. >> * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new >> insn anatomy of movmem[x]_<mode>. >> (avr_out_movmem): Same for printing assembler and operand usage. Approved. Denis. PS: I have not understood this patch from first look
Denis Chertykov schrieb: > 2012/2/24 Georg-Johann Lay: > >>http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html >> >>Georg-Johann Lay wrote: >> >>>Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3 >>>pointer registers. Register allocator is at it's limits and the patch tries to >>>cure the situation by replacing >>> >>>(match_operand:HI 0 "register_operand" "x") >>> >>>with explicit >>> >>>(reg:HI REG_X) >>> >>>and similar for Z Register classes "x" and "z" contain only one HI register. >>> >>>This PR and PR50925 show that register allocator has some problems. >>>Even though this patch is not a fix of the root cause, it allows the PR's test >>>case to compile. >>> >>>Anyways, the patch simplifies the backend and replaces an insn with 11(!) >>>operands with an insn with only 2 operands so that the patch is improvement of >>>the backend. >>> >>>The hard registers are already known at expand time so there is no need for >>>match_operands. >>> >>>Passes without regression. >>> >>>Ok for trunk? >>> >>>Johann >>> >>> PR target/52148 >>> * config/avr/avr.md (movmem_<mode>): Replace match_operand that >>> match only one single hard register with respective hard reg rtx. >>> (movmemx_<mode>): Ditto. >>> * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new >>> insn anatomy of movmem[x]_<mode>. >>> (avr_out_movmem): Same for printing assembler and operand usage. > > Approved. > > Denis. > > PS: I have not understood this patch from first look In 4.6, movmemhi expands the copy loop explicitly: /* Move one byte into scratch and inc pointer. */ emit_move_insn (tmp_reg_rtx, gen_rtx_MEM (QImode, addr1)); emit_move_insn (addr1, gen_rtx_PLUS (Pmode, addr1, const1_rtx)); /* Move to mem and inc pointer. */ emit_move_insn (gen_rtx_MEM (QImode, addr0), tmp_reg_rtx); emit_move_insn (addr0, gen_rtx_PLUS (Pmode, addr0, const1_rtx)); tmp_reg is fixed and used implicitly by insn templates; the assertion is that the lifetime of tmp_reg does never extend over one insn. The 4.6 movmemhi violates that assertion, and in 4.7 I saw situations where reload generates insns that use (i.e. clobber) tmp_reg so that explicit expanding is wrong. This happend together with movmemhi on address spaces: /* FIXME: Register allocator does a bad job and might spill address register(s) inside the loop leading to additional move instruction to/from stack which could clobber tmp_reg. Thus, do *not* emit load and store as seperate insns. Instead, we perform the copy by means of one monolithic insn. */ To fix it, I put all the movmemhi stuff into an unspec insn, but that insn is not nice: The address-space version has 11 operands to describe the insns action and the ordinary version 7 operands: ;; $0, $4 : & dest (REG_X) ;; $1, $5 : & src (REG_Z) ;; $2 : Address Space ;; $3, $7 : Loop register ;; $6 : Scratch register ;; "movmem_qi" ;; "movmem_hi" (define_insn "movmem_<mode>" [(set (mem:BLK (match_operand:HI 0 "register_operand" "x")) (mem:BLK (match_operand:HI 1 "register_operand" "z"))) (unspec [(match_operand:QI 2 "const_int_operand" "n")] UNSPEC_MOVMEM) (use (match_operand:QIHI 3 "register_operand" "<MOVMEM_r_d>")) (clobber (match_operand:HI 4 "register_operand" "=0")) (clobber (match_operand:HI 5 "register_operand" "=1")) (clobber (match_operand:QI 6 "register_operand" "=&r")) (clobber (match_operand:QIHI 7 "register_operand" "=3"))] "" { return avr_out_movmem (insn, operands, NULL); } [(set_attr "adjust_len" "movmem") (set_attr "cc" "clobber")]) Even more issues are that the expand machinery expects a specific style for movmem patterns (or was it clrstrhi?) and that with "free" register assignments, i.e. with constrints "b" for the addresses, I saw spill fails. Outcome was the pattern above, that only allows one specific allocation of the pointer registers involved; similar for the address space version: ;; Ditto and ;; $3, $7 : Loop register = R24 ;; $8, $9 : hh8 (& src) = R23 ;; $10 : RAMPZ_ADDR ;; "movmemx_qi" ;; "movmemx_hi" (define_insn "movmemx_<mode>" [(set (mem:BLK (match_operand:HI 0 "register_operand" "x")) (mem:BLK (lo_sum:PSI (match_operand:QI 8 "register_operand" "r") (match_operand:HI 1 "register_operand" "z")))) (unspec [(match_operand:QI 2 "const_int_operand""n")] UNSPEC_MOVMEM) (use (match_operand:QIHI 3 "register_operand" "w")) (clobber (match_operand:HI 4 "register_operand" "=0")) (clobber (match_operand:HI 5 "register_operand" "=1")) (clobber (match_operand:QI 6 "register_operand" "=&r")) (clobber (match_operand:HI 7 "register_operand" "=3")) (clobber (match_operand:QI 9 "register_operand" "=8")) (clobber (mem:QI (match_operand:QI 10 "io_address_operand" "n")))] "" "%~call __movmemx_<mode>" [(set_attr "type" "xcall") (set_attr "cc" "clobber")]) With these patterns the test suite passed but are were more FIXMEs in the source: /* FIXME: Register allocator might come up with spill fails if it is left on its own. Thus, we allocate the pointer registers by hand: Z = source address X = destination address */ However, there is this PR now and I used it to clean up the code. As the intention is to hard-assign the pointer registers, anyway, there is no need to have operands for them. Obviously, reload tries to spill some of the addresses which is no good idea. As said, this is not a fix of the root cause of the spill fail like PR50925, which is still open as you know. But this small patch removes the spill fail for the test case at least, and in itself it is a code clean-up that I preferred over the original version. The pattern for the address spaces is now as simple as ;; $0 : Address Space ;; $1 : RAMPZ RAM address ;; R24 : #bytes and loop register ;; R23:Z : 24-bit source address ;; R26 : 16-bit destination address ;; "movmemx_qi" ;; "movmemx_hi" (define_insn "movmemx_<mode>" [(set (mem:BLK (reg:HI REG_X)) (mem:BLK (lo_sum:PSI (reg:QI 23) (reg:HI REG_Z)))) (unspec [(match_operand:QI 0 "const_int_operand" "n")] UNSPEC_MOVMEM) (use (reg:QIHI 24)) (clobber (reg:HI REG_X)) (clobber (reg:HI REG_Z)) (clobber (reg:QI LPM_REGNO)) (clobber (reg:HI 24)) (clobber (reg:QI 23)) (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))] "" "%~call __movmemx_<mode>" [(set_attr "type" "xcall") Johann
2012/2/25 Georg-Johann Lay <avr@gjlay.de>: [...] > The pattern for the address spaces is now as simple as > > ;; $0 : Address Space > ;; $1 : RAMPZ RAM address > ;; R24 : #bytes and loop register > ;; R23:Z : 24-bit source address > ;; R26 : 16-bit destination address > > ;; "movmemx_qi" > ;; "movmemx_hi" > (define_insn "movmemx_<mode>" > [(set (mem:BLK (reg:HI REG_X)) > (mem:BLK (lo_sum:PSI (reg:QI 23) > (reg:HI REG_Z)))) > (unspec [(match_operand:QI 0 "const_int_operand" "n")] > UNSPEC_MOVMEM) > (use (reg:QIHI 24)) > (clobber (reg:HI REG_X)) > (clobber (reg:HI REG_Z)) > (clobber (reg:QI LPM_REGNO)) > (clobber (reg:HI 24)) > (clobber (reg:QI 23)) > (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))] > "" > "%~call __movmemx_<mode>" > [(set_attr "type" "xcall") Thanks for explanation. This technique similar to mulhi3, mulhi3_call ... IMHO this is a better solution then very strict constraint 'x' or 'y' or 'z'. I.e. better to have (at least from reload's point of view): mov r26, "r" mov r27, "r" r30,r31 = do-something-with r26, r27 mov "r", r30 mov "r", r31 Than: "z" = do-something-with "x" Denis.
Index: config/avr/avr.md =================================================================== --- config/avr/avr.md (revision 184269) +++ config/avr/avr.md (working copy) @@ -461,6 +461,9 @@ (define_insn "xload_8" (set_attr "isa" "lpmx,lpm") (set_attr "cc" "none")]) +;; R21:Z : 24-bit source address +;; R22 : 1-4 byte output + ;; "xload_qi_libgcc" ;; "xload_hi_libgcc" ;; "xload_psi_libgcc" @@ -841,24 +844,23 @@ (define_expand "movmemhi" (define_mode_attr MOVMEM_r_d [(QI "r") (HI "wd")]) -;; $0, $4 : & dest (REG_X) -;; $1, $5 : & src (REG_Z) -;; $2 : Address Space -;; $3, $7 : Loop register -;; $6 : Scratch register +;; $0 : Address Space +;; $1, $2 : Loop register +;; R30 : source address +;; R26 : destination address ;; "movmem_qi" ;; "movmem_hi" (define_insn "movmem_<mode>" - [(set (mem:BLK (match_operand:HI 0 "register_operand" "x")) - (mem:BLK (match_operand:HI 1 "register_operand" "z"))) - (unspec [(match_operand:QI 2 "const_int_operand" "n")] + [(set (mem:BLK (reg:HI REG_X)) + (mem:BLK (reg:HI REG_Z))) + (unspec [(match_operand:QI 0 "const_int_operand" "n")] UNSPEC_MOVMEM) - (use (match_operand:QIHI 3 "register_operand" "<MOVMEM_r_d>")) - (clobber (match_operand:HI 4 "register_operand" "=0")) - (clobber (match_operand:HI 5 "register_operand" "=1")) - (clobber (match_operand:QI 6 "register_operand" "=&r")) - (clobber (match_operand:QIHI 7 "register_operand" "=3"))] + (use (match_operand:QIHI 1 "register_operand" "<MOVMEM_r_d>")) + (clobber (reg:HI REG_X)) + (clobber (reg:HI REG_Z)) + (clobber (reg:QI LPM_REGNO)) + (clobber (match_operand:QIHI 2 "register_operand" "=1"))] "" { return avr_out_movmem (insn, operands, NULL); @@ -866,26 +868,28 @@ (define_insn "movmem_<mode>" [(set_attr "adjust_len" "movmem") (set_attr "cc" "clobber")]) -;; Ditto and -;; $3, $7 : Loop register = R24 -;; $8, $9 : hh8 (& src) = R23 -;; $10 : RAMPZ_ADDR + +;; $0 : Address Space +;; $1 : RAMPZ RAM address +;; R24 : #bytes and loop register +;; R23:Z : 24-bit source address +;; R26 : 16-bit destination address ;; "movmemx_qi" ;; "movmemx_hi" (define_insn "movmemx_<mode>" - [(set (mem:BLK (match_operand:HI 0 "register_operand" "x")) - (mem:BLK (lo_sum:PSI (match_operand:QI 8 "register_operand" "r") - (match_operand:HI 1 "register_operand" "z")))) - (unspec [(match_operand:QI 2 "const_int_operand" "n")] + [(set (mem:BLK (reg:HI REG_X)) + (mem:BLK (lo_sum:PSI (reg:QI 23) + (reg:HI REG_Z)))) + (unspec [(match_operand:QI 0 "const_int_operand" "n")] UNSPEC_MOVMEM) - (use (match_operand:QIHI 3 "register_operand" "w")) - (clobber (match_operand:HI 4 "register_operand" "=0")) - (clobber (match_operand:HI 5 "register_operand" "=1")) - (clobber (match_operand:QI 6 "register_operand" "=&r")) - (clobber (match_operand:HI 7 "register_operand" "=3")) - (clobber (match_operand:QI 9 "register_operand" "=8")) - (clobber (mem:QI (match_operand:QI 10 "io_address_operand" "n")))] + (use (reg:QIHI 24)) + (clobber (reg:HI REG_X)) + (clobber (reg:HI REG_Z)) + (clobber (reg:QI LPM_REGNO)) + (clobber (reg:HI 24)) + (clobber (reg:QI 23)) + (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))] "" "%~call __movmemx_<mode>" [(set_attr "type" "xcall") Index: config/avr/avr.c =================================================================== --- config/avr/avr.c (revision 184269) +++ config/avr/avr.c (working copy) @@ -9898,7 +9898,7 @@ avr_emit_movmemhi (rtx *xop) HOST_WIDE_INT count; enum machine_mode loop_mode; addr_space_t as = MEM_ADDR_SPACE (xop[1]); - rtx loop_reg, addr0, addr1, a_src, a_dest, insn, xas, reg_x; + rtx loop_reg, addr1, a_src, a_dest, insn, xas; rtx a_hi8 = NULL_RTX; if (avr_mem_flash_p (xop[0])) @@ -9927,7 +9927,7 @@ avr_emit_movmemhi (rtx *xop) } else { - int segment = avr_addrspace[as].segment % avr_current_arch->n_segments; + int segment = avr_addrspace[as].segment; if (segment && avr_current_arch->n_segments > 1) @@ -9954,11 +9954,7 @@ avr_emit_movmemhi (rtx *xop) X = destination address */ emit_move_insn (lpm_addr_reg_rtx, addr1); - addr1 = lpm_addr_reg_rtx; - - reg_x = gen_rtx_REG (HImode, REG_X); - emit_move_insn (reg_x, a_dest); - addr0 = reg_x; + emit_move_insn (gen_rtx_REG (HImode, REG_X), a_dest); /* FIXME: Register allocator does a bad job and might spill address register(s) inside the loop leading to additional move instruction @@ -9973,23 +9969,19 @@ avr_emit_movmemhi (rtx *xop) /* Load instruction ([E]LPM or LD) is known at compile time: Do the copy-loop inline. */ - rtx (*fun) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx) + rtx (*fun) (rtx, rtx, rtx) = QImode == loop_mode ? gen_movmem_qi : gen_movmem_hi; - insn = fun (addr0, addr1, xas, loop_reg, - addr0, addr1, tmp_reg_rtx, loop_reg); + insn = fun (xas, loop_reg, loop_reg); } else { - rtx loop_reg16 = gen_rtx_REG (HImode, 24); - rtx r23 = gen_rtx_REG (QImode, 23); - rtx (*fun) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx) + rtx (*fun) (rtx, rtx) = QImode == loop_mode ? gen_movmemx_qi : gen_movmemx_hi; - emit_move_insn (r23, a_hi8); + emit_move_insn (gen_rtx_REG (QImode, 23), a_hi8); - insn = fun (addr0, addr1, xas, loop_reg, addr0, addr1, - lpm_reg_rtx, loop_reg16, r23, r23, GEN_INT (avr_addr.rampz)); + insn = fun (xas, GEN_INT (avr_addr.rampz)); } set_mem_addr_space (SET_SRC (XVECEXP (insn, 0, 0)), as); @@ -10000,32 +9992,27 @@ avr_emit_movmemhi (rtx *xop) /* Print assembler for movmem_qi, movmem_hi insns... - $0, $4 : & dest - $1, $5 : & src - $2 : Address Space - $3, $7 : Loop register - $6 : Scratch register - - ...and movmem_qi_elpm, movmem_hi_elpm insns. - - $8, $9 : hh8 (& src) - $10 : RAMPZ_ADDR + $0 : Address Space + $1, $2 : Loop register + Z : Source address + X : Destination address */ const char* -avr_out_movmem (rtx insn ATTRIBUTE_UNUSED, rtx *xop, int *plen) +avr_out_movmem (rtx insn ATTRIBUTE_UNUSED, rtx *op, int *plen) { - addr_space_t as = (addr_space_t) INTVAL (xop[2]); - enum machine_mode loop_mode = GET_MODE (xop[3]); - - bool sbiw_p = test_hard_reg_class (ADDW_REGS, xop[3]); - - gcc_assert (REG_X == REGNO (xop[0]) - && REG_Z == REGNO (xop[1])); + addr_space_t as = (addr_space_t) INTVAL (op[0]); + enum machine_mode loop_mode = GET_MODE (op[1]); + bool sbiw_p = test_hard_reg_class (ADDW_REGS, op[1]); + rtx xop[3]; if (plen) *plen = 0; + xop[0] = op[0]; + xop[1] = op[1]; + xop[2] = tmp_reg_rtx; + /* Loop label */ avr_asm_len ("0:", xop, plen, 0); @@ -10039,16 +10026,16 @@ avr_out_movmem (rtx insn ATTRIBUTE_UNUSE case ADDR_SPACE_GENERIC: - avr_asm_len ("ld %6,%a1+", xop, plen, 1); + avr_asm_len ("ld %2,Z+", xop, plen, 1); break; case ADDR_SPACE_FLASH: if (AVR_HAVE_LPMX) - avr_asm_len ("lpm %6,%a1+", xop, plen, 1); + avr_asm_len ("lpm %2,%Z+", xop, plen, 1); else avr_asm_len ("lpm" CR_TAB - "adiw %1,1", xop, plen, 2); + "adiw r30,1", xop, plen, 2); break; case ADDR_SPACE_FLASH1: @@ -10058,31 +10045,31 @@ avr_out_movmem (rtx insn ATTRIBUTE_UNUSE case ADDR_SPACE_FLASH5: if (AVR_HAVE_ELPMX) - avr_asm_len ("elpm %6,%a1+", xop, plen, 1); + avr_asm_len ("elpm %2,Z+", xop, plen, 1); else avr_asm_len ("elpm" CR_TAB - "adiw %1,1", xop, plen, 2); + "adiw r30,1", xop, plen, 2); break; } /* Store with post-increment */ - avr_asm_len ("st %a0+,%6", xop, plen, 1); + avr_asm_len ("st X+,%2", xop, plen, 1); /* Decrement loop-counter and set Z-flag */ if (QImode == loop_mode) { - avr_asm_len ("dec %3", xop, plen, 1); + avr_asm_len ("dec %1", xop, plen, 1); } else if (sbiw_p) { - avr_asm_len ("sbiw %3,1", xop, plen, 1); + avr_asm_len ("sbiw %1,1", xop, plen, 1); } else { - avr_asm_len ("subi %A3,1" CR_TAB - "sbci %B3,0", xop, plen, 2); + avr_asm_len ("subi %A1,1" CR_TAB + "sbci %B1,0", xop, plen, 2); } /* Loop until zero */