diff mbox

[AVR] Fix/hack around spill fail ICE PR52148

Message ID 4F3E927C.5060802@gjlay.de
State New
Headers show

Commit Message

Georg-Johann Lay Feb. 17, 2012, 5:46 p.m. UTC
Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3
pointer registers. Register allocator is at it's limits and the patch tries to
cure the situation by replacing

(match_operand:HI 0 "register_operand" "x")

with explicit

(reg:HI REG_X)

and similar for Z Register classes "x" and "z" contain only one HI register.

This PR and PR50925 show that register allocator has some problems.
Even though this patch is not a fix of the root cause, it allows the PR's test
case to compile.

Anyways, the patch simplifies the backend and replaces an insn with 11(!)
operands with an insn with only 2 operands so that the patch is improvement of
the backend.

The hard registers are already known at expand time so there is no need for
match_operands.

Passes without regression.

Ok for trunk?

Johann

	PR target/52148
	* config/avr/avr.md (movmem_<mode>): Replace match_operand that
	match only one single hard register with respective hard reg rtx.
	(movmemx_<mode>): Ditto.
	* config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new
	insn anatomy of movmem[x]_<mode>.
	(avr_out_movmem): Same for printing assembler and operand usage.

Comments

Georg-Johann Lay Feb. 24, 2012, 6:43 p.m. UTC | #1
http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html

Georg-Johann Lay wrote:
> Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3
> pointer registers. Register allocator is at it's limits and the patch tries to
> cure the situation by replacing
> 
> (match_operand:HI 0 "register_operand" "x")
> 
> with explicit
> 
> (reg:HI REG_X)
> 
> and similar for Z Register classes "x" and "z" contain only one HI register.
> 
> This PR and PR50925 show that register allocator has some problems.
> Even though this patch is not a fix of the root cause, it allows the PR's test
> case to compile.
> 
> Anyways, the patch simplifies the backend and replaces an insn with 11(!)
> operands with an insn with only 2 operands so that the patch is improvement of
> the backend.
> 
> The hard registers are already known at expand time so there is no need for
> match_operands.
> 
> Passes without regression.
> 
> Ok for trunk?
> 
> Johann
> 
> 	PR target/52148
> 	* config/avr/avr.md (movmem_<mode>): Replace match_operand that
> 	match only one single hard register with respective hard reg rtx.
> 	(movmemx_<mode>): Ditto.
> 	* config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new
> 	insn anatomy of movmem[x]_<mode>.
> 	(avr_out_movmem): Same for printing assembler and operand usage.
Denis Chertykov Feb. 25, 2012, 11:09 a.m. UTC | #2
2012/2/24 Georg-Johann Lay <avr@gjlay.de>:
> http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html
>
> Georg-Johann Lay wrote:
>> Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3
>> pointer registers. Register allocator is at it's limits and the patch tries to
>> cure the situation by replacing
>>
>> (match_operand:HI 0 "register_operand" "x")
>>
>> with explicit
>>
>> (reg:HI REG_X)
>>
>> and similar for Z Register classes "x" and "z" contain only one HI register.
>>
>> This PR and PR50925 show that register allocator has some problems.
>> Even though this patch is not a fix of the root cause, it allows the PR's test
>> case to compile.
>>
>> Anyways, the patch simplifies the backend and replaces an insn with 11(!)
>> operands with an insn with only 2 operands so that the patch is improvement of
>> the backend.
>>
>> The hard registers are already known at expand time so there is no need for
>> match_operands.
>>
>> Passes without regression.
>>
>> Ok for trunk?
>>
>> Johann
>>
>>       PR target/52148
>>       * config/avr/avr.md (movmem_<mode>): Replace match_operand that
>>       match only one single hard register with respective hard reg rtx.
>>       (movmemx_<mode>): Ditto.
>>       * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new
>>       insn anatomy of movmem[x]_<mode>.
>>       (avr_out_movmem): Same for printing assembler and operand usage.


Approved.

Denis.

PS: I have not understood this patch from first look
Georg-Johann Lay Feb. 25, 2012, 2:19 p.m. UTC | #3
Denis Chertykov schrieb:
> 2012/2/24 Georg-Johann Lay:
> 
>>http://gcc.gnu.org/ml/gcc-patches/2012-02/msg00956.html
>>
>>Georg-Johann Lay wrote:
>>
>>>Spill failure PR52148 occurs for movmem insn that allocates 2 of AVR's 3
>>>pointer registers. Register allocator is at it's limits and the patch tries to
>>>cure the situation by replacing
>>>
>>>(match_operand:HI 0 "register_operand" "x")
>>>
>>>with explicit
>>>
>>>(reg:HI REG_X)
>>>
>>>and similar for Z Register classes "x" and "z" contain only one HI register.
>>>
>>>This PR and PR50925 show that register allocator has some problems.
>>>Even though this patch is not a fix of the root cause, it allows the PR's test
>>>case to compile.
>>>
>>>Anyways, the patch simplifies the backend and replaces an insn with 11(!)
>>>operands with an insn with only 2 operands so that the patch is improvement of
>>>the backend.
>>>
>>>The hard registers are already known at expand time so there is no need for
>>>match_operands.
>>>
>>>Passes without regression.
>>>
>>>Ok for trunk?
>>>
>>>Johann
>>>
>>>      PR target/52148
>>>      * config/avr/avr.md (movmem_<mode>): Replace match_operand that
>>>      match only one single hard register with respective hard reg rtx.
>>>      (movmemx_<mode>): Ditto.
>>>      * config/avr/avr.c (avr_emit_movmemhi): Adapt expanding to new
>>>      insn anatomy of movmem[x]_<mode>.
>>>      (avr_out_movmem): Same for printing assembler and operand usage.
> 
> Approved.
> 
> Denis.
> 
> PS: I have not understood this patch from first look

In 4.6, movmemhi expands the copy loop explicitly:

   /* Move one byte into scratch and inc pointer.  */
   emit_move_insn (tmp_reg_rtx, gen_rtx_MEM (QImode, addr1));
   emit_move_insn (addr1, gen_rtx_PLUS (Pmode, addr1, const1_rtx));

   /* Move to mem and inc pointer.  */
   emit_move_insn (gen_rtx_MEM (QImode, addr0), tmp_reg_rtx);
   emit_move_insn (addr0, gen_rtx_PLUS (Pmode, addr0, const1_rtx));

tmp_reg is fixed and used implicitly by insn templates; the
assertion is that the lifetime of tmp_reg does never
extend over one insn.

The 4.6 movmemhi violates that assertion, and in 4.7 I saw situations
where reload generates insns that use (i.e. clobber) tmp_reg so that
explicit expanding is wrong. This happend together with movmemhi on
address spaces:

   /* FIXME: Register allocator does a bad job and might spill
      address register(s) inside the loop leading to additional
      move instruction to/from stack which could clobber tmp_reg.
      Thus, do *not* emit load and store as seperate insns.
      Instead, we perform the copy by means of one
      monolithic insn.  */

To fix it, I put all the movmemhi stuff into an unspec insn, but that
insn is not nice: The address-space version has 11 operands to
describe the insns action and the ordinary version 7 operands:

;; $0, $4 : & dest (REG_X)
;; $1, $5 : & src  (REG_Z)
;; $2     : Address Space
;; $3, $7 : Loop register
;; $6     : Scratch register

;; "movmem_qi"
;; "movmem_hi"
(define_insn "movmem_<mode>"
   [(set (mem:BLK (match_operand:HI 0 "register_operand" "x"))
         (mem:BLK (match_operand:HI 1 "register_operand" "z")))
    (unspec [(match_operand:QI 2 "const_int_operand"     "n")]
            UNSPEC_MOVMEM)
    (use (match_operand:QIHI 3 "register_operand"       "<MOVMEM_r_d>"))
    (clobber (match_operand:HI 4 "register_operand"     "=0"))
    (clobber (match_operand:HI 5 "register_operand"     "=1"))
    (clobber (match_operand:QI 6 "register_operand"     "=&r"))
    (clobber (match_operand:QIHI 7 "register_operand"   "=3"))]
   ""
   {
     return avr_out_movmem (insn, operands, NULL);
   }
   [(set_attr "adjust_len" "movmem")
    (set_attr "cc" "clobber")])

Even more issues are that the expand machinery expects a specific style
for movmem patterns (or was it clrstrhi?) and that with "free" register
assignments, i.e. with constrints "b" for the addresses,
I saw spill fails.

Outcome was the pattern above, that only allows one specific allocation
of the pointer registers involved; similar for the address space version:

;; Ditto and
;; $3, $7 : Loop register = R24
;; $8, $9 : hh8 (& src)   = R23
;; $10    : RAMPZ_ADDR

;; "movmemx_qi"
;; "movmemx_hi"
(define_insn "movmemx_<mode>"
  [(set (mem:BLK (match_operand:HI 0 "register_operand" "x"))
        (mem:BLK (lo_sum:PSI (match_operand:QI 8 "register_operand" "r")
                             (match_operand:HI 1 "register_operand" "z"))))
    (unspec [(match_operand:QI 2 "const_int_operand""n")]
            UNSPEC_MOVMEM)
    (use (match_operand:QIHI 3 "register_operand"              "w"))
    (clobber (match_operand:HI 4 "register_operand"           "=0"))
    (clobber (match_operand:HI 5 "register_operand"           "=1"))
    (clobber (match_operand:QI 6 "register_operand"           "=&r"))
    (clobber (match_operand:HI 7 "register_operand"           "=3"))
    (clobber (match_operand:QI 9 "register_operand"           "=8"))
    (clobber (mem:QI (match_operand:QI 10 "io_address_operand" "n")))]
   ""
   "%~call __movmemx_<mode>"
   [(set_attr "type" "xcall")
    (set_attr "cc" "clobber")])

With these patterns the test suite passed but are were more FIXMEs
in the source:

   /* FIXME: Register allocator might come up with spill fails
      if it is left on its own.
      Thus, we allocate the pointer registers by hand:
         Z = source address
         X = destination address  */

However, there is this PR now and I used it to clean up the code.
As the intention is to hard-assign the pointer registers, anyway,
there is no need to have operands for them. Obviously, reload tries
to spill some of the addresses which is no good idea.

As said, this is not a fix of the root cause of the spill fail
like PR50925, which is still open as you know.

But this small patch removes the spill fail for the test case at
least, and in itself it is a code clean-up that I preferred
over the original version.

The pattern for the address spaces is now as simple as

;; $0    : Address Space
;; $1    : RAMPZ RAM address
;; R24   : #bytes and loop register
;; R23:Z : 24-bit source address
;; R26   : 16-bit destination address

;; "movmemx_qi"
;; "movmemx_hi"
(define_insn "movmemx_<mode>"
   [(set (mem:BLK (reg:HI REG_X))
         (mem:BLK (lo_sum:PSI (reg:QI 23)
                              (reg:HI REG_Z))))
    (unspec [(match_operand:QI 0 "const_int_operand" "n")]
             UNSPEC_MOVMEM)
    (use (reg:QIHI 24))
    (clobber (reg:HI REG_X))
    (clobber (reg:HI REG_Z))
    (clobber (reg:QI LPM_REGNO))
    (clobber (reg:HI 24))
    (clobber (reg:QI 23))
    (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))]
   ""
   "%~call __movmemx_<mode>"
   [(set_attr "type" "xcall")

Johann
Denis Chertykov Feb. 25, 2012, 3:27 p.m. UTC | #4
2012/2/25 Georg-Johann Lay <avr@gjlay.de>:
[...]
> The pattern for the address spaces is now as simple as
>
> ;; $0    : Address Space
> ;; $1    : RAMPZ RAM address
> ;; R24   : #bytes and loop register
> ;; R23:Z : 24-bit source address
> ;; R26   : 16-bit destination address
>
> ;; "movmemx_qi"
> ;; "movmemx_hi"
> (define_insn "movmemx_<mode>"
>  [(set (mem:BLK (reg:HI REG_X))
>        (mem:BLK (lo_sum:PSI (reg:QI 23)
>                             (reg:HI REG_Z))))
>   (unspec [(match_operand:QI 0 "const_int_operand" "n")]
>            UNSPEC_MOVMEM)
>   (use (reg:QIHI 24))
>   (clobber (reg:HI REG_X))
>   (clobber (reg:HI REG_Z))
>   (clobber (reg:QI LPM_REGNO))
>   (clobber (reg:HI 24))
>   (clobber (reg:QI 23))
>   (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))]
>  ""
>  "%~call __movmemx_<mode>"
>  [(set_attr "type" "xcall")

Thanks for explanation.
This technique similar to mulhi3, mulhi3_call ...
IMHO this is a better solution then very strict constraint 'x' or 'y' or 'z'.
I.e. better to have (at least from reload's point of view):
mov r26, "r"
mov r27, "r"
r30,r31 = do-something-with r26, r27
mov "r", r30
mov "r", r31
Than:
"z" = do-something-with "x"

Denis.
diff mbox

Patch

Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 184269)
+++ config/avr/avr.md	(working copy)
@@ -461,6 +461,9 @@  (define_insn "xload_8"
    (set_attr "isa" "lpmx,lpm")
    (set_attr "cc" "none")])
 
+;; R21:Z : 24-bit source address
+;; R22   : 1-4 byte output
+
 ;; "xload_qi_libgcc"
 ;; "xload_hi_libgcc"
 ;; "xload_psi_libgcc"
@@ -841,24 +844,23 @@  (define_expand "movmemhi"
 (define_mode_attr MOVMEM_r_d [(QI "r")
                               (HI "wd")])
 
-;; $0, $4 : & dest (REG_X)
-;; $1, $5 : & src  (REG_Z)
-;; $2     : Address Space
-;; $3, $7 : Loop register
-;; $6     : Scratch register
+;; $0     : Address Space
+;; $1, $2 : Loop register
+;; R30    : source address
+;; R26    : destination address
 
 ;; "movmem_qi"
 ;; "movmem_hi"
 (define_insn "movmem_<mode>"
-  [(set (mem:BLK (match_operand:HI 0 "register_operand" "x"))
-        (mem:BLK (match_operand:HI 1 "register_operand" "z")))
-   (unspec [(match_operand:QI 2 "const_int_operand"     "n")]
+  [(set (mem:BLK (reg:HI REG_X))
+        (mem:BLK (reg:HI REG_Z)))
+   (unspec [(match_operand:QI 0 "const_int_operand" "n")]
            UNSPEC_MOVMEM)
-   (use (match_operand:QIHI 3 "register_operand"       "<MOVMEM_r_d>"))
-   (clobber (match_operand:HI 4 "register_operand"     "=0"))
-   (clobber (match_operand:HI 5 "register_operand"     "=1"))
-   (clobber (match_operand:QI 6 "register_operand"     "=&r"))
-   (clobber (match_operand:QIHI 7 "register_operand"   "=3"))]
+   (use (match_operand:QIHI 1 "register_operand" "<MOVMEM_r_d>"))
+   (clobber (reg:HI REG_X))
+   (clobber (reg:HI REG_Z))
+   (clobber (reg:QI LPM_REGNO))
+   (clobber (match_operand:QIHI 2 "register_operand" "=1"))]
   ""
   {
     return avr_out_movmem (insn, operands, NULL);
@@ -866,26 +868,28 @@  (define_insn "movmem_<mode>"
   [(set_attr "adjust_len" "movmem")
    (set_attr "cc" "clobber")])
 
-;; Ditto and
-;; $3, $7 : Loop register = R24
-;; $8, $9 : hh8 (& src)   = R23
-;; $10    : RAMPZ_ADDR
+
+;; $0    : Address Space
+;; $1    : RAMPZ RAM address
+;; R24   : #bytes and loop register
+;; R23:Z : 24-bit source address
+;; R26   : 16-bit destination address
 
 ;; "movmemx_qi"
 ;; "movmemx_hi"
 (define_insn "movmemx_<mode>"
-  [(set (mem:BLK (match_operand:HI 0 "register_operand"             "x"))
-        (mem:BLK (lo_sum:PSI (match_operand:QI 8 "register_operand" "r")
-                             (match_operand:HI 1 "register_operand" "z"))))
-   (unspec [(match_operand:QI 2 "const_int_operand"                 "n")]
+  [(set (mem:BLK (reg:HI REG_X))
+        (mem:BLK (lo_sum:PSI (reg:QI 23)
+                             (reg:HI REG_Z))))
+   (unspec [(match_operand:QI 0 "const_int_operand" "n")]
            UNSPEC_MOVMEM)
-   (use (match_operand:QIHI 3 "register_operand"                   "w"))
-   (clobber (match_operand:HI 4 "register_operand"                 "=0"))
-   (clobber (match_operand:HI 5 "register_operand"                 "=1"))
-   (clobber (match_operand:QI 6 "register_operand"                 "=&r"))
-   (clobber (match_operand:HI 7 "register_operand"                 "=3"))
-   (clobber (match_operand:QI 9 "register_operand"                 "=8"))
-   (clobber (mem:QI (match_operand:QI 10 "io_address_operand"       "n")))]
+   (use (reg:QIHI 24))
+   (clobber (reg:HI REG_X))
+   (clobber (reg:HI REG_Z))
+   (clobber (reg:QI LPM_REGNO))
+   (clobber (reg:HI 24))
+   (clobber (reg:QI 23))
+   (clobber (mem:QI (match_operand:QI 1 "io_address_operand" "n")))]
   ""
   "%~call __movmemx_<mode>"
   [(set_attr "type" "xcall")
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 184269)
+++ config/avr/avr.c	(working copy)
@@ -9898,7 +9898,7 @@  avr_emit_movmemhi (rtx *xop)
   HOST_WIDE_INT count;
   enum machine_mode loop_mode;
   addr_space_t as = MEM_ADDR_SPACE (xop[1]);
-  rtx loop_reg, addr0, addr1, a_src, a_dest, insn, xas, reg_x;
+  rtx loop_reg, addr1, a_src, a_dest, insn, xas;
   rtx a_hi8 = NULL_RTX;
 
   if (avr_mem_flash_p (xop[0]))
@@ -9927,7 +9927,7 @@  avr_emit_movmemhi (rtx *xop)
     }
   else
     {
-      int segment = avr_addrspace[as].segment % avr_current_arch->n_segments;
+      int segment = avr_addrspace[as].segment;
       
       if (segment
           && avr_current_arch->n_segments > 1)
@@ -9954,11 +9954,7 @@  avr_emit_movmemhi (rtx *xop)
         X = destination address  */
 
   emit_move_insn (lpm_addr_reg_rtx, addr1);
-  addr1 = lpm_addr_reg_rtx;
-
-  reg_x = gen_rtx_REG (HImode, REG_X);
-  emit_move_insn (reg_x, a_dest);
-  addr0 = reg_x;
+  emit_move_insn (gen_rtx_REG (HImode, REG_X), a_dest);
 
   /* FIXME: Register allocator does a bad job and might spill address
         register(s) inside the loop leading to additional move instruction
@@ -9973,23 +9969,19 @@  avr_emit_movmemhi (rtx *xop)
       /* Load instruction ([E]LPM or LD) is known at compile time:
          Do the copy-loop inline.  */
       
-      rtx (*fun) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx)
+      rtx (*fun) (rtx, rtx, rtx)
         = QImode == loop_mode ? gen_movmem_qi : gen_movmem_hi;
 
-      insn = fun (addr0, addr1, xas, loop_reg,
-                  addr0, addr1, tmp_reg_rtx, loop_reg);
+      insn = fun (xas, loop_reg, loop_reg);
     }
   else
     {
-      rtx loop_reg16 = gen_rtx_REG (HImode, 24);
-      rtx r23 = gen_rtx_REG (QImode, 23);
-      rtx (*fun) (rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx, rtx)
+      rtx (*fun) (rtx, rtx)
         = QImode == loop_mode ? gen_movmemx_qi : gen_movmemx_hi;
 
-      emit_move_insn (r23, a_hi8);
+      emit_move_insn (gen_rtx_REG (QImode, 23), a_hi8);
       
-      insn = fun (addr0, addr1, xas, loop_reg, addr0, addr1,
-                  lpm_reg_rtx, loop_reg16, r23, r23, GEN_INT (avr_addr.rampz));
+      insn = fun (xas, GEN_INT (avr_addr.rampz));
     }
 
   set_mem_addr_space (SET_SRC (XVECEXP (insn, 0, 0)), as);
@@ -10000,32 +9992,27 @@  avr_emit_movmemhi (rtx *xop)
 
 
 /* Print assembler for movmem_qi, movmem_hi insns...
-       $0, $4 : & dest
-       $1, $5 : & src
-       $2     : Address Space
-       $3, $7 : Loop register
-       $6     : Scratch register
-
-   ...and movmem_qi_elpm, movmem_hi_elpm insns.
-   
-       $8, $9 : hh8 (& src)
-       $10    : RAMPZ_ADDR
+       $0     : Address Space
+       $1, $2 : Loop register
+       Z      : Source address
+       X      : Destination address
 */
 
 const char*
-avr_out_movmem (rtx insn ATTRIBUTE_UNUSED, rtx *xop, int *plen)
+avr_out_movmem (rtx insn ATTRIBUTE_UNUSED, rtx *op, int *plen)
 {
-  addr_space_t as = (addr_space_t) INTVAL (xop[2]);
-  enum machine_mode loop_mode = GET_MODE (xop[3]);
-
-  bool sbiw_p = test_hard_reg_class (ADDW_REGS, xop[3]);
-
-  gcc_assert (REG_X == REGNO (xop[0])
-              && REG_Z == REGNO (xop[1]));
+  addr_space_t as = (addr_space_t) INTVAL (op[0]);
+  enum machine_mode loop_mode = GET_MODE (op[1]);
+  bool sbiw_p = test_hard_reg_class (ADDW_REGS, op[1]);
+  rtx xop[3];
 
   if (plen)
     *plen = 0;
 
+  xop[0] = op[0];
+  xop[1] = op[1];
+  xop[2] = tmp_reg_rtx;
+
   /* Loop label */
 
   avr_asm_len ("0:", xop, plen, 0);
@@ -10039,16 +10026,16 @@  avr_out_movmem (rtx insn ATTRIBUTE_UNUSE
       
     case ADDR_SPACE_GENERIC:
 
-      avr_asm_len ("ld %6,%a1+", xop, plen, 1);
+      avr_asm_len ("ld %2,Z+", xop, plen, 1);
       break;
       
     case ADDR_SPACE_FLASH:
 
       if (AVR_HAVE_LPMX)
-        avr_asm_len ("lpm %6,%a1+", xop, plen, 1);
+        avr_asm_len ("lpm %2,%Z+", xop, plen, 1);
       else
         avr_asm_len ("lpm" CR_TAB
-                     "adiw %1,1", xop, plen, 2);
+                     "adiw r30,1", xop, plen, 2);
       break;
       
     case ADDR_SPACE_FLASH1:
@@ -10058,31 +10045,31 @@  avr_out_movmem (rtx insn ATTRIBUTE_UNUSE
     case ADDR_SPACE_FLASH5:
 
       if (AVR_HAVE_ELPMX)
-        avr_asm_len ("elpm %6,%a1+", xop, plen, 1);
+        avr_asm_len ("elpm %2,Z+", xop, plen, 1);
       else
         avr_asm_len ("elpm" CR_TAB
-                     "adiw %1,1", xop, plen, 2);
+                     "adiw r30,1", xop, plen, 2);
       break;
     }
 
   /* Store with post-increment */
 
-  avr_asm_len ("st %a0+,%6", xop, plen, 1);
+  avr_asm_len ("st X+,%2", xop, plen, 1);
 
   /* Decrement loop-counter and set Z-flag */
 
   if (QImode == loop_mode)
     {
-      avr_asm_len ("dec %3", xop, plen, 1);
+      avr_asm_len ("dec %1", xop, plen, 1);
     }
   else if (sbiw_p)
     {
-      avr_asm_len ("sbiw %3,1", xop, plen, 1);
+      avr_asm_len ("sbiw %1,1", xop, plen, 1);
     }
   else
     {
-      avr_asm_len ("subi %A3,1" CR_TAB
-                   "sbci %B3,0", xop, plen, 2);
+      avr_asm_len ("subi %A1,1" CR_TAB
+                   "sbci %B1,0", xop, plen, 2);
     }
 
   /* Loop until zero */