[AVR] : Fix PR46779

Message ID	4DF0FAB5.6070704@gjlay.de
State	New
Headers	show Return-Path: <gcc-patches-return-294064-incoming=patchwork.ozlabs.org@gcc.gnu.org> Message-ID: <4DF0FAB5.6070704@gjlay.de> Date: Thu, 09 Jun 2011 18:54:13 +0200 From: Georg-Johann Lay <avr@gjlay.de> User-Agent: Thunderbird 2.0.0.24 (X11/20100302) MIME-Version: 1.0 To: gcc-patches@gcc.gnu.org CC: Anatoly Sokolov <aesok@post.ru>, Denis Chertykov <chertykov@gmail.com>, "Eric B. Weddington" <eric.weddington@atmel.com> Subject: [Patch, AVR]: Fix PR46779 Content-Type: multipart/mixed; boundary="------------050208020902060002050104" Mailing-List: contact gcc-patches-help@gcc.gnu.org; run by ezmlm Precedence: bulk Sender: gcc-patches-owner@gcc.gnu.org

Georg-Johann Lay June 9, 2011, 4:54 p.m. UTC

This is a tentative patch to fix PR46779 and hopefully also related
issues like PR45291.

In the present version of avr_hard_regno_mode_ok, QImode is forbidden
in r29/r28. If a 16-bit value or composite is allocated to r28, this
can lead to odd subregs like
  (set (subreg:QI (reg:HI r28) 0) ...)
These subregs are produced by IRA and reload treats the subreg with a
RELOAD_WRITE. As reload spills r28 to another hard reg that can access
QI, there will be no input reload. Therefore, if r29 already contains
data that data will get garbaged. See also the discussion around

http://gcc.gnu.org/ml/gcc/2011-06/msg00005.html

Tested with two regressions less compared to unpatched compiler.
Testcases that now pass are:

* gcc.target/avr/pr46779-1.c
* gcc.target/avr/pr46779-2.c

Johann

--
	PR target/46779
	* config/avr/avr.c (avr_hard_regno_mode_ok): Rewrite.
	In particular, allow 8-bit values in r28 and r29.
	(avr_hard_regno_scratch_ok): Disallow any register that might be
	part of the frame pointer.
	(avr_hard_regno_rename_ok): Same.

Denis Chertykov June 9, 2011, 6:34 p.m. UTC | #1

2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
> This is a tentative patch to fix PR46779 and hopefully also related
> issues like PR45291.
>
-  /* Disallow QImode in stack pointer regs.  */
-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
+  /* Don't allocate data to non-GENERAL_REGS registers.  */
+
+  if (regno >= 32)
     return 0;

I think that we not need in this code.
GCC core must bother about this.

+
+  if (GET_MODE_SIZE (mode) == 1)
     return 1;

I'm agree with this.

+
+  /* Disallow big registers that overlap the frame pointer.
+     This will hopefully reduce the number of spill failures.  */
+
+  if (GET_MODE_SIZE (mode) > 2
+      && regno <= REG_Y
+      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
+    {
+      return 0;
+    }

Fragment from GCC info:
--------------------------------------
HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
is permissible to store a value of mode mode in hard register number
regno (or in several registers starting with that one). For a machine
where all registers are equivalent, a suitable definition is

#define HARD_REGNO_MODE_OK(REGNO, MODE) 1

You need not include code to check for the numbers of fixed registers,
because the allocation mechanism considers them to be always occupied.
-----------------------------------------
Again, GCC core must bother about this.

-  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
-  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
-    return 0;
-
-  if (mode == QImode)
-    return 1;
-
-  /* Modes larger than QImode occupy consecutive registers.  */
-  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
-    return 0;
-

This is a right change.

Denis.

Georg-Johann Lay June 9, 2011, 7:21 p.m. UTC | #2

Denis Chertykov schrieb:
> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> This is a tentative patch to fix PR46779 and hopefully also related
>> issues like PR45291.
>>
> -  /* Disallow QImode in stack pointer regs.  */
> -  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
> +  /* Don't allocate data to non-GENERAL_REGS registers.  */
> +
> +  if (regno >= 32)
>      return 0;
> 
> I think that we not need in this code.
> GCC core must bother about this.

I am unsure about that. There is the "q" constraint that is used for
SP. register_operand will accept SP because regno_reg_class does not
return NO_REGS for SP. So something must stop the register allocator
from allocating ordinary data to SP.

In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
Besides that it is strongly discouraged to have more than one move
insn for one mode, not each and every place looks at insn conditions.

Moreover, if there is an insn like
(set (reg:HI 100)
     (reg:HI 101))
that matches "*movhi", what will keep IRA from allocating one of the
values to SP, i.e. chose alternative "q,r" or "r,q"?

> +
> +  if (GET_MODE_SIZE (mode) == 1)
>      return 1;
> 
> I'm agree with this.

> +  /* Disallow big registers that overlap the frame pointer.
> +     This will hopefully reduce the number of spill failures.  */
> +
> +  if (GET_MODE_SIZE (mode) > 2
> +      && regno <= REG_Y
> +      && regno + GET_MODE_SIZE (mode) >= REG_Y + 1)
> +    {
> +      return 0;
> +    }
> 
> Fragment from GCC info:
> --------------------------------------
> HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
> is permissible to store a value of mode mode in hard register number
> regno (or in several registers starting with that one). For a machine
> where all registers are equivalent, a suitable definition is
> 
> #define HARD_REGNO_MODE_OK(REGNO, MODE) 1
> 
> You need not include code to check for the numbers of fixed registers,
> because the allocation mechanism considers them to be always occupied.
> -----------------------------------------
> Again, GCC core must bother about this.

I agree with you. However, I think that the risk of spill failure
should be minimized. I have no idea how to fix a splill failure like
the following that occurs in testsuite (with -Os, no matter if the
patch is applied or not):

./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
to find a register to spill in class 'POINTER_REGS'
./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
the insn:
(insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
     (nil))
./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
compiler error: in spill_failure, at reload1.c:2113

Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
would reduce the risk of spill failure :-/

> -  /* Otherwise disallow all regno/mode combinations that span r28:r29.  */
> -  if (regno <= (REG_Y + 1) && (regno + GET_MODE_SIZE (mode)) >= (REG_Y + 1))
> -    return 0;
> -
> -  if (mode == QImode)
> -    return 1;
> -
> -  /* Modes larger than QImode occupy consecutive registers.  */
> -  if (regno + GET_MODE_SIZE (mode) > FIRST_PSEUDO_REGISTER)
> -    return 0;
> -
> 
> This is a right change.
> 
> Denis.

Johann

Georg-Johann Lay June 9, 2011, 7:39 p.m. UTC | #3

Georg-Johann Lay schrieb:
> Denis Chertykov schrieb:
> 
>>2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>>
>>>This is a tentative patch to fix PR46779 and hopefully also related
>>>issues like PR45291.
>>>
>>
>>-  /* Disallow QImode in stack pointer regs.  */
>>-  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
>>+  /* Don't allocate data to non-GENERAL_REGS registers.  */
>>+
>>+  if (regno >= 32)
>>     return 0;
>>
>>I think that we not need in this code.
>>GCC core must bother about this.
> 
> 
> I am unsure about that. There is the "q" constraint that is used for
> SP. register_operand will accept SP because regno_reg_class does not
> return NO_REGS for SP. So something must stop the register allocator
> from allocating ordinary data to SP.
> 
> In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
> Besides that it is strongly discouraged to have more than one move
> insn for one mode, not each and every place looks at insn conditions.
> 
> Moreover, if there is an insn like
> (set (reg:HI 100)
>      (reg:HI 101))
> that matches "*movhi", what will keep IRA from allocating one of the
> values to SP, i.e. chose alternative "q,r" or "r,q"?

Ok, SP is a fixed register, I missed that above :-)

For the "*movhi_sp" I will provide a patch too as soon as I find the time.

Johann

Denis Chertykov June 9, 2011, 7:43 p.m. UTC | #4

2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
> Denis Chertykov schrieb:
>> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>>> This is a tentative patch to fix PR46779 and hopefully also related
>>> issues like PR45291.
>>>
>> -  /* Disallow QImode in stack pointer regs.  */
>> -  if ((regno == REG_SP || regno == (REG_SP + 1)) && mode == QImode)
>> +  /* Don't allocate data to non-GENERAL_REGS registers.  */
>> +
>> +  if (regno >= 32)
>>      return 0;
>>
>> I think that we not need in this code.
>> GCC core must bother about this.
>
> I am unsure about that. There is the "q" constraint that is used for
> SP. register_operand will accept SP because regno_reg_class does not
> return NO_REGS for SP. So something must stop the register allocator
> from allocating ordinary data to SP.

I know, this is a FIXED_REGISTERS.

> In the current md there is "*movhi_sp" insn prior to "*movhi" insn.
> Besides that it is strongly discouraged to have more than one move
> insn for one mode, not each and every place looks at insn conditions.

I'm agree. May be better to rewrite it.

>
> Moreover, if there is an insn like
> (set (reg:HI 100)
>     (reg:HI 101))
> that matches "*movhi", what will keep IRA from allocating one of the
> values to SP, i.e. chose alternative "q,r" or "r,q"?

FIXED_REGISTERS.

>> +    {
>> +      return 0;
>> +    }
>>
>> Fragment from GCC info:
>> --------------------------------------
>> HARD_REGNO_MODE_OK (regno, mode)A C expression that is nonzero if it
>> is permissible to store a value of mode mode in hard register number
>> regno (or in several registers starting with that one). For a machine
>> where all registers are equivalent, a suitable definition is
>>
>> #define HARD_REGNO_MODE_OK(REGNO, MODE) 1
>>
>> You need not include code to check for the numbers of fixed registers,
>> because the allocation mechanism considers them to be always occupied.
>> -----------------------------------------
>> Again, GCC core must bother about this.
>
> I agree with you. However, I think that the risk of spill failure
> should be minimized. I have no idea how to fix a splill failure like
> the following that occurs in testsuite (with -Os, no matter if the
> patch is applied or not):
>
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
> to find a register to spill in class 'POINTER_REGS'
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
> the insn:
> (insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
>        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
> S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
>     (nil))
> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
> compiler error: in spill_failure, at reload1.c:2113
>
> Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
> would reduce the risk of spill failure :-/

I think, no.
I will try to debug reload for pr38051.c (It will be a long process 1-2 weeks)

Denis.

Georg-Johann Lay June 10, 2011, 10:09 a.m. UTC | #5

Denis Chertykov schrieb:
> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> Denis Chertykov schrieb:
>>> 2011/6/9 Georg-Johann Lay <avr@gjlay.de>:
>> I agree with you. However, I think that the risk of spill failure
>> should be minimized. I have no idea how to fix a splill failure like
>> the following that occurs in testsuite (with -Os, no matter if the
>> patch is applied or not):
>>
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: unable
>> to find a register to spill in class 'POINTER_REGS'
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: error: this is
>> the insn:
>> (insn 61 60 62 10 (set (reg/v:SI 12 r12 [orig:73 b0 ] [73])
>>        (mem:SI (subreg:HI (reg/v:SI 70 [ srcp2 ]) 0) [2 *D.2218_56+0
>> S4 A8])) ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:64 12 {*movsi}
>>     (nil))
>> ./gcc/testsuite/gcc.c-torture/execute/pr38051.c:189:1: internal
>> compiler error: in spill_failure, at reload1.c:2113
>>
>> Actually I have no idea if the snip in avr_hard_regno_mode_ok actually
>> would reduce the risk of spill failure :-/
> 
> I think, no.
> I will try to debug reload for pr38051.c (It will be a long process 1-2 weeks)
> 
> Denis.

Thanks for doing that! Debugging reload is really no fun...

I see spill fails for these test cases with -mmcu=atmega128

* gcc.c-torture/execute/pr38051.c (-Os)
* gcc.dg/pr43670.c      (-O -ftree-vrp -fcompare-debug)
* gcc.dg/20030324-1.c   (-O -fstrict-aliasing -fgcse)

all complainins contain RTX like

   (mem:SI/SF (subreg:HI (reg:SI xxx) 0/2))


Then I have a question on spill failures:

There is PR46278, an optimization flaw that goes as follows:

The avr BE defines fake addressing mode X+const that has to be written
down in asm as
  X += const
  a = *X
  X -= const

The comment says that this is just to cover a corner case, however the
new register allocator uses this case more often or even greedily.
There is no way to describe the cost for such an access, and as X has
lower prologue/epilogue cost than Y, X is preferred over Y often.

In 4.7, I see that flaw less often than in 4.5.  However, I think the
best way is not to lie at the register allocator and not to supply a
fake instruction like that.

So I started to work on a fix. The changes in avr.h are:

	* config/avr/avr.h (BASE_REG_CLASS): Remove.
	(REG_OK_FOR_BASE_NOSTRICT_P): Remove.
	(REG_OK_FOR_BASE_STRICT_P): Remove.
	(MODE_CODE_BASE_REG_CLASS): New Define.
	(REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.

The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
allow a better, fine grained control over addressing modes for each
hard register and allow to support X without fake instructions. The
code quality actually improves, but I see a new spill failure for the
test case

* gcc.c-torture/compile/950612-1.c

On the one hand, that test case has heavy long long use and is no
"real world code" to run on AVR. On the other hand, I don't think
trading code quality here against ICE there is a good idea.

What do you think about that? As I have no idea how to fix a spill
failure in the backend, I stopped working on a patch.

Then I observed trouble with DI patterns during libgcc build and had
to remove

* "zero_extendqidi2"
* "zero_extendhidi2"
* "zero_extendsidi2"

These are "orphan" insns: they deal with DI without having movdi
support so I removed them.

Johann

Denis Chertykov June 12, 2011, 9:01 a.m. UTC | #6

2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> Then I have a question on spill failures:
>
> There is PR46278, an optimization flaw that goes as follows:
>
> The avr BE defines fake addressing mode X+const that has to be written
> down in asm as
>  X += const
>  a = *X
>  X -= const
>
> The comment says that this is just to cover a corner case, however the
> new register allocator uses this case more often or even greedily.
> There is no way to describe the cost for such an access, and as X has
> lower prologue/epilogue cost than Y, X is preferred over Y often.
>
> In 4.7, I see that flaw less often than in 4.5.  However, I think the
> best way is not to lie at the register allocator and not to supply a
> fake instruction like that.
>
> So I started to work on a fix. The changes in avr.h are:
>
>        * config/avr/avr.h (BASE_REG_CLASS): Remove.
>        (REG_OK_FOR_BASE_NOSTRICT_P): Remove.
>        (REG_OK_FOR_BASE_STRICT_P): Remove.
>        (MODE_CODE_BASE_REG_CLASS): New Define.
>        (REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.
>
> The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
> allow a better, fine grained control over addressing modes for each
> hard register and allow to support X without fake instructions. The
> code quality actually improves, but I see a new spill failure for the
> test case
>
> * gcc.c-torture/compile/950612-1.c
>
> On the one hand, that test case has heavy long long use and is no
> "real world code" to run on AVR. On the other hand, I don't think
> trading code quality here against ICE there is a good idea.
>
> What do you think about that? As I have no idea how to fix a spill
> failure in the backend, I stopped working on a patch.

Ohhh. It's a most complicated case for GCC for AVR.
Look carefully at `out_movqi_r_mr'.
There are even two fake addressing modes:
1. [Y + infinite-dslacement];
2. [X + (0...63)].
I have spent a many hours (days, months) to debug GCC (especially avr port
and reload) for right addressing modes.
I have stopped on this code.
AVR have a limited memory addressing and GCC can't handle it in native form.
Because of that I have supported a fake adddressing modes.
(Richard Henderson have a different oppinion: GCC can, AVR port can't)
IMHO that three limited pointer registers is not enough for C compiler.
Even more with frame pointer it's only two and X is a very limited.

1. [Y + infinite-dslacement] it's helper for reload addressing.
For addressing too long local/spilled variable. (Y + more-than-63)

2. [X + (0...63)] for another one "normal" pointer address.

> Then I observed trouble with DI patterns during libgcc build and had
> to remove
>
> * "zero_extendqidi2"
> * "zero_extendhidi2"
> * "zero_extendsidi2"
>
> These are "orphan" insns: they deal with DI without having movdi
> support so I removed them.

This seems strange for me.

Denis.

Georg-Johann Lay June 13, 2011, 6:02 p.m. UTC | #7

[In CCing Richard Henderson]

Denis Chertykov schrieb:
> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> 
>>Then I have a question on spill failures:
>>
>>There is PR46278, an optimization flaw that goes as follows:
>>
>>The avr BE defines fake addressing mode X+const that has to be written
>>down in asm as
>> X += const
>> a = *X
>> X -= const
>>
>>The comment says that this is just to cover a corner case, however the
>>new register allocator uses this case more often or even greedily.
>>There is no way to describe the cost for such an access, and as X has
>>lower prologue/epilogue cost than Y, X is preferred over Y often.
>>
>>In 4.7, I see that flaw less often than in 4.5.  However, I think the
>>best way is not to lie at the register allocator and not to supply a
>>fake instruction like that.
>>
>>So I started to work on a fix. The changes in avr.h are:
>>
>>       * config/avr/avr.h (BASE_REG_CLASS): Remove.
>>       (REG_OK_FOR_BASE_NOSTRICT_P): Remove.
>>       (REG_OK_FOR_BASE_STRICT_P): Remove.
>>       (MODE_CODE_BASE_REG_CLASS): New Define.
>>       (REGNO_MODE_CODE_OK_FOR_BASE_P): New Define.
>>
>>The macros REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS
>>allow a better, fine grained control over addressing modes for each
>>hard register and allow to support X without fake instructions. The
>>code quality actually improves, but I see a new spill failure for the
>>test case
>>
>>* gcc.c-torture/compile/950612-1.c
>>
>>On the one hand, that test case has heavy long long use and is no
>>"real world code" to run on AVR. On the other hand, I don't think
>>trading code quality here against ICE there is a good idea.
>>
>>What do you think about that? As I have no idea how to fix a spill
>>failure in the backend, I stopped working on a patch.
> 
> Ohhh. It's a most complicated case for GCC for AVR.

So you think is is pointless/discouraged to give a more realistic 
description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS 
(instead of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?

> Look carefully at `out_movqi_r_mr'.
> There are even two fake addressing modes:
> 1. [Y + infinite-dslacement];
> 2. [X + (0...63)].

Yes, I know. The first is introduced by avr_legitimate_address_p and the 
second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.

The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) 
and a rewrite of avr_legitimate_address_p. The changes aim at a better 
addressing for X and to minimize fake addresses.

> I have spent a many hours (days, months) to debug GCC (especially avr port
> and reload) for right addressing modes.
> I have stopped on this code.
> AVR have a limited memory addressing and GCC can't handle it in native form.
> Because of that I have supported a fake adddressing modes.

I assume the code is from prior to 4.2 when 
REGNO_MODE_CODE_OK_FOR_BASE_P and MODE_CODE_BASE_REG_CLASS had not been 
available so that supporting X required some hacking.
All that would still be fine; however the new register allocator leads 
to code that noone would accept. Accessing a structure through a pointer 
is not uncommon, not even on AVR. So if Z is used for, say accessing 
flash, X appears to be the best register.

The shortcoming in GCC is that there is no way to give costs of 
addressing (TARGET_ADDRESS_COST does different things).

So take a look what avr-gcc compiles here:
   http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
I saw similar complains in forums on the web.

> (Richard Henderson have a different opinion: GCC can, AVR port can't)

What does he mean with that?

> IMHO that three limited pointer registers is not enough for C compiler.
> Even more with frame pointer it's only two and X is a very limited.

The current implementation has several oddities like

* allowing SUBREGs in avr-legitimate_address_p
* changing BASE_REG_CLASS on the fly (by means of reload_completed)

isn't that supposed to cause trouble?

> 1. [Y + infinite-dslacement] it's helper for reload addressing.
> For addressing too long local/spilled variable. (Y + more-than-63)
> 
> 2. [X + (0...63)] for another one "normal" pointer address.

Denis Chertykov June 13, 2011, 8:11 p.m. UTC | #8

2011/6/13 Georg-Johann Lay <avr@gjlay.de>:
>
> So you think is is pointless/discouraged to give a more realistic
> description of AVR addressing be means of MODE_CODE_BASE_REG_CLASS (instead
> of BASE_REG_CLASS) resp. REGNO_MODE_CODE_OK_FOR_BASE_P?
>
>> Look carefully at `out_movqi_r_mr'.
>> There are even two fake addressing modes:
>> 1. [Y + infinite-dslacement];
>> 2. [X + (0...63)].
>
> Yes, I know. The first is introduced by avr_legitimate_address_p and the
> second appears to be artifact of LEGITIMIZE_RELOAD_ADDRESS.
>
> The changes are basically MODE_CODE_BASE_REG_CLASS (introduced in 4.2) and a
> rewrite of avr_legitimate_address_p. The changes aim at a better addressing
> for X and to minimize fake addresses.
>
>> I have spent a many hours (days, months) to debug GCC (especially avr port
>> and reload) for right addressing modes.
>> I have stopped on this code.
>> AVR have a limited memory addressing and GCC can't handle it in native
>> form.
>> Because of that I have supported a fake adddressing modes.
>
> I assume the code is from prior to 4.2 when REGNO_MODE_CODE_OK_FOR_BASE_P
> and MODE_CODE_BASE_REG_CLASS had not been available so that supporting X
> required some hacking.
> All that would still be fine; however the new register allocator leads to
> code that noone would accept. Accessing a structure through a pointer is not
> uncommon, not even on AVR. So if Z is used for, say accessing flash, X
> appears to be the best register.
>
> The shortcoming in GCC is that there is no way to give costs of addressing
> (TARGET_ADDRESS_COST does different things).
>
> So take a look what avr-gcc compiles here:
>  http://gcc.gnu.org/bugzilla/attachment.cgi?id=22242
> I saw similar complains in forums on the web.
>
>> (Richard Henderson have a different opinion: GCC can, AVR port can't)
>
> What does he mean with that?
>
>> IMHO that three limited pointer registers is not enough for C compiler.
>> Even more with frame pointer it's only two and X is a very limited.
>
> The current implementation has several oddities like
>
> * allowing SUBREGs in avr-legitimate_address_p
> * changing BASE_REG_CLASS on the fly (by means of reload_completed)
>
> isn't that supposed to cause trouble?

You can try to remove all oddities and check results.
Definitely something changed in GCC core since I wrote addressing code.

>>>
>>> * "zero_extendqidi2"
>>> * "zero_extendhidi2"
>>> * "zero_extendsidi2"
>>>
>>> These are "orphan" insns: they deal with DI without having movdi
>>> support so I removed them.
>>
>> This seems strange for me.
>
> As far as I know, to support a mode a respective mov insn is needed, which
> is not the case for DI. I don't know the exact rationale behind that
> (reloading?), just read is on gcc list by Ian Taylor (and also that it is
> stronly discouraged to have more than one mov insn per mode).
>
> So if the requirement to have mov insn is dropped and without the burden to
> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> avr...

Personally, I don't like to maintain 64bits integers in AVR port, but
may be somebody use it.

Denis.

Hans-Peter Nilsson June 22, 2011, 12:58 a.m. UTC | #9

On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
> [In CCing Richard Henderson]
> Denis Chertykov schrieb:
> > 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:

> > > Then I observed trouble with DI patterns during libgcc build and had
> > > to remove
> > >
> > > * "zero_extendqidi2"
> > > * "zero_extendhidi2"
> > > * "zero_extendsidi2"
> > >
> > > These are "orphan" insns: they deal with DI without having movdi
> > > support so I removed them.
> >
> > This seems strange for me.
>
> As far as I know, to support a mode a respective mov insn is needed,

For the record, not in general, just if you have patterns
operating on DImode.  I.e. if you always have to call into
libgcc for every operation, you're fine with just SImode, as the
access will be split into SImode accesses.  (That reload can't
split the access is arguably a wart.)

It's even documented, "node Standard Names" for mov@var{m}:
"If there are patterns accepting operands in larger modes,
@samp{mov@var{m}} must be defined for integer modes of those
sizes."

> which is
> not the case for DI. I don't know the exact rationale behind that
> (reloading?),

Yes.  (I ran into problems with this myself long ago.)

> just read is on gcc list by Ian Taylor (and also that it is
> stronly discouraged to have more than one mov insn per mode).

That is correct.

> So if the requirement to have mov insn is dropped and without the burden to
> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> avr...

Resist the temptation... I see you did. :)

brgds, H-P

Georg-Johann Lay June 22, 2011, 4:36 p.m. UTC | #10

Hans-Peter Nilsson schrieb:
> On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
>> [In CCing Richard Henderson]
>> Denis Chertykov schrieb:
>>> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> 
>>>> Then I observed trouble with DI patterns during libgcc build and had
>>>> to remove
>>>>
>>>> * "zero_extendqidi2"
>>>> * "zero_extendhidi2"
>>>> * "zero_extendsidi2"
>>>>
>>>> These are "orphan" insns: they deal with DI without having movdi
>>>> support so I removed them.
>>> This seems strange for me.
>> As far as I know, to support a mode a respective mov insn is needed,
> 
> For the record, not in general, just if you have patterns
> operating on DImode.  I.e. if you always have to call into
> libgcc for every operation, you're fine with just SImode, as the
> access will be split into SImode accesses.  (That reload can't
> split the access is arguably a wart.)

For avr it's actually split in QImode (word_mode), SImode would be
more efficient.

> It's even documented, "node Standard Names" for mov@var{m}:
> "If there are patterns accepting operands in larger modes,
> @samp{mov@var{m}} must be defined for integer modes of those
> sizes."

Thanks for pointing that out.

For avr that means:  There is movsf pattern that is implemented less
efficient than movsi.  So removing movsf could improve code a bit.
Besides efficiency, code for movsi and movsf can be the same on avr.

>> which is
>> not the case for DI. I don't know the exact rationale behind that
>> (reloading?),
> 
> Yes.  (I ran into problems with this myself long ago.)

So the zero_extend*di2 pattern are bogus because there is no movdi.

>> just read is on gcc list by Ian Taylor (and also that it is
>> stronly discouraged to have more than one mov insn per mode).
> 
> That is correct.
> 
>> So if the requirement to have mov insn is dropped and without the burden to
>> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
>> avr...
> 
> Resist the temptation... I see you did. :)

The preferred handling is still that optabs cared for calling __adddi3
if there is no adddi3 pattern... The target would have to care for
implementing __adddi3 so generic libgcc need not to be changed and IMO
changing libgcc for that would not be adequate.

Johann

> brgds, H-P

Hans-Peter Nilsson June 23, 2011, 11:56 a.m. UTC | #11

On Wed, 22 Jun 2011, Georg-Johann Lay wrote:

> Hans-Peter Nilsson schrieb:
> > On Mon, 13 Jun 2011, Georg-Johann Lay wrote:
> >> [In CCing Richard Henderson]
> >> Denis Chertykov schrieb:
> >>> 2011/6/10 Georg-Johann Lay <avr@gjlay.de>:
> >
> >>>> Then I observed trouble with DI patterns during libgcc build and had
> >>>> to remove
> >>>>
> >>>> * "zero_extendqidi2"
> >>>> * "zero_extendhidi2"
> >>>> * "zero_extendsidi2"
> >>>>
> >>>> These are "orphan" insns: they deal with DI without having movdi
> >>>> support so I removed them.
> >>> This seems strange for me.
> >> As far as I know, to support a mode a respective mov insn is needed,
> >
> > For the record, not in general, just if you have patterns
> > operating on DImode.  I.e. if you always have to call into
> > libgcc for every operation, you're fine with just SImode, as the
> > access will be split into SImode accesses.  (That reload can't
> > split the access is arguably a wart.)
>
> For avr it's actually split in QImode (word_mode), SImode would be
> more efficient.
>
> > It's even documented, "node Standard Names" for mov@var{m}:
> > "If there are patterns accepting operands in larger modes,
> > @samp{mov@var{m}} must be defined for integer modes of those
> > sizes."
>
> Thanks for pointing that out.
>
> For avr that means:  There is movsf pattern that is implemented less
> efficient than movsi.  So removing movsf could improve code a bit.
> Besides efficiency, code for movsi and movsf can be the same on avr.
>
> >> which is
> >> not the case for DI. I don't know the exact rationale behind that
> >> (reloading?),
> >
> > Yes.  (I ran into problems with this myself long ago.)
>
> So the zero_extend*di2 pattern are bogus because there is no movdi.
>
> >> just read is on gcc list by Ian Taylor (and also that it is
> >> stronly discouraged to have more than one mov insn per mode).
> >
> > That is correct.
> >
> >> So if the requirement to have mov insn is dropped and without the burden to
> >> implement movdi, it would be rather easy to implement adddi3 and subdi3 for
> >> avr...
> >
> > Resist the temptation... I see you did. :)
>
> The preferred handling is still that optabs cared for calling __adddi3
> if there is no adddi3 pattern... The target would have to care for
> implementing __adddi3 so generic libgcc need not to be changed and IMO
> changing libgcc for that would not be adequate.
>
> Johann
>
> > brgds, H-P
>

Hans-Peter Nilsson June 23, 2011, 11:58 a.m. UTC | #12

Sorry for the earlier semi-empty mail (just quoting G-J), I
meant to cancel it.  Happy midsummer.

brgds, H-P

[AVR] : Fix PR46779

Commit Message

Comments

Patch