Message ID | 4D5C104D.7050707@redhat.com |
---|---|
State | New |
Headers | show |
On 16 February 2011 18:58, Richard Henderson <rth@redhat.com> wrote: > Anitha pointed out to me via gcc pr17994 that AVR uses post-decrement > for its pushes. I had a brief read over the AVR insn manual, and it's > not crystal clear how multi-byte post-decrement pushes operate. > > I've made an assumption that it happens as-if each byte is pushed > separately. I.e. > > caller: callee: > save rN > save rM > trash <- SP hi(ret) <- CFA > lo(ret) > trash <- SP > > This is the only way I can imagine that call insns interoperate with > byte push/pop insns. Yes, except the bytes of return address are in big-endian. See avr-tdep.c in function avr_frame_prev_register(): /* ... And to confuse matters even more, the return address stored on the stack is in big endian byte order, even though most everything else about the avr is little endian. Ick! */ > All of which means that the return address is at a different offset > from the CFA than I originally thought. This ought to be fixed in > the following. > > Can someone please test these two patches and see if they actually > agree with the hardware? What should I look for when testing? The layout of stack you draw is correct (minus the endianity), see avr-tdep.c: /* Any function with a frame looks like this ...
On 16 February 2011 23:28, Richard Henderson <rth@redhat.com> wrote: > On 02/15/2011 02:44 PM, Petr Hluzín wrote: >> In avr-tdep.c [1] near avr_dwarf_reg_to_regnum(): >> /* Unfortunately dwarf2 register for SP is 32. */ > > Excellent. We're all on the same page for this. > >> (I can't help you with the value for #define DWARF2_DEFAULT_RETURN_COLUMN 36) >> AFAIK there is no written ABI. Only the calling convention is >> documented (and only the easy cases), the rest is in gdb/gcc/binutils >> sources and people's heads. > > As I recall, GCC defaults to using FIRST_PSEUDO_REGISTER for this, > so as to not overlap any hard registers. I'll continue to so the same. > >> /* Avr-6 call instructions save 3 bytes. */ >> switch (info.bfd_arch_info->mach) > > Thanks. That value is readily available in the assembler as well. > > Anitha pointed out to me via gcc pr17994 that AVR uses post-decrement > for its pushes. I had a brief read over the AVR insn manual, and it's > not crystal clear how multi-byte post-decrement pushes operate. > > I've made an assumption that it happens as-if each byte is pushed > separately. I.e. > > caller: callee: > save rN > save rM > trash <- SP hi(ret) <- CFA > lo(ret) > trash <- SP > > This is the only way I can imagine that call insns interoperate with > byte push/pop insns. > The stack layout is correct. For call/rcall instructions, PC-low is pushed first followed by a PC-high. (I just verified by running/debugging a small app on the device) > All of which means that the return address is at a different offset > from the CFA than I originally thought. This ought to be fixed in > the following. Can you please explain the logic behind the following lines in gcc patch: - offset = -cfa_store.offset; + if (GET_CODE (XEXP (dest, 0)) == POST_DEC) + offset += -cfa_store.offset; + else + offset = -cfa_store.offset; > > Can someone please test these two patches and see if they actually > agree with the hardware? I have tried only compiler patch. Please refer to the attached output for a small testcase. (avr-objdump -WfF). It appeared correct to me. However I have one simple question with regarding the output: The CFI instructions for registers have changed only after the prologue. (For convenience I have attached disassembly too). As far as I understand, DWARF2 spec emits CFI instructions immediately. (Appendix 5 of DWARF2 specification) The other scenario is - how about functions with signals/interrupts? The compiler will give an ICE compiling a function as below: void my_interrupt_handler() __attribute__ (("interrupt")); Likewise, for signal attribute too. I am going to apply assembler patch and test it. Will get back on it shortly. Anitha > > r~ > call-saved: file format elf32-avr Contents of the .debug_frame section: 00000000 00000010 ffffffff CIE Version: 1 Augmentation: "" Code alignment factor: 1 Data alignment factor: -1 Return address column: 36 DW_CFA_def_cfa: r32 ofs 2 DW_CFA_offset: r36 at cfa-1 DW_CFA_nop DW_CFA_nop 00000014 0000006c 00000000 FDE cie=00000000 pc=00000000..000000ce DW_CFA_advance_loc: 2 to 00000002 DW_CFA_def_cfa_offset: 3 DW_CFA_advance_loc: 2 to 00000004 DW_CFA_def_cfa_offset: 4 DW_CFA_advance_loc: 2 to 00000006 DW_CFA_def_cfa_offset: 5 DW_CFA_advance_loc: 2 to 00000008 DW_CFA_def_cfa_offset: 6 DW_CFA_advance_loc: 2 to 0000000a DW_CFA_def_cfa_offset: 7 DW_CFA_advance_loc: 2 to 0000000c DW_CFA_def_cfa_offset: 8 DW_CFA_advance_loc: 2 to 0000000e DW_CFA_def_cfa_offset: 9 DW_CFA_advance_loc: 2 to 00000010 DW_CFA_def_cfa_offset: 10 DW_CFA_advance_loc: 2 to 00000012 DW_CFA_def_cfa_offset: 11 DW_CFA_advance_loc: 2 to 00000014 DW_CFA_def_cfa_offset: 12 DW_CFA_advance_loc: 2 to 00000016 DW_CFA_def_cfa_offset: 13 DW_CFA_advance_loc: 2 to 00000018 DW_CFA_def_cfa_offset: 14 DW_CFA_advance_loc: 2 to 0000001a DW_CFA_def_cfa_offset: 15 DW_CFA_advance_loc: 2 to 0000001c DW_CFA_def_cfa_offset: 16 DW_CFA_advance_loc: 2 to 0000001e DW_CFA_def_cfa_offset: 17 DW_CFA_advance_loc: 2 to 00000020 DW_CFA_def_cfa_offset: 18 DW_CFA_advance_loc: 4 to 00000024 DW_CFA_def_cfa_offset: 20 DW_CFA_offset: r28 at cfa-18 DW_CFA_offset: r17 at cfa-17 DW_CFA_offset: r16 at cfa-16 DW_CFA_offset: r15 at cfa-15 DW_CFA_offset: r14 at cfa-14 DW_CFA_offset: r13 at cfa-13 DW_CFA_offset: r12 at cfa-12 DW_CFA_offset: r11 at cfa-11 DW_CFA_offset: r10 at cfa-10 DW_CFA_offset: r9 at cfa-9 DW_CFA_offset: r8 at cfa-8 DW_CFA_offset: r7 at cfa-7 DW_CFA_offset: r6 at cfa-6 DW_CFA_offset: r5 at cfa-5 DW_CFA_offset: r4 at cfa-4 DW_CFA_offset: r3 at cfa-3 DW_CFA_offset: r2 at cfa-2 DW_CFA_advance_loc: 4 to 00000028 DW_CFA_def_cfa_register: r28 DW_CFA_advance_loc: 2 to 0000002a DW_CFA_def_cfa_offset: 36 DW_CFA_advance_loc: 10 to 00000034 DW_CFA_def_cfa_register: r32 DW_CFA_nop DW_CFA_nop 00000084 00000014 00000000 FDE cie=00000000 pc=000000ce..000000de DW_CFA_advance_loc: 4 to 000000d2 DW_CFA_def_cfa_offset: 4 DW_CFA_offset: r28 at cfa-2 DW_CFA_advance_loc: 4 to 000000d6 DW_CFA_def_cfa_register: r28 call-saved: file format elf32-avr Contents of the .debug_frame section: 00000000 00000010 ffffffff CIE "" cf=1 df=-1 ra=36 LOC CFA ra 00000000 r32+2 c-1 00000014 0000006c 00000000 FDE cie=00000000 pc=00000000..000000ce LOC CFA r2 r3 r4 r5 r6 r7 r8 r9 r10 r11 r12 r13 r14 r15 r16 r17 r28 ra 00000000 r32+2 u u u u u u u u u u u u u u u u u c-1 00000002 r32+3 u u u u u u u u u u u u u u u u u c-1 00000004 r32+4 u u u u u u u u u u u u u u u u u c-1 00000006 r32+5 u u u u u u u u u u u u u u u u u c-1 00000008 r32+6 u u u u u u u u u u u u u u u u u c-1 0000000a r32+7 u u u u u u u u u u u u u u u u u c-1 0000000c r32+8 u u u u u u u u u u u u u u u u u c-1 0000000e r32+9 u u u u u u u u u u u u u u u u u c-1 00000010 r32+10 u u u u u u u u u u u u u u u u u c-1 00000012 r32+11 u u u u u u u u u u u u u u u u u c-1 00000014 r32+12 u u u u u u u u u u u u u u u u u c-1 00000016 r32+13 u u u u u u u u u u u u u u u u u c-1 00000018 r32+14 u u u u u u u u u u u u u u u u u c-1 0000001a r32+15 u u u u u u u u u u u u u u u u u c-1 0000001c r32+16 u u u u u u u u u u u u u u u u u c-1 0000001e r32+17 u u u u u u u u u u u u u u u u u c-1 00000020 r32+18 u u u u u u u u u u u u u u u u u c-1 00000024 r32+20 c-2 c-3 c-4 c-5 c-6 c-7 c-8 c-9 c-10 c-11 c-12 c-13 c-14 c-15 c-16 c-17 c-18 c-1 00000028 r28+20 c-2 c-3 c-4 c-5 c-6 c-7 c-8 c-9 c-10 c-11 c-12 c-13 c-14 c-15 c-16 c-17 c-18 c-1 0000002a r28+36 c-2 c-3 c-4 c-5 c-6 c-7 c-8 c-9 c-10 c-11 c-12 c-13 c-14 c-15 c-16 c-17 c-18 c-1 00000034 r32+36 c-2 c-3 c-4 c-5 c-6 c-7 c-8 c-9 c-10 c-11 c-12 c-13 c-14 c-15 c-16 c-17 c-18 c-1 00000084 00000014 00000000 FDE cie=00000000 pc=000000ce..000000de LOC CFA r28 ra 000000ce r32+2 u c-1 000000d2 r32+4 c-2 c-1 000000d6 r28+4 c-2 c-1 call-saved: file format elf32-avr Disassembly of section .text: 00000000 <foo>: 0: 2f 92 push r2 2: 3f 92 push r3 4: 4f 92 push r4 6: 5f 92 push r5 8: 6f 92 push r6 a: 7f 92 push r7 c: 8f 92 push r8 e: 9f 92 push r9 10: af 92 push r10 12: bf 92 push r11 14: cf 92 push r12 16: df 92 push r13 18: ef 92 push r14 1a: ff 92 push r15 1c: 0f 93 push r16 1e: 1f 93 push r17 20: df 93 push r29 22: cf 93 push r28 24: cd b7 in r28, 0x3d ; 61 26: de b7 in r29, 0x3e ; 62 28: 60 97 sbiw r28, 0x10 ; 16 2a: 0f b6 in r0, 0x3f ; 63 2c: f8 94 cli 2e: de bf out 0x3e, r29 ; 62 30: 0f be out 0x3f, r0 ; 63 32: cd bf out 0x3d, r28 ; 61 34: 8e 2d mov r24, r14 36: 9f 2d mov r25, r15 38: 80 0f add r24, r16 3a: 91 1f adc r25, r17 3c: 29 81 ldd r18, Y+1 ; 0x01 3e: 3a 81 ldd r19, Y+2 ; 0x02 40: 82 0f add r24, r18 42: 93 1f adc r25, r19 44: 2b 81 ldd r18, Y+3 ; 0x03 46: 3c 81 ldd r19, Y+4 ; 0x04 48: 82 0f add r24, r18 4a: 93 1f adc r25, r19 4c: 2d 81 ldd r18, Y+5 ; 0x05 4e: 3e 81 ldd r19, Y+6 ; 0x06 50: 82 0f add r24, r18 52: 93 1f adc r25, r19 54: 2f 81 ldd r18, Y+7 ; 0x07 56: 38 85 ldd r19, Y+8 ; 0x08 58: 82 0f add r24, r18 5a: 93 1f adc r25, r19 5c: 29 85 ldd r18, Y+9 ; 0x09 5e: 3a 85 ldd r19, Y+10 ; 0x0a 60: 82 0f add r24, r18 62: 93 1f adc r25, r19 64: eb 84 ldd r14, Y+11 ; 0x0b 66: fc 84 ldd r15, Y+12 ; 0x0c 68: e8 0e add r14, r24 6a: f9 1e adc r15, r25 6c: 8d 85 ldd r24, Y+13 ; 0x0d 6e: 9e 85 ldd r25, Y+14 ; 0x0e 70: 2f 85 ldd r18, Y+15 ; 0x0f 72: 38 89 ldd r19, Y+16 ; 0x10 74: 82 0f add r24, r18 76: 93 1f adc r25, r19 78: 82 0d add r24, r2 7a: 93 1d adc r25, r3 7c: 84 0d add r24, r4 7e: 95 1d adc r25, r5 80: 86 0d add r24, r6 82: 97 1d adc r25, r7 84: 88 0d add r24, r8 86: 99 1d adc r25, r9 88: 8a 0d add r24, r10 8a: 9b 1d adc r25, r11 8c: 08 2f mov r16, r24 8e: 19 2f mov r17, r25 90: 0c 0d add r16, r12 92: 1d 1d adc r17, r13 94: 0e 0d add r16, r14 96: 1f 1d adc r17, r15 98: 80 2f mov r24, r16 9a: 91 2f mov r25, r17 9c: 60 96 adiw r28, 0x10 ; 16 9e: 0f b6 in r0, 0x3f ; 63 a0: f8 94 cli a2: de bf out 0x3e, r29 ; 62 a4: 0f be out 0x3f, r0 ; 63 a6: cd bf out 0x3d, r28 ; 61 a8: cf 91 pop r28 aa: df 91 pop r29 ac: 1f 91 pop r17 ae: 0f 91 pop r16 b0: ff 90 pop r15 b2: ef 90 pop r14 b4: df 90 pop r13 b6: cf 90 pop r12 b8: bf 90 pop r11 ba: af 90 pop r10 bc: 9f 90 pop r9 be: 8f 90 pop r8 c0: 7f 90 pop r7 c2: 6f 90 pop r6 c4: 5f 90 pop r5 c6: 4f 90 pop r4 c8: 3f 90 pop r3 ca: 2f 90 pop r2 cc: 08 95 ret 000000ce <main>: ce: df 93 push r29 d0: cf 93 push r28 d2: cd b7 in r28, 0x3d ; 61 d4: de b7 in r29, 0x3e ; 62 d6: 94 df rcall .-216 ; 0x0 <foo> d8: cf 91 pop r28 da: df 91 pop r29 dc: 08 95 ret
On 02/17/2011 07:35 AM, Anitha Boyapati wrote: > Can you please explain the logic behind the following lines in gcc patch: > > > - offset = -cfa_store.offset; > + if (GET_CODE (XEXP (dest, 0)) == POST_DEC) > + offset += -cfa_store.offset; > + else > + offset = -cfa_store.offset; This is differentiating between pre-dec and post-dec. pre-dec post-dec before stuff stuff stuff <-sp trash <-sp trash after stuff stuff stuff value value <-sp trash <-sp We've just decremented cfa_store.offset by the size of the pushed value, and we're computing the offset of VALUE from the CFA. For pre-decrement, the value is stored at the CFA offset (the else); for post-decrement, the value is stored just above the CFA offset. I admit the logic is confusing here, because we're storing some quantities as positive offsets, and some quantities as negative offsets, and we're also using the same variable for two different things over two sections of code. Perhaps it would have been less obtuse if I had written offset = -(cfa_store.offset - offset); > However I have one simple question with regarding the output: The CFI > instructions for registers have changed only after the prologue. (For > convenience I have attached disassembly too). As far as I understand, > DWARF2 spec emits CFI instructions immediately. (Appendix 5 of DWARF2 > specification) GCC is attempting to minimize the number of advance opcodes by grouping the DW_CFA_offset opcodes. We can delay emission of such until the registers in question are actually clobbered. In this case this delay tactic failed because of the push -- we cannot delay changes to the CFA. > The other scenario is - how about functions with signals/interrupts? > The compiler will give an ICE compiling a function as below: > > void my_interrupt_handler() __attribute__ (("interrupt")); It's a bug in the avr backend; the enable_interrupt insn should not be marked as frame-related. We should fix this separately. r~
On 02/16/2011 02:47 PM, Petr Hluzín wrote:
> What should I look for when testing?
Run the gdb testsuite with dwarf-3 enabled. Either by editing the
default in the compiler, or by some dejagnu argument that compiles
the tests with -gdwarf-3.
The use of Dwarf3 enables the use of DW_OP_call_frame_cfa in the
DW_AT_frame_base field, which means that the .debug_frame info will
be used every time local variables are referenced, as well as for
unwinding the stack.
If lots of tests fail, see if you can determine if the address
computed for the local variable -- or even more particularly a
function parameter -- is off by a byte. I have an idea that we're
missing a definition of
#define ARG_POINTER_CFA_OFFSET(FNDECL) -1
in GCC.
r~
On Feb 17, 2011, at 5:12 PM, Richard Henderson wrote: > On 02/16/2011 02:47 PM, Petr Hluzín wrote: >> What should I look for when testing? > > Run the gdb testsuite with dwarf-3 enabled. Either by editing the > default in the compiler, or by some dejagnu argument that compiles > the tests with -gdwarf-3. Note that gdb includes an avr simulator (if this could simplify your testing).
Index: config/tc-avr.c =================================================================== RCS file: /cvs/src/src/gas/config/tc-avr.c,v retrieving revision 1.74 diff -u -p -r1.74 tc-avr.c --- config/tc-avr.c 28 Jun 2010 14:06:57 -0000 1.74 +++ config/tc-avr.c 16 Feb 2011 17:39:55 -0000 @@ -24,6 +24,8 @@ #include "as.h" #include "safe-ctype.h" #include "subsegs.h" +#include "dw2gencfi.h" + struct avr_opcodes_s { @@ -1488,3 +1490,18 @@ avr_cons_fix_new (fragS *frag, exp_mod_pm = 0; } } + +void +tc_cfi_frame_initial_instructions (void) +{ + /* AVR6 pushes 3 bytes for calls. */ + int return_size = (avr_mcu->mach == bfd_mach_avr6 ? 3 : 2); + + /* The CFA is the caller's stack location before the call insn. */ + /* Note that the stack pointer is dwarf register number 32. */ + cfi_add_CFA_def_cfa (32, return_size); + + /* Note that AVR consistently uses post-decrement, which means that things + do not line up the same way as for targers that use pre-decrement. */ + cfi_add_CFA_offset (DWARF2_DEFAULT_RETURN_COLUMN, 1-return_size); +} Index: config/tc-avr.h =================================================================== RCS file: /cvs/src/src/gas/config/tc-avr.h,v retrieving revision 1.17 diff -u -p -r1.17 tc-avr.h --- config/tc-avr.h 27 Oct 2009 15:39:27 -0000 1.17 +++ config/tc-avr.h 16 Feb 2011 17:39:55 -0000 @@ -153,3 +153,16 @@ extern long md_pcrel_from_section (struc /* 32 bits pseudo-addresses are used on AVR. */ #define DWARF2_ADDR_SIZE(bfd) 4 + +/* Enable cfi directives. */ +#define TARGET_USE_CFIPOP 1 + +/* The stack grows down, and is only byte aligned. */ +#define DWARF2_CIE_DATA_ALIGNMENT -1 + +/* Define the column that represents the PC. */ +#define DWARF2_DEFAULT_RETURN_COLUMN 36 + +/* Define a hook to setup initial CFI state. */ +extern void tc_cfi_frame_initial_instructions (void); +#define tc_cfi_frame_initial_instructions tc_cfi_frame_initial_instructions