diff mbox

[AVR] : Support tail calls

Message ID 4D7A2711.4060801@gjlay.de
State New
Headers show

Commit Message

Georg-Johann Lay March 11, 2011, 1:43 p.m. UTC
This is a patch to test/review/comment on. It adds tail call
optimization to avr backend.

The implementation uses struct machine_function to pass information
around, i.e. from avr_function_arg_advance to avr_function_ok_for_sibcall.

Tail call support is more general than avr-ld's replacement of
call/ret sequences with --relax which are sometimes wrong, see
http://sourceware.org/PR12494

gcc can, e.g. tail-call bar1 in

void bar0 (void);
void bar1 (int);

int foo (int x)
{
  bar0();
  return bar1 (x);
}

I did not find a way to make this work together with -mcall-prologues.
Please let me know if you have suggestion on how call prologues can be
combine with tail calls.

Regards, Johann


2011-03-10  Georg-Johann Lay  <avr@gjlay.de>

	* config/avr/avr-protos.h (expand_epilogue): Change prototype
	* config/avr/avr.h (struct machine_function): Add field
	sibcall_fails.
	* config/avr/avr.c (init_cumulative_args,
	avr_function_arg_advance): Use it.
	* config/avr/avr.c (expand_epilogue): Add bool parameter. Handle
	sibcall	epilogues.
	(TARGET_FUNCTION_OK_FOR_SIBCALL): Define to...
	(avr_function_ok_for_sibcall): ...this new function.
	(avr_lookup_function_attribute1): New static Function.
	(avr_naked_function_p, interrupt_function_p,
	signal_function_p, avr_OS_task_function_p,
	avr_OS_main_function_p): Use it.
	* config/avr/avr.md ("sibcall", "sibcall_value",
	"sibcall_epilogue"): New expander.
	("*call_insn", "*call_value_insn"): New insn.
	("call_insn", "call_value_insn"): Remove
	("call", "call_value", "epilogue"): Change expander to handle
	sibling calls.

Comments

Weddington, Eric March 11, 2011, 2:28 p.m. UTC | #1
> -----Original Message-----
> From: Georg-Johann Lay [mailto:avr@gjlay.de]
> Sent: Friday, March 11, 2011 6:44 AM
> To: gcc-patches@gcc.gnu.org
> Cc: Denis Chertykov; Anatoly Sokolov; Weddington, Eric; Boyapati, Anitha
> Subject: [Patch][AVR]: Support tail calls
> 
> This is a patch to test/review/comment on. It adds tail call
> optimization to avr backend.

<snip>

> I did not find a way to make this work together with -mcall-prologues.
> Please let me know if you have suggestion on how call prologues can be
> combine with tail calls.

Yeah, we're going to have to find a way to make this work with -mcall-prologues because that is a very commonly used target optimization switch.

Eric
Georg-Johann Lay March 11, 2011, 2:59 p.m. UTC | #2
Weddington, Eric schrieb:
> 
>> -----Original Message-----
>> From: Georg-Johann Lay [mailto:avr@gjlay.de]
>> Sent: Friday, March 11, 2011 6:44 AM
>> To: gcc-patches@gcc.gnu.org
>> Cc: Denis Chertykov; Anatoly Sokolov; Weddington, Eric; Boyapati, Anitha
>> Subject: [Patch][AVR]: Support tail calls
>>
>> This is a patch to test/review/comment on. It adds tail call
>> optimization to avr backend.
> 
> <snip>
> 
>> I did not find a way to make this work together with -mcall-prologues.
>> Please let me know if you have suggestion on how call prologues can be
>> combine with tail calls.
> 
> Yeah, we're going to have to find a way to make this work with -mcall-prologues because that is a very commonly used target optimization switch.

"Does not work" means here that tail-call optimization won't be
applied in the presence of -mcall-prologues. The code will be correct,
of course. For any module that is compiled with -mcall-prologues the
patch hat no effect, see avr.c:avr_function_ok_for_sibcall()


+  /* Tail-calling must fail if callee-saved regs are used to pass
+     function args.  We must not tail-call when `epilogue_restores'
+     is used.  Unfortunalelly, we cannot tell at this point if that
+     actually will happen or not, and we cannot step back from
+     tail-calling. Thus, we inhibit tail-calling with
-mcall-prologues. */
+
+  if (cfun->machine->sibcall_fails
+      || TARGET_CALL_PROLOGUES)
+    {
+      return false;
+    }



Johann
Weddington, Eric March 11, 2011, 6:01 p.m. UTC | #3
> -----Original Message-----
> From: Georg-Johann Lay [mailto:avr@gjlay.de]
> Sent: Friday, March 11, 2011 7:59 AM
> To: Weddington, Eric
> Cc: gcc-patches@gcc.gnu.org; Denis Chertykov; Anatoly Sokolov; Boyapati,
> Anitha; Joerg Wunsch
> Subject: Re: [Patch][AVR]: Support tail calls
> 
> "Does not work" means here that tail-call optimization won't be
> applied in the presence of -mcall-prologues. The code will be correct,
> of course. For any module that is compiled with -mcall-prologues the
> patch hat no effect, see avr.c:avr_function_ok_for_sibcall()

Thanks for the clarification! :-)

Eric
Boyapati, Anitha March 14, 2011, 6:34 a.m. UTC | #4
Hi Georg,

>This is a patch to test/review/comment on. It adds tail call
>optimization to avr backend.
>
>The implementation uses struct machine_function to pass information
>around, i.e. from avr_function_arg_advance to avr_function_ok_for_sibcall.
>
>Tail call support is more general than avr-ld's replacement of
>call/ret sequences with --relax which are sometimes wrong, see
>http://sourceware.org/PR12494
>
>gcc can, e.g. tail-call bar1 in
>
>void bar0 (void);
>void bar1 (int);
>
>int foo (int x)
>{
>  bar0();
>  return bar1 (x);
>}

To be on same page, can you explain how gcc optimizes above case? As I understand, in a tail-call optimization, bar1 can return to the caller of foo(). There can be different cases of handling this. But how is this handled in gcc after recognizing that foo() is a candidate for tail call? 

Also, I have applied the patch, and used it for a small test case as below:

int bar1(int x) {
        x++;
        return x;
}

int foo (int x)
{
  return bar1 (x);
}

int main() {
        volatile int i;
        return foo(i);

}

avr-gcc -S -foptimize-sibling-calls tail-call.c


I find no difference in the code generated with and without tail call optimization. (I am assuming -foptimize-sibling-calls should turn on this). Let me know if I am doing something wrong.

Anitha
Georg-Johann Lay March 14, 2011, 2:31 p.m. UTC | #5
Boyapati, Anitha schrieb:

> To be on same page, can you explain how gcc optimizes above case?

in

void bar0 (void);
int bar1 (int);

int foo (int x)
{
    bar0();
    return bar1 (x);
}

x must be saved somewhere. avr-gcc choses Y.
Compiled -Os -mmcu=atmega8 -fno-optimize-sibling-calls reads

foo:
	push r28	 ;
	push r29	 ;
.L__stack_usage = 2
	movw r28,r24	 ;  x, x
	rcall bar0	 ;
	movw r24,r28	 ; , x
	rcall bar1	 ;
	pop r29	 ;
	pop r28	 ;
	ret

and -Os -mmcu=atmega8 -foptimize-sibling-calls

foo:
	push r28	 ;
	push r29	 ;
.L__stack_usage = 2
	movw r28,r24	 ;  x, x
	rcall bar0	 ;
	movw r24,r28	 ; , x
	pop r29	 ;
	pop r28	 ;
	rjmp bar1	 ;

> As I understand, in a tail-call optimization, bar1 can return to
> the caller of foo(). There can be different cases of handling this.
> But how is this handled in gcc after recognizing that foo() is a
> candidate for tail call?

gcc recognizes most cases where tail call optimization must not be
applied. But in some cases backend has to impose more restrictions,
this is what TARGET_FUNCTION_OK_FOR_SIBCALL is for. E.g. An ISR must
not tail-call an ordinary function because the epilogues must be
compatible bit ISR resp. non-ISR have incompatible epilogues.

gcc also evaluates standard insns "sibcall", "sibcall_value",
"sibcall_epilogue" which are analoga to "call", "call_value",
"epilogue", resp.

> Also, I have applied the patch, and used it for a small test case
> as below:
> 
> int bar1(int x) { x++; return x; }
> 
> int foo (int x) { return bar1 (x); }
> 
> int main() { volatile int i; return foo(i);
> 
> }
> 
> avr-gcc -S -foptimize-sibling-calls tail-call.c
> 
> 
> I find no difference in the code generated with and without tail
> call optimization. (I am assuming -foptimize-sibling-calls should
> turn on this). Let me know if I am doing something wrong.
> 
> Anitha

As with all other optimization options/passes, they are only applied
in the presence of optimization, i.e. with -O0 options like
-foptimize-sibling-calls have no effect. You will have to specify at
least -O1 to see effects.

Johann
Boyapati, Anitha March 15, 2011, 1:47 p.m. UTC | #6
>
>Boyapati, Anitha schrieb:
>
>> To be on same page, can you explain how gcc optimizes above case?
>
>in
>
>void bar0 (void);
>int bar1 (int);
>
>int foo (int x)
>{
>    bar0();
>    return bar1 (x);
>}
>
>x must be saved somewhere. avr-gcc choses Y.
>Compiled -Os -mmcu=atmega8 -fno-optimize-sibling-calls reads

Ok. I was trying to understand the basic strategies of handling a tail-called function. Broadly speaking,

[1]. In the case of normal tail-call, the stack of the caller is freed and then a jump to callee is made.
[2]. Incase of a tail-recursive case, the recursion is made iterative.


>gcc recognizes most cases where tail call optimization must not be
>applied. But in some cases backend has to impose more restrictions,
>this is what TARGET_FUNCTION_OK_FOR_SIBCALL is for. E.g. An ISR must
>not tail-call an ordinary function because the epilogues must be
>compatible bit ISR resp. non-ISR have incompatible epilogues.
>


<snip>

>
>As with all other optimization options/passes, they are only applied
>in the presence of optimization, i.e. with -O0 options like
>-foptimize-sibling-calls have no effect. You will have to specify at
>least -O1 to see effects.
>

Thanks. I tried various cases (mutually recursive functions, interrupts and signals, variable arguments). Required a bit of wandering through tree-tailcall.c to know what other cases are not qualified as tail-called ones(variable arguments is one such). The code generated is as expected.

Coming to -mcall-prologues issue, I agree whenever prologue_saves and epilogue_restores, making the function qualified as a tail call is not ok (or it requires different handling). Going by the backend code, they are now emitted (prologue_saves and epilogue_restores) whenever -mcall-prologues is used. Ditto when callee-used registers are used for argument passing.


I think it is a nice work!


Anitha
Richard Henderson March 15, 2011, 10:39 p.m. UTC | #7
On 03/11/2011 05:43 AM, Georg-Johann Lay wrote:
> I did not find a way to make this work together with -mcall-prologues.
> Please let me know if you have suggestion on how call prologues can be
> combine with tail calls.

You need a new symbol in libgcc for this.  It should be easy enough to have
the sibcall epilogue load up Z+EIND before jumping to the new symbol
(perhaps called __sibcall_restores__).  This new symbol would be just like
the existing __epilogue_restores__ except that it would finish with an
eijmp/ijmp instruction (depending on multilib) instead of a ret instruction.

> The implementation uses struct machine_function to pass information
> around, i.e. from avr_function_arg_advance to avr_function_ok_for_sibcall.

Look at how the s390 port handles this exact problem.

  /* Register 6 on s390 is available as an argument register but unfortunately
     "caller saved". This makes functions needing this register for arguments
     not suitable for sibcalls.  */
  return !s390_call_saved_register_used (exp);

I'll admit that it would be helpful if the cumulative_args pointer was passed
into the ok_for_sibcall hook, but it's not *that* hard to recreate that value
by hand.  This is what the s390_call_saved_register_used function does.

> +      || (avr_OS_task_function_p (decl_callee) ^ avr_OS_task_function_p (current_function_decl))

Please just use != instead of ^ here.  Also, needs line wrapping.


I do like very much how you've cleaned up the call patterns.  IMO this should
be committed as a separate patch; I'll let the AVR maintainers approve it though.


r~

> 
> Regards, Johann
> 
> 
> 2011-03-10  Georg-Johann Lay  <avr@gjlay.de>
> 
> 	* config/avr/avr-protos.h (expand_epilogue): Change prototype
> 	* config/avr/avr.h (struct machine_function): Add field
> 	sibcall_fails.
> 	* config/avr/avr.c (init_cumulative_args,
> 	avr_function_arg_advance): Use it.
> 	* config/avr/avr.c (expand_epilogue): Add bool parameter. Handle
> 	sibcall	epilogues.
> 	(TARGET_FUNCTION_OK_FOR_SIBCALL): Define to...
> 	(avr_function_ok_for_sibcall): ...this new function.
> 	(avr_lookup_function_attribute1): New static Function.
> 	(avr_naked_function_p, interrupt_function_p,
> 	signal_function_p, avr_OS_task_function_p,
> 	avr_OS_main_function_p): Use it.
> 	* config/avr/avr.md ("sibcall", "sibcall_value",
> 	"sibcall_epilogue"): New expander.
> 	("*call_insn", "*call_value_insn"): New insn.
> 	("call_insn", "call_value_insn"): Remove
> 	("call", "call_value", "epilogue"): Change expander to handle
> 	sibling calls.
diff mbox

Patch

Index: config/avr/avr-protos.h
===================================================================
--- config/avr/avr-protos.h	(revision 170814)
+++ config/avr/avr-protos.h	(working copy)
@@ -76,7 +76,7 @@  extern const char *lshrsi3_out (rtx insn
 extern bool avr_rotate_bytes (rtx operands[]);
 
 extern void expand_prologue (void);
-extern void expand_epilogue (void);
+extern void expand_epilogue (bool);
 extern int avr_epilogue_uses (int regno);
 
 extern void avr_output_bld (rtx operands[], int bit_nr);
Index: config/avr/avr.md
===================================================================
--- config/avr/avr.md	(revision 170814)
+++ config/avr/avr.md	(working copy)
@@ -2647,94 +2647,85 @@ 
 ;; call
 
 (define_expand "call"
-  [(call (match_operand:HI 0 "call_insn_operand" "")
-         (match_operand:HI 1 "general_operand" ""))]
+  [(parallel[(call (match_operand:HI 0 "call_insn_operand" "")
+                   (match_operand:HI 1 "general_operand" ""))
+             (use (const_int 0))])]
   ;; Operand 1 not used on the AVR.
   ""
   "")
 
+(define_expand "sibcall"
+  [(parallel[(call (match_operand:HI 0 "call_insn_operand" "")
+                   (match_operand:HI 1 "general_operand" ""))
+             (use (const_int 1))])]
+  ""
+  "")
+
 ;; call value
 
 (define_expand "call_value"
-  [(set (match_operand 0 "register_operand" "")
-        (call (match_operand:HI 1 "call_insn_operand" "")
-              (match_operand:HI 2 "general_operand" "")))]
+  [(parallel[(set (match_operand 0 "register_operand" "")
+                  (call (match_operand:HI 1 "call_insn_operand" "")
+                        (match_operand:HI 2 "general_operand" "")))
+             (use (const_int 0))])]
   ;; Operand 2 not used on the AVR.
   ""
   "")
 
-(define_insn "call_insn"
-  [(call (mem:HI (match_operand:HI 0 "nonmemory_operand" "!z,*r,s,n"))
-         (match_operand:HI 1 "general_operand" "X,X,X,X"))]
-;; We don't need in saving Z register because r30,r31 is a call used registers
+(define_expand "sibcall_value"
+  [(parallel[(set (match_operand 0 "register_operand" "")
+                  (call (match_operand:HI 1 "call_insn_operand" "")
+                        (match_operand:HI 2 "general_operand" "")))
+             (use (const_int 1))])]
+  ""
+  "")
+
+(define_insn "*call_insn"
+  [(parallel[(call (mem:HI (match_operand:HI 0 "nonmemory_operand" "z,s,z,s"))
+                   (match_operand:HI 1 "general_operand"           "X,X,X,X"))
+             (use (match_operand:HI 2 "const_int_operand"          "L,L,P,P"))])]
   ;; Operand 1 not used on the AVR.
+  ;; Operand 2 is 1 for tail-call, 0 otherwise.
   "(register_operand (operands[0], HImode) || CONSTANT_P (operands[0]))"
-  "*{
-  if (which_alternative==0)
-     return \"%!icall\";
-  else if (which_alternative==1)
-    {
-      if (AVR_HAVE_MOVW)
-	return (AS2 (movw, r30, %0) CR_TAB
-               \"%!icall\");
-      else
-	return (AS2 (mov, r30, %A0) CR_TAB
-		AS2 (mov, r31, %B0) CR_TAB
-		\"%!icall\");
-    }
-  else if (which_alternative==2)
-    return AS1(%~call,%x0);
-  return (AS2 (ldi,r30,lo8(%0)) CR_TAB
-          AS2 (ldi,r31,hi8(%0)) CR_TAB
-          \"%!icall\");
-}"
-  [(set_attr "cc" "clobber,clobber,clobber,clobber")
+  "@
+    %!icall
+    %~call %x0
+    %!ijmp
+    %~jmp %x0"
+  [(set_attr "cc" "clobber")
    (set_attr_alternative "length"
-			 [(const_int 1)
-			  (if_then_else (eq_attr "mcu_have_movw" "yes")
-					(const_int 2)
-					(const_int 3))
-			  (if_then_else (eq_attr "mcu_mega" "yes")
-					(const_int 2)
-					(const_int 1))
-			  (const_int 3)])])
-
-(define_insn "call_value_insn"
-  [(set (match_operand 0 "register_operand" "=r,r,r,r")
-        (call (mem:HI (match_operand:HI 1 "nonmemory_operand" "!z,*r,s,n"))
-;; We don't need in saving Z register because r30,r31 is a call used registers
-              (match_operand:HI 2 "general_operand" "X,X,X,X")))]
+                         [(const_int 1)
+                          (if_then_else (eq_attr "mcu_mega" "yes")
+                                        (const_int 2)
+                                        (const_int 1))
+                          (const_int 1)
+                          (if_then_else (eq_attr "mcu_mega" "yes")
+                                        (const_int 2)
+                                        (const_int 1))])])
+
+(define_insn "*call_value_insn"
+  [(parallel[(set (match_operand 0 "register_operand"                   "=r,r,r,r")
+                  (call (mem:HI (match_operand:HI 1 "nonmemory_operand"  "z,s,z,s"))
+                        (match_operand:HI 2 "general_operand"            "X,X,X,X")))
+             (use (match_operand:HI 3 "const_int_operand"                "L,L,P,P"))])]
   ;; Operand 2 not used on the AVR.
-  "(register_operand (operands[0], VOIDmode) || CONSTANT_P (operands[0]))"
-  "*{
-  if (which_alternative==0)
-     return \"%!icall\";
-  else if (which_alternative==1)
-    {
-      if (AVR_HAVE_MOVW)
-	return (AS2 (movw, r30, %1) CR_TAB
-		\"%!icall\");
-      else
-	return (AS2 (mov, r30, %A1) CR_TAB
-		AS2 (mov, r31, %B1) CR_TAB
-		\"%!icall\");
-    }
-  else if (which_alternative==2)
-    return AS1(%~call,%x1);
-  return (AS2 (ldi, r30, lo8(%1)) CR_TAB
-          AS2 (ldi, r31, hi8(%1)) CR_TAB
-          \"%!icall\");
-}"
-  [(set_attr "cc" "clobber,clobber,clobber,clobber")
+  ;; Operand 3 is 1 for tail-call, 0 otherwise.
+  ""
+  "@
+    %!icall
+    %~call %x1
+    %!ijmp
+    %~jmp %x1"
+  [(set_attr "cc" "clobber")
    (set_attr_alternative "length"
-			 [(const_int 1)
-			  (if_then_else (eq_attr "mcu_have_movw" "yes")
-					(const_int 2)
-					(const_int 3))
-			  (if_then_else (eq_attr "mcu_mega" "yes")
-					(const_int 2)
-					(const_int 1))
-			  (const_int 3)])])
+                         [(const_int 1)
+                          (if_then_else (eq_attr "mcu_mega" "yes")
+                                        (const_int 2)
+                                        (const_int 1))
+                          (const_int 1)
+                          (if_then_else (eq_attr "mcu_mega" "yes")
+                                        (const_int 2)
+                                        (const_int 1))])])
 
 (define_insn "nop"
   [(const_int 0)]
@@ -3246,8 +3237,15 @@ 
 (define_expand "epilogue"
   [(const_int 0)]
   ""
-  "
   {
-    expand_epilogue (); 
+    expand_epilogue (false /* sibcall_p */);
     DONE;
-  }")
+  })
+
+(define_expand "sibcall_epilogue"
+  [(const_int 0)]
+  ""
+  {
+    expand_epilogue (true /* sibcall_p */);
+    DONE;
+  })
Index: config/avr/avr.c
===================================================================
--- config/avr/avr.c	(revision 170814)
+++ config/avr/avr.c	(working copy)
@@ -100,6 +100,7 @@  static rtx avr_function_arg (CUMULATIVE_
 static void avr_function_arg_advance (CUMULATIVE_ARGS *, enum machine_mode,
 				      const_tree, bool);
 static void avr_help (void);
+static bool avr_function_ok_for_sibcall (tree, tree);
 
 /* Allocate registers from r25 to r8 for parameters for function calls.  */
 #define FIRST_CUM_REG 26
@@ -231,6 +232,9 @@  static const struct default_options avr_
 #undef TARGET_HELP
 #define TARGET_HELP avr_help
 
+#undef TARGET_FUNCTION_OK_FOR_SIBCALL
+#define TARGET_FUNCTION_OK_FOR_SIBCALL avr_function_ok_for_sibcall
+
 struct gcc_target targetm = TARGET_INITIALIZER;
 
 static void
@@ -330,17 +334,34 @@  avr_regno_reg_class (int r)
   return ALL_REGS;
 }
 
+/* A helper for the subsequent function attribute used to dig for
+   attribute 'name' in a FUNCTION_DECL or FUNCTION_TYPE */
+
+static inline int
+avr_lookup_function_attribute1 (const_tree func, const char *name)
+{
+  if (FUNCTION_DECL == TREE_CODE (func))
+    {
+      if (NULL_TREE != lookup_attribute (name, DECL_ATTRIBUTES (func)))
+        {
+          return true;
+        }
+      
+      func = TREE_TYPE (func);
+    }
+
+  gcc_assert (TREE_CODE (func) == FUNCTION_TYPE
+              || TREE_CODE (func) == METHOD_TYPE);
+  
+  return NULL_TREE != lookup_attribute (name, TYPE_ATTRIBUTES (func));
+}
+
 /* Return nonzero if FUNC is a naked function.  */
 
 static int
 avr_naked_function_p (tree func)
 {
-  tree a;
-
-  gcc_assert (TREE_CODE (func) == FUNCTION_DECL);
-  
-  a = lookup_attribute ("naked", TYPE_ATTRIBUTES (TREE_TYPE (func)));
-  return a != NULL_TREE;
+  return avr_lookup_function_attribute1 (func, "naked");
 }
 
 /* Return nonzero if FUNC is an interrupt function as specified
@@ -349,13 +370,7 @@  avr_naked_function_p (tree func)
 static int
 interrupt_function_p (tree func)
 {
-  tree a;
-
-  if (TREE_CODE (func) != FUNCTION_DECL)
-    return 0;
-
-  a = lookup_attribute ("interrupt", DECL_ATTRIBUTES (func));
-  return a != NULL_TREE;
+  return avr_lookup_function_attribute1 (func, "interrupt");
 }
 
 /* Return nonzero if FUNC is a signal function as specified
@@ -364,13 +379,7 @@  interrupt_function_p (tree func)
 static int
 signal_function_p (tree func)
 {
-  tree a;
-
-  if (TREE_CODE (func) != FUNCTION_DECL)
-    return 0;
-
-  a = lookup_attribute ("signal", DECL_ATTRIBUTES (func));
-  return a != NULL_TREE;
+  return avr_lookup_function_attribute1 (func, "signal");
 }
 
 /* Return nonzero if FUNC is a OS_task function.  */
@@ -378,12 +387,7 @@  signal_function_p (tree func)
 static int
 avr_OS_task_function_p (tree func)
 {
-  tree a;
-
-  gcc_assert (TREE_CODE (func) == FUNCTION_DECL);
-  
-  a = lookup_attribute ("OS_task", TYPE_ATTRIBUTES (TREE_TYPE (func)));
-  return a != NULL_TREE;
+  return avr_lookup_function_attribute1 (func, "OS_task");
 }
 
 /* Return nonzero if FUNC is a OS_main function.  */
@@ -391,12 +395,7 @@  avr_OS_task_function_p (tree func)
 static int
 avr_OS_main_function_p (tree func)
 {
-  tree a;
-
-  gcc_assert (TREE_CODE (func) == FUNCTION_DECL);
-  
-  a = lookup_attribute ("OS_main", TYPE_ATTRIBUTES (TREE_TYPE (func)));
-  return a != NULL_TREE;
+  return avr_lookup_function_attribute1 (func, "OS_main");
 }
 
 /* Return the number of hard registers to push/pop in the prologue/epilogue
@@ -864,7 +863,7 @@  avr_epilogue_uses (int regno ATTRIBUTE_U
 /*  Output RTL epilogue.  */
 
 void
-expand_epilogue (void)
+expand_epilogue (bool sibcall_p)
 {
   int reg;
   int live_seq;
@@ -875,6 +874,8 @@  expand_epilogue (void)
   /* epilogue: naked  */
   if (cfun->machine->is_naked)
     {
+      gcc_assert (!sibcall_p);
+      
       emit_jump_insn (gen_return ());
       return;
     }
@@ -1016,7 +1017,8 @@  expand_epilogue (void)
           emit_insn (gen_popqi (zero_reg_rtx));
         }
 
-      emit_jump_insn (gen_return ());
+      if (!sibcall_p)
+        emit_jump_insn (gen_return ());
     }
 }
 
@@ -1629,6 +1631,10 @@  init_cumulative_args (CUMULATIVE_ARGS *c
   cum->regno = FIRST_CUM_REG;
   if (!libname && stdarg_p (fntype))
     cum->nregs = 0;
+
+  /* Assume the calle may be tail called */
+  
+  cfun->machine->sibcall_fails = 0;
 }
 
 /* Returns the number of registers to allocate for a function argument.  */
@@ -1676,6 +1682,23 @@  avr_function_arg_advance (CUMULATIVE_ARG
   cum->nregs -= bytes;
   cum->regno -= bytes;
 
+  /* A parameter is being passed in a call-saved register. As the original
+     contents of these regs has to be restored before leaving the function,
+     a function must not pass arguments in call-saved regs in order to get
+     tail-called. */
+  
+  if (cum->regno >= 0
+      && !call_used_regs[cum->regno])
+    {
+      /* FIXME: We ship info on failing tail-call in struct machine_function.
+         This uses internals of calls.c:expand_call() and the way args_so_far
+         is used. targetm.function_ok_for_sibcall() needs to be extended to
+         pass &args_so_far, too. At present, CUMULATIVE_ARGS is target
+         dependent so that such an extension is not wanted. */
+      
+      cfun->machine->sibcall_fails = 1;
+    }
+
   if (cum->nregs <= 0)
     {
       cum->nregs = 0;
@@ -1683,6 +1706,60 @@  avr_function_arg_advance (CUMULATIVE_ARG
     }
 }
 
+/* Implement `TARGET_FUNCTION_OK_FOR_SIBCALL' */
+/* Decide whether we can make a sibling call to a function.  DECL is the
+   declaration of the function being targeted by the call and EXP is the
+   CALL_EXPR representing the call. */
+
+static bool
+avr_function_ok_for_sibcall (tree decl_callee, tree exp_callee)
+{
+  tree fntype_callee;
+
+  /* Tail-calling must fail if callee-saved regs are used to pass
+     function args.  We must not tail-call when `epilogue_restores'
+     is used.  Unfortunalelly, we cannot tell at this point if that
+     actually will happen or not, and we cannot step back from
+     tail-calling. Thus, we inhibit tail-calling with -mcall-prologues. */
+  
+  if (cfun->machine->sibcall_fails
+      || TARGET_CALL_PROLOGUES)
+    {
+      return false;
+    }
+  
+  fntype_callee = TREE_TYPE (CALL_EXPR_FN (exp_callee));
+
+  if (decl_callee)
+    {
+      decl_callee = TREE_TYPE (decl_callee);
+    }
+  else
+    {
+      decl_callee = fntype_callee;
+      
+      while (FUNCTION_TYPE != TREE_CODE (decl_callee)
+             && METHOD_TYPE != TREE_CODE (decl_callee))
+        {
+          decl_callee = TREE_TYPE (decl_callee);
+        }
+    }
+
+  /* Ensure that caller and callee have compatible epilogues */
+  
+  if (interrupt_function_p (current_function_decl)
+      || signal_function_p (current_function_decl)
+      || avr_naked_function_p (decl_callee)
+      || avr_naked_function_p (current_function_decl)
+      || (avr_OS_task_function_p (decl_callee) ^ avr_OS_task_function_p (current_function_decl))
+      || (avr_OS_main_function_p (decl_callee) ^ avr_OS_main_function_p (current_function_decl)))
+    {
+      return false;
+    }
+ 
+  return true;
+}
+
 /***********************************************************************
   Functions for outputting various mov's for a various modes
 ************************************************************************/
Index: config/avr/avr.h
===================================================================
--- config/avr/avr.h	(revision 170814)
+++ config/avr/avr.h	(working copy)
@@ -828,4 +828,7 @@  struct GTY(()) machine_function
   
   /* Current function stack size.  */
   int stack_usage;
+
+  /* 'true' if a callee might be tail called */
+  int sibcall_fails;
 };