Patchwork [avr] Implement PR56254

login
register
mail settings
Submitter Georg-Johann Lay
Date Feb. 8, 2013, 12:53 p.m.
Message ID <5114F540.2080905@gjlay.de>
Download mbox | patch
Permalink /patch/219136/
State New
Headers show

Comments

Georg-Johann Lay - Feb. 8, 2013, 12:53 p.m.
This adds variable delays to __builtin_avr_delay_cycles.

Is this okay?

Johann

gcc/
	PR target/56254
	* config/avr/avr.c (avr_expand_builtin)	<AVR_BUILTIN_DELAY_CYCLES>:
	Expand to delay_cycles for non-const delays.
	* config/avr/avr.md (delay_cycles): New expander.
	(*delay_cycles.libgcc): New insn.
	* config/avr/builtins.def (DELAY_CYCLES) <LIBNAME>: Set to
	__delay_cycles.
	* doc/extend.texi (AVR Built-in Functions)
	<__builtin_avr_delay_cycles>: Adjust documentation.

libgcc/
	PR target/56254
	* config/avr/t-avr (LIB1ASMFUNCS): Add _delay_cycles.
	* config/avr/lib1funcs.S (__delay_cycles): Implement it.
	(skip): New macro.
Weddington, Eric - Feb. 8, 2013, 1:26 p.m.
> -----Original Message-----

> From: Georg-Johann Lay 

> Sent: Friday, February 08, 2013 5:53 AM

> To: gcc-patches@gcc.gnu.org

> Cc: Denis Chertykov; Weddington, Eric

> Subject: [Patch,avr] Implement PR56254

> 

> This adds variable delays to __builtin_avr_delay_cycles.

> 

> Is this okay?

> 


What does this do? How does it work?

Could you explain the statement in the documentation "if ticks is not a compile-time constant, the delay might be overestimated by up to 15 cycles". Why would we want to do that? I thought the whole point of a builtin delay cycles function was to have an accurate delay.

Eric
Georg-Johann Lay - Feb. 8, 2013, 3:59 p.m.
Weddington, Eric wrote:

> From: Georg-Johann Lay
>> This adds variable delays to __builtin_avr_delay_cycles.
>> 
>> Is this okay?
> 
> What does this do? How does it work?
> 
> Could you explain the statement in the documentation "if ticks is not a
> compile-time constant, the delay might be overestimated by up to 15 cycles".
> Why would we want to do that? I thought the whole point of a builtin delay
> cycles function was to have an accurate delay.

It's not uncommon that users complain that non-constant delays are not supported.

It works that same way like const delays except that no constant has to be
loaded and the delay is not 100% exact, i.e. might be up to 15 ticks too slow.

For delays like 100000 it does not matter whether or not it is 15 ticks longer,
most crystals are not exact enough you'll ever notice the difference.

IMHO, using delays in a application is a design flaw in the first place (except
for very short delays to wait for slow I/O), but the users want it, as you know
very well.


In cases of variable delays the inexactness comes from computing the amount of
ticks at run time, not __builtin_avr_delay_cycles itself.  But that's up the
user that calls __builtin_avr_delay_cycles, it's not a problem of
__builtin_avr_delay_cycles.


Johann
Weddington, Eric - Feb. 8, 2013, 5:25 p.m.
> -----Original Message-----

> From: Georg-Johann Lay 

> Sent: Friday, February 08, 2013 9:00 AM

> To: Weddington, Eric

> Cc: gcc-patches@gcc.gnu.org; Denis Chertykov; Joerg Wunsch

> Subject: Re: [Patch,avr] Implement PR56254

> 

> Weddington, Eric wrote:

> 

> >

> > What does this do? How does it work?

> >

> > Could you explain the statement in the documentation "if ticks is not

> a

> > compile-time constant, the delay might be overestimated by up to 15

> cycles".

> > Why would we want to do that? I thought the whole point of a builtin

> delay

> > cycles function was to have an accurate delay.

> 

> It's not uncommon that users complain that non-constant delays are not

> supported.


Agreed.
 
> It works that same way like const delays except that no constant has to

> be

> loaded and the delay is not 100% exact, i.e. might be up to 15 ticks

> too slow.

> 

> For delays like 100000 it does not matter whether or not it is 15 ticks

> longer,

> most crystals are not exact enough you'll ever notice the difference.


True.

> IMHO, using delays in a application is a design flaw in the first place

> (except

> for very short delays to wait for slow I/O), but the users want it, as

> you know

> very well.


It is not necessarily a design flaw. We cannot judge someone else's application in that way.

And this is what I have a problem with: As you said, most of the time users will implement a very short delay waiting for I/O. It is exactly these short delays where being 15 cycles too slow will have the biggest impact, as it is a bigger percentage of the overall time being delayed.

We already have variable delays in avr-libc. How does this patch solve an existing problem? Does this patch make it more accurate than avr-libc?

Eric
 
> 

> In cases of variable delays the inexactness comes from computing the

> amount of

> ticks at run time, not __builtin_avr_delay_cycles itself.  But that's

> up the

> user that calls __builtin_avr_delay_cycles, it's not a problem of

> __builtin_avr_delay_cycles.

> 

> 

> Johann

>
Georg-Johann Lay - Feb. 8, 2013, 5:46 p.m.
Weddington, Eric wrote:
>> -----Original Message----- From: Georg-Johann Lay Sent: Friday, February
>> 08, 2013 9:00 AM To: Weddington, Eric Cc: gcc-patches@gcc.gnu.org; Denis
>> Chertykov; Joerg Wunsch Subject: Re: [Patch,avr] Implement PR56254
>> 
>> Weddington, Eric wrote:
>> 
>>> What does this do? How does it work?
>>> 
>>> Could you explain the statement in the documentation "if ticks is not
>> a
>>> compile-time constant, the delay might be overestimated by up to 15
>> cycles".
>>> Why would we want to do that? I thought the whole point of a builtin
>> delay
>>> cycles function was to have an accurate delay.
>> It's not uncommon that users complain that non-constant delays are not 
>> supported.
> 
> Agreed.
> 
>> It works that same way like const delays except that no constant has to be
>>  loaded and the delay is not 100% exact, i.e. might be up to 15 ticks too
>> slow.
>> 
>> For delays like 100000 it does not matter whether or not it is 15 ticks 
>> longer, most crystals are not exact enough you'll ever notice the
>> difference.
> 
> True.
> 
>> IMHO, using delays in a application is a design flaw in the first place 
>> (except for very short delays to wait for slow I/O), but the users want
>> it, as you know very well.
> 
> It is not necessarily a design flaw. We cannot judge someone else's
> application in that way.

Ya, you are right.  But for the same reason the compiler should not throw an
error if the user wants a non-const delay.  It could war, yes, but error() is
not nice.

IMO either support a feature properly or not at all.

> And this is what I have a problem with: As you said, most of the time users
> will implement a very short delay waiting for I/O. It is exactly these short
> delays where being 15 cycles too slow will have the biggest impact, as it is
> a bigger percentage of the overall time being delayed.
> 
> We already have variable delays in avr-libc. How does this patch solve an
> existing problem? Does this patch make it more accurate than avr-libc?

This is a patch for gcc, not libc.

libc can define any interface it wants and use, e.g. attribute error, attribute
warning, builtin_constant_p or whatever to wrap around builtin_delay_cycles.

But throwing an error so that the application no more compiles with -O0 is not
nice, i.e. it blocks compiling the stuff for debugging.

Moreover, most applications want a minimum delay, not a maximum delay.


Johann

Patch

Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 195736)
+++ gcc/doc/extend.texi	(working copy)
@@ -9030,8 +9030,8 @@  void __builtin_avr_delay_cycles (unsigne
 @noindent
 @code{ticks} is the number of ticks to delay execution. Note that this
 built-in does not take into account the effect of interrupts that
-might increase delay time. @code{ticks} must be a compile-time
-integer constant; delays with a variable number of cycles are not supported.
+might increase delay time. If @code{ticks} is not a compile-time
+constant, the delay might be overestimated by up to 15@tie{}cycles.
 
 @smallexample
 char __builtin_avr_flash_segment (const __memx void*)
Index: gcc/config/avr/builtins.def
===================================================================
--- gcc/config/avr/builtins.def	(revision 195878)
+++ gcc/config/avr/builtins.def	(working copy)
@@ -50,7 +50,7 @@  DEF_BUILTIN (FMULSU, 2, int_ftype_char_u
 
 /* More complex stuff that cannot be mapped 1:1 to an instruction.  */
 
-DEF_BUILTIN (DELAY_CYCLES, -1, void_ftype_ulong, nothing, NULL)
+DEF_BUILTIN (DELAY_CYCLES, -1, void_ftype_ulong, nothing, "__delay_cycles")
 DEF_BUILTIN (INSERT_BITS, 3, uchar_ftype_ulong_uchar_uchar, insert_bits, NULL)
 DEF_BUILTIN (FLASH_SEGMENT, 1, char_ftype_const_memx_ptr, flash_segment, NULL)
 
Index: gcc/config/avr/avr.md
===================================================================
--- gcc/config/avr/avr.md	(revision 195878)
+++ gcc/config/avr/avr.md	(working copy)
@@ -5495,6 +5495,27 @@  (define_expand "sibcall_epilogue"
 ;; Some instructions resp. instruction sequences available
 ;; via builtins.
 
+(define_expand "delay_cycles"
+  [(set (reg:SI 22)
+        (match_operand:SI 0 "register_operand" ""))
+   (parallel [(set (reg:SI 22)
+                   (unspec_volatile:SI [(reg:SI 22)]
+                                       UNSPECV_DELAY_CYCLES))
+              (set (match_operand:BLK 1 "" "")
+                   (unspec_volatile:BLK [(match_dup 1)]
+                                        UNSPECV_MEMORY_BARRIER))])])
+
+(define_insn "*delay_cycles.libgcc"
+  [(set (reg:SI 22)
+        (unspec_volatile:SI [(reg:SI 22)]
+                            UNSPECV_DELAY_CYCLES))
+   (set (match_operand:BLK 0 "" "")
+        (unspec_volatile:BLK [(match_dup 0)] UNSPECV_MEMORY_BARRIER))]
+  ""
+  "%~call __delay_cycles"
+  [(set_attr "type" "xcall")
+   (set_attr "cc" "clobber")])
+
 (define_insn "delay_cycles_1"
   [(unspec_volatile [(match_operand:QI 0 "const_int_operand" "n")
                      (const_int 1)]
Index: gcc/config/avr/avr.c
===================================================================
--- gcc/config/avr/avr.c	(revision 195878)
+++ gcc/config/avr/avr.c	(working copy)
@@ -11707,18 +11707,18 @@  avr_expand_builtin (tree exp, rtx target
     {
     case AVR_BUILTIN_NOP:
       emit_insn (gen_nopv (GEN_INT(1)));
-      return 0;
+      return NULL_RTX;
 
     case AVR_BUILTIN_DELAY_CYCLES:
       {
         arg0 = CALL_EXPR_ARG (exp, 0);
         op0 = expand_expr (arg0, NULL_RTX, VOIDmode, EXPAND_NORMAL);
 
-        if (!CONST_INT_P (op0))
-          error ("%s expects a compile time integer constant", bname);
-        else
+        if (CONST_INT_P (op0))
           avr_expand_delay_cycles (op0);
-
+        else
+          emit_insn (gen_delay_cycles (force_reg (SImode, op0),
+                                       avr_mem_clobber()));
         return NULL_RTX;
       }
 
Index: libgcc/config/avr/lib1funcs.S
===================================================================
--- libgcc/config/avr/lib1funcs.S	(revision 195878)
+++ libgcc/config/avr/lib1funcs.S	(working copy)
@@ -128,6 +128,9 @@  see the files COPYING3 and COPYING.RUNTI
 #define exp_lo(N)  hlo8 ((N) << 23)
 #define exp_hi(N)  hhi8 ((N) << 23)
 
+;; Skip the next instruction, typically a jump target
+#define skip cpse 0,0
+
 
 .section .text.libgcc.mul, "ax", @progbits
 
@@ -2752,6 +2755,30 @@  DEFUN __bswapdi2
 ENDF __bswapdi2
 #endif /* defined (L_bswapdi2) */
 
+#if defined (L_delay_cycles)
+
+#define A0 22
+#define A1 A0+1
+#define A2 A0+2
+#define A3 A0+3
+
+DEFUN __delay_cycles
+    subi    A0, 14
+    skip
+0:  subi    A0, 6
+    sbci    A1, 0
+    sbci    A2, 0
+    sbci    A3, 0
+    brcc 0b
+    ret
+ENDF __delay_cycles
+
+#undef A0
+#undef A1
+#undef A2
+#undef A3
+#endif  /* L_delay_cycles */
+
 
 /**********************************
  * 64-bit shifts
Index: libgcc/config/avr/t-avr
===================================================================
--- libgcc/config/avr/t-avr	(revision 195878)
+++ libgcc/config/avr/t-avr	(working copy)
@@ -51,6 +51,7 @@  LIB1ASMFUNCS = \
 	_popcountqi2 \
 	_bswapsi2 \
 	_bswapdi2 \
+	_delay_cycles \
 	_ashldi3 _ashrdi3 _lshrdi3 _rotldi3 \
 	_adddi3 _adddi3_s8 _subdi3 \
 	_cmpdi2 _cmpdi2_s8 \