RFA: Add Epiphany port
diff mbox

Message ID 20111104094354.cl551itv9c8c4g0o-nzlynne@webmail.spamcop.net
State New
Headers show

Commit Message

Joern Rennecke Nov. 4, 2011, 1:43 p.m. UTC
Thanks for the helpful comments.

Attached is the revised version.
2011-11-04  Joern Rennecke <joern.rennecke@embecosm.com>

toplevel: (subject to SC approval)
	MAINTAINERS: Move myself from Write After Approval to CPU Port
	Maintainers section, as Epiphany maintainer.
gcc:
	* config.gcc (epiphany-*-*): New architecture.
	(epiphany-*-elf): New configuration.
	* config/epiphany, common/config/epiphany : New directories.
	* doc/extend.texi (disinterrupt attribute): Add Epiphany.
	(interrupt attribute): Add Epiphany.
	(long_call, short_call attribute): Add Epiphany.
	* doc/invoke.texi (Options): Add Epiphany options.
	* doc/md.texi (Machine Constraints): Add Epiphany constraints.
	* doc/install.texi (Options specification):
	Add --with-stack-offset=@var{num} description.
	(host/target specific issues): Add epiphany-*-elf.
	* doc/contrib.texi (Contributors): Mention Epiphany port.
gcc/testsuite:
	* gcc.c-torture/execute/ieee/mul-subnormal-single-1.x:
	Disable test on Epiphany.
	* gcc.c-torture/execute/20101011-1.c: Disable test on Epiphany.
	* gcc.dg/stack-usage-1.c [__epiphany__] (SIZE): Define.
	* gcc.dg/pragma-pack-3.c: Disable test on Epiphany.
	* g++.dg/parse/pragma3.C: Likewise.
	* stackalign/builtin-apply-2.c (STACK_ARGUMENTS_SIZE): Define.
	(bar): Use it.
	* gcc.dg/weak/typeof-2.c [epiphany-*-*]: Add option -mshort-calls.
	* gcc.dg/tls/thr-cse-1.c: Likewise.
	* g++.dg/opt/devirt2.C: Likewise.
	* gcc.dg/20020312-2.c [epiphany-*-*] (PIC_REG): Define.
	* gcc.dg/builtin-apply2.c [__epiphany__]: (STACK_ARGUMENTS_SIZE): 20.
	* gcc.target/epiphany: New directory.
libgcc:
	* config.host (epiphany-*-elf*): New configuration.
	* config/epiphany: New Directory.
contrib:
	* contrib-list.mk: Add Epiphany configurations.

Index: htdocs/backends.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/backends.html,v
retrieving revision 1.41
diff -u -r1.41 backends.html
--- htdocs/backends.html	15 Jul 2011 09:48:14 -0000	1.41
+++ htdocs/backends.html	4 Nov 2011 13:27:01 -0000
@@ -73,6 +73,7 @@
 c4x      |  ??  N I BD       g  d  e 
 c6x      |   S     CB      p g bda 
 cris     |       F  B     cp g b a  s
+epiphany |         C       p g  da  s
 fr30     | ??    FI B        gm     s
 frv      | ??       B      p    da  s
 h8300    |       FI       cp g      s
Index: htdocs/index.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/index.html,v
retrieving revision 1.819
diff -u -r1.819 index.html
--- htdocs/index.html	28 Oct 2011 21:47:30 -0000	1.819
+++ htdocs/index.html	4 Nov 2011 13:27:01 -0000
@@ -53,6 +53,11 @@
 
 <dl class="news">
 
+<dt><span>Epiphany processor support</span>
+    <span class="date">[2011-11-03]</span></dt>
+<dd>A port for Adapteva's Epiphany multicore processor has been contributed by
+Embecosm.</dd>
+
 <dt><span><a href="gcc-4.6/">GCC 4.6.2</a> released</span>
     <span class="date">[2011-10-26]</span></dt>
 <dd></dd>
Index: htdocs/readings.html
===================================================================
RCS file: /cvs/gcc/wwwdocs/htdocs/readings.html,v
retrieving revision 1.215
diff -u -r1.215 readings.html
--- htdocs/readings.html	15 Jul 2011 09:48:15 -0000	1.215
+++ htdocs/readings.html	4 Nov 2011 13:27:01 -0000
@@ -114,6 +114,12 @@
    <br /><a href="http://developer.axis.com/">Site with CPU documentation</a>
  </li>
  
+ <li>Epiphany
+  <br />Manufacturer: Adapteva
+  <br /><a href="http://www.adapteva.com/">Manufacturer's website</a> with
+  additional information about the Epiphany architecture.
+ </li>
+ 
  <li>fr30
    <br />Manufacturer: Fujitsu
    <br />Acronym stands for: Fujitsu RISC

Comments

Richard Henderson Nov. 4, 2011, 6:15 p.m. UTC | #1
> (define_predicate "call_address_operand"
>   (match_code "symbol_ref,const,reg")
> {
>   return (symbolic_operand (op, mode) || (GET_CODE (op) == REG));
> })

Nit.

(define_predicate "call_address_operand"
  (ior (match_code "reg")
       (match_operand 0 "symbolic_operand")))

> (define_special_predicate "any_gpr_operand"
>   (match_code "subreg,reg")
> {
>   return gpr_operand (op, mode);
> })

(define_special_predicate "any_gpr_operand"
  (match_operand 0 "gpr_operand"))

though, I'm not sure what this achieves...  while any_gpr_operand
will ignore its mode, passing on that same mode to gpr_operand wont.

> (define_insn_and_split "move_frame"
>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>         (match_operand:SI 1 "register_operand" "r"))
>    (clobber (reg:CC CC_REGNUM))]
>   "operands[1] == frame_pointer_rtx || operands[1] == arg_pointer_rtx"

It looks to me that there are several places that could be tidied up
if you have a frame_register_operand predicate to use here instead of
testing operands[] in the extra_constraints field.

> ;; If the frame pointer elimination offset is zero, we'll use this pattern.
> (define_insn_and_split "*move_frame_1"
>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>         (match_operand:SI 1 "register_operand" "r"))
>    (clobber (reg:CC CC_REGNUM))]
>   "(reload_in_progress || reload_completed)
>    && (operands[1] == stack_pointer_rtx
>        || operands[1] == hard_frame_pointer_rtx)"
>   "#"
>   "&& 1"
>   [(set (match_dup 0) (match_dup 1))])

Duplicate of the above?  I really don't see how they differ...

> ;; reload uses gen_addsi2 because it doesn't understand the need for
> ;; the clobber.
> (define_peephole2
>   [(set (match_operand:SI 0 "gpr_operand" "")
>         (match_operand:SI 1 "const_int_operand" ""))
>    (parallel [(set (match_dup 0)
>                    (plus:SI (match_dup 0)
>                             (match_operand:SI 2 "gpr_operand")))
>               (clobber (reg:CC UNKNOWN_REGNUM))])]

I'd like to understand all of this a little more.

Given that reload *would* generate an add3 (with no clobber), and
add-without-clobber isn't ordinarily a valid insn, why not just go
ahead and use that as your reload pattern instead of this 
clobber placeholder?

Wouldn't a bare plus pattern with that same reload_in_progress || reload_completed
check work just as well for all the pattern matching you want to do
during peephole analysis?

>   int scratch = (0x17
>                  ^ (true_regnum (operands[0]) & 1)
>                  ^ (true_regnum (operands[1]) & 2)
>                  ^ (true_regnum (operands[2]) & 4));
>   asm_fprintf (asm_out_file, \"\tstr r%d,[sp,#0]\n\", scratch);
>   asm_fprintf (asm_out_file, \"\tmovfs r%d,status\n\", scratch);
>   output_asm_insn (\"add %0,%1,%2\", operands);
>   asm_fprintf (asm_out_file, \"\tmovts status,r%d\n\", scratch);
>   asm_fprintf (asm_out_file, \"\tldr r%d,[sp,#0]\n\", scratch);
>   return \"\";

It does seem like you'd do well to split this pattern.  If you do, then you'll
automatically get the right changes to debugging unwind info across this.

A test for epilogue_completed in the split condition should be sufficient to
wait until after peephole2 has finished, so that you don't interfere with the
transformations you want there.

I'm also interested in hearing about how well this whole scheme works in 
practice, as opposed to merely waiting until after reload to split and flags
users.  There are certainly lots of other ports that are in the same boat
with respect to only having a flags-clobbering add.

> (define_insn_and_split "*recipsf2_1"
>   [(match_parallel 4 "float_operation"
>      [(set (match_operand:SF 0 "gpr_operand" "=r,r")
>            (div:SF (match_operand:SF 1 "const_float_1_operand" "")
>                    (match_operand:SF 2 "move_src_operand" "rU16m,rU16mCal")))
>       (use (match_operand:SI 3 "move_src_operand" "rU16m,rU16mCal"))

How to you prevent a post-reload copy propagation pass from putting things
back just the way before you split them?  It seems to me that's the primary
reason to use specific register constraints here.

> (define_insn "fmadd"
>   [(match_parallel 4 "float_operation"
>      [(set (match_operand:SF 0 "gpr_operand" "=r")
>            (fma:SF (match_operand:SF 2 "gpr_operand" "%r")
>                    (match_operand:SF 3 "gpr_operand" "r")
>                    (match_operand:SF 1 "gpr_operand" "0")))
>       (clobber (reg:CC_FP CCFP_REGNUM))])]

Presumably the strange operand ordering is left over from your port-specific
builtins?  Also, the % is extraneous since the constraints are identical.

> ; combiner pattern, also used by vector combiner pattern
> (define_expand "maddsf"
>   [(parallel
>      [(set (match_operand:SF 0 "gpr_operand" "=r")
>            (plus:SF (mult:SF (match_operand:SF 1 "gpr_operand" "%r")
>                              (match_operand:SF 2 "gpr_operand" "r"))
>                     (match_operand:SF 3 "gpr_operand" "0")))
>       (clobber (reg:CC_FP CCFP_REGNUM))])]
>   "TARGET_FUSED_MADD")

I suspect these aren't needed anymore.  Anything the combiner could have
found should be able to be found by the SSA optimizers.

> ; leave to desaster.

lead to disaster

>         (and:SI (match_operand:SI 1 "gpr_operand" "%r")
>                 (match_operand:SI 2 "gpr_operand" "r")))

More useless %.  And more in the other logicals.

> (define_expand "one_cmplsi2"
>   [(set (match_operand:SI 0 "gpr_operand" "")
>         (xor:SI (match_operand:SI 1 "gpr_operand" "")
>                 (match_dup 2)))]
>   ""
>   "emit_insn (gen_xorsi3 (operands[0], operands[1],
>                           force_reg (SImode, GEN_INT (-1))));
>    DONE;")
> 
> (define_insn "*one_cmplsi2_i"
>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>         (not:SI (match_operand:SI 1 "gpr_operand" "r")))
>    (clobber (reg:CC CC_REGNUM))]
>   "epiphany_m1reg >= 0"
>   "eor %0,%1,%-")

Why not combine these?  I'm pretty sure that expand_binop will try the xor
solution all on its own.  I'd think you'd only want to not use that when
m1reg is available.

Of course, if gpr_operand were to include -1 when m1reg is available, and
you swapped all the constraints to something other than "r", you get even
this for free.

>   "*
> {
>   rtx xop[3];
> 
>   xop[0] = operands[0];
>   xop[1] = operands[1];
>   xop[2] = GEN_INT (31-INTVAL (operands[2]));
>   output_asm_insn (\"lsl %0,%1,%2\", xop);
>   return \"\";
> }")

Please don't use "* in new code; { } is sufficient, and you get to remove
all of the gross \" bits.

> (define_insn "*mov<mode>cc_insn"
>   [(set (match_operand:WMODE 0 "gpr_operand" "=r")
>         (if_then_else:WMODE (match_operator 3 "proper_comparison_operator"
>                               [(match_operand 4 "cc_operand") (const_int 0)])
>                             (match_operand:WMODE 1 "nonmemory_operand" "r")
>                             (match_operand:WMODE 2 "gpr_operand" "0")))]

If there's a good reason for nonmemory_operand and not gpr_operand here,
you should add a big comment.  It looks like a mistake.

> static bool
> epiphany_frame_pointer_required (void)
> {
>   return cfun->calls_alloca;

Isn't this automatic?

> epiphany-load-combiner.o : $(srcdir)/config/epiphany/epiphany-load-combiner.c \

Missing file?

> #define IMM16(X)     ((unsigned)(X) <= 0xFFFF)
> #define IMM5(X)      ((unsigned)(X) <= 0x1F)

These need to be unsigned HOST_WIDE_INT, at minimum.
Preferably no cast at all and compare vs 0 as well.

> /* ??? This currently isn't used.  Waiting for PIC.  */
> #if 0
> #define EXTRA_CONSTRAINT(VALUE, C) \
> ((C) == 'R' ? (SYMBOL_REF_FUNCTION_P (VALUE) || GET_CODE (VALUE) == LABEL_REF) \

Just remove it.

There appear to be quite a number of macros in this file that have been
migrated to target hooks.  Please move them.

> /* For DWARF.  Marginally different than default so output is "prettier"
>    (and consistent with above).  */
> #define PUSHSECTION_ASM_OP "\t.section "

Unused?

> /* Tell crtstuff.c we're using ELF.  */
> #define OBJECT_FORMAT_ELF

Why wouldn't you get this from config/elfos.h?



r~
Joern Rennecke Nov. 5, 2011, midnight UTC | #2
Quoting Richard Henderson <rth@redhat.com>:

>> (define_special_predicate "any_gpr_operand"
>>   (match_code "subreg,reg")
>> {
>>   return gpr_operand (op, mode);
>> })
>
> (define_special_predicate "any_gpr_operand"
>   (match_operand 0 "gpr_operand"))
>
> though, I'm not sure what this achieves...  while any_gpr_operand
> will ignore its mode, passing on that same mode to gpr_operand wont.

It does get rid of the warning in the md file about a 'missing' mode, when
I really want to match registers in various modes.
When you look at the SH port, you'll see that I have used this trick of
having an "any_*" predicate to silence a warning about using VOIDmode in
a couple of places.

>> (define_insn_and_split "move_frame"
>>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>>         (match_operand:SI 1 "register_operand" "r"))
>>    (clobber (reg:CC CC_REGNUM))]
>>   "operands[1] == frame_pointer_rtx || operands[1] == arg_pointer_rtx"
>
> It looks to me that there are several places that could be tidied up
> if you have a frame_register_operand predicate to use here instead of
> testing operands[] in the extra_constraints field.

There are two mentions of frame_pointer_rtx, and three of  
hard_frame_pointer_rtx, in the entirety of the md file.
So I don't think having such a predicate - which would have to do
different things before, during, and after reload, and would need to
be grokked before understanding the md file - would make the port
easier to understand.
It could make it easier for some optimizers to manipulate these patterns.
OTOH, judging from past performance, I'm not sure that that would be a
good thing.  There are inherent contradictions in the assumptions
different compiler passes make regarding frame pointer based addresses
on RISC machines - e.g. we don't actually know the valid add offset or  
addressing ranges off the frame / arg pointer until after they have
been eliminated.
A few weeks of performance and regression testing, in a toolchain that has
the currently known correctness issues fixed, could provide some data to
make an informed decision.
So, while your suggestion might have merit, I can't pursue it at the moment.

>> ;; If the frame pointer elimination offset is zero, we'll use this pattern.
>> (define_insn_and_split "*move_frame_1"
>>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>>         (match_operand:SI 1 "register_operand" "r"))
>>    (clobber (reg:CC CC_REGNUM))]
>>   "(reload_in_progress || reload_completed)
>>    && (operands[1] == stack_pointer_rtx
>>        || operands[1] == hard_frame_pointer_rtx)"
>>   "#"
>>   "&& 1"
>>   [(set (match_dup 0) (match_dup 1))])
>
> Duplicate of the above?  I really don't see how they differ...

The first is for a pre-reload frame pointer reference, which is subject to
register elimination.  The original idea is that it morphs into an
add by the effect of register elimination.  Two years I later added the
..._and_split part for the case that there is a zero offset.
I can't remeber why I did it, or if there was some actual problem I was
trying to solve.  Looking at it now conjunction with frame_move_1, I think
I probably just added the splitter because I thought it should have one,
forgetting about frame_move_1.

The second is for a reload_in_progress / reload_completed frame  
pointer reference.  I introduced it as a define_insn_and_split at the  
same time as
in introduced move_frame as an insn pattern.

move_frame is supposed to have precedence in insn recognition above
*movsi_insn, which in turn is supposed to have precedence above
*move_frame_1

Also, combining these two patterns into a single insn pattern would give
reload more freedom to change operands into each other (when it doesn't
re-recognize insns), which is unwanted in this case.

I suppose I should remove again the _and_split part from frame_move, and
put a comment there that the post-reload/elimination recognition and
splitting is in *frame_move_1

>> ;; reload uses gen_addsi2 because it doesn't understand the need for
>> ;; the clobber.
>> (define_peephole2
>>   [(set (match_operand:SI 0 "gpr_operand" "")
>>         (match_operand:SI 1 "const_int_operand" ""))
>>    (parallel [(set (match_dup 0)
>>                    (plus:SI (match_dup 0)
>>                             (match_operand:SI 2 "gpr_operand")))
>>               (clobber (reg:CC UNKNOWN_REGNUM))])]
>
> I'd like to understand all of this a little more.
>
> Given that reload *would* generate an add3 (with no clobber), and
> add-without-clobber isn't ordinarily a valid insn, why not just go
> ahead and use that as your reload pattern instead of this
> clobber placeholder?

Having add3 without clobber would mean it's a free-for-all for the optimizers
to generate it.  Which would be disastrous for code speed and size.
Also, it would be harder to make sure the rescanning peephole2 terminates
when there is a pattern that looks so simple but is so wrong.

> Wouldn't a bare plus pattern with that same reload_in_progress ||   
> reload_completed
> check work just as well for all the pattern matching you want to do
> during peephole analysis?

It would stop the pre-reload optimizers of goofing up, but not the
post-reload ones.

>>   int scratch = (0x17
>>                  ^ (true_regnum (operands[0]) & 1)
>>                  ^ (true_regnum (operands[1]) & 2)
>>                  ^ (true_regnum (operands[2]) & 4));
>>   asm_fprintf (asm_out_file, \"\tstr r%d,[sp,#0]\n\", scratch);
>>   asm_fprintf (asm_out_file, \"\tmovfs r%d,status\n\", scratch);
>>   output_asm_insn (\"add %0,%1,%2\", operands);
>>   asm_fprintf (asm_out_file, \"\tmovts status,r%d\n\", scratch);
>>   asm_fprintf (asm_out_file, \"\tldr r%d,[sp,#0]\n\", scratch);
>>   return \"\";
>
> It does seem like you'd do well to split this pattern.  If you do,   
> then you'll
> automatically get the right changes to debugging unwind info across this.

The only possible unwind issue I see here is if an interrupt or exception
triggers inside this sequence.  And that is assuming that we can unwind
the interrupt / exception otherwise.
I think the impact on debuggability of this pattern is minimal.  It is
not supposed to be emitted frequently, it is only there as a last resort
when all optimization attemts fail.  In fact, I have yet to see a testcase
where that happens at -O2 or higher.  If the integer flags register is free,
we split.  If the integer flags register is set in the previous insn, and data
flow allows to re-order, we split.  If we can find any free register
to save the flags into, we split.

> A test for epilogue_completed in the split condition should be sufficient to
> wait until after peephole2 has finished, so that you don't interfere with the
> transformations you want there.

Actually, peephole2 runs after thread_prologue_and_epilogue.

If we found / made another suitable variable to make the splitter
conditional on, I suppose it could help debugging at -O0 .  In the above
stated corner case.  There is the drawback that it makes it less easy to
spot these sequences when they are split.  OTOH I could make the splitter
optional, and/or have an optional warning when it triggers.
But I couldn't sufficiently test such a change with mainline
GCC at the moment, because there are too many unresolved other issues.
And testing it in another branch and then re-merging to mainline is
likely to miss the stage1 window.
For now, I can add a comment about the possible benefit of a splitter.

> I'm also interested in hearing about how well this whole scheme works in
> practice, as opposed to merely waiting until after reload to split and flags
> users.

"and flags users"?   I think I get the general idea what you are trying to
express, but I can't reconstruct what exactly you wanted to say.

>  There are certainly lots of other ports that are in the same boat
> with respect to only having a flags-clobbering add.

Well, actually, the Epiphany has two different flags registers,
and a biased stack pointer and post-modify can be used to do add
a constant to the (value of the) stack pointer without clobbering
any flags, so the potential for peephole2 patterns to fix things up
is probably above average.

>> (define_insn_and_split "*recipsf2_1"
>>   [(match_parallel 4 "float_operation"
>>      [(set (match_operand:SF 0 "gpr_operand" "=r,r")
>>            (div:SF (match_operand:SF 1 "const_float_1_operand" "")
>>                    (match_operand:SF 2 "move_src_operand"   
>> "rU16m,rU16mCal")))
>>       (use (match_operand:SI 3 "move_src_operand" "rU16m,rU16mCal"))
>
> How to you prevent a post-reload copy propagation pass from putting things
> back just the way before you split them?  It seems to me that's the primary
> reason to use specific register constraints here.

They don't know how to put back the hard reg clobbers of
(reg:SF 0) and (reg:SI 1) .
I suppose they might do that in the future, then I'd have to find a safer
way to avoid the recombination.  But I can't test that till the copy
propagation has learned this new trick, so for the time being I think
I better leave the pattern alone.

Well, if it did get recombined, it's would just be split again later.
But I'll change the output pattern to "#" to make sure this will
always happen.

>> (define_insn "fmadd"
>>   [(match_parallel 4 "float_operation"
>>      [(set (match_operand:SF 0 "gpr_operand" "=r")
>>            (fma:SF (match_operand:SF 2 "gpr_operand" "%r")
>>                    (match_operand:SF 3 "gpr_operand" "r")
>>                    (match_operand:SF 1 "gpr_operand" "0")))
>>       (clobber (reg:CC_FP CCFP_REGNUM))])]
>
> Presumably the strange operand ordering is left over from your port-specific
> builtins?  Also, the % is extraneous since the constraints are identical.

True.  And as I have removed the last use of gen_fmadd / gen_fmsub, I can
remove the expanders.

>> ; combiner pattern, also used by vector combiner pattern
>> (define_expand "maddsf"
>>   [(parallel
>>      [(set (match_operand:SF 0 "gpr_operand" "=r")
>>            (plus:SF (mult:SF (match_operand:SF 1 "gpr_operand" "%r")
>>                              (match_operand:SF 2 "gpr_operand" "r"))
>>                     (match_operand:SF 3 "gpr_operand" "0")))
>>       (clobber (reg:CC_FP CCFP_REGNUM))])]
>>   "TARGET_FUSED_MADD")
>
> I suspect these aren't needed anymore.  Anything the combiner could have
> found should be able to be found by the SSA optimizers.

I'm not sure how well this work with vectorized code.  There are splitters
from vectorized versions of these patterns.
Generating vectorized code is also still pretty hit-and-miss-and-miss,
there are lots of conditions where it gives up for modest SIMD mode sizes
even though a vectorization would be possible.

I suppose I should put a comment on these patterns to review their usefulness
once we can get consistent vectorization.

>> (define_expand "one_cmplsi2"

...

> Why not combine these?  I'm pretty sure that expand_binop will try the xor
> solution all on its own.

Actually, it's expand_unop, and it doesn't.  Left to its own devices, it
will generate a libcall instead.


> Of course, if gpr_operand were to include -1 when m1reg is available, and
> you swapped all the constraints to something other than "r", you get even
> this for free.

That makes sense.  IIRC I did something similar to that that for on the SH64.
Except it's a lot of trouble to swap out all the constraints (reload still
needs r, so all affected constraint strings become longer), so there was a
separate predicate to be used in the insns where this was profitable.
AFAICT the opportunities for -1 to be useful are likewise there, but limited -
xor, mul, shift.
OTOH there might be more constants later than can be available in a register.
so I'm thinking of "ggpr_operand" for general gpr operand (well, there's
two general in there, but gpr_or_reg_cst_operand seems too long) which
is gpr_operand or any constant that's available in a fixed register.
Then rRcf (Rcf : non-regclass Register constraint: constant in fixed register)
can be the associated constraint.

>> (define_insn "*mov<mode>cc_insn"
>>   [(set (match_operand:WMODE 0 "gpr_operand" "=r")
>>         (if_then_else:WMODE (match_operator 3 "proper_comparison_operator"
>>                               [(match_operand 4 "cc_operand")   
>> (const_int 0)])
>>                             (match_operand:WMODE 1 "nonmemory_operand" "r")
>>                             (match_operand:WMODE 2 "gpr_operand" "0")))]
>
> If there's a good reason for nonmemory_operand and not gpr_operand here,
> you should add a big comment.  It looks like a mistake.

I dimly remember tweaking stuff to make the movcc operands match, but if
it was this, or even that architecture, I'm not sure.  It might also have
to do with the three-insn capacity of combine.  At any rate, it looses
cse / loop hoisting opportunities.
So if there is a good reason to allow immediates, that's best rediscovered
so it can be properly documented.  Or the optimizers improved to make that
unnecessary.


> you should add a big comment.  It looks like a mistake.
>
>> static bool
>> epiphany_frame_pointer_required (void)
>> {
>>   return cfun->calls_alloca;
>
> Isn't this automatic?
>
>> epiphany-load-combiner.o :   
>> $(srcdir)/config/epiphany/epiphany-load-combiner.c

The pass didn't work, so I removed it before submission.
I might add it back later when I can get it to work and do something
useful, but it's not ready for gcc 4.7 .
I've already removed this vestige in t-epiphany while back.

>> #define IMM16(X)     ((unsigned)(X) <= 0xFFFF)
>> #define IMM5(X)      ((unsigned)(X) <= 0x1F)
>
> These need to be unsigned HOST_WIDE_INT, at minimum.
> Preferably no cast at all and compare vs 0 as well.

Compare vs. 0 gives warnings when the input type is unsigned.
There is a reason IN_RANGE uses the cast to unsigned HOST_WIDE_INT.

IN_RANGE might actually be the better choice for these tests.

I'll over your points and prepare a new version now...
Richard Henderson Nov. 5, 2011, 12:24 a.m. UTC | #3
On 11/04/2011 05:00 PM, Joern Rennecke wrote:
>>> (define_insn_and_split "move_frame"
>>>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>>>         (match_operand:SI 1 "register_operand" "r"))
>>>    (clobber (reg:CC CC_REGNUM))]
>>>   "operands[1] == frame_pointer_rtx || operands[1] == arg_pointer_rtx"
...
>>> (define_insn_and_split "*move_frame_1"
>>>   [(set (match_operand:SI 0 "gpr_operand" "=r")
>>>         (match_operand:SI 1 "register_operand" "r"))
>>>    (clobber (reg:CC CC_REGNUM))]
>>>   "(reload_in_progress || reload_completed)
>>>    && (operands[1] == stack_pointer_rtx
>>>        || operands[1] == hard_frame_pointer_rtx)"
>>>   "#"
>>>   "&& 1"
>>>   [(set (match_dup 0) (match_dup 1))])
...
> The second is for a reload_in_progress / reload_completed frame
> pointer reference.  I introduced it as a define_insn_and_split at the
> same time as in introduced move_frame as an insn pattern.

Look much closer.  These patterns are 100% identical.  The *move_frame_1
pattern will _never_ be matched, because such an insn will always be 
matched by move_frame.


>> Given that reload *would* generate an add3 (with no clobber), and
>> add-without-clobber isn't ordinarily a valid insn, why not just go
>> ahead and use that as your reload pattern instead of this
>> clobber placeholder?
...
> It would stop the pre-reload optimizers of goofing up, but not the
> post-reload ones.

Fair enough.

>> It does seem like you'd do well to split this pattern.  If you do,  then you'll
>> automatically get the right changes to debugging unwind info across this.
> 
> The only possible unwind issue I see here is if an interrupt or exception
> triggers inside this sequence.  And that is assuming that we can unwind
> the interrupt / exception otherwise.

stepi with gdb was the case I had in mind.

>> A test for epilogue_completed in the split condition should be sufficient to
>> wait until after peephole2 has finished, so that you don't interfere with the
>> transformations you want there.
> 
> Actually, peephole2 runs after thread_prologue_and_epilogue.

Yes, but there are no splits between thread_prologue_and_epilogue and
peephole2.  The first split pass after t_p_a_e is just before sched2.

Alpha relies on this distinction already.

> For now, I can add a comment about the possible benefit of a splitter.

Fair enough.

> 
>> I'm also interested in hearing about how well this whole scheme works in
>> practice, as opposed to merely waiting until after reload to split and flags
>> users.
> 
> "and flags users"?   I think I get the general idea what you are trying to
> express, but I can't reconstruct what exactly you wanted to say.

On mn10300 (and other targets), cbranch, cstore, adddi3, et al, remain an
indivisible pattern until after reload.  Thus reload can generate a
flags-clobbering addition at any point, because the flags register is
never live between insns at that point.  After reload, the patterns that
require the use of flags are split, both to make it easy to compute
instruction sizes and for scheduling.

I'm asking what kind of benefits you see from splitting these patterns early,
and then working so hard to fix up the problems that might be caused.

> They don't know how to put back the hard reg clobbers of
> (reg:SF 0) and (reg:SI 1) .

Ah, I missed that difference in the patterns.

> Actually, it's expand_unop, and it doesn't.  Left to its own devices, it
> will generate a libcall instead.

That's a shame.  I wonder how many instances we could fix.
Something for a later cleanup, then.


r~
Joern Rennecke Nov. 5, 2011, 12:40 a.m. UTC | #4
Quoting Joern Rennecke <amylaar@spamcop.net>:

> Quoting Richard Henderson <rth@redhat.com>:
  >>> (define_expand "one_cmplsi2"
>
> ...
>
>> Why not combine these?  I'm pretty sure that expand_binop will try the xor
>> solution all on its own.
>
> Actually, it's expand_unop, and it doesn't.  Left to its own devices, it
> will generate a libcall instead.
>
>
>> Of course, if gpr_operand were to include -1 when m1reg is available, and
>> you swapped all the constraints to something other than "r", you get even
>> this for free.
>
> That makes sense.  IIRC I did something similar to that that for on the SH64.
> Except it's a lot of trouble to swap out all the constraints (reload still
> needs r, so all affected constraint strings become longer), so there was a
> separate predicate to be used in the insns where this was profitable.
> AFAICT the opportunities for -1 to be useful are likewise there, but  
>  limited -
> xor, mul, shift.
> OTOH there might be more constants later than can be available in a register.
> so I'm thinking of "ggpr_operand" for general gpr operand (well, there's
> two general in there, but gpr_or_reg_cst_operand seems too long) which
> is gpr_operand or any constant that's available in a fixed register.
> Then rRcf (Rcf : non-regclass Register constraint: constant in fixed  
>  register)
> can be the associated constraint.

Actually, it's not that simple.  The not:SI operaration can be generated
by combine.  But it won't generate (xor:SI (reg:SI foo) (const_int -1)) .

So I need one_cmplsi2_i as combiner pattern no matter what.

It makes sense to expand straight to that pattern though, to avoid hoisting -1
out of a loop.
Richard Henderson Nov. 5, 2011, 12:59 a.m. UTC | #5
On 11/04/2011 05:40 PM, Joern Rennecke wrote:
> Actually, it's not that simple.  The not:SI operaration can be generated
> by combine.  But it won't generate (xor:SI (reg:SI foo) (const_int -1)) .
> 
> So I need one_cmplsi2_i as combiner pattern no matter what.

Oh, right.  I forgot about combine not liking the absence of neg/not.


r~
Joern Rennecke Nov. 5, 2011, 1:02 a.m. UTC | #6
Quoting Richard Henderson <rth@redhat.com>:

>> static bool
>> epiphany_frame_pointer_required (void)
>> {
>>   return cfun->calls_alloca;
>
> Isn't this automatic?

Only if EXIT_IGNORE_STACK is nonzero; it's zero for epiphany.
But I think I can change that.
Joern Rennecke Nov. 5, 2011, 1:33 a.m. UTC | #7
Quoting Richard Henderson <rth@redhat.com>:

> Look much closer.  These patterns are 100% identical.  The *move_frame_1
> pattern will _never_ be matched, because such an insn will always be
> matched by move_frame.

move_frame only matches if:
   operands[1] == frame_pointer_rtx || operands[1] == arg_pointer_rtx

move_frame_1 instead matches if:
   (reload_in_progress || reload_completed)
    && (operands[1] == stack_pointer_rtx
        || operands[1] == hard_frame_pointer_rtx)

Patch
diff mbox

Index: contrib/config-list.mk
===================================================================
--- contrib/config-list.mk	(revision 180924)
+++ contrib/config-list.mk	(working copy)
@@ -18,7 +18,8 @@  LIST = alpha-linux-gnu alpha-freebsd6 al
   arm-linux-androideabi arm-uclinux_eabi arm-ecos-elf arm-eabi \
   arm-symbianelf arm-rtems arm-elf arm-wince-pe avr-rtems avr-elf \
   bfin-elf bfin-uclinux bfin-linux-uclibc bfin-rtems bfin-openbsd \
-  c6x-elf c6x-uclinux cris-elf cris-linux crisv32-elf crisv32-linux fido-elf \
+  c6x-elf c6x-uclinux cris-elf cris-linux crisv32-elf crisv32-linux \
+  epiphany-elf epiphany-elfOPT-with-stack-offset=16 fido-elf \
   fr30-elf frv-elf frv-linux h8300-elf h8300-rtems hppa-linux-gnu \
   hppa-linux-gnuOPT-enable-sjlj-exceptions=yes hppa64-linux-gnu \
   hppa2.0-hpux10.1 hppa64-hpux11.3 \
Index: MAINTAINERS
===================================================================
--- MAINTAINERS	(revision 180924)
+++ MAINTAINERS	(working copy)
@@ -55,6 +55,7 @@ 
 bfin port		Jie Zhang		jzhang918@gmail.com
 c6x port		Bernd Schmidt		bernds@codesourcery.com
 cris port		Hans-Peter Nilsson	hp@axis.com
+epiphany port		Joern Rennecke		joern.rennecke@embecosm.com
 fr30 port		Nick Clifton		nickc@redhat.com
 frv port		Nick Clifton		nickc@redhat.com
 frv port		Alexandre Oliva		aoliva@redhat.com
@@ -466,8 +467,6 @@  build machinery (*.in)	Ralf Wildenhues
 Easwaran Raman					eraman@google.com
 Rolf Rasmussen					rolfwr@gcc.gnu.org
 Volker Reichelt					v.reichelt@netcologne.de
-Joern Rennecke					amylaar@spamcop.net
-Joern Rennecke					joern.rennecke@embecosm.com
 Bernhard Reutner-Fischer			rep.dot.nop@gmail.com
 Tom Rix						trix@redhat.com
 Craig Rodrigues					rodrigc@gcc.gnu.org
Index: libgcc/config.host
===================================================================
--- libgcc/config.host	(revision 180924)
+++ libgcc/config.host	(working copy)
@@ -433,6 +433,10 @@ 
 cris-*-linux* | crisv32-*-linux*)
 	tmake_file="$tmake_file cris/t-cris t-fdpbit cris/t-linux"
 	;;
+epiphany-*-elf*)
+	tmake_file="epiphany/t-epiphany t-fdpbit epiphany/t-custom-eqsf"
+	extra_parts="$extra_parts crti.o crtint.o crtrunc.o crtm1reg-r43.o crtm1reg-r63.o crtn.o"
+	;;
 fr30-*-elf)
 	tmake_file="$tmake_file fr30/t-fr30 t-fdpbit"
 	extra_parts="$extra_parts crti.o crtn.o"
Index: gcc/doc/extend.texi
===================================================================
--- gcc/doc/extend.texi	(revision 180924)
+++ gcc/doc/extend.texi	(working copy)
@@ -2192,7 +2192,7 @@  types (@pxref{Variable Attributes}, @pxr
 
 @item disinterrupt
 @cindex @code{disinterrupt} attribute
-On MeP targets, this attribute causes the compiler to emit
+On Epiphany and MeP targets, this attribute causes the compiler to emit
 instructions to disable interrupts for the duration of the given
 function.
 
@@ -2551,7 +2551,7 @@  void bar (void)
 
 @item interrupt
 @cindex interrupt handler functions
-Use this attribute on the ARM, AVR, M32C, M32R/D, m68k, MeP, MIPS,
+Use this attribute on the ARM, AVR, Epiphany, M32C, M32R/D, m68k, MeP, MIPS,
 RX and Xstormy16 ports to indicate that the specified function is an
 interrupt handler.  The compiler will generate function entry and exit
 sequences suitable for use in an interrupt handler when this attribute
@@ -2723,7 +2723,8 @@  least version 2.20.1), and GNU C library
 @item long_call/short_call
 @cindex indirect calls on ARM
 This attribute specifies how a particular function is called on
-ARM@.  Both attributes override the @option{-mlong-calls} (@pxref{ARM Options})
+ARM and Epiphany.  Both attributes override the
+@option{-mlong-calls} (@pxref{ARM Options})
 command-line switch and @code{#pragma long_calls} settings.  The
 @code{long_call} attribute indicates that the function might be far
 away from the call site and require a different (more expensive)
Index: gcc/doc/invoke.texi
===================================================================
--- gcc/doc/invoke.texi	(revision 180924)
+++ gcc/doc/invoke.texi	(working copy)
@@ -458,6 +458,14 @@  cpp(1), gcov(1), as(1), ld(1), gdb(1), a
 @c Try and put the significant identifier (CPU or system) first,
 @c so users have a clue at guessing where the ones they want will be.
 
+@emph{Adapteva Epiphany Options}
+@gccoptlist{-mhalf-reg-file -mprefer-short-insn-regs @gol
+-mbranch-cost=@var{num} -mcmove -mnops=@var{num} -msoft-cmpsf @gol
+-msplit-lohi -mpost-inc -mpost-modify -mstack-offset=@var{num} @gol
+-mround-nearest -mlong-calls -mshort-calls -msmall16 @gol
+-mfp-mode=@var{mode} -mvect-double -max-vect-align=@var{num} @gol
+-msplit-vecmove-early -m1reg-@var{reg}}
+
 @emph{ARM Options}
 @gccoptlist{-mapcs-frame  -mno-apcs-frame @gol
 -mabi=@var{name} @gol
@@ -10226,6 +10234,7 @@  finds any @option{-l} options and any no
 @c in Machine Dependent Options
 
 @menu
+* Adapteva Epiphany Options::
 * ARM Options::
 * AVR Options::
 * Blackfin Options::
@@ -10274,6 +10283,161 @@  finds any @option{-l} options and any no
 * zSeries Options::
 @end menu
 
+@node Adapteva Epiphany Options
+@subsection Adapteva Epiphany Options
+
+These @samp{-m} options are defined for Adapteva Epiphany:
+
+@table @gcctabopt
+@item -mhalf-reg-file
+@opindex mhalf-reg-file
+Don't allocate any register in the range @code{r32}@dots{}@code{r63}.
+That allows code to run on hardware variants that lack these registers.
+
+@item -mprefer-short-insn-regs
+@opindex mprefer-short-insn-regs
+Preferrentially allocate registers that allow short instruction generation.
+This can result in increasesd instruction count, so if this reduces or
+increases code size might vary from case to case.
+
+@item -mbranch-cost=@var{num}
+@opindex mbranch-cost
+Set the cost of branches to roughly @var{num} ``simple'' instructions.
+This cost is only a heuristic and is not guaranteed to produce
+consistent results across releases.
+
+@item -mcmove
+@opindex mcmove
+Enable the generation of conditional moves.
+
+@item -mnops=@var{num}
+@opindex mnops
+Emit @var{num} nops before every other generated instruction.
+
+@item -mno-soft-cmpsf
+@opindex mno-soft-cmpsf
+For single-precision floating point comparisons, emit an fsub instruction
+and test the flags.  This is faster than a software comparison, but can
+get incorrect results in the presence of NaNs, or when two different small
+numbers are compared such that their difference is calculated as zero.
+The default is @option{-msoft-cmpsf}, which uses slower, but IEEE-compliant,
+software comparisons.
+
+@item -mstack-offset=@var{num}
+@opindex mstack-offset
+Set the offset between the top of the stack and the stack pointer.
+E.g., a value of 8 means that the eight bytes in the range sp+0@dots{}sp+7
+can be used by leaf functions without stack allocation.
+Values other than @samp{8} or @samp{16} are untested and unlikely to work.
+Note also that this option changes the ABI, compiling a program with a
+different stack offset than the libraries have been compiled with
+will generally not work.
+This option can be useful if you want to evaluate if a different stack
+offset would give you better code, but to actually use a different stack
+offset to build working programs, it is recommended to configure the
+toolchain with the appropriate @samp{--with-stack-offset=@var{num}} option.
+
+@item -mno-round-nearest
+@opindex mno-round-nearest
+Make the scheduler assume that the rounding mode has been set to
+truncating.  The default is @option{-mround-nearest}.
+
+@item -mlong-calls
+@opindex mlong-calls
+If not otherwise specified by an attribute, assume all calls might be beyond
+the offset range of the b / bl instructions, and therefore load the
+function address into a register before performing a (otherwise direct) call.
+This is the default.
+
+@item -mshort-calls
+@opindex short-calls
+If not otherwise specified by an attribute, assume all direct calls are
+in the range of the b / bl instructions, so use these instructions
+for direct calls.  The default is @option{-mlong-calls}.
+
+@item -msmall16
+@opindex msmall16
+Assume addresses can be loaded as 16 bit unsigned values.  This does not
+apply to function addresses for which @option{-mlong-calls} semantics
+are in effect.
+
+@item -mfp-mode=@var{mode}
+@opindex mfp-mode
+Set the prevailing mode of the floating point unit.
+This determines the floating point mode that is provided and expected
+at function call and return time.  Making this mode match the mode you
+predominantly need at function start can make your programs smaller and
+faster by avoiding unnecessary mode switches.
+
+@var{mode} can be set to one the following values:
+
+@table @samp
+@item caller
+Any mode at function entry is valid, and retained or restored when
+the function returns, and when it calls other functions.
+This mode is useful for compiling libraries or other compilation units
+you might want to incorporate into different programs with different
+prevailing FPU modes, and the convenience of being able to use a single
+object file outweighs the size and speed overhead for any extra
+mode switching that might be needed, compared with what would be needed
+with a more specific choice of prevailing FPU mode.
+
+@item truncate
+This is the mode used for floating point calculations with
+truncating (i.e.@: round towards zero) rounding mode.  That includes
+conversion from floating point to integer.
+
+@item round-nearest
+This is the mode used for floating point calculations with
+round-to-nearest-or-even rounding mode.
+
+@item int
+This is the mode used to perform integer calculations in the FPU, e.g.@:
+integer multiply, or integer multiply-and-accumulate.
+@end table
+
+The default is @option{-mfp-mode=caller}
+
+@item -mnosplit-lohi
+@opindex mnosplit-lohi
+@item -mno-postinc
+@opindex mno-postinc
+@item -mno-postmodify
+@opindex mno-postmodify
+Code generation tweaks that disable, respectively, splitting of 32
+bit loads, generation of post-increment addresses, and generation of
+post-modify addresses.  The defaults are @option{msplit-lohi},
+@option{-mpost-inc}, and @option{-mpost-modify}.
+
+@item -mnovect-double
+@opindex mno-vect-double
+Change the preferred SIMD mode to SImode.  The default is
+@option{-mvect-double}, which uses DImode as preferred SIMD mode.
+
+@item -max-vect-align=@var{num}
+@opindex max-vect-align
+The maximum alignment for SIMD vector mode types.
+@var{num} may be 4 or 8.  The default is 8.
+Note that this is an ABI change, even though many library function
+interfaces will be unaffected, if they don't use SIMD vector modes
+in places where they affect size and/or alignment of relevant types.
+
+@item -msplit-vecmove-early
+@opindex msplit-vecmove-early
+Split vector moves into single word moves before reload.  In theory this
+could give better register allocation, but so far the reverse seems to be
+generally the case.
+
+@item -m1reg-@var{reg}
+@opindex m1reg-
+Specify a register to hold the constant @minus{}1, which makes loading small negative
+constants and certain bitmasks faster.
+Allowable values for reg are r43 and r63, which specify to use that register
+as a fixed register, and none, which means that no register is used for this
+purpose.  The default is @option{-m1reg-none}.
+
+@end table
+
 @node ARM Options
 @subsection ARM Options
 @cindex ARM options
Index: gcc/doc/contrib.texi
===================================================================
--- gcc/doc/contrib.texi	(revision 180924)
+++ gcc/doc/contrib.texi	(working copy)
@@ -746,7 +746,7 @@  documentation (DESIGN, CHECKLIST, and so
 
 @item
 Joern Rennecke for maintaining the sh port, loop, regmove & reload
-hacking.
+hacking and developing and maintaining the Epiphany port.
 
 @item
 Loren J. Rittle for improvements to libstdc++-v3 including the FreeBSD
Index: gcc/doc/md.texi
===================================================================
--- gcc/doc/md.texi	(revision 180924)
+++ gcc/doc/md.texi	(working copy)
@@ -1,5 +1,5 @@ 
 @c Copyright (C) 1988, 1989, 1992, 1993, 1994, 1996, 1998, 1999, 2000, 2001,
-@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010
+@c 2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011
 @c Free Software Foundation, Inc.
 @c This is part of the GCC manual.
 @c For copying conditions, see the file gcc.texi.
@@ -1778,6 +1778,77 @@  Register pair Z (r31:r30)
 Constant integer 4
 @end table
 
+@item Epiphany---@file{config/epiphany/constraints.md}
+@table @code
+@item U16
+An unsigned 16-bit constant.
+
+@item K
+An unsigned 5-bit constant.
+
+@item L
+A signed 11-bit constant.
+
+@item Cm1
+A signed 11-bit constant added to @minus{}1.
+Can only match when the @option{-m1reg-@var{reg}} option is active.
+
+@item Cl1
+Left-shift of @minus{}1, i.e., a bit mask with a block of leading ones, the rest
+being a block of trailing zeroes.
+Can only match when the @option{-m1reg-@var{reg}} option is active.
+
+@item Cr1
+Right-shift of @minus{}1, i.e., a bit mask with a trailing block of ones, the
+rest being zeroes.  Or to put it another way, one less than a power of two.
+Can only match when the @option{-m1reg-@var{reg}} option is active.
+
+@item Cal
+Constant for arithmetic/logical operations.
+This is like @code{i}, except that for position independent code,
+no symbols / expressions needing relocations are allowed.
+
+@item Csy
+Symbolic constant for call/jump instruction.
+
+@item Rcs
+The register class usable in short insns.  This is a register class
+constraint, and can thus drive register allocation.
+This constraint won't match unless @option{-mprefer-short-insn-regs} is
+in effect.
+
+@item Rsc
+The the register class of registers that can be used to hold a
+sibcall call address.  I.e., a caller-saved register.
+
+@item Rct
+Core control register class.
+
+@item Rgs
+The register group usable in short insns.
+This constraint does not use a register class, so that it only
+passively matches suitable registers, and doesn't drive register allocation.
+
+@ifset INTERNALS
+@item Car
+Constant suitable for the addsi3_r pattern.  This is a valid offset
+For byte, halfword, or word addressing.
+@end ifset
+
+@item Rra
+Matches the return address if it can be replaced with the link register.
+
+@item Rcc
+Matches the integer condition code register.
+
+@item Sra
+Matches the return address if it is in a stack slot.
+
+@item Cfm
+Matches control register values to switch fp mode, which are encapsulated in
+@code{UNSPEC_FP_MODE}.
+@end table
+
 @item Hewlett-Packard PA-RISC---@file{config/pa/pa.h}
 @table @code
 @item a
Index: gcc/doc/install.texi
===================================================================
--- gcc/doc/install.texi	(revision 180924)
+++ gcc/doc/install.texi	(working copy)
@@ -1208,6 +1208,11 @@  Specify that the target supports TLS (Th
 Specify if the compiler should default to @option{-marm} or @option{-mthumb}.
 This option is only supported on ARM targets.
 
+@item --with-stack-offset=@var{num}
+This option sets the default for the -mstack-offset=@var{num} option,
+and will thus generally also control the setting of this option for
+libraries.  This option is only supported on Epiphany targets.
+
 @item --with-fpmath=@var{isa}
 This options sets @option{-mfpmath=sse} by default and specifies the default
 ISA for floating-point arithmetics.  You can select either @samp{sse} which
@@ -3314,6 +3319,13 @@  Collection (GCC)},
 
 @html
 <hr />
+@end html
+@heading @anchor{epiphany-x-elf}epiphany-*-elf
+Adapteva Epiphany.
+This configuration is intended for embedded systems.
+
+@html
+<hr />
 @end html
 @heading @anchor{x-x-freebsd}*-*-freebsd*
 
Index: gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x	(revision 180924)
+++ gcc/testsuite/gcc.c-torture/execute/ieee/mul-subnormal-single-1.x	(working copy)
@@ -1,3 +1,8 @@ 
+if [istarget "epiphany-*-*"] {
+    # The Epiphany single-precision floating point format does not
+    # support subnormals.
+    return 1
+}
 if [istarget "mips-sgi-irix6*"] {
     # IRIX 6 sets the MIPS IV flush to zero bit by default, so this test
     # isn't expected to work for n32 and n64 on MIPS IV targets.
Index: gcc/testsuite/gcc.c-torture/execute/20101011-1.c
===================================================================
--- gcc/testsuite/gcc.c-torture/execute/20101011-1.c	(revision 180924)
+++ gcc/testsuite/gcc.c-torture/execute/20101011-1.c	(working copy)
@@ -28,6 +28,10 @@ 
   /* Not all Linux kernels deal correctly the breakpoints generated by
      MIPS16 divisions by zero.  They show up as a SIGTRAP instead.  */
 # define DO_TEST 0
+#elif defined (__epiphany__)
+  /* Epiphany does not have hardware division, and the software implementation
+     has truly undefined behaviour for division by 0.  */
+# define DO_TEST 0
 #else
 # define DO_TEST 1
 #endif
Index: gcc/testsuite/gcc.dg/stack-usage-1.c
===================================================================
--- gcc/testsuite/gcc.dg/stack-usage-1.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/stack-usage-1.c	(working copy)
@@ -52,6 +52,8 @@ 
 #  define SIZE 160 /* 256 -  96 bytes for register save area */
 #elif defined (__SPU__)
 #  define SIZE 224
+#elif defined (__epiphany__)
+#  define SIZE (256 - __EPIPHANY_STACK_OFFSET__)
 #else
 #  define SIZE 256
 #endif
Index: gcc/testsuite/gcc.dg/pragma-pack-3.c
===================================================================
--- gcc/testsuite/gcc.dg/pragma-pack-3.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/pragma-pack-3.c	(working copy)
@@ -1,6 +1,7 @@ 
 /* PR c++/25294 */
 /* { dg-options "-std=gnu99" } */
-/* { dg-do run } */
+/* Epiphany makes struct S 8-byte aligned.  */
+/* { dg-do run { target { ! epiphany-*-* } } } */
 
 extern void abort (void);
 
Index: gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c
===================================================================
--- gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/torture/stackalign/builtin-apply-2.c	(working copy)
@@ -9,6 +9,15 @@ 
 
 #define INTEGER_ARG  5
 
+#if defined(__ARM_PCS) || defined(__epiphany__)
+/* For Base AAPCS, NAME is passed in r0.  D is passed in r2 and r3.
+   E, F and G are passed on stack.  So the size of the stack argument
+   data is 20.  */
+#define STACK_ARGUMENTS_SIZE  20
+#else
+#define STACK_ARGUMENTS_SIZE  64
+#endif
+
 extern void abort(void);
 
 void foo(char *name, double d, double e, double f, int g)
@@ -19,7 +28,7 @@  void foo(char *name, double d, double e,
 
 void bar(char *name, ...)
 {
-  __builtin_apply(foo, __builtin_apply_args(), 64);
+  __builtin_apply(foo, __builtin_apply_args(), STACK_ARGUMENTS_SIZE);
 }
 
 int main(void)
Index: gcc/testsuite/gcc.dg/weak/typeof-2.c
===================================================================
--- gcc/testsuite/gcc.dg/weak/typeof-2.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/weak/typeof-2.c	(working copy)
@@ -5,6 +5,9 @@ 
 /* { dg-require-weak "" } */
 /* { dg-require-alias "" } */
 /* { dg-options "-O2" } */
+/* Using -mshort-calls avoids loading the function addresses in
+   registers and thus getting the counts wrong.  */
+/* { dg-additional-options "-mshort-calls" { target epiphany-*-* } } */
 
 extern int foo1 (int x) __asm ("baz1");
 int bar1 (int x) { return x; }
Index: gcc/testsuite/gcc.dg/tls/thr-cse-1.c
===================================================================
--- gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/tls/thr-cse-1.c	(working copy)
@@ -1,5 +1,8 @@ 
 /* { dg-do compile } */
 /* { dg-options "-O1" } */
+/* Using -mshort-calls avoids loading the function addresses in
+   registers and thus getting the counts wrong.  */
+/* { dg-additional-options "-mshort-calls" { target epiphany-*-* } } */
 /* { dg-require-effective-target tls_emulated } */
 
 /* Test that we only get one call to emutls_get_address when CSE is
Index: gcc/testsuite/gcc.dg/20020312-2.c
===================================================================
--- gcc/testsuite/gcc.dg/20020312-2.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/20020312-2.c	(working copy)
@@ -20,6 +20,8 @@  extern void abort (void);
 /* No pic register.  */
 #elif defined(__cris__)
 # define PIC_REG  "0"
+#elif defined(__epiphany__)
+#define PIC_REG "r28"
 #elif defined(__fr30__)
 /* No pic register.  */
 #elif defined(__H8300__) || defined(__H8300H__) || defined(__H8300S__)
Index: gcc/testsuite/gcc.dg/builtin-apply2.c
===================================================================
--- gcc/testsuite/gcc.dg/builtin-apply2.c	(revision 180924)
+++ gcc/testsuite/gcc.dg/builtin-apply2.c	(working copy)
@@ -12,7 +12,7 @@ 
 
 #define INTEGER_ARG  5
 
-#ifdef __ARM_PCS
+#if defined(__ARM_PCS) || defined(__epiphany__)
 /* For Base AAPCS, NAME is passed in r0.  D is passed in r2 and r3.
    E, F and G are passed on stack.  So the size of the stack argument
    data is 20.  */
Index: gcc/testsuite/g++.dg/opt/devirt2.C
===================================================================
--- gcc/testsuite/g++.dg/opt/devirt2.C	(revision 180924)
+++ gcc/testsuite/g++.dg/opt/devirt2.C	(working copy)
@@ -1,5 +1,8 @@ 
 // { dg-do compile }
 // { dg-options "-O2" }
+/* Using -mshort-calls avoids loading the function addresses in
+   registers and thus getting the counts wrong.  */
+// { dg-additional-options "-mshort-calls" {target epiphany-*-*} }
 // { dg-final { scan-assembler-times "xyzzy" 2 { target { ! { alpha*-*-* hppa*-*-* ia64*-*-hpux* sparc*-*-* } } } } }
 // The IA64 and HPPA compilers generate external declarations in addition
 // to the call so those scans need to be more specific.
Index: gcc/testsuite/g++.dg/parse/pragma3.C
===================================================================
--- gcc/testsuite/g++.dg/parse/pragma3.C	(revision 180924)
+++ gcc/testsuite/g++.dg/parse/pragma3.C	(working copy)
@@ -1,5 +1,6 @@ 
 // PR c++/25294
-// { dg-do run }
+// Epiphany makes struct S 8-byte aligned.
+// { dg-do run { target { ! epiphany-*-* } } }
 
 extern "C" void abort (void);
 
Index: gcc/config.gcc
===================================================================
--- gcc/config.gcc	(revision 180924)
+++ gcc/config.gcc	(working copy)
@@ -967,6 +967,14 @@ 
 		;;
 	esac
 	;;
+epiphany-*-elf )
+	tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file}"
+	tmake_file="epiphany/t-epiphany"
+	extra_options="${extra_options} fused-madd.opt"
+	extra_objs="$extra_objs mode-switch-use.o resolve-sw-modes.o"
+	tm_defines="${tm_defines} EPIPHANY_STACK_OFFSET=${with_stack_offset:-8}"
+	extra_headers="epiphany_intrinsics.h"
+	;;
 fr30-*-elf)
 	tm_file="dbxelf.h elfos.h newlib-stdint.h ${tm_file}"
 	;;